Perl China Blog Spot

了解, 可是我的那两个函数为什么不对呢?  agentzh 写道:   
我们一般使用 CPAN 上的 Encode::Guess 模块。对于长文本非常有效的，但是对于非常短的，比如两三个字的文本就不怎么准了，呵呵。         use Encode::Guess;        my @enc = qw( UTF-8 GB2312 Big5 GBK Latin1 );        for my $enc (@enc) {            my $decoder = guess_encoding($data, $enc);            if (ref $decoder) {                $charset = $decoder->name;                last;            }        }        if (!$charset) {            die "Can't determine the charset of the input.\n";        }  这里 @enc 中是尝试的 charset 数量。其实感觉用 Encode 的 decode 函数也可以直接整，只不过设一个参数让它遇到错误字节时直接抛异常即可 ;)  -agentzh        
 _______________________________________________ China-pm mailing list China-pm@pm.org http://mail.pm.org/mailman/listinfo/china-pm

Perl China Blog Spot

2008年11月26日星期三

Re: [PerlChina] 如何用正则确定变量的内容是utf8还是gb2312的?

没有评论:

博客归档

Perl China Blog Spot

2008年11月26日星期三

Re: [PerlChina] 如何用正则确定变量的内容是utf8还是gb2312的?

没有评论:

订阅

博客归档