Perl China Blog Spot: [PerlChina] Re: 多字节字符utf的问题

2009年5月25日星期一

[PerlChina] Re: 多字节字符utf的问题

On 5/25/09, wqphoenix <wqphoenix@gmail.com> wrote:

想请教一个问题，我现在用perl提供一些数据时，发现有多字节字符。怎么判断和去除这些宽字符啊。

Sorry, cannot type Chinese in my Opera :)

The most naive way might be:

    use Encode 'decode';
    my $s = 'some multi-byte chars here...';
    ($s = decode($charset, $s)) =~ s/[^[:ascii:]]+//g;

Here's a realistic example:

    use Encode 'decode';
    my $s = '你好abc么';
    ($s = decode('utf8', $s)) =~ s/[^[:ascii:]]+//g;
    print $s, "\n";

And you'll get "abc" if your .pl file is saved in the UTF-8 format :)

Well, you might have GBK stuffs on your side though ;)

Ciao,
-agentzh

--~--~---------~--~----~------------~-------~--~----~
您收到此信息是由于您订阅了 Google 论坛"PerlChina Mongers 讨论组"论坛。
要在此论坛发帖，请发电子邮件到 perlchina@googlegroups.com
要退订此论坛，请发邮件至 perlchina+unsubscribe@googlegroups.com
更多选项，请通过 http://groups.google.com/group/perlchina?hl=zh-CN 访问该论坛

-~----------~----~----~----~------~----~------~--~---

没有评论:

发表评论

Perl China Blog Spot

2009年5月25日星期一

[PerlChina] Re: 多字节字符utf的问题

没有评论:

订阅

博客归档