2010年6月2日星期三

Re: [PerlChina] Re: 请教个xpath的问题

Hi,

It seems your mail client intepreted my GB2312 encoded email in UTF-8 encoding.

You'd better read `perldoc HTML::Element` more carefully and understand it....


my @nodes = $tree->findnodes('/html/body/table/tr');
for my $node (@nodes) {
# $node is 'tr'
my @tds = $node->content_list;

print $tds[0]->as_trimmed_text, " ", $tds[2]->as_trimmed_text, "\n";
}

About encoding problem, please find out which encoding you get and which encoding
you require first, then use Encode::decode()/encode()/from_to() to convert.

perlw01f wrote:
> Thanks for your comments
> It works!
> So I have another question,
> my $nodes = $tree->findnodes( '/html/body/table/tr]); ## parent node
> of aim nodes
> for my $node ( $nodes->get_nodelist() ){
> my @tds = $node->descendents->content_list; ## I want to
> reach my aim via method "descendents" of it's parent node
> print $tds[2]->as_trimmed_text, "\n";
> }
> However, it throws some error like:
> "Can't call method "content_list" without a package or object
> reference at C:\DOCUME~1\root\LOCALS~1\Temp\dirBB9.tmp\ptest3.pl line
> 29."
>
> The involved info from HTML::Element,
> "$h->descendants()
> In list context, returns the list of all $h's descendant elements,
> listed in pre-order (i.e., an element appears before its content-
> elements). Text segments DO NOT appear in the list. In scalar context,
> returns a count of all such elements."
> "$h->parent() or $h->parent($new_parent)
> Returns (optionally sets) the parent (aka "container") for this
> element. The parent should either be undef, or should be another
> element."
> two methods are similar, and returns are HTML::Element
>
> What's the matter about the error?
>
> By the way, please note my quote, it seems the texts are abnormal,
> such as "룬 ת ʲô 롣", in your reply? So I got nothing about Encode.
> Thanks anyway, and also thanks for your attention and patience.
>
> On 6月3日, 上午10时08分, Liu Yubao <yubao....@gmail.com> wrote:
>> Hi,
>>
>> Ҫ Ŀ ĵ :
>>
>> perldoc HTML::Element
>>
>> my @tds = $node->parent->content_list;
>> print $tds[2]->as_trimmed_text, "\n";
>>
>> ⣬ ޷ Encode ת ˣ ֱ ӵ Encode ת 룬 ǿ
>> HTML::TreeBuilder ģ ܷ Զ html ı Ϣ Զ ת 룬 ֮
>> ȸ õ ʲô 룬 ת ʲô 롣
>

--
您收到此邮件是因为您订阅了 Google 网上论坛的"PerlChina Mongers 讨论组"论坛。
要向此网上论坛发帖,请发送电子邮件至 perlchina@googlegroups.com
要取消订阅此网上论坛,请发送电子邮件至 perlchina+unsubscribe@googlegroups.com
若有更多问题,请通过 http://groups.google.com/group/perlchina?hl=zh-CN 访问此网上论坛。

没有评论: