Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re^3: TreeBuilder and encoding

by Khen1950fx (Canon)
on Jul 15, 2013 at 23:08 UTC ( #1044470=note: print w/replies, xml ) Need Help??

in reply to Re^2: TreeBuilder and encoding
in thread TreeBuilder and encoding

I think that you are working it a little to hard. There is no "utf-8", but there is ":utf8". I always use ":encoding(UTF-8)", just to be safe.

Here's what I did: If you use the new_from_url method, then it will call LWP::UserAgent for you.
#!/usr/bin/perl use strict; use warnings; use HTML::TreeBuilder 5 -weak; my $url = ' +et-sauvignon-napa-rutherford'; my $tree = HTML::TreeBuilder->new_from_url( $url ); $tree->parse_content( $url ); my $review_et = $tree->look_down('itemprop', 'reviewBody'); binmode STDOUT, ":encoding(UTF-8)"; print $review_et->as_text; $tree->delete;

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1044470]
[Eily]: there was a post some time ago about variable coming from regex taking more space than the same string defined directly IIRC
[Eily]: and if there's XS, make sure the scalar holds a string representation, that sounds like bad practice though (forcing that on the call side)
[choroba]: You should never pass $1 without double quotes to a sub
[choroba]: or "I should never", at least
[choroba]: that's not the sub's business
[Eily]: choroba but does $name = $1 solve the issue ?
[Eily]: if so, most subs start with my (VARIABLES) = @_ anyway
[moritz]: for short subs, I sometimes let them work on $_[0] directly

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2018-02-22 17:18 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (296 votes). Check out past polls.