It was version difference of LWP::Simple to get decoded character or not decoded bytes.
5.810 of LWP::Simple doesn't use decoded_content() but it uses content() method. On the other hand 5.835 uses decoded_content() method, so it returns decoded character.
I decided to use useragent and decoded_content explicitly, to work with each enviornment. Below is test script with LWP::Simple 5.810
#!/usr/bin/perl
use strict;
use warnings;
use lib './libwww-perl-5.818/lib';
use LWP::Simple;
use HTTP::Response;
use LWP::UserAgent;
my $url='http://www.youtube.com/user/tbsnewsi/videos?sort=dd&view=0&pa
+ge=1';
sub test1 { ### with get
my($html);
$html=get($url);
print "perl=$]\n";
print "LWPsimple v=$LWP::Simple::VERSION\n";
print "HTTP::Response v=$HTTP::Response::VERSION\n";
print "url=$url\n";
my $message=(utf8::is_utf8($html)) ? "html is utf8 flaged" : "html
+ is not utf8 flaged";
print "$message\n";
}
sub test2{ ### with useragent and requst
my ($ua, $req, $res, $html);
$ua = LWP::UserAgent->new;
$ua->agent("test user agent");
$req = HTTP::Request->new(GET => $url);
$res = $ua->request($req);
if ($res->is_success) {
$html=$res->decoded_content;
} else {
print $res->status_line, "\n";
return undef;
}
print "perl=$]\n";
print "LWPsimple v=$LWP::Simple::VERSION\n";
print "HTTP::Response v=$HTTP::Response::VERSION\n";
print "url=$url\n";
my $message=(utf8::is_utf8($html)) ? "html is utf8 flaged" : "html
+ is not utf8 flaged";
print "$message\n";
}
print join ',', @INC, "\n";
print "###with Simple,get#############\n";
test1();
print "###with ua,decoded content #############\n";
test2();
__DATA__
[tetsu]$/usr/home/tetsu/perl/videonews/Simple_Test% perl Simple_Test.p
+l
./libwww-perl-5.818/lib,/usr/local/lib/perl5/5.12.2/BSDPAN,/usr/local/
+lib/perl5/site_perl/5.12.2/mach,/usr/local/lib/perl5/site_perl/5.12.2
+,/usr/local/lib/perl5/5.12.2/mach,/usr/local/lib/perl5/5.12.2,.,
###with Simple,get#############
perl=5.012002
LWPsimple v=5.810
HTTP::Response v=5.818
url=http://www.youtube.com/user/tbsnewsi/videos?sort=dd&view=0&page=1
html is not utf8 flaged
###with ua,decoded content #############
perl=5.012002
LWPsimple v=5.810
HTTP::Response v=5.818
url=http://www.youtube.com/user/tbsnewsi/videos?sort=dd&view=0&page=1
html is utf8 flaged
thanks a lot.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Outside of code tags, you may need to use entities for some characters:
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
|
|