Rajeshk has asked for the wisdom of the Perl Monks concerning the following question:
Hi Monks,
I have problem while downloading HTML files using LWP::UserAgent.
There are some Junks Characters found in downloaded files.
Is any way to download the file without junks.
Note:I am using in Windows OS. Download the webpage to see junk Characters
'http://www.whitecase.com/attorneys/detail.aspx?attorney=1148';
Here are some sample junk characters Downloaded files Input -- Original Output =========================================== 1. jury trial. For his -- jury trial. For his 2. Börries Ahrens -- Börries Ahrens 3. Aldejohann’s main -- Mr. Aldejohann’s 4. University of MĂĽnster -- University of Münster 5. the €625 million senior and €130 -- €625 million senior and €1 +30 6. acquisition of a properties’ -- acquisition of a properties’ 7. Westfield College – University -- Westfield College – University + 8. TelĂ©fonos -- Teléfonos 9.(CelumĂłvil S -- (Celumóvil S 10. Dr. jur., 1990, with a dissertation on “Die Unabhängigkeit des +genossenschaftlichen PrĂĽfungsverbandes” (“The Independence of th +e Cooperative Inspection Association”) --- Dr. jur., 1990, with a dissertation on "Die Unabhängigkeit des genosse +nschaftlichen Prüfungsverbandes" ("The Independence of the Cooperativ +e Inspection Association")
Here is my try:
use LWP::UserAgent; my $ua = new LWP::UserAgent; $ua->proxy(['http']=> 'http://00.00.0.00:0000'); my $url = 'http://www.whitecase.com/attorneys/detail.aspx?attorney=11 +48'; # Create a request my $req = HTTP::Request->new('GET' => $url); $req->proxy_authorization_basic("xxxxx", "xxxxx"); my $res = $ua->request($req); if ($res->is_success) { my $file_cnt = $res->content; print "$file_cnt"; open WOUT, ">out.html" or die "Can't open File: out.html"; print WOUT $file_cnt; close WOUT; } else { print "Download Error\n"; }
Thanks & Regards,
Rajesh.K
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: How to Remove Junk Characters
by abcde (Scribe) on Jan 05, 2006 at 13:33 UTC | |
by Rajeshk (Scribe) on Jan 06, 2006 at 05:47 UTC | |
by wfsp (Abbot) on Jan 06, 2006 at 19:56 UTC | |
Re: How to Remove Junk Characters
by wfsp (Abbot) on Jan 05, 2006 at 10:18 UTC | |
| |
Re: How to Remove Junk Characters
by zentara (Archbishop) on Jan 05, 2006 at 12:31 UTC | |
by wfsp (Abbot) on Jan 05, 2006 at 13:51 UTC | |
by holli (Abbot) on Jan 05, 2006 at 17:58 UTC |
Back to
Seekers of Perl Wisdom