Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

download pdf from url which has unicode characters

by srikrishnan (Beadle)
on Jun 11, 2015 at 11:22 UTC ( [id://1130018]=perlquestion: print w/replies, xml ) Need Help??

srikrishnan has asked for the wisdom of the Perl Monks concerning the following question:

Hi all

I want to download pdf file from webaddress which has unicode characters. for e.g. "www.tamildeal.com/home/upload/pdf/வைத்திய%20அனுகூல%20ஜீவரட்சணி.pdf" use LWP::Simple; is not supporting such web addresses. is there any way?

Thanks in Advance, Srikrishnan

  • Comment on download pdf from url which has unicode characters

Replies are listed 'Best First'.
Re: download pdf from url which has unicode characters
by ikegami (Patriarch) on Jun 11, 2015 at 14:32 UTC
    Encode the characters using UTF-8 and url-encode the resulting bytes. URI::Escape's uri_escape_utf8 does this.

      Thanks for your response

      90% your suggestion works. but still there are unwanted bytes which makes the url dead

Re: download pdf from url which has unicode characters
by Corion (Patriarch) on Jun 11, 2015 at 11:28 UTC

    Find out how the URL gets encoded and then use the encoded URL. Most likely, the characters get percent-encoded ( / becomes %2e etc.). Maybe you can write a Perl script using URI::Encode to do the proper encoding of the URL for you.

Re: download pdf from url which has unicode characters
by vinoth.ree (Monsignor) on Jun 11, 2015 at 12:40 UTC
    Hi srikrishnan

    Use this URL, to fetch this PDF,

    http://www.tamildeal.com/home/upload/pdf/%E0%AE%B5%E0%AF%88%E0%AE%A4%E0%AF%8D%E0%AE%A4%E0%AE%BF%E0%AE%AF%20%E0%AE%85%E0%AE%A9%E0%AF%81%E0%AE%95%E0%AF%82%E0%AE%B2%20%E0%AE%9C%E0%AF%80%E0%AE%B5%E0%AE%B0%E0%AE%9F%E0%AF%8D%E0%AE%9A%E0%AE%A3%E0%AE%BF.pdf

    I just pasted your URL in notepad and got this encoded URL and able to access your pdf, you try with your code and let us know.


    All is well. I learn by answering your questions...

      Thanks for your reply

      yours url working well. Can you explain me the process? how you convert this encoded url? I have many links which i need to download

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1130018]
Approved by sundialsvc4
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2024-04-19 09:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found