Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

URL white space issues

by upaksh (Novice)
on Sep 28, 2012 at 12:35 UTC ( #996199=perlquestion: print w/replies, xml ) Need Help??
upaksh has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a problem with some links getting white spaces automatically. eg. due to %20 or white space char the file doesn't get called. Is there any perl module that can clean the url? thanks...

Replies are listed 'Best First'.
Re: URL white space issues
by MidLifeXis (Monsignor) on Sep 28, 2012 at 12:51 UTC

    Is this a Garbage In Garbage Out issue? Those URLs are valid properly-formed URLs, they just do not point to anything on the server. What is creating the original URLs with the %20 embedded?


Re: URL white space issues
by choroba (Bishop) on Sep 28, 2012 at 12:42 UTC
    If you want to remove all the spaces, you need no module. Just use substitution:
    $url =~ s/%20//g;
    If you want to remove only certain spaces, please specify how to recognise them.
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: URL white space issues
by sierpinski (Chaplain) on Sep 28, 2012 at 14:32 UTC
    "I have a problem with some links getting white spaces automatically"

    How does the "automatically" part occur? The whitespace fairy doesn't pay you a visit... so when are these whitespaces being introduced? Usually that occurs when a file has a space in the filename, and a browser intelligently converts that space to a %20 to avoid a global meltdown. I think rather than converting the spaces, you should avoid them in the first place by fixing the source filenames (or what creates them if it's another process.)

    In other words, If you have a file called "sample file.html", and you want to access it via a browser, you will have to use "" to reference it. Instead of trying to fix the URL, remove the space in the file name, perhaps "sample_file.html", so your URL will not have any %20 characters present. Would that solve your problem?
      Yes ... the whitespace fairy is a BUG coming from somewhere, and you need to trace that BUG to its true source, not just regex your way past the surface symptom.
Re: URL white space issues
by 2teez (Vicar) on Sep 28, 2012 at 15:32 UTC

    There are two modules you might like to check.

    Using URI::Escape, you can get back, your original URL, like so:
    use warnings; use strict; use URI::Escape; my $url = " due + to %20"; my $uri = uri_unescape($url); print $uri; # dex.cgi?action=vi ew due + to
    But really, I will advice, you seriously consider sierpinski's wisdom on this issue.
    Hope this helps

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: URL white space issues
by prashantktyagi (Scribe) on Sep 28, 2012 at 12:40 UTC
    Please explain problem with code examples, so that we can get your context.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://996199]
Approved by NetWallah
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2018-06-20 00:50 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (116 votes). Check out past polls.