URL white space issues

by upaksh (Novice)
on Sep 28, 2012 at 12:35 UTC ( #996199=perlquestion: print w/replies, xml ) Need Help??
upaksh has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a problem with some links getting white spaces automatically. eg. due to %20 or white space char the file doesn't get called. Is there any perl module that can clean the url? thanks...

Re: URL white space issues
by MidLifeXis (Monsignor) on Sep 28, 2012 at 12:51 UTC

    Is this a Garbage In Garbage Out issue? Those URLs are valid properly-formed URLs, they just do not point to anything on the server. What is creating the original URLs with the %20 embedded?


Re: URL white space issues
by choroba (Chancellor) on Sep 28, 2012 at 12:42 UTC
    If you want to remove all the spaces, you need no module. Just use substitution:
    $url =~ s/%20//g;
    If you want to remove only certain spaces, please specify how to recognise them.
Re: URL white space issues
by sierpinski (Hermit) on Sep 28, 2012 at 14:32 UTC
    "I have a problem with some links getting white spaces automatically"

    How does the "automatically" part occur? The whitespace fairy doesn't pay you a visit... so when are these whitespaces being introduced? Usually that occurs when a file has a space in the filename, and a browser intelligently converts that space to a %20 to avoid a global meltdown. I think rather than converting the spaces, you should avoid them in the first place by fixing the source filenames (or what creates them if it's another process.)

    In other words, If you have a file called "sample file.html", and you want to access it via a browser, you will have to use "" to reference it. Instead of trying to fix the URL, remove the space in the file name, perhaps "sample_file.html", so your URL will not have any %20 characters present. Would that solve your problem?
      Yes ... the whitespace fairy is a BUG coming from somewhere, and you need to trace that BUG to its true source, not just regex your way past the surface symptom.
Re: URL white space issues
by 2teez (Priest) on Sep 28, 2012 at 15:32 UTC

    There are two modules you might like to check.

    Using URI::Escape, you can get back, your original URL, like so:
    use warnings; use strict; use URI::Escape; my $url = " due + to %20"; my $uri = uri_unescape($url); print $uri; # dex.cgi?action=vi ew due + to
    But really, I will advice, you seriously consider sierpinski's wisdom on this issue.
    Hope this helps

Re: URL white space issues
by prashantktyagi (Scribe) on Sep 28, 2012 at 12:40 UTC
    Please explain problem with code examples, so that we can get your context.

    Results (149 votes). Check out past polls.