http://www.perlmonks.org?node_id=89750

Buckaroo Buddha has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to replace the HTML escape codes from the GET string.

as you may well know, special characters
get turned into their hexadecimal name
eg: , would become %2C
   % would become %25

etc...

i need to make sure i do it for every occurance of a percent sign

can anyone help me write the regex for that?

Replies are listed 'Best First'.
Re: Regex to cange HTML %?? to char(0x??);
by bikeNomad (Priest) on Jun 19, 2001 at 21:52 UTC
    If you use CGI, it will decode the parameters for you.
Re: Regex to cange HTML %?? to char(0x??);
by myocom (Deacon) on Jun 19, 2001 at 21:59 UTC
Re: Regex to cange HTML %?? to char(0x??);
by dimmesdale (Friar) on Jun 19, 2001 at 21:54 UTC
    Well, here's a solution, but there's a module that does this--I just can't think of it now. Give CPAN a good look, and someone will probably respond with it. Here's the solution:
    $str =~ s/%([0-9A-Fa-f][0-9A-Fa-f])/chr(hex($1))/eg;

      How about these?

      # The above shrunk a bit. $str =~ s/%([0-9A-F][0-9A-F])/chr(hex($1))/ieg; # - or more cryptic $str =~ s/%([0-9A-F][0-9A-F])/sprintf("%1c",hex($1))/ieg;

      Can anyone think of something that doesn't use hex?


      TGI says moo

      Update

      Bill's post and Abigail's clever method taken together let us chop off another character or so:

      s/%([0-9A-F]{2})/"chr 0x$1"/igee;

      TGI - Syncretic Cretin

        Can anyone think of something that doesn't use hex?
        s/%([0-9A-F][0-9A-F])/"chr 0x$1"/ieeg;

        -- Abigail

Re: Regex to cange HTML %?? to char(0x??);
by TGI (Parson) on Jun 19, 2001 at 23:09 UTC

    All your escaping and unescaping is handled by CGI. Try this (untested) code.

    use CGI; use strict; my $q = new CGI; #get a list of parameter names my @params = $q->param(); # Start HTML Output print $q->header(),$q->start_html('FOO'); # Print a table of params and values. print "<TABLE>"; print << EHTML; <TR> <TH>Parameter</TH> <TH>Value</TH> </TR> EHTML foreach (@params) { my $value = $q->param($_); print << EHTML; <TR> <TD>$_</TD> <TD>$value</TD> </TR> EHTML } print "</TABLE>", $q->end_html();

    Note that I use here docs for most of the HTML generation. While CGI.pm can do HTML generation, I prefer to make minimal use of that capability. I always use its parameter handling, which is great.


    TGI says moo

Re: Regex to cange HTML %?? to char(0x??);
by Anonymous Monk on Jun 20, 2001 at 00:18 UTC
    As they already said, there's a Module to do this ... or several (TMTOWTDI) ... but there's also a really cute expression that's worth grokking-in-fullness, even if you should use the module instead, since it's a lovely example of s///g and s///e together.

    From Effective Perl Programming (Hall with Schwartz, 0-201-41975-0 not linked to Fatbrain, please support your local meatspace bookstore),

    $_ = "a%5eb"; s/%([0-9a-fA-F]{2})/pack("c",hex($1))/ge;
    which also says only one paragraph later
    use URI::Escape; $_ = uri_unescape "a%5eb";
    Both will result in $_ eq "a^b".

    and Yes, it will handle "9%25EA" correctly, tranforming it to "9%EA" and not eating the output %. See s///g. (What's 9%EA? maybe fair share for 11-way split, less crumbs.)

    -- Bill / n1vux