Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: Re: PayPal Advice Sought

by epoptai (Curate)
on Jul 07, 2001 at 03:37 UTC ( #94653=note: print w/replies, xml ) Need Help??

in reply to Re: PayPal Advice Sought
in thread PayPal Advice Sought

I have a question about that regex. A look at sub unescape in CGI reveals a regex that's nearly identical to the one in question. The first difference is trivial {2}. I'm curious about how significant the use of a signed pack (c) in the CGI regex is, in contrast to the unsigned pack (C) in the other one?
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; # carg +o $todecode =~ s/%([0-9a-fA-F]{2})/pack("c",hex($1))/ge; # CGI
For reference's sake here's sub unescape from version 2.46:
# unescape URL-encoded data sub unescape { shift() if ref($_[0]); my $todecode = shift; return undef unless defined($todecode); $todecode =~ tr/+/ /; # pluses become spaces $todecode =~ s/%([0-9a-fA-F]{2})/pack("c",hex($1))/ge; return $todecode; }
thanks - epoptai

Check out my Perlmonks Related Scripts like framechat, reputer, and xNN.

Replies are listed 'Best First'.
pack 'c',... -vs.- pack 'C',... was: Re: Re: Re: PayPal Advice Sought
by ariels (Curate) on Jul 08, 2001 at 01:07 UTC
    epoptai's right -- there's no difference between pack 'c',$number and pack 'C',$number, ever. There is a difference when unpacking, of course.

    What the translations 'c' and 'C' do when packing is to translate an integer to a corresponding character value. Your character values are most likely single-byte numbers. Each corresponds to a specific modulus of integers. In particular, the two most popular ways to assign representative integer values to the 256 bytes are 0..255 ("unsigned char") and -128..127 ("signed char").

    But of course any integer value is congruent (modulo 256) to exactly one byte value, whichever of the 2 ranges you pick. So any integer has a unique translation to a byte. The reverse direction (unpack 'c',$str) is less single-valued: for instance, unpack 'C',(pack 'C',-1) == 255. Here unpack has to chose a specific range from which to pick an integer representing the byte value, and the two letter codes make a difference.

    The same thing occurs for the other signed/unsigned letters for integer conversions in pack/unpack.

Re: Re: Re: PayPal Advice Sought
by MeowChow (Vicar) on Jul 08, 2001 at 01:11 UTC
    My version (2.752) of's unescape uses chr instead of pack, by the way:
    $todecode =~ s/%([0-9a-fA-F]{2})/chr hex($1)/ge;
    I wonder if it's faster...
                   s aamecha.s a..a\u$&owag.print
      I found differences re UTF-8 strings, so maybe it was changed because using pack broke when characters were longer than one byte. That shouldn't affect your specific example because you know they are hex digits, but maybe he got rid of packing altogether throughout his code.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://94653]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2021-05-14 23:06 GMT
Find Nodes?
    Voting Booth?
    Perl 7 will be out ...

    Results (150 votes). Check out past polls.