Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
Syntactic Confectionery Delight
 
PerlMonks  

CGI param character strangeness

by wardk (Deacon)
on Feb 26, 2002 at 16:35 UTC ( #147630=perlquestion: print w/ replies, xml ) Need Help??
wardk has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks, I have an interesting situation with a CGI parameter. I am somewhat convinced it's a character set issue, but not sure what the resolution is. It appears that param is having an issue unencoding the parameter.

I can see in the raw query_string that the encoded string is passing in OK, but when I yank that parm out via CGI->param it's getting hosed. I'd rather not just parse the query_string myself if I don't have to.

Any thoughts on this? I am on an AIX box using Websphere.

given this URL parameter:

/cgi-bin/sidtest?sid=%32%2D%9F%A5%54%12%23%AA%C8%C7%CF%CE%8E%AC
and this code:
#!/ots/perl/bin/perl $|++; use strict; use CGI; my $q = new CGI; my $sid = $q->param("sid"); print "Content-type: text/plain\n\n"; print "\nsid is: $sid\n"; print "\nsid escaped is: " . $q->escape($sid); print "\nsid unescaped is: " . $q->unescape($sid); print "\nENV{QUERY_STRING} is: " . $ENV{QUERY_STRING}; exit;
I am getting this:
sid is: 2-T#Ύ sid escaped is: 2-%9F%A5T%12%23%AA%C8%C7%CF%CE%8E%AC sid unescaped is: 2-T#Ύ ENV{QUERY_STRING} is: sid=%32%2D%9F%A5%54%12%23%AA%C8%C7%CF%CE%8E%AC
note that if I execute from the command line, I am seeing similar behavior:
gen31$ ./sidtest sid=%32%2D%9F%A5%54%12%23%AA%C8%C7%CF%CE%8E%AC Content-type: text/plain sid is: 2-T#ά sid escaped is: 2-%9F%A5T%12%23%AA%C8%C7%CF%CE%8E%AC sid unescaped is: 2-T#ά ENV{QUERY_STRING} is:
Any thoughts/clues are most welcome, especially if I am just being a dumbass and not seeing the obvious.

Comment on CGI param character strangeness
Select or Download Code
Re: CGI param character strangeness
by count0 (Friar) on Feb 26, 2002 at 17:01 UTC
    The way those characters are displayed depends entirely upon the terminal settings / locale / charset, etc.

    I don't see what's hosed about it though. Many of those hex values in the query string do not correspond to "normal" ascii printable characters, so you should expect some odd output when printing it to the terminal, or viewing the file to which it was been written. But while those values are still in your variable - after yanking them with param - they should still be fine. This is just a display issue.

    If you're writing these to a file, perhaps it might be better to pack() the data, and write() it to the file, and later read() it when it's needed. Or even simply using binmode would suffice. (This keeps print from inserting any extra or different junk - normal text file laziness promotion ;)
Re: CGI param character strangeness
by wardk (Deacon) on Feb 26, 2002 at 17:09 UTC
    ok, well it appears that I did miss something fairly obvious... this string (which is being passed from another system as a session id) was not URL encoded BEFORE being delivered to my site. So param is decoding it properly and returning what is essentially junk. So if I passed them anywhere else, the session id is mucked up. so the calling system needs to URL encode this first, so it looks like:
    sid=%2532%252D%259F%25A5%2554%2512%2523%25AF%25CE%25C7%25D6%25D0%2591% +25AB
    which then (properly) produces from the same program above:
    sid is: %32%2D%9F%A5%54%12%23%AF%CE%C7%D6%D0%91%AB

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://147630]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (11)
As of 2014-04-17 10:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (444 votes), past polls