http://www.perlmonks.org?node_id=788093

markjugg has asked for the wisdom of the Perl Monks concerning the following question:

I have a patch pending for CGI.pm that I'd like peer feedback on. The code change is dead simple, it's the conceptual change itself I'd like feedback on.

Currently, the query_string() method returns $ENV{QUERY_STRING} more or less how it received it, although it always processes and rebuilds the query string internally, so it's possible it's different. One change it will make is to convert the joining character to either "&" or ";", as you prefer.

My patch would cause a further change: it would cause the keys to be returned in a sorted order.

In support this of this change, it "canonicalizes" query strings, so that you can easily compare to queries to see if they have equivalent names and values, even if they are in a different order, or use a different joiner.

Without this change, the comparison would be much harder, you'd have to do something like turning the query back into a hash, then compare two hashes for equality.

On the other hand, you could speak against the change: For one, it's a change that could be considered unnecessary, and somebody, somewhere likely depends on the old behavior.

I tried to find a RFC which spoke to the point, but I couldn't find anything useful.

Although I have already submitted the patch, I'm having second thoughts about how as a "safe" change. There are other alternatives to make query comparison easier, such as a providing a "sorted_query_string()" method that would either directly in CGI.pm, or accessible through a plugin.

What do you think?

Replies are listed 'Best First'.
Re: CGI.pm: Want your query_string() sorted or unsorted?
by ikegami (Patriarch) on Aug 13, 2009 at 02:22 UTC

    I don't know how relevant it is here, but the order of the parameters in a query is well defined to be the same order as the fields appear in the form. In general, it's not acceptable to reorder parameters since the resulting URL is not equivalent.

    This information comes from the HTML4 spec. I don't know what the URI spec says.

Re: CGI.pm: Want your query_string() sorted or unsorted?
by Burak (Chaplain) on Aug 13, 2009 at 10:08 UTC
    I think that this is a job for subclassing :p BTW, congrats to your work on CGI.pm. The bugs were out of hand the the last time I checked :)
      Thanks.

      Yes, there were about 150 active bugs in the bug tracker when I became co-maintainer. Many of them represented things which were long ago resolved, but never updated in the bug tracker. I've gone through and triaged every bug there in the last few weeks, so they should all have an up-to-date status. Several are waiting on volunteer help, and the "Needs..." subject lines should reflect that.

Re: CGI.pm: Want your query_string() sorted or unsorted?
by dsheroh (Monsignor) on Aug 13, 2009 at 11:01 UTC
    Reordering the query string doesn't sound like a particularly safe operation to me. It seems like something where order-dependency may be somewhat common. Given ikegami's point about the HTML4 spec specifying how it's to be ordered (which I assume is accurate, but I haven't checked to verify), it would appear that reordering it would be non-standards-compliant and, therefore, definitively wrong.

    What about adding either a sorted_query_string (as already suggested) or a query_hash method? Personally, query_hash seems like the more general and more likely to be useful of the two, but either would handle the comparability case that you've made.

Re: CGI.pm: Want your query_string() sorted or unsorted?
by scorpio17 (Canon) on Aug 13, 2009 at 14:38 UTC
    Are you familiar with PayPal's IPN (instant payment notification) system? The idea is that if you have an e-commerce site, and someone chooses to pay via paypal, paypal can notify you when it gets a payment. To help prevent abuse, your payment processing script has to send back an acknowledgment to paypal - this has to contain all the same parameters, IN THE SAME ORDER. If all goes well, you get back a 'verified' signal, and can complete the transaction. Else you know something has gone wrong, etc. But if CGI.pm changed the order of the parameters in the query string, this would cause a problem. At least, I know it would break MY paypal IPN processing script (and probably that of a lot of other people). If you add this feature, I suggest making it a) optional, and b) NOT the default. Thanks!
Re: CGI.pm: Want your query_string() sorted or unsorted?
by JavaFan (Canon) on Aug 13, 2009 at 11:06 UTC
    It's a long time since I used QUERY_STRING or CGI.pm, but isn't the only worthwile feature of CGI.pm the fact you don't have to bother with QUERY_STRING yourself? And if I really want to compare QUERY_STRINGS, I could just call the Vars method and compare the resulting hashes. There are modules out there that will compare hashes for you.
      Thanks. I rarely use "Vars" any more in favor of param(), but I agree it sounds like it could be a good fit here, in combination with a hash comparison function.

      <thinks>

      I'm recalling that the other piece of wanting this is that the query strings will stored in a database column, and I want them in a canonical form for that, so that two identical queries in a different order will be represented the same.

      For that, it sounds like I really do need a "sorted_query_string()" method, which I could supply as a plugin or new method to CGI.pm

      Thanks to everyone for the feedback!

        I'm recalling that the other piece of wanting this is that the query strings will stored in a database column, and I want them in a canonical form for that, so that two identical queries in a different order will be represented the same.
        That sounds like a database design disaster. Why not store the individual items of the query string as columns?
Re: CGI.pm: Want your query_string() sorted or unsorted?
by Anonymous Monk on Aug 13, 2009 at 03:46 UTC
    overload ==?