Your skill will accomplish
what the force of many cannot
The 10 stages of Bug hunting and the sweet smell of successby tachyon (Chancellor)
|on Sep 18, 2003 at 13:42 UTC||Need Help??|
So we have this rather large application. mod_perl, squid, Apache, MySQL and some Perl and C glue. So I am looking at the database and go "I don't like the look of that..." and so it begins - the hunt for the bug.....
Stage 1 - Begin Self Delusion (aka How Hard Can it Be?)
On the surface it seems like it will be pretty simple. A script which accesses the DB is iterating through a list. When it presents an item from the list it updates its status from 0 ie to be done, to 1 as in in progress. When it finishes if updates the flag to 2 ie done. Simple enough. Only problem is that there are a hell of a lot more 1s in there than there should be. For some reason the task is being commenced but not completed. A quick test show that for every task completed we have effectively 3 iterations through the DB list leaving one 2 and two 1 flags. Should be simple I think ;-) Someone has just called the get widget more than once. Well actually no.....they have not.....
Stage 2 - End Self Delusion (aka This might take a while)OK so there are no obvious errors. It ain't gonna be that easy. So we start to strip things away. Now I was accessing the widget through the proxy. And it was working before I was doing this. That MUST be the problem. Strip off the proxy.....and.....problem remains.....
OK I think. It was running fine under vanilla cgi and it is now not. Copy to vanilla CGI and all is well....and so I get the first sniff of the problem.....
Stage 3 - Back to Basics (aka Warn is your friend)
Time to pepper the code with warn "blah" and see what is going on. Lets have a look at the error log now. Ah interesting, we are getting a double POST to the script (the first debug warn was warn "Entered script $$\n") followed by others:
So now we can see the problem happening but why.....
Stage 4 - I think I am getting close (aka You wish)
Ah the almighty power of self delusion. So it now looks like it is a mod perl related double POST issue. But how is it related exactly? We are using a highly cut down version of CGI::Simple with just the basic header, param, query string and cookie methods. Why? Well it is twice as fast as CGI.pm, 10% the size, has methods that Apache::Registry doesn't and is damn near as fast as A::R without cookies and faster if you are using cookies. Doesn't care if it is running under mod_perl or vanilla. Big deal if it is broken! Perhaps that is the problem? Surely it must be.....
Stage 5 - Assumptions Assumptions (aka Begin Fruitless Search)
So let's convert to CGI.pm and see if the problem goes away I think. OK so the interface is the same, so it should not be hard. Oops 500 internal - forgot I added that $q->cookie()->bake method. Oops another 500 - $q->no_cache. OK so we have it running. And the problem seems fixed.....
Stage 6 - More Self Delusion
So it LOOKS like changing to CGI.pm has fixed it, must be a bug in my measly 300 lines of code. So off we go. Looking this way and looking that. When it was still broken using almost identical code to CGI.pm (mod perl associated) I begin to wonder.....I was sure CGI.pm fixed it....Double check. Oh, it actually didn't.....
Stage 6 - s/Fruitless Search|More Self Delusion/Denial/
Surely I didn't really just waste all that time? Must be a browser/mod_perl/somebody_elses_problem BUG. Sheesh it even happens with CGI.pm. Google for 'double POST bug mod_perl'....waste more time.....what about 'IE browser bug'....Hang on why don't I just try Netscape and Firebird (after all I am only using IE because that is what the end users will be using)....Damn, can't blame Bill.....Double check just to be sure ;-).....
Stage 7 - KISS (aka Strip it down to the bare minimum test case)
OK so lets prepare a really simple test case. Useful to post to Perlmonks in desparation if it gets really desperate. Yup there we go < 15 lines that reproduces the problem. Hang on.....
Stage 8 - The light bulb goes on
So I now have a trivial case that reproduces the problem, but that's the problem, it is so trivial it is almost self evident that there is no problem with that code....
Stage 9 - You always find your keys (and the bug) in the last place you lookSo what else was there? Only httpd.conf left really. Berate self, how could I have neglected that. Anyway here is the snippet from httpd.conf:
Yes we are using Apache::Dynagzip - great module BTW. Check the online docs (not carefully enough) Yup all looks OK. So comment out bits, restart Apache, test, problem refuses to go away although some amusing things happen. The most interesting occurred when I commented out the bit shown (killing the gzip stream...sort of):
Back to Test ScriptAAG@%#(*&GDG&$%#!@SES@%$$%TGSKUJYHT* ^%@%#$DFGF(*(%$FSDGF^@%$$DGDF*%%#Q#$DSG(*&^$WDS
So what was that? Well that is the output of Apache::Registry (the legible bit of the test case) and an undecoded gzip stream. Why two? Wasn't two the problem. Could my presumption about Apache::Dynagzip (i.e., it just could not be the problem because it is on the output stream) be WRONG? Reread the docs, carefully this time. Oops, you don't use Apache::Registry with Apache::RegistryFilter and Apache::Dynagzip. If I had read the docs more carefully in the first place I would have noted that. So the problem was that we had two different Apache modules essentially splitting one request into two.....amazing it has not caused more problems really!
Stage 10 - The sweet taste of success
Nothing like ending a session on a satisfactory note. Being Australian there is only one way to celebrate. Did I say smell? I meant taste. If you ever get to OZ try a VB. It's a beer. Bottoms up, and happy hunting.
Fix one problem create another. The config directives shown above using Apache::RegistryFilter and Apache::Dynagzip (but removing Apache::Registry from the chain) fix the double POST....BUT...stop cookies from working properly. This config apparently gives you mod_perl, cookies and on the fly gzip compression. Here's hoping! Some small test cases look promising.