|Think about Loose Coupling|
mod_perl go boom, mod_cgi worksby hacker (Priest)
|on Mar 06, 2005 at 01:40 UTC||Need Help??|
hacker has asked for the
wisdom of the Perl Monks concerning the following question:
I've written a mod_perl little application that takes an rss/rdf/atom feed or a usenet newsgroup by name, and converts it to HTML output, via some custom XSL stylesheets I've written. It uses XML::LibXML and XML::LibXSLT as well as a smattering of XML::RSS and Net::NNTP.
So far, so good, and it works great. The output looks great too. Clean and effective. The target (as with much of my Perl and screen-scraping code) is to be converted for viewing on a handheld through Plucker.
But there's a problem...
Every once in awhile, at unpredictable intervals, running the script to convert the feed to HTML, will cause apache's child servicing that request to segfault. It happens predictably, but randomly, meaning... I can refresh the same POSTDATA, and it will work some of the time, and segfault apache other times.
Normally, this wouldn't be a problem, because Apache would just spawn another thread to handle the segfaulted process, but I'm running Apache behind Squid as an accellerator, and when Apache segfaults a child, Squid loses the handle, and drops the socket.
Before you ask, I'm using strict, warnings and diagnostics. There are no closures. Everything that could possibly be causing this has been checked and checked again. The code itself runs very clean, and it is probably one of the best pieces of Perl I've written to date.
The weirder piece of this, is that it only segfaults Apache when its running as a mod_perl application. It does not segfault when running as mod_cgi, in the same directory. Of course, running it as mod_cgi is about 80% slower, so that's not an option if I want to launch this tool publically. I even tried using Memoize to gain a bit more control over the way functions are evaluated, but that didn't seem to improve matters at all.
I've tried Apache 1.3.33 as source-built, Debian packaged, and on FreeBSD. I've tried threaded and non-threaded apache+mod_perl builds (-lpthread and USE_THREADS=1). I've tried DSO and static. I've tried separating the mod_perl and mod_php servers into their own instances on separate ports, with physically-separate config files. Still no success in stopping the segfaults.
I've also tried using Apache 2.0.53 in the same configuration, threaded, non-threaded, DSO, static, shared instances, separate instances, etc. Again, no luck.
I've tried using stock linuxthreads and also NPTL, with the same negative results.
After days of frustration, I tried another wild approach, with Apache 2.1.3 sitting behind Squid, acting as a ProxyPass agent, talking to two physically-separate instances of 1.3.33 running on separate ports. Still no luck.
So I'm at an impasse. I don't quite know how to figure out why this is happening, and how to avoid it. I even went through the Apache debugging steps one-by-one, and still could not find a repducable way to get it to crash in the same place so I could identify which apache or mod_perl interface has the bug. I've run it through gdb and strace hundreds of times (yes tye, I've used 'attach' here <grin>), but it keeps dying in different places, its very inconsistent.
The last step, before I finally give up and write this in another language, is to go through the mod_perl Porting Guide that belg4mit referred me to, and see if there's any last things I might have missed.
Is there a way to get even more granular with this, to really see if there are some hidden globals I can't see, or closures not being reported by strict/warnings/taint/diagnostics? It's just about driving me mad now, 5 days and counting, so I'd like to solve it and move on to the next project.