LukeyBoy has asked for the wisdom of the Perl Monks concerning the following question:

I've got a core dump from one of my server-side Perl CGI scripts, and the main line at the end points to /gateway/cgi-bin/test/nph-proxy.cgi (which is a Perl script obviously). When I use gdb on the Perl binary and cross-reference it to the core dump, all I get for a backtrace is:

#0 0x805efe1 in S_my_exit_jump () #1 0x805eea9 in Perl_my_exit () #2 0x8088d55 in Perl_safemalloc () #3 0x8099d0a in S_more_sv () #4 0x809e1d5 in Perl_newSV () #5 0x805a720 in perl_construct () #6 0x8059d38 in main () #7 0x400699cb in __libc_start_main (main=0x8059d10 <main>, argc=2, ar +gv=0xbffffb04, init=0x8058fa8 <_init>, fini=0x80d681c <_fini>, rtld_fini=0x4000ae60 <_dl_fini>, stack_end +=0xbffffafc) at ../sysdeps/generic/libc-start.c:92
Does anyone know how I'd go about finding why this crashed the Perl binary? Thanks!

Edit by tye to change PRE tags to CODE tags around long lines

Replies are listed 'Best First'.
Re: What do you do with a core dump?
by dragonchild (Archbishop) on Mar 07, 2002 at 20:42 UTC
    Unless you feel like combing through the source, don't bother. It died in Perl_safemalloc(), at least that's what it looks like.

    If you don't feel like patching the binary, I'd just start debugging and try and fix in your script the situation that caused the binary to crash.

    I would highly recommend you trying to find a minimal case that reproduces the bug and forwarding that to perl-bug.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      I would forward it to perlbug, but the script calls a hell of a lot of other code in modules, so I don't even have a clue which section would reproduce this. Also, I'm not sure how to run the debugger in the context of a server-script.
Re (tilly) 1: What do you do with a core dump?
by tilly (Archbishop) on Mar 08, 2002 at 03:08 UTC
    The odds are that you don't. Suppose that the problem was data corruption from a C library that it uses. Then the bug could quite literally be anywhere, and where it shows up is not very informative.

    First of all, try to reproduce. Get all of the information that you can from logs, try to do it yourself. If you can, then try to do it in a controllable test environment. assuming that you can get it there, then try removing chunks of the program and reproducing it again. If a chunk seems to fix it, remove other chunks. You are trying to produce a test case, or else a good bug report. Doing this is a bit of an art, and a lot of luck. (The element of needed luck rises tremendously if the error is not deterministic. When something happens one time in 10,000, it is really hard to get feedback on whether you have made a difference or not with your latest edit...)

    A second approach is to make a list of what modules it uses. Find out which ones load C libraries. For each one search for reports that it is associated with core dumps. (For instance I have seen reports that some database drivers will every so often.) This is kind of hit or miss, but sometimes you can identify a problem this way. An intimate familiarity with a wide variety of reported bugs may improve your odds slightly...

    Speaking of which, a distinct possibility to look into is whether or not it is using a signal handler. Perl does not have safe signal handling, and in particular allocating memory within a signal handler can cause core dumps.

    But unless you can find someone with an intimate familiarity with the internals of Perl, odds are that trying to debug a complex Perl script using gdb is going to be an uphill battle. (When I say "uphill", think Nepal...)