http://www.perlmonks.org?node_id=759203


in reply to New to me crash message

I gather than line 68 (where the error is reported to have occurred) is here:
while ( my ( $url, $text ) = each(%hrefs) ) { if ( !defined( $text->[-1] ) ) { print $fh_ERR "$url|_____NO_TEXT_____\n"; } elsif ( $text->[-1] ne $success ) { $notfound++; if (defined($text->[0])) { my @text = collapse(@$text); ### <--- line 68 print $fh_ERR "$url|", join( ",", @text ), "\n"; } else { print $fh_ERR "$url|\n"; } } $total++; }
I'm stumped about the reference to "aelem" in the error message -- no clue what this might be referring to. Apart from that, when you mention "the usual runtime", do you mean that this script has "usually" worked prior to this failure?

If so, the question becomes: what was different about this run relative to previous runs (when it worked as intended)? More input data? Corrupted input data? (I don't see much in the way of checking for bad input... what would happen if a line in your "$base_file.links" does not contain a "|" (vertical bar) character?)

Probably not related to your problem, but you could replace the "collapse" function call with:

grep { defined } @$text
Also, I see you checking for hash elements with if(defined($hash{$key})), and it might make more sense to use if(exists($hash{$key})) instead.

Replies are listed 'Best First'.
Re^2: New to me crash message
by grinder (Bishop) on Apr 22, 2009 at 08:03 UTC
    I'm stumped about the reference to "aelem" in the error message

    It is internals-speak for "array element", at the C level. It only surfaces when you grovel deeply in magic, or XS.

    • another intruder with the mooring in the heart of the Perl

      More precisely, it's one of the opcodes that implements $a[$i]. (There appears to be a aelemfast as well.)
      $ perl -MO=Concise -e'$a[$i]' 7 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 6 <2> aelem vK/2 ->7 4 <1> rv2av sKR/1 ->5 3 <#> gv[*a] s ->4 - <1> ex-rv2sv sK/1 ->6 5 <#> gvsv[*i] s ->6 -e syntax OK

      But oddly, there's no array indexing at the source line number given by the message.

Re^2: New to me crash message
by Anonymous Monk on Apr 22, 2009 at 08:03 UTC
    aelem seems to be an Opcode, example
    # l <|> mapwhile(other->m)[t26] lK # m <#> gv[*_] s # n <1> rv2sv sKM/DREFAV,1 # o <1> rv2av[t4] sKR/1 # p <$> const[IV 0] s # q <2> aelem sK/2
    Simpler way to generate aelem
Re^2: New to me crash message
by hsmyers (Canon) on Apr 22, 2009 at 14:32 UTC
    Improvements also very welcome. Much of this code was written in quick and dirty mode, sensible re factoring need not apply. While anything is possible, this is the second part of two scripts, the first writes the file this one reads at an earlier point, the one with '|' as divider so at least that is unlikely. Thanks for the better design suggestions! I'm thinking that the best chance for a 'change' as smoking gun is a significant jump in data size. This is of course complicated by the time it takes to process a file of large size (hours--- companion takes the longest portion).

    --hsm

    "Never try to teach a pig to sing...it wastes your time and it annoys the pig."
      ... this is the second part of two scripts, the first writes the file this one reads at an earlier point, the one with '|' as divider so at least that is unlikely.

      In my experience, a statement like "that is unlikely" doesn't suffice for debugging the sort of problem you're having. Only testing will suffice. Test the output of that first script for the data set in question, and confirm that every record matches  /\S\|\S/.

      You can also create a small test set of data for input to the OP script, include a record that does not match that regex, and see what happens. If it blows up, that's a good cue for adding some defensive code in the OP script, to do something sensible when such data comes in (skip the record or die, with a suitable message to stderr).

        A little like asking the husband for his fingerprints to rule him out. While it may upset him, it can in fact actually rule the poor guy out. And in the code, would be one less thing to worry about. Good suggestion.

        --hsm

        "Never try to teach a pig to sing...it wastes your time and it annoys the pig."