Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Hi All-

So this error is confusing me (a Perl beginner but with some experience) as well as someone I consider to be very far from a beginner and thus this post. Basically I have a tab delimited file which comes from the output of another bit of code. This is opened and read line by line in a presumably simple operation to populate a complex data structure (Hash{key}{value}->(Array)) where the key and value are fields pulled from the line and the array is the entire line stored for later use. The issue is that sometimes when I split the line on tabs for whatever reason it *misses* one of the fields leaving it as uninitialized. The very very strange thing is that I can see this in the debugger but if I execute the split command from within the debugger... its just fine. So... any help would be great.- code snippet and debugger view of this error is below- FYI perl -v tells me that... This is perl 5, version 12, subversion 3 (v5.12.3) built for darwin-thread-multi-2level (with 2 registered patches, see perl -V for more detail)

my $count = 0; 44 my $priorScaffold; 45 my $priorChr; 46 my @finalArray; 47 open (IN, '<', $fileName) or die "Cannot open $fileName\n"; 48 LINE: while (my $line = <IN>){ 49 chomp $line; 50 if ($line =~ m/^>\w+/){ 51 next LINE; 52 } 53 my @dataLine = split /\t/, $line; 54 my $chr = $dataLine[5]; 55 next LINE if ($chr !~ /^\d{1,2}?$/); 56 my $scaffold = $dataLine[0]; 57 if ($count != 0){ 58 if ($scaffold ne $priorScaffold){ + #If we found a new scaffoldd 59 push @{$global{$priorChr}{$priorScaffold}}, @final +Array; #write to hash @finalArray using prior accumulated +data for last scaffold 60 $count = 0 ; 61 $allScaffolds{$priorScaffold}=0; + #Save priorScaffold in allScaffolds Hash for list u +se later- value is meaningless 62 @finalArray = (); 63 } 64 elsif ($chr != $priorChr){ + #If we just switched Chromosomes but are still o +n the same scaffold 65 push @{$global{$priorChr}{$priorScaffold}}, @final +Array; #write to global the @finalArray accumulated to the + point it switches chromosomes 66 $count = 0; 67 @finalArray = (); 68 } 69 } 70 push @{$finalArray[$count]}, @dataLine; 71 $priorScaffold = $scaffold; 72 $priorChr = $chr; 73 $count ++; 74 print "$line\n"; 75 print join ("\t", @dataLine)."\n"; 76 }

and the debugger transcript showing this very odd (to me) behavior is below.... NOTE that when I command line execute the split on the line it works perfectly.. but somehow the actual executed code is failing

DB<1> b 54 $dataLine[2] = uninitialized + + DB<2> c + + This program requires 4 command line arguments Input File from as MizBee output from SyntenyFinder as first Input File listing all scaffolds and total lengths from the query geno +me Output FILE name to be used as basis for creation of MizBee input file + and tmp file you can deleteA NUMERIC integer value specifying the cu +t off value for scaffold EXCLUSION from visualization This value is used to eliminate scaffolds with less then VAL hits on a +ny given chromosome THIS ONLY ELIMINATES THE SCAFFOLD VISUALIZATION FOR THAT CHROMOSOME main::syntenyLoad( 54: my $chr = $dataLine[5]; DB<2> x $line + + 0 "GL429767\cI42604\cI226589\cI0\cI1\cI7\cI100487615\cI100493753\cI55 +0\cI-1\cI0.8748\cIENSMLUG00000029214\cIENSG00000087085\cI" DB<3> x split /\t/, $line + + 0 'GL429767' 1 42604 2 226589 3 0 4 1 5 7 6 100487615 7 100493753 8 550 9 '-1' 10 0.8748 11 'ENSMLUG00000029214' 12 'ENSG00000087085' DB<4> x @dataLine + + 0 'GL429767' 1 42604 2 'uninitialized' 3 0 4 1 5 7 6 100487615 7 100493753 8 550 9 '-1' 10 0.8748 11 'ENSMLUG00000029214' 12 'ENSG00000087085' DB<5>

So I went ahead and wrote a test script doing only the split and using the same file and print out both the line and the split line and that script works just fine... in that the print statement shows the line and the split line to be identical over all (well... I admit I didn't look at all 80,000 lines... but I cannot find any errors in that so the input file is OK... its just somehow this code thats causing issues... but its so simple! HAHAH... always simple that kills....

Update: Hold.... working on this for a minute... had an idea to test before bothering everyone.

In reply to uninitialized error by rufessor

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others lurking in the Monastery: (5)
    As of 2017-04-23 08:03 GMT
    Find Nodes?
      Voting Booth?
      I'm a fool:

      Results (430 votes). Check out past polls.