BioGeek has asked for the wisdom of the Perl Monks concerning the following question:

Hey Monks,

I have the following snippet of code.
my @disease_name_pocus = @_; open( POCUS, "/home/BioGeek/results_100.out" ||die $!\n"; while (<POCUS>) { push @results, [split]; } open( MARKER, ">marker_list.txt" ) || die $!; for ( my $i = 0 ; $i < scalar @results ; $i++ ) { if ( $results[$i]->[0] eq $disease_name_pocus[0] ) { if ( $results[$i]->[1] ne $results[ ( $i + 1 ) ]->[1] ) { print MARKER "$results[$i]->[1]\t$results[$i]->[4]\n"; } }

It compares a variable $disease_name_pocus[0] with the first element in each array of an ArrayOfArrayrefs. If they're equal, print the first and 4th element of the array, but don't print two arrays with the same value $results[$i]->[1].
The code above complains of a "use of uninitialized value in string ne", so are there better ways of testing, so that no duplicate arrays will be printed?

Update: added snippet of inputfile.
lca ENSG00000179314 6 100 0.0030279 + 7 3 23 ; lca ENSG00000179314 6 100 0.0030279 + 7 3 23 ; lca ENSG00000179314 6 100 0.0030279 + 7 3 23 ; lca ENSG00000176287 6 100 0 + 17 3 23 ; lca ENSG00000176287 6 100 0 + 17 3 23 ; lca ENSG00000176287 6 100 0 + 17 3 23 ; lca ENSG00000176287 6 100 0 + 17 3 23 ; lca ENSG00000176287 6 100 0 + 17 3 23 ; lca ENSG00000176287 6 100 0 + 17 3 23 ; lca ENSG00000161940 6 100 8.7149e-09 + 9 4 23 ; lca ENSG00000161940 6 100 8.7149e-09 + 9 4 23 ; lca ENSG00000161940 6 100 8.7149e-09 + 9 4 23 ; lca ENSG00000161940 6 100 8.7149e-09 + 9 4 23 ; lca ENSG00000161940 6 100 8.7149e-09 + 9 4 23 ; lca ENSG00000161940 6 100 8.7149e-09 + 9 4 23 ; lca ENSG00000108561 6 100 1.02585e-13 + 13 4 23 ; lca ENSG00000108561 6 100 1.02585e-13 + 13 4 23 ; lca ENSG00000108561 6 100 1.02585e-13 + 13 4 23 ; lca ENSG00000108561 6 100 1.02585e-13 + 13 4 23 ;

Replies are listed 'Best First'.
Re: use of uninitialized value in string ne
by Joost (Canon) on Aug 19, 2004 at 09:48 UTC
    Please copy & paste your code next time. This code doesn't even compile.

    Also, it's a good idea to post a snippet of the input file, so we can see what that code is actually reading.

    In this specific instance the uninitialized value warning probably indicates a bug, assuming @results and @disease_name_pocus are supposed to be filled without undef values.

    A nice way of checking these arrays is

    use Data::Dumper qw(Dumper); warn Dumper(\@results,\@disease_name_pocus);
      This snippet was copied an pasted, but it's part of a larger program offcourse.

      Also added typical sample input with lots of duplicate lines
        This:
        open( POCUS, "/home/BioGeek/results_100.out" ||die $!\n";
        has several syntax errors so it can't come from any working code.

        Anyway, it looks to me (if this resembles the actual code, and assuming the warning is from the $results[$i]->[0] eq $disease_name_pocus[0] line) that reading the input file is ok, so probably $disease_name_pocus[0] is undefined. It gets set from @_ which is generally used for passing subroutine arguments. Maybe this is part of a sub and you're calling it incorrectly?

        ofcourse, this code will also warn if you have blank lines or other lines in your input that don't conform to the sample input you gave above.

        Also,

        $results[$i]->[1] ne $results[ ( $i + 1 ) ]->[1]

        will test one element past the current length of @results at the last iteration, and will also give the warning.

        I'd probably write it something like this:

        my $disease_name_pocus = 'something'; # dont' need array, we only ever use the first value anyway open POCUS,"<","/home/BioGeek/results_100.out" or die "can't open /home/BioGeek/results_100.out: $!\n"; open MARKER, ">","marker_list.txt" or die "Can't open marker_list.txt: $!"; my $lastline; while (<POCUS>) { chomp; # remove newline next unless /\S/; # only use lines that contain something else + than spaces my @result = [split]; next unless $result[0] eq $disease_name_pocus; next if (defined $lastline and $lastline eq $result[1]); $lastline = $result[1]; print MARKER "$result[1]\t$result[4]\n"; } close POCUS; close MARKER;
Re: use of uninitialized value in string ne
by Random_Walk (Prior) on Aug 19, 2004 at 11:19 UTC
    You can add a check that the second element has some value.
    perl -wle '$a[1]=[1];print $a[1]->[0];if($a[1]->[1]){print$a[1]->[1]}' r perl -wle '$a[1]=[1,2];print$a[1]->[0];if($a[1]->[1]){print$a[1]->[1]} +' r l
    something like this should do it in your code
    for ( my $i = 0 ; $i < scalar @results ; $i++ ) { if ( $results[$i]->[0] eq $disease_name_pocus[0] ) { next unless $results[$i]->[1]; if ( $results[$i]->[1] ne $results[ ( $i + 1 ) ]->[1] ) { print MARKER "$results[$i]->[1]\t$results[$i]->[4]\n"; } }

    Cheers,
    Random.