http://www.perlmonks.org?node_id=978738


in reply to Re^4: difficulty in matching 2 fields of a same line
in thread difficulty in matching 2 fields of a same line

Do you mean that there are lines where you want to compare fields 3 and 11, and other lines where you want to compare, say, 6 and 8, and others where fields 2 and 9 are relevant? If that's what you mean, then I suggest you provide us with the rules that decide which fields are of relevance. If this is not what you mean, then yeah, maybe you should give us an example to show what you mean.

Replies are listed 'Best First'.
Re^6: difficulty in matching 2 fields of a same line
by Anonymous Monk on Jun 27, 2012 at 18:26 UTC
    example
    14975:50417 1:N:0:CGATGT - chr11 108607098 GCTAGCTTACAGGTCACCTTGCTTGGTGTGGACAGT JJIIJJJGGGIJHHHFJJJIJJJHHHHHFFFFFCCC 0 33:A>T,34:G>C 14975:50417 2:N:0:CGATGT + chr11 89267281 TCATGAAGTATAAAGTTACAGGGTGGTGATGTGATT CCCFFFFFBHHHFIJIIIIIIIJAFH<FFHIGIIIJ 0

    my code shud not be printing this line because here both fields are equal but it does because there are sm extra tabs. how do i tell it to compare only those fields starting with chr in a given line?

      You should make every progam begin with those two lines, unless you have a very good reason not to. It will force you to write cleaner code, helping you catch errors much quicker. Originally you did that, but somewhere along the road, it seemed, you dropped these lines.

      use strict; use warnings;

       

      This looks okay.

      print "\n Enter the file name :>"; chomp( my $filename1 = <STDIN> );

       

      Because we use strict; we'll have to declare the array now. You did it right in the original code, but again, somewhere you strayed from the right path.

      my @lines = ();

       

      I really disagree with the exact wording of the error message here. As I've pointed out in the other threat. But bah. If this is how ya wanna put it, don't lemme stop ya!

      open my $fh, '<', $filename1 or die "Cannot find the file $filename1: +$!";

       

      I'm tempted to rewrite your loop so that it writes out lines as it reads them. There's really no point in keeping things in-memory, but I'll assume you have a reason for doing it this way. At any rate I'm getting rid of your $count variable. You don't really need it. This, too, has been pointed out in that other threat.

      while (<$fh>) { push @lines, $_; }

      Or just push @lines, $_ while <$fh>;. Or, while we're at it. @lines = <$fh>;, as has been mentioned in... you know where. Anyway, I've moved this close statement up. There's no point in keeping the file handle open if you're not using it anymore anyway.

      close $fh;

       

      This looks okay.

      open my $w, '>', 'Result.txt' or die "Cannot open the file Result.txt: + $!";

       

      I've rewritten your for loop to be more Perl-ish. This, too, has been demonstrated.

      foreach my $line (@lines) { my @chr_fields = grep {$_ =~ m/^chr/} split ("\t", $line) if ( # If... @chr_fields == 2 # ...there are exactly two +chr* fields... && $chr_fields[0] ne $chr_fields[1] # ...which are not exactly +the same... ) { print $w $line; } } close $w;