Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

difficulty in matching 2 fields of a same line

by Anonymous Monk
on Jun 27, 2012 at 17:06 UTC ( #978721=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hii

i have a file that contains each line smthg like this
4829:71370 1:N:0:CGATGT + chr6 126912761 GAAGGCATAGCCCGTTGGGCTGTGGTCATCAGCCTC CCCFFFFFHGHHHJHIJJJHIJIGHCGIIJJJJIJI 0 4829:71370 2:N:0:CGATGT + chr7 89349071 AGCCCTGCCCCCACCCCCCATTCTTCTTGACTGTCT C@@FFFFFHHHGHJ JIJIJIIIIJJJJJJJJIIJIJ 0

now i have used the following code to match 3rd and 11nth fields.

print "\n Enter the file name :>"; chomp( my $filename1 = <STDIN> ); open my $fh, '<', $filename1 or die "Cannot find the file $filename1: +$!"; while (<$fh>) { push (@line, $_); $count++; } open my $w, '>', 'Result.txt' or die "Cannot open the file Result.txt: + $!"; for ($i = 0; $i < $count; $i++) { @st = split ("\t", $line [$i]); if ($st[3] ne $st [11]) { print $w $line [$i]; } } close $w; close $fh;
but howevr although its a tabdelimited file it gives me wrong output
i mean it also prints which have identical fields in 3rd and 11nth positions
is there any way to get rid of such error

Comment on difficulty in matching 2 fields of a same line
Download Code
Re: difficulty in matching 2 fields of a same line
by choroba (Abbot) on Jun 27, 2012 at 17:23 UTC
    Perl numbers arrays from 0, so 3rd and 11th positions are [2] and [10].

      ya by considering that only i gave 3.
      4829:71370 [0]
      1:N:0:CGATGT 1
      + 2

        Oh, I see. Can you give an example of a line that is being printed, but should not be? It seems to work for me. Please, use <c> tags around the data as well as code.
Re: difficulty in matching 2 fields of a same line
by 2teez (Priest) on Jun 27, 2012 at 18:02 UTC

    You can use array slice like so:

    my @line; while(<DATA>){ chomp; push @line,split; } print @line[2,10]; #print 3,11 __DATA__ 4829:71370 1:N:0:CGATGT + chr6 126912761 GAAGGCATAGCCCGTTGGGCTGTGGTCAT +CAGCCTC CCCFFFFFHGHHHJHIJJJHIJIGHCGIIJJJJIJI 0 4829:71370 2:N:0:CGATGT + chr7 89349071 AGCCCTGCCCCCACCCCCCATTCTTCTTGACTGTCT C@@F +FFFFHHHGHJ JIJIJIIIIJJJJJJJJIIJIJ 0

    However, it really depend on what you really want to print out.

      example
      14975:50417    1:N:0:CGATGT    -    chr11    108607098    GCTAGCTTACAGGTCACCTTGCTTGGTGTGGACAGT    JJIIJJJGGGIJHHHFJJJIJJJHHHHHFFFFFCCC    0    33:A>T,34:G>C    14975:50417    2:N:0:CGATGT    +    chr11    89267281    TCATGAAGTATAAAGTTACAGGGTGGTGATGTGATT    CCCFFFFFBHHHFIJIIIIIIIJAFH<FFHIGIIIJ    0
      my code shud not be printing this line because here both fields are equal
      but it does because there are sm extra tabs.

      how do i tell it to compare only those fields starting with chr?

        Let a regex find what you need
        my @cmp = $line[$i] =~ /\tchr([0-9]+)\t/g; if (2 == @cmp and $cmp[0] != $cmp[1]) { print $line[$i]; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://978721]
Approved by muba
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2014-10-25 20:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (148 votes), past polls