Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

difficulty in matching 2 fields of a same line

by Anonymous Monk
on Jun 27, 2012 at 17:06 UTC ( #978721=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hii

i have a file that contains each line smthg like this
4829:71370 1:N:0:CGATGT + chr6 126912761 GAAGGCATAGCCCGTTGGGCTGTGGTCATCAGCCTC CCCFFFFFHGHHHJHIJJJHIJIGHCGIIJJJJIJI 0 4829:71370 2:N:0:CGATGT + chr7 89349071 AGCCCTGCCCCCACCCCCCATTCTTCTTGACTGTCT C@@FFFFFHHHGHJ JIJIJIIIIJJJJJJJJIIJIJ 0

now i have used the following code to match 3rd and 11nth fields.

print "\n Enter the file name :>"; chomp( my $filename1 = <STDIN> ); open my $fh, '<', $filename1 or die "Cannot find the file $filename1: +$!"; while (<$fh>) { push (@line, $_); $count++; } open my $w, '>', 'Result.txt' or die "Cannot open the file Result.txt: + $!"; for ($i = 0; $i < $count; $i++) { @st = split ("\t", $line [$i]); if ($st[3] ne $st [11]) { print $w $line [$i]; } } close $w; close $fh;
but howevr although its a tabdelimited file it gives me wrong output
i mean it also prints which have identical fields in 3rd and 11nth positions
is there any way to get rid of such error

Comment on difficulty in matching 2 fields of a same line
Download Code
Replies are listed 'Best First'.
Re: difficulty in matching 2 fields of a same line
by choroba (Canon) on Jun 27, 2012 at 17:23 UTC
    Perl numbers arrays from 0, so 3rd and 11th positions are [2] and [10].

      ya by considering that only i gave 3.
      4829:71370 [0]
      1:N:0:CGATGT 1
      + 2

        Oh, I see. Can you give an example of a line that is being printed, but should not be? It seems to work for me. Please, use <c> tags around the data as well as code.
Re: difficulty in matching 2 fields of a same line
by 2teez (Priest) on Jun 27, 2012 at 18:02 UTC

    You can use array slice like so:

    my @line; while(<DATA>){ chomp; push @line,split; } print @line[2,10]; #print 3,11 __DATA__ 4829:71370 1:N:0:CGATGT + chr6 126912761 GAAGGCATAGCCCGTTGGGCTGTGGTCAT +CAGCCTC CCCFFFFFHGHHHJHIJJJHIJIGHCGIIJJJJIJI 0 4829:71370 2:N:0:CGATGT + chr7 89349071 AGCCCTGCCCCCACCCCCCATTCTTCTTGACTGTCT C@@F +FFFFHHHGHJ JIJIJIIIIJJJJJJJJIIJIJ 0

    However, it really depend on what you really want to print out.

      example
      14975:50417    1:N:0:CGATGT    -    chr11    108607098    GCTAGCTTACAGGTCACCTTGCTTGGTGTGGACAGT    JJIIJJJGGGIJHHHFJJJIJJJHHHHHFFFFFCCC    0    33:A>T,34:G>C    14975:50417    2:N:0:CGATGT    +    chr11    89267281    TCATGAAGTATAAAGTTACAGGGTGGTGATGTGATT    CCCFFFFFBHHHFIJIIIIIIIJAFH<FFHIGIIIJ    0
      my code shud not be printing this line because here both fields are equal
      but it does because there are sm extra tabs.

      how do i tell it to compare only those fields starting with chr?

        Let a regex find what you need
        my @cmp = $line[$i] =~ /\tchr([0-9]+)\t/g; if (2 == @cmp and $cmp[0] != $cmp[1]) { print $line[$i]; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://978721]
Approved by muba
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (4)
As of 2015-07-30 01:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (269 votes), past polls