Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^4: compare lines within a file

by kennethk (Abbot)
on Mar 10, 2011 at 15:29 UTC ( [id://892442]=note: print w/replies, xml ) Need Help??


in reply to Re^3: compare lines within a file
in thread compare lines within a file

Define "doesn't work". Please provide your input file (wrapped in <code> tags) as well as your observed and expected outputs.

One problem is that you have written '\t' in place of "\t". These mean different things in Perl. Double quotes interpolate, whereas single quotes do not. This means '\t' yields the literal string of a backslash followed by a t; "\t" yields a tab character. There is a list of escape sequences in the link I have provided.

Replies are listed 'Best First'.
Re^5: compare lines within a file
by garyboyd (Acolyte) on Mar 11, 2011 at 13:56 UTC

    I tried the code but I get an error: Use of uninitialized value in printf at parse_result.txt.pl line 23, <$fh> line 45.

    If I change the code '\t' in place of "\t" I still get the error

    The input file is:

    HWUSI-EAS95L_0025_FC:3:1:2031:1075#0/1 + 2770970 2771005 HWUSI-EAS95L_0025_FC:3:1:2031:1075#0/2 + 2771158 2771190 HWUSI-EAS95L_0025_FC:3:1:2229:1075#0/1 - 1449587 1449620 HWUSI-EAS95L_0025_FC:3:1:2229:1075#0/2 - 1449425 1449460 HWUSI-EAS95L_0025_FC:3:1:5001:1079#0/1 - 1449311 1449346 HWUSI-EAS95L_0025_FC:3:1:5001:1079#0/2 - 1449301 1449336 HWUSI-EAS95L_0025_FC:3:1:5232:1082#0/1 - 1449586 1449619 HWUSI-EAS95L_0025_FC:3:1:5232:1082#0/2 - 1449544 1449577 HWUSI-EAS95L_0025_FC:3:1:6417:1078#0/1 - 4744083 4744113 HWUSI-EAS95L_0025_FC:3:1:6417:1078#0/2 - 4744011 4744042 HWUSI-EAS95L_0025_FC:3:1:6539:1083#0/1 - 4867122 4867157 HWUSI-EAS95L_0025_FC:3:1:6539:1083#0/2 - 4866942 4866977 HWUSI-EAS95L_0025_FC:3:1:10260:1083#0/1 + 1930232 1930266 HWUSI-EAS95L_0025_FC:3:1:10260:1083#0/2 + 1930354 1930389 HWUSI-EAS95L_0025_FC:3:1:10916:1076#0/1 - 4874098 4874133 HWUSI-EAS95L_0025_FC:3:1:10916:1076#0/2 - 4874089 4874121 HWUSI-EAS95L_0025_FC:3:1:11022:1076#0/1 + 749842 749877 HWUSI-EAS95L_0025_FC:3:1:11022:1076#0/2 + 749905 749936 HWUSI-EAS95L_0025_FC:3:1:11305:1077#0/1 + 2083459 2083494 HWUSI-EAS95L_0025_FC:3:1:11305:1077#0/2 + 2083661 2083696 HWUSI-EAS95L_0025_FC:3:1:11824:1080#0/1 + 1930341 1930376 HWUSI-EAS95L_0025_FC:3:1:11824:1080#0/2 + 1930373 1930408 HWUSI-EAS95L_0025_FC:3:1:12409:1075#0/1 - 4359407 4359442 HWUSI-EAS95L_0025_FC:3:1:12409:1075#0/2 - 4359384 4359419 HWUSI-EAS95L_0025_FC:3:1:15014:1078#0/1 + 742090 742125 HWUSI-EAS95L_0025_FC:3:1:15014:1078#0/2 + 742134 742168 HWUSI-EAS95L_0025_FC:3:1:15074:1080#0/1 - 2697450 2697485 HWUSI-EAS95L_0025_FC:3:1:15074:1080#0/2 - 2697347 2697381 HWUSI-EAS95L_0025_FC:3:1:15895:1077#0/1 - 3870810 3870845 HWUSI-EAS95L_0025_FC:3:1:15895:1077#0/2 - 3870798 3870832 HWUSI-EAS95L_0025_FC:3:1:16241:1078#0/1 + 3726316 3726351 HWUSI-EAS95L_0025_FC:3:1:16241:1078#0/2 + 3726444 3726479 HWUSI-EAS95L_0025_FC:3:1:16990:1084#0/1 + 4485745 4485780 HWUSI-EAS95L_0025_FC:3:1:16990:1084#0/2 + 4485764 4485797 HWUSI-EAS95L_0025_FC:3:1:1360:1089#0/1 - 4848206 4848241 HWUSI-EAS95L_0025_FC:3:1:2719:1087#0/1 - 1449535 1449570 HWUSI-EAS95L_0025_FC:3:1:2719:1087#0/2 - 1449425 1449460 HWUSI-EAS95L_0025_FC:3:1:2763:1085#0/1 - 1449423 1449458 HWUSI-EAS95L_0025_FC:3:1:2763:1085#0/2 - 1449427 1449460 HWUSI-EAS95L_0025_FC:3:1:3151:1099#0/1 - 4867745 4867773 HWUSI-EAS95L_0025_FC:3:1:3151:1099#0/2 - 4867750 4867774 HWUSI-EAS95L_0025_FC:3:1:4137:1088#0/1 - 4359723 4359758 HWUSI-EAS95L_0025_FC:3:1:4137:1088#0/2 - 4359622 4359657 HWUSI-EAS95L_0025_FC:3:1:4196:1093#0/1 + 2145336 2145371 HWUSI-EAS95L_0025_FC:3:1:4196:1093#0/2 + 2145456 2145490

    I was hoping to get the output something like 2770970 and 2771190 for the first 2 entries......etc

    Hope that makes sense!

      The line

          if ($row->[0] =~ m{\QHWUSI-EAS95L_0025_FC:3:1:5232:1082#0//E}) {

      should have read

          if ($row->[0] =~ m{\QHWUSI-EAS95L_0025_FC:3:1:5232:1082#0/\E}) {

      There is a typo in the original code with an incorrect slash before the trailing E. The escaped pair \Q and \E tells Perl (when interpolating) to escape all characters that have special meaning in a regular expression - see Quote and Quote like Operators in perlop. The typo was of course mine, and I have corrected the original post accordingly. With that change, I get the output: 1449586    1449577. Obviously, you should be modifying that matching condition to fit your requirements.

        Thanks kennethk, this works, but I cannot figure out how to incorporate this into a program that can take a list and then check each line of the list to see if subsequent entries compare and if they do to then print out.

        It is possible to do this if the name is specified in the code

        if ($row->[0] =~ m{\QHWUSI-EAS95L_0025_FC:3:1:5232:1082#0/\E}) {

        but how to do this if you do not know what each line is??

        I tried using the following regex, which will match each line:

                 if ($header2 =~  m{HWUSI-EAS95L_0025_FC:3:1:\d*:\d*#0/}) {

        but not sure how to take it from there.....

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://892442]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2024-04-18 06:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found