Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: pattern search then remove duplicacy

by hexcoder (Curate)
on Jun 21, 2014 at 10:03 UTC ( [id://1090749]=note: print w/replies, xml ) Need Help??


in reply to pattern search then remove duplicacy

The problem is that you use the matching operator in scalar context instead of list context. In scalar context it would return the number of matches (0 or 1).
my $pat = $line =~ m/^LOC_Os0[1-7]g[0-9]*.[0-9]\s/;
should be
my ($pat) = $line =~ m/^LOC_Os0[1-7]g[0-9]*.[0-9]\s/;
I would write the script more like this in order to have error handling and avoid rewriting the whole file for each new entry.
#!/usr/local/bin/perl use strict; use warnings; use autodie; open (FILE, "<:utf8", "outputps_scan_chr1_.out"); my %seen = (); open (MYFILE, ">:utf8", "data.txt"); open (WASTE, ">:utf8", "waste.txt"); while (defined(my $line = <FILE>)) { my $pat; next if ($line !~ m/^(LOC_Os0[1-7]g[0-9]*.[0-9])\s/); $pat = $1; if (!$seen{$pat}++) { print MYFILE $line; } else { print WASTE $line; } } close (MYFILE); close (WASTE); close (FILE);
Update: I forgot to mention that I changed the pattern matching also.

First I want to know if there has been a match and then ignore the line, if there wasn't one.
Instead of matching a second time to get the pattern, I used a capture (...) in the pattern. Then I can retrieve the matched string in $1 and assign it to $pat.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1090749]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-26 00:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found