Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: merge lines removing duplicates in a file

by AnomalousMonk (Archbishop)
on Oct 01, 2016 at 14:34 UTC ( [id://1173085]=note: print w/replies, xml ) Need Help??


in reply to merge lines removing duplicates in a file

In general:

c:\@Work\Perl\monks>perl -wMstrict -le "use List::MoreUtils qw(uniq); ;; my @data = qw( NC_009565:0 NC_017524:0 NC_017522:0 NC_018143:0 NC_017026:0 NC_017523:0 NC_016934:1 NC_018078:0 NC_017026:0 NC_017523:0 NC_016934:1 NC_018078:0 NC_999999:0 NC_999999:1 NC_021193:0 NC_016768:0 NC_021251:0 NC_021192:0 NC_012943:0 NC_002755:0 NC_020559:0 NC_020089:0 NC_999999:1 NC_999999:0 ); ;; my @uniq = uniq @data; ;; printf qq{%d in \@data \n}, scalar @data; printf qq{%d in \@uniq \n}, scalar @uniq; print qq{'$_'} for @uniq; " 24 in @data 18 in @uniq 'NC_009565:0' 'NC_017524:0' 'NC_017522:0' 'NC_018143:0' 'NC_017026:0' 'NC_017523:0' 'NC_016934:1' 'NC_018078:0' 'NC_999999:0' 'NC_999999:1' 'NC_021193:0' 'NC_016768:0' 'NC_021251:0' 'NC_021192:0' 'NC_012943:0' 'NC_002755:0' 'NC_020559:0' 'NC_020089:0'
I leave to you the task of extracting the protein (?) and NC_ info from all records and associating the latter (after uniq-ification) with the former. See List::MoreUtils::uniq().


Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1173085]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-04-24 11:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found