Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Grouping unique lines into a file.

by Limbic~Region (Chancellor)
on Apr 21, 2014 at 15:31 UTC ( #1083031=note: print w/replies, xml ) Need Help??


in reply to Grouping unique lines into a file.

Anonymous Monk,
If I have understood what you are trying to do, it boils down to these requirements:
  • A single file contains multiple lines.
  • Each line is comprised of an account and then data
  • The desired end state is for each account to be in its own file
  • Any duplicate data for an account should be ignored

Assuming the above is correct, here is how I would do it:

#!/usr/bin/perl use strict; use warnings; my %seen; while (<DATA>) { chomp; my ($acct, $data) = $_ =~ m{^(\d\d)(.*)}; next if $seen{$acct}{$data}++; append_data($acct, $_); } # If you know you are not going to exceed the open filehandle limit # You can improve performance by caching filehandles sub append_data { my ($acct, $line) = @_; open(my $fh, '>>', "$acct.txt") or die "Unable to open '$acct.txt' + for appending: $!\n"; print $fh $line; }
Here is how it would look if you know you will be safe caching file handles.
#!/usr/bin/perl use strict; use warnings; my (%seen, %fh); while (<DATA>) { chomp; my ($acct, $data) = $_ =~ m{^(\d\d)(.*)}; next if $seen{$acct}{$data}++; append_data($acct, $_, \%fh); } # Improved performance by caching filehandles sub append_data { my ($acct, $line, $fh) = @_; if (! $fh->{$acct}) { open($fh->{$acct}, '>>', "$acct.txt") or die "Unable to open ' +$acct.txt' for appending: $!\n"; } print { $fh->{$acct} } $line; }

Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1083031]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2018-06-23 20:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?



    Results (125 votes). Check out past polls.

    Notices?