Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re: how to speed up dupe checking of arrays

by wind (Priest)
on Jul 31, 2007 at 09:57 UTC ( #629780=note: print w/replies, xml ) Need Help??

in reply to how to speed up dupe checking of arrays

Most likely you're going to simply be throttled by IO speed. However, your above code could in theory be simplified by limiting the split to only 3 parts, and by adding the dup check to the while loop. Assuming that dup count is all you really care about.
if ($file =~ $spec_text){ my $file_date = (split(/\./,$file))[3]; open(IN, '<', $file) or die("open failed: $!"); my $count_uniq = 0; my %seen; while (<IN>) { chomp; my ($ele0, $ele1, undef) = split ';', $_, 3; $count_uniq++ if !$seen{"$ele0;$ele1;$file_date"}++; } print "$.\n"; # Total number of lines. print "$count_uniq\n"; close(IN); }
- Miller

Replies are listed 'Best First'.
Re^2: how to speed up dupe checking of arrays
by ultibuzz (Monk) on Jul 31, 2007 at 10:25 UTC

    i need them in an array so i adjusted ur code like this

    my @rows; my %seen; while (<IN>) { chomp; my ($ele0, $ele1, undef) = split ';', $_, 3; push @rows,"$ele0;$ele1;$file_date" if !$seen{"$ele0;$ele1 +;$file_date"}++; } close(IN);

    and waht shoud i say AWSOME, from 203 seconds down do around 11 seconds,great so no over hours in office needed ;)
    thx alot.

    kd ultibuzz

      you already have them in array.
      then if you need the data you can, for example:
      foreach (keys %seen) { .... }
      and the value of the hash is the number of times the string is repeated


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://629780]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2018-02-25 06:40 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (312 votes). Check out past polls.