Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: To Hash or to Array--Uniqueness is the question.

by Ryszard (Priest)
on Dec 02, 2005 at 08:28 UTC ( #513532=note: print w/replies, xml ) Need Help??


in reply to To Hash or to Array--Uniqueness is the question.

Warning, untested code:
my %stathash; while (<FH>) { $stathash{$_}++; }
Has the extra advantage of counting the number of hits for each unique value.

You can then do some grooy stuff, like pulling out records which occur n times, records which appear in one set and not another (if you use two hashes, two datasets), or records which appear in both sets, (again,if you use two hashes, two datasets)

I regularly do this with sets of about 500k records to determine where my data integrity issues lie, its pretty damn fast.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://513532]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2016-10-01 18:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?






    Results (3 votes). Check out past polls.