Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

printing hash values, (don't need the keys)

by punklrokk (Beadle)
on Jul 06, 2006 at 20:25 UTC ( #559676=perlquestion: print w/ replies, xml ) Need Help??
punklrokk has asked for the wisdom of the Perl Monks concerning the following question:

Hello

I have the following code:

$data_file="ActiveItems2.txt"; open(DAT, $data_file) || die("Could not open file!"); @raw_data=<DAT>; close(DAT); my %uniques; foreach my $key(@raw_data){ $uniques{$key}++; } foreach $line(%uniques){ print "$line \n"; }
and the following data:

-331816 -331816 .25 X 10.25 X 17 .25 X 10.25 X 17 .250 ROLL #3 CUT .250 ROLL #3 CUT .250 SOLID ROD .250 SOLID ROD .250 SOLID ROD .375 ROLL #3 CUT .375 ROLL #3 CUT .375 X .049 ROD .375 X .049 ROD .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS .437 DIA 1018 CRS
My problem is I really have a 200,000+ line file in which I need to remove the duplicates and then print them. I am unable to figure out how to only output the original line, with out a count of how many there were such as:
C:\scripts>jr2.pl .250 ROLL #3 CUT 2 .250 SOLID ROD 3 -331816 2 .25 X 10.25 X 17 2 .375 ROLL #3 CUT 2 .437 DIA 1018 CRS 15 .437 DIA 1018 CRS 1 .375 X .049 ROD 2
I need to just print out say: ".375 X .049 ROD" instead of
1 .375 X .049 ROD"
Can any of the monks help me please? Your time is much appreciated!!

JP Bourget (punklrokk) MS Information and Security Rochester Institute of Technology Rochester, NY

Comment on printing hash values, (don't need the keys)
Select or Download Code
Re: printing hash values, (don't need the keys)
by ww (Bishop) on Jul 06, 2006 at 20:34 UTC
    Very little time needed for this help: search, supersearch and the various perl docs and tuts.

    IIRC, this has been covered here at least half a dozen times in the past 6 months.
Re: printing hash values, (don't need the keys)
by philcrow (Priest) on Jul 06, 2006 at 20:40 UTC
    I think you do want the keys. Try the keys builtin:
    foreach $line( keys %uniques ) { print "$line\n"; }
    You might also want to add sort in front of keys.

    Phil

      Sort keys worked!!!

      Thanks so much!!

      JP Bourget (punklrokk) MS Information and Security Rochester Institute of Technology Rochester, NY

Re: printing hash values, (don't need the keys)
by planetscape (Canon) on Jul 06, 2006 at 20:53 UTC
Re: printing hash values, (don't need the keys)
by trammell (Priest) on Jul 06, 2006 at 22:07 UTC
    One problem with your code is that you're reading the entire file into memory when you don't need to. Here's an alternative that doesn't (untested):
    my $data_file = "ActiveItems2.txt"; open(DAT, $data_file) || die("Could not open file: $!"); my %seen; while (<DAT>) { chomp; $seen{$_}++; } close(DAT); for (sort keys %seen){ print "$_\n"; }

      If you omit the chomp then you can:

      my $data_file = "ActiveItems2.txt"; open(DAT, $data_file) || die("Could not open file: $!"); my %seen; $seen{$_}++ while <DAT>; print for sort keys %seen; close(DAT);

      Another interesting variant (although it may suffer the slurp problem) is to use a hash slice:

      @seen{<DAT>} = (); # Hash slice assignement print for sort keys %seen;

      It's worth seeing the hash slice a few times to remember that it is there and what the syntax is. Like many things in Perl, occasionally it is exactly what you need to achieve a clean solution to a problem. Perhaps not in this case though. :)


      DWIM is Perl's answer to Gödel
Re: printing hash values, (don't need the keys)
by clinton (Priest) on Jul 06, 2006 at 22:41 UTC
    Not Perl I know, but surely this would be easier:

    sort -u filename > sorted_filename

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://559676]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (14)
As of 2014-07-22 22:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (129 votes), past polls