As a programmer and teacher of the Perl programming language, I often get destabilizing questions. In one of the last class I gave, while I was talking about hashes, someone asked me "What is it used for? When would I ever need that?" Of course, for me (and you too, probably) hashes are quite practical, but being told that, on the spot, I didn't know what to say, so I talked about the %ENV hash and made an example with it.
Today I found an interesting use for hashes. I wish I would have thought of it during my class but I didn't, so I would like to share it with you for the benefit of newer Perl programmers.
Imagine you have to read a Space Separated Value file or Comma Separated Value (CSV) file. It's easy because the fields are always in the same order. For example:
# firstname lastname age joe builder 9 bob plumber 66 dora squarepants 10 diego simpson 11
You can do this:
open( $l, "<file" ) || die "Error : $!"; my @lines = <$l>; close( $l ); foreach my $line ( @lines ) { # Skipping if the line is empty or a comment next if ( $line =~ /^\s*$/ ); next if ( $line =~ /^\s*#/ ); my ($firstname, $lastname, $age) = split( /\s+/, $line ); # then do whatever you have to }
But then someday someone give you a new file with the fields in a different order plus new extra fields you don't need. Here is the new file:
# lastname firstname age gender phone mcgee bobby 27 M 555-555-5555 kincaid marl 67 M 555-666-6666 hofhazards duke 22 M 555-696-6969
What do you do? Do you change your code with a if statement? Do you alter the file to change the order of the fields and remove the extra fields? No! You use hashes!
Here is the solution:
open( $l, "<file" ) || die "Error : $!"; my @lines = <$l>; close( $l ); my @keys = split( /\s+/, $lines[0] ); shift( @keys ); # to remove the # as the first field foreach my $line ( @lines ) { # Skipping if the line is empty or a comment next if ( $line =~ /^\s*$/ ); next if ( $line =~ /^\s*#/ ); my %hash; @hash{ @keys } = split( /\s+/, $line ); # then do whatever you have to }
Note that the first line in the file is important, it gives you the order of the fields. Even if it's not there when you receive the file, you can easily add it. Note the @hash{ } syntax. This is called a slice. You are slicing the hash using the array form, basically to access a list of element from the hash. The @keys array contains a list of keys in the same order written at the top of the file therefore, doing @hash{ @keys } is like doing @hash{ qw(lastname firstname age gender phone) } or @hash{ 'lastname', 'firstname', 'age', 'gender', 'phone' } except it doesn't matter if the fields in the file are not always in the same order as in the previous file.
The split of the line returns a list so doing this:
@hash{ @keys } = split( /\s+/, $line );
is the same as this:
@hash{'lastname', 'firstname', 'age', 'gender', 'phone' } = split( /\s+/, $line );
or this:
($hash{'lastname'}, $hash{'firstname'}, $hash{'age'}, $hash{'gender'}, $hash{'phone'}) = split( /\s+/, $line );
Also if some fields are not needed, you don't care. As long as all the required fields are there, your code will always work.
I hope this will be useful for you someday! Good luck!
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Cool way to parse Space Separated Value and CSV files
by Anonymous Monk on Apr 10, 2013 at 07:18 UTC | |
by greengaroo (Hermit) on Apr 10, 2013 at 13:12 UTC | |
Re: Cool way to parse Space Separated Value and CSV files
by johngg (Canon) on Apr 12, 2013 at 22:53 UTC | |
by Anonymous Monk on May 21, 2018 at 07:23 UTC | |
by Corion (Patriarch) on May 21, 2018 at 07:35 UTC | |
|