qball has asked for the wisdom of the Perl Monks concerning the following question:

Once again I'm faced with using a hash, which if you've chatted with me, you know that I'm still a newbie at using the hash concept. After doing some research in both books and perlmonks, I still can't find a good way of doing the following:

First, take a look at my code.
open (DATA, "<.txt") or die "Can't open file $!\n"; @DATA = <DATA>; close (DATA); foreach $rec (@DATA) { chomp $rec; my ($var1, $var2, $var3, $var4) = split(/,/,$rec); if ($var1) { $var1 =~ s/^\s+$//g; $var2 =~ s/^\s+$//g; $var3 =~ s/^\s+$//g; $var4 =~ s/^\s+$//g; print "$var1 $var2 $var3 $var4\n"; } }
I want to use $var3 as the unique key in a new hash. I've taken out some of the garbage code that I used when attempting to do this on my own. So how would I do this?

qball~"I have node idea?!"

Replies are listed 'Best First'.
Re: Hash Variable
by OeufMayo (Curate) on Apr 06, 2001 at 00:15 UTC

    Hey qball, Where in your example are you trying to use a hash?
    Now here are some tips you might want to follow:

    • Use the -w switch and use strict.
    • You probably don't want to put you file in an array. There's no real use here and it can be a performance issue if your file is several hundreds MB large. Use the while (<DATA>){} construction instead. The line will be stored in the scalar $_.
    • Use the map function when you have to repeat the same operation several time on array elements and get another array back (or use a foreach loop if you don't want a new array).
    • Your loop and regex seem to be stripping the commas and whitespaces on the current line. There's more efficient ways to do it (ex: s/(?:\s|,)//g, and/or use a join to print the array back without commas).
    • You can save some keystrokes on your split by assiging it directly to an array, and later access the array elements with $array[n], where n is the array index (starting from 0).
    • To affect a key in a hash, it's really easy just say $hash{$key}=$value. $value can be just anything, but it has to be something (even undef or '').

    By mixing these advices and your current code, you'll probably get the results you want. (I hope so!)

    <kbd>--
    my $OeufMayo = new PerlMonger::Paris({http => 'paris.mongueurs.net'});</kbd>
Re: Hash Variable
by jeroenes (Priest) on Apr 06, 2001 at 00:16 UTC
    Try to avoid repeating code. It's a bad coding habit in general. Bad for Debugging, Modularity and Maintainability. Hashes are nice, if you get the hang of it.
    my %hash; while( <DATA> ){ chomp; @vars = map { s/^\s+$// } split /,/; $hash{ $vars[2] } = [ @vars[ qw/0 1 3/ ] ] if $vars[0]; }
    The code is more dense, but therefore more fun. Look at perldata, map, perlop and perlref for docs.

    Hope this helps,

    Jeroen
    "We are not alone"(FZ)
    Update: mr.nick /msg'ed me a warning: What if $vars[0] equals a '0'? You still want it to be valid? add <code> or $vars =~ m/0/<code> to the if statement.

Re: Hash Variable
by ton (Friar) on Apr 06, 2001 at 00:13 UTC
    And while we're at it, there is no need to read in the entire file into memory if you're just going to iterate over the lines anyway. Read it in line by line:
    open (DATA, "<.txt") or die "Can't open file $!\n"; while ($rec = <DATA>) { # Loop stuff here } close(DATA)
(jptxs) Re: Hash Variable
by jptxs (Curate) on Apr 06, 2001 at 05:36 UTC
    there are many good examples of code here. I will not presume to put another. What I will do is try and explain what a hash is a bit more in detail because I think (like I once was) you are really struggling with that. Before I get going, nay I suggest a book that really helped me a year ago: Elements of Programming with Perl. The link is to my review of it. It's a great hands on way to get to know perl end to end.

    in perl programs there are three ways to store the data you want to throw around, scalars, arrays and hashes. scalars, as you know by now, are the simplest. you put something in and you take it out in the same form. basically it just holds what you put in there. an array will take a whole bunch of things and order them for you. you give it a list and it will hole all the elements and give you whichever you want when you ask for it. if you want the 3rd item you placed in you simple ask for the item which is associated with the index [2](because the first is always 0!), and you will get it.

    a hash is something you can think of as an array with a few different features. in fact, another name for a hash is an associative array. basically it is an unordered array which associates it's elements with names instead of numbers. above we said we ask for elements from an array with their associated numbers or indexes. with a hash, you would ask for things by their name, called a "key". i also said that a hash is unordered. this means that unlike an array it does not keep things in the order in which you put them in. perl decides what order to keep thing in according to black magic in the guts which optimizes the storage and retrieval of the data in the hash. let's look at some simple stuff to illustrate this.

    first of all, proving how like an array a hash is, you can form one just like you would an array:

    @array = ( one, monkey, two, donkey, three, funky ); # makes an array %hash = ( one, monkey, two, donkey, three, funky ); # makes a hash where each of one, two and three is a key # for the data to their right, but better written as %hash = ( one => monkey, two => donkey, three => funky ); # in both ones, two, three are keys and the rest the values. # the *only* difference is that the second is clearer for a # human. => has no special meaning at all - just another # comma for all perl cares. print $array[1]; # will print monkey print $hash{one}; # also prints monkey # notice both have stored simple scalers # so when we get the info for the print # we ask for a scalar.

    what you store in a has doesn't have to be a scalar. you can store whole arrays, through references (another one I struggled with), and even other hashes.

    I hope you find this useful. I also hope you don't see it as me talking down to you in any way. I struggled with this really hard a year ago. with no math or comp sci background all the references I found were leaving me spinning. i just hope to help you avoid that =)

    "A man's maturity -- consists in having found again the seriousness one had as a child, at play." --Nietzsche
      One of the most useful offhand comments that clarified hashes when I was learning Perl is the following gem from the Camel.

      A hash lookup is linguistically "of". So for instance you would write:

      $wife{Adam} = "Eve";
      That reads, "The wife of Adam is Eve."

      So when you sit down to do a problem, describe it to yourself. Whenever you start saying phrases like, "of", "lookup", "check if we have seen this before", and so on, that is generally an excellent sign that a hash fits, and the above tip is frequently a good guide to naming the hash as well!

Re: Hash Variable
by suaveant (Parson) on Apr 06, 2001 at 00:08 UTC
    A hash storing what? You should pretty much just be able to do...
    $hash{$var3} = whatever;

                    - Ant
Re: Hash Variable
by mr.nick (Chaplain) on Apr 06, 2001 at 00:20 UTC
    Here's my poke at it with some corrections...
    my %hash; open DATA,"<filename.txt" || die "Couldn't open $!"; ## we'll process each line individually instead of ## slurping it into an array. That way any size file ## will be usable while (<DATA>) { chomp; my @vars=split /,/,$_; ## ignore it if no first value next unless defined $vars[0]; ## I will also assume your regex were to remove all ## leading and trailing spaces. If that is true, ## you did it incorrectly map { s/^\s*//; s/\s*$// } @vars; ## and store it based on the 3rd var push @{$hash{$vars[2]}},@vars[0,1,3]; } close DATA;
    Then to access you data, simply use it like this:
    $hash{somevalue}[0] = the first value (your $var1) $hash{somevalue}[1] = the second value (your $var3) $hash{somevalue}[2] = the third value (your $var4)

    Untested again, but it should work

Re: Hash Variable
by telesto (Novice) on Apr 06, 2001 at 00:25 UTC
    I know this isn't what you asked, but I'm not following the logic of the regexes. Are you trying to remove whitespace before and after the input? If so, make it part of what you split on:

    my @vars = split(/\s*,\s*/,$rec);

    Also, if you are only interested in $var3 for the hash, why work on the rest? Do something like:

    open (DATA, "<data.txt") or die "Can't open file $!\n"; @DATA = <DATA>; close (DATA); foreach $rec (@DATA) { chomp $rec; my @vars = split(/\s*,\s*/,$rec); # debug - you might want to join with '|' to see # whitespace in any vars print "VARS = " . join('|', @vars). "\n"; # now insert into the hash $var_hash{$vars[2]} = 'whatever'; }
    If you could relay more info the format of the input and how you want to make use of the hash you might get more (better) suggestions.

    best of luck

    --telesto

      I want to print out each variable--$var1 through $var4--using $var3 as the primary key. And I want the data to be printed on each row according to my specifications. For example, if I want to:
      print "Row Data: $var1, $var2, $var4, $var3\n"; -or- print "Row Data: $var3, $var1, $var2\n";
      I do want to trim whitespace before and after the input.

      Many thanks to all your input.

      qball~"I have node idea?!"
Re: Hash Variable
by FrankG (Scribe) on Apr 06, 2001 at 00:30 UTC
    Here ye go:
    #!/usr/bin/perl use strict; my %hash; open (DATA, "<.txt") or die "Can't open file $!\n"; my @DATA = <DATA>; close (DATA); foreach my $rec (@DATA) { chomp $rec; my @vars = split(/,/,$rec); if ($vars[0]) { # you can use map to use the regex # on each element of the array map $_ =~ s/^\s+$//g, @vars; print "$vars[0] $vars[1] $vars[2] $vars[3]\n"; if( $hash{$vars[2]} ) { warn "key '$vars[2]' already exists in hash\n"; } else { $hash{$vars[2]} = join(' ', @vars[0,1,3]); # or whatever you wanted } # your hash to be built as } }
    I threw the if( $hash{$vars[2]} ) in there in case you don't really know if the 3rd element is uniq throughout.

    I'm assuming that with the s/^\s+$//g your intent is to to truncate all records that are all spaces. Is this correct?

    - FrankG

      I see where you're creating hash:
      $hash{$vars[2]} = join(' ', @vars[0,1,3]);
      How would I print that out?
      foreach $var2 (keys %hash){ print "value: $hash{$var2}\n"; }
      ...doesn't work too well.

      qball~"I have node idea?!"
        What do you mean "doesn't work too well"? That should work.
        Is an error occuring, is there not output, or is it not the way you want it to look?

        - FrankG

      Okay, I previously made a comment that FrankG's code didn't work. I decided to dig further and break the code down piece by piece. After a few hours of working with FrankG's code, I learned that my data.txt file wasn't a true CSV file. Some of the fields were combined and were fixed widths.

      Excel came in handy as I imported data.txt file and created data.csv. I then ran FrankG's code and voila!

      Now I'm going back and break the code down again, this time using a real CSV file.

      I'll keep you all posted on my progress. Many thanks to all of your comments. And please comment further if need be.

      qball~"I have node idea?!"
Re: Hash Variable
by Anonymous Monk on Apr 06, 2001 at 00:08 UTC
    You would use $some_hash{$var3} = $some_value;