Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Creating CSV term document matrix from a hash stored in multideminsional array

by lobs (Acolyte)
on Mar 03, 2017 at 19:06 UTC ( [id://1183589]=perlquestion: print w/replies, xml ) Need Help??

lobs has asked for the wisdom of the Perl Monks concerning the following question:

PROBLEM AT HAND: every row in my csv file has the same value. I am trying to implement a term document matrix and store it in a CSV file.In my program I store the features and the value within a hash for a document. Each hash is stored in an array. The array then is stored within another array to seperate the document classes. I get the hash function by going through an array and adding the terms to the hash as follows (hash %termFreq is specific for the document and hash %docTerms is the features of all documents):
while ($element = shift(@numOfWordArr)) { $termsFreq{$element} ++; if(!exists($docTerms{$element})) { $docTerms{$element}++; } }
Then add the hash into an array of documents as follows:
push(@docArray, \%termsFreq);
Finally pass the array into an array to seperate the classes of the documents:
push(@classArr, \@docArray);
took the advice from another thread to iterate through the arrays to get a hash value and print the doc matrix to a csv file:
foreach my $subArr_ref(@classArr) { print "subArr_ref: ".@{$subArr_ref}."\n"; foreach my $hashRef(@{$subArr_ref}) { print "hashRef: ".$hashRef."\n"; foreach my $key (sort keys %{$hashRef}) { #print $csv $key.":".$hashRef->{$key}.","; print $csv "$key : ${$hashRef}{$key},"; } # foreach my $feat(@featureVector) { # print $csv $hashRef->{$feat}.","; # } print $csv $i."\n"; }
My problem is that I get rows of the same value since it is accessing the same hash per iteration. Help is much appreciated. Thanks!

Replies are listed 'Best First'.
Re: Creating CSV term document matrix from a hash stored in multideminsional array
by stevieb (Canon) on Mar 03, 2017 at 19:13 UTC

    This:

    for my $i($#classArr) {

    Does not increment $i like you think it does. Essentially, it loops only once, because $#classArr is only a single number; the last element number in the array.

    You want something more like:

    for my $i (0..$#classArr){

    Which will put the current element number into $i on each iteration, starting from 0.

    update: Or, more idiomatically (untested):

    for my $class (@classArr){ for my $href (@$class){ for my $key (sort keys %$href){ print $csv "$key-- $href->{$key},"; } } }
      yea I changed it recently and its now:
      foreach my $subArr_ref(@classArr) { print "subArr_ref: ".@{$subArr_ref}."\n"; foreach my $hashRef(@{$subArr_ref}) { print "hashRef: ".$hashRef."\n"; foreach my $key (sort keys %{$hashRef}) { #print $csv $key.":".$hashRef->{$key}.","; print $csv "$key : ${$hashRef}{$key},"; } # foreach my $feat(@featureVector) { # print $csv $hashRef->{$feat}.","; # } print $csv $i."\n"; } }

        Did that work?

        Note I updated my original reply before you posted this update, but it would be more idiomatic Perl to rewrite it something like this (untested):

        for my $class (@classArr){ for my $href (@$class){ for my $key (sort keys %$href){ print $csv "$key-- $href->{$key},"; } } }
Re: Creating CSV term document matrix from a hash stored in multideminsional array
by Cow1337killr (Monk) on Mar 03, 2017 at 21:27 UTC

    I discovered Text::CSV, recently (because I hang out at PerlMonks). Also, DBD::CSV let's one use SQL on their CSV files.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1183589]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2024-04-18 15:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found