Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Scalar data & collection data with Tree::Trie

by Endless (Beadle)
on Jul 04, 2013 at 14:40 UTC ( #1042466=perlquestion: print w/replies, xml ) Need Help??

Endless has asked for the wisdom of the Perl Monks concerning the following question:

Hello all! I'm just getting started at learning Perl and am converting a Java program of mine to Perl for both learning and because I think Perl should handle the task well. I've read through a couple books on Perl but am still getting my Perl-coding legs (and will be for some time, it seems).

Here we are: I'm creating a Tree::Trie that reads from a 4-column CSV to create a lexicon. The first column is the actual word to add to the tree, and works fine. But now I am trying to add the other columns (starting with just one) as data onto that node and could use some help. Here's what I've got:

#!/usr/bin/perl use feature(say); use strict; use warnings; use Text::CSV; use Tree::Trie; my $file = $ARGV[0] or die "Need to get CSV file on the command line\n +"; my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1, }); my($trie) = new Tree::Trie; open(my $data, '<:encoding(utf8)', $file) or die "Could not open '$fil +e' $!\n"; while (my $fields = $csv->getline( $data )) { $trie->add_data(($fields->[0])=>($fields->[1])); my(@sent) = $trie->lookup_data($fields->[0]); #<-- only seems to g +et the right data in array context printf "Just added %s\n with sentiment %s\n", $fields->[0], @sent[ +1]; } if (not $csv->eof) { $csv->error_diag(); }

I just arrived at this and it actually works, but it came after hours of failing to get a data retrieval in scalar context to function; it would always just return the word again.

So, I have two questions: first, what do I need to do to get a single piece of data out in scalar context with Tree::Trie? I have a suspicion that I am making some Perl-novice mistake in that.

Second, attaching and retrieving multiple pieces of data for a word. I know it will have to do with attaching an array at the data I'm adding; can anyone give me a friendly example of how the attachment/retrieval would look in that case?

Thank you much! I am pleased to be here with the monks.

Replies are listed 'Best First'.
Re: Scalar data & collection data with Tree::Trie
by hdb (Monsignor) on Jul 04, 2013 at 14:57 UTC

    According to the documentation of Tree::Trie the return value of lookup_data depends on the setting of the deepsearch option when used in scalar context. 'exact' is probably the setting you want.

Re: Scalar data & collection data with Tree::Trie
by tangent (Vicar) on Jul 04, 2013 at 16:23 UTC
    failing to get a data retrieval in scalar context to function; it would always just return the word again.
    I tried your code and using scalar context worked for me, i.e. it returned the data associated with the word, not the word itself.
    use Tree::Trie; my $trie = new Tree::Trie; $trie->add_data( word => 'data'); my @sent = $trie->lookup_data('word'); print "List context: @sent\n"; my $data = $trie->lookup_data('word'); print "Scalar context: $data\n";
    List context: word data Scalar context: data
    Maybe you could post up some of your input file? It would help to answer your second question as well.

    Update: Is it possible you are using "scalar context" like this:

    my ($data) = $trie->lookup_data($fields->[0]);
    If so, that will indeed return the word and not the data, you need to leave out the brackets around $data

      Thanks! You are exactly right about why my scalar wasn't working; I was calling it as 'my ($var).' Can you give me a quick line on why I should/shouldn't have the parenthesis on that?

      I seem to have the multi-data working now, and I'll show you what I did. I suspect it is pretty sloppy, un-idiomatic code, so any suggestions for a cleaner implementation are welcome. Here we are. First, my CSV format (for reference):

      "PrimaryTest","normalSentiment","negSentiment","topics" "a waste",-1,0,""

      Now my code:

      while (my $fields = $csv->getline( $data )) { my $word = $fields -> [0]; my $pos_sent = $fields->[1]; my $neg_sent = $fields->[2]; my $topics = $fields->[3]; my @word_info = ($pos_sent, $neg_sent, $topics); # Trying with a reference my $pnt_word_info = \@word_info; $trie->add_data($word => $pnt_word_info); my $info = $trie->lookup_data($word); printf "Just added %s\n Sentiment: %s \t Neg Sentiment: %s \t Topi +c: %s\n", $word, $info->[0], $info->[1], $info->[2]; }
        my ($var) is imposing list context, $var will contain the first element of the list returned (in this case the word).

        With regards to your code there's nothing really wrong with it but, unless you need the intermediate variables for something else, it could be shortened to:

        while (my $fields = $csv->getline( $data )) { my $word = $fields->[0]; $trie->add_data( $word => [ $fields->[1], $fields->[2], $fields->[ +3] ] ); my $info = $trie->lookup_data($word); printf "Just added %s\n Sentiment: %s \t Neg Sentiment: %s \t Topi +c: %s\n", $word, $info->[0], $info->[1], $info->[2]; }

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1042466]
Front-paged by Arunbear
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (1)
As of 2023-03-25 19:56 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (63 votes). Check out past polls.