Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: objects and duplicates

by stiller (Friar)
on Apr 27, 2008 at 18:20 UTC ( #683169=note: print w/replies, xml ) Need Help??

in reply to objects and duplicates

First, you have a bug in the ordering of $seen{$_} ... a few lines before you do %seen = ();

You can reduce the amount of work (and code) by doing:

my %seen; $seen{$_}++ for @temp;

Now you have one entry in %seen for each sentence, and you know how many times each sentence occured. Now you can use wfsp's package to make the objects.

Replies are listed 'Best First'.
Re^2: objects and duplicates
by Anonymous Monk on Apr 27, 2008 at 18:45 UTC
    Wfsp and stiller, thank you both. It worked. I now use
    for $record (@records){ $duplicates{$_}++ for $record->src; }

    to store each sentence in a hash, with the number of times it appears. This is great.

    I still need to change the $record->duplicate of each object to the number of times the sentence appears. Do I have to write a new loop for this, or can we do it at the same time we count the duplicate sentences?

    I was thinking of something like this:

    for $record (@records){ $duplicates{$_}++ for $record->src; $record->duplicate++; }

    Thank you

      Does $record->src return a string or a list? If it returns a string you'll want:
      my %count; for my $record (@records) { if ($count{$record->src}++) { $record->duplicate(1); # or however you set the duplicate flag } }
      Note: if a string is duplicated, the first Entry object with that src value will not have its duplicate flag set but all matching Entry objects will.


        $record->src does return a string. Your code works perfectly fine, thank you.

        I didn't express myself clearly: I said flag for $record->duplicate, but in fact it should hold the number of times the sentence is duplicate, so it's more a count than a flag.

        I still think it's possible to achieve this, but how?

        Thank you

      Just jot down the pseudo code: e.g.:
      • read the file, each line into a hash, incrementing number of occurences of that sentence.
      • create an object from each sentence, for which I already know the number of occurences...
      • and so on

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://683169]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2018-05-20 20:44 GMT
Find Nodes?
    Voting Booth?