Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re: makeing refering faster ?

by perldeveloper (Scribe)
on Aug 15, 2004 at 15:20 UTC ( [id://383102]=note: print w/replies, xml ) Need Help??

in reply to makeing refering faster ?

I'd say from your code that you are trying to build for every sentence Si a list which contains a all sentences Si1, Si2, ..., Sik that start with any of the words belonging to Si. However, your code keeps overwriting this list with the sentence which start with the last word of every sentence -- making the code if not incorrect, at least suspicious and inefficient. If you are trying to do what I think you are, here is how I'd do it for you:
use strict; use warnings; my $dat0 = 'a.txt'; open (DAT, "$dat0") or die "Could not open file `$dat0'.\n"; my @all=<DAT>; close (DAT); my @words = (); my $sentences = {}; # 'word' => [ sentences that start with `word' ] foreach my $sentence (@all) { chomp ($sentence); push (@words, [ split (/[ \t]+/, $sentence) ]); my $firstWord = $words[-1]->[0]; $sentences->{$firstWord} = [] if not exists $sentences->{$firstWor +d}; push (@{$sentences->{$firstWord}}, $#words); } my @temp = ('', ''); for (my $i = 0; $i <= $#words; $i++) { push (@temp, $all[$i]); my @referencedSentences = (); foreach my $j (@{$words[$i]}) { if (($j ne "$j") || ($j ne "v")) { # I don't get this so I lea +ve it intact if (exists $sentences->{$j}) { push (@referencedSentences, $sentences->{$j}); } } } push (@temp, \@referencedSentences); } print "Done.\n"; # ...

As you can see, I first make a hash indexed by the first words in every sentence, where the values are references to arrays holding the indices to the sentences whith start with the word. Then, for every sentence I make an array of these hash values, for every word which happens to start any of the sentences (including the one under scrutiny). I believe this code is more fit to start working on optimization -- my code ran within a second on a 3 thousand line file.

A few other remarks:
  • Initialize an empty list with my @list = ();, and not @list = '', which initializes the first element to an empty string.
  • Always use my, always stick to warnings and strict.
  • Avoid recalculating the same values more than a couple of times by caching them (like split in your code).
  • Use descriptive names and comments, especially when asking for assistance :)

Replies are listed 'Best First'.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://383102]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-22 03:27 GMT
Find Nodes?
    Voting Booth?

    No recent polls found