Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Word frequency in an array

by cool (Scribe)
on Jun 10, 2007 at 15:28 UTC ( #620315=note: print w/replies, xml ) Need Help??


in reply to Word frequency in an array

Its just avoiding grep; again a trivial soln.
#! /usr/bin/perl use strict; use warnings; my $x=1; my @arr= qw(foo cho roh foo kho foo moo foo); foreach(@arr){print $x++ if (/foo/)}
But it can be done using spl variable of reg ex also, if I am right?? Any takers?
#! /usr/bin/perl use strict; use warnings; my $x=1; my @arr= qw(foo cho roh foo kho foo moo foo); my $str=join ' ',@arr; $str=~ /foo/; print $&; #### In place of $&; we can use that for no #### for no. of matches.

Replies are listed 'Best First'.
Re^2: Word frequency in an array
by davido (Cardinal) on Jun 10, 2007 at 16:20 UTC

    Ok, your solutions:

    The first one is less than optimal. First, you're starting with $x = 1, which means that after the loop terminates $x will overstate the count by one. Why not start with $x = 0, and then pre-increment instead of post-incrementing $x? In other words, ++$x, instead of $x++. The next issue is the regexp you used. It will match just about anything containing "foo", including "foolish". Is that intentional? Maybe /^foo$/ would be better, or perhaps /\bfoo\b/. And the last thing to mention is the use of print within the loop. You're printing on each iteration, which creates an IO bottleneck, plus a lot of clutter. If $x started at zero, you could print after the loop terminates.

    Your second solution goes to a lot of extra work and memory inefficiency by creating $str as a temporary stringified version of @arr. And the other problem is that $& only shows the actual most recent match, not some count of the number of possible times the regular expression could have matched. Don't use a special variable, use this:

    my $count = () = $str =~ m/\bfoo\b/g;

    But I still feel it's a bad solution because you're creating a temporary string unnecessarily.

    The grep solution is probably the best for a one-time count. The hash solution is probably better if you're doing the count several times, but it does have two problems: you're still creating the temporary copy (the hash), and the creation of a hash is a more computationally expensive operation than running through the array one time counting, as is done in the grep method.

    One other thing: "utioecia". There you go; the keystrokes you saved by abbreviating "special" and "solution." You can cut and paste them into your future posts so that you can retain clarity without wasting those eight keystrokes. ;)


    Dave

      Why not start with $x = 0, and then pre-increment instead of post-incrementing $x? In other words, ++$x, instead of $x++.

      Indeed. In fact it's also worth reminding incidentally that {pre,post}-{increment,decrement} behave intelligently by first of all not complaining under warnings and, in the case of post-ones, to "coerce to numeric value", that is, to return 0:

      errol:~ [19:01:32]$ perl -wMstrict -le 'my $x; print map $x++, (1) x 3 +' 012
      Hi Dave,

      Thanks for giving insight of the solutions and giving me prototypes to copy and paste ;)

      And the other problem is that $& only shows the actual most recent match, not some count of the number of possible times

      Actually that is what I mentioned pl read comments in

      #! /usr/bin/perl use strict; use warnings; my $x=1; my @arr= qw(foo cho roh foo kho foo moo foo); my $str=join ' ',@arr; $str=~ /foo/; print $&; #### In place of $&; we can use that for no #### for no. of matches.
      Now, I posted this piece to get suggestion from people, what in regular expression can be used (in place of $&) But it can be done using spl variable of reg ex also, if I am right?? Any takers?

      to count the no of matches in one go using special variable, if there is any!! and I think I encountered that somewhere!

Re^2: Word frequency in an array
by blazar (Canon) on Jun 11, 2007 at 17:29 UTC
    Its just avoiding grep; again a trivial soln.

    cool, I know that u r c00l and all, but could u plz stop im-talking? Many find it plain annoying and it hinders communication between people here. It is appropriate where it is appropriate, that is in IMs et simila, in which case your primary goal is speed and not clarity. But here it's just the opposite.

    my $x=1; foreach(@arr){print $x++ if (/foo/)}

    Initialization and "efficiency" issues apart, which were duly pointed out by davido, just reasoning solely in terms of user interaction, what benefit could come of incrementally printing the counter at each iteration? You're only interested in the final value anyway. Not to mention the /foo/ gotcha mentioned several times in this thread.

    But it can be done using spl variable of reg ex also, if I am right?? Any takers?

    cool, I know that u r c00l and all, but could u plz stop im-talking?

    my $str=join ' ',@arr;

    That is just like my $str="@arr"; that is, unless you've changed $". And if you haven't then it's a very convenient idiom. If you have, then you should have done so locally in a block anyway, unless yours is a very very special situation.

    $str=~ /foo/;

    That is just like "@arr" =~ /foo/; no need for an intermediate variable.

    print $&; #### In place of $&; we can use that for no #### for no. of matches.

    I understand what you mean, but:

    • your match is not a global one (you have to use the /g modifier for that), so the number of matches will always be at most one;
    • no, there's not such a variable and no need for it, since a match in global context will return the list of all the matches (or of all the captures if capturing parens are there) and one can use that list in scalar context to get the number of them.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://620315]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2020-02-18 16:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (76 votes). Check out past polls.

    Notices?