Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Counting the number of items returned by split without using a named array

by Anonymous Monk
on May 03, 2006 at 09:46 UTC ( #547098=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I'm getting the warning Use of implicit split to @_ is deprecated on this line:

$entries=split(/\s+/);

I really just want to count the elements there, I'm not interested in the resulting array. So what's the most elegant way to get the number of entries and throw away the resulting array (and remove the warning)?


Thx
I.

2006-05-04 Retitled by Arunbear, as per consideration
Original title: 'strip into @_ deprecated'

Replies are listed 'Best First'.
Re: Counting the number of items returned by split without using a named array
by borisz (Canon) on May 03, 2006 at 10:02 UTC
    $_ = 'Hi There! x'; my $entries = () = /\S+/g; print $entries;
    Boris
Re: Counting the number of items returned by split without using a named array
by BrowserUk (Pope) on May 03, 2006 at 10:16 UTC

    Another way that avoids a named array.

    $entries = @{[ split /\s+/ ]};

    Also, split /\s+/ is the similar to as the slightly magical split ' ', except undefs from leading whitespace are suppressed.

    In turn, split ' ' is the same as split with no arguments, so you could reduce your code to:

    $entries = @{[ split ]};

    If you don't have leading whitespace, or don't want to count the undef any leading whitespace would produce as an entry.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      My bias goes towards either
      $entries = @{[ split ]};

      or
      $entries = () = /\S+/g;

      as "most elegant".
      It's both sufficiently short, although I'm not sure which one is easier to understand for the uninitiated reader :)
      Just out of curiosity (it doesnt really matter in my case):
      which one would be the more (CPU- and memory-) efficient one?

      I.

        C:\test>p1 our $s = join ' ', 'aa'..'zz';; cmpthese -3, { split => q[ $_=$s; my $n = @{[ split ]}; ], regex => q[ $_=$s; my $n = () = /\S+/g; ] };; Rate regex split regex 546/s -- -50% split 1102/s 102% --

        Assuming I didn't goof on the benchmark split appears to be quicker.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        Just out of curiosity (it doesnt really matter in my case): which one would be the more (CPU- and memory-) efficient one?

        It hardly ever matters. As a wild guess I would say that since the former involves doing something and then undoing it and that something is taking a reference, it is more computationally intensive. In case of doubt

        use Benchmark;

        I may well (and happily!) prove wrong...

Re: Counting the number of items returned by split without using a named array
by prasadbabu (Prior) on May 03, 2006 at 09:58 UTC

    Here is one way to do it.

    use strict; use warnings; $_ = 'here the text goes'; my @entries; print scalar (@entries = split /\s+/, $_);

    Without return list, here is one way

    use strict; use warnings; $_ = 'here the text goes'; my $count; print $count = $_ =~ s/\S+//g;

    updated: Added second method as blazar pointed out, without return list, though not a most elegant way. Thanks.

    Prasad

      "So what's the most elegant way to get the number of entries and throw away the resulting array?"

      I think he means: "discarding the return list, retaining only its lenght".

      Ha, great
      print $count = $_ =~ s/\S+//g;

      was probably what I wanted - just didnt know you could do that with a simple search (although I wondered if one couldnt just use a search in some way)

      Thx a lot

      Why does a simple
      print $count = s/\S+//g;

      not work, though?

      This (as suggested below) did not count anything either:

      $count = () = s/\S+//g;

      I.
        I use m//g bellow __not__ s///g.
        Boris

        Although this may occasionally work for you in this circumstance, it's not logical to modify the original string just to count the number of occurrences. Just use /\S+/g instead.

Re: Counting the number of items returned by split without using a named array
by Tanalis (Curate) on May 03, 2006 at 10:12 UTC
    Why not just count the instances of whitespace?
    my $str = "this is a test string"; my $cnt = 1; # num tokens = whitespace + 1 ++$cnt while $str =~ /\s+/g; print $cnt, "\n"; # prints 5

      The =()= does work with matches:

      $ perl -lpe '($_=()=/\s+/g)++' foo 1 bar baz 2 foo bar baz 3

      But I would use split, especially with the smart behaviour provided by the default ' ' argument.

        That fails when input has leading or trailing blanks:

        $ perl -lpe '($_=()=/\s+/g)++' ## leading foo bar 3 $ perl -lpe '($_=()=/\s+/g)++' ## trailing foo bar 3 $ perl -lpe '($_=()=/\s+/g)++' ## both foo bar 4 $ _

        It's better to count ocurrences of actual elements (\S+):

        $ perl -wle 'print scalar (()=/\S+/g) for "a b c", " a b c", "a b c ", + " a b c "' 3 3 3 3 $ _

        Anyway I prefer split too, like salva++'s 0e0 solution.

        --
        David Serrano

Re: Counting the number of items returned by split without using a named array
by blazar (Canon) on May 03, 2006 at 10:07 UTC

    The so called "goatse opearator":

    =()=

    BTW: if you don't know what goatse is, then chances are you don't want to!

    Incidentally, dou you really want \s+? The default, which is ' ' is a special case and does what you mean in the vast majority of cases.

      the =()= trick can not be used with split as it is "optimized" into split /foo/, $bar, 1::
      $ perl -MO=Deparse -e '$a = () = split' $a = () = split(" ", $_, 1);

        GAWD! Well, the fact that you write "optimized" yourself suggests that it is really an unwanted side effect of an optimization... may I push it as far as to dare to say that it is a bug?

        Well, another trick that I verified not to be flawed is:

        my $count=map $_, split;

        of course it doesn't just taste as good... hmmm, how 'bout:

        my $count=+(split); # ?!?

        (also verified!)

        $ perl -lpe '$_=+(split)' foo 1 bar baz 2
        pretty amazing, that "optimization":
        $ perl -MO=Deparse -e '$a = () = split(" ",$_,0)' $a = () = split(" ", $_, 1);

        BUT:
        $ perl -MO=Deparse -e '$a = () = split(" ",$_,100000)' $a = () = split(" ", $_, 100000);

        so it's not completely impossible to use it with split, just limited to a fixed maximum number of fields in the split.
        It's a bit unsuspected (at least for me) though, that an explicit  split(" ",$_,0) was 'optimized' also.

        Btw - just out of curiosity put this version in the benchmark also - and it's faster than the regexp version, but still slower than  $n = @{[ split ]};

        I.

      a) You're right, I didnt want to know that

      b) hm, I just wanted to catch spaces and tabs and thought \s+ most appropriate.

      Never seen =()= in the perldoc before :-/
      I.

        Because it's not an operator of itself. It's an assignment to a list further piped into another assignment. It's just a means to create a list context. Others may find a better wording to describe it: possibly mine is not as technically accurate as it could be. Unfortunately as others already explained, it's not reliable to use it with split.

      Hm, sorry to have asked, as I see the two prominent solutions to the problem had already been discussed here
      (hope linking works as expected now, preview was fine... if not: was meant to link here http://www.perlmonks.org/?node_id=527973) I.

        You should not be sorry. Even if the topic seems trivial and elementary, it turned out to be more complex than one would probably think, and thus the discussion has been very interesting.

        BTW: to insert a link you should use [id://527973] or [id://527973|here], which render like Perl Idioms Explained - my $count = () = /.../g and here respectively. This is the preferred way since they will bring up the correct link both if you're in http://perlmonks.org and http://www.perlmonks.org, or any other possible mirror. See this node for more info.

Re: Counting the number of items returned by split without using a named array
by lima1 (Curate) on May 03, 2006 at 10:09 UTC
    don't know if it's the "most elegant way", but if you don't want to generate an array, count the blanks, e.g.:
    my $entries = 1; s/\s/$entries++/eg;

      No, no, no, that would modify the original string in a most probably unwanted way. And if you really wanted to do it, then probably it should have been \s+. But you do not want to do so: a match would be better suited.

        I know that of course. It is just one quick n dirty way to count blanks (therefore the "e.g." ). but the goatse thing is nicer, i must admit
Re: Counting the number of items returned by split without using a named array
by smokemachine (Hermit) on May 03, 2006 at 21:16 UTC
    my$s=my@a=split/\s+/;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://547098]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2020-11-28 14:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?