Re: Counting the number of items returned by split without using a named array
by BrowserUk (Patriarch) on May 03, 2006 at 10:16 UTC
|
$entries = @{[ split /\s+/ ]};
Also, split /\s+/ is the similar to as the slightly magical split ' ', except undefs from leading whitespace are suppressed.
In turn, split ' ' is the same as split with no arguments, so you could reduce your code to:
$entries = @{[ split ]};
If you don't have leading whitespace, or don't want to count the undef any leading whitespace would produce as an entry.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
My bias goes towards either
$entries = @{[ split ]};
or
$entries = () = /\S+/g;
as "most elegant".
It's both sufficiently short, although I'm not sure which one is easier to understand for the uninitiated reader :)
Just out of curiosity (it doesnt really matter in my case): which one would be the more (CPU- and memory-) efficient one?
I. | [reply] [d/l] [select] |
|
C:\test>p1
our $s = join ' ', 'aa'..'zz';;
cmpthese -3, {
split => q[ $_=$s; my $n = @{[ split ]}; ],
regex => q[ $_=$s; my $n = () = /\S+/g; ]
};;
Rate regex split
regex 546/s -- -50%
split 1102/s 102% --
Assuming I didn't goof on the benchmark split appears to be quicker.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
|
|
|
use Benchmark;
I may well (and happily!) prove wrong... | [reply] [d/l] |
|
|
|
| [reply] [d/l] |
Re: Counting the number of items returned by split without using a named array
by borisz (Canon) on May 03, 2006 at 10:02 UTC
|
$_ = 'Hi There! x';
my $entries = () = /\S+/g;
print $entries;
| [reply] [d/l] |
Re: Counting the number of items returned by split without using a named array
by prasadbabu (Prior) on May 03, 2006 at 09:58 UTC
|
use strict;
use warnings;
$_ = 'here the text goes';
my @entries;
print scalar (@entries = split /\s+/, $_);
Without return list, here is one way
use strict;
use warnings;
$_ = 'here the text goes';
my $count;
print $count = $_ =~ s/\S+//g;
updated: Added second method as blazar pointed out, without return list, though not a most elegant way. Thanks.
| [reply] [d/l] [select] |
|
"So what's the most elegant way to get the number of entries and throw away the resulting array?"
I think he means: "discarding the return list, retaining only its lenght".
| [reply] |
|
print $count = $_ =~ s/\S+//g;
was probably what I wanted - just didnt know you could do that with a simple search (although I wondered if one couldnt just use a search in some way)
Thx a lot
Why does a simple
print $count = s/\S+//g;
not work, though?
This (as suggested below) did not count anything either:
$count = () = s/\S+//g;
I. | [reply] [d/l] [select] |
|
I use m//g bellow __not__ s///g.
| [reply] |
|
| [reply] [d/l] |
Re: Counting the number of items returned by split without using a named array
by Tanalis (Curate) on May 03, 2006 at 10:12 UTC
|
Why not just count the instances of whitespace?
my $str = "this is a test string";
my $cnt = 1; # num tokens = whitespace + 1
++$cnt while $str =~ /\s+/g;
print $cnt, "\n"; # prints 5
| [reply] [d/l] |
|
$ perl -lpe '($_=()=/\s+/g)++'
foo
1
bar baz
2
foo bar baz
3
But I would use split, especially with the smart behaviour provided by the default ' ' argument. | [reply] [d/l] [select] |
|
$ perl -lpe '($_=()=/\s+/g)++' ## leading
foo bar
3
$ perl -lpe '($_=()=/\s+/g)++' ## trailing
foo bar
3
$ perl -lpe '($_=()=/\s+/g)++' ## both
foo bar
4
$ _
It's better to count ocurrences of actual elements (\S+):
$ perl -wle 'print scalar (()=/\S+/g) for "a b c", " a b c", "a b c ",
+ " a b c "'
3
3
3
3
$ _
Anyway I prefer split too, like salva++'s 0e0 solution.
| [reply] [d/l] [select] |
|
Re: Counting the number of items returned by split without using a named array
by blazar (Canon) on May 03, 2006 at 10:07 UTC
|
=()=
BTW: if you don't know what goatse is, then chances are you don't want to!
Incidentally, dou you really want \s+? The default, which is ' ' is a special case and does what you mean in the vast majority of cases. | [reply] [d/l] [select] |
|
the =()= trick can not be used with split as it is "optimized" into split /foo/, $bar, 1::
$ perl -MO=Deparse -e '$a = () = split'
$a = () = split(" ", $_, 1);
| [reply] [d/l] [select] |
|
GAWD! Well, the fact that you write "optimized" yourself suggests that it is really an unwanted side effect of an optimization... may I push it as far as to dare to say that it is a bug?
Well, another trick that I verified not to be flawed is:
my $count=map $_, split;
of course it doesn't just taste as good... hmmm, how 'bout:
my $count=+(split); # ?!?
(also verified!)
$ perl -lpe '$_=+(split)'
foo
1
bar baz
2
| [reply] [d/l] [select] |
|
|
|
|
pretty amazing, that "optimization":
$ perl -MO=Deparse -e '$a = () = split(" ",$_,0)'
$a = () = split(" ", $_, 1);
BUT:
$ perl -MO=Deparse -e '$a = () = split(" ",$_,100000)'
$a = () = split(" ", $_, 100000);
so it's not completely impossible to use it with split, just limited to a fixed maximum number of fields in the split. It's a bit unsuspected (at least for me) though, that an explicit split(" ",$_,0) was 'optimized' also.
Btw - just out of curiosity put this version in the benchmark also - and it's faster than the regexp version, but still slower than $n = @{[ split ]};
I.
| [reply] [d/l] [select] |
|
a) You're right, I didnt want to know that
b) hm, I just wanted to catch spaces and tabs and thought
\s+ most appropriate.
Never seen =()= in the perldoc before :-/
I.
| [reply] [d/l] |
|
Because it's not an operator of itself. It's an assignment to a list further piped into another assignment. It's just a means to create a list context. Others may find a better wording to describe it: possibly mine is not as technically accurate as it could be. Unfortunately as others already explained, it's not reliable to use it with split.
| [reply] |
|
Hm, sorry to have asked, as I see the two prominent solutions to the problem had already been discussed here
(hope linking works as expected now, preview was fine... if not: was meant to link here http://www.perlmonks.org/?node_id=527973)
I.
| [reply] |
|
You should not be sorry. Even if the topic seems trivial and elementary, it turned out to be more complex than one would probably think, and thus the discussion has been very interesting.
BTW: to insert a link you should use [id://527973] or [id://527973|here], which render like Perl Idioms Explained - my $count = () = /.../g and here respectively. This is the preferred way since they will bring up the correct link both if you're in http://perlmonks.org and http://www.perlmonks.org, or any other possible mirror. See this node for more info.
| [reply] [d/l] [select] |
Re: Counting the number of items returned by split without using a named array
by smokemachine (Hermit) on May 03, 2006 at 21:16 UTC
|
| [reply] [d/l] |
Re: Counting the number of items returned by split without using a named array
by lima1 (Curate) on May 03, 2006 at 10:09 UTC
|
don't know if it's the "most elegant way", but if you don't want to generate an array, count the blanks, e.g.:
my $entries = 1;
s/\s/$entries++/eg;
| [reply] [d/l] |
|
No, no, no, that would modify the original string in a most probably unwanted way. And if you really wanted to do it, then probably it should have been \s+. But you do not want to do so: a match would be better suited.
| [reply] [d/l] |
|
I know that of course. It is just one quick n dirty way to count blanks (therefore the "e.g." ). but the goatse thing is nicer, i must admit
| [reply] |