Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

ignoring empty returns from split

by DeusVult (Scribe)
on Apr 24, 2001 at 01:26 UTC ( [id://74883]=perlquestion: print w/replies, xml ) Need Help??

DeusVult has asked for the wisdom of the Perl Monks concerning the following question:

Is there any way to call split in such a way that it will return an array without any empty entries? What I want to do is something like the following:
my ( $one, $two ) = split /exp/, $line

I can guarantee that the resulting array will have at least two entries, but it might have more than that, and some of them might be empty. I want $one and $two to contain the first two nonempty array entries.

The best way I can think of doing it now (and I use the word "best" very, very loosely) is declaring an array to hold the return from split, looping through it and pushing all nonempty entries onto yet another array, and taking $yetAnotherArray[0] and $yetAnotherArray[1] and storing them in $one and $two. That solution is simply so hideous I'd be embarrassed to use it.

I sort of thought I might be able to use map to do it, but I'm a map newbie and couldn't figure out exactly how to go about it.

If you have any trouble sounding condescending, find a Unix user to show you how it's done.
- Scott Adams

Replies are listed 'Best First'.
(Ovid) Re: ignoring empty returns from split
by Ovid (Cardinal) on Apr 24, 2001 at 01:36 UTC
    If I understand you correctly, the following should work:
    use strict; use warnings; my $line = "one====two==three"; my ( $one, $two ) = grep { length > 0 } split /=/, $line; print "$one\n$two";

    Cheers,
    Ovid

    Update: Modified code sample to more accurately reflect poster's intention.

    Update 2: tadman does raise a point about grep{length} being sufficient. I'm of the opinion that while this may be sufficient, it might not be as immediately clear to the casual observer as grep{length>0}. Any monks care to comment on the style issue?

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

      If it's not possible that length() could return a negative value, then isn't this sufficient:

           grep { length } Just curious. Not trying to be snarky.
      Ok, I tried this, but for some reason it still didn't work. I have code as follows:
      my @line = grep { length > 0 } split ( /\S+/, $_ ); print "$line[0]\t$line[1]\n";
      Where $_ will look something like this:
      STATUS mandatory
      Now, I'd assume that it would therefore print:
      STATUS mandatory
      But instead, all I get are blank lines. Can anyone think what's going wrong?

      Also, grep{length>0} is much preferable to grep{length}. The latter accomplishes absolutely nothing, fails for returns of "" so is actually somewhat buggy, and detracts from readability. The only real reason I can see to use it is a desire to show off, an impulse which should be squelched at every opportunity.

      UPDATE:Ok, I'm an idiot, but I will leave my stupidity on display for others benefit. I just noticed that I'm splitting on \S+, not \s+. So amazingly, perl did exactly what I told it to do. I just told it to do something extremely counter-productive : )

      "If you have any trouble sounding condescending, find a Unix user to show you how it's done."
      - Scott Adams

        You're splitting by \S+, this is a capital S and means anything but whitespace.
        This is probably not what you want, use a lowercase s.
Re: ignoring empty returns from split
by Anonymous Monk on Apr 24, 2001 at 06:10 UTC
    This is possible to solve using map. map removes list the element in question if an empty list is returned: @after = map { (length) ? $_ : () } @before; FYI, you can also return lists to insert entries. From perlfunc under map:
    Evaluates BLOCK or EXPR in list context, so each element of LIST may p +roduce zero, one, or more elements in the returned value.
    However, grep is more suitable for this purpose. It's also faster:
    use Benchmark; @b = (1, 2, 3, undef, 4, 5, undef, 5, 6, 7, undef, 8, undef, 9, 0, 0, 0, 0, 1, 3, 4, undef, undef, undef, 9); timethese(50_000, { 'map' => sub { my (@a); @a = map { (length) ? $_ : () } @b; }, 'grep'=> sub { my (@a); @a = grep { length } @b; } }); Benchmark: timing 50000 iterations of grep, map... grep: 6 wallclock secs ( 6.48 usr + 0.00 sys = 6.48 CPU) @ 77 +16.05/s (n =50000) map: 10 wallclock secs ( 9.28 usr + 0.00 sys = 9.28 CPU) @ 53 +87.93/s (n =50000)
Re: ignoring empty returns from split
by Corion (Patriarch) on Apr 24, 2001 at 01:39 UTC

    My, admittedly lame, approach would be to first split and then fish out the wanted elements :

    use strict; my $string = "a,b,c,,d,e,,f"; my @array = grep { $_ } split ",", $string; print join ":", @array;
Re: ignoring empty returns from split
by tadman (Prior) on Apr 24, 2001 at 01:36 UTC
    You could knock yourself out, such as:      my ($one, $two) = grep { $_ } split (/exp/, $line); Or, you could just use the split parameters:      my ($one, $two) = split (/exp/, $line, 2); Where 2 is the limit on the number of parameters you want back.
      Or, you could just use the split parameters:
      my ($one, $two) = split (/exp/, $line, 2);

      This will break for cases of "=0==one=two" since you'll return an empty and a zero.

      For those suggesting a simple $_ test for the grep, you will not return the 0. Best to go with the length() function.

      Update: I'd have to go with Ovid that "length > 0" is clearer than just length, but might there be a performance hit? (And would this code really be hurt by the performance hit?)

Re: ignoring empty returns from split
by Yohimbe (Pilgrim) on Apr 24, 2001 at 03:06 UTC
    my ($one,$two)= ( split /exp/,$line)[0,1];
    Just because there is always another way to do it.
    --
    Jay "Yohimbe" Thorne, alpha geek for UserFriendly
Re: ignoring empty returns from split
by BrotherAde (Pilgrim) on Apr 24, 2001 at 03:39 UTC

    I think this can be done by choosing the split-regexp carefully:

    $line="one===two==three===four"; ($one,$two,$three,$four)=split (/=+/,$line); print "$one\n$two\n$three\n$four";

    The + in the expression effectively tells split to treat multiple delimiters as one.

    Hope that helps
    Brother Ade

    Update
    Bug killed: $a!=$line, thanks sachmet...

      That fails on the case where $line starts with a =:
      $line = "==one==two==three"; ($one,$two,$three,$four)=split (/=+/,$line); print "$one\n$two\n$three\n$four";
      gives me an undef, 'one', and 'two'.

      As a side note, '$line' != '$a' :-)

      Afterthought: This would work:
      $line = "==one==two==three=four"; ($one,$two,$three,$four)=split (/=+/,($line=~/^=*(.*?)$/,$1)[1]); print "$one\n$two\n$three\n$four";

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://74883]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-24 13:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found