Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

How to get split $var to work like split ' '?

by QM (Vicar)
on Sep 10, 2013 at 15:35 UTC ( #1053295=perlquestion: print w/ replies, xml ) Need Help??
QM has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that optionally allows the split pattern to be specified on the command line. The default pattern is:
my $split_pattern = ' ';

However, this doesn't DWIM with the default:

my @x = split $split_pattern, $line;

Is there a nifty-keen way of getting the default ' ' behavior from a $variable pattern? Or am I stuck with something like this:

my $split_pattern; ... my @x = defined($split_pattern) ? split $split_pattern, $line : split ' ', $line;

or even

my @x = split defined($split_pattern)?$split_pattern:' ', $line;

I'm on 5.10.1.

Update See Re^2: How to get split $var to work like split ' '?.

-QM
--
Quantum Mechanics: The dreams stuff is made of

Comment on How to get split $var to work like split ' '?
Select or Download Code
Re: How to get split $var to work like split ' '?
by Laurent_R (Parson) on Sep 10, 2013 at 15:50 UTC
    $split_pattern = ' ' unless defined $split_pattern;

    Update: I have just given a way to define the default pattern that you asked for for use if no other pattern has been defined. But you might consider using something else than ' ', because of its special behavior. For example:

    $split_pattern = qr / / unless defined $split_pattern;
Re: How to get split $var to work like split ' '?
by keszler (Priest) on Sep 10, 2013 at 15:53 UTC
Re: How to get split $var to work like split ' '?
by LanX (Canon) on Sep 10, 2013 at 15:58 UTC
    I can only guess which default behavior you want, b/c you didn't make it clear.

    If you want repeated whitespaces to be ignored, then simply tell split to do so.

    DB<113> split ' ', 'abc def' => ("abc", "def") DB<114> $del=qr/ / => qr/ / DB<115> split $del, 'abc def' => ("abc", "", "def") DB<116> $del=qr/\s+/ => qr/\s+/ DB<117> split $del, 'abc def' => ("abc", "def")

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    update

    split :

    As a special case, specifying a PATTERN of space (' ') +will split on white space just as "split" with no arguments +does. Thus, "split(' ')" can be used to emulate awk’s default behavior, whereas "split(/ /)" will give you as many nu +ll initial fields as there are leading spaces. A "split" +on "/\s+/" is like a "split(' ')" except that any leading whitespace produces a null first field. A "split" with + no arguments really does a "split(' ', $_)" internally.

      I'd like the split ' ' behavior, unless changed from the command line option. Something like this:
      my $split_pattern = ' '; # default ... # $split_pattern might get changed here from a command line option $split_pattern = $foo if $bar; ... $line = " one two three \n"; my @words = split $split_pattern, $line; for my $i (0..$#words) { print "$i: ($words[$i])\n"; } # Expected output for default case 0: one 1: two 2: three # Actual output for default case 0: () 1: () 2: () 3: (one) 4: (two) 5: (three) 6: () 7: () 8: ( )

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

        Now, I understand what you are looking for, it wasn't clear to me so far. If you store ' ' into a variable, you don't get the expected behavior as if you hard code:

        my @fields = split ' ', $line;

        Using "qr/\s+/;" as a split pattern improves the result but still does not yields what you want (you still get an extra empty element at the beginning of the array).

        Well, then I don't see any real simple direct solution. Even something like this:

        my @fields = split $pattern? $pattern : ' ', $line;

        does not do what you are looking for. The only solutions I can think of are either to use the above "qr/\s+/;" and shift the first element if empty, or use a if else construct, or yet this equivalent construct:

        my @fields  = defined $pattern? split $pattern, $line : split ' ', $line;

      Many thanks. I had not seen that bit.

Re: How to get split $var to work like split ' '?
by BrowserUk (Pope) on Sep 10, 2013 at 16:45 UTC

    You'll need to use an if statement:

    my $split_pattern //= getOpt(); my @x; unless( defined $split_pattern ) { @x = split ' ', $line; } else @x = split $split_pattern, $line; }

    Not elegant, but the only way. The special behaviour of split ' ', is triggered by the use of exactly ' ' in the source code.

    Substituting a variable set to a space, or even a conditional statement where one branch is ' ', will not do the same thing.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      UPDATE: Pls ignore this post. Thanks to LanX above I now understand what the problem is.

      Can you explain what the supposedly special behavior of split ' ' is? I cannot find any documentation saying anything about it. For me this works:

      use strict; use warnings; use Data::Dumper; my $str = "I want a split pattern"; my $pattern = $ARGV[0] // ' '; my @pieces = split /$pattern/, $str; print Dumper \@pieces;
      hdb$ perl split4.pl $VAR1 = [ 'I', 'want', 'a', 'split', 'pattern' ]; hdb$ perl split4.pl a $VAR1 = [ 'I w', 'nt ', ' split p', 'ttern' ];
        Search for "specifying a PATTERN of space" in perldoc -f split. The difference relates to leading empty strings.

        You may also want to use quotemeta. e.g.

        ski@anito:~$ perl -e 'use strict; use warnings; my $x = "a.*b.*c.*d"; my $pattern = ".*"; my @x = split /$pattern/, $x; print join " - ",@x'
        ski@anito:~$

        vs.

        ski@anito:~$ perl -le 'use strict; use warnings; my $x = "a.*b.*c.*d"; my $pattern = quotemeta(".*"); my @x = split /$pattern/, $x; print join " - ",@x'
        a - b - c - d
        ski@anito:~$

        perldoc -f quotemeta.
Re: How to get split $var to work like split ' '?
by Eily (Hermit) on Sep 10, 2013 at 19:25 UTC

    There's always the eval solution.

    my $sep = ' '; $, = ', '; $\ = $/; my $string = " Hello World"; print split ' ', $string; print split $sep, $string; # This does not work print eval qq< split '$sep', \$string >;
    Hello, World , , , , , , , , , , Hello, World Hello, World
    This could do the trick as long as you don't try to use regexes. And maybe escaping the quotes in $sep would be a good idea too :P

      Yes, that occurred to me later.

      Still, to follow the Principle of Least Astonishment, I'd like to be able to specify a regex. To avoid further if then's, I'd probably set the default like so:

      my $sep = q/' '/; $sep = shift @ARGV if @ARGV; # you get the idea my $line = " one two three \n"; my @words = eval qq(split $sep, \$line);

      Then something evaluating to a regex must be used, such as:

      $sep = q(/\s+/); $sep = q(//); $sep = q(/two|\s+/); $sep = q(m|/|);

      But the delimiters must be part of the string.

      (I'm ignoring for now the security issue of letting the user specify a regex, as this is a script just for me. And I'm not up to untainting the full regex language, so I can't see letting this into the wild hands of colleagues.)

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1053295]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (18)
As of 2014-08-27 15:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (242 votes), past polls