Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

How to get split $var to work like split ' '?

by QM (Vicar)
on Sep 10, 2013 at 15:35 UTC ( #1053295=perlquestion: print w/ replies, xml ) Need Help??
QM has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that optionally allows the split pattern to be specified on the command line. The default pattern is:
my $split_pattern = ' ';

However, this doesn't DWIM with the default:

my @x = split $split_pattern, $line;

Is there a nifty-keen way of getting the default ' ' behavior from a $variable pattern? Or am I stuck with something like this:

my $split_pattern; ... my @x = defined($split_pattern) ? split $split_pattern, $line : split ' ', $line;

or even

my @x = split defined($split_pattern)?$split_pattern:' ', $line;

I'm on 5.10.1.

Update See Re^2: How to get split $var to work like split ' '?.

-QM
--
Quantum Mechanics: The dreams stuff is made of

Comment on How to get split $var to work like split ' '?
Select or Download Code
Replies are listed 'Best First'.
Re: How to get split $var to work like split ' '?
by BrowserUk (Pope) on Sep 10, 2013 at 16:45 UTC

    You'll need to use an if statement:

    my $split_pattern //= getOpt(); my @x; unless( defined $split_pattern ) { @x = split ' ', $line; } else @x = split $split_pattern, $line; }

    Not elegant, but the only way. The special behaviour of split ' ', is triggered by the use of exactly ' ' in the source code.

    Substituting a variable set to a space, or even a conditional statement where one branch is ' ', will not do the same thing.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      UPDATE: Pls ignore this post. Thanks to LanX above I now understand what the problem is.

      Can you explain what the supposedly special behavior of split ' ' is? I cannot find any documentation saying anything about it. For me this works:

      use strict; use warnings; use Data::Dumper; my $str = "I want a split pattern"; my $pattern = $ARGV[0] // ' '; my @pieces = split /$pattern/, $str; print Dumper \@pieces;
      hdb$ perl split4.pl $VAR1 = [ 'I', 'want', 'a', 'split', 'pattern' ]; hdb$ perl split4.pl a $VAR1 = [ 'I w', 'nt ', ' split p', 'ttern' ];

        You may also want to use quotemeta. e.g.

        ski@anito:~$ perl -e 'use strict; use warnings; my $x = "a.*b.*c.*d"; my $pattern = ".*"; my @x = split /$pattern/, $x; print join " - ",@x'
        ski@anito:~$

        vs.

        ski@anito:~$ perl -le 'use strict; use warnings; my $x = "a.*b.*c.*d"; my $pattern = quotemeta(".*"); my @x = split /$pattern/, $x; print join " - ",@x'
        a - b - c - d
        ski@anito:~$

        perldoc -f quotemeta.
        Search for "specifying a PATTERN of space" in perldoc -f split. The difference relates to leading empty strings.
Re: How to get split $var to work like split ' '?
by LanX (Canon) on Sep 10, 2013 at 15:58 UTC
    I can only guess which default behavior you want, b/c you didn't make it clear.

    If you want repeated whitespaces to be ignored, then simply tell split to do so.

    DB<113> split ' ', 'abc def' => ("abc", "def") DB<114> $del=qr/ / => qr/ / DB<115> split $del, 'abc def' => ("abc", "", "def") DB<116> $del=qr/\s+/ => qr/\s+/ DB<117> split $del, 'abc def' => ("abc", "def")

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    update

    split :

    As a special case, specifying a PATTERN of space (' ') +will split on white space just as "split" with no arguments +does. Thus, "split(' ')" can be used to emulate awk’s default behavior, whereas "split(/ /)" will give you as many nu +ll initial fields as there are leading spaces. A "split" +on "/\s+/" is like a "split(' ')" except that any leading whitespace produces a null first field. A "split" with + no arguments really does a "split(' ', $_)" internally.

      I'd like the split ' ' behavior, unless changed from the command line option. Something like this:
      my $split_pattern = ' '; # default ... # $split_pattern might get changed here from a command line option $split_pattern = $foo if $bar; ... $line = " one two three \n"; my @words = split $split_pattern, $line; for my $i (0..$#words) { print "$i: ($words[$i])\n"; } # Expected output for default case 0: one 1: two 2: three # Actual output for default case 0: () 1: () 2: () 3: (one) 4: (two) 5: (three) 6: () 7: () 8: ( )

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

        Now, I understand what you are looking for, it wasn't clear to me so far. If you store ' ' into a variable, you don't get the expected behavior as if you hard code:

        my @fields = split ' ', $line;

        Using "qr/\s+/;" as a split pattern improves the result but still does not yields what you want (you still get an extra empty element at the beginning of the array).

        Well, then I don't see any real simple direct solution. Even something like this:

        my @fields = split $pattern? $pattern : ' ', $line;

        does not do what you are looking for. The only solutions I can think of are either to use the above "qr/\s+/;" and shift the first element if empty, or use a if else construct, or yet this equivalent construct:

        my @fields  = defined $pattern? split $pattern, $line : split ' ', $line;

      Many thanks. I had not seen that bit.

Re: How to get split $var to work like split ' '?
by Laurent_R (Monsignor) on Sep 10, 2013 at 15:50 UTC
    $split_pattern = ' ' unless defined $split_pattern;

    Update: I have just given a way to define the default pattern that you asked for for use if no other pattern has been defined. But you might consider using something else than ' ', because of its special behavior. For example:

    $split_pattern = qr / / unless defined $split_pattern;
Re: How to get split $var to work like split ' '?
by Eily (Curate) on Sep 10, 2013 at 19:25 UTC

    There's always the eval solution.

    my $sep = ' '; $, = ', '; $\ = $/; my $string = " Hello World"; print split ' ', $string; print split $sep, $string; # This does not work print eval qq< split '$sep', \$string >;
    Hello, World , , , , , , , , , , Hello, World Hello, World
    This could do the trick as long as you don't try to use regexes. And maybe escaping the quotes in $sep would be a good idea too :P

      Yes, that occurred to me later.

      Still, to follow the Principle of Least Astonishment, I'd like to be able to specify a regex. To avoid further if then's, I'd probably set the default like so:

      my $sep = q/' '/; $sep = shift @ARGV if @ARGV; # you get the idea my $line = " one two three \n"; my @words = eval qq(split $sep, \$line);

      Then something evaluating to a regex must be used, such as:

      $sep = q(/\s+/); $sep = q(//); $sep = q(/two|\s+/); $sep = q(m|/|);

      But the delimiters must be part of the string.

      (I'm ignoring for now the security issue of letting the user specify a regex, as this is a script just for me. And I'm not up to untainting the full regex language, so I can't see letting this into the wild hands of colleagues.)

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

Re: How to get split $var to work like split ' '?
by keszler (Priest) on Sep 10, 2013 at 15:53 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1053295]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2015-07-30 06:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (270 votes), past polls