Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

What am I doing wrong with 'split'

by flexvault (Monsignor)
on Aug 11, 2013 at 09:36 UTC ( [id://1048991]=perlquestion: print w/replies, xml ) Need Help??

flexvault has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Usually when I use 'split', I define the separator as a non-printable character:

perl -e '$sep=chr(254);$s="85$sep"."mat\@com";($key,$email)=split($sep +,$s);print"$key\t$email\n";'

And the result is:

85 mat@com
Now a client wants to see the separator in the 'select' drop down box, so I defined the separator with the following:
perl -e '$s="85|mat\@com";$sep="\|"; ($key,$email)=split($sep,$s); pri +nt "$key\t$email\n";'
And the result is:
8 5

If I do it this way, it works:

perl -e '$s="85|mat\@com";($key,$email)=split(/\|/,$s);print "$key\t$e +mail\n";'

I know "|" is a special character, but what is going on?

Regards...Ed

"Well done is better than well said." - Benjamin Franklin

Replies are listed 'Best First'.
Re: What am I doing wrong with 'split'
by rnewsham (Curate) on Aug 11, 2013 at 09:41 UTC

    You need to escape your escape.

    perl -e '$s="85|mat\@com";$sep="\\\|"; ($key,$email)=split($sep,$s); +print "$key\t$email\n";' 85 mat@com

      rnewsham,

      Thanks, makes sense now that you pointed it out.

      Regards...Ed

      "Well done is better than well said." - Benjamin Franklin

Re: What am I doing wrong with 'split'
by Laurent_R (Canon) on Aug 11, 2013 at 10:45 UTC

    I think the point that you are perhaps missing is that split really uses a regular expression as pattern for splitting the input, not simply a character, whether printable or not. Just to illustrate this point:

    $ perl -e '$s="foobaaarbaz"; print join " ", split /ba+r/, $s;' foo baz

    As you can see, this code was able to determine that "baaar" matches the pattern /ba+r/ and split the string on it. And, BTW, this gives you much more expressive power once you realize that. In the example you gave, the problem comes from the fact that you use a string in your $sep variable, and that sting contains characters which are special in regexes. This string is then converted to a regex and gets messed up. You don't have the problem if you define $sep as a regular expression instead of a string:

    $ perl -e '$s="85|mat\@com";$sep=qr/\|/;($key,$email)=split($sep,$s); +print "$key\t$email\n";' 85 mat@com

    Defining the $sep variable as a regex, using $sep=qr/\|/, suffices to solve the issue.

    To see how your string is transformed into a regex and compare to the real regex, consider this session under the Perl debugger:

    DB<1> $s = "\|" DB<2> $t = qr/$s/ DB<3> p $t (?^:|) DB<4> $u = qr/\|/ DB<5> p $u (?^:\|)

    As you can see, your $sep variable is transformed into a simple alternation (the $t variable), it is no longer the escaped | ('\|') that you need (the $u variable). A simple alternation ('|') pattern will split the string into individual characters, and your code was retrieving the first two elements of the list output by split, i.e. the first two characters of your string.

Re: What am I doing wrong with 'split'
by AnomalousMonk (Archbishop) on Aug 11, 2013 at 17:28 UTC

    Another approach is to use the power of Q... er, quotemeta to unmajick those pesky regex metacharacters.

    >perl -wMstrict -le "my $s = '85|mat@com'; print qq{'$s'}; ;; my $sep = '|'; my ($key, $email) = split qq{\Q$sep\E}, $s; ;; print qq{key '$key' email '$email'}; " '85|mat@com' key '85' email 'mat@com'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1048991]
Approved by choroba
Front-paged by Anneq
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2024-04-20 02:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found