Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re^2: Split function

by Rajsai2825 (Novice)
on Dec 04, 2012 at 07:06 UTC ( #1007021=note: print w/replies, xml ) Need Help??

in reply to Re: Split function
in thread Split function

Hi ALL, My requirement is: i need to process a text file which is comma/tab delimited. Example: INPUT File ABC,DEF,GHI,JKL code: my ($a,$b,$c,$d) = split(/,\t/, $_); will process this text file. If a a text file conatins a INPUT file as below: ABC|DEF|GHU|IJK the same code :my ($a,$b,$c,$d) = split(/,\t/, $_); should die.

Replies are listed 'Best First'.
Re^3: Split function
by davido (Archbishop) on Dec 04, 2012 at 08:10 UTC

    This evil goes against my sense of sane coding, but it seems to be what you're asking for:

    use strict; use warnings; while( my $line = <DATA> ) { chomp $line; my( $a, $b, $c, $d ) = split /(?(?=^[^|]*\|)(?{die "Pipe [|] detected in input."})|)[,\ +t]/, $line; print "[($a)($b)($c)($d)]\n"; } __DATA__ ABC,DEF,GHI,JKL ABC|DEF|GHI|JKL

    This throws an exception from within the regex passed to split if the input string contains a pipe character. I wouldn't recommend bringing that to a code review, but given that none of the other solutions already provided seem to satisfy you, I am thinking that you'll only be happy when an exception is thrown as part of the split line. Despite the hackish nature of the code, it produces what you're requesting. Here's the output:

    [(ABC)(DEF)(GHI)(JKL)] Pipe [|] detected in input. at (re_eval 1) line 1, <DATA> line 2.

    It would be a lot better to just follow the advice of bart's post, or Colonel_Panic's post, in this same thread. And if neither of those posts does what you need, rather than just repeating your question again, explain exactly how their code fails to meet your needs. I find it hard to believe that your requirement is for the exact line containing the split to throw an exception. It seems a lot more reasonable to just assure that an exception is thrown once split fails to produce reasonable output, or possibly to pre-screen the line of text and throw before you split, if a pipe character is found.

    Update: Just for fun, an explanation of the regex:

    (?(condition)true_regex|false_regex) creates a conditional. For our condition, we use a zero-width lookahead assertion, (?=^[^|]*|) that detects if a pipe character is found anywhere in the string. If that condition is satisfied, the "true_regex" gets tested. The "true_regex" that we use is a (?{code}) construct, which is used (or abused) to execute Perl code from within a regular expression. The codeabuse we execute is the die statement. For our "false_regex", we use an empty expression, which will not affect the rest of the split match. The remainder of the regex is just what we would normally pass to 'split'.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1007021]
[perldigious]: but... but... perldigious is the unvirtuous kind of lazy in this case. :-)
[perldigious]: Just kidding. Thanks 1nickt, I'll go ahead and do it the right way. An extra set of brackets and a little extra indentation isn't too much to ask.
[karlgoethebier]: perldigious: perhaps a block if you are paranoid ;-)
[choroba]: but undef %hash and %hash = () both work, too, but the first one keeps the memory allocated, while the latter makes it available for other parts of the program.
[choroba]: iirc
[perldigious]: karlgoethebier: Well it is a pretty old and complicated (for me) bit of code I wrote (poorly by my current standards), so I'm expecting everything to break when I add the scoping and find out what else is undesireably scope changed. :-)
[perldigious]: Ah, thanks choroba, that sort of thing was precisely what I was wondering when I asked.

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (8)
As of 2017-07-21 19:51 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (335 votes). Check out past polls.