Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: Regex Extraction Help

by Flexx (Pilgrim)
on Aug 09, 2012 at 17:02 UTC ( #986558=note: print w/ replies, xml ) Need Help??


in reply to Re: Regex Extraction Help
in thread Regex Extraction Help

Well, I guess all the fields are variable, and what invaderzard meant, was to get that second field.

So I'd suggest this:

# assuming the raw data is in $line. $line =~ m/^[^;]*;\s*([^;]*?)\s*;/ # $1 now holds whatever is between the second and third # semicolon, leading and trailing spaces trimmed.

Now, what am I doing here?

First I say: Let's start at the beginning (^). This is important, since we can't exclude the possibility that the pattern repeats in one instance of $line.

Next, I say: give me zero or more non-semicolon characters ([^;]*), followed by exactly one semicolon (;).

Now our "cursor" would be in the second field, quasi. We say, well, there might or might not be some leading space (\s*). Then comes the data we want, that's why we use parentheses to capture it. What do we wanna capture? Well, again, anything not a semicolon ([^;]*?), but this time, non-greedily (using the *? quantifier.). Well, that's because we want any trailing space to go into the \s* that follows, instead of it being captured. Lastly, we need to require that the field is terminated by exactly one semicolon (;).

If you want to capture other fields as well, then a solution using split, like it's been suggested below is a more efficient way of doing it. If you want just a few fields of a long CSV record (which this seems to be, only demimited by semicola instead of kommas, then you also could expand on the regexp above, which might be a bit more performant than split. But I didn't really check that with benchmarks. Just an inkling I'd have, and very dependent on the length of the input, and the number of fields in it.

Cheers,
Flexx


Comment on Re^2: Regex Extraction Help
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://986558]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (10)
As of 2014-12-18 08:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (47 votes), past polls