Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: Match a comma between two words

by davido (Cardinal)
on Jul 03, 2014 at 00:13 UTC ( [id://1092066]=note: print w/replies, xml ) Need Help??


in reply to Re: Match a comma between two words
in thread Match a comma between two words

  1. My post did not suggest that you use the CSV module. It suggested that my solution correctly splits your sample input, but that things could get a lot more complicated quickly, and that is why people often recommend and use the CSV module.

  2. I tried to answer your questions, but the questions were incomplete sentence fragments that contained ambiguities and that didn't fully correlate with your sample input.. If you didn't have the initiative to investigate and understand the regex I used, how do you expect to solve the harder problems?

  3. My regex matches the comma between two words if it is surrounded by a quote on the left, and a quote on the right, which is exactly how your sample input is formatted. If you want to just match a comma between words, the regex would look like:

    /(?<=\p{Alpha}),(?=\p{Alpha})/

    But that doesn't take into consideration the quotes your sample input demonstrated. Now you've got two answers; one to the question you asked, and one adapted to the question you inferred by posting sample input slightly different from the exact question.

  4. My split example does that. But if you prefer to match multiple commas on a single line, the answer to that exact question is:

    while( $line =~ /,/g ) { print "Matched a comma.\n"; }

    That's probably not the question you really wanted to ask, but we got in trouble for misinterpreting ambiguous or incorrectly specified questions already. I must be sick for being willing to try again after being subjected to hostility in response to my voluntary effort, but I will do so. Perhaps you mean something like this:

    while( $line =~ m/(?:^|,)"([\p{Alpha}\s]+)"(?=,|$)/g ) { print "$1\n"; }

    Thiw works as long as none of the fields fall into the pitfalls I discussed in my original post. If they do, I recommend looking at the source code for Text::CSV to learn how that module handles the elevated rigors of balanced quotes and escaped delimiters.

    Here is the same regex in a slightly different scaffolding:

    print "$_\n" for $line =~ m/(?:^|,)"([\p{Alpha}\s]+)"(?=,|$)/g;

Dave

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1092066]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2024-03-19 08:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found