Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re: Regex Conundrum

by SciDude (Friar)
on Jun 12, 2004 at 19:38 UTC ( #363646=note: print w/replies, xml ) Need Help??

in reply to Regex Conundrum

There is a standard solution to this problem, mostly from Mastering Regular Expressions:

# You need to match a double quoted string with the following regex # [^"\\]*(\\.[^"\\]*)*",? # # But to get the text between double quotes use some ( ) # ([^"\\]*(\\.[^"\\]*)*)",? # gets text inside quotes as $1 # # but you also have non quoted fields, thus # ([^,]+),? # which should match things optionally followed by a comma # # and then a match for separation commas # , # # this must be repeated with m/.../g

Before attempting this yourself, take at look at Text::ParseWords and the quotewords routine. This should solve your problem. If the module is not available to you then the following untested code from Mastering Regular Expressions should work:

@fields = (); while ($text =~ m/"([^"\\]*(\\.[^"\\]*)*)",?|([^,]+),?|,/g { push (@fields, defined ($1) ? $1 : $3) ; } push (@fields, undef) if $text =~ m/,$/; # Account for the special cas +e of an empty last field. # all data is now in @fields

Note: untested.

The first dog barks... all other dogs bark at the first dog.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://363646]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2017-03-24 02:23 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (295 votes). Check out past polls.