Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Reg Exp to handle variations in the matched pattern

by markjrouse (Initiate)
on Feb 22, 2012 at 13:04 UTC ( [id://955515]=perlquestion: print w/replies, xml ) Need Help??

markjrouse has asked for the wisdom of the Perl Monks concerning the following question:

How can I search for a specific string pattern that has variations on the matches to that pattern?. I'm trying to find two regexp patterns:

    pattern 1 = \s-\r
    pattern 2 = :\r

So I use (\s-\r)|(:\r) This works, but its pattern 2 that matches different variations. As an example, in my text file, I can have different pattern 2 cases like this:

    match2a = "this is text:\r" or
    match2b = "this is text:\rAAAADDD" or
    match2c = "this is text:\r82323"

What I'm looking for is to perhaps modify my reg exp in such a way that pattern 2 only matches match2a. I want to exclude the match2b/c matches. I thought along the lines of:

(\s-\r)|(:\r)([^:\r(\d|\w]))

But of course this doesn't work. Any suggestions.

Replies are listed 'Best First'.
Re: Reg Exp to handle variations in the matched pattern
by moritz (Cardinal) on Feb 22, 2012 at 13:12 UTC

    I don't understand your question. It would be nice if you provided several pieces of text that are supposed to match, and several that are supposed not to match, and what problem you encounter.

    One thing that looks suspicious is your use of character classes. For example [^:\r(\d|\w] matches everything except the colon, \r, the vertical pipe, the opening paren, digits and word characters. That's not what you want, is it?

    Also your last regex has an imbalanced )

    What I'm looking for is to perhaps modify my reg exp in such a way that pattern 2 only matches match2a

    The regex /^this is text:\r$/ would do that trick. Is that what you want?

      Essentially, it's match any text where there is:

        a space, followed by a dash, followed by a carriage return OR a colon, followed by a carriage return BUT NOT a colon, followed by carriage return, followed by a digit, or a letter.

      One of the text files is actually located here: http://www.treasury.gov/resource-center/sanctions/SDN-List/Documents/sdnew02.txt

      I'm not interested in the text before the colon, as I want to search and replace, but having problems getting the regexp just write.

        a space, followed by a dash, followed by a carriage return OR a colon, followed by a carriage return

        So far that's simple / -[:r\r]\r/

        BUT NOT a colon, followed by carriage return

        If you're looking for two carriage returns in a row, then you'll never find something where the first carriage return is followed by a colon (because then it's not two carriage returns in a row, d'oh), so I don't see why you emphasize it like that.

        followed by carriage return, followed by a digit, or a letter.
        \r\w
        One of the text files is actually located here: http://www.treasury.gov/resource-center/sanctions/SDN-List/Documents/sdnew02.txt

        The pattern you describe matches nowhere in that file; in fact I can't find a single occurence of a carriage return in that file.

        If you describe what information you want to extract from that file, we might be able to help you. But right now it seems that you don't have a clear mental image yourself, so it's pretty hard to help you.

Re: Reg Exp to handle variations in the matched pattern
by bitingduck (Chaplain) on Feb 22, 2012 at 15:46 UTC

    I agree that something's not quite clear about what you're trying to match.

    Do you want the \r to be at the end of the line? Then you can use the $ anchor at the end of the regex.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://955515]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2024-04-19 21:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found