Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
Welcome to the Monastery
 
PerlMonks  

pcre regex

by Anonymous Monk
on Apr 06, 2012 at 10:14 UTC ( #963798=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I have the following regex which searches for a number with starting 4 numeric values and last value as numeric or alphabet

^\d{4,4}[A-Z0-9]$

When i try to match this regex in a user given input it easily matches. But when i try to run the same regex for pattern matching in a large text file than it fails to recognize the pattern in text file. i fail to understand why.. Help

Comment on pcre regex
Download Code
Re: pcre regex
by nemesdani (Friar) on Apr 06, 2012 at 10:22 UTC
    Post your code with which you read the text file, maybe we can work out someting.
    Oh, and d{4,4} == d{4}

    I'm too lazy to be proud of being impatient.

      When i give it user input it matches the regex

      print "Please enter the string "; chomp($string=<STDIN>); $regex= "^\d{4,4}[A-Z0-9]$ "; if ($string =~ m!($regex)!g) { print 'match'; print "$regex"; } else { print 'no match'; print "$regex"; }

      It matches the regex in this case but when the input is a text file it fails to recognize any patterns

Re: pcre regex
by GrandFather (Cardinal) on Apr 06, 2012 at 10:30 UTC

    Show us some sample data that fails and some test code. If the problem is real you shouldn't need more than a few lines of data and a few lines of code to reproduce the problem.

    Are you sure the pattern you want to match is at the start of the line? Could it be that you are reading the entire file into a string and are then trying to match against that string? If so you probably need to use the m (multiple line match) switch. Consider:

    use strict; use warnings; my $str = <<TEXT; First line 1234X last line TEXT print "Matched\n" if $str =~ /^\d{4,4}[A-Z0-9]$/; print "Multi-matched\n" if $str =~ /^\d{4,4}[A-Z0-9]$/m;

    prints:

    Multi-matched
    True laziness is hard work

      Yes have to search in multilines so i used /m but it still does not give any matches. I think the problem is with ^ and $ as only regex having these identifiers are giving me problem rest ase working fine such as for email validation

      /[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/g
Re: pcre regex
by moritz (Cardinal) on Apr 06, 2012 at 10:32 UTC

      how do i substitute ^ and $ in my regex

        Well, what semantics do you want?

        If you just want to remove the restriction that the regex has to match the whole string, remove them.

Re: pcre regex
by Anonymous Monk on Apr 06, 2012 at 12:48 UTC
    The lines from the text files may be ending with '\r\n' and '$' matches before an optional '\n'.
    Opening the file in the '<:crlf' mode, or adding an optional "\r" into regex (/^\d{4}[A-Z0-9]\r?$/), may help.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://963798]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2014-04-23 19:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (554 votes), past polls