Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Common Regex's

by coreolyn (Parson)
on Aug 07, 2000 at 19:26 UTC ( [id://26583]=CUFP: print w/replies, xml ) Need Help??

Common Regex's for your coding pleasure.
$NAME_REGEX = '[a-zA-Z]+'; $ADDRESS_REGEX = '[.-\w]+'; $CITY_REGEX = '\w+'; $PHONE_REGEX = '^1?[-\s\.\/]?\s*\(?(\d\d\d)\)?[-\s\.\/]?\s*(\d\d\d) +?[-\s\.\/]?(\d\d\d\d)\s*$'; $ZIP_REGEX = '^\d{5}(([-\s])?\d{4})?\s*$'; $DOB_REGEX = '^\d?\d[-\s\.\/]\s*\d?\d[-\s\.\/]\d\d\d?\d?s*$'; $EMAIL_REGEX = '[-.\w]+\@[-.\w]+'; $COMMENT_REGEX = '\w+'; $SSN_REGEX = '^\d\d\d([\s\.\/-])?\d\d([\s\.\/-]>?\d\d\d\d\s*$'; $HTTP_CLEAN_HEADER_REGEX =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex( +$1))/eg; #i.e., get rid of %20 et. al nauseaum from post headers.

Replies are listed 'Best First'.
RE: Common Regex's
by merlyn (Sage) on Aug 07, 2000 at 19:27 UTC
    The $EMAIL_REGEX is very wrong. The right one (for some version of "right") is over 6000 characters. Please don't use the one given here for ANYTHING.

    -- Randal L. Schwartz, Perl hacker

RE: Common Regex's
by KM (Priest) on Aug 07, 2000 at 19:53 UTC
    You can find the script which generates the pattern in Mastering Regular Expressions chapter 7 or here. Just add a print $mailbox at the end and redirect to a new file. If there is anything out there newer than this, let us know.

    Cheers,
    KM

      Or you can just use this. This came from the same book as KM mentioned above.

      [\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\ +xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\ xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:(?:[^(\040) +<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^( \040)<>@,;:".\\\[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n\015"]*(?:\\[^ +\x80-\xff][^\\\x80-\xff\n\015"]*)*")[\04 0\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\ +n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\ n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\ +([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\ xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]* +)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t] *)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\ +[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n \015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[\040\t]*(?:\([^\\\x +80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\( [^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^ +\\\x80-\xff\n\015()]*)*\)[\040\t]*)*)*@[ \040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\x +ff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\x ff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<>@, +;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040 )<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^ +\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\x ff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80- +\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x8 0-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80-\xff\n\015( +)]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\ n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\ +015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:" .\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff +])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x8 0-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]| +\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xf f][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*)* +|(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80- \xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n\0 +15"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\01 5"]*)*")[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]*(?:(?:\([^\\\x8 +0-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([ ^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\ +\\x80-\xff\n\015()]*)*\)|"[^\\\x80-\xff\ n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[^()<>@,;:".\\\[\]\ +x80-\xff\000-\010\012-\037]*)*<[\040\t]* (?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015 +()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015 ()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:@[\040\t]*(?:\([^\\\ +x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\ ([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[ +^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?: [^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\00 +0-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\0 15\[\]]|\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?: +\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]* (?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)* +\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80 -\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x +80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\ \x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x +80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\ 037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\])[\040\ +t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[ ^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\ +015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[ \040\t]*)*)*(?:,[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\x +ff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80 -\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]* +)*@[\040\t]*(?:\([^\\\x80-\xff\n\015()]* (?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x8 +0-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015 ()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^( +\040)<>@,;:".\\\[\]\000-\037\x80-\xff])| \[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x8 +0-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([ ^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\ +\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\. [\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\ +xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\ xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<>@ +,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\04 0)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[ +^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\ xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80 +-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x 80-\xff\n\015()]*)*\)[\040\t]*)*)*)*:[\040\t]*(?:\([^\\\x80-\xff\n\015 +()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff \n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n +\015()]*)*\)[\040\t]*)*)?(?:[^(\040)<>@, ;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\ +xff])|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80 -\xff][^\\\x80-\xff\n\015"]*)*")[\040\t]*(?:\([^\\\x80-\xff\n\015()]*( +?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\01 5()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015( +)]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\ \\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\ +\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\) )[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\ +037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\ 000-\037\x80-\xff])|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\ +xff\n\015"]*)*")[\040\t]*(?:\([^\\\x80-\ xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80 +-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x 80-\xff\n\015()]*)*\)[\040\t]*)*)*@[\040\t]*(?:\([^\\\x80-\xff\n\015() +]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n \015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\0 +15()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:". \\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff] +)|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80 -\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\ +([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff ][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?: +\.[\040\t]*(?:\([^\\\x80-\xff\n\015()]*( ?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80 +-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015( )]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\ +040)<>@,;:".\\\[\]\000-\037\x80-\xff])|\ [(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80 +-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^ \\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\ +\x80-\xff\n\015()]*)*\)[\040\t]*)*)*>)

      LeGo <hr

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://26583]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2025-11-12 19:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What's your view on AI coding assistants?





    Results (68 votes). Check out past polls.

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.