Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Parse mailing addresses with a regex

by tos (Deacon)
on Jun 23, 2003 at 15:03 UTC ( #268198=note: print w/replies, xml ) Need Help??


in reply to Parse mailing addresses with a regex

Hi,

you only have to translate your remarks into regex-speech. For more clarity it's helpful to use the x-modifier. Here my result with your remarks.

while (my $line = <DATA>) { #if ($line =~ m!^(\d+)\s+(([A-Za-z]+\s+[A-Za-z].\s+[A-Za-z]+)| +([A-Za-z]+\s+[A-Za-z]+) )!) { if ($line =~ m! ^(\d+)\s+ ([^\d]+) ((?:\w+\s)+) (\w+)\s (\w\w)\s (\d{5})\s ([\d\-]+)\s (.*)\s$ !x) { print "\n"; print "\$1: $1 \n"; print "\$2: $2 \n"; print "\$3: $3 \n"; print "\$4: $4 \n"; print "\$5: $5 \n"; print "\$6: $6 \n"; print "\$7: $7 \n"; $custNum = $1; # First number field. $custName = $2; # Name styles can vary + + so match everything between two numbers. $custStreet = $3; # Street is everything + + after name and before CITY. $custCity = $4; # City is after addres + +s and before the TWO char state identifier. $custState = $5; # State is after addre + +ss and before FIVE digit zip number. $custZip = $6; # Zip is before teleph + +one number and after State id. $custTel = $7; # Telephone no. is aft + +er zip and before comments field. $custComments = $8; # Last remaining part ++after telephone number. } } __DATA__ 141 Martha Lynn Costello 11750 Old Mill Drive Media PA 19063 610-555-1 +212 no detail 178 Edgar Jones Jr. 18013 Highfield Road Ashton Ma 20861 323-774-1339 +no detail 161 Joyce W. Whang 18 Long Point Lane Media PA 19063 610-891-2344 no d +etail 188 Alex Smith 1979 Biltmore St NW Apt B Washington DC 20009 202-913-6 +685 no detail
produces
# perl re $1: 141 $2: Martha Lynn Costello $3: 11750 Old Mill Drive $4: Media $5: PA $6: 19063 $7: 610-555-1212 $1: 178 $2: Edgar Jones Jr. $3: 18013 Highfield Road $4: Ashton $5: Ma $6: 20861 $7: 323-774-1339 $1: 161 $2: Joyce W. Whang $3: 18 Long Point Lane $4: Media $5: PA $6: 19063 $7: 610-891-2344 $1: 188 $2: Alex Smith $3: 1979 Biltmore St NW Apt B $4: Washington $5: DC $6: 20009 $7: 202-913-6685
Greetings, tos

Replies are listed 'Best First'.
Re^2: Parse mailing addresses with a regex
by agron (Initiate) on Jul 08, 2007 at 00:32 UTC
    And how do you get around the problem when the streets are actually numbers like "19010 20th Ave NE Apt. 505" or "19010 SE 20th Ave Apt. 505" And instead of "lettered" apartments you have numbered apartments Addresses like this are very common in the state of Washington.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://268198]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2021-03-04 03:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favorite kind of desktop background is:











    Results (98 votes). Check out past polls.

    Notices?