Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Grab zip codes out of an HTML page

by Falkkin (Chaplain)
on Feb 24, 2001 at 08:56 UTC ( #60611=snippet: print w/ replies, xml ) Need Help??

Description: Takes in an HTML page and extracts everything that looks like a ZIP code (that is, 5 digits in a row.) Prints them out, separated by commas. I'm sure there's an better way to do this, but thought I'd share. :)
cat zips.html | perl -ne 's/(\d{5,})//g; print "$1," if $1'
Comment on Grab zip codes out of an HTML page
Download Code
Re: Grab zip codes out of an HTML page
by damian1301 (Curate) on Feb 24, 2001 at 09:31 UTC
    Wouldn't the s/(\d{5,})//g; match 5 or more instances of consecutive numbers? So if, in a webpage, there is 123456778990, it would match all of the numbers and return them in $1 A better solution to this would be to omit the comma and result with this

    s/(\d{5})//g;

    But, since your not actually making a substitution in that code, you should just use a match.

    m/(\d{5})(-\d{4})?/g;

    That way you don't have to falsely delete anything and its much more tidier :). Also, now you can catch the full zip code for better accuracy (eg. 12345-1234). Hope I helped.

    UPDATE:Thanks albannach for pointing out my typing mistake and for suggesting the (-\d{4})? part :)

    Almost a Perl hacker.
    Dave AKA damian

    I encourage you to email me

Back to Snippets Section

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://60611]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (10)
As of 2014-12-18 10:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (49 votes), past polls