Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Parse file for email address

by RyuMaou (Chaplain)
on Jan 22, 2010 at 15:31 UTC ( #818978=note: print w/ replies, xml ) Need Help??


in reply to Parse file for email address

Have you looked in the Code Catacombs? There are a couple of scripts in there for finding e-mail in files. I know because I wrote them. They're not perfect, and I'm not sure if they'll be faster or slower than anything you've already tried, but it might be a place to start.

(I will warn you, though, I'm a Network Admin, not a Perl Programmer, so the code reflects the, um, "utilitarian" nature of the effort and the speed with which I needed a solution.)


Comment on Re: Parse file for email address
Re^2: Parse file for email address
by sri1230 (Novice) on Jan 22, 2010 at 16:14 UTC
    Thank you both. The file are larger and i have many that get processed ina loop. It takes forever to check line by line. I also tried to look in Code Catacombs did not find anything that does what i am looking for. Please let me know if you have any other thoughts. Basically i would like to LWP "get" a web page and look for the first valid email address in the web page source.
      Oh, from the description of your initial question, I thought you had the files already, which is what my scripts were all about doing. Though, there is one in there for verifying the e-mail addresses after they've been gathered.

      What you're talking about, though, is an e-mail harvester. Because so many spammers use them, I doubt too many people are going to be willing to help with that.
      Good luck, though, and be sure to post your results for everyone to see!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://818978]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2014-04-20 18:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (486 votes), past polls