Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I've got some mailing lists in this format:




What I am working on is a bit of code to pull a clean email address out of either format and store it in a scalar (or perhaps two scalars.... one for the username and one for the domain). (The formats are mixed up within the mailing list file.)

Is there a simple regexp to do this?

Can someone help me? :)

(I can handle the code for looping through each line of the file. But I am stumped as to the precise regexp I need to use to extract the email address and whether or not I need to use split() or simply s/// or perhaps m// to do the trick.)

Perhaps there is more than one way to do it. :)

Replies are listed 'Best First'.
Re: Shuckin' the Email Addy
by chromatic (Archbishop) on Sep 07, 2000 at 06:17 UTC
    Note that the perlfaq correctly notes that there is no simple regex to match an RFC-822 compliant e-mail address. Please note this!

    With that in mind, if your data file is as simple as what you say, I would do something like this:

    (undef, $address, undef) = split '', $line;

    I wouldn't bother with trying to match anything else, I'd just split. Again, this depends on your text file being *very* simple, as simple as you've described it above.

    For anything else, the modules Email::Find and Email::Valid might come in handy.

      There is a module EMail::Find specifically for searching text for e-mail addresses. I don't know how well it works. It does come with RFC822 validation code (Email::Valid, mentioned elsewhere), though. There is also a module RFC::RFC822::Address that checks (strictly) for the validity of an RFC822 e-mail address.
      chromatic, that is exactly what I need!


      By the way, you're advisory about trying to validate e-mail addresses is well-taken, but the format of the source file is pretty simple so your split idea should fit the bill.

      Thanks again.