Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Extract field from pipe delimited flat file

by peppiv (Curate)
on Nov 16, 2001 at 22:00 UTC ( #125871=perlquestion: print w/replies, xml ) Need Help??

peppiv has asked for the wisdom of the Perl Monks concerning the following question:

I should change my username to "Lots to Learn". Anyway, I've got a pipe delimited text file and it looks like this:

1005238613||John Q. Public|1678 Evergreen Terrance.|NE +W ORLEANS | 1005239353||James Q. Smith|1313 Monster Rd.|brook Pa +rk|OH| 1005241949||Jane Q. Anon|123 Main St.|Naperville| +IL|

What I need to do is extract the email addresses and write to another file to do a mail merge. I haven't been successful splitting and listing the second element. (first is timestamp - obviously).

Also, is there an easy way to mail to multiple recipients with hand-rolled code as opposed to a pm?

Thanks for the help oh brotherly monks.

Edit Masem - CODE tags add, and personal data changed to protect the innocent

Replies are listed 'Best First'.
Re: Extract field from pipe delimited flat file
by MZSanford (Curate) on Nov 16, 2001 at 22:04 UTC
    maybe ... (untested code ahead) :
    while (my $line = <INPUT>) { my ($time,$email,$name,@addr) = split(/\|/,$line); # use $email for cool stuff here # or add to an array for mailing. }

    As for mailing multiple recipients, i do suggest using Net::SMTP, but you could pass a coma seperated list of addresses to something like sendmail.
    i had a memory leak once, and it ruined my favorite shirt.
Re: Extract field from pipe delimited flat file
by {NULE} (Hermit) on Nov 16, 2001 at 22:35 UTC

    I really, sincerely hope that this is not real data. I would be terribly upset to see my name, e-mail and full address posted on some random web page.

    Please respond or update your post to acknowledge that this data is fabricated, or I must recommend to the monks that this post be sanitized.

    Update: Thanks to Masem for cleaning the code - I hadn't realized that a root SoPW node couldn't be updated (thanks to ChemBoy for pointing that out). I work in the healthcare field and am forced to think constantly about protecting patient data, so forgive me if I over-reacted.


      Personally, I don't think you over-reacted. We should all take great to protect personal data as much as possible. I think this is a good reminder for us all.

      I have a link in a bookmark file lying around somewhere, a Usenet post, recorded Deja/Google's archive. Some poor fool posted his credit card number on a newsgroup.

      Fortunately we can rectify such mistakes here on this site. On Usenet, you're out of luck.

      g r i n d e r
      It is not real data. I changed it to protect the innocent (or something like that).

      Thanks for looking out for dumb mistakes. I know I've made plenty. I will make sure in the future that I include verbage to let you know that it is not real.

Re: Extract field from pipe delimited flat file
by gryphon (Abbot) on Nov 16, 2001 at 22:46 UTC

    Greetings peppiv,

    If all you want to do is take data from one file and dump it into another, just setup your filehandles and perform the following:

    print OUTPUT join "\n", grep /\S+@\S+\.\S+/, map { split(/\|/) } <INPUT>;

    The regex on the grep isn't all that great. merlyn has a few that work much better for finding all sorts of valid email addresses. However, for your example data, this works just fine.

    So here's how it works (starting right, moving left):

    1. Grab the file data and use map to split every field of every line into an element of an array.
    2. Then we're going to grep that array for all elements that look like they might be email addresses.
    3. Then we're going to join that array of maybe-valid email addresses into a single string with line-breaks.
    4. Finally, print that to whatever output file you setup.

    code('Perl') || die;

Re: Extract field from pipe delimited flat file
by Jazz (Curate) on Nov 17, 2001 at 02:11 UTC

    Please forgive me if I've misunderstood the question (and subsequent replies), but it sounds like searching for email addresses in the line isn't necessary -- that the email addresses are always the second element of the | separated line.

    Here's a way to extract only the second element and dump it (one per line) to another file.

    print OUTFILE join "\n", map { ( split( /\|/ ) )[1] } <INFILE>;

    But if I may ask, why dump the email addresses into another file if you'll just need to re-open the new file to grab the email addresses for further processing? You can use this to grab the emails from your INFILE, then loop through them to perform your actual mailing:</P?

    foreach my $email ( map { ( split( /\|/ ) )[1] } <INFILE> ){ # Do something with $email }

    As you can see from the benchmark results below, the foreach loop isn't that much slower than throwing the data into another file, which will then need to be re-opened, re-read, re-closed.

    Benchmark: timing 250000 iterations of foreachmapsplit, greprxmapsplit +, mapsplit... foreachmapsplit: 4 wallclock secs ( 3.46 usr + 0.00 sys = 3.46 CPU) + @ 72254.34/s (n=250000) greprxmapsplit: 4 wallclock secs ( 3.24 usr + 0.00 sys = 3.24 CPU) +@ 77160.49/s (n=250000) mapsplit: 4 wallclock secs ( 3.24 usr + 0.00 sys = 3.24 CPU) @ 77 +160.49/s (n=250000)

    The greprxmapsplit code is gryphon's. I included it because the grep-regex/map/split consistently matched the plain map/split. One day I'll understand why :)

Re: Extract field from pipe delimited flat file
by Purdy (Hermit) on Nov 16, 2001 at 22:25 UTC
    Not sure if that previous example will work if your text file is all one line like it is in your example. I'd recommend using grep() (this is also untested code):

    # this is a lazy regexp - you can refine it better, I'm sure. # $line is the line of text from your example @emails = grep ( /.*\@.*/, split( /\|/, $line ) );


    update - before the sanitization of the data above, it was all in one line. No, really! ;)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://125871]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (2)
As of 2021-06-19 09:17 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (91 votes). Check out past polls.