http://www.perlmonks.org?node_id=787699

Grey Fox has asked for the wisdom of the Perl Monks concerning the following question:

Hello Fellow Monks
I wrote a perl script to seperate long lists of email seperated by a semi colon. What I would like to do with the code is combine the split with the trimming of white space so I don't need two arrays. Is there away to trim while loading the first array. Output is a sorted list of names.
Thanks.
#!/pw/prod/svr4/bin/perl use warnings; use strict; my $file_data = 'Builder, Bob ;Stein, Franklin MSW; Boop, Elizabeth PHD Cc: Bear, + Izzy'; my @email_list; $file_data =~ s/CC:/;/ig; $file_data =~ s/PHD//ig; $file_data =~ s/MSW//ig; my @tmp_data = split( /;/, $file_data ); foreach my $entry (@tmp_data) { $entry =~ s/^[ \t]+|[ \t]+$//g; push( @email_list, $entry ); } foreach my $name ( sort(@email_list) ) { print "$name \n"; }
-- Grey Fox
"We are grey. We stand between the darkness and the light" B5

Replies are listed 'Best First'.
Re: How do I combine SPLIT with trimming white space
by Fletch (Bishop) on Aug 11, 2009 at 16:58 UTC

    Foreach doesn't require an actual array to work from; it's perfectly content to work on the list returned from split:

    foreach my $entry ( split( /;/, $file_data ) ) { do { s/^\s+//; s/\s+$//; } for $entry; push @email_list, $entry; }

    Of course you can use map and get everything in one statement.

    print join( "\n", sort map { (my $t=$_)=~s/^\s+//;$t=~s/\s+$//;$t } sp +lit( /;/, $file_data ) ), "\n";

    Update: Of course if you need the list later you can certainly store the return from the sort into a variable and print that instead.

    my @email_list = sort map ... split( /;/, $file_data ); print join( "\n", @email_list ), "\n";

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

        do { s/^\s+//; s/\s+$//; } for $entry;

      That's a weird way to write $entry =~ s/\A\s+|\s+\z//g; :)

        Wouldn't argue it too hard, but that one has a bit more leaning-toothpick-itis and the two separate s///'s are a bit more explicit in what's being done. But yes, mea culpa on the slightly evil use of postfix for to get $_ set though. :)

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

Re: How do I combine SPLIT with trimming white space
by moritz (Cardinal) on Aug 11, 2009 at 17:01 UTC
    How about splitting on /\s*;\s*/?

    Then the items returned from split will have neither leading nor trailing whitespace, assuming that your original string doesn't have them.

Re: How do I combine SPLIT with trimming white space
by Grey Fox (Chaplain) on Aug 11, 2009 at 17:57 UTC
    Thanks;
    Fletch I incorporated a couple of your ideas. I also didn't realize you can do the do blocks in perl, very helpful.
    Also Moritz thanks I didn't realize, you can use a complex regex in the splits, nice to know.
    The code was changed adding:
    foreach my $entry ( split( /;/, $file_data ) ) { do { s/^\s+//; s/\s+$//; } for $entry; push @email_list, $entry; } print join( "\n", sort(@email_list)), "\n";
    Thanks again.
    -- Grey Fox
    "We are grey. We stand between the darkness and the light" B5
      Hi, If you want to chain multiple s/// statements, then you can use a comma and get rid of the ugly do and the curlys.
      foreach my $entry ( split( /;/, $file_data ) ) { s/^\s+//, s/\s+$// for $entry; push @email_list, $entry; } print join( "\n", sort(@email_list)), "\n";
      If you are a Perl::Critic follower then you are not allowed to use the for statement modifier Perl::Critic::Policy::ControlStructures::ProhibitPostfixControls. However consider this using List::MoreUtils apply function.
      use List::MoreUtils qw(apply); foreach my $entry ( split( /;/, $file_data ) ) { push @email_list, apply { s/^\s+//; s/\s+$// } $entry; } print join( "\n", sort(@email_list)), "\n";

      print+qq(\L@{[ref\&@]}@{['@'x7^'!#2/"!4']});
Re: How do I combine SPLIT with trimming white space
by Marshall (Canon) on Aug 11, 2009 at 20:26 UTC
    I tried to post before and for some reason it didn't work.
    I dunno why not.

    Anyway this is a different approach. Have fun!

    #!usr/bin/perl -w use strict; # I suspect that you left off a ";" after Boop, Elizabeth PHD # I add added that to the test data below. my $file_data =qq( Builder, Bob ;Stein, Franklin MSW; Boop, Elizabeth PHD ; Cc: Bear, +Izzy; SomeGuy, Guy; Einstein , Albert PHD; ); # #yes, you can open a scalar for read with a filehandle! #there is some "weirdness" about this, but the idea does work! open (my $in, "<", \$file_data) or die "can't open input file $!"; while (my $line =<$in>) { next if $line =~ m/^\s*$/; # skip blank lines chomp ($line); $line =~ s/CC://i; $line =~ s/\s+$//; #gets rid of trailing \n #I suspect that this is an artifact #of the way the file is opened?? my @whole_name_titles = split(/;/,$line); foreach my $name (@whole_name_titles) { $name =~ s/\s+[A-Z]+\s*$//; #Skip trailing cap letters $name =~ s/\s+//g; #compress spaces print "$name\n"; #push to DB or whatever here... } } __END__ Prints: Builder,Bob Stein,Franklin Boop,Elizabeth Bear,Izzy SomeGuy,Guy Einstein,Albert
      Hi Marshall
      One of the reasons there was no semi colon after Betty, was that she was at the end of the original e-mail To: list. One of the things I wanted to do is just pull the To: and CC: as one cut and paste. That leaves a new line between the last name and the first CC: the original code changes the CC: to a semi colon.
      Also part of the original requirements is a sorted list. Thanks though, some nice ideas.
      -- Grey Fox
      "We are grey. We stand between the darkness and the light" B5
        Grey Fox,
        Glad you got some good ideas from this thread!

        I thought something was weird with the missing semicolon - just a small detail.

        As far as sorting goes print "$name\n"; #push to DB or whatever here... is where you can just push to a list like @full_names. Here is one way of many to do the sort. Note that the "Boop" family winds up in the correct sort order.

        #!/usr/bin/perl -w use strict; my @names = ( "Builder,Bob", "Stein,Franklin", "Boop,Elizabeth", "Boop,Albert", "Bear,Izzy", "SomeGuy,Guy", "Einstein,Albert",); @names = sort by_last_name @names; print join("\n",@names),"\n"; sub by_last_name { my ($a_last, $a_first) = split (/,/,$a); my ($b_last, $b_first) = split (/,/,$b); $a_last cmp $b_last or $a_first cmp $b_first } __END__ Prints: Bear,Izzy Boop,Albert Boop,Elizabeth Builder,Bob Einstein,Albert SomeGuy,Guy Stein,Franklin
        I added the simple code to sort by first name.