Grey Fox has asked for the wisdom of the Perl Monks concerning the following question:
Hello Fellow Monks
I wrote a perl script to seperate long lists of email seperated by a semi colon. What I would like to do with the code is combine the split with the trimming of white space so I don't need two arrays. Is there away to trim while loading the first array. Output is a sorted list of names. Thanks.
#!/pw/prod/svr4/bin/perl
use warnings;
use strict;
my $file_data =
'Builder, Bob ;Stein, Franklin MSW; Boop, Elizabeth PHD Cc: Bear,
+ Izzy';
my @email_list;
$file_data =~ s/CC:/;/ig;
$file_data =~ s/PHD//ig;
$file_data =~ s/MSW//ig;
my @tmp_data = split( /;/, $file_data );
foreach my $entry (@tmp_data) {
$entry =~ s/^[ \t]+|[ \t]+$//g;
push( @email_list, $entry );
}
foreach my $name ( sort(@email_list) ) {
print "$name \n";
}
-- Grey Fox
"We are grey. We stand between the darkness and the light" B5
Re: How do I combine SPLIT with trimming white space
by Fletch (Bishop) on Aug 11, 2009 at 16:58 UTC
|
Foreach doesn't require an actual array to work from; it's perfectly content to work on the list returned from split:
foreach my $entry ( split( /;/, $file_data ) ) {
do { s/^\s+//; s/\s+$//; } for $entry;
push @email_list, $entry;
}
Of course you can use map and get everything in one statement.
print join( "\n", sort map { (my $t=$_)=~s/^\s+//;$t=~s/\s+$//;$t } sp
+lit( /;/, $file_data ) ), "\n";
Update: Of course if you need the list later you can certainly store the return from the sort into a variable and print that instead.
my @email_list = sort map ... split( /;/, $file_data );
print join( "\n", @email_list ), "\n";
The cake is a lie.
The cake is a lie.
The cake is a lie.
| [reply] [d/l] [select] |
|
| [reply] [d/l] [select] |
|
Wouldn't argue it too hard, but that one has a bit more leaning-toothpick-itis and the two separate s///'s are a bit more explicit in what's being done. But yes, mea culpa on the slightly evil use of postfix for to get $_ set though. :)
The cake is a lie.
The cake is a lie.
The cake is a lie.
| [reply] [d/l] |
Re: How do I combine SPLIT with trimming white space
by moritz (Cardinal) on Aug 11, 2009 at 17:01 UTC
|
How about splitting on /\s*;\s*/?
Then the items returned from split will have neither leading nor trailing whitespace, assuming that your original string doesn't have them. | [reply] [d/l] |
Re: How do I combine SPLIT with trimming white space
by Grey Fox (Chaplain) on Aug 11, 2009 at 17:57 UTC
|
Thanks;
Fletch I incorporated a couple of your ideas. I also didn't realize you can do the do blocks in perl, very helpful.Also Moritz thanks I didn't realize, you can use a complex regex in the splits, nice to know.
The code was changed adding:
foreach my $entry ( split( /;/, $file_data ) ) {
do { s/^\s+//; s/\s+$//; }
for $entry;
push @email_list, $entry;
}
print join( "\n", sort(@email_list)), "\n";
Thanks again.
-- Grey Fox
"We are grey. We stand between the darkness and the light" B5
| [reply] [d/l] |
|
Hi,
If you want to chain multiple s/// statements, then you can use a comma and get rid of the ugly do and the curlys.
foreach my $entry ( split( /;/, $file_data ) ) {
s/^\s+//, s/\s+$// for $entry;
push @email_list, $entry;
}
print join( "\n", sort(@email_list)), "\n";
If you are a Perl::Critic follower then you are not allowed to use the for statement modifier Perl::Critic::Policy::ControlStructures::ProhibitPostfixControls.
However consider this using List::MoreUtils apply function.
use List::MoreUtils qw(apply);
foreach my $entry ( split( /;/, $file_data ) ) {
push @email_list, apply { s/^\s+//; s/\s+$// } $entry;
}
print join( "\n", sort(@email_list)), "\n";
print+qq(\L@{[ref\&@]}@{['@'x7^'!#2/"!4']});
| [reply] [d/l] [select] |
Re: How do I combine SPLIT with trimming white space
by Marshall (Canon) on Aug 11, 2009 at 20:26 UTC
|
#!usr/bin/perl -w
use strict;
# I suspect that you left off a ";" after Boop, Elizabeth PHD
# I add added that to the test data below.
my $file_data =qq(
Builder, Bob ;Stein, Franklin MSW; Boop, Elizabeth PHD ; Cc: Bear,
+Izzy;
SomeGuy, Guy; Einstein , Albert PHD; );
#
#yes, you can open a scalar for read with a filehandle!
#there is some "weirdness" about this, but the idea does work!
open (my $in, "<", \$file_data) or die "can't open input file $!";
while (my $line =<$in>)
{
next if $line =~ m/^\s*$/; # skip blank lines
chomp ($line);
$line =~ s/CC://i;
$line =~ s/\s+$//; #gets rid of trailing \n
#I suspect that this is an artifact
#of the way the file is opened??
my @whole_name_titles = split(/;/,$line);
foreach my $name (@whole_name_titles)
{
$name =~ s/\s+[A-Z]+\s*$//; #Skip trailing cap letters
$name =~ s/\s+//g; #compress spaces
print "$name\n"; #push to DB or whatever here...
}
}
__END__
Prints:
Builder,Bob
Stein,Franklin
Boop,Elizabeth
Bear,Izzy
SomeGuy,Guy
Einstein,Albert
| [reply] [d/l] |
|
| [reply] |
|
Grey Fox,
Glad you got some good ideas from this thread!
I thought something was weird with the missing semicolon - just a small detail.
As far as sorting goes print "$name\n"; #push to DB or whatever here... is where you can just push to a list like @full_names.
Here is one way of many to do the sort. Note that the "Boop" family winds up in the correct sort order.
#!/usr/bin/perl -w
use strict;
my @names =
(
"Builder,Bob",
"Stein,Franklin",
"Boop,Elizabeth",
"Boop,Albert",
"Bear,Izzy",
"SomeGuy,Guy",
"Einstein,Albert",);
@names = sort by_last_name @names;
print join("\n",@names),"\n";
sub by_last_name
{
my ($a_last, $a_first) = split (/,/,$a);
my ($b_last, $b_first) = split (/,/,$b);
$a_last cmp $b_last
or
$a_first cmp $b_first
}
__END__
Prints:
Bear,Izzy
Boop,Albert
Boop,Elizabeth
Builder,Bob
Einstein,Albert
SomeGuy,Guy
Stein,Franklin
I added the simple code to sort by first name.
| [reply] [d/l] [select] |
|
|