Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

regex problem

by Anonymous Monk
on Mar 04, 2018 at 19:31 UTC ( #1210321=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

The desired outcome should be :
abcd 723
abcd 724
abcde 552
abcde 554
abcde 553
abcded 756
but instead I get :
abcd 723
abcd -724
abcde 552
abcde -554-553
abcdef 756
abcdef
The code:
while ($line=<DATA>) { my @c=($line=~/^(\w+)\t(\d+)((?:-\d+)*)/); my @d=@c[1..$#c]; foreach $e (@d) { print $c[0]," ", $e,"\n"; } } __DATA__ abcd 723-724 abcde 552-554-553 abcdef 756

Replies are listed 'Best First'.
Re: regex problem
by tybalt89 (Priest) on Mar 04, 2018 at 19:55 UTC

    regex doesn't work that way.

    #!/usr/bin/perl # http://perlmonks.org/?node_id=1210321 use strict; use warnings; while(<DATA>) { my ($first, $rest) = /^(\w+)\s+([-\d]+)/; print "$first $_\n" for $rest =~ /\d+/g; } __DATA__ abcd 723-724 abcde 552-554-553 abcdef 756
Re: regex problem
by johngg (Abbot) on Mar 05, 2018 at 00:43 UTC

    A solution using splits and maps to build a hash. The order of the output might be problematic if you desire something other than sorted.

    johngg@shiraz ~/perl/Monks $ perl -Mstrict -Mwarnings -E ' open my $inFH, q{<}, \ <<EOD or die $!; abcd 723-724 abcde 552-554-553 abcdef 756 EOD my %hash = map { $_->[ 0 ] => [ split m{-}, $_->[ 1 ] ] } map { [ split ] } <$inFH>; foreach my $key ( sort keys %hash ) { say qq{$key $_} for @{ $hash{ $key } }; }' abcd 723 abcd 724 abcde 552 abcde 554 abcde 553 abcdef 756

    I hope this is useful.

    Update: This version dispenses with the hash so items will be output in the same order as input.

    johngg@shiraz ~/perl/Monks $ perl -Mstrict -Mwarnings -E ' open my $inFH, q{<}, \ <<EOD or die $!; abcd 723-724 abcde 552-554-553 abcdef 756 EOD say qq{@$_} for map { my $key = $_->[ 0 ]; my @nos = split m{-}, $_->[ 1 ]; map { [ $key => shift @nos ] } 1 .. scalar @nos; } map { [ split ] } <$inFH>;' abcd 723 abcd 724 abcde 552 abcde 554 abcde 553 abcdef 756

    Update 2: Even simpler.

    johngg@shiraz ~/perl/Monks $ perl -Mstrict -Mwarnings -E ' open my $inFH, q{<}, \ <<EOD or die $!; abcd 723-724 abcde 552-554-553 abcdef 756 EOD say qq{@$_} for map { my( $key, $rest ) = split; my @nos = split m{-}, $rest; map { [ $key => shift @nos ] } 1 .. scalar @nos; } <$inFH>;' abcd 723 abcd 724 abcde 552 abcde 554 abcde 553 abcdef 756

    Cheers,

    JohnGG

Re: regex problem
by AnomalousMonk (Chancellor) on Mar 04, 2018 at 19:58 UTC

    Try something like:

    c:\@Work\Perl\monks>perl -wMstrict -le "for my $s ( qq{abcd\t723-724}, qq{abcde\t552-554-553}, qq{abcdef\t756}, qq{abcdef\tfoo}, ) { my $parsed = my ($base, $groups) = $s =~ m{ \A ([[:alpha:]]+) \t (\d+ (?: - \d+)*) \z }xms; ;; die qq{bad string '$s'} unless $parsed; ;; print qq{'$s' -> }; for my $g ($groups =~ /\d+/g) { print qq{ '$base' '$g'}; } } " 'abcd 723-724' -> 'abcd' '723' 'abcd' '724' 'abcde 552-554-553' -> 'abcde' '552' 'abcde' '554' 'abcde' '553' 'abcdef 756' -> 'abcdef' '756' bad string 'abcdef foo' at -e line 1.

    Update: Well, basically the same idea as tybalt89's approach, but more effort at data validation.


    Give a man a fish:  <%-{-{-{-<

Re: regex problem
by Marshall (Abbot) on Mar 05, 2018 at 19:47 UTC
    I figure that your thinking is too complicated, especially when it comes to regex and processing the input lines.
    Consider this code:
    #!/usr/bin/perl use strict; use warnings; while (my $line = <DATA>) { next if $line =~ /^\s*$/; # skip blank lines $line =~ s/^\s*//; # remove leading spaces $line =~ s/\s*$//; # remove trailing space and line ending + my ($name, @nums) = split /[\s-]+/, $line; foreach my $num (@nums) { print "$name\t$num\n"; } } # PRINTS #abcd 723 #abcd 724 #abcde 552 #abcde 554 #abcde 553 #abcdef 756 __DATA__ abcd 723-724 abcde 552-554-553 abcdef 756
    I don't think the trim leading and trailing spaces statements are needed here given your DATA. However, you should become familiar with how to do that.

    Update: As a general rule:

    • Use Regex when you know what to keep.
    • Use Split when you know what to throw away.
    This code works the same:
    #!/usr/bin/perl use strict; use warnings; while (my $line = <DATA>) { next if $line =~ /^\s*$/; # skip blank lines my ($name, @nums) = $line =~ /(\w+)/g; foreach my $num (@nums) { print "$name\t$num\n"; } } # PRINTS #abcd 723 #abcd 724 #abcde 552 #abcde 554 #abcde 553 #abcdef 756 __DATA__ abcd 723-724 abcde 552-554-553 abcdef 756
    I almost always have a statement to throw away blank lines. They can often appear at the end of a file and hard to see when you just type or cat the file.
Re: regex problem
by writch (Sexton) on Mar 06, 2018 at 16:06 UTC
    I just split it into the two thoughts you had, grabbing the text, and then the potential numbers in the line.
    while ($line=<DATA>) { my @c=($line=~/^(\w+)\t/); my @d=$line =~ /(\d+)/gsm; foreach $e (@d) { print $c[0]," ", $e,"\n"; } } __DATA__ abcd 723-724 abcde 552-554-553 abcdef 756
Re: regex problem
by hippo (Canon) on Mar 08, 2018 at 12:29 UTC

    Taking our anonymous brother's spec literally, here is a solution which produces the actual output he says is desired:

    use strict; use warnings; while (my $line = <DATA>) { my ($key, @nums) = split /[\s-]+/, $line; $key =~ s/f$/d/; print "$key $_\n" for @nums; } __DATA__ abcd 723-724 abcde 552-554-553 abcdef 756
Re: regex problem SHORT SOLUTION
by python_guy (Initiate) on Mar 08, 2018 at 05:34 UTC
    HERE! If it's only printing like that, you worry about I would recommend this short piece of code!
    while ($line=<DATA>) { my @items = split /\s+/,$line; $items[1] =~ s/\-/\n$items[0] /g; print join " ",@items,"\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1210321]
Front-paged by Corion
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2018-07-21 18:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (449 votes). Check out past polls.

    Notices?