Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Pattern Matching with a Selected Sub-String

by ozboomer (Friar)
on Jul 26, 2012 at 23:53 UTC ( [id://983971]=perlquestion: print w/replies, xml ) Need Help??

ozboomer has asked for the wisdom of the Perl Monks concerning the following question:

Please consider the following code:-

$sample = "A-E-H-L"; @options = ( "A-B-F-G-H-K-M", # idx = 0 "A-E-G-H-L", # idx = 1 "A-C-E-G-H-J-L", # idx = 2 "B-F-H-K", # idx = 3 "A-B-F-G-H-K-L", # idx = 4 "C-H" # idx = 5 ); $mc = "[A,E,H,L]"; @matches = grep(/$mc/, @options); foreach $x (@matches) { printf("> %s\n", $x); }

The intention is to find each of the elements in @options that includes all of the elements in $sample. The hyphens ("-") are sort-of important, mainly because they are delimiters between each 'sub-element'.

The code I've shown obviously doesn't work -- I was trying to establish a 'class' for a regex -- but let me explain it using the contents of $sample:...

For the example of "A-E-H-L", I'm looking in the @options array for those elements that contain "A" and "E" and "H" and "L". If $options[$i] contains all of those items ("A", "E", etc), then it will be recorded as a match. Thus, the @matches array should, ideally contain the elements:-

@matches = (1,2);

Note that the elements of $sample will always be in 'increasing' order (the data could be "A-F-H" but it won't be "H-A-F").

An interesting generalization of this would be to match on a subset of $sample. For example, if we allow a subset of elements from $sample to match, we would have the resulting @matches = (0,1,2,3,4,5).

There are probably a lot of other options that could be considered... but this is a good start, I think.

Would appreciate any pointers, please...

Replies are listed 'Best First'.
Re: Pattern Matching with a Selected Sub-String
by choroba (Cardinal) on Jul 27, 2012 at 00:17 UTC
    You can create a regular expression from the $sample and then just match each option against it:
    #!/usr/bin/perl use warnings; use strict; use feature 'say'; my $sample = "A-E-H-L"; my @options = ( "A-B-F-G-H-K-M", "A-E-G-H-L", "A-C-E-G-H-J-L", "B-F-H-K", "A-B-F-G-H-K-L", "C-H" ); my @search_for = split /-/, $sample; my $re = '(?:^|-)' . join('(?:-|-.*-)', @search_for) . '(?:-|$)'; say for grep /$re/, @options;
Re: Pattern Matching with a Selected Sub-String
by toolic (Bishop) on Jul 27, 2012 at 00:24 UTC
    use warnings; use strict; use Data::Dumper; my $sample = "AEHL"; my $mc = join '.*', split //, $sample; my @options = ( "ABFGHKM", # idx = 0 "AEGHL", # idx = 1 "ACEGHJL", # idx = 2 "BFHK", # idx = 3 "ABFGHKL", # idx = 4 "CH" # idx = 5 ); my @matches; for my $i (0 .. $#options) { push @matches, $i if $options[$i] =~ /$mc/; } print Dumper(\@matches); __END__ $VAR1 = [ 1, 2 ];
Re: Pattern Matching with a Selected Sub-String
by tobyink (Canon) on Jul 27, 2012 at 06:25 UTC
    use 5.010; use strict; my $sample = "A-E-H-L"; my @sample_arr = split '-', $sample; my @options = ( "A-B-F-G-H-K-M", # idx = 0 "A-E-G-H-L", # idx = 1 "A-C-E-G-H-J-L", # idx = 2 "B-F-H-K", # idx = 3 "A-B-F-G-H-K-L", # idx = 4 "C-H" # idx = 5 ); my @matches = map { join '-', @$_ } grep { my $ok = 1; for my $s (@sample_arr) { next if $s ~~ @$_; $ok = 0; last }; $ok; } map { [split '-' ] } @options; say for @matches;
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: Pattern Matching with a Selected Sub-String
by Anonymous Monk on Jul 27, 2012 at 00:17 UTC

    I don't like how you keep calling many things elements :)

    #!/usr/bin/perl -- use strict; use warnings; use List::AllUtils qw' indexes '; use Data::Dump qw' dd '; my @options = ( "A-B-F-G-H-K-M", "A-E-G-H-L", "A-C-E-G-H-J-L", "B-F-H-K", "A-B-F-G-H-K-L", "C-H" ); #~ my @idx = findem( "A-E-H-L", \@options ); my @idx = findem( "A-E-H-L", @options ); dd \@options; dd \@idx; dd [ @options[@idx] ]; exit 0; sub findem { #~ my( $query, $options ) = @_; my $query = shift; $query = join '.+', map { "\Q$_\E" } split '-', $query; #~ return indexes { /$query/ } @$options; return indexes { /$query/ } @_; } __END__ [ "A-B-F-G-H-K-M", "A-E-G-H-L", "A-C-E-G-H-J-L", "B-F-H-K", "A-B-F-G-H-K-L", "C-H", ] [1, 2] ["A-E-G-H-L", "A-C-E-G-H-J-L"]
Re: Pattern Matching with a Selected Sub-String
by Marshall (Canon) on Jul 27, 2012 at 12:34 UTC
    Here is another way to do this...
    #!/usr/bin/perl -w use strict; use List::MoreUtils qw{uniq}; my $sample = "A-E-H-L"; my @options = ( "A-B-F-G-H-K-M", # idx = 0 "A-E-G-H-L", # idx = 1 "A-C-E-G-H-J-L", # idx = 2 "B-F-H-K", # idx = 3 "A-B-F-G-H-K-L", # idx = 4 "C-H" , # idx = 5 "A-E-H-L", # idx = 6 "H-E-A-L" # idx = 7 ); my @indicies = find_indexes (\@options, $sample); print "sample $sample occurs in indicies: @indicies\n"; sub find_indexes { my ($aref, $template) = @_; my @indicies; my @patterns = $template =~ /\w+/g; my $regex = join ("|", @patterns); # "OR" is easy with regex # "AND" is hard my $index; foreach my $line (@$aref) { my (@matches) = $line =~ /$regex/g; push @indicies, $index if (uniq (@matches) >= @patterns); $index++; } return @indicies; } __END__ Prints: sample A-E-H-L occurs in indicies: 1 2 6 7
Re: Pattern Matching with a Selected Sub-String
by Kenosis (Priest) on Jul 27, 2012 at 17:09 UTC

    And yet another option:

    use Modern::Perl; my $sample = "A-E-H-L"; my @options = ( "A-B-C-E-H-J-L", # idx = 0 "A-B-F-G-H-K-M", # idx = 1 "A-E-G-H-L", # idx = 2 "A-C-E-G-H-J-L", # idx = 3 "B-F-H-K", # idx = 4 "A-B-F-G-H-K-L", # idx = 5 "C-H" # idx = 6 ); $sample =~ s/-/.*/g; my @matches = grep $options[$_] =~ /$sample/, 0 .. $#options; say for @matches;

    Output:

    0 2 3
Re: Pattern Matching with a Selected Sub-String
by AnomalousMonk (Archbishop) on Jul 28, 2012 at 04:42 UTC

    Another approach, with a 'figure of merit':

    >perl -wMstrict -le "use Text::LevenshteinXS qw(distance); ;;;; my $sample = 'A-E-H-L'; ;; my @options = qw( A-B-F-G-H-K-M A-E-G-H-L A-C-E-G-H-J-L B-F-H-K A-B-F-G-H-K-L C-H A-F-H H-A-F L-H-E-A A A-E X-A-X X-A-E-X L A-L X-A-L-X W-X-Y-Z Z Y-Z ); ;;;; my @interesting = $sample =~ m{ \w }xmsg; my $mask = join '', @interesting; my ($dull) = map qr{[^$_]}xms, join '', @interesting; my ($ordered) = map qr{$_?}xms, join '? ', @interesting; ;;;; for my $option (@options) { (my $primary = $option) =~ s{ $dull }{}xmsg; my ($sequence) = $primary =~ m{ ($ordered)? }xms; my $dist = distance $mask, $sequence; my $percent = 100 * (1 - ($dist / length $mask)); printf qq{%-16s -> %-6s -> %3.0f%% \n}, qq{'$option'}, qq{'$sequence'}, $percent; } " 'A-B-F-G-H-K-M' -> 'AH' -> 50% 'A-E-G-H-L' -> 'AEHL' -> 100% 'A-C-E-G-H-J-L' -> 'AEHL' -> 100% 'B-F-H-K' -> 'H' -> 25% 'A-B-F-G-H-K-L' -> 'AHL' -> 75% 'C-H' -> 'H' -> 25% 'A-F-H' -> 'AH' -> 50% 'H-A-F' -> 'H' -> 25% 'L-H-E-A' -> 'L' -> 25% 'A' -> 'A' -> 25% 'A-E' -> 'AE' -> 50% 'X-A-X' -> 'A' -> 25% 'X-A-E-X' -> 'AE' -> 50% 'L' -> 'L' -> 25% 'A-L' -> 'AL' -> 50% 'X-A-L-X' -> 'AL' -> 50% 'W-X-Y-Z' -> '' -> 0% 'Z' -> '' -> 0% 'Y-Z' -> '' -> 0%
Re: Pattern Matching with a Selected Sub-String
by ozboomer (Friar) on Jul 28, 2012 at 01:20 UTC

    Many thanks for all the options, folks.. I'll study them and will post the final usage..

    I still have to get my head around regex.. after 15+ years of 'perling', I still have to get my head around regex.. *mutter*

    Fanx! again for all your help....

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://983971]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2024-04-25 15:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found