Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Subphrases from a phrase

by themage (Friar)
on Aug 10, 2005 at 00:43 UTC ( #482475=perlquestion: print w/ replies, xml ) Need Help??
themage has asked for the wisdom of the Perl Monks concerning the following question:

Hi Enlighted Ones,

I'm here looking for wisdom from you. I have an expression, and I need the expressions contained in that. For example, if I have:
camel perl book
I would like to get:
camel perl book camel perl perl book camel perl book
I made this small sub that returns the pretended list, but I would like to know your opinions on that:

sub list { my $words=shift; chomp $words; my @words=split /\s/, $words; my @longs=(); for my $i (1..$#words-1) { push @longs, map { join " ", @words[$_..$_+$i]} (0..$#words-$i); } push @words, @longs,$words; return @words; }
This is intented to create a query for a payperview search engine, where the records which search expression is complete in the query will be shown.

There is any better way to do this?

Thank you very much for sharing your wisdom.


Comment on Subphrases from a phrase
Select or Download Code
Re: Subphrases from a phrase
by BrowserUk (Pope) on Aug 10, 2005 at 00:52 UTC

    You missed one combination, "camel book":

    #! perl -slw use strict; use List::Util qw[ sum ]; sub Cnr{ my( $n, @r ) = shift; return [] unless $n--; for my $x ( 0 .. ($#_ - $n) ) { push @r, map{ [ $_[$x], @$_ ] } Cnr( $n, @_[ ($x + 1) .. $#_ ] ); } return @r; } our $N ||= 0; print map "@$_\n", Cnr( $N, @ARGV ) and exit if $N; print map{ "@$_\n" } Cnr( $_, @ARGV ) for 1 .. @ARGV; __END__ P:\test>cnr camel perl book camel perl book camel perl camel book perl book camel perl book

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
      It's entirely possible that the OP does not want "camel book" to be among the outputs in this example -- the task may be limited to all possible substrings consisting of single words or two or more adjacent words in sequence.

        Good point. I hadn't looked at it that way.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Re: Subphrases from a phrase
by chester (Hermit) on Aug 10, 2005 at 01:02 UTC
    Update: Mine gives combinations, not "phrases", so may not be a wanted solution. Thanks, ikegami.

    Using Math::Combinatorics:

    use warnings; use strict; use Math::Combinatorics; use Data::Dumper; my $phrase = 'camel perl book'; my @data = split /\s/, $phrase; my @sets; foreach my $count (1..@data) { my $combinat = Math::Combinatorics->new( count => $count, data => [@data] ); while(my @combo = $combinat->next_combination()) { push @sets, join ' ', @combo; } } print Dumper \@sets;
      I don't think he wants combinations. There are 7 combinations, but he only has 6 results. Your solution lists "camel book", which is not a subexpression of "camel perl book". The difference becomes much more evident when there are four words.
Re: Subphrases from a phrase
by AReed (Pilgrim) on Aug 10, 2005 at 01:44 UTC
    I'm making the assumption that your definition of "phrase" requires that the words be contiguous in the original text. That would explain the missing "Camel Book" phrase. What I came up with is no shorter, but it works.
    #!/usr/bin/perl use strict; use warnings; my @words = qw(Camel Perl Book); for my $phrase_length (1..@words) { for my $start_idx (0..(@words-$phrase_length)) { print "@words[$start_idx .. ($start_idx + ($phrase_length - 1))]\n +"; } }
Re: Subphrases from a phrase
by sk (Curate) on Aug 10, 2005 at 04:46 UTC
    Something like this -

    #!/usr/bin/perl use strict; use warnings; my @phrase = qw(Camel Perl Book); my @subph = (); while(@phrase) { @subph = subphrases(@phrase); print +($_,$/) for (@subph); shift(@phrase); } sub subphrases { my $i = 0; my @sph = (); my @ph = (@_); while ($i < @phrase) { $sph[$i] = join (' ', map { $ph[$_]} (0..$i)); $i++; } return (@sph); } __END__ Camel Camel Perl Camel Perl Book Perl Perl Book Book
Re: Subphrases from a phrase
by graff (Chancellor) on Aug 10, 2005 at 05:30 UTC
    I think the only changes I would make involve a few trivial simplifications:
    sub list { my @words = split( /\s+/, $_[0] ); my $lastword = $#words; for my $i (1..$lastword) { push @words, map { join " ", @words[$_..$_+$i] } (0..$lastword +-$i); } return @words; }
    Notes:
    • Assuming that $/ has its default value, splitting on white-space makes "chomp" unnecessary
    • Your map-within-foreach covers everything, including the "substring" containing all input words
    • You can eliminate most of your temporary storage variables, but ...
    • ... it's important not to use "$#array" inside a loop when altering the size of @array in the loop (unless you're doing something different from this case, and you really know what you're doing)
    (update: fixed  @_[0] --> $_[0] in first line of sub -- and this could just as well have been "shift".)
Re: Subphrases from a phrase
by ikegami (Pope) on Aug 10, 2005 at 05:47 UTC
    I think the following is much easier to read than using map:
    sub list { my $words=shift; my @words=split /\s/, $words; my @rv; for my $s (0..$#words) { for my $e ($s..$#words) { push(@rv, join(' ', @words[$s..$e])); } } return @rv; }
    And you can keep the optimization yours had:
    sub list { my $words=shift; my @words=split /\s/, $words; my @rv = @words; for my $s (0..$#words) { for my $e ($s+1..$#words) { push(@rv, join(' ', @words[$s..$e])); } } return @rv; }
Re: Subphrases from a phrase
by Elliott (Pilgrim) on Aug 10, 2005 at 11:01 UTC
    I think more info on the application would be helpful. What are you trying to achieve and why? It's possible a simple regexp would do what you want (depending on what you want).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://482475]
Approved by moot
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (8)
As of 2014-10-23 02:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (123 votes), past polls