Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re: List::MoreUtils before, after and ... between?

by LanX (Bishop)
on Feb 21, 2012 at 16:43 UTC ( #955353=note: print w/replies, xml ) Need Help??

in reply to List::MoreUtils before, after and ... between?

sometimes I wished we could simply apply regexes to lists (preferably in a layzy functional way) ...

DB<152> @list= ("a".."c","DBIC","A".."C","DANCER","a".."c") => ("a", "b", "c", "DBIC", "A", "B", "C", "DANCER", "a", "b", "c") DB<153> $list = join "\0", @list => "a\0b\0c\0DBIC\0A\0B\0C\0DANCER\0a\0b\0c" DB<154> ($match)= $list =~ /DBIC\0(.*)\0DANCER/ => "A\0B\0C" DB<155> split /\0/,$match => ("A", "B", "C")

this will find the longest interval between the first DBIC and the last Dancer and you are free to use more powerful regexes.

UPDATE: hmm maybe newlines are here a better choice as delimiter.

Cheers Rolf

Replies are listed 'Best First'.
Re^2: List::MoreUtils before, after and ... between? (1 regex)
by tye (Sage) on Feb 22, 2012 at 07:34 UTC

    Heh, I like that solution. Note that adding a second 'DBIC' and/or a second 'Dancer' demonstrates flaws in the exact regex you chose. But both of those flaws are easily overcome:

    $list =~ /.*DBIC\0(.*?)\0DANCER/ # ^^ ^

    But perhaps people less comfortable with regexes would like it less...

    - tye        

      And did you think about the edgecase of choosing a wrong delimiter which already appears in the elements? ;)

      FWIW, I'm quite often in a situation where I would prefere to apply regexes on lists, so I started meditating yesterday about a module abstracting the delimiter problem away by locally redefining a special var like $; or $\ for the delimiter and $_ for flattened list.

      something like

      @newlist = flat { s/START$;(.*)$;END/$1/ } @list sub flat (&@) { my ($code,@list) = @_; local ( $;, $_ ) = join_reliably (@list); $code->(); return split $;, $_ }


      Cheers Rolf

        I thought that was obvious enough to enough readers and far enough "beside the point" that I needn't belabor it. (:

        I expect a reported stack trace to be highly likely to already prevent the inclusion of literal control characters and so find "\0" to be a completely reasonable choice.

        MIME has a mechanism for choosing a delimiter that I particularly like. Pick a starting delimiter however you like. Do index on the body of text that must not contain the delimiter. If a match is found, then look at the character after the match and pick a character that isn't that and append it to your delimiter. Perform an index from that point. Repeat until no match.

        Most of the time, your original delimiter is just fine. Rarely, you have to append a single character to it. In the worst possible case, you only traverse the body of text a single time and always come up with a functional delimiter.

        You could probably use that on "@list" to pick your delimiter if you are careful about a couple of edge cases (picking "aba" as your delimiter because it appears nowhere in "@list" but other-than-the-last-item ending in "ab" would still be a problem).

        But in practice, in Perl at least, I'm more likely to just escape embedded delimiters if such were required.

        - tye        

        as a proof on concept
        use Data::Dumper qw/Dumper/; sub flat (&@) { my ($code, @list) = @_; local ( $;, $_ ) = join_reliably (@list); $code->(); return split $;, $_ } sub join_reliably { # just a stub # TODO search for reliable delimiter my $delim="\0"; return $delim, join ($delim, @_); } #------------------------------ # examples #------------------------------ @list= ("a".."c","DBIC","A".."C","DANCER","a".."c"); @newlist = flat { s/.*$;DBIC$;(.*)$;DANCER$;.*/$1/ } @list; print Dumper \@newlist; # DIMTOWTDI @newlist = flat { m/DBIC$;(.*)$;DANCER/; $_=$1 } @list; print Dumper \@newlist;
        $VAR1 = [ 'A', 'B', 'C' ]; $VAR1 = [ 'A', 'B', 'C' ];
        UPDATE: maybe splitjoin() or sploin() are better names than just flat().

        Cheers Rolf

Re^2: List::MoreUtils before, after and ... between?
by Boldra (Deacon) on Feb 22, 2012 at 09:08 UTC

    This looks great, because the list is generated by splitting a string on newlines in the first place:

    eval { confess }; my @stack = split /\n/, $@;
    The regex becomes a bit more complicated, but that usually also means I'm a bit more certain about what I'm matching.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://955353]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2018-05-26 00:46 GMT
Find Nodes?
    Voting Booth?