Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Find Length Of Longest Ascending/Descending Sequence

by Limbic~Region (Chancellor)
on May 09, 2011 at 17:45 UTC ( #903800=perlquestion: print w/ replies, xml ) Need Help??
Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

All,
I asked about a regex solution in the CB to the following problem and was prompted to post a SoPW since others might enjoy it.

Assume you have a string of integers (no other characters) such as '82665409266027476709324472'. The task is to find the longest sequence of contiguous integers in ascending or descending order wrapping around 0 with a difference in adjacent pairs of abs(1). In other words, '2468' would be a length of 1 even though they are contiguous and in ascending order. It would have to be '2345678' to count. Regarding wrapping around 0, '78901' is a length of 5 just as '32109' is.

Originally, I was looking for the most compact solution but now I am interested in any creative ways you might think of to solve it (experimental features or not). You might think of this as a modified version of longest increasing sequence with additional constraints (I have solved that problem before).

Cheers - L~R

Comment on Find Length Of Longest Ascending/Descending Sequence
Re: Find Length Of Longest Ascending/Descending Sequence
by wind (Priest) on May 09, 2011 at 18:10 UTC

    Well, the first step to solving this problem is coming up with some more interesting fake data to play with, so the following __DATA__ segment contains 20 60 character strings with sequences of 4 numbers or more.

    Script included so you can expand your data set if you so desire:

    for (1..20) { my @str = map {int rand 10} (1..60); my $diff = join '', map {($str[$_] - $str[$_+1]) % 10} (0..$#str-1 +); redo unless $diff =~ /1{4}|9{4}/; print join('', @str) . "\n"; } __DATA__ 461771621368210983721913243963580233112903255149955120374576 412182871611285727937329810487417347645678899662426008558163 806445168622715729854145415356850576789033724164441981190358 737993205942843210944632043755783070487691419376068748520835 797836732611301651049157941402677493925908901268033467721123 800343673335834413425690063499636965547527671535789010360222 364556954738232740095412905428192849218058869868234154321153 656886016318465389850845261620694916565611559209695154321950 439823085483918818243602360632486757938483367890464586909210 275010825726438419734041987657085162773815821872771619322197 362862003186986435628956392997462270571538785287890111165953 865788969800483978909876396956896428059483834654443783640994 084943262444682256649196170826699756789404699917511813404654 038022435612197850835139478901471838789042407199858583654421 244712325541896615697820925442735727638540309574604984321042 799272593665633014792964321006295907297545039391442339309443 875134840933937552505069624569692178901372764210973501240868 160604373233208765433548554842941708483446447649679473623104 787966403724853567890040276258105492183424704761915858524942 234565329824361008677075492264130762647177370538202678949041
    I verified that each of the above data sets all contain an absolute maximum. Here are the solutions:
    pos = 12, str = '21098' pos = 37, str = '45678' pos = 35, str = '67890' pos = 13, str = '432109' pos = 41, str = '89012' pos = 48, str = '78901' pos = 52, str = '54321' pos = 52, str = '54321' pos = 43, str = '67890' pos = 24, str = '98765' pos = 47, str = '78901' pos = 19, str = '09876' pos = 34, str = '56789' pos = 25, str = '78901' pos = 53, str = '43210' pos = 23, str = '43210' pos = 34, str = '78901' pos = 14, str = '876543' pos = 15, str = '567890' pos = 0, str = '23456'

    Update: Added expected results for fake data.

      Here's my round 1 solution. Fancier solutions to come, work willing.

      while (<DATA>) { chomp; my $line = $_; my @chars = split ''; my $diff = join '', map {($chars[$_+1] - $chars[$_]) % 10} (0..$#c +hars-1); my $maxlen = 0; my $maxpos = 0; while ($diff =~ m/(1+|9+)/g) { if ($maxlen < length $1) { $maxlen = length $1; $maxpos = pos($diff) - $maxlen; } } printf "pos = % 2s, str = '%s'\n", $maxpos, substr($line, $maxpos, + $maxlen + 1); }
Re: Find Length Of Longest Ascending/Descending Sequence
by ikegami (Pope) on May 09, 2011 at 18:21 UTC

    How about

    my ($longest) = reduce { length($a) >= length($b) ? $a : $b } / ( [0-9] (?: (?: (?<=0)1 | (?<=1)2 | (?<=2)3 | (?<=3)4 | (?<=4)5 | (?<=5)6 | (?<=6)7 | (?<=7)8 | (?<=8)9 | (?<=9)0 )+ | (?: (?<=0)9 | (?<=1)0 | (?<=2)1 | (?<=3)2 | (?<=4)3 | (?<=5)4 | (?<=6)5 | (?<=7)6 | (?<=8)7 | (?<=9)8 )+ ) ) /xg;

    And I thought contiguous would be easier than non-contiguous.

    Update: No, that's not right. Endpoints can belong to an ascending and a descending series.

    my ($longest) = reduce { length($a) >= length($b) ? $a : $b } / ( [0-9] (?: (?<=0)1 | (?<=1)2 | (?<=2)3 | (?<=3)4 | (?<=4)5 | (?<=5)6 | (?<=6)7 | (?<=7)8 | (?<=8)9 | (?<=9)0 )+ ) /xg, / ( [0-9] (?: (?<=0)9 | (?<=1)0 | (?<=2)1 | (?<=3)2 | (?<=4)3 | (?<=5)4 | (?<=6)5 | (?<=7)6 | (?<=8)7 | (?<=9)8 )+ ) /xg;

    Update: This can be simplified if you just want the length of the longest rather than the sequence itself.

    my $longest = max map length, / ( (?: 1(?=2) | 2(?=3) | 3(?=4) | 4(?=5) | 5(?=6) | 6(?=7) | 7(?=8) | 8(?=9) | 9(?=0) | 0(?=1) )+ | (?: 1(?=0) | 2(?=1) | 3(?=2) | 4(?=3) | 5(?=4) | 6(?=5) | 7(?=6) | 8(?=7) | 9(?=8) | 0(?=9) )+ ) /xg; if (defined($longest)) { ++$longest; } else { $longest = /[0-9]/ ? 1 : 0; }

    Ok, maybe not simpler.

    I could save typing by using reverse, but it would be a bit slower.

      As far as I can tell, when using a purely regex solution, you pretty much have to keep the scanning for ascending and descending sequences separated. Otherwise overlapping sequences will be missed:

      123xxxx56543

      Capturing in one go will return qw(123 56 543) instead of qw(123 56 6543);

      my $asc = join '|', map {"$_(?=".(($_+1)%10).")"} (0..9); my $dsc = join '|', map {"$_(?=".(($_+9)%10).")"} (0..9); while (<DATA>) { my $longest = max map length, /((?:$asc)+)/g, /((?:$dsc)+)/g; print "$longest\n"; }

      Update: This was in reference to your last solution, as previous ones did take this into account.

        Capturing in one go will return qw(123 56 543) instead of qw(123 56 6543);

        Which isn't a problem when you just want the length. The result is 1 more than the length of the longest capture. That's why I increment $longest at the end.

        ...or rather, that's why I was planning on incrementing $longest at the end. Fixed.

Re: Find Length Of Longest Ascending/Descending Sequence
by ikegami (Pope) on May 09, 2011 at 19:03 UTC

    A non-regex solution:

    my @digits = /./sg; my @next = (1,2,3,4,5,6,7,8,9,0); my @prev = (9,0,1,2,3,4,5,6,7,8); my $longest_start = my $start = 0; my $longest_len = my $len = 1; my $slope = 0; for (1..$#digits) { my $new_slope = $digits[$_-0] == $next[$digits[$_-1]] ? +1 : $digits[$_-0] == $prev[$digits[$_-1]] ? -1 : 0; if ($new_slope) { if ($new_slope == $slope) { ++$len; } else { $start = $_ - 1; $len = 1; } } $slope = $new_slope; if ($len > $longest_len) { $longest_start = $start; $longest_len = $len; } } my $longest = substr($_, $longest_start, $longest_len+1);

    Update: Fixed lack of wrapping.

Re: Find Length Of Longest Ascending/Descending Sequence
by BrowserUk (Pope) on May 09, 2011 at 19:58 UTC

    How does 01234567890987654321 measure?

    A single compliant sequence? Two that overlap? Other?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I'm going with two that overlap:
      qw(01234567890 0987654321);
      BrowserUk,
      The sequence is defined as either ascending or descending - not both at the same time so two overlapping. In the end, I am only interested in the length of the longest so if there are ties for first place pick one.

      Cheers - L~R

        I am only interested in the length of the longest

        In that case, this should work. Nothing special, but just one pass of the string:

        #! perl -slw use strict; sub z { my $soFar = 1; my $lastC = substr $_[0], 0, 1; my $lastS = $lastC <=> substr $_[0], 1, 1; my $bestN = 0; for my $p ( 1 .. length( $_[0] )-1 ) { my $this = substr $_[0], $p, 1; my $d = abs( $this - $lastC ); my $s = $this <=> $lastC; $s *= -1 if $d == 9; if( $s != $lastS or $d != 1 and $d != 9 ) { $soFar = 1; $lastS = $s; } if( ++$soFar > $bestN ) { $bestN = $soFar; } $lastC = $this; } return $bestN; } while( <DATA> ) { chomp; my $l = z( $_ ); print $_, ' : ', $l; } __DATA__ 01234567890987654321 01234567890123456789 78901234567890123456 98765432109876543210 21098765432109876543 012345678890123456789 789012345677890123456 987654321099876543210 210987654322109876543 012345678900987654321

        That is one nasty, messy problem :)


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Find Length Of Longest Ascending/Descending Sequence
by chrestomanci (Priest) on May 09, 2011 at 20:31 UTC

    Interesting problem. ++ Perhaps a subject for golf?

    Though if you where not a long term resident of the monastery and I did not know you, I would be thinking Homework problem and it would be --

Re: Find Length Of Longest Ascending/Descending Sequence
by BrowserUk (Pope) on May 09, 2011 at 21:15 UTC

    It won't win "most compact", but it might prove more efficient than many:

    #! perl -slw use strict; sub z { my $soFar = 1; my $lastC = substr $_[0], 0, 1; my( $lastP, $bestN, $bestP ) = ( 0 ) x 3; for my $p ( 1 .. length( $_[0] )-1 ) { my $this = substr $_[0], $p, 1; my $d = abs( $this - $lastC ); if( $d != 1 and $d != 9 ) { $soFar = 1; $lastP = $p; } ++$soFar; if( $soFar > $bestN ) { $bestN = $soFar, $bestP = $lastP; } $lastC = $this; } return $bestN, $bestP; } while( <DATA> ) { chomp; print; my( $l, $p ) = z( $_ ); print ' ' x $p, substr( $_, $p, $l ), "\t", $l, "\n"; } __DATA__ 01234567890123456789 78901234567890123456 98765432109876543210 21098765432109876543 01234567890987654321 012345678890123456789 789012345677890123456 987654321099876543210 210987654322109876543 012345678900987654321 012345678909876543210

    In the absence of an answer re: 090, I've noted your abs(1) criteria. Also, first of equal longest.

    Could be simplified if you don't need to know where the longest occurs.

    Output:

    C:\test>junk 01234567890123456789 01234567890123456789 20 78901234567890123456 78901234567890123456 20 98765432109876543210 98765432109876543210 20 21098765432109876543 21098765432109876543 20 01234567890987654321 01234567890987654321 20 012345678890123456789 890123456789 13 789012345677890123456 78901234567 11 987654321099876543210 98765432109 11 210987654322109876543 21098765432 11 012345678900987654321 01234567890 11 012345678909876543210 012345678909876543210 21

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Find Length Of Longest Ascending/Descending Sequence
by ambrus (Abbot) on May 09, 2011 at 21:40 UTC

    Let me try to make a solution without looking at the other replies.

    I apologize in advance, for such an APL-style solution translates to very ugly code in perl. It would look nicer in haskell or some ml or apl language, but I'll stick to perl in this reply. In fact the code looks a bit like some of my prolog homework, where more than half of the code is defining several generic list handling functions.

    use warnings; use strict; my $s = @ARGV ? $ARGV[0] : "82665409266027476709324472"; use List::Util qw"min"; sub zip { my($p, $x, $y) = @_; map { scalar &$p($$x[$_], $$y[$_]) } 0 .. min(@$x - 1, @$y - 1); } sub zipadj2 { my($p, @y) = @_; zip $p, [@y[0 .. @y - 2]], [@y[1 .. @y - 1]]; } sub scanr1 (&@) { my($p, @y) = @_; my @r = my $m = pop @y; for my $y (reverse @y) { unshift @r, ($m = &$p($y, $m)); } @r; } sub maxind { my(@y) = @_; my($p, $m) = (0, $y[0]); for my $i (1 .. @y - 1) { if ($m < $y[$i]) { ($p, $m) = ($i, $y[$i]); } } wantarray ? ($p, $m) : $p; } my @s = split //, $s; my @d = map { do { no integer; $_ % 10 } } zipadj2 sub { my($a, $w) = @_; $a - $w }, @s; my $lu = sub { my($k) = @_; my($p, $m) = maxind scanr1 sub { my($x, $y) = @_; $x * (1 + $y) }, + map { $k == $_ } @d; $p, $m + 1; }; my($p1, $l1) = &$lu(1); # decreasing my($p9, $l9) = &$lu(9); # increasing my($p, $l) = $l9 <= $l1 ? ($p1, $l1) : ($p9, $l9); print "The longest subsequence you want starts at position ", $p, " and has length ", $l, " and its contents are (", join("", @s[$p .. $p + $l - 1]), ").\n"; my $v = $s; substr $v, $p + $l, 0, ")"; substr $v, $p, 0, "("; print "In context, it's here: ", $v, ".\n"; __END__

    Update: thanks for the test cases, wind.

    Update: It turns out, it comes out pretty ugly and long in Haskell too. (Maybe that's because I don't have too much practice in Haskell.)

    import Data.Char (digitToInt, intToDigit); import Data.List (elemIndex); import Data.Maybe (fromJust); import System (getArgs); egin :: String; egin = "82665409266027476709324472"; zipadj2 :: (a -> a -> b) -> [a] -> [b]; zipadj2 p y = zipWith p (init y) (tail y); main = do { args <- getArgs; let { ss = head (args ++ [egin]); s = map digitToInt ss; d = map (flip mod 10) (zipadj2 (-) s); gr x y = x * succ y; lu k = let { h = scanr1 gr (map (fromEnum . (k ==)) d :: [Int]); mh = maximum h; } in (fromJust (elemIndex mh h), succ mh); pl1@(_, l1) = lu 1; pl9@(_, l9) = lu 9; (p, l) = if l9 <= l1 then pl1 else pl9; fs = map intToDigit; }; putStr ("The longest subsequence you want starts at position " ++ show p ++ " and has length " ++ show l ++ " and its contents a +re (" ++ fs (take l (drop p s)) ++ ").\n" ++ "In context, it's here: " ++ fs (take p s) ++ "(" ++ fs (take l (drop p s)) ++ ")" ++ fs (drop l (drop p s)) ++ ".\n"); };

      My haskell approach. Only prints the length and the longest substring which I think was the original problem. And there's probably a module function that does take2While that I just don't know about.

      import Data.List (sort,tails) isAscP x y | succ x == y = True isAscP x y | x == '9', y == '0' = True isAscP _ _ = False isDescP x y | succ y == x = True isDescP x y | x == '0', y == '9' = True isDescP _ _ = False take2While :: (a -> a -> Bool) -> [a] -> [a] take2While _ [] = [] take2While _ (x:[]) = [x] take2While p (x:xs@(y:ys)) | p x y == True = x : take2While p xs | otherwise = [x] longestSeq p s = last . sort . map (\x -> (length x, x)) . map (take2W +hile p) $ tails s longestAsc s = longestSeq isAscP s longestDesc s = longestSeq isDescP s main = do mapM_ print $ map ( \x -> last $ sort [ longestAsc x, longestDesc x +] ) test_data test_data = ["46177162136821098372191324396358023311290325514995512037 +4576", "41218287161128572793732981048741734764567889966242600855 +8163", "80644516862271572985414541535685057678903372416444198119 +0358", "73799320594284321094463204375578307048769141937606874852 +0835", "79783673261130165104915794140267749392590890126803346772 +1123", "80034367333583441342569006349963696554752767153578901036 +0222", "36455695473823274009541290542819284921805886986823415432 +1153", "65688601631846538985084526162069491656561155920969515432 +1950", "43982308548391881824360236063248675793848336789046458690 +9210", "27501082572643841973404198765708516277381582187277161932 +2197", "36286200318698643562895639299746227057153878528789011116 +5953", "86578896980048397890987639695689642805948383465444378364 +0994", "08494326244468225664919617082669975678940469991751181340 +4654", "03802243561219785083513947890147183878904240719985858365 +4421", "24471232554189661569782092544273572763854030957460498432 +1042", "79927259366563301479296432100629590729754503939144233930 +9443", "87513484093393755250506962456969217890137276421097350124 +0868", "16060437323320876543354855484294170848344644764967947362 +3104", "78796640372485356789004027625810549218342470476191585852 +4942", "23456532982436100867707549226413076264717737053820267894 +9041" ]

      Produces the same strings as the first response for the same test data.

      $ runhaskell ascruns.hs (5,"21098") (5,"45678") (5,"67890") (6,"432109") (5,"89012") (5,"78901") (5,"54321") (5,"54321") (5,"67890") (5,"98765") (5,"78901") (5,"09876") (5,"56789") (5,"78901") (5,"43210") (5,"43210") (5,"78901") (6,"876543") (6,"567890") (5,"23456")

      Update: Revised version that produces offsets as well as being a bit shorter terser (thanks ambrus for maximum vice last . sort)

      longestSeq p s = maximum tuples where ts = tails s runs = map (take2While p) ts tuples = zip3 (map length runs) [0..] runs $ runhaskell ascruns.hs | head -n 4 (len,pos,str) (5,12,"21098") (5,37,"45678") (5,35,"67890")

      Further Duh: isDescP is really just flip isAscP . . .

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

Re: Find Length Of Longest Ascending/Descending Sequence
by moritz (Cardinal) on May 10, 2011 at 08:41 UTC
    Not a working solution, just a thought experiment:

    Perl 6 has a regex construct spelled <*abcd>. It is a shortcut for a|ab|abc|abcd (also note that in Perl 6 regexes, the longest pipe-delimited alternative wins, not the first one as in Perl 5).

    Sadly no compiler implements the <*abcd> feature yet, so I can't demonstrate it, but I think something like this should work:

    my token longest_asc_desc { # ascending | <*0123456789> ** <?after 9> | <*1234567890> ** <?after 0> | <*2345678901> ** <?after 1> ... # descending: | <*9876543210> ** <?after 0> | <*8765432109> ** <?after 9> | <*7654321098> ** <?after 8> ... }

      nice feature :-)

      How will it match data like the following?

      11101234567890123411111 => expected result: 15
      Will you need to create the pattern in <*0123456789> dynamically based on the length of the data?

      Rata
        Notice the repetition:
        <*0123456789> ** <?after 9>

        is short for

        <*0123456789> [ <?after 9> <*0123456789>]*

        So if it matches the first one, and upto 9 (that's what the <?after 9> checks), it tries again to start with 0.

        To extract the longest match, you need to collect all, and overlapping (because the last digits of an ascending sequence can be the first of a descending sequence):

        my regex seq { ... } my $length = [max] $teststring.match(&seq, :g, :overlap)>>.chars;
Re: Find Length Of Longest Ascending/Descending Sequence
by kennethk (Monsignor) on May 10, 2011 at 16:34 UTC
    A little late to the party, but I didn't see any other posts that leveraged the cyclical nature of Z10, so seemed worth putting in my two cents. Should be pretty efficient as well, given the once-through approach.

    #!/usr/bin/perl -w use strict; while (<DATA>) { chomp; my $max = 0; my $count = 1; my $direction = 0; my $last = substr $_, 0, 1; for my $this (split //) { local $_ = $this - $last; ++$count and next if $direction and $_ % 10 == $direction; $count = 2 if $direction = abs==1 || abs==9 ? $_ % 10 : 0; } continue { $max = $max > $count ? $max : $count; $last = $this; } print "$_\t=> $max\n"; } __DATA__ 82665409266027476709324472 2468 2345678 78901 78909 32109 32101 909 09090

    outputs

    82665409266027476709324472 => 3 2468 => 1 2345678 => 7 78901 => 5 78909 => 4 32109 => 5 32101 => 4 909 => 2 09090 => 2

    Update: Ambrus caught a mistake in my $direction assignment for the case of 909. Corrected bugs (changed $this <=> $last to $_ % 10) and added cases to test set.

Re: Find Length Of Longest Ascending/Descending Sequence
by masak (Scribe) on May 10, 2011 at 19:59 UTC
    My Perl 6 contribution:
    my $seq = "82665409266027476709324472"; my $diff = join "", map { ($^b - $^a) % 10 }, ($seq.comb Z $seq.comb.rotate); my $longest = $diff.comb(/9+ | 1+/, :match).max({.to - .from}); say $seq.substr($longest.from, $longest.to - $longest.from + 1); # 654

      Would this work for strings like "834567"? I think you may need to replace ($seq.comb Z $seq.comb.rotate) with something like ($seq.comb[0..*-2] Z $seq.comb[1..*-1]).

Re: Find Length Of Longest Ascending/Descending Sequence
by LanX (Canon) on May 10, 2011 at 23:14 UTC
    2 approaches by calculating the string of modulo distances of two neighboring digits

    Any sequence of consecutive 1 or 9 reflects ascending or descending sequences.

    The difficulty is to get all overlapping pairs in an elegant way w/o trouble with edge cases.

    Version 1 manipulates pos() of m//g

    Version 2 uses reduce {...;$b}

    use List::Util qw/max reduce/; while(<DATA>){ chomp; #-- Version 1 my $diff=""; while (/(\d)(\d)/g) { $diff.=($2-$1)%10; pos($_)--; } my $v1= 1 + max map { length } $diff =~ m/(1+|9+)/g; #--- Version 2 $diff=""; reduce { $diff.=($a-$b)%10; $b} split //; my $v2= 1 + max map { length } $diff =~ m/(1+|9+)/g; print "V1: $v1 V2: $v2 $_\n"; } __DATA__ 461771621368210983721913243963580233112903255149955120374576

    Cheers Rolf

      Does this /(\d)(\d)/g; pos($_)--; thing have some advantage over /(\d)(?=\d)/g; ? I mean, I wouldn't be surprised if it made sense given the strange hacks with pos() I've seen, but could you give some explanation please?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://903800]
Approved by LanX
Front-paged by LanX
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (10)
As of 2014-12-18 07:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (44 votes), past polls