Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Nice one. Now send me a version that I don't have to compile:)

Seriously, thanks for doing the benchmark. I am doing the same right now, but am having trouble ensuring that I'm comparing apples with apples. I have one (perl) solution that hasn't been posted yet which might do better, but it's very difficult to benchmark as it relies on @- to get the positions and lengths and this system var goes out of scope very quickly.

Here is my simplistic test program run on a 1920 char string. One run only and a different string from yours so direct comparison is dodgey, but it finds 351 triplets in the 1920 chars string in a little under 0.4 of a second, which may mean it isn't doing so bad by comparison with your C version. Maybe you could include it into you benchmark (HARD!) or at least time one run of it using the same data as you did for your other tests for comparison.

#! perl -slw use strict; use re 'eval'; #! Took me ages to find this. use Benchmark::Timer; my $t = new Benchmark::Timer; #! Set up big regex. 1-time hit. my $re ='(?:(.)(??{"$+*"}))?' x 500; $re = qr/$re/o; # Test data(+2) x 32 to make it more comparable $_ = ' aaaa bbbbccccccccbbbb aaaabbbbbcdddddddddddddddddddddd' x +32; #! All the data is gathered here in one shot. #! Chars in @c - start positions in @-, lengths derived: length N = $- +[N+2] - $-[N+1] #! Caveat. lengths must be used immediately (or stored elsewhere which + costs) else the go stale:) $t->start('bigre'); my @c = m/$re/; #! THIS LINE DOES ALL THE WORK. #! This truncates the list to exclude null matches returned from regex +. $#c = $#- -1; $t->stop('bigre'); #! data accessed here. printf "('%1s', %3d, %3d)\n", $c[$_], $-[$_+1], ( $-[$_+2] || $+[0] ) +- $-[$_+1] for 0..$#c; print $#c, ' triplets found in '; $t->report; __END__ C:\test>212796-2 (' ', 0, 3) ('a', 3, 4) (' ', 7, 3) .... 345 values omitted .... ('b', 1892, 5) ('c', 1897, 1) ('d', 1898, 22) 351 triplets found in 1 trial of bigre (371ms total) C:\test>

The basic idea was to push the loop into the regex engine instead of external to it, and use the regex engines well tuned optimisations to do the data gathering for me. I originally though that using the /g modifier would do the trick, but it only returns the positions for the last match made. Then I thought of using a repeated regex ({1..n}) but again this is treated such that only the positions for the last capture group are available.

The answer was to construct a single large regex with enough capture groups to gather the data I needed. This was made harder by the need to only capture the first char of the group, but also to know its end position. That's when I discovered the (??{}) re and combined this with $+ in order to achieve my aims.

I'm not sure that this is good or maintainable Perl, but it seems to work and appears on preliminary tests to be very fast, which was and is the only criteria by which I am judging it.

Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

In reply to Re: Re: Re: Efficient run determination. by BrowserUk
in thread Efficient run determination. by BrowserUk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others romping around the Monastery: (3)
    As of 2019-05-20 09:08 GMT
    Find Nodes?
      Voting Booth?
      Do you enjoy 3D movies?

      Results (125 votes). Check out past polls.