Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

PWC 244 task 2 in linear time

by Anonymous Monk
on Nov 27, 2023 at 18:22 UTC ( [id://11155873]=perlmeditation: print w/replies, xml ) Need Help??

Disclaimer: it's clickbait. The plot is curved, solution isn't linear, despite lack of nested loops, -- but fast.

Task 2: Group Hero
Submitted by: Mohammad S Anwar

You are given an array of integers representing the strength.

Write a script to return the sum of the powers of all possible 
combinations; power is defined as the square of the largest number 
in a sequence, multiplied by the smallest.

Example 1

Input: @nums = (2, 1, 4)
Output: 141

Group 1: (2) => square(max(2)) * min(2) => 4 * 2 => 8
Group 2: (1) => square(max(1)) * min(1) => 1 * 1 => 1
Group 3: (4) => square(max(4)) * min(4) => 16 * 4 => 64
Group 4: (2,1) => square(max(2,1)) * min(2,1) => 4 * 1 => 4
Group 5: (2,4) => square(max(2,4)) * min(2,4) => 16 * 2 => 32
Group 6: (1,4) => square(max(1,4)) * min(1,4) => 16 * 1 => 16
Group 7: (2,1,4) => square(max(2,1,4)) * min(2,1,4) => 16 * 1 => 16

Sum: 8 + 1 + 64 + 4 + 32 + 16 + 16 => 141

I waited for a recap to be published to write this. Most solutions were built around Algoritm::Combinatorics or similar. The one I have (accidentally) chosen, below, uses different module, I didn't investigate if A::C would be faster. They are all prohibitively slow for array of a couple dozen items or so. Some solutions avoid CPAN modules, but employ, technically, the same combinatorics rolled out manually. They may be ten-fold faster, but scale exponentially all the same.

Later in the week, significantly more optimal, no-combinatorics, solutions were added; I think there were 2-3 of them; one is benchmarked below and is self-described to be "quadratic". It easily can process arrays of hundreds of items -- with help of bigint, of course. With input as such, the only interest is in speed itself, I don't think any practical application would calculate "strength" as big integer (bigint is required for arrays of relatively small integers, starting from approx. 50 items).

Subrotines code to compare to was taken from PWC GH repository; I only changed sub names and localized bigint scope to sub body. Section titles could also be more imaginative and descriptive, sorry about that. The benchmarking sub is a bit rough and ad-hoc.

The result:

******************************* *** Math::Combinatorics ******* ******************************* +------------+------------+ | Array size | Time, sec. | +------------+------------+ | 10 | 0.010 | | 11 | 0.031 | | 12 | 0.063 | | 13 | 0.135 | | 14 | 0.276 | | 15 | 0.578 | | 16 | 1.198 | | 17 | 2.474 | | 18 | 5.104 | +------------+------------+ + +----------------------------------------------------------+ | + + + + + + + + + | 10 |-+ +-| |+ +| |+ * +| |+ * +| | * | |+ * * +| |+ * * +| | * * | | * * * | 1 |-+ * * * +-| |+ * * * +| |+ * * * * +| |+ * * * * +| |+ * * * * +| |+ * * * * * +| |+ * * * * * +| | * * * * * | | * * * * * * | 0.1 |-+ * * * * * * +-| |+ * * * * * * +| |+ * * * * * * * +| |+ * * * * * * * +| |+ * * * * * * * +| |+ * * * * * * * * +| |+ * * * * * * * * +| | * * * * * * * * | | * * * * * * * * | 0.01 |-+ * * * * * * * * * +-| |+ * * * * * * * * * +| +----------------------------------------------------------+ 10 11 12 13 14 15 16 17 18 ******************************* *** Combinatorics (manual) **** ******************************* +------------+------------+ | Array size | Time, sec. | +------------+------------+ | 13 | 0.016 | | 14 | 0.021 | | 15 | 0.052 | | 16 | 0.104 | | 17 | 0.224 | | 18 | 0.464 | | 19 | 0.990 | | 20 | 2.031 | | 21 | 4.219 | +------------+------------+ + +----------------------------------------------------------+ | + + + + + + + + + | 10 |-+ +-| |+ +| |+ +| |+ * +| | * | |+ * +| |+ * * +| | * * | | * * | 1 |-+ * * * +-| |+ * * * +| |+ * * * +| |+ * * * * +| |+ * * * * +| |+ * * * * +| |+ * * * * * +| | * * * * * | | * * * * * | 0.1 |-+ * * * * * * +-| |+ * * * * * * +| |+ * * * * * * +| |+ * * * * * * * +| |+ * * * * * * * +| |+ * * * * * * * +| |+ * * * * * * * * +| | * * * * * * * * * | | * * * * * * * * * | 0.01 |-+ * * * * * * * * * +-| |+ * * * * * * * * * +| +----------------------------------------------------------+ 13 14 15 16 17 18 19 20 21 ******************************* *** Min-max (quadratic) ******* ******************************* +------------+------------+ | Array size | Time, sec. | +------------+------------+ | 20 | 0.036 | | 40 | 0.156 | | 60 | 0.354 | | 80 | 0.635 | | 100 | 1.010 | | 120 | 1.474 | | 140 | 2.011 | | 160 | 2.651 | | 180 | 3.380 | +------------+------------+ + +-----------------------------------------------------------+ | + + + + + + + + + | 3.5 |-+ +-| | * | | * | | * | 3 |-+ * +-| | * | | * * | 2.5 |-+ * * +-| | * * | | * * | | * * | 2 |-+ * * * +-| | * * * | | * * * | | * * * | 1.5 |-+ * * * * +-| | * * * * | | * * * * | | * * * * | 1 |-+ * * * * * +-| | * * * * * | | * * * * * | | * * * * * * | 0.5 |-+ * * * * * * +-| | * * * * * * * | | * * * * * * * * | | * * * * * * * * * | 0 |-+ * * * * * * * * * +-| | + + + + + + + + + | +-----------------------------------------------------------+ 20 40 60 80 100 120 140 160 180 ******************************* *** Faster (linear?) ********** ******************************* +------------+------------+ | Array size | Time, sec. | +------------+------------+ | 800 | 0.297 | | 1200 | 0.484 | | 1600 | 0.698 | | 2000 | 0.948 | | 2400 | 1.245 | | 2800 | 1.599 | | 3200 | 2.021 | | 3600 | 2.516 | | 4000 | 3.083 | +------------+------------+ + +-----------------------------------------------------------+ | + + + + + + + | | | 3 |-+ * +-| | * | | * | | * | | * | 2.5 |-+ * * +-| | * * | | * * | | * * | | * * * | 2 |-+ * * * +-| | * * * | | * * * | | * * * * | 1.5 |-+ * * * * +-| | * * * * | | * * * * | | * * * * * | | * * * * * | 1 |-+ * * * * * * +-| | * * * * * * | | * * * * * * | | * * * * * * * | | * * * * * * * | 0.5 |-+ * * * * * * * * +-| | * * * * * * * * * | | * * * * * * * * * | | * + * + * * * + * + * + * * | +-----------------------------------------------------------+ 500 1000 1500 2000 2500 3000 3500 4000

Code:

use strict; use warnings; use experimental qw{ say postderef signatures state }; use Benchmark qw/ timeit :hireswallclock /; use POSIX qw/ floor ceil log10 /; use File::Spec::Functions 'catfile'; use File::Basename 'dirname'; use List::Util qw{ min max }; use Math::Combinatorics; use Term::Table; use Chart::Gnuplot; my $GP = catfile dirname( $^X ), '../../c/bin/gnuplot.exe'; die unless -x $GP; # assume Strawberry Perl say ' ******************************* *** Math::Combinatorics ******* ******************************* '; benchmark( \&group_hero_dave_jacoby, [ 10 .. 18 ], 'logscale' ); say ' ******************************* *** Combinatorics (manual) **** ******************************* '; benchmark( \&group_hero_e_choroba, [ 13 .. 21 ], 'logscale' ); say ' ******************************* *** Min-max (quadratic) ******* ******************************* '; benchmark( \&group_hero_jo_37, [ map $_ * 20, 1 .. 9 ], '' ); say ' ******************************* *** Faster (linear?) ********** ******************************* '; benchmark( \&group_hero_mine, [ map $_ * 400, 2 .. 10 ], '' ); sub benchmark( $coderef, $x, $logscale ) { my @xdata = @$x; my @ydata; for my $size ( @xdata ) { my $t = timeit 3, sub { $coderef-> ( 1 .. $size ) }; push @ydata, $t-> [ 1 ] / $t-> [ -1 ]; } my $table = Term::Table-> new( header => [ 'Array size', 'Time, sec.' ], rows => [ map [ $xdata[ $_ ], sprintf '%.3f', $ydata[ $_ ]], 0 .. $#xdata ], ); say for $table-> render; my $chart = Chart::Gnuplot-> new( gnuplot => $GP, terminal => 'dumb size 70, 35', ); my $dataset = Chart::Gnuplot::DataSet-> new( xdata => \@xdata, ydata => \@ydata, style => 'impulses', ); my $dx = .1 * ( max( @xdata ) - min( @xdata )); my $dy = .1 * ( max( @ydata ) - min( @ydata )); $chart-> set( xrange => [ -$dx + min( @xdata ), $dx + max( @xdata )], yrange => [ -$dy + min( @ydata ), $dy + max( @ydata )], ); if ( $logscale ) { $chart-> set( logscale => 'y', yrange => [ 10 ** ( -.2 + floor log10 min( @ydata )), 10 ** ( .2 + ceil log10 max( @ydata )) ], ) } $chart-> plot2d( $dataset ) } sub group_hero_dave_jacoby (@input) { my $output = 0; for my $c ( 1 .. scalar @input ) { my $comb = Math::Combinatorics->new( count => $c, data => [@in +put], ); while ( my @combo = $comb->next_combination ) { my $min = min @combo; my $max = max @combo; my $str = $max**2 * $min; $output += $str; } } return $output; } sub group_hero_e_choroba(@nums) { my @indicator = (0) x @nums; $indicator[-1] = 1; my $sum = 0; while (1) { my @group = @nums[grep $indicator[$_], 0 .. $#nums]; $sum += max(@group) ** 2 * min(@group); my $i = $#indicator; $indicator[$i--] = 0 while $indicator[$i]; ++$indicator[$i]; last if $i < 0; } return $sum } sub group_hero_jo_37 { use bigint; my @s = sort {$a <=> $b} @_; my $power; while (defined (my $min = $s[0])) { while (my ($offs, $max) = each @s) { $power += $min * $max**2 * ($offs ? 2**($offs - 1) : 1); } } continue { shift @s; } $power; } sub group_hero_mine { use bigint; my @nums = sort { $a <=> $b } @_; my $big = 0; for my $i ( 0 .. $#nums - 1 ) { $big += $nums[ $i ] * 2 ** ( $#nums - $i- 1 ) } my $sum = 0; for my $i ( reverse 0 .. $#nums ) { $sum += ( $big + $nums[ $i ]) * $nums[ $i ] ** 2; $big -= $nums[ $i - 1 ]; $big /= 2 } return $sum }

How? Perhaps it's better to illustrate with a picture, drawn with pen and paper. Number of combinations is 2**n. For 5-items sorted array, we are looking for sum of these terms:

f(4,0)*2**3 f(3,0)*2**2 f(2,0)*2**1 f(1,0)*2**0 f(0,0)*2**0 f(4,1)*2**2 f(3,1)*2**1 f(2,1)*2**0 f(1,1)*2**0 f(4,2)*2**1 f(3,2)*2**0 f(2,2)*2**0 f(4,3)*2**0 f(3,3)*2**0 f(4,4)*2**0

where f(i,j) is function of items indexed i, j. Direct translation to code:

sub func { $_[0] ** 2 * $_[1] } sub group_hero_blunt { use bigint; my @nums = sort { $a <=> $b } @_; my $sum = 0; for my $i ( 0 .. $#nums ) { for my $j ( 0 .. $i - 1 ) { $sum += func( $nums[ $i ], $nums[ $j ]) * 2 ** ( $i - $j - 1 ) } $sum += func( $nums[ $i ], $nums[ $i ]) } return $sum }

With function inlined and square factored out, it's ~2x faster:

sub group_hero_better { use bigint; my @nums = sort { $a <=> $b } @_; my $sum = 0; for my $i ( 0 .. $#nums ) { my $s = 0; for my $j ( 0 .. $i - 1 ) { $s += $nums[ $j ] * 2 ** ( $i - $j - 1 ) } $sum += ( $s + $nums[ $i ]) * $nums[ $i ] ** 2 } return $sum }

But looking at picture above, it's not difficult to note that nested loops aren't required. Pre-calculate one big number, and then chip away small (or not so "small") pieces in one loop. Which leads to further ~10x speed improvement, and is what has been benchmarked with 4000 items array.

Replies are listed 'Best First'.
Re: PWC 244 task 2 in linear time
by LanX (Saint) on Nov 28, 2023 at 15:58 UTC
    > powers of all possible combinations; power is defined as the square of the largest number in a sequence

    My problem with this challenge definition is that a sequence is an ordered set, much like Perl arrays.

    But it seems to me the solutions presented are based on unordered subsets.

    For clarification:

    [2,3,4,5] may be a subset of [0 ..10] and there is only one subset of length 4 possible with 2 minimal and 5 maximal.

    But 2 sequences for length 4

    • [2,3,4,5]
    • [2,4,3,5]
    Am I mistaken and is this a case of TL;DR?

    Thanks for clarification.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

    PS: Mohammed's example doesn't help because you need longer arrays to see the difference.

    Update

    ) And generally n! for length n+2

      There is a big charm in the ambiguities in the PWC tasks, IMHO. Don't take the task description too serious. Just find an interpretation of your own. It's fun!

      Greetings,
      -jo

      $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
        "Interpretation of your own" becomes particular "charming" when np-complete problems are solved in polynomial time. ;)

        Anyway, my question wasn't answered, which problem was solved here?

        Seeing terms of 2**n let's me think it's not for sub sequences.(?)

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery

Re: PWC 244 task 2 in linear time
by jo37 (Deacon) on Nov 28, 2023 at 20:42 UTC

    The linear time solution was not submitted to PWC yet. What a pity!

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
Re: PWC 244 task 2 in linear time
by jo37 (Deacon) on Nov 28, 2023 at 09:27 UTC

    Cool!

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11155873]
Approved by choroba
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-04-25 10:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found