Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^3: Table shuffling challenge

by poj (Priest)
on Aug 23, 2013 at 22:11 UTC ( #1050743=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Table shuffling challenge
in thread Table shuffling challenge

In the quest for speed I've written this code in a way I wouldn't normally but hopefully it reflects your requirement. A 100 iterations takes about a minute on my desktop so the million would take 170 hours !! - I'll work on speeding it up.

#!perl use strict; use List::Util 'shuffle'; # parameters my $size = 100_000; my $repeat = 100; my @col1=(); my @col2=(); my @col3=(); my @col4=(); my @col5=(); my @col6=(); my @col7=(); my @col8=(); my @col9=(); my @col10=(); # create a test file my $file = 'table.dat'; test_data($file,$size); # load file into arrays open IN,'<',$file or die "Could not open $file : $!"; my $total=0; while (<IN>){ chomp; my @f = split "\t",$_; push @col1,$f[1]; push @col2,$f[2]; push @col3,$f[3]; push @col4,$f[4]; push @col5,$f[5]; push @col6,$f[6]; push @col7,$f[7]; push @col8,$f[8]; push @col9,$f[9]; push @col10,$f[10]; for my $i (1..10){ $total += $f[$i]; } } print "$. lines read from $file. Total 1's = $total\n"; my @count=(); my $sum=0; my $t0 = time(); # shuffle arrays and count my @c1=(); my @c2=(); my @c3=(); my @c4=(); my @c5=(); my @c6=(); my @c7=(); my @c8=(); my @c9=(); my @c10=(); for my $n (1..$repeat){ @c1 = shuffle @col1; @c2 = shuffle @col2; @c3 = shuffle @col3; @c4 = shuffle @col4; @c5 = shuffle @col5; @c6 = shuffle @col6; @c7 = shuffle @col7; @c8 = shuffle @col8; @c9 = shuffle @col9; @c10 = shuffle @col10; for my $i (1..$size){ $sum=0; $sum += pop @c1; $sum += pop @c2; $sum += pop @c3; $sum += pop @c4; $sum += pop @c5; $sum += pop @c6; $sum += pop @c7; $sum += pop @c8; $sum += pop @c9; $sum += pop @c10; ++$count[$sum]; } } # report stats my $total=0; print "Sum Count\n"; print "--- ----------\n"; for my $i (1..10){ printf "%2d %10d\n",$i,$count[$i]; $total += $i * $count[$i]; } print "Total = $total\n"; # run time my $dur = (time() - $t0); print "$repeat repeats for $size line table took $dur secs\n"; # test file sub test_data { my ($file,$lines) = @_; open OUT,'>',$file or die "Could not open $file : $!"; for (1..$lines){ print OUT $_; for (1..10){ print OUT "\t".int rand(2); } print OUT "\n"; } close OUT; print "$lines created in $file\n"; }
poj


Comment on Re^3: Table shuffling challenge
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1050743]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (15)
As of 2014-09-18 13:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (116 votes), past polls