Re^2: Memory usage while tallying instances of lines in a .txt file

Replies are listed 'Best First'.
Re^3: Memory usage while tallying instances of lines in a .txt file by BrowserUk (Patriarch) on Dec 05, 2016 at 20:19 UTC
The following code produces identical results to choroba's code but uses less than 1/4 of the memory (180MB vs 795MB for my test dataset) and runs more quickly: #! perl -slw use strict; use List::Util qw[ first ]; my @headers = split ' ', scalar <>; my $f = first { $headers[$_] eq 'Strand' } 0 .. $#headers; my( $cCounts, $wCounts, $n, %index ) = ( '', '', 0 ); while( <> ) { chomp; my @F = split ' '; my $index = $index{ $F[ $f+1 ] }{ $F[ $f + 2 ] } //= $n++; ++vec( $F[ $f ] eq 'w' ? $wCounts : $cCounts, $index, 8 ); } while( my( $key, $subhash ) = each %index ) { while( my( $subkey, $index ) = each %{ $subhash } ) { print join "\t", $key, $subkey, vec( $cCounts, $index, 8 ), ve +c( $wCounts, $index, 8 ); } } __END__ 1177246.pl 1177246.dat > 1177246.out [download] It assumes no count will be greater than 256. If that's too small, change the three 8s to 16s for a small increase in memory consumption. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity. In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^4: Memory usage while tallying instances of lines in a .txt file by RonW (Parson) on Dec 07, 2016 at 22:11 UTC
It assumes no count will be greater than 256. If that's too small, change the three 8s to 16s Actually, the 8s give a max count of 255 (`0b00000000 .. 0b11111111`, where `0b11111111 == 255`) 16s will give you counts up to 65535 (`0b1111111111111111`).	[reply] [d/l] [select]
Re^5: Memory usage while tallying instances of lines in a .txt file by BrowserUk (Patriarch) on Dec 08, 2016 at 00:57 UTC
Indeed.	[reply]


"be consistent"
	PerlMonks