Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: mem usage

by crag (Novice)
on May 26, 2010 at 22:13 UTC ( #841833=note: print w/ replies, xml ) Need Help??


in reply to mem usage

If you don't mind reading the source file twice and a whole lot of random seeks the second time through, you can generate a list of offsets, shuffle those and then loop through a seek, read, write cycle:

#!/usr/bin/perl use strict; use warnings; use List::Util qw(shuffle); my @offsets; print STDERR "Scanning..."; open(IN, $ARGV[0]); do { push @offsets, tell(IN) } while (<IN>); close(IN); pop @offsets; print STDERR "Done. ($#offsets)\nScrambling..."; @offsets = shuffle(@offsets); print STDERR "Done.\nWriting scrambled..."; open(IN, $ARGV[0]); for (@offsets) { seek(IN, $_, 0); my $line = <IN>; $line .= $/ if $line !~ qr{$/}; print $line; } print STDERR "Done.\n";
This scrambled an 800k line/60M file I had handy in eight seconds with minimal memory usage in process space. I assume the kernel kept the entire file cached in memory.


Comment on Re: mem usage
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://841833]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (7)
As of 2014-09-19 06:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (131 votes), past polls