Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Perl's poor disk IO performance

by snoopy (Deacon)
on Apr 29, 2010 at 23:09 UTC ( #837663=note: print w/ replies, xml ) Need Help??


in reply to Perl's poor disk IO performance

Are you really wanting to skip over records as per your example above?

Memory mapping can be a better choice, if I/O has been identified as a bottleneck and you want 'semi-random' access to you data. I.e. if you can be a bit selective, skipping records, based on the headers and thus skipping significant blocks of data.

For example, the following uses Sys::Mmap:

#!/usr/bin/perl use common::sense; use Sys::Mmap; my $path = '/tmp/stuff'; my $file_size = -s $path; die "empty or missing file: $path" unless $file_size; open (my $fh, '+<', $path) or die "unable to open $path for read: $!"; mmap(my $data, 0, PROT_READ, MAP_SHARED, $fh) or die "mmap: $!"; my $pos = 0; while ($pos < $file_size) { my ($size, $code, $ftype) = unpack ("nCC", substr($data, $pos, 4)); $pos += 4; # advance past header $size = $size - 4; if ($size > 0) { $pos += $size; # advance past record } }
If you've identified I/O as a bottleneck, it's worthwhile benchmarking this against your above solution anyway, even if you are reading sequentially. It'll help to determine if read really is imposing a performance penalty!


Comment on Re: Perl's poor disk IO performance
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://837663]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (12)
As of 2015-07-07 13:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (88 votes), past polls