http://www.perlmonks.org?node_id=692427


in reply to 4k read buffer is too small

You might want to look at your NFS client to see if it can be of any help. Readahead could help here a great deal without changing Perl; look at the rsize NFS option, and any other options you have in your NFS client. You will need to test by running tcpdump or looking at your NFS stats, since Perl will still be doing 4K reads, but the OS will be doing larger reads behind the scenes.

If you're only reading the file from beginning to end, another useful trick is to write a small program to read files in whatever blocksize you need (for example with sysread) and write them to standard output; then you can run that program and pipe its output to your actual program, which can read from the pipe in 4KB blocks without affecting how the NFS server is accessed. If you need to seek around this won't work, but sometimes it can be helpful.

Replies are listed 'Best First'.
Re^2: 4k read buffer is too small
by voeckler (Sexton) on Jun 17, 2008 at 04:10 UTC
    If you're only reading the file from beginning to end, another useful trick is to write a small program to read files in whatever blocksize you need (for example with sysread) and write them to standard output; then you can run that program and pipe its output to your actual program, which can read from the pipe in 4KB blocks without affecting how the NFS server is accessed. If you need to seek around this won't work, but sometimes it can be helpful.

    Yes, strong agreement to this trick. My office neighbor also suggested this work-around, since we have at least 2 CPUs per node, and up to 8 CPUs per node, but most often, the actual computation only takes 1 CPU. CPU cycles are cheap!

    As for the NFS client tuning, I will convey the message, but I suspect that the admins already did quite a bit of tuning. After all, our directory requests are served from a different physical machine than the data blocks. Myself, I don't have god privileges on any of the machines.

    XXX:/export/samfs-XXX01 /auto/XXX-01 nfs rw,nosuid,noatime,rsize=32768 +,wsize=32768,timeo=15,retrans=7,tcp,intr,noquota,rsize=32768,wsize=32 +768,addr=10.125.0.8 0 0

    The readahead sounds intriguing. How would it work, if 200 clients tried to read the same file, though slightly offset in start time? Wouldn't read-ahead aggravate the server load in this case?

      XXX:/export/samfs-XXX01 /auto/XXX-01 nfs rw,nosuid,noatime,rsize=32768 +,wsize=32768,timeo=15,retrans=7,tcp,intr,noquota,rsize=32768,wsize=32 +768,addr=10.125.0.80 0
      Interesting, that should be reading in 32KB blocks. You would still see 4K blocks with strace, though, which might be throwing off your analysis. Try seeing if the output of nfsstat or tcpdump matches what you'd expect from strace. If you find that it actually is reading in larger blocks, your sysadmins can try increasing rsize further.

      Also, I seem to recall that you need NFSv3 to read blocks larger than 16K, so if you're not getting the full 32K you are asking for, you might want to look at that.

      The readahead sounds intriguing. How would it work, if 200 clients tried to read the same file, though slightly offset in start time? Wouldn't read-ahead aggravate the server load in this case?
      I'm not familiar with the internals of the Linux NFS code, but generally readahead will write into the buffer cache, and then client requests will be read from there. As long as it doesn't run out of memory it should do the right thing in the scenario you describe.
Re^2: 4k read buffer is too small
by voeckler (Sexton) on Jun 17, 2008 at 20:32 UTC
    ... to write a small program to read files in whatever blocksize you need ...

    It just occurred to me: The small program is called dd:

    dd if=largefile ibs=8M | perl ... | dd of=newfile obs=8M
      genius!