Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^12: selecting columns from a tab-separated-values file

by ibm1620 (Beadle)
on Jan 24, 2013 at 21:26 UTC ( #1015237=note: print w/ replies, xml ) Need Help??


in reply to Re^11: selecting columns from a tab-separated-values file
in thread selecting columns from a tab-separated-values file

I'm on CentOS, a Linux disto, not Windows. I did issue the 'sync' command, but it apparently only commits buffer cache to disk rather than emptying it. I also tried touch'ing the file. In all cases, I'm able to cat my testfile to /dev/null in under one second. So I don't know what's up.


Comment on Re^12: selecting columns from a tab-separated-values file
Re^13: selecting columns from a tab-separated-values file
by BrowserUk (Pope) on Jan 24, 2013 at 23:43 UTC
    I'm on CentOS, a Linux disto, not Windows.

    Sorry. I remembered this reference to XP, but misremembered who made it.

    I did issue the 'sync' command, but it apparently only commits buffer cache to disk rather than emptying it.

    A crude but usually effective way of flushing one file from the cache is to cat a file that is bigger than the cache. Say, copy/append your 80GB datafile to another file 5 times (=400GB), and then cat that to /dev/null before running your tests. Might work for you.

    I'm able to cat my testfile to /dev/null in under one second.

    Assuming this is your 10e6 record testfile, and it is representative of your 80GB file and has an average of 86 characters/line, that gives a filesize of ~820MB.

    The very best sequential-read throughput figure I can find for a non-raided 15k local drive is a little over 100MB/s.

    That pretty much confirms that your testing is reading from cache rather than from disk. Even the most optimistic read-ahead algorithm cannot drive the interface 8 times faster than its maximum throughput.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1015237]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (13)
As of 2015-07-02 08:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (31 votes), past polls