Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^9: Can you improve upon my algorithm.

by salva (Canon)
on Mar 13, 2015 at 23:28 UTC ( [id://1120009]=note: print w/replies, xml ) Need Help??


in reply to Re^8: Can you improve upon my algorithm.
in thread Can you improve upon my algorithm.

Here we go!

n-way-merge.c

On my old home computer it merges at around 3GB/min, 100GB in half an hour.

As it is just a prove of concept, it can only handle records formed by integers of type int64_t.

The meaning of the constants are as follows:

N = maximum number of files that can be sorted CHUNKSIZE = the per-file memory buffer KEYSIZE = the number of int64_t numbers per record
The source files are passed as arguments on the command line, the sorted output is sent to STDOUT. Only tested on Ubuntu 64bits.

Update: It is difficult to count the number of instructions inside the loop on the hot path. It is certainly below 100, don't know if I have been able to reach the 50 I have predicted... I blame gcc -O3 for that for unrolling small loops as the key comparison.

Replies are listed 'Best First'.
Re^10: Can you improve upon my algorithm.
by BrowserUk (Patriarch) on Mar 14, 2015 at 01:09 UTC

    Case and analysis proven.

    Setting KEYSIZE = 1; compiling /Ox and running it against my 50x1GB files, it sustains a pretty steady 60MB in/60MB out per second, at a cpu utilisation of ~22-23%; which given my wc -c baseline test of 171mb/s is unidirectional, probably means that it is IO-bound rather than cpu-bound -- testimony to a well implemented heap.

    Total runtime for 50GB was a gnat's over 14 minutes which comes out at 61MB/s each way. Very impressive.

    Just makes you wonder about gnu/ms sort!

    I'll try and add the sort to split files into it tomorrow.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
      Case and analysis proven

      Actually, there was an error on the analysis. I predicted the in-memory merge to be RAM bound because of the log2N comparisons per record all going to RAM to fetch the keys. That is wrong, the records at the top of the buffers stay on the L1 cache. Only the replacement for the one at the top of the heap has to be loaded from RAM. So, it is actually CPU bound.

      I'll try and add the sort to split files into it tomorrow.

      I used my own Sort::Packed for that.

        I'm still going to finish my file mapped attempt.

        For a 100GB file, using the tradition method, you need to read the original file; write the spill files; read the spill files; write the final file. That's 400GB of reads and writes at 61MB/s; that's a bit under 2 hours (1.865) of IO.

        If I can do it "in-place"; half of that (56 minutes) is up for grabs.

        Given that I can swap the two halves of a 4GB buffer in ram in 1.177196260s, that gives the potential to move around 12TB of data around in the same time as it takes just to write and re-read the spill files.

        Of course, that's optimistic because mem-mapped data gets written back to disk at some point; but given the 3 layers of cache between ram and disk, with the right choice of algorithm(s), and the 3072 times headroom; there ought to be the possibility to save a substantial chunk of time.

        If you view the mem-mapped file with its set of sorted sub-partitions as the spill files; you only need access to the top of each partition if your doing an N-way merge. So if you have a minimal mapped-view (say 64k) at the top of each of 50 2GB sorted sub-partitions, that's a bit over 3MB, leaving the rest of memory for use as the 'output file'. It does mean writing the accumulated output to a separate file (in chunks), but maybe that's okay.

        Another possibility is to move the first 2GB buffer into unmapped ram, run the N-way merge, writing the output over the freed up mem-mapped first buffer. Once that is full, move the (remainder of) the second buffer into unmapped ram and repeat until done.

        Now take that a step further: Instead of copying the whole of the first 2GB buffer into unmapped ram, do it in 64k or 1mb chunks as needed to free space for the N-way to write to.

        Just a bunch of random thoughts at the moment...


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
        In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
        I used my own Sort::Packed for that.

        I'm splitting the mapped file into buffers such that I can perform the sub-partitions sorts on 4 threads in parallel; which divides that part of the time by 4 - increase cache misses. (I haven't got a handle on that last part yet.)

        I don't think there is any mileage in paral(lel)ising the N-way merge?

        There are some possibilities for parallelising parts of the in-place merge (via insertion sort).

        By using a smallish maximum heap as an "input buffer" to each of the partitions, records displaced from earlier partitions by insertions, can be rippled into place in reverse position order (in large part), which should reduce the number & depth of ripples involved.

        So many possibilities. All I gotta do is wrap my brain around programming them :)


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
        In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1120009]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2024-03-29 01:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found