in reply to Re^2: [OT] A measure of 'sortedness'?
in thread [OT] A measure of 'sortedness'?

The more I've thought about this, the more I doubt the first pass is really going to buy much at all that'll make your merge more efficient.

In the long run, you still have to do the same comparisons and the same moves. In terms of performance, it doesn't matter much which loop they're in. My original thought above, if it's feasible at all, only really adds overhead in almost every case.

The only little chance of saving anything, I think, is if you have a chunk of data at the beginning or end of the combined buffer that doesn't have to move. You might be able to identify that on your first past and shorten your second pass to the data in between.

That's all I got.

-sauoq
"My two cents aren't worth a dime.";

Replies are listed 'Best First'.
Re^4: [OT] A measure of 'sortedness'?
by BrowserUk (Pope) on Mar 19, 2015 at 21:18 UTC
    The only little chance of saving anything, I think, is if you have a chunk of data at the beginning or end of the combined buffer that doesn't have to move.

    That's exactly what the pre-scan is doing. "Topping and tailing" the two buffers for bits that either don't need to move, or only need to be block-moved rather than merged:

    |---| ---| | ___ --- | ___--- --- | ___--- --- |__--- ---| | --- | |

    Don't read too much into the ascii art, but there are obviously two chunks, the left and right ends of the left buffer, that can be either left where they are or just block-moved into place, respectively.

    There are 5 other possibilities:

    1. 1 requires no further processing: all of the left buffer is less than all of the right buffer.

      (Full scan required.)

    2. 1 requires a single block swap; all of the left buffer is greater than all of the right buffer.

      (Full scan required.

    3. 4 (including the above) that either leave in place or block-move two smaller chunks and need to recurse on the other two.

      (Partial scan if left in place or full if moved.)

    What I was looking for was something discoverable or doable during that first scan, that abetted the next stage.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
Re^4: [OT] A measure of 'sortedness'?
by marinersk (Priest) on Mar 19, 2015 at 19:04 UTC

    I'm coming to the same conclusion. Inevitably, all my creativity winds up solving the "how to conserve space" and to a lesser degree the "how to minimize data moves" issues -- and none of my thinking in those areas is yielding a first-pass-scan benefit.

    This is maddeningly entertaining because I can't count the number of times in my career I could optimize the inner algorithm if only I had the privilege of a pre-processing linear scan, which was too costly to be worth implementing.

    Here, I have the exact opposite problem -- I am going to get a pre-processing linear scan, and I can't make use of it to optimize the inner process.

    Somewhere, some deity of software development is having a huge guffaw at my expense.

    :-)