http://www.perlmonks.org?node_id=1131284


in reply to Re: Random shuffling
in thread Random shuffling

Hi BrowserUK, I am afraid that the divergence in downstream results from my work Vs prior published results goes into hardcore genomics that I am not willing to discuss the details on a public forum, especially because my work is unpublished, and I have no idea who is reading and responding my posts, and if they might be competitors

This is just a compulsion of the publish-perish paradigm in science today! Please do not take it personally. :)

But this issue I am posting about here in this threads, is related to our previous discussion thread on shuffling DNA and using that to estimate False Discovery Rates of a certain type of "feature" in the genome

Replies are listed 'Best First'.
Re^3: Random shuffling
by BrowserUk (Patriarch) on Jun 20, 2015 at 19:39 UTC
    I am not willing to discuss the details on a public forum, especially because my work is unpublished

    Understood.

    I probably wouldn't understand the hardcore genomics anyway :)

    But ... if you could isolate the differences in non-genomic terms. Eg. The results are more random; or less random; or way too random?

    Or if you are want to discuss it off-line ...

    I am posting about here in this threads, is related to our previous discussion thread on shuffling DNA and using that to estimate False Discovery Rates of a certain type of "feature" in the genome

    I thought the processing was familiar :)

    One question. Why are you shuffling the data 10 times? Is this your innovation; or part of some procedure described somewhere that you are following?

    The reason I ask is: it shouldn't be necessary. Actually, I'll re-state that. It isn't necessary!

    The whole point about the Fischer-Yates shuffle is that every possible reordering of the data is possible; and they all have equal probability.

    Thus; by shuffling multiple times all you are doing is choosing a different mix. Not a better one; nor a more random one; just a different one.

    And that could just as equally be achieved by seeding the PRNG differently: srand( time * rand() ) or similar.

    But as all outcomes are possible from every shuffle -- including the first one -- doing multiple shuffles is just wasting time; nothing more.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

      Hi BrowserUK, please see my UPDATE 1 to the original post, perhaps your answer may change or need modification in light of the additional info I have provided?