Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^2: Node 600_000 - when will it appear?

by castaway (Parson)
on Feb 09, 2007 at 14:31 UTC ( #599215=note: print w/ replies, xml ) Need Help??


in reply to Re: Node 600_000 - when will it appear?
in thread Node 600_000 - when will it appear?

That's a good question.. But since I started quite late, how about we come up with an algorithm that will relate the guess to the closeness of the actual result.

So, another challenge, given the date of the actual 600k node, the date guessed, and the date they guessed it on.. Produce a fair result.. somehow ;)

C.


Comment on Re^2: Node 600_000 - when will it appear?
Re^3: Node 600_000 - when will it appear?
by ambrus (Abbot) on Feb 09, 2007 at 15:25 UTC

    Good idea. I'll try to think about the formula but maybe someone more knowledgable about maths statistics can tell it rightaway.

    There should still be a closing of the guesses because very short time before node 600_000 appears the model used for the formula will get unusable because the guesser can influence the date (by posting the 600_000th post himself). And, obviously, there's no point to guess after the 600_000th node came out. (Update. Let me clarify that you needn't take a given date to stop accepting votes at, it could instead be when a given node id is created.)

    Update. Let me try to build a model.

    Suppose that the nodes on perlmonks get created periodically with exactly tau time between them, where tau is a statistical parameter. Now if the current node is 600_000 - k, then the actual time of node 600_000 is now + k * tau. If your estimate on tau is tau_hat, then you will guess now + k * tau_hat. I think we can suppose that you do not get any new information on the parameter tau as time progresses, so the mistake of your guess will be linearly proportional with k, or (equivalently) with the time between your guess and the creation time of node 600_000. Thus, our formula should simply divide the mistake of the guess with either k (which is 600_000 minus the id of the node posted at the same time as the guess was made), or the difference between the time of the guess and the time of 600_000.

    The problem with this argument is, that as we get closer to node 600_000, our model gets more and more inaccurate, and in reality even knowing the parameters of the distribution of posting times on perlmonks, you can't know the 600_000 date certainly. At that point, our problem becomes a complicated problem that is both statistical and probability theoretical in nature, and I've no idea how to solve it. (At that point, you probably had to actually make a guess on tau, which we did not need above.)

    So if anyone has better solutions, just tell.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://599215]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2014-07-22 23:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (130 votes), past polls