http://www.perlmonks.org?node_id=493924


in reply to Spooky math problem

The trick here – and there is a trick – is a subtle use of equivocation: You make a claim about one problem but then explain the claim in terms of another problem that is subtly yet significantly different.

In the first problem, each envelope can contain any number. In the second problem, however, you require that the distribution of numbers have a "continuous probability distribution with non-zero density everywhere." These problems are not the same.

To see why, imagine an infinite number line representing all numbers. If you pick any segment on this line, say that between 0 and 1, the portion of the line that the segment represents is zero. Hence the probability density function over the segment is also zero, which contradicts the assumption made in your analysis.

If we repeat your analysis, this time using the distribution of numbers for your originally stated problem, we find that the probability of picking a number in between your two numbers is zero, and thus knowing the first number provides no benefit. Our intuition turns out to be correct.

Replies are listed 'Best First'.
Re^2: Spooky math problem
by tilly (Archbishop) on Sep 22, 2005 at 15:30 UTC
    There is no trick.

    In the problem, each envelope can contain any number. The requirement of a distribution of numbers that has a continuous probability distribution with non-zero density everywhere is on the one you create for your algorithm. That number is made up out of thin air and has no connection to either of the numbers in the envelopes. The trick is that it gives us a chance of telling the two numbers apart.

    The probability density function that you are trying to describe, "uniform over the real numbers", is not a valid probability density function. It is a classic result both in real analysis and probability that it can't be. (The real analysis statement is that no such measure exists, the probability theory statement is that no such probability distribution exists. Those statements are the same.)

    This is why you have to be very careful in the wording to even get a well-defined problem. Given any two numbers and the algorithm, there is a well-defined probability that you're right, and that probability is over 0.5. Prior to the numbers and algorithm, the probability of your being right is undefined and undefinable.

    Again I'd like to point people to Bently Preece's excellent explanation. The goal of the question is to come up with a single strategy that will give better than even odds for every game in a particular class of games. And the described strategy does that.

    Now I haven't explained why this particular probability distribution can't exist. The technical reason is that the axioms of probability require the probability of a countable disjoint union to be the sum of the probabilities, and that the probability of the whole set is 1. But the real line is a countable disjoint union of intervals, each of which has probability 0 (for the reason that you explained), so the probability of the whole set is a countable sum of 0's, which is 0. Contradicting the requirement that it be 1.

    Now it is easy to object that one could just define probabilities differently. And it is true, one could. But any way you define it, you'll have a lot of subtleties around infinity, and attempting to think about a uniform probability distribution on the real numbers will be one place that you'll encounter lots of them.

    BTW in the USA this fact is generally covered in a real analysis class in advanced undergraduate or beginning graduate school mathematics.

      There is no trick.

      Yes, there is – even though it is probably not intentional – and it is equivocation. First, you say:

      In the problem, each envelope can contain any number.
      Here, you present "any number" as meaning "any number, with absolutely no restrictions." Later, however, you do place restrictions on the numbers by requiring that it be possible for a guessed third number to fall between two such "any numbers" with a probability of greater than zero:
      Given any two numbers and the algorithm, there is a well-defined probability that you're right, and that probability is over 0.5.
      In other words, you subtly (and perhaps unknowingly) redefined "any number" to effectively mean "any number within a finite range."
      This is why you have to be very careful in the wording to even get a well-defined problem.... Prior to the numbers and algorithm, the probability of your being right is undefined and undefinable.
      Precisely. Prior to the numbers and the algorithm, the probability of your being right is undefinable. How, then, did you arrive at a concrete statement about that probability? You redefined the problem to make it possible. You did it while explaining the "numbers and the algorithm," which made it harder to see, but you did do it.

      I'll say it again: The problem you originally presented and the problem you ultimately analyzed are not the same. The original problem's numbers were free of restrictions, but the analyzed problem's numbers were not. Two different problems.

      Cheers,
      Tom

        There is no trick.

        Yes, there is – even though it is probably not intentional – and it is equivocation. First, you say:

        In the problem, each envelope can contain any number.

        Here, you present "any number" as meaning "any number, with absolutely no restrictions." Later, however, you do place restrictions on the numbers by requiring that it be possible for a guessed third number to fall between two such "any numbers" with a probability of greater than zero:

        That is not a restriction. That is a property of the two numbers both being real. Real numbers are always finite. Given 2 finite numbers, there is an interval between them. A continuous distribution with non-zero probability density everywhere has a positive probability of falling into that interval.

        However you're partially right. I did not state that the numbers had to be real. I don't think that there is any confusion on that point though.

        Given any two numbers and the algorithm, there is a well-defined probability that you're right, and that probability is over 0.5.

        In other words, you subtly (and perhaps unknowingly) redefined "any number" to effectively mean "any number within a finite range."

        No. I did not.

        In this case "any number" means "any real number". I have not put any bounds on how large those numbers may be. But whatever they are, any real number will be finite. That's part of being a real number.

        This is why you have to be very careful in the wording to even get a well-defined problem.... Prior to the numbers and algorithm, the probability of your being right is undefined and undefinable.

        Precisely. Prior to the numbers and the algorithm, the probability of your being right is undefinable. How, then, did you arrive at a concrete statement about that probability? You redefined the problem to make it possible. You did it while explaining the "numbers and the algorithm," which made it harder to see, but you did do it.

        Let's be careful here.

        If I ask what the probability is of your being right without knowing the numbers and algorithm, I can't get an answer. The exact answer is undefined because it depends on the numbers and algorithm.

        If I ask what the probability is of your being right while knowing the numbers and algorithm, I can get an answer. That problem is always well-defined.

        But not knowing the answer is not the same as being unable to prove anything about it. Even though I don't at the moment have the numbers, given the algorithm, I still can prove something about what the probability must be. In this case I can prove that the answer must be greater than 50%. (I can also prove that the answer must be less than 100%.)

        This is like being able to prove that (x+y)+z is x+(y+z). Even though I don't know what x, y, z, or their sum is, I can still prove that the sum adds up the same both ways. (Incidentally, that particular proof takes a surprising amount of work.)

        I'll say it again: The problem you originally presented and the problem you ultimately analyzed are not the same. The original problem's numbers were free of restrictions, but the analyzed problem's numbers were not. Two different problems.
        They were the same problem. Even though the set of real numbers is unbounded, every real number is finite, and every pair of distinct real numbers has a finite interval between them. A continuous probability distribution with non-zero density everywhere must have a positive probability of landing on that interval.

        To object that I'm wrong because I'm limiting my numbers to be finite is like objecting that the rules of algebra are wrong because they only work for finite numbers. Real numbers are finite. It is only the set of real numbers that is infinite.