Re: Bayesian boxes and Independence of Scales

From: <GSLevy.domain.name.hidden>
Date: Mon, 10 May 1999 14:41:41 EDT

I agree that the implicit distribution of m is essential in determining if we
should switch after making our first choice. Interestingly, if we assume a
logarithmic distribution and compute the expected value of the content of the
other box (the one not selected), we find that it is identical to the content
of the selected box:

Expected Value = (1/2) (Log(m/2) + Log(2m)) = (1/2) (Log(m) - Log2 + Log(m)
+ Log2)
                       = Log(m)

Which implies that the logarithmic distribution is the one assumed by common
sense: don't switch. In other words, the value that any particular number is
equal to m is equally likely, and INDEPENDENT OF SCALE. This independence of
scale has some intriguing connection with the MWI and the issue that the
probability of survival after death is extremely low but FROM THE POINT OF
VIEW of the observer it must be equal to one.


In a message dated 99-05-05 12:31:25 EDT, you write:

<<
 Stefan Rueger <smr3.domain.name.hidden> wrote:
 
> Iain,
>
> Let box one contain m pounds and box two 2m pounds, where m is
> any real positive number. You may choose a box look at the
> amount of money inside and decide either to keep this amount or
> the other box's amount. (For this exercise you may think of
> cheques with real numbers on it and a special bank account where
> you can actually bank any real number of pounds.)
>
> In Peter's inaugural, he suggested that if you pick a box at
> random and find x pounds, you might conclude that the other box
> either contains x/2 or 2x pounds, the average of which is 1.25x.
> So you are better off (on average by 25%!) choosing the other
> box. He left it to the audience to decide whether this conclusion
> is valid or not.
>
> It was immediately clear to me - and I am sure, it is immediately
> clear to you - that the argument is flawed.
 
 Errrm... do you mean the argument exactly as given above (with the
 conclusion that choosing randomly and then always switching - which is
 of course just a different style of "choosing randomly"! - is 1.25 times
 better than not switching) is flawed? If so, then you and I certainly
 agree it's flawed, and probably Pete agrees too! (Though he'll have to
 speak for himself of course.) Or do you mean the "meta-argument" (which
 correctly points out the absurdity of one random 50-50 choice generation
 protocol being 1.25 times better than another, but goes on to conclude
 sweepingly that *no* choice protocol, random or non-random, can be
 better than any other for this problem) is flawed? If this latter, then
 I agree with you that the "meta-argument" is flawed - though maybe for
 different reasons than yours! - and I think Pete possibly *disagrees*
 and takes the meta-argument to be just fine, sweeping conclusion and
 all! (Again, of course, he'll have to speak for himself... I really
 shouldn't play guessing games with other people's opinions like this. I
 wasn't even at his inaugural talk after all!!)
 
>
> We both know that
> choosing a box at random (eg, using a fair coin) and keeping the
> contents gives you 1.5m pounds on average. The same is true by
> symmetry if you choose a box randomly and keep the contents of
> the **other** box: both methods must add up to the 3m pounds.
>
 
 Exactly. That's the "correctly points out" clause at the beginning of
 what I'm calling the meta-argument above. Fine so far. And indeed any
 other *non-quantity-of-money-driven* elaboration of "50-50 random
 choice" would be no better or worse too. You can think up really
 glorious protocols here, like choosing a box, choosing an integer to say
 how many times to "dither back and forth" (of course this integer may as
 well be restricted to 0 or 1 since dithering back and forth is cyclic
 mod 2), and then duly "dithering" that many times before finally
 announcing the box one settles on. All these
 non-quantity-of-money-driven elaborations are the same as just flipping
 a coin and getting the thing over with.
 
>
> Peter's argument has the following flaw: The core argument is
> that reading one box's cheque (say, amount x) makes the contents
> of the other box's cheque amount a random variable with values 2x
> and x/2 and a uniform distribution (50% probability for either).
> But the contents of the other box is not random at all! It is
> either 2x or x/2, and this depends on the experimentor's mood
> when the boxes were set up.
 
 I agree here - everything depends on the distribution the experimenter
 used to generate the secret number m (the number which is used to fill
 the boxes with m and 2m) in the first place. Even if the experimenter
 later says something like "But I didn't use a distribution! I just shut
 my eyes and thought of a number on a whim!", that statement is
 scientifically meaningless - after all their "whim" has to be encoded
 somehow in their brain state, and so the distribution is "whatever
 probabilities brains in that whim-state have for thinking up numbers".
 By hook or by crook, there's a distribution for m lurking in there!
 
> This is a
> classical example for
> assuming a probability distribution for something that you don't
> know. People who do this
 
 ...(including me!)...
 
> call themselves Bayesians.
> The problem
> with the Bayesian approach is that in some cases, when you are
> not careful, it can produce misleading results.
 
 Ah, but all good Bayesians are careful. :-)
 
 I think what you mean by being "not careful", in the context of this
 example, is taking the distribution for the question "Is the box I've
 just opened the m box or the 2m box?" to always be 50-50, *even after
 you've read the quantity of money it contains*. That would indeed be not
 just careless but wrong - it is, after all, 50-50 *before* you've read
 the quantity of money, and *therefore*, for that very reason, is not in
 general 50-50 *after* you've read the quantity of money. If you had
 access to the experimenter's secret distribution for generating m in the
 first place (which is not the same thing as access to the value of m
 itself on this occasion, I hasten to add!), you could perform the update
 in a fully proper and knowledgeable way. If you are a Bayesian, then
 even in the absence of access to the experimenter's distribution for m,
 you take it upon yourself to do the next best thing to this: you make
 your own guess as to "what kind of distributions experimenters like when
 playing games with people", you weight *these* distributions according
 to a guessed set of weightings ("meta-distribution"?) of your own, you
 collapse the meta-distribution of "ordinary" distributions into one
 consolidated distribution by the standard
 "multiply-it-all-out-and-add-it-all-up" method, and you use *that*
 next-best-thing consolidated distribution (and your just-obtained
 knowledge of the quantity of money in the box you opened) to update your
 weightings - previously 50-50 - for that dangling question this long and
 rambling paragraph started off with: "Is the box I've just opened the m
 box or the 2m box?". Then, of course, you switch iff your updated
 weightings favour the unopened box. Phew! :-)
 
>
> Thinking deeper about Peter's box problem after the inauguration,
> I thought that there might be a similar experiment that actually
> increases your expected return of 1.5m.
>
> You may find this astonishing, but I actually came up with a
> strategy that has an expected return of **more than** 1.5m -
> no matter what m is (as long as m is bigger than zero).
>
> In order to increase suspension a little and give you some time
> to think about this, I'll reveal my method to you - after you
> reply to this mail saying something to the effect of "It would be
> astonishing and completely unbelievable that such a method should
> exist".
>
> Regards!
>
> Stefan
 
 Well, as should now be clear, I wouldn't use *those* words - and in fact
 the algorithm I said a good Bayesian would follow (the business about
 the consolidated distribution, in my long and rambling paragraph above)
 *would*, I believe, increase the expected return above 1.5m in most
 real-life cases. Even using a very crude caricature of what I called the
 meta-distribution in that paragraph would probably raise the expected
 return a wee bit above 1.5m. So, I'm not saying such an algorithm would
 be "astonishing and completely unbelievable". Not at all! But of course
 I'd still like you to tell me your method. I'm too tired to work out a
 particular instantiation of my own "algorithm in words only" in my long
 and rambling paragraph above! :-)
 
 Bye for now... (and apologies to those on the Theories of Everything
 mailing list for whom this might all be a bit out of context - but it is
 a fairly well-known "toy probability paradox" in the literature, and
 Stefan's explanation of it at the beginning is quite self-contained.)
 
   Iain.
 
 --
 I have discovered a truly marvellous Theory of Everything. Unfortunately
 this T-shirt is too small to contain it.
 (Or: yeah, sure, the Theory of Everything will fit on a T-shirt. But in
 what size of font?)
 
 --------------------
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
 
 Stefan Rueger &lt;smr3.domain.name.hidden> wrote:
 <BLOCKQUOTE TYPE=CITE>Iain,
 
Let box one contain m pounds and box two 2m pounds, where m is
 
any real positive number. You may choose a box look at the
 
amount of money inside and decide either to keep this amount or
 
the other box's amount. (For this exercise you may think of
 
cheques with real numbers on it and a special bank account where
 
you can&nbsp; actually bank any real number of pounds.)
 
In Peter's inaugural, he suggested that if you pick a box at
 
random and find x pounds, you might conclude that the other box
 
either contains x/2 or 2x pounds, the average of which is 1.25x.
 
So you are better off (on average by 25%!) choosing the other
 
box. He left it to the audience to decide whether this conclusion
 
is valid or not.
 
It was immediately clear to me - and I am sure, it is immediately
 
clear to you - that the argument is flawed.</BLOCKQUOTE>
 Errrm... do you mean the argument exactly as given above (with the conclusion
 that choosing randomly and then always switching - which is of course just
 a different style of "choosing randomly"! - is 1.25 times better than not
 switching) is flawed? If so, then you and I certainly agree it's flawed,
 and probably Pete agrees too! (Though he'll have to speak for himself of
 course.) Or do you mean the "meta-argument" (which correctly points out
 the absurdity of one random 50-50 choice generation protocol being 1.25
 times better than another, but goes on to conclude sweepingly that *no*
 choice protocol, random or non-random, can be better than any other for
 this problem) is flawed? If this latter, then I agree with you that the
 "meta-argument" is flawed - though maybe for different reasons than yours!
 - and I think Pete possibly *disagrees* and takes the meta-argument
 to be just fine, sweeping conclusion and all! (Again, of course, he'll
 have to speak for himself... I really shouldn't play guessing games with
 other people's opinions like this. I wasn't even at his inaugural talk
 after all!!)
 <BLOCKQUOTE
TYPE=CITE>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 We both know that
 
choosing a box at random (eg, using a fair coin) and keeping the
 
contents gives you 1.5m pounds on average. The same is true by
 
symmetry if you choose a box randomly and keep the contents of
 
the **other** box: both methods must add up to the 3m pounds.
 
&nbsp;</BLOCKQUOTE>
 Exactly. That's the "correctly points out" clause at the beginning of what
 I'm calling the meta-argument above. Fine so far. And indeed any other
 *non-quantity-of-money-driven* elaboration of "50-50 random choice"
 would be no better or worse too. You can think up really glorious protocols
 here, like choosing a box, choosing an integer to say how many times to
 "dither back and forth" (of course this integer may as well be restricted
 to 0 or 1 since dithering back and forth is cyclic mod 2), and then duly
 "dithering" that many times before finally announcing the box one settles
 on. All these non-quantity-of-money-driven elaborations are the same as
 just flipping a coin and getting the thing over with.
 <BLOCKQUOTE TYPE=CITE>&nbsp;
 
Peter's argument has the following flaw: The core argument is
 
that reading one box's cheque (say, amount x) makes the contents
 
of the other box's cheque amount a random variable with values 2x
 
and x/2 and a uniform distribution (50% probability for either).
 
But the contents of the other box is not random at all! It is
 
either 2x or x/2, and this depends on the experimentor's mood
 
when the boxes were set up.</BLOCKQUOTE>
 I agree here - everything depends on the distribution the experimenter
 used to generate the secret number m (the number which is used to
 fill the boxes with m and 2m) in the first place. Even if
 the experimenter later says something like "But I didn't use a distribution!
 I just shut my eyes and thought of a number on a whim!", that statement
 is scientifically meaningless - after all their "whim" has to be encoded
 somehow in their brain state, and so the distribution is "whatever
probabilities
 brains in that whim-state have for thinking up numbers". By hook or by
 crook, there's a distribution for m lurking in there!
 <BLOCKQUOTE
TYPE=CITE>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;
 This is a classical example for
 
assuming a probability distribution for something that you don't
 
know. People who do this</BLOCKQUOTE>
 ...(including me!)...
 <BLOCKQUOTE
TYPE=CITE>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n
bsp;&nbsp;&nbsp;&nbsp;
 call themselves Bayesians. The problem
 
with the Bayesian&nbsp; approach is that in some cases, when you are
 
not careful, it can produce misleading results.</BLOCKQUOTE>
 Ah, but all good Bayesians are careful. :-)
 
I think what you mean by being "not careful", in the context of this
 example, is taking the distribution for the question "Is the box I've just
 opened the m box or the 2m box?" to always be 50-50, *even
 after you've read the quantity of money it contains*. That would indeed
 be not just careless but wrong - it is, after all, 50-50 *before*
 you've read the quantity of money, and *therefore*, for that very
 reason, is not in general 50-50 *after* you've read the quantity
 of money. If you had access to the experimenter's secret distribution for
 generating m in the first place (which is not the same thing as
 access to the value of m itself on this occasion, I hasten to add!),
 you could perform the update in a fully proper and knowledgeable way. If
 you are a Bayesian, then even in the absence of access to the experimenter's
 distribution for m, you take it upon yourself to do the next best
 thing to this: you make your own guess as to "what kind of distributions
 experimenters like when playing games with people", you weight *these*
 distributions according to a guessed set of weightings ("meta-distribution"?)
 of your own, you collapse the meta-distribution of "ordinary" distributions
 into one consolidated distribution by the standard
"multiply-it-all-out-and-add-it-all-up"
 method, and you use *that* next-best-thing consolidated distribution
 (and your just-obtained knowledge of the quantity of money in the box you
 opened) to update your weightings - previously 50-50 - for that dangling
 question this long and rambling paragraph started off with: "Is the box
 I've just opened the m box or the 2m box?". Then, of course,
 you switch iff your updated weightings favour the unopened box. Phew! :-)
 <BLOCKQUOTE TYPE=CITE>&nbsp;
 
Thinking deeper about Peter's box problem after the inauguration,
 
I thought that there might be a similar experiment that actually
 
increases your expected return of 1.5m.
 
You may find this astonishing, but I actually came up with a
 
strategy that has an expected return of **more than** 1.5m -
 
no matter what m is (as long as m is bigger than zero).
 
In order to increase suspension a little and give you some time
 
to think about this, I'll reveal my method to you - after you
 
reply to this mail saying something to the effect of "It would be
 
astonishing and completely unbelievable that such a method should
 
exist".
 
Regards!
 
Stefan</BLOCKQUOTE>
 Well, as should now be clear, I wouldn't use *those* words - and
 in fact the algorithm I said a good Bayesian would follow (the business
 about the consolidated distribution, in my long and rambling paragraph
 above) *would*, I believe, increase the expected return above 1.5m
 in most real-life cases. Even using a very crude caricature of what I called
 the meta-distribution in that paragraph would probably raise the expected
 return a wee bit above 1.5m. So, I'm not saying such an algorithm
 would be "astonishing and completely unbelievable". Not at all! But of
 course I'd still like you to tell me your method. I'm too tired to work
 out a particular instantiation of my own "algorithm in words only" in my
 long and rambling paragraph above! :-)
 
Bye for now... (and apologies to those on the Theories of Everything
 mailing list for whom this might all be a bit out of context - but it is
 a fairly well-known "toy probability paradox" in the literature, and Stefan's
 explanation of it at the beginning is quite self-contained.)
 
&nbsp; Iain. >>
Received on Mon May 10 1999 - 11:50:45 PDT

This archive was generated by hypermail 2.3.0 : Fri Feb 16 2018 - 13:20:06 PST