RE: Observation selection effects

From: Jesse Mazer <>
Date: Tue, 05 Oct 2004 19:00:49 -0400

>>-----Original Message-----
>>From: Jesse Mazer []
>>Sent: Tuesday, October 05, 2004 8:45 PM
>>Subject: RE: Observation selection effects
>>If the range of the smaller amount is infinite,
>>as in my P(x)=1/e^x
>>example, then it would no longer make sense to say that
>>the range of the
>>larger amount is r times larger.
>Sure it does; r*inf=inf.  P(s)=exp(-x) -> P(l)=exp(-x/r)

But it would make just as much sense to say that the second range is 3r
times wider, since by the same logic 3r*inf=inf. In other words, this step
in your proof doesn't make sense:

>In other words, the range of possible
>amounts is such that the larger and smaller amount do not overlap.
>Then, for any interval of the range (x,x+dx) for the smaller
>amount with probability p, there is a corresponding interval (r*x,
>r*x+r*dx) with probability p for the larger amount. Since the
>latter interval is longer by a factor of r
> P(l|m)/P(s|m) = r ,
>In other words, no matter what m is, it is r-times more likely to
>fall in a large-amount interval than in a small-amount interval.

As for your statement that "P(s)=exp(-x) -> P(l)=exp(-x/r)", that can't be
true. It doesn't make sense that the value of the second probability
distribution at x would be exp(-x/r), since the range of possible values for
the amount in that envelope is 0 to infinity, but the integral of exp(-x/r)
from 0 to infinity is not equal to 1, so that's not a valid probability

Also, now that I think more about it I'm not even sure the step in your
proof I quoted above actually makes sense even in the case of a probability
distribution with finite range. What exactly does the equation
"P(l|m)/P(s|m) = r" mean, anyway? It can't mean that if I choose an envelope
at random, before I even open it I can say that the amount m inside is r
times more likely to have been picked from the larger distribution, since I
know there is a 50% chance I will pick the envelope whose amount was picked
from the larger distribution. Is it supposed to mean that if we let the
number of trials go to infinity and then look at the subset of trials where
the envelope I opened contained m dollars, it is r times more likely that
the envelope was picked from the larger distribution on any given trial?
This can't be true for every specific m--for example, if the smaller
distribution had a range of 0 to 100 and the larger had a range of 0 to 200,
if I set m=150, then in every single trial where I found 150 dollars in the
envelope it must have been selected from the larger distribution. You could
do a weighted average over all possible values of m, like "integral over all
possible values of m of P('I found m dollars in the envelope I
selected')*P('the envelope I selected had an amount taken from the smaller
distribution' | 'I found m dollars in the envelope I selected'), which you
could write as "integral over m of P(m)*P(s|m)", but I don't think it would
be true that the ratio "integral over m of P(m)*P(l|m)"/"integral over m of
P(m)*P(s|m)" would be equal to r, in fact I think both integrals would
always come out to 1/2 so the ratio would always be 1...and even if I'm
wrong, replacing P(l|m)/P(s|m) with this ratio of integrals would mess up
the rest of your proof.

Received on Tue Oct 05 2004 - 19:05:09 PDT

This archive was generated by hypermail 2.3.0 : Fri Feb 16 2018 - 13:20:10 PST