Re: SSA and game theory (was: self-sampling assumption is incorrect) from Hal Finney on 2002-07-17 (everything)

From: Hal Finney <hal.domain.name.hidden>
Date: Wed, 17 Jul 2002 18:49:04 -0700

Wei wrote:
> Here's a simplified thought experiment that illustrates the issue. Two
> copies of the subject S, A and B, are asked to choose option 1 or option
> 2. If A chooses 1, S wins a TV (TV), otherise S wins a worse TV (TV2). If
> B chooses 1, S wins a stereo, otherwise S wins TV. S prefers TV to TV2 to
> stereo, but would rather have a TV and a stereo than two TVs. The copies
> have to choose without knowing whether they are A or B.

OK, I understand now that the utilities below are the utilities for A
and B when S gets the various items. So U(TV) is the utility for A for
S to get a TV, which is the same as the utility for B since they are
identical copies.

> According to my incorrect analysis, SSA would imply that you choose option
> 2, because that gives you .5*U(TV2) + .5*U(TV) > .5*U(TV) + .5*U(stereo)
> since U(TV2) > U(stereo). I argued that you should consider yourself A and
> B simultaneously so you could rationally choose option 2, because
> U({TV,stereo}) > U({TV2, TV}).

Yes, that makes sense.

> However taking both SSA and game theory
> into account implies that option 2 is rational. Furthermore, my earlier
> suggestion leads to unintuitive results in general, when the two players
> do not share the same utility function.

I know you meant to write that game theory implies that option 2 is
irrational.

> The game theoretic analysis goes like this. There are two possible
> outcomes with pure strategies (I'll ignore mixed strategies for now).
> Either A and B both choose 1, or they both choose 2. The first one is a
> Nash equilibrium, the second may or may not be. To understand what this
> means, suppose you are one of the players in this game (either A or B but
> you don't know which) and you expect the other player to choose option 1.
> Then your expected utility if you choose option 1 is .5*U({TV,stereo}) +
> .5*U({TV,stereo}). If you choose option 2, the expected utility is
> .5*U({TV2,stereo}) + .5*({TV,TV}) which is strictly less. So you have no
> reason not to choose option 1 if you expect the other player to choose
> option 1. Whether or not the second possible outcome is also a Nash
> equilibrium depends on whether U({TV2,TV}) > .5*U({TV2,stereo}) +
> .5*({TV,TV}). But even if it is, the players can just coordinate ahead of
> time (or implicitly) to choose option 1 and obtain the better equilibrium.

If option 2 is also a Nash equilibrium, that is better than option 1,
right? This is why option 2 was preferred under the first analysis.
However I see that under this reasoning there are utility assignments
which make option 1 be a Nash equilibrium while option 2 is not, hence
option 1 would be preferred in those cases, despite the earlier reasoning
which would choose option 2.

I have a problem with this application of game theory to a situation where
A and B both know that they are going to choose the same thing, which I
believe is the case here. Let me make this more specific by assuming that
A and B (and S) are deterministic computational systems. Their needs
for randomness are met by an internal pseudo random number generator.
When S is duplicated to form A and B, the PRNG state is duplicated as
well, so that A and B are running exactly the same deterministic program.

This is the situation which most sharply appeals to the intuition that A
and B should be thought of as "the same person". They are two instances
of the same deterministic calculation, with exactly the same steps being
executed for both.

Under these circumstances, I don't see how considerations of Nash
equilibria can arise. These require implicitly assuming that the other
side may choose a different value than yourself. But with the setup I
give, it is physically impossible for that to happen. The other player
has no more freedom to behave differently than does an image in a mirror.

Likewise with the amnesiac prisoner's dilemma, if the amnesia is provided
in the manner I have described, so that both parties are running exactly
the same program and both know that they are doing so, it seems perfectly
reasonable to choose to cooperate. There are actually fewer degrees
of freedom than the game matrix implies; only two possible outcomes,
rather than four. And the best of the two possible outcomes is when
both parties cooperate rather than defect.

This approach suggests a question with regard to the causal interpretation
of Newcomb's paradox. First, as something of a digression, suppose
it turns out that the experimenter's eerie accuracy in the Newcomb
setup is because he has a time machine. After the subject's decision
is made to choose one or two boxes, the experimenter goes back in time
and fills the boxes appropriately. In this case, it seems to me that
the causalist may decide that taking one box is the preferred outcome,
because his choice does *cause* the filling of the boxes. The effect
takes place earlier in time, but given that there is a time machine in
the picture, we have to accept reversed causality. OTOH the causalist may
reject this reasoning, arguing along his usual lines that the boxes have
already been filled, and taking two has to give him more than taking one.
I don't know which conclusion he would choose.

But more relevantly, suppose that the experimenter's secret is as
follows. Let the subject be a deterministic computational system as in
the APD and other examples above. What the experimenter does is to run
the computation forward until it makes a choice. Then he rewinds the
computation to the state it was in at the beginning, and fills the boxes.
Now he runs the computation forward again, where it will make the same
decision (being deterministic) and so the "prediction" is always correct.

Of course, this description is not much different from standard variants
where the experimenter is an alien with a perfect grasp of human
psychology, or God, able to predict with perfection what people will do.
But by making it concrete in terms of deterministic computations, it
allows for a different view from the causal perspective.

Specifically, when the subject is asked to make his decision, he knows
that he will be put into this state twice; once when the experimenter
was running him to find out what he'd do, and again when the actual
choice was made. From the point of view of shared minds, the subject
must view himself as being in a superposition of these two states.

The point is that in one of those two states, his decision does in
fact have a causal effect on the outcome. It is the direct effect of
his decision that lets the experimenter fill the boxes. So from his
subjective perspective, where he doesn't know if this is the first or
second run, he can at least figure that there is a 50% chance that his
decision has a causal effect on the outcome. It seems to me that this
might be enough to justify choosing one box even from a causal analysis.

Hal Finney
Received on Wed Jul 17 2002 - 18:49:35 PDT

This archive was generated by hypermail 2.3.0 : Fri Feb 16 2018 - 13:20:07 PST