Newcomb’s paradox is the name usually given to the following problem. You are playing a game against another player, often called Omega, who claims to be omniscient; in particular, Omega claims to be able to predict how you will play in the game. Assume that Omega has convinced you in some way that it is, if not omniscient, at least remarkably accurate: for example, perhaps it has accurately predicted your behavior many times in the past.
Omega places before you two opaque boxes. Box A, it informs you, contains $1,000. Box B, it informs you, contains either $1,000,000 or nothing. You must decide whether to take only Box B or to take both Box A and Box B, with the following caveat: Omega filled Box B with $1,000,000 if and only if it predicted that you would take only Box B.
What do you do?
(If you haven’t heard this problem before, please take a minute to decide on an option before continuing.)
The paradox is that there appear to be two reasonable arguments about which option to take, but unfortunately the two arguments
support opposite conclusions.
The two-box argument is that you should clearly take both boxes. You take Box B either way, so the only decision you’re making is whether to also take Box A. No matter what Omega did before offering the boxes to you, Box A is guaranteed to contain $1,000, so taking it is guaranteed to make you $1,000 richer.
The one-box argument is that you should clearly take only Box B. By hypothesis, if you take only Box B, Omega will predict that and will fill Box B, so you get $1,000,000; if you take both boxes, Omega will predict that and won’t fill Box B, so you only get $1,000.
The two-boxer might respond to the one-boxer as follows: “it sounds like you think a decision you make in the present, at the moment Omega offers you the boxes, will affect what Omega did in the past, at the moment Omega filled the boxes. That’s absurd.”
The one-boxer might respond to the two-boxer as follows: “it sounds like you think you can just make decisions without Omega predicting them. But by hypothesis he can predict them. That’s absurd.”
Now what do you do?
(Again, please take a minute to reassess your original choice before continuing.)
The von Neumann-Morgenstern theorem
Let’s avoid the above question entirely by asking some other questions instead. For example, a question one might want to ask after having thought about Newcomb’s paradox for a bit is “in general, how should I think about the process of making decisions?” This is the subject of decision theory, which is roughly about decisions in the same sense that game theory is about games. The things that make decisions in decision theory are abstractions that we will refer to as agents. Agents have some preferences about the world and are making decisions in an attempt to satisfy their preferences.
One model of preferences is as follows: there is a set of (mutually exclusive) outcomes, and we will model preferences by a binary relation on outcomes describing pairs of outcomes such that the agent weakly prefers to . This means either that in a decision between the two the agent would pick over (the agent strictly prefers to ; we write this as ) or that the agent is indifferent between them. The weak preference relation should be a total preorder; that is, it should satisfy the following axioms:
- Reflexivity: . (The agent is indifferent between an outcome and itself.)
- Transitivity: If and , then . (The agent’s preferences are transitive.)
- Totality: Either or . (The agent has a preference about every pair of outcomes.)
If and then this means that the agent is indifferent between the two outcomes; we write this as . The axioms above imply that indifference is an equivalence relation.
The strong assumptions here are transitivity and totality. One reason totality is a reasonable axiom is that an agent whose preferences aren’t total may be incapable of making a decision if presented with a choice between two outcomes the agent doesn’t have a defined preference between, and this seems undesirable. For example, if we were trying to write a program to make medical decisions, we wouldn’t want the program to crash if faced with the wrong kind of medical crisis.
One reason transitivity is a reasonable axiom is that an agent whose preferences aren’t transitive can be money pumped. For example, if an agent strictly prefers apples to oranges, oranges to bananas, and bananas to apples, then I can offer the agent an apple, then offer to trade it a banana for the apple and a penny (say), then offer to trade it an orange for the banana and a penny (say), and so forth. Again, if we were trying to write a program to make important decisions of some kind, this kind of vulnerability would be very dangerous.
In this model, an agent makes decisions as follows. Each time it makes a decision, it must choose from some number of actions. It needs to determine what outcomes result from each of these actions. Then it needs to determine which of these outcomes is greatest in its preference ordering, and it selects the corresponding action.
This is very unsatisfying as a model of decision making because it fails to take into account uncertainty. In practice, agents making decisions cannot completely determine what outcomes result from their actions: instead, they have some uncertainty about possible outcomes, and that uncertainty should be factored into the decision-making process. We will take uncertainty into account as follows. Define a lottery over outcomes to be a formal linear combination
of outcomes, where the are real numbers summing to and should be interpreted as the probabilities that the outcomes occurs. (Equivalently, a lottery is a particularly simple kind of probability measure on the space of outcomes, which is given the discrete -algebra as a measurable space, but we will not need to use this language.) We now want our agent to have preferences over lotteries rather than preferences over outcomes. That is, the agent’s preferences are now modeled by a total order on lotteries.
Aside from the axioms defining a total order, what other axioms seem reasonable? First, suppose that are two lotteries such that . Now consider the modified lotteries and where with probability the original lotteries occur but with probability some other fixed lottery occurs. Whether we are in the first case or not, we either prefer or are indifferent to what happens in the second lottery, so the following seems reasonable.
- Independence: If , then for all and all we have . Moreover, if and then .
Note that by taking the contrapositive of the second part of independence we get a partial converse of the first part: if such that , then . In particular, if , then . This will be useful later.
Another reasonable axiom is the following. Suppose are three lotteries such that . Now consider the family of lotteries . When the agent weakly prefers this lottery to , but when the agent weakly prefers to this lottery. What happens for intermediate values of ? It seems reasonable for an “intermediate value theorem” to hold here: the agent’s preferences should not jump as varies. So the following seems reasonable.
- Continuity: If , then there exists some such that .
With these axioms we can now state the following foundational theorem.
Theorem (von Neumann-Morgenstern): Suppose an agent’s preferences satisfy the above axioms. Then there exists a function on outcomes, the utility function of the agent, such that if and only if
where and . The utility function is unique up to affine transformations where .
If is a lottery, the corresponding sum is the expected utility with respect to the lottery, so the von Neumann-Morgenstern theorem allows us to describe the goal of an agent (a VNM-rational agent) satisfying the above axioms as maximizing expected utility.
Proof. First observe that we can reduce to the case that is finite. If the theorem were false in the infinite case, then for any proposed utility function we would be able to find a pair of lotteries such that but . But since in total only involve finitely many outcomes, restricts to a utility function with the same property on the finitely many outcomes involved in , so the theorem is false in the finite case.
Now for the proof. It is possible to take a fairly concrete but tedious approach by first constructing using continuity and then proving that satisfies the conclusions of the theorem by induction. We will instead take a more abstract approach by appealing to the hyperplane separation theorem. To start with, think of the set of lotteries as sitting inside Euclidean space as the probability simplex . Let be outcomes which are minimal resp. maximal in the agent’s preference ordering. For , let .
We would like to show that the subset
(of lotteries the agent strictly prefers to , but strictly prefers to) and the subset
(of lotteries the agent strictly prefers to but strictly prefers to ) are disjoint convex open subsets of . That they are disjoint follows from the definition of strict preference. That they are convex can be seen as follows: if are two lotteries such that , then by independence we have
for all , hence for all . Applying this argument with and then applying the argument with reversed inequality signs, first with general and then with , gives the desired result.
Finally, that they are open can be seen as follows: let be a lottery such that . By inspection every point in an open ball around has the form where is some other lottery, which can be taken to be either a lottery equivalent to (in that the agent is indifferent between them) or a lottery such that . So it suffices by convexity to show that for any such there exists some such that .
In the case that can be taken to be equivalent to this is straightforward; by independence
In the case that can be taken to satisfy , a similar application of independence gives
Again, applying the argument with and then applying the argument with reversed inequality signs, first with general and then with , gives the desired result.
Now by the hyperplane separation theorem there exists a hyperplane separating and , where are constants. These constants are in fact independent of and are (up to affine transformation, and in particular we may need to flip their signs) the utility function we seek. To see this, let be two lotteries. Then by independence , and by continuity there is a constant such that
If , then , and the separating hyperplane must pass through both and (since are in neither nor , and the complement of their union consists of lotteries equivalent to ), so they have the same utility. Conversely, if a separating hyperplane passes through two lotteries then they must be equivalent to the same and hence must be equivalent.
Otherwise, , and the separating hyperplane separates and . With the correct choice of signs, it follows that as desired. Conversely, if a separating hyperplane separates two lotteries then they cannot have the same expected utility and hence cannot be equivalent; with the correct choice of signs, if then .
It remains to address the uniqueness claim. The above discussion shows that the utility function is uniquely determined by its value on and , subject to the constraint that . To fix the correct choice of signs above we may set ; any other choice is related to this choice by a unique positive affine linear transformation.
But what about the paradox?
The relevance of the von Neumann-Morgenstern theorem to Newcomb’s paradox is that a particular interpretation of Newcomb’s paradox in the context of expected utility maximization supports the one-box argument. A VNM-rational agent participating in Newcomb’s paradox should be acting in order to maximize expected utility. For the purposes of recasting Newcomb’s paradox in this framework, it’s reasonable to equate utility with money; agents certainly don’t need to have the property that their utility functions are linear in money, but Newcomb’s paradox can just be restated in units of utility (utilons) rather than money.
So, it remains to determine the expected utility of the lottery that occurs if the agent takes one box and the lottery that occurs if the agent takes two boxes. Newcomb’s paradox can be interpreted as saying that in the first lottery, the box contains $1,000,000 with high probability (whatever probability the agent assigns to Omega being an accurate predictor), while in the second lottery, the two boxes together contain $1,000 with high probability. Provided that this probability is sufficiently high, which again can be absorbed into a suitable restatement of Newcomb’s paradox, it seems clear that a VNM-rational agent should take one box. (Note that stating the one-box argument in this way shows that it does not depend on Omega being a perfect predictor; Omega need only be a sufficiently good predictor, where the meaning of “sufficiently” depends on the ratio of the amounts of money in each box.)
This version of the one-box argument is therefore based on the principle of expected utility (to be distinguished from the von Neumann-Morgenstern theorem); roughly speaking, that rational agents should act so as to maximize expected utility. Relative to the definition of expected utility given above this says exactly that rational agents should be VNM-rational.
The two-box argument can also be based on a decision-making principle, namely the principle of dominance, which says the following. Suppose an agent is choosing between two options and . Say that dominates if there is a way to partition possible states of the world such that in each partition, the agent would prefer to choice . (The notion of domination does not depend on having a notion of probability distribution over world states; it requires something much weaker, namely a set of possible world states.) The principle of dominance asserts that rational agents should choose dominant options.
This seems plausible. But it also seems to be the case that taking two boxes dominates taking one box in Newcomb’s paradox:
- If Omega has filled Box B with $1,000,000, then taking both boxes gives you $1,001,000 rather than $1,000,000, so it’s $1,000 better.
- If Omega hasn’t filled Box B with $1,000,000, then taking both boxes gives you $1,000 rather than $0, so it’s still $1,000 better.
One situation in which the principle of dominance doesn’t make sense is if the choice between options itself affects which partition of world-states you’re in. For example, if you chose which boxes to open and then Omega chose whether to fill Box B based on your choice, then the above reasoning doesn’t seem to apply since Omega gets to choose which partition of world-states you’re in after seeing your choice between the two options. But in the setting of Newcomb’s paradox itself this doesn’t seem to be the case: Omega has already made its decision in the past, and it seems absurd to think of the agent’s decision in the present as having an effect on Omega’s past decision.
So Newcomb’s paradox appears to show that the principle of expected utility maximization and the principle of dominance are inconsistent.
Now what do you do?
Newcomb’s paradox remains, as far as I can tell, a hotly debated topic in the philosophical literature, and in particular is considered unresolved. Campbell and Sowden’s Paradoxes of Rationality and Cooperation is a thorough, if somewhat outdated, overview of some aspects of Newcomb’s paradox and its relationship to the prisoner’s dilemma.