The principle of maximum entropy asserts that when trying to determine an unknown probability distribution (for example, the distribution of possible results that occur when you toss a possibly unfair die), you should pick the distribution with maximum entropy consistent with your knowledge.

The goal of this post is to derive the principle of maximum entropy in the special case of probability distributions over finite sets from

- Bayes’ theorem and
- the principle of indifference: assign probability to each of possible outcomes if you have no additional knowledge. (The slogan in statistical mechanics is “all microstates are equally likely.”)

We’ll do this by deriving an arguably more fundamental principle of maximum relative entropy using only Bayes’ theorem.