Examples of Probability: What do these mean?

What is the probability that Penn will abolish football (as it has boxing) within 5 years?

What was the probability of a health-reform bill in spring 2009?

What is the probability that the Republicans will have a majority in the House after the 2012 election? the Senate?

Other examples:
Intrade
Iowa Electronic Markets
NWS
National Hurricane Center

Normative Theory of Probability

What is probability?
A numerical measure of the strength of a belief in a certain proposition: p(proposition).

Theories: frequency, logical, personal.

Rules of coherence: addition, multiplication, conditional probability, independence.

Theories

The Frequency Theory

Book of odds

The proportion of times it might have happened in the past that it actually did, e.g., p("I get run over crossing 38th street after class") =

   (Number of times people got run over crossing 38th st.)
   -------------------------------------------------------
         (Number of times people crossed 38th st.)

But why this denominator? why the numerator?

Other theories

The Logical Theory

The proportion of all possible exchangeable (i.e., equally likely) worlds which entail our proposition of interest being true. E.g. Playing cards.

But how often can we apply this in the real world?

The Personal Theory

Probability is a subjective judgment based on all of the knowledge and beliefs you have. There is no objectively perfect way to determine the `correct' probability. Reasonable people can disagree, because they have different evidence available to them.

Rules of coherence

p(A) + p(not A) = 1

(not A is called the "complement" of A)

rainnot rain

Additivity

Mutually Exclusive

Propositions A and B are "mutually exclusive" if they cannot both be true at the same time.

I.e., if one of the propositions is true, that "excludes" the possibility of the other being true: the two propositions "mutually exclude" each other.

When propositions A and B are mutually exclusive: p(A or B) = p(A) + p(B)

e.g. p(yes) = p(female-yes or male-yes) = p(male-yes) + p(female-yes)

male-yesfemale-yes

Where does additivity come from?

From betting. The expected value of a bet on an event is its probability times the amount to win. (Later we'll see that this works for "expected utility.")

For example, the expected value of "$10 if a coin comes up heads" is $5.

What is the EV of "$10 if a coin comes up heads twice (in 2 flips)"?

EV is (roughly) the average value if the bet were repeated.

EV and additivity

We can bet on two events at once.

The value of a bet on one event should not change when you break it into two events.

For example, the expected value of (and willingness to pay for)

$10 if "Red card"

should be the same as:

$10 if "Heart"
and
$10 if "Diamond"

Example, continued

Suppose you are willing to pay or accept $4 for $10 if "Red card"

$2.50 for $10 if "Heart"

$2.50 for $10 if "Diamond"

You have a ticket for the first bet. I buy it from you for $4. You are now ahead by $4.

Example, continued

Suppose you are willing to pay or accept $4 for $10 if "Red card"

$2.50 for $10 if "Heart"

$2.50 for $10 if "Diamond"

You have a ticket for the first bet. I buy it from you for $4. You are now ahead by $4.

Then I sell you tickets for the second and third bets for $2.50 each. You are now behind by $1.

Example, continued

Suppose you are willing to pay or accept $4 for $10 if "Red card"

$2.50 for $10 if "Heart"

$2.50 for $10 if "Diamond"

You have a ticket for the first bet. I buy it from you for $4. You are now ahead by $4.

Then I sell you tickets for the second and third bets for $2.50 each. You are now behind by $1.

Then I point out that the second and third bets together are the same as the first bet, so you are willing to trade those two tickets for a ticket for the first bet. You are still behind by $1, and we start over....

Conditional Probability Defined

The conditional probability of proposition A given proposition B is the probability that we would assign to A if we knew that B were true, that is, the probability of A conditional on B being true. We write p(A|B) or p(A/B). This does not mean "divided by".

For example, what is the probability that Obama wins if Perry is the nominee?

Multiplication Rule - p(A and B)

The Conditional Probability Rule is: p(A | B) = p(A and B) / p(B)

(This time the / does mean "divided by".)

In other words: p(A and B) / p(B) = p(A | B)

or:
p (A and B) = p(A | B) × p(B), the multiplication rule.

p (Obama-win and Perry-nom) = p(Obama-win | Perry-nom) × p(Perry-nom)

Where does the multiplication rule come from?

two stage gamble image

Your choice should not change if you narrow down the space of possibilities to what matters.

Thus, if you prefer ($45, .20) to ($30, .25)
you should prefer ($45,.80) to ($30, 1.00). But:
U($45) × .20 = U($30) × .25 iff
U($45) × .80 = U($30) × 1.00

Conditioned assessment (Kleinmuntz et al., 1996)

Estimate the probability of some event E.

Estimate the probability of E given some other event F, p(E|F)

Estimate p(E|not-F)

Estimate p(F)

Compute E' as p(F)×p(E|F) + p(not-F)×p(E|not-F)

Independent Propositions

Two propositions A and B are independent if believing that A is true does not change your belief about whether B is true. (The reverse always holds.)

Are "It is raining" and "Melissa is mowing the lawn" independent?

Are "The first car to pass us will be a Dodge" and "The second car to pass us will be a BMW" independent?

When A and B are independent, then the multiplication rule can be simplified, because p(A | B) = p(A). Hence: p(A and B) = p(A) × p(B).

A diagram of Bayes's theorem

diagram

Deriving Bayes's Theorem

The Conditional Probability Rule tells us a way to calculate p(H | D):

(1) p(H | D) = p(H and D) / p(D)

But we don't know p(H and D) or p(D). Let's try to use what we do know to find out what they are. The Multiplication Rule is:

(2) p(D and H) = p(D | H) × p(H)

We can change (D and H) to (H and D), since they mean the same thing:

(3) p(H and D) = p(D | H) × p(H)

So now we can work out p(H and D) from two things we know: p(D | H) and p(H). If we use (3) to change (1), we can get a form of Bayes's theorem:

(4) p(H | D) = p(D | H) × p(H) / p(D)

Getting p(D)

(we have 4) p(H | D) = p(D | H) × p(H) / p(D)

Now we just need to get p(D). We can use the fact that D must happen either with H or with not-H:

(5) p(D) = p(D and H) + p(D and not-H)

Then we can use the Multiplication Rule again to replace the "and" terms:

(6) p(D) = p(D | H) × p(H) + p(D | not-H) × p(not-H)

By replacing "p(D)" in (4) with our expression from (6) we get:

(7)

                    p(D|H)×p(H)
p(H|D) = ---------------------------------------------
           [ p(D|H)×p(H) + p(D|not-H)×p(not-H) ]

Finishing the example

(we have 7)

                    p(D|H)×p(H)
p(H|D) = ---------------------------------------------
           [ p(D|H)×p(H) + p(D|not-H)×p(not-H) ]

So the doctor could use that formula to figure out whether a mammogram would be worthwhile. Putting in the numbers

                .9×.1          .09     .09
p(H | D) = --------------- = ------- = --- = .33
           [.9×.1 + .2×.9]   .09+.18   .27

Relation between Bayes's theorem and conditional assessment

R SVG Plot! m f y n y n

compute p(y) from p(y|m), p(y|f), and p(m):

p(y) = p(y|m)×p(m) + p(y|f)×(1-p(m))

which is the same as p(y&m) + p(y&f).

Now suppose we want to comput p(m|y). By Bayes,

p(m|y) = p(m&y) / [p(m&y) + p(f&y)]

or p(m|y) = p(m&y) / p(y).

In other words, conditional assessment gives us the denominator of Bayes's theorem, for this calculation.

Ratio form of Bayes's theorem

                    p(D | H) × p(H)
p(H | D) = ---------------------------------------------
           [ p(D | H) × p(H) + p(D | not-H) × p(not-H) ]

                         p(D | not-H) × p(not-H)
p(not-H | D) = ---------------------------------------------
               [ p(D | H) × p(H) + p(D | not-H) × p(not-H) ]

  p(H | D)         p(D | H) × p(H)          p(D | H)     p(H)
------------ = ----------------------- = ------------ × --------
p(not-H | D)   p(D | not-H) × p(not-H)   p(D | not-H)   p(not-H)

posterior odds = diagnostic ratio × prior odds

log(posterior odds) = log(diagnostic ratio) + log(prior odds)