The Monty Hall Problem
Jun
8
Written by:
Charles Flock
6/8/2010 5:04 PM
Using the SeriesInt function to create a Monte Carlo simulation analyzing a classic probability problem.
Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?
The reason this is called the Monty Hall Problem, is that it is loosely based on the game show Let’s Make A Deal, which was hosted by Monty Hall.
We can use the SeriesInt function to randomly generate integers. We can use this function to create a Monte Carlo simulation to answer the question. First, we will have to figure out which door the car is behind. And, let’s keep the number of simulations small (10) until we have built our model. The first thing we want to do is randomly generate integers between 1 and 3 (representing the 3 doors). We can do that in the following statement:
SELECT seq, seriesvalue FROM wct.SeriesInt(1,3,1,10,'R')
I have told the SeriesInt function to randomly (as opposed to linearly) generate numbers between 1 and 3 ten times. Since this is a non-determinstic function, your results may vary, but this query returned the following results:
seq seriesvalue
----------- -----------
1 2
2 3
3 3
4 3
5 3
6 2
7 3
8 2
9 1
10 2
(10 row(s) affected)
We also need to determine which of the 3 doors ‘you’ picked and pair it up with the door that the car is behind. We will use a derived table to do that, and we will call the 2 columns Car and Pick.
SELECT n.seriesvalue as Car
,m.seriesvalue as Pick
FROM (
SELECT seq, seriesvalue FROM wct.SeriesInt(1,3,1,10,'R')
) n
,wct.SeriesInt(1,3,1,10,'R') m
WHERE m.seq = n.seq
This produces the following result (though your results will almost definitely be different):
Car Pick
----------- -----------
3 2
1 2
2 1
3 3
3 3
2 2
3 3
3 1
1 3
2 1
(10 row(s) affected)
Now, let’s figure out which door Monty Hall is going to open for us. The wording of the problem implies that he will always open a door that has a goat behind it (which seems reasonable to me), and we will build our model under that assumption. The following statement will use the values generated for Car and Pick to determine which door Monty Hall will open.
SELECT Car
,Pick
,CASE
WHEN Pick % 3 = Car - 1 THEN 6-car-pick
ELSE Pick % 3 + 1
END as [Monty Opens]
FROM (SELECT n.seriesvalue as Car
,m.seriesvalue as Pick
FROM (SELECT seq, seriesvalue FROM wct.SeriesInt(1,3,1,10,'R')) n
,wct.SeriesInt(1,3,1,10,'R') m
WHERE m.seq = n.seq) o
Which returns the following result:
Car Pick Monty Opens
----------- ----------- -----------
3 1 2
3 2 1
2 2 3
3 2 1
2 1 3
2 2 3
3 3 1
3 1 2
1 2 3
3 2 1
(10 row(s) affected)
The trick here was to have Monty open a door other than the door we picked and other than the door that hides the car. When Car <> Pick, that’s easy; it’s the only door that’s left. Otherwise we have to pick one of the two remaining doors.
The hard part is done. Now we just have to compute the results. We will add 2 more columns to our report, called [Stay and Win] (which means we didn’t change the door that we had originally selected) and [Switch and Win] (which means we changed the door that we originally selected). We will set [Stay and Win] equal to True when Car is equal to Pick and we will set [Switch and Win] equal to True when Car is not equal to Pick. We do that in the following statement:
SELECT Car
,Pick
,CASE
WHEN Pick % 3 = Car - 1 THEN 6-car-pick
ELSE Pick % 3 + 1
END as [Monty Opens]
,CASE
WHEN Car = Pick Then 'True'
ELSE 'False'
END as [Stay and Win]
,CASE
WHEN Car = Pick Then 'False'
ELSE 'True'
END as [Switch and Win]
FROM (SELECT n.seriesvalue as Car
,m.seriesvalue as Pick
FROM (SELECT seq, seriesvalue FROM wct.SeriesInt(1,3,1,10,'R')) n
,wct.SeriesInt(1,3,1,10,'R') m
WHERE m.seq = n.seq) o
Which produced the following results:
Car Pick Monty Opens Stay and Win Switch and Win
----------- ----------- ----------- ------------ --------------
1 2 3 False True
3 2 1 False True
2 2 3 True False
1 2 3 False True
3 2 1 False True
1 3 2 False True
3 1 2 False True
1 1 2 True False
2 3 1 False True
2 1 3 False True
(10 row(s) affected)
Finally, we will increase the number of trials and count the results:
SELECT COUNT(*) as [Stay and Win Count]
FROM (Select Car
,Pick
,CASE
WHEN Pick % 3 = Car - 1 THEN 6-car-pick
ELSE Pick % 3 + 1
END as [Monty Opens]
,CASE
WHEN Car = Pick Then 'True'
ELSE 'False'
END as [Stay and Win]
,CASE
WHEN Car = Pick Then 'False'
ELSE 'True'
END as [Switch and Win]
from (Select n.seriesvalue as Car
,m.seriesvalue as Pick
FROM (select seq, seriesvalue from wct.SeriesInt(1,3,1,1000,'R')) n
,wct.SeriesInt(1,3,1,1000,'R') m
where m.seq = n.seq) o) p
where [Stay and Win] = 'True'
This produces the following result:
Stay and Win Count
------------------
349
(1 row(s) affected)
In this simulation, which consisted of 1,000 trials, the strategy of not switching, Stay and Win, was successful in 349 of the trials, or 34.9% of time. We can conclude that the Switch and Win strategy is more successful.
The beauty of doing the simulation in SQL Server is that if I wanted to increase the number of trials to ten thousand or a hundred thousand, or even a million, it’s effortless to do so. I simply need to change the fourth parameter in the SeriesInt function which specifies the number of iterations. And if I wanted to change the model, it’s just a few minor changes to the SQL.
Of course, we don’t need to run a Monte Carlo simulation to answer the question. It can be derived in a very straightforward manner that the Switch and Win strategy will be successful 2/3 of the time, which is consistent with our simulation.