Dora’s Choice

March 9, 2014Curiosities, Simulationggplot, Monty Hall, R, simulation@aschinchon

Arithmetic is being able to count up to twenty without taking off your shoes (Mickey Mouse)

On her last mission, Dora The Explorer sails down the Amazon river to save her friend Isa The Iguana from Swiper The Fox claws. After some hours of navigation, Dora sees how the river divides into 3 branches and has to choose which one to follow. Before leaving, her friend Map told her that just one of these branches is safe. Two others end in terrible waterfalls, both impossible to escape alive. Although Dora does not know which one is the good one, she decides to take the branch number 1. Suddenly, her friend Boots The Monkey yells from the top of a palm tree:

– Dora, do not take branch number 3! I can see from here that it ends in a horrible waterfall!

After listening to Boots, Dora changes her mind and decides to take branch number 2. Why Dora switches? Because she knows that this change has significantly increased her probability of ending the mission alive.

There are several ways to convince yourself of this. One is to simulate the situation that has faced Dora and compare results of switching and not switching . Switching, Dora saves her life 2 of each 3 simulations while if she does not, Dora only saves 1 of each 3 simulations. Changing her mind, Dora doubles her chances of survival!

Carefully considering what happens, you can see that switching Dora saves herself when her first choice is erroneus, which occurs with probability 2/3. On the other hand, if Dora remains faithful to her first choice, obviously only saves herself with probability 1/3.

This is an example on my own of the famous Monty Hall Problem. You can see a nice explanation of it in a chapter of Numb3rs or in the film 21 Black Jack. Not long ago I exposed the problem in a family meeting. Only my mum said she would switch (we were 6 people in the meeting). It is fun to share this experiment and ask what people would do. Do it with your friends and family. First time I knew the problem I thought there were no difference between switching and not since I gave both possibilities 1/2 of probability. If I had been Dora, pretty sure I would tumbled over a terrible waterfall. What about yo?

Note: this is an update of the post, which was not a correct formulation of Monty Hall Problem. Thanks to David Robinson and Scott Kostyshak for showing me my error. A correct formulation of the problem may be this:

On her last mission, Dora The Explorer sails down the Amazon river to meet her cousin Diego. After some hours of navigation, Dora sees how the river divides into 3 branches and has to choose which one to follow. Before leaving, her friend Map told her that just one of these branches is safe. Two others end in terrible waterfalls, both impossible to escape alive. Although Dora does not know which one is the good one, she decides to take the branch number 1. After putting the bow towards branch number one, Dora sees Swiper The Fox smiling from the shore, in a high place where obviously can see the end of all three branches. Dora yells him:

– Help me Swiper! Which one should I take?

Swiper replies:

– I am the villain of this story so I will give you only an advice: do not take branch number 3. It ends into a terrible waterfall.

Dora, who has a sixth sense to notice when Swiper is lying, knows he is telling the truth and immediately changes her mind and decides to take branch number 2. Why Dora switches? Because she knows that this change has significantly increased her probability of ending the mission alive.

Here you have the code:

library(ggplot2)
library(extrafont)
nchoices <- 3
nsims <- 500
choices <- seq(from=1, to=nchoices, by=1)
good.choice <- sample(choices, nsims, replace=TRUE)
choice1 <- sample(choices, nsims, replace=TRUE)
dfsims <- as.data.frame(cbind(good.choice, choice1))
dfsims$advice <- apply(dfsims, 1, function(x) choices[!choices %in% as.vector(x)][sample(1:length(choices[!choices %in% as.vector(x)]), 1)])
dfsims$choice2 <- apply(dfsims, 1, function(x) choices[!choices %in% as.vector(c(x[2], x[3]))][sample(1:length(choices[!choices %in% as.vector(c(x[2], x[3]))]), 1)])
dfsims$win1 <- apply(dfsims, 1, function(x) (x[1]==x[2])*1)
dfsims$win2 <- apply(dfsims, 1, function(x) (x[1]==x[4])*1)
dfsims$csumwin1 <- cumsum(dfsims$win1)/as.numeric(rownames(dfsims))
dfsims$csumwin2 <- cumsum(dfsims$win2)/as.numeric(rownames(dfsims))
dfsims$nsims <- as.numeric(rownames(dfsims))
dfsims$xaxis <- 0
### XKCD theme
theme_xkcd <- theme(
panel.background = element_rect(fill="darkolivegreen1"),
panel.border = element_rect(colour="black", fill=NA),
axis.line = element_line(size = 0.5, colour = "black"),
axis.ticks = element_line(colour="black"),
panel.grid = element_line(colour="white", linetype = 2),
axis.text.y = element_text(colour="black"),
axis.text.x = element_text(colour="black"),
text = element_text(size=18, family="Humor Sans"),
plot.title = element_text(size = 50)
)
### Plot the chart
p <- ggplot(data=dfsims, aes(x=nsims, y=csumwin1))+
geom_line(aes(y=csumwin2), colour="green4", size=1.5, fill=NA)+
geom_line(colour="green4", size=1.5, fill=NA)+
geom_text(data=dfsims[400, ], family="Humor Sans", aes(x=nsims), colour="green4", y=0.7, label="if Dora switches ...", size=5.5, adjust=1)+
geom_text(data=dfsims[400, ], family="Humor Sans", aes(x=nsims), colour="green4", y=0.3, label="if Dora does not switch ...", size=5.5, adjust=1)+
coord_cartesian(ylim=c(0, 1), xlim=c(1, nsims))+
scale_y_continuous(breaks = c(0,round(1/3, digits = 2),round(2/3, digits = 2),1), minor_breaks = c(round(1/3, digits = 2),round(2/3, digits = 2)))+
scale_x_continuous(minor_breaks = seq(100, 400, 100))+
labs(x="Number Of Simulations", y="Rate Of Survival", title="Dora's Choice")+
theme_xkcd
ggsave("doras_choice.jpg", plot=p, width=8, height=5)

27 thoughts on “Dora’s Choice”

Scott Kostyshak says:

March 10, 2014 at 12:40 am

Hi Antonio, thank you for your great blog! I’ve enjoyed many of your articles. I like the creativity of your story but I think it is missing part of the original monty hall problem. In the game host problem, we know that the host will never show what’s behind the door that the contestant chose. Knowing this is important information. It tells us that if the contestant stays, there is the same chance as there was at the beginning (one third). Note that this property of the problem is reflected in your R code with <>. In Dora’s story, it’s not clear why the monkey would never be able to see down the branch that Dora chooses. I tried to think of a way to modify the story to reflect this. Maybe “whichever branch Dora chooses, the abnormally tall sail on her raft will block the monkey from seeing what’s down the branch she chose.” Or perhaps simply “the monkey was unable to see down the branch that Dora chose” (this is less of a stretch than an abnormally tall sail and might be believable if we think this event just happened once but if we are thinking about simulations, it would seem strange that Dora always happens to choose the branch that the monkey cannot see down).

Reply
1. aschinchon says:
  
  March 10, 2014 at 9:06 am
  
  Thanks for your comment. It is true, the monkeys doesn’t have as much information as the host has in the original problem but Dora’s point of view is the same as the contestants. This is what makes both experiments comparable. What do you think?
  
  Reply
  1. David Robinson says:
    
    March 10, 2014 at 2:55 pm
    
    Scott is right: this version of the problem is very different from traditional Monty Hall, so much so that it is no longer beneficial to switch. This is the equivalent of the Monty Hall variation where Monty brings on a new contestant with no information, who then guesses a door (that the original contestant hadn’t picked) and gets it wrong. In the cases that the second contestant (the monkey) gets it right, you don’t have to choose whether to switch because you know the correct answer.
    
    You can see this in your simulation. Your line:
    
    dfsims$advice <- apply(dfsims, 1, function(x) choices[!choices %in% as.vector(x)][sample(1:length(choices[!choices %in% as.vector(x)]), 1)])
    
    guarantees that the monkey will never look down the river that is safe (how could that be guaranteed without knowing in advance which river it is?). If the only guarantee is that the monkey doesn't look down the same river Dora had originally chosen, you simply change that line to:
    
    dfsims$advice <- sapply(choice1, 1, function(ch) sample(choices[!choices %in% ch]))
    
    And because Dora doesn't have to pick a river in the cases when Boots yells down "River 3 is all clear!" (there is no choice whether to switch or not), you add the following line:
    
    dfsims = dfsims[dfsims$advice != good.choice, ]
    
    And you'll see that it no longer matters whether one switches or doesn't switch.
    
    Reply
    1. aschinchon says:
      
      March 10, 2014 at 3:25 pm
      
      In the original problem, Monty Hall asks the contestant to pick one of the three doors. Once the contestant has done so, Monty opens one of the two remaining doors to reveal what’s behind it, but is careful never to open the door hiding the car. After Monty has opened one of these other two doors, he offers the contestant the chance to switch doors. In my case Dora is the contestant and the Monkey has the role of Monty Hall, which is no more that revealing a wrong option. This is why Monkey never gives advice over the river that Dora chooses as first option nor the good one. He has this role unintentionally, of course, but the effect over the next action of Dora is the same as the Monty’s revelation has in the original problem. I do not understand why you say this Dora’s version is so “that it is no longer beneficial to switch”. Won’t you switch your first choice after listening Boots?
      
      Reply
  2. David Robinson says:
    
    March 10, 2014 at 3:43 pm
    
    Imagine the counterfactual in which 3 is the safe path. Wouldn’t Boots tell Dora about it? Monty wouldn’t- so Boots is not in the same position as Monty, and Dora is not in the same position having heard the advice. In the counterfactual in which 3 is the safe path, Boots would tell Dora that 3 is the safe path, and Monty would tell Dora that 2 is a waterfall.
    
    Consider the situation where Dora originally chooses path 1, Boots looks down path 3, and Monty always warns about a waterfall that Dora didn’t pick. The three possibilities are:
    
    *Path 1 is safe*: Boots gives a warning about path 3, Monty also gives a warning about path 3 (or 2, doesn’t matter). Staying wins.
    
    *Path 2 is safe*: Boots gives a warning about path 3, Monty gives a warning about path 3. Switching wins.
    
    *Path 3 is safe*: Boots says “All clear, Dora!” Monty says “Path 2 is a waterfall.” IN THIS CASE: If you’re playing with Monty, switching wins. If you’re playing with Boots, you never face a difficult choice: you just always head down path 3.
    
    Notice that if you’re playing with Monty, switching wins 2 out of 3 times- it’s the better strategy. If you’re playing with Boots, switching wins 1 out of 2 times: doesn’t matter.
    
    Most importantly, *did you try my change to your R simulation*? If so, you saw that when Boots doesn’t know what river was safe (your simulation explicitly ensured Boots never looked down the safe path), but happened to look down a safe path anyway, switching is equal to not switching.
    
    Reply
    1. aschinchon says:
      
      March 10, 2014 at 4:16 pm
      
      Of course Monkey would tell Dora 3 is the safe path! Boots is her best friend! But unfortunately, 3 is not the right path. And Dora does the right thing changing her decision. Is the same thing as the original problem, don’t you really think so? I didn’t try your change but obviously will happens what you say, I trust you. But this is another problem. Is not the Monty Hall problem. You can think in boots as a monkey that is always in the wrong place but wants to help her friend.
      
      Reply
  3. David Robinson says:
    
    March 10, 2014 at 4:32 pm
    
    What is causing the monkey to *always* be in the wrong place? If Dora is aware that Boots always looks down a waterfall path (does she know that the author of the story arranged that to be so, for dramatic tension?) then the logic applies. But if Boots just happened to pick a path to look down and see it was a waterfall (the sensible interpretation), then it is *not* helpful to switch.
    
    You are right that this is not the Monty Hall problem: it is a different problem. (Let’s call it the Boots Problem). But your simulation, graph and answer are in response to the Monty Hall problem. In the Boots problem it does *not* matter whether Dora switches (your non-mother family members were correct).
    
    Shouldn’t you change either your explanation of the problem or your graph and simulations, since pairing one with the other is misleading?
    
    Reply
    1. aschinchon says:
      
      March 10, 2014 at 4:43 pm
      
      I never said mine is not the original problem. I think mine is EXACTLY the original one. Look at the graph with the simulations. Obviously trying to compare a cartoon monkey with a human host of a TV contest can cause some philosophical discussion as the one we are having. What is causing the monkey to *always* be in the wrong place? Some comment of the post can give you a possible explanation. Maybe you wouldn’t switch after listening Boots. I would do for sure.
      
      Reply
  4. David Robinson says:
    
    March 10, 2014 at 4:56 pm
    
    According to your written example, Boots tries to help Dora, so he climbs up a tree and looks down a river. In your simulation, Boots knows in advance which path is the safe one and is very careful not to look down that path (!choices %in% as.vector(x)). Boots doesn’t sound like much of a friend if this is the case.
    
    This is not a “philosophical” distinction at all: the fact that Boots doesn’t know in advance how which path is safe makes yours a different problem from Monty Hall.
    
    Let’s try a slightly different version of the Monty Hall problem. After the first contestant has made his choice, Monty brings on a second contestant (whose name is Boots), who doesn’t know which door holds the car. After being informed that the first contestant picked door 1, Boots picks door 3, and it is revealed that door 3 does not have the goat.
    
    Do you agree that:
    
    A) This version of the Monty Hall problem is equivalent to your Dora formulation?
    B) This version of the Monty Hall problem is different from the original?
    
    Reply
    1. aschinchon says:
      
      March 10, 2014 at 5:09 pm
      
      Do you agree from the point of view of Dora (the contestant) both problems (original and mine) are exactly the same? Would you change your choice after listening Boots? According my simulation Boots unfortunately always reveals a wrong path (call it “bad luck” or whatever you want).
      
      Reply
  5. David Robinson says:
    
    March 10, 2014 at 5:18 pm
    
    “Do you agree from the point of view of Dora (the contestant) both problems (original and mine) are exactly the same?”
    
    Nope, not at all. Dora knows that Monty knows the correct answer and intentionally opens a door with a goat (path with a waterfall), whereas Dora knows that Boots didn’t know in advance. This knowledge changes everything.
    
    Take a look at my breakdown again of Path 1 is safe/Path 2 is safe/Path 3 is safe. In the Monty version, in 2 out of 3 of those it is better to switch. In the Boots version, in 1 out of 2 of those it is better to switch (in the third version he looked down the safe path).
    
    “According my simulation Boots unfortunately always reveals a wrong path (call it “bad luck” or whatever you want).”
    
    By that logic, you could easily create a simulation in which Dora always chooses the safe path on her first try, and just call it “good luck”. In that simulation, it is always better for Dora to *stay*. But that’s not what’s happening.
    
    Reply
    1. aschinchon says:
      
      March 10, 2014 at 5:39 pm
      
      It doesn’t change nothing from my point of view.
      
      My problem from Dora’s knowledge point of view:
      Step 1. Dora chooses a path randomly. She does not know if it is the correct one
      Step 2. Receives information: other path She didn’t pick is wrong
      
      The original problem:
      Step 1. Contestant chooses a door randomly. He/she does not know if it is the correct one
      Step 2. Receives information: other door He/She didn’t pick is wrong
      
      Is exactly the same.
      
      Would you change your decision or not?
      
      Reply
  6. David Robinson says:
    
    March 10, 2014 at 5:34 pm
    
    Put another way: there are two ways you could say that “Boots is always in the wrong place.” One of them is “Boots picks a random path to look down, and then we ignore all those cases when he looked down the right path” (the version I coded). The other is “Boots knows which path is safe and specifically chooses a different one” (your original simulation).
    
    You would probably say those two cases are equivalent (both are “bad luck”). That is incorrect. If Dora chose the safe path originally, Boots will always look down a waterfall path- but if Dora chooses a waterfall path originally, Boots has a 50% chance of looking down a safe path and a 50% of looking down a waterfall path.
    
    This means that by saying “Look only at the bad luck cases where Boots got it wrong”, you are removing half of the cases where Dora was originally wrong, and remove none of the cases where Dora was originally right.” This changes the balance of probability: that is why it is different from your simulation and the original Monty Hall problem.
    
    Reply
  7. David Robinson says:
    
    March 10, 2014 at 5:44 pm
    
    In the Boots formulation I would not switch, or more precisely I wouldn’t bother switching since it doesn’t help. (Again, because it’s different from Monty Hall. In original Monty Hall I would switch).
    
    If my most recent post (“there are two ways you could say that Boots is always in the wrong place”) doesn’t convince you, please try taking a look at my change to your simulation.
    
    In your simulation, Boots is careful not to pick the correct door. In my simulation, Boots picks a random door, and then we filter out the cases where Boots is right.
    
    So if it’s true that nothing is different, here is my question: *why do those two simulations give different results*?
    
    Reply
    1. aschinchon says:
      
      March 10, 2014 at 6:12 pm
      
      I will try your simulations before giving you my opinion. It surprises me you wouldn’t switch in the Boots simulation. Imagine there were 100 paths instead 3 (and again only a good one). You choose one randomly and Boots tells you that some other 98 are wrong. Following your logic I can think you wouldn’t change neither. But maybe in this case you will fall for sure by the waterfall. If you change you will save your life for pretty sure. Make your calculations. My case is the same but only with 3 paths instead 100. Thanks a lot for your stimulating comments. Keep in touch.
      
      Reply
  8. David Robinson says:
    
    March 10, 2014 at 6:24 pm
    
    Your 100 paths version has the same issue as the three paths version. Of course if Boots knew in advance which path was safe, and crossed out 98 of the other ones, then it would be advantageous to switch.
    
    But let’s say Boots checked 98 paths randomly, and they all happened to be waterfalls (poor unlucky Boots). Consider this: if you had picked the wrong path initially, the chance of that happening (Boots picking all waterfalls) was 1/99. If you had picked the right path originally, then that was *guaranteed* to happen (whatever 98 paths Boots picked, they would all be waterfalls).
    
    Let S mean that Dora picked a safe path originally, let B mean that Boots picked 98 waterfalls. We wish to calculate P(S|B), the probability that Dora picked a safe path *given* that Boots picked 98 waterfalls. We can calculate this using Bayes Theorem.
    
    P(S) = 1/100 (prior probability you originally picked a safe path)
    P(~S) = 99/100 (prior probability you did not pick a safe path)
    P(B|S) = 1 (if you picked a safe path, Boots’s 98 will always be waterfalls)
    P(B|~S) = 1/99 (if you did not pick a safe path, Boots will pick waterfalls only 1/99 of the time)
    
    By Bayes Theorem:
    
    P(S|B)=P(B|S)P(S)/(P(B|S)P(S) + P(B|~S)P(~S))
    P(S|B)=1 * (1 / 100)/(1 * (1 / 100) + 1/99 * (99 / 100))
    P(S|B)=(1 / 100)/(1 / 100) + (1 / 100))
    P(S|B)=1/2
    
    Given that Boots picked 98 waterfalls, the probability that you originally picked a safe path is 50% (and therefore there is no advantage to switching).
    
    (The reason this is not equivalent to Monty Hall is that Monty knows which paths are waterfalls. Therefore P(B|~S)=1 rather than 99).
    
    Reply
    1. aschinchon says:
      
      March 10, 2014 at 6:36 pm
      
      Monty Hall knows everything, I agree. Monkey Boots no. Apart this (I know for you is a key question, not for me) their role is to give information to Dora about some other wrong option. Both of them do exactly the same, in my opinion. This is why I don’t need to call Bayes to do calculations.
      
      Reply
  9. David Robinson says:
    
    March 10, 2014 at 6:42 pm
    
    It is a key question whether one ignores it or not. Consider that when Boots yells down, he is providing two pieces of information:
    
    1) Path 3 is a waterfall
    2) When I picked a random path that was different from yours, it turned out to be a waterfall.
    
    The second piece of information is relevant: it is evidence that improves the odds that your original path was safe (by Bayesian updating). Monty Hall does *not* provide that second piece of information; only the first.
    
    Please do try using Bayes Theorem to calculate the probability (you are clearly mathematically educated and savvy: why would you be afraid to look at the problem rigorously?). Try it once where P(B|~S) is 1 (Monty Hall), and try it once where P(B|S) is 1/99 (Boots).
    
    Reply
    1. aschinchon says:
      
      March 10, 2014 at 9:43 pm
      
      Definitely you are right. I apologize for being so obstinate. I think I have a simple way to reformulate the problem, I will do as soon as possible. Thanks for your patient.
      
      Reply
Tom says:

March 10, 2014 at 1:48 am

Clearly the educational value of Dora has increased exponentially since I last viewed.

Reply
1. Woodstock says:
  
  March 11, 2014 at 2:42 pm
  
  Holy cow! I’d say so.
  
  Reply
jstuartmill says:

March 10, 2014 at 2:00 am

This is great. I look forward to reading more of your blog.

Reply
Harold Baize says:

March 10, 2014 at 4:48 pm

The package “extrafont” is no longer available on CRAN. Is there an easy work around?

Reply
Ian says:

March 10, 2014 at 6:18 pm

Seeing that it was her last adventure, I’m guessing switching didn’t really help her.

Reply
aschinchon says:

March 10, 2014 at 10:27 pm

I reformulated the problem already. Sincerely, thanks again for your help.

Reply
1. David Robinson says:
  
  March 11, 2014 at 2:59 am
  
  I am very glad I could be helpful, and I appreciate you discussing it in an open-minded way! This variation is at least as counter-intuitive as the original Monty Hall problem so I enjoyed working through it. (I even think it would be cool to show the alternative variation in the blog entry, along with a simulation showing that the answer is different).
  
  I like your reformulation, though I wonder if there’s a way for Swiper to provide proof (a photograph?) rather than Dora trusting him.
  
  Reply
Scott Kostyshak says:

March 11, 2014 at 6:31 am

I apologize for not taking the time to read in detail the back-and-forth exchange of comments (I did skim them). Looks like a good discussion! If the following has been covered, feel free to ignore it. I read the update and I still don’t think this is a direct analog to the Monty Hall problem even if the concern of Swiper providing proof is satisfied. It all depends on what Dora assumes about Swiper. To see my concern, let’s add some structure on this tricky Swiper. We are assuming that Swiper knows the “good” branch. Let’s also suppose that Swiper will pick one of the two “bad” branches at random and report the one he chose to Dora. Let’s consider two strategies:
(Strategy 1) Dora always switches
(Strategy 2) Dora only switches when Swiper tells her that the branch she chose is bad
(it seems silly to consider the “always stays” strategy).
Should Dora go with Strategy 1 or Strategy 2? To see this analytically, with Strategy 1:
P(Dora survives) = P(Dora’s original choice was correct) + P(Dora’s original choice was incorrect) * P(Dora’s second choice is correct | Dora’s original choice was incorrect)
Dora never switches if her original choice is correct (because Swiper will never say her branch was bad). So if she chose correctly the first time she’s good.
If she chose incorrectly at the start, there’s a 1/2 chance Swiper will tell her that her branch is bad. She will switch and be on the good branch with probability 1/2. We thus have
= 1/3 + (2/3)[(1/2)*(1/2)] = 1/2

What about Strategy 2? Is it 2/3 like in the Monty Hall problem or did this change under the setup here?
Well, if the branch Dora originally chose is bad, she will never survive because she always switches. The only way she survives is if she switches to a good branch.
P(Dora survives) = P(good branch was not the one she started with)*P(she switches to good branch | …)
There are two scenarios. Swiper might tell her that the branch she is on is bad or that a branch she is not on is bad. Telling her that the branch she is on is bad does not help at all because she was planning to switch anyway. The following probabilities are conditional on the event that the original branch Dora chose was incorrect.
= (2/3)*[P(Swiper told her the branch she started with is bad) * P(Dora switches to a good branch | …) + P(Swiper told her a branch she was not on was bad) * P(Dora switches to a good branch | …)]
= (2/3)*[(1/2) * (1/2) + (1/2)*1] = 1/2

So it doesn’t matter whether Dora follows Strategy 1 or Strategy 2.

To see this in the simulations, change

dfsims$advice <- apply(dfsims, 1, function(x) choices[!choices %in% as.vector(x)][sample(1:length(choices[!choices %in% as.vector(x)]), 1)])

to

dfsims$advice <- apply(dfsims, 1, function(x) choices[!choices %in% as.vector(x[1])][sample(1:length(choices[!choices %in% as.vector(x[1])]), 1)])
dfsims$choice1 <- apply(dfsims, 1, function(x) if (x[["choice1"]] == x[["advice"]]) sample(choices[!choices %in% x[["choice1"]]], 1) else x[["choice1"]])

The second line reflects the observation that Dora will of course switch if she knows she's on a bad branch.

To get rid of this scenario, Swiper could specifically say "Dora, choose a branch. Whichever branch you choose I will look at the *other* two branches and tell you which one of them is bad."

Reply