Dishashree Gupta — April 10, 2017

## Introduction

Probability forms the backbone of many important data science concepts from inferential statistics to Bayesian networks. It would not be wrong to say that the journey of mastering statistics begins with probability. This skilltest was conducted to help you identify your skill level in probability.

A total of 1249 people registered for this skill test. The test was designed to test the conceptual knowledge of probability. If you are one of those who missed out on this skill test, here are the questions and solutions. You missed on the real time test, but can read this article to find out how you could have answered correctly.

Here are the leaderboard ranking for all the participants.

Are you preparing for your next data science interview? Then look no further! Check out the comprehensive ‘Ace Data Science Interviews‘ course which encompasses hundreds of questions like these along with plenty of videos, support and resources. And if you’re looking to brush up your probability sills even more, we have covered it comprehensively in the ‘Introduction to Data Science‘ course!

## Overall Scores You can access the final scores here. More than 300 people participated in the skill test and the highest score obtained was 38. Here are a few statistics about the distribution.

Mean Score: 19.56

Median Score: 20

Mode Score: 15

This was also the first test where some one scored as high as 38! The community is getting serious about DataFest

## Useful Resources

Basics of Probability for Data Science explained with examples

Introduction to Conditional Probability and Bayes theorem for data science professionals

1) Let A and B be events on the same sample space, with P (A) = 0.6 and P (B) = 0.7. Can these two events be disjoint?

A) Yes

B) No

2) Alice has 2 kids and one of them is a girl. What is the probability that the other child is also a girl?

You can assume that there are an equal number of males and females in the world.

A) 0.5
B) 0.25
C) 0.333
D) 0.75

3) A fair six-sided die is rolled twice. What is the probability of getting 2 on the first roll and not getting 4 on the second roll?

A) 1/36
B) 1/18
C) 5/36
D) 1/6
E) 1/3 ###### Dishashree Gupta

Dishashree is passionate about statistics and is a machine learning enthusiast. She has an experience of 1.5 years of Market Research using R, advanced Excel, Azure ML.

## 40 thoughts on "40 Questions on Probability for data science" ###### Kenneth Singh says:April 10, 2017 at 7:32 am
Hi, None of the options in question 7 are probability values right? I think all of them are Numerators. Shouldn't the correct option be "None of the above"? Reply ###### Chen Luo says:April 10, 2017 at 12:51 pm
Maybe this problem should be 'the number of possible cases'. Reply ###### Arcady Novosyolov says:April 10, 2017 at 1:25 pm
Hello, Thanks for publishing this set of problems. It looks like the answer to the question 21 is right (option A), but solution is wrong, since actual probabilities of events are P(A) = 0.402, P(B) = 0.296, and P(C) = 0.245, see explanation at https://github.com/arcadynovosyolov/math_and_prob/blob/master/note_on_question_21.ipynb Best regards, Arcady Reply ###### santiago says:April 10, 2017 at 2:12 pm
About question 2, Why you do that? If you ask something with a real example please give a real answer. Do you really think that the probability is 0.33? The result is 0.5. The events are independant. Reply ###### Dishashree Gupta says:April 10, 2017 at 2:31 pm
This is a problem of conditional probability and we have defined the case where one child is already a girl. Now the sample set reduces and hence we have only 3 choices in the sample set. Reply ###### santiago says:April 10, 2017 at 2:40 pm
I understand the idea, but if your objective is to test conditional probability maybe another example could be better. In real life the sex of one of your childs doesn't affect the sex of those who will come Reply ###### Oshan says:April 10, 2017 at 2:58 pm
for Q 21. The cases considered are for exactly 1, 2 and 3 sixes respectively. however, the question states "atleast" 1,2 and 3 sixes. I am not sure if this would make a difference in the final answer, but this seems like a mistake. Kindly clarify in case I am missing something here.. Reply ###### Andrew Morris says:April 10, 2017 at 5:12 pm
The answer to question 7 must be wrong, because it is greater than 1. The English grammar through these questions also has many mistakes, e.g. "are common" should be "are in common". Reply ###### [email protected]says:April 10, 2017 at 8:47 pm
Hi! Could you please clarify the following: Q7 - The answer A) can not be the correct one since it is > 1 (see Q6 :) ). So, D.) should be the right choice Q28 - it is the Binomial schema, isn't it? Probability of success (p) = 0.7, q = 0.3, n = 3 and m = 2. Then, answer is 3 * 0.7^2 * 0.3 = 0.44. The choice C.) is incorrect since 0.7*0.3*0.7 and 0.3*0.7*0.7 have to be considered as well. Thanks. Reply ###### [email protected]says:April 10, 2017 at 9:33 pm
Also, regarding the Q2 - it is well-known "Boy or Girl paradox" (https://en.wikipedia.org/wiki/Boy_or_Girl_paradox). And both answers - C.) and A.) - are correct depending on the randomizing procedure. Reply ###### Aditya Royal Matturi says:April 11, 2017 at 5:16 am
The answer for question 7 is (1/52 c 4)*(48 c 4/52 c 4) please correct that Reply ###### Aditya Royal Matturi says:April 11, 2017 at 5:17 am
The answer for question 7 is (1/52 c 4)*(48 c 4/52 c 4) please correct that Reply ###### Aditya Royal Matturi says:April 11, 2017 at 7:33 am
As we are concerned about statistics here it definitely effects statistically, I am also sure that it doesn't effect genetically and it's pure luck and as always luck as has some role in many things like tossing coins etc but we are ignorant of it. Reply ###### Soham Lawar says:April 11, 2017 at 7:53 am
Hello , I have following queries ; 1) I request you to elaborate more for question 9 2) In question 19 range of test scores is 18 to 24 and in the explanation you have explained for range 20 to 26 3) In the explanation of question number 40 I think 'since the probabilities are continuous, the probabilities form a distribution function ' is correct Regards, Soham Reply ###### potterhead says:April 11, 2017 at 8:18 am
The answer for question 7 should be D - None of these. None of the options is a probability value. Reply ###### Santhosh says:April 11, 2017 at 10:48 am
Q28 & Q33 both looks similar to me as both are having binomial outputs . Could you please explain why is it different for Q27 and why cant we use Binomial distribution for this unlike Q33 Reply ###### Aditya Royal Matturi says:April 11, 2017 at 10:50 am
question 23 is invalid because there can be an infinite number of ways of throwing a coin, though to satisfy the condition there is only one possible way, there are infinite total outcomes for that test case so I guess1/infinite=~0 Reply ###### Aditya Royal Matturi says:April 11, 2017 at 10:54 am ###### [email protected]says:April 11, 2017 at 11:57 am
Hi! Could you please clarify the following: Q7 – The answer A) can not be the correct one since it is > 1 (see Q6 ? ). So, D.) should be the right choice Q28 – it is the Binomial schema, isn’t it? Probability of success (p) = 0.7, q = 0.3, n = 3 and m = 2. Then, answer is 3 * 0.7^2 * 0.3 = 0.44. The choice C.) is incorrect since 0.7*0.3*0.7 and 0.3*0.7*0.7 have to be considered as well. Thanks. Reply ###### hanna yang says:April 11, 2017 at 12:49 pm
i think the explanation of question 4 "P(AꓵCc) will be only P(A)."is wrong. it shoud be P(AꓵCc) will be P(A-C), If AꓵC =Φ,then the explanation"P(AꓵCc) will be only P(A)."must be right.. Reply ###### Stu says:April 11, 2017 at 4:42 pm
I believe the correct answer to 7 is B not A. Your reasoning is correct, but I think you made an error in reducing the fraction. Reply ###### M Zakaria says:April 12, 2017 at 8:11 am
Regarding Q4. I think we should state explicitly that A, B, and C are independent. This way P(AUBUC) = P(A) + P(B) + P(C). otherwise we need to subtract P(A and B), P(A and C), and P(B and C) from the result. Reply ###### Mustafa V says:April 12, 2017 at 9:46 am ###### Mustafa V says:April 12, 2017 at 2:16 pm
Q2. Answer should be 0.5. Can be reasoned in 2 ways. 1st child & 2nd child are independent events and it is mentioned that population is balanced, p=.5. Also since 1st child is girl so of the possibilities only gb and gg valid, again p = 0.5. Reply ###### David Harper says:April 13, 2017 at 5:19 pm
Hi Dishashree, The flaw is that {BB, GG, BG, GB} represent a permutation. When you reveal that the first is a girl, you are revealing GX or XG and excluding BB and BX so that 2/4 remain. Put another way, there are 3 combinations {BB, BG or GB, and GG} such that revealing one girl eliminates BB and what remains is 1/2. I hope that's helpful! Reply ###### Mustafa V says:April 14, 2017 at 5:53 am
Hi, Regarding Q10 - the solution I have is as follows:- Jack & Jill in one section - number of ways is (58C18)*(40C20) [A] Total number of ways is (60C20)*(40C20) [B] Thus the required probability is A/B which is 19/177. But this option is not there at all. Can you review my solution? Reply ###### 40 Questions on Probability for data science says:April 16, 2017 at 1:01 am ###### Nimrod Ifrach says:May 12, 2017 at 7:24 am
Hi, Im not sure the answer to q28 is correct. 0.147 is the probability of only 1 case that can happen, but there are 3 possible cases: 1. Customers 1+2 buys egg 2. Customers 1+3 buys egg 3. Customers 2+3 buys egg So actually its 0.147*3=0.44 Reply ###### Dishashree Gupta says:May 12, 2017 at 1:39 pm
Both Q7 and 28 were removed from the final score calculation due to discrepancy ! Reply ###### Dishashree Gupta says:May 12, 2017 at 1:43 pm
Can you give more detail on your A and B calculation ? Reply ###### Dishashree Gupta says:May 12, 2017 at 1:44 pm
Yes there was an issue with it. It had been removed from the final score calculation Reply ###### Dishashree Gupta says:May 12, 2017 at 1:44 pm
Yes, it has been removed from the final score calculation Reply ###### Dishashree Gupta says:May 12, 2017 at 1:46 pm
Yes, the answer for Q7 is not mentioned. We have removed it from the final calculation. Reply ###### Dishashree Gupta says:May 12, 2017 at 1:48 pm
Yes, we have removed 7 from score calculation. Reply ###### Dishashree Gupta says:May 12, 2017 at 2:18 pm
1) Elaborating on Q9, what I wanted you to calculate is the probability that when you throw a dice 6 times, you should get 1,2,3,4,5,6 in some order. 2) Q19 - this has been rectified. 3) yes, so in case of a distribution function, the probability of a random variable being exactly equal to a particular value is 0. We can only calculate the probability of a random variable being in a range. Reply ###### Dishashree Gupta says:May 24, 2017 at 4:39 pm
We removed 7 from the calculation due to discrepancy ! Reply ###### Sam Gu says:June 02, 2017 at 5:27 pm
There are 2 variants of this question: 1. Alice has 2 kids and the elder one is a girl. What is the probability that the younger child is also a girl? The answer is 50%. 2. Alice has 2 kids and one of them is a girl. What is the probability that the other child is also a girl? The answer is 33%, as explained in the answer key. Reply ###### Sarah Nogueira says:October 30, 2017 at 7:24 pm
For Q11, the solution says: "If coin A is selected then the number of times the coin would be tossed for a guaranteed Heads is 2". This does not seem right since the probability of getting heads with coin A is 1/2, and each toss is independent, we could toss coin A a hundred times and never observe heads... Reply ###### Dishashree Gupta says:October 30, 2017 at 7:36 pm
Hi Sarah, If the probability of getting a heads from a coin is 1/2, can you really say that you would get a heads if you throw the coin twice ? Reply ###### MJ says:April 10, 2018 at 1:44 pm
No we can't, thats why I think the solution is flawed. Reply