Booooogus answer to disease puzzler (7/4)

The answer to last week’s puzzler about the man’s chance of having the disease if he tests positive (1 in 51, or 1.97%) is based on some faulty probability calculations. Here’s the right way to look at it:

Suppose you have 1 million people. If 0.1% of them catch the disease, that’s 1,000 infected and 999,000 healthy. They all get tested, with 95% accuracy in the results. Now we have:

Healthy, negative test: 95% of 999,000 = 949,050
Healthy, positive test: 5% of 999,000 = 49,950
Infected, positive test: 95% of 1,000 = 950
Infected, negative test: 5% of 1,000 = 50

Total positive tests = 950 + 49,950 = 50,900

Chance of being infected AND having a positive test = 950 true positive / 50,900 total positive = 19/1018, or 1.87%

“Suppose you have 1 million people.”

Sorry but your probability factor is flawed too. When you start any mathematical calculation with “suppose” then it’s not really a true mathematical calculation in my book. You could insert any number into the equation and the answer would be different every time.

huh? try it. Plug in 10 million. All the numbers go up by a factor of 10 and the factors cancel in the percentage division, giving you the same percentage.

Chance of being infected AND having a positive test = 950 true positive / 50,900 total positive = 19/1018, or 1.87%

Aren’t the people who are infected but with a negative test just as infected? Why would those be deleted from the infected population?

When you start any mathematical calculation with "suppose" then it's not really a true mathematical calculation in my book. You could insert any number into the equation and the answer would be different every time.

Were talking percentages…so it’s a ratio. As long as the ratio is the same…the it doesn’t matter what number you use.

missileman, It’s just easier to explain when you presuppose a number. If you want to be a mathematical purist, just use X for that number. The percent will come out the same.

I chose 1 million to make sure that the four groups would come out as whole numbers. Based on the percentages given in the problem, the total number of people has to be a multiple of 20,000 in order to avoid any decimals.

insightful: I left out the people who were infected and tested negative because they aren’t important. The question was: if you test positive, what are the chances that you actually do have the disease? All we care about here are the people who test positive, and figuring out what percentage of them are infected.

This is an example of conditional probability and involves the use of Bayes’ theorem. You are given the condition that your test came back positive. You are now concerned that you have the disease. This,is a,standard problem in probability theory. The answer makes it seem as,though the test isn’t reliable. However, a more important question is this: A person is tested and the test is negative. What is the probability that the person really has the disease?
This theorem, derived by Thomas Bayes, resulted in a branch of statistics known as Baysian statistics. Bayes was a minister who liked to play around with probability theory. He thought his big contribution to society was a couple of theology papers he published.

Probability theorem always had one big flaw in my book. Sure each flip of a coin has a 50/50 probability but after 4 heads in a row it seems the probability of a fifth head should be less.

What you’ve touched on is the difference between a priori (theoretical) and empirical probability. The latter, based on observational data, can be used as an estimate of the true probability of the coin coming down heads. The more times you flip, the better your estimate will be.

If the true probability of a head is 50%, and if every flip is independent of all the others before it, then half your flips will come down heads in the long run. There’s no way to predict how the next flip will turn out, and any sequence of results is just as likely as any other.

@Barkydog
Probability: Given that I have a fair coin, what is the probability of tossing 4 heads in a,row?

Statistics: A coin is,tossed 4 times and each toss results in a head. What is the,probability that the coin is fair ? (Head or tail equally likely?)
When I taught probability and statistics, I would say “We will ignore the probability that the coin lands on its,edge or eolls, down the sewer”. One,time, after making that statement I tossed a coin and it did land,on its edge. Another time. after making that statement, I tossed,the coin too high. The classroom had a ceiling, panel missing and,the coin disappeared into the ceiling.
The,strangest expeeience I had teaching a statistitics class was,having 2 students in the class turn in identical test papers,on each exam, even to the notes and calculations in the margins,and on the scratch sheet. Yet the 2 students sat on opposite sides of the classroom and.could not communicate with each,other? What are the chances of this happening?
The 2 students did not cheat. They were identical twins!

@triedaq I am betting against heads for the fifth flip, double up on the 6th, quadruple on the 7th, Guarenteeed a fair coin. How would you choose?
ps dropped a nickel and it landed standing on it’s edge, bought a lottery ticket and did not win.

If the coin is guaranteed fair, then the probability of a head on the fifth toss is 1/2.
Think about the 4 tosses: 0 heads in 4 tosses. 1/16; 1 head in 4 tosses 1/4; 2 heads in 4 tosses 3/8; 3 heads in 4 tosses 1/4; 4 heads in 4 tosses 1/16. Here are the possibilities:TTTT,HTTT,THTT,TTHT,TTTH,HTTH,THHT,THTH,HTHT,HTTH,THHT, HHHT,HHTH,HTHH,THHH,HHHH
The probability is,only 1 in 16 of all four tosses being heads, so that if you hadn’t guaranteed a fair coin, I would bet heads on the fifth toss(you are using a 2 headed coin)…

Is your point OP that the correct answer is 1.87% compared to what Ray said, 1.97%? Could be, but it seems to be sort of a gnat’s eyebrow thing. I was thinking the question as posed couldn’t be answered exactly, b/c you also need to know the chance of actually having the disease, and the test saying you don’t.

That was exactly my point, with an analysis of where the explanation given for it went wrong. Yes, the difference between Ray’s answer and mine is small, but it’s the principle of the thing. If you pose a question and put a prize on the line, no matter how small or large, you had better be dead sure that the answer you have in mind is the right one.

Where Ray’s line of reasoning falls down is in assuming the test to be 95% accurate in testing people who don’t have the disease, but 100% accurate in testing the ones who do. I have no background in medical research, but from what little I’ve read, most lab tests don’t work that way; it’s the same accuracy for both groups.

As I said earlier, the people who get a negative test result don’t matter for the purposes of this problem. The question was: if you test positive, what are the chances that you have the disease?

I stand by my comment right or wrong. Math can be manipulated so many ways and I’m one that hates fractions. Split a dime among 3 people and there will never be a correct answer. Just a close one. I don’t want it close…I want it exactly. OK…off the soapbox for now.

“after 4 heads in a row it seems the probability of a fifth head should be less.”

Barkey: one way to disprove this is that this assumes the coin has memory, and it doesn’t. Thin about it. You flip 4 heads in a row, then the coin sits for an hour, then you pick it up and flip it again. If the odds are not still 50-50, that means the coin somehow remembered the results of the previous flips.

Supposed you waited a year? does the coin (or some other part of the universe) actually remember for that year what the previous flips were? Suppose during that year the coin was in your pocket and accidently flipped many times. Do those count?

b

I stand by my comment right or wrong. Math can be manipulated so many ways and I'm one that hates fractions.

You can manipulate the equations…but the answer MUST be the same. If not then you manipulated the equations WRONG.

Usually, these medical tests are biased to be much more accurate in not letting infected people slip through. So there would be 5% false positives, but only, say, 0.1% false negatives. Not so booooogus in that light.

Edit: Never mind.