Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The chance of any individual drug having obtained its positive results by chance is 31%

And that's different from the chance that randomness could have produced the effect? How?



Sorry, I wasn't very clear. Usually people quote the p value as the chance the result is a fluke; the p value in CERN's case is p = 1/1,740,000. But that's the chance that the effect would be produced if the Higgs did not exist, which is different.

By analogy in the medical case, p = 0.05. The incorrect interpretation is that this means only 5% of drugs with statistically significant benefits actually achieved these benefits through luck; rather, the right interpretation is that 5% of the nonfunctional drugs somehow appeared to work.

You could also imagine testing 200,000,000 hypotheses which were all completely false. Even if you used CERN's level of statistical significance, you'd still quite likely find one hypothesis which appears to be true, simply by chance. The chance of that hypothesis being false is 100%, despite the significance level of 1 in 1,740,000.

So yes, 31% is exactly the chance that randomness produced the effect in the trial. But people will try to tell you that it's actually 5%, and they're wrong.


This thread started with the claim that "the chance that the results occurred by chance" was different from "the chance that randomness could produce [the result]".

But you're saying that in your example both are 31%. So again, I ask, are we talking about two separate things? And if so, can you give an example where the two things have different values?


In my medical example, "the chance that the results occurred by chance" is 31%. "The chance that randomness could produce [the result]" was only 5%.

For CERN, the chance that randomness could produce this result is 1 in 1.74 million; the chance that the results occurred by chance is larger, but not computable with the information we have,

The guide I linked to above gives a much better explanation than this. I rushed my first post here, and I think I was unclear.


I swear I'm not trying to give you a hard time, but I don't see how these two things you said could both be true:

> "The chance that randomness could produce [the result]" was only 5%.

> 31% is exactly the chance that randomness produced the effect in the trial.


Whoops, poor wording on my part.

Imagine flipping a perfectly fair coin 100 times. You'd expect to see 50 heads, but you don't always -- it's just an average. Suppose you see 75 heads. What is the chance that you'd see 75 heads with a fair coin? Very very small. The chance that randomness could produce such a result is small.

Now, imagine you test 100 perfectly fair coins. A few of them give more than 75 heads, just by luck. You conclude they're unfair, since the result is unlikely otherwise. The chance that randomness produced the effects you saw is actually 100%, because all the coins are fair.

There's a difference between the question "How likely is this outcome to happen if the coin is fair?" and "Given that this outcome happened, how likely is it that the coin is fair?" Statistical significance addresses the first question, not the second.


Thanks; I feel like I learned something. Your infinite patience was appreciated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: