View Single Post
  #19  
Old 02-28-2021, 09:23 PM
jadier jadier is offline
Sarnak


Join Date: Dec 2019
Posts: 220
Default

Quote:
Originally Posted by Isomorphic [You must be logged in to view images. Log in or Register.]
The standard deviation is a property of the underlying distribution, not a sample. Regardless of what you think/feel about OP's reporting, my claim still stands. Given the assumption that the drop rate is 3.5% nothing I said is false. This leads me to your second point. I am assuming what the drop rate is, meaning I am deriving the standard deviation from this assumption. I am not assuming there is some estimation of the drop rate, which suggests that the drop rate in question is a random variable, it's a constant.
Yes, a standard deviation is a property of the distribution. But the reason for applying it is to figure out what the chance you'd observe some data is given the distribution. However, it doesn't work that way if the data are biased.

That is, if you know that the normal chance for something is 10% with a sd of 2% and I tell you, "hey, I got a drop rate of 5% with this dataset where I just ignored all the people who got the drop more often than 7%" you can't just go "oh well 5% is more than 2 sigma away from 10, so I guess the droprate changed". [ I'm not saying DMN did this. This is an extreme example of why using Z-scores to interpret data relies on fair sampling. If sampling isn't, fair, it's meaningless to calculate Z scores ]

In other words, you can only meaningfully interpret a Z score if the sample's fair. If it's a biased sample, it doesn't readily translate to the probability expected from the standard deviation...because it's biased.

Regarding the assumption: the OP's question was whether the drop rate changed. 8 / 224 = 1.5 - 7% drop rate, 1 / 160 (even with the bias) translates to a <0.1 - 3.5% drop rate, and 9 / 384 = 1 - 4%. That is, they're all consistent with one another.

So you're correct that assuming a 3.5% drop rate, 1/160 is an unlikely observation...but my point is that although the point-estimate may vary by 2 sigma, when you account for sample size, even this biased dataset doesn't preclude 3.5%.

Eg, nothing posted here implies the droprate changed at all, even if you assume the true drop rate was exactly 3.5%. OP's friends just had bad luck, and the true drop-rate is uncertain but we'd need way, way, way more evidence to suspect a change.

(edit: tldr; if you assume a 3.5% drop rate, the point estimate of 1/160 is, as Isomorphic says, >2 standard deviations away from 3.5%. However, (1) the data are biased, so there's no way to actually connect a probability to the observation, and (2) even if it weren't biased, 1/160 is still not inconsistent at 95% confidence with a 3.5% drop rate because while an assumed distribution doesn't have error, 160 is a small sample size when dealing with a 3.5% chance event so the confidence interval still overlaps the previously-estimated drop-rate)
Last edited by jadier; 02-28-2021 at 09:52 PM..