Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Choosing a null model for this purpose requires more assumptions than doing null hypothesis testing with frequentist statistics.

How so?

I could choose the same null model that predicts that the observation is distributed as, say, a standard Gaussian. What additional assumptions are required?



Mean and standard deviation up front, not from the sample. At least, that would go against my understanding of Bayes factors and how I’ve calculated them. You can do other stats, t-test for example, without declaring that up front.


I don’t understand what you say, sorry.

If my choice of null model is p(x)=exp(-x^2/2) and I get some observation xobs I can do frequentist things with it - like calculating the p-value p(|x|>|xobs|) for example - and I can compare it with some alternative model p’(x) using the Bayes factor p(xobs)/p’(xobs).

What does “Mean and standard deviation up front, not from the sample.” mean in this context?


Mean and standard deviation up front would mean that you set your null as a normal distribution with a mean of 0 and a standard deviation of 2%, for example. This is different from saying that you’ll just say it’s normal, and take the mean and standard deviation from the sample.

I mean, I could be wrong on this. You could do that if you want. I just think of Bayes factors as a competition and it needs to be “fair”. So it doesn’t make sense to let the null update as data comes in but not the alternative.


Ok.

As you said "Bayes factors work with comparing models."

I don't think that there is a concept of "fairness" that prevents us from including in the comparison a very simple model.

"There is no null model. What, 0% effect?"

For example. But even if the true effect is zero you may get a non-zero observation because measurements are not perfect.

If what you measure is precisely what you want to know there is no need for statistical analysis!

Let's say for the sake of this example that you know that your measurement error is distributed normally with unit variance.

"Ok, there was a non-zero effect."

There was a non-zero _observation_.

"That model loses since it put the probability of 0% at 1 and everything else at 0."

It didn't. The prediction of the null model is that observations will be normally distributed around 0.

That's precisely what the author of the blogpost complains about: that if your observation is 1 the prediction of the null model (close to 0) was more accurate than the prediction of the second model (somewhere between 1 and 10) and the observation favours the former over the latter.


As far as I see it, you have three options:

Pick a measure. That’s what I mean by “effect is 0%”. It’s a straw man here.

Pick a fully specified model. This is a model that, up front, you could ask what is the probability of event E? For a normal distribution, this would require choosing concrete mean and standard deviation.

Pick an under-specified model. This would be that it’s normal, but you don’t pick the mean and standard deviation. You pull them from the sample. As I’ve described it here, you can’t get P(E) from that.

The expectation from our alternative hypothesis and model is that it’s fully formed before we look at the data. It’s a choice whether you want that to be the case or not with the null model. “Fair” as I’m describing it is that you would pick something.


A> Pick a measure. That’s what I mean by “effect is 0%”. It’s a straw man here.

I don't understand what you mean by "pick a measure" but maybe the "it’s a straw man here" (that I don't really understand either) indicates that looking at the other two options is enough.

B> Pick a fully specified model. This is a model that, up front, you could ask what is the probability of event E? For a normal distribution, this would require choosing concrete mean and standard deviation.

Ok. That seems to describe a simple classical null hypothesis like the example I gave in my previous comment. The underlying thing of interest is zero and the sampling distribution for the data is normally distributed around zero.

C> Pick an under-specified model. This would be that it’s normal, but you don’t pick the mean and standard deviation. You pull them from the sample. As I’ve described it here, you can’t get P(E) from that.

That is not the kind of null hypothesis I gave in my example, I think we can agree on that.

> The expectation from our alternative hypothesis and model is that it’s fully formed before we look at the data.

I don't understand that sentence. What is "it" that is fully formed before we look at the data? The alternative hypothesis and model?

> It’s a choice whether you want that to be the case or not with the null model.

What is "that"? Being formed before we look at the data? (In that case I hope that the null model I described would satisfy that.)

> “Fair” as I’m describing it is that you would pick something.

Pick something of what? I'm completely lost, I'm afraid.

I'm just saying that I can have a null model of the form B like in the example "the underlying thing is zero and the data generated by this model has a probability distribution p(x)=exp(-x^2/2)".

And I can compare that model it with any other model described by a distribution probability for the underlying thing which, taking into account the measurement error, results in a probability distribution p'(x) for the data generated.


“Pick a measure” just meant that you’re predicting the difference will be exactly 0%. P(0) = 1.

The difference between a Bayes factor and a likelihood ratio is Bayes factor uses the marginal likelihood. So you need to pick your parameters ahead of time, weighted by priors. With a likelihood ratio, you can use the best parameters given the data.

You can do the likelihood ratio in an objective way, because you’re choosing whatever has the least error given the data. Bayes factor you can’t be totally objective. You need to choose ahead. Upside is it reduces overfitting.


> “Pick a measure” just meant that you’re predicting the difference will be exactly 0%.

The difference of what? If you mean for example the difference between the population means of two groups that doesn’t mean that the observed difference between two sample means is zero. A non-zero observation doesn’t mean that “the model loses”. A non-zero observed difference is not just something that can happen, it’s what is expected.

If you mean that the null hypothesis is really “the difference between the observed means is exactly zero” that doesn’t seem very useful and I’ve never seen anyone do that. You don’t need statistics of any kind to reject the model “the observation is zero” when the observation is not zero.

Apart from that I agree that different hypothesis testing procedures do different things and their respective merits are debatable. My point was just that if you have a well-defined “null” model you can do different things with it. Using the same exact model.


> that doesn’t seem very useful and I’ve never seen anyone do that.

Yes. That’s why it’s a straw man. I’m being sort of uncharitable in that description. My point is that’s a starting point. To go from there, you need to choose a model which will have parameters or priors.


That’s where the « null hypothesis » comes in. If it fixes the parameters in the model you get a well-defined « null » model with a well-defined probability distribution for the observation and - just like you can take this null hypothesis model and do frequentist calculations with it - you can take this model and calculate a Bayes factor relative to some other model.

(To be clear, if the null hypothesis doesn’t fully specify the parameters the preceding paragraph doesnt apply and the situation is more complex.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: