Do I need to go Bayesian?

This week’s research methods blog posed a simple topical question; “Do I need to go Bayesian?”. For those who haven’t yet travelled down this particular rabbit hole, Bayesian statistics make use of Bayes theorem and prior information to arrive at posterior distributions of the relative plausibility of model parameters. To you and me, this means results from Bayesian models have the uncertainty in parameter estimates baked in. This is functionally different from classic frequentist (or Null Hypothesis Significance testing) based models which typically produce point estimates and rely on p values for interpretation. The two camps often form something of a diametrically opposed view on how to conduct statistical tests, with particular criticism levelled at the use of and interpretation of p values.

Naturally, the conversation begun with direct discussions on this debate. Several positive features of Bayesian models quickly emerged. For the field of neuroscience, a Bayesian framework more closely approximates how the human brain makes decisions and     processes perceptual information. Where one member raised concerns about shrinkage in multilevel models with small sample sizes, it was pointed out that a Bayesian approach could mitigate this with priors or would reflect this in the uncertainty of the estimates. The posterior distributions from Bayesian models can also uniquely be used for evolutionary simulations.

What followed were lengthy discussions on the logistics of priors. In principle, Bayesian models should consider any prior information one has about an effect of interest, which the model then uses to inform what parameter values are more or less plausible. The issue here being that in science, often one has no idea what kind of effect would be expected. Even those sympathetic to the Bayesian approach admitted that they have rarely considered anything more than weakly regularising priors (though, this still offers benefits). Further, with enough data, frequentist and Bayesian models will often produce nearly identical results. As noted earlier, priors are the most useful when data is lacking, however this is also the case where priors must be considered most carefully for their influence on the model. This process often invites some concern to the subjective nature of priors. Though it is worth pointing out that the conventional threshold of p < 0.05 for "statistical significance" is just as arbitrary as any choice of prior (arguably, more so). Subjectivity is certainly not uniquely Bayesian. 

To find common ground, the discussion concluded with more general concerns about data analysis, which neither approach can magically fix. Conducting analysis on excessively large or excessively small samples remains problematic. Too large and you will almost always find some sort of relationship (spurious or otherwise) but too little risks learning almost nothing of value or poorly estimated parameters. Many models focus on the average of a population rather than individual variation, which could be helped by collecting repeated observations and employing multilevel modelling. Care should also always be taken when attempting to extrapolate conclusions outside of sampled cases. Employing either approach, Bayesian or Frequentist, cannot stand in for good data practices.

The discussion was a particularly interesting one, with many perspectives offered from a variety of backgrounds. By way of a final statement, one attendee offered the advice “Don’t be afraid of using multiple methods”. Indeed, often the way models disagree can be just as interesting than the way they agree. For those still sceptical about Bayesian statistics, computational software has excelled tremendously in the past 10-15 years, making it easier than ever to access powerful engines like MCMC (rstan) or Gibbs sampling (JAGS ) to name but a few. Two highly recommended introductory texts are Statistical Rethinking by Richard McElreath and Bayesian Statistics the Fun Way by Will Kurt.

In an attempted answer to the question, “Do I need to go Bayesian?”, the truthful answer seems to be “it depends”. What it depends on are your research goals, your intended audience and, to an extent, your own opinion. Having said that, spending a little time learning about the other side of the coin is probably beneficial for anyone conducting data analysis.

*The views expressed in this blog are that of the author and the authors interpretation of comments made during the conversation.

Written by Robin Watson, PhD student in the Anthropology Department 


Comments