Introducing the eefAnalytics Package

The Educational Endowment Foundation (EEF) has paved the way towards research-informed interventions that aim towards investigating new methods to reduce educational inequality among the deprived students in schools today. In order to facilitate the efforts by EEF, we developed a statistical package to allow researchers to conduct their investigation using state of the art methods for analysing data from Randomised Controlled Trials (RCT). One of the many vessels towards achieving the aforementioned goal is the eefAnalytics package. A set of user-friendly commands, developed for both R and Stata software, allowing researchers in education to use an optimal model to quantify causal effects in RCTs. It should be noted that these commands can also be used in other disciplines as long as the RCTs have continuous outcomes and do not need to exceed a two-level structure (e.g. participants nested within schools). Moreover, advanced knowledge of the specific calculations used by the package is not a necessity or a pre-requisite; it is expected that users with an understanding of ordinary least squares (OLS) regression and multilevel modelling will find this package easy to use and interpret. This blog post will introduce the functionality of the package and its contribution to the analysis of RCTs in educational research.

Although RCTs have been used for a long time, the appropriate methodology used in their analysis is contextually and theoretically challenging. Educational RCTs pose various statistical difficulties that need to be considered and controlled for. When discussing the analytical aspect of these trials, it is important to remember that the interventions proposed have the potential to benefit the lives of many children. Thereby posing a challenge for the researcher in defining an appropriate language of interpretation regarding the effect of a particular trial. In order to facilitate this undertaking, the eefAnalytics package provides estimates of Hedge’s g effect size. This metric is calculated as the estimated difference in the outcome variable between the intervention(s) and the control group, standardised for the uncertainty in the data using conditional variance (model-based variance after adjusting for covariates) and unconditional variance (total variance without adjusting for covariates). As a statistical tool, the effect size has been more widely used in recent years in conjunction with classical p-values, allowing one to contextualise the results of an intervention in terms of its practical -and not just statistical- importance. More importantly, one of the main contributions of the eefAnalytics package is that it provides a suite of functions to estimate effect size and its associated uncertainty.

As with any test statistic, the interpretation of effect size relies on the uncertainty surrounding it. For instance, a statistically significant effect size of a relatively large magnitude will be of little practical use to a researcher if its confidence interval is very large, covering values in its lower bound that are not practically meaningful. In this case, even though the mean effect may be large and significant, the true value could still lie in an area of the interval that is of no or little practical importance. Therefore, calculation of uncertainty associated with an effect size is equally as important as its point estimate. This is especially true in educational RCTs where it is imperative to understand whether more time and funds should be invested towards expanding the implementation of an intervention to a wider audience of groups of children and schools.

More specifically, eefAnalytics provides analytical methods underpinned by study design as it allows the use of ordinary least squares regression in cases of Simple Randomised Trials (SRT), and the use of mixed models for Cluster Randomised Trials (CRT) and Multi-Site Trials (MST). For example, in a CRT design where randomisation to intervention and control group happens between clusters (e.g. schools), there is a need to quantify the potential similarity of subjects within each cluster as students of a particular cluster may share similar attributes that can affect their outcome data. On the other hand, for MST design randomisation of participants to intervention and control group occurs within a school or cluster. This implies that there is an additional level of variation stemming from the potential difference in the way the intervention is administered (for instance the quality of the execution may vary between schools). As a result, eefAnalytics functions for MSTs explicitly model intervention-by-school interactions as random effects for educational trials or intervention-by-site random effects for clinical trials. More detailed information on MSTs and their effect size modelling can be found in Akansha Singh’s blog post.

Although the aforementioned potentially confounding issues of clustering effects within as well as between schools are being controlled. There has been increasing interest in the application of permutation tests and bootstrapping in randomised control trials to obtain precise results when the assumptions of the parametric models are violated. eefAnalytics package provides permutation and bootstrapping options in order to provide more flexibility to researchers to test their model.

Bootstrapping is a re-sampling technique that randomly samples participants from the observed data to generate replicated samples. The random sampling is done with replacement and each new dataset produces its own effect size. This process is repeated numerous times accumulating the effect sizes to build a sampling distribution of the effect size. This distribution can be used to make direct inference on the variability of the effect size. Bootstrapping is particularly useful as it sensitises heterogeneity between participants and the variability of effect size given different snapshots of the original data. Similarly, permutation testing relies on the random shuffling of the intervention or treatment groups under the null hypothesis of no difference between the groups. For each permutated dataset, the effect size is calculated and contributes to a null distribution. This distribution can then be used to quantify how likely an observed effect size can be obtained by chance under the null hypothesis. It is important to note that both of these methods consider design effects from each study in the estimation of the effect size.

Alternatively, if a researcher is faced with heterogeneous/noisy data or has quantifiable prior knowledge that can be integrated within the model itself, the option for Bayesian analysis and diagnostics is also available for all three RCT designs (SRT, CRT, and MST). The package allows for the optional provision of a posterior probability which serves as a very intuitive and easily understood way of reporting the weight of evidence. The posterior distribution can be used to support an intervention, expressing the probability that an effect size is above a pre-specified educationally relevant threshold(s). Please see Germaine Uwimpuhwe’s blog for more thorough introduction to statistical evidence using posterior probability.

Overall, the effect size, re-sampling techniques, and Bayesian analysis are all examples of highly comprehensive methods. Although these methods were introduced to the scientific community a long time ago. They are now being used more frequently with easy access to high computing power to deliver more intricate and interpretable findings. Therefore, notwithstanding its practical use and the methodology it encompasses, this package can also serve as a platform for the familiarisation and adoption of these methods not just in education, but across disciplines that concern themselves with the analysis of RCTs.

This blog is written by Dimitris Vallis, a Research Assistant in Statistics in the Anthropology Department at Durham University.

This blog expresses the author’s views and interpretation of comments made during the conversation. Any errors or omissions are the author's.

Comments