Thursday 12 Oct 2017: Statistical science- Using Directed acyclic graphs (DAGs) to strengthen statistical modelling for causal inference: a cautionary tale
Prof George Ellison -
Directed acyclic graphs (DAGs) help improve statistical modelling in three specific ways: (i) they establish an a priori theoretical framework which, as specified, is open to scrutiny and debate; (ii) as specified, they prescribe the covariate adjustment sets required to minimise confounding bias from known/measured covariates; and (iii) they help to ensure that these adjustment sets, as specified, do not include covariates acting as 'mediators' and 'descendants of the outcome'. These benefits have made DAGs popular for theory-driven observational epidemiological studies that are keen to assess the 'total causal effect' of one variable (the 'exposure') on another (the 'outcome'), not least since Johannes Textor developed, first, the online tool DAGitty.net (which automatically identifies confounders and mediators) and, second, the R package 'dagitty' (which, amongst other things, can assess whether a DAG, as specified, is consistent with the dataset it is intended to represent). Yet, DAGs may also prove to be (even more) useful for generating all of the alternative, data-driven hypotheses (in the form of data-compatible DAGs - again using the R package 'dagitty') so that ALL of these can then be assessed using temporal logic (and any certain functional relationships between variables), to ensure that any of the remaining DAGs that are plausible theories remain for causal interpretation, rather than only the single DAG that analysts tend to preference on the basis of their unique, yet cognitively flawed, theories and experiences. This talk will explain how DAGs are used and how to draw DAGs with limited reference to (potentially flawed) a priori 'knowledge' of functional relationships between measured variables. It will also include a cautionary tale of how DAGs can help identify past analytical mistakes in published analyses.
Bio: George is Associate Professor of Epidemiology at the Leeds Institute for Data Analytics and Leeds Institute for Cardiovascular and Metabolic Medicine. Originally trained as a zoologist, he has taught across the natural and applied social sciences, in the UK and southern Africa; and has a keen interest in 'Social Studies of Knowledge'.