Causal inference can be attempted using different statistical methods, each of which require some (untestable) assumptions. Common methods include multivariable regression (no unmeasured confounding), variations on regression such as propensity scores, g-methods (no unmeasured confounding) and instrumental variables (no association between instrument and outcome, other than via the exposure). Less attention has been given to the impact of selection (e.g. selection into a study, analysis of cases only) or missing data (e.g. dropout from a study, death due to other causes) on causal inference. Using directed acyclic graphs (DAGs) I will show some of the ways in which bias can occur due to selection or missing data. I will also present a recently developed method to overcome selection bias using genetic data (under some assumptions). Applied work shows evidence
of non-random selection into and dropout from studies including ALSPAC and UK Biobank, and I will discuss how this might impact causal analyses using these datasets.