Advancing methodology for predicting healthcare study

Project Overview

The project will look at improving the accuracy and generalisability of risk prediction models by accounting for heterogeneity from various sources (such as systematic differences in important risk factors not included in the models). The project will also look at how to exploit the way in which patients interact with health services to improve the predictive performance, by drawing information from the frequency and timing of the data collected within the electronic health records (EHR’s). A key limitation of current risk prediction models is that they do not allow consideration of ‘what-if’ scenarios. In this project, to enable ‘what-if’ queries, a potential outcomes (causal) framework will be used to incorporate counterfactuals (such as medication use, surgeries, and lifestyle changes) into a clinical prediction model.

Start: OCTOBER 2018


Funded By:

The Alan Turing Institute

Data Source

UKBiobank, New Zealand primary care data


Developing statistical methods in causal inference and prediction.


Improving the way that we perform clinical prediction, which ultimately means we can better support decisions across the healthcare system.

Early Findings

– Literature reviews on methods for handling informative presence and for counterfactual reasoning in risk prediction models in submission.

– Methodological work in progress.

Researchers Involved

Matt Sperrin

Niels Peek

Lijing Lin

Kenneth Muir

Artitaya (‘Li’) Lophatananon