National Awareness and Early Diagnosis Initiative


National Awareness and Early Diagnosis Initiative: Development of a risk prediction tool for early cancer detection in patients with Type 2 diabetes (T2D)

Project Overview

Patients with T2D are at increased risk of developing several common cancer types compared with non-diabetes population. Current risk prediction tools for cancer risk do not differentiate explicitly between the T2D and non-T2D populations.

The project aims to develop and validate a risk prediction tool for identifying individuals with T2D who are at high risk of developing cancer. The risk prediction tool will be easy to use and simple to apply in a General Practice setting. The tool will address the statistical complexity of baseline and time-varying factors such as BMI and drug types and exposures, which are routinely measured in the Primary Care setting.

Start: September 2014

End: August 2017

Funded by: Cancer Research UK

Disease Area Impacted

Diabetes and Cancer

Data Source

To develop and validate the risk prediction tool, anonymised data is linked from multiple regional and national sources.

These databases include the Salford Integrated Record (SIR), drawn from the population of Salford, UK (N=248 752) which has a defined T2D population of N= 14 380. This data is linked to secondary care data through the national Hospital Episode Statistics (HES) and cancer intelligence service (CIS) databases. The data has a relatively stable population and uniformity of management of T2D across the locality as well as information on deprivation levels and exposure data.

The data is further linked to the Clinical Practice Research Datalink (CPRD) database of anonymous longitudinal data drawn from 667 general practices in the UK and provides information on approximately N=300  000 people with T2D.


A variety of standard current statistical methods have been explored initially such as:

  • Multiple imputation methods to address missing data issues
  • Logistic and cox proportional regression to estimate risk factors
  • The use of performance diagnostics such as ROC curves

However, the richness of the linked data allows more complex statistical methods to be applied to enhance the accuracy and precision on the final risk prediction tool. We are currently exploring:

  • Inclusion of time-varying explanatory variables
  • The inconsistency in current debate regarding the use of age rather than time since diagnosis of a condition as the timescale using multiple timescales in predictive modelling
  • Period and cohort effects using temporal validation techniques
  • Competing risk time-to-event analyses
  • Mixed and latent class analyses to determine risk factor trajectories in particular BMI trajectories and the associated probabilities of risk of cancer.

Additional complexity is present because patients with T2D have a lower risk of prostate cancer and there is potential for reverse causality with pancreatic cancer risk prediction because subclinical cancer developing within the pancreas may lead to symptoms of diabetes.


The project will identify markers of greater cancer risk for T2D patients but it will not be able to address the mechanistic question of what aspects T2D confer any increase in cancer risk.

The prediction model will raise awareness among patients and professionals to increase cancer screening and to change health behaviours.

Further benefits include improving statistical modelling practice of how best to include repeated BMI measures in risk prediction models and subsequently analyse differential patterns in the final prediction model.

Intended Outcomes

It is vital that if a robust prediction model is developed, it is incorporated into annual reviews for people with T2D, whilst fitted within existing GP IT systems to ensure the model is updated and used in practice. Adaptation of screening practices and lifestyle advice for any people found to be high risk is also being explored.

For further information on our collaborations see our Diabetes and Cancer Study Group.


Prof Andrew Renehan

Dr Matthew Sperrin

Ellena Badrick

Dr Hannah Lennon

Prof Iain Buchan

Dr Martin Rutter

Dr Evangelos Kontopantelis

Prof Darren Ashcroft


Sperrin M, Candlish J, Badrick E, Renehan A, Buchan I. Collider bias is only a Partial Explanation for the Obesity Paradox. Epidemiology, eScholarID:293239

Badrick E, Renehan AG. Diabetes and cancer: 5 years into the recent controversy. Eur J Cancer. 2014 Aug;50(12):2119-25. doi: 10.1016/j.ejca.2014.04.032. Epub 2014 Jun 11. Review.

Badrick, Ellena, Renehan, Andrew. Colorectal cancer [internet]. 2013 [cited 2014 Jan 13]; Diapedia 61044601108 rev. no. 13. Available from: