The Statistics and Data Science team plays an integral part in the research of all scientific programmes in IMS Epidemiology.
Statistics work is led by Stephen Sharp:
- Good Analytical Practice (see below).
- Statistical input into ongoing department research work.
- Library of exemplar Stata code for application of specific methods.
- Cambridge Epidemiology and Trials Unit (CETU).
- National Diet and Nutrition Survey (NDNS) rolling programme.
- QC, imputation and analysis for big data from genomewide and omics platforms.
- Collaborations with external statisticians (e.g. MRC Biostatistics Unit) on statistical topics of relevance to department research.
- Contributions to the training of statisticians and epidemiologists (e.g. University of Cambridge MPhil in Population Health Sciences).
- Provision of statistical reviews for papers submitted to medical and epidemiological journals.
Data science work is led by Tom Bishop:
- Federated meta-analysis, which enables cross-cohort analyses without physically pooling data from each study (InterConnect, EUCAN-Connect projects).
- Application of novel methods for data acquisition and processing, including web-scraping techniques and deep neural networks.
- Collaboration with external experts (e.g. University of Cambridge Department of Applied Mathematics and Theoretical Physics, Department of Computer Science, Health Data Research UK) on data science issues of relevance to department research.
- Development of a Trusted Research Environment (TRE) for the department, which allows researchers to use and access our data without being able to take it away.
IMS Epidemiology has a Standard Operating Procedure (SOP) for Good Analytical Practice. This SOP applies to all employees, students and visitors affiliated to any of the department research programmes who perform any type of analysis using data and generate outputs for which IMS Epidemiology has primary responsibility. Examples of outputs include papers, reports, PhD theses, MPhil project reports, conference presentations. The rationale of this SOP is to ensure that all analytical work is clearly justified, accurate, transparent and reproducible.
Topics covered by the SOP include:
- Statistical Analysis Plans.
- Analysis software.
- Analysis programs.
- Datasets.
- Location of analysis work.
- Internal peer review of analysis work.
For further information about the SOP or the work of the Statistics and Data Science Team, please email Stephen Sharp (stephen.sharp@ims.cam.ac.uk).