Adopting a common data model

Keeping data research ready

Status: Completed
Share:

Insight

 

Across NHS hospitals, patient data are captured using different electronic patient record systems (EPR). Combining this data into one data format to optimise its utility for research is difficult due to the fact that different data structures, different terminology and different units of measurement may have been used by different EPR systems.

There are also information governance complexities when combining data from different data sources. Even when researchers have permission to use data in different research environments of diverse organisations, moving data between locations can be costly, complicated, and time consuming.

To be able to successfully compare or combine data from multiple sources in a like-for-like manner, the source data needs to be mapped to an agreed Common Data Model (CDM).

Intervention

 

Observational Medical Outcomes Partnership (OMOP) CDM is an an open community data standard, designed to standardise the structure and content of observational data and to enable efficient analyses that can produce reliable evidence (1). Data from different EPR systems can be mapped to a CDM to enable collaborative research, large-scale analytics and sharing of sophisticated tools and methodologies. Having standardised data also means that research studies can be carried out across multiple locations and much larger populations with a greater diversity of data subjects, which in turn improves the validity and applicability of the results. It also helps maintain data security, as researchers are able to run the same query across multiple secure data environments  without the data ever having to move or leave its secure location.

Health Innovation East expanded on an approach called ‘omop es’, initially developed by University College London Hospitals NHS Foundation Trust (UCLH) for mapping its internal EPR system, ‘Epic’ data to OMOP – so that the same ‘omop es’ approach could be adapted for different EPR systems in health and care. Funding for this work was secured by Professor Serena Nik-Zainal from NHS England as part of an ongoing data infrastructure project at the NIHR Cambridge Biomedical Research Centre.

Using data to optimise treatments used for multiple sclerosis

Multiple sclerosis (MS) is a condition that affects the brain and spinal cord, causing a wide range of potential symptoms, including problems with vision, arm or leg movements, sensation or balance (2). There is currently no cure for MS but several treatments can help control the condition and ease symptoms (2). In the UK, 1 in 400 people have MS – over 150,000 people (3).

 

Professor Stephen Sawcer is leading work at the National Institute of Health and Care Research (NIHR) Cambridge Biomedical Research Centre to find ways to optimise treatments for MS to reduce disability and prolong a healthy life. Professor Sawcer’s group use genetic data alongside patient record data to identify genetic variants associated with MS to help better understand immunological and neurobiological consequences for patients – how MS progresses differently for different people, and why people respond differently to treatments.

 

Professor Sawcer’s research group is using clinical data from an MS research study, where patients’ data was held in a secure environment, CYNAPSE, and in the Cambridge University Hospitals NHS Foundation Trust (CUH) EPR system, Epic. The data across these two different systems could not be easily compared. By using omop es to convert the data, the two separate data models can be processed within two minutes and compared like-for-like. Using this process, data can be optimised for research, and can be easily updated or refreshed, is replicable, efficient and supports the NHS’s drive to increase transformational research.

Impact

 

Being able to map data quickly and efficiently is significant for any research collaboration – locally, nationally or internationally.

In the East of England, OMOP is increasingly used in research environments including the East of England Sub-National Secure Data Environment (EoE SN-SDE).

Embedding routine automated extraction of EPR data and mapping to the OMOP CDM would allow research datasets to be kept up to data and ’research ready’ without the need for human intervention. This would make datasets and cohorts easily ‘findable, accessible, interoperable, and reusable’ (FAIR), negating the need for any further data curation in many cases, and reducing the time to generate up to date standardised and interoperable data for a given cohort to minutes, rather than days or weeks.

 

Who is involved?

References

(1) Observational Health Data Sciences and Informatics. (2024). OMOP Common Data Model. [Online]. OMOP Common Data Model. Available at: https://ohdsi.github.io/CommonDataModel/ [Accessed 9 August 2024].

(2) (2022). Multiple sclerosis. [Online]. NHS. Last Updated: 22 March 2022. Available at: https://www.nhs.uk/conditions/multiple-sclerosis/ [Accessed 9 August 2024].

(3) MS Society. (2024). MS in the UK. [Online]. MS Society. Available at: https://www.mssociety.org.uk/what-we-do/our-work/our-evidence/ms-in-the-uk [Accessed 9 August 2024].

Why this impact story is relevant to you

Healthcare providers

We can help partner you with academic research communities to make better use of the data you hold.Get involved

Academics and researchers

We can help you create or connect trusted research environments of your own in a safe, consented and usable way to supercharge your research capabilities.Get involved

Innovators

We can connect you with researchers who can help you develop and test your innovationsGet involved

Patients and members of the public

We want to build your voice into any programmes which deal with how patient data is used.Get involved

Newsletter