Case study

Follow-COVID Cohort: From paper-based data capture to HDR UK Innovation Gateway

In the wake of the COVID-19 pandemic, the Follow-COVID project, led by Dr David Connell at the University of Dundee, investigated long-term health impacts on patients who experienced severe illness

Published on 3 December 2025

HIC’s worked collaboratively across our different teams to transform paper-based participant questionnaires into a searchable digital cohort on the Health Data Research (HDR) Innovation Gateway as part of the CO-CONNECT project. This effort allowed secure, anonymised search capabilities and cross-cohort research access via the Gateway.

HIC’s role and services delivered

  • Secure data collection and entry infrastructure
  • De-identification and Metadata Generation
  • Data mapping and validation via CaRROT tools
  • Cohort integration into HDR UK Innovation Gateway

HIC managed the manual digitisation of extensive paper questionnaires, the original data, collected across Tayside, Lanarkshire, and Highlands, were shipped in cardboard boxes for secure handling and manual data entry. Over two months, the University of Dundee’s Health Informatics Centre (HIC) manually entered nearly 47,842 data points from 83 participants.

Upon data entry, personal identifiers were removed to ensure anonymisation. HIC then collaborated on metadata creation using OHDSI’s WhiteRabbit tool, generating descriptive data profiles (counts, types, column headers) necessary for accurately mapping to a common data model.

HIC supported the execution of mapping metadata to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), facilitated through the CaRROT-Mapper and CaRROT-CDM tools developed in collaboration with the Universities of Nottingham and Edinburgh. Using synthetic test data, CO-CONNECT and HIC validated mapping procedures before applying them to real, de-identified study data.

After successfully mapping to the OMOP CDM, HIC enabled the Follow-COVID dataset to be discoverable through the HDR UK’s Cohort Discovery Tool, without holding any patient data themselves, empowering researchers to find and request access to the cohort securely and efficiently.

Impact and benefits

  • Timely and inclusive data collection
  • High-fidelity, privacy-preserving data integration
  • Greater research access and collaboration

Despite assumptions that only digital tools suffice, the paper-based approach ensured that data collection could begin rapidly, without technological barriers, capturing recruits interviews verbally and comfortably, a critical advantage during an emergency public health response.

Through meticulous manual entry and de-identification, HIC preserved data quality and participant confidentiality. Metadata and OMOP CDM mapping allowed the cohort to be standardised, enabling cross-study searches and integrating with broader research infrastructure.

The integration of Follow-COVID into the HDR UK Gateway, visibility and discoverability of the cohort were expanded in line with FAIR data principles, making the dataset more Findable, Accessible, Interoperable, and Reusable. This supports secure, governed access for external researchers and supports collaboration on COVID’s long-term effects across UK datasets without compromising privacy or data ownership.

“HIC added real value by turning a local, paper-based study into a high-quality digital cohort discoverable nationwide amplifying the reach and impact of Follow-COVID.”

Dr David Connell, Principal Investigator

Story category Research