Data Safe Havens: Keeping data secure using the "Five Safes" framework
Published on 27 January 2023
To mark Data Protection Day (28 January), we explore how Data Safe Havens prioritise and maintain personal data protection and privacy when it comes to public health data.
Data Safe Havens in Scotland such as Dundee’s Health Informatics Centre, have some of the highest quality, population-wide, longitudinal health data across the globe. When it comes to cutting-edge research and the development of public health solutions, this invaluable data can be mined for healthcare innovation.
Nonetheless, as it becomes more and more evident that using ‘Big Data’ in healthcare can provide endless opportunities, it is essential that this data is collected, maintained and accessed securely.
But how exactly do Data Safe Havens keep data secure? And what role do the “Five Safes” play in this?
Ensuring data security via data save havens
Data Safe Havens (also known as ‘Trusted Research Environments’ or ‘TREs’) provide secure, remote access to Scotland’s wealth of healthcare data for innovation. Dundee’s Data Safe Haven, the Health Informatics Centre (HIC), hosts and manages NHS patient data representing the Tayside and Fife populations. HIC was the first centre in Scotland to operate as a Trusted Research Environment, providing opportunities for the secure, restricted access and use of eHealth data. Before the requested data are made available to an Approved Data User, it needs to be processed within the Data Safe Haven so that a specific individual cannot be reidentified from the dataset.
In Scotland, Data Safe Havens fulfil the needs of the Chief Scientist Office in supporting research of NHS data and must comply with the Safe Have Charter.
Safeguarding health data to maintain patient confidentiality
Sensitive data managed within a Data Safe Haven is not released externally to Approve Data Users but can be accessed through a remote computing environment managed by the centre. This creates a secure, controlled environment for researchers to access the data they require for healthcare innovation whilst protecting the anonymity of the original patients.
At the University of Dundee’s Health Informatics Centre (HIC), a project-specific data anonymisation process is applied in order to maximise data security and data subject confidentiality. This allows for patient confidentiality to be protected whilst still permitting the accurate linkage of patient data across multiple datasets.
Enabling safe research using the "Five Safes" framework
When it comes to Data Safe Havens, data security and patient confidentiality are of utmost priority. With this in mind, the Office for National Statistics (ONS) alongside other data providers developed a set of principles which enable data services to provide safe research access to data: The “5 Safes” Framework.
- Safe Projects – Approved Data Users may only access relevant, minimal data within the Data Safe Haven and is only used for research that has passed ethics and government requirements, as well as demonstrated the potential public benefit.
- Safe People – Any researchers or professionals requesting data access must undergo training and authorisation before being granted access to the data they require.
- Safe Places – Data can only be accessed within a restricted, secure environment for analyses and through industry-standard remote access technologies.
- Safe Data – Data is treated to protect any confidentiality concerns and researchers are only given access to the minimum data required to answer their research question.
- Safe Outputs - When all analysis is complete, any data or project work can only be removed from the safe place once it has been screened by the centre’s team of experts to assess and ensure that all outputs are aggregate-level, or information or summary statistics, to avoid the risk of re-identification.
Opportunities to enhance projects through safe data
In terms of academic research, the significant value of accessible, accurate data linkage is evident. For example, research data compiled from recruited, consenting individuals within a specific study can be enriched through linkage to additional datasets captured within the Data Safe Haven (e.g., laboratory test results, prescribing history, hospitalisation information etc.). Research data can also be linked to records in external, non-healthcare related sectors, such as education.
However, secure data access and linkage also provides tremendous opportunity for industry professionals and decision-makers within the healthcare sector. Similarly to research studies, data collection can be expensive and labour-intensive. Therefore, compiling and analysing data from existing NHS health records can be a more proactive, time-efficient method for decreasing project costs.
The Health Informatics Centre’s expertise includes a Data Service that can support Data Safe Haven users with this linkage across datasets. With appropriate approvals from Data Controllers, The Data Team are all able to help access new datasets, adding missing identifiers with bespoke look-up tools if required, to enable linkage and facilitate new areas of research. This can be for project-specific use or hosting a copy of the new dataset, to allow future research to proceed more quickly.
HIC works closely with experts in health informatics and research methodology who can advise data users on how to make the best use of the data. They also provide expertise on how to safely make their data available for research, for example through the separation of dataset indexing (adding an anonymised identifier) and linking activities.