Secure Data Environments

Source #

The policy guidelines for secure data environments for NHS health and social care data was published on 6th September 2022.

Much of this work is reproduced verbatim below for my own reference under the terms of the OGL (with which I comply by linking the original source above).

Introduction #

In Data saves lives: reshaping health and social care with data, we committed to implementing secure data environments (SDEs) as the default way to access NHS health and social care data for research and analysis. This strategy sets out intentions as well as 12 guidelines to implement SDEs.

The federated data platform (FDP) #

The FDP will enable, and must apply, secure data environment policy for any use of NHS health and social care beyond direct patient care. For example, when using data to support population health management and operational planning. This procurement will also support integrated care systems to implement secure data environment policy

Guidelines #

Purpose #

These guidelines have been developed to:

strengthen public confidence and trust in the transition to using secure data environments to access NHS health and social care data
provide additional information about the use of secure data environments, as outlined in the Data saves lives strategy
describe the foundations on which the NHS Transformation Directorate will further develop secure data environment policy, in collaboration with the public and expert stakeholders
communicate the direction of travel for secure data environment policy signalling areas that require further development
communicate the fundamental principles which secure data environments must adhere to

Five safes #

These guidelines are arranged according to the five safes:

safe settings - the environment prevents inappropriate access, or misuse
safe data - information is protected and is treated to protect confidentiality
safe people - individuals accessing the data are trained, and authorised, to use it appropriately
safe projects - research projects are approved by data owners for the public good
safe outputs - summarised data taken away is checked to make sure it protects privacy

Safe settings #

The principle of ‘safe setting’ is about preventing inappropriate access, or misuse, of data.

The safe settings principle will be upheld by secure data environments because data security is integral to their design.

Secure data environments will be the default way to access NHS Health and Social Care Data for research and analysis… Instances of analysing or disseminating data outside of a secure data environment will be extremely limited. Any exceptions will require significant justification, such as where explicit consent from clinical trial participants has been obtained
Secure data environments providing access to NHS health and social care data must meet defined criteria

The design, implementation and management of secure data environments must meet minimum requirements. This will include technical, behavioural, governance, and training specifications.

Secure data environments must maintain the highest level of cyber security to prevent unauthorised access to data

Secure data environments must adhere to the principle of ‘security by design’. All aspects of cyber security must be integrated into the design and implementation of these environments… Security by design will make sure that secure data environments comply with the UK General Data Protection Regulation (UK GDPR) requirement of data protection by design and by default.

Secure data environment owners must be transparent about how data is used within their environment

Owners of secure data environments must be open about the way data is used within their secure data environment. They must be able to detail who is accessing the data and for what purpose. This may be achieved, for example, by organisations ensuring that clear and accessible reporting is in place for their secure data environment… Transparency about how data is used also increases the accountability of data controllers and data users.

Safe people #

The principle of ‘safe people’ is about ensuring that individuals accessing data are trained and authorised, to use it appropriately.

The safe people principle will be upheld by secure data environments by making sure that users are verified before access is granted and are able to access appropriate data only. Patients and the public will also be engaged in decisions about who can access their data.

The secure data environment may only be accessed by appropriate, verified users

Access to NHS health and social care data within a secure data environments must be carefully controlled. Only authorised users will be granted access to data for approved purposes… This will enable a variety of users - with sufficient levels of training, qualifications, and expertise - to analyse NHS health and social care data.

Secure data environments must make sure that patients and the public are actively involved in the decision making processes to build trust in how their data is used

Owners of secure data environments must make sure that the public are properly informed and meaningfully involved in ongoing decisions about who can access their data and how their data is used. For example, by ensuring that relevant technical information is presented in an accessible way (that is, through publishing privacy notices and data protection impact assessments).

Secure data environment owners must also be able to demonstrate that they have, or plan to, undertake active patient and public involvement activities. Patient and public involvement and engagement (PPIE) activities must follow the NHS Research Authority’s principles.

Safe data #

The principle of ‘safe data’ is about making sure that information is protected and is treated to protect confidentiality.

The safe data principle will be upheld by secure data environments by their design and function, which prevents the dissemination of identifiable data.

Data made available for analysis in a secure data environment must protect patient confidentiality

Data must be treated in a secure data environment to protect confidentiality using techniques such as data minimisation and de-identification. De-identification practices mean that personal identifiers are removed from datasets to protect patient confidentiality. This includes techniques such as aggregation, anonymisation, and pseudonymisation. The level of de-identification applied to data may vary based on user roles and requirements for accessing the data.

Data protection law will continue to apply. This means there must always be a valid lawful basis for the collection and processing of personal information (including special category information) within secure data environments, as defined under data protection legislation. Where the data being accessed is confidential patient information, the requirements of the common law duty of confidentiality must also be met. More information on this can be found in the Transformation Directorate’s guidance on confidential patient information.

Inputs to a secure data environment must be assessed and approved

Owners of secure data environments must have robust processes in place for checking external inputs before they are approved to enter the environment. This includes data, code tools, and any other inputs.

Owners of secure data environments must have processes in place to make sure that the linking of NHS health and social care data with other datasets is carried out within the environment itself. They must also make sure that only approved and appropriately qualified individuals conduct dataset linking.

Safe projects #

The principle of ‘safe projects’ is about making sure that research projects are approved by data owners for the public good.

The safe projects principle will be upheld by secure data environments by:

a) supporting open working practices that deliver efficiencies and improve the quality of analysis and findings b) making data available for a range of uses intended for the public good

Secure data environments must adhere to a policy of open-working and support code-sharing

Secure data environments must support open working, ensuring that code developed in these environments is reusable. Examples of how this could be achieved include:

applying the principles of the NHS Open Source Code policy
using the Reproducible Analytical Pipelines (RAP) strategy

Code developed in secure data environments must be published in the open unless there is a specific rationale for not doing so. We will engage further on these exceptions, and publish guidance in due course. This may include making it available in open repositories.

Secure data environments must be able to support flexible and high-quality analysis for a diverse range of uses

Owners of secure data environments must engage with their intended users to make sure that they provide the necessary functionality and tools required for analysis. A range of users with different requirements and skill sets will need to access data within these environments. They will need to analyse different data to produce different outputs.

All uses of data within secure data environments must be for the public good

The use of NHS health and social care data must be ethical, for the public good, and comply with all existing law. It must also be intended for health purposes or the promotion of health. Data access must never be provided for marketing or insurance purposes.

Safe outputs #

The principle of ‘safe outputs’ makes sure that any summarised data taken away is checked to make sure it protects privacy.

The safe outputs principle will be upheld by secure data environments by making sure that the results of analysis contain only aggregated, non-identifiable results that match the approvals of users and their projects.

Outputs from a secure data environment must be assessed and approved and must not identify individuals

All information must be checked before it leaves a secure data environment, including data, code, tools, and any other outputs.

There must be robust processes in place to maintain patient confidentiality and to make sure that outputs align with the intentions of individual projects.

Next steps for secure data environment policy #

Public and patient engagement #

Engaging with patients and the public on how we store and use their data is key to getting secure data environment policy right. We have started this process, having engaged public and patient groups on the contents of a simple explainer for secure data environment policy.

This engagement will scale-up from Autumn 2022, forming part of the Data saves lives strategy engagement campaign. Further information about our plans for engagement and how you can get involved will be published this Autumn.

Technical and accreditation guidance #

By the end of 2022, we will publish:

technical guidance for secure data environments, including details about the core capabilities for these environments. This will apply to all environments that provide access to NHS health and social care data for research and analysis
an outline of the accreditation process that all NHS secure data environments will need to meet

We will continue to publish simple explainers alongside this more technical information to make sure that the public understands what we are doing and why.

Delivery and implementation #

The transition to secure data environments for access to NHS health and care data is a positive step forward. However, it is a complex and rapidly developing field and careful thought must be given to ensure successful implementation. For example, we intend to provide greater clarity on the below in the next phase of this work:

what the target ecosystem of NHS accredited secure data environments should look like and what needs to change to achieve the desired end-state
the requirements of an accreditation process, our overall approach to ensuring compliance, and the capabilities of an accreditation body
exemptions to the use of secure data environments, the justifications required, and how this may change over time as technology develops and platforms improve
a realistic transition timeline for adoption of the policy that is both ambitious and achievable

We will be working with a wide range of stakeholders to develop and publish information about these plans and timescales for transition and welcome all views. This process will also be informed by the NHS’s continued investment in a number of flagship programmes:

the federated data platform (FDP)
NHS Digital’s national secure data environment
sub-national secure data environments