Your study can be run using aggregate data
You have indicated that you do not need to be able to distinguish individuals.
It is likely you will be able to make use of aggregate data where data about individuals are grouped by similar characteristics and summarised. These datasets provide statistical analysis without providing individual-level data points to data users.
Does the use of aggregate data call for approval?
Using aggregate data is likely to be safe… but it might not be.
Data are considered ‘safe’ when they are ‘effectively anonymised’, i.e., when the risk of re-identifying individuals is low enough in the eyes of the law. There have been occurrences where it was possible to re-identify individuals from aggregate data.
Below is a link to a paper investigating re-identification attacks on medical data:
A Systematic Review of Re-Identification Attacks on Health Data | PLOS ONE
If you are using an existing dataset of aggregate data, please check with the data provider that the dataset has been assessed and is considered ‘effectively anonymised’. You can also find more information about data anonymisation and risks of re-identification on the Information Commissioner’s Office (ICO) website.
If the dataset is not considered ‘effectively anonymised’, your usage of data does require approval.
Example of repositories that host aggregate datasets include:
A note on using existing datasets
Data providers each have their own protocols for granting access to their data. If you plan on using an existing dataset, please factor in appropriate time and resources to go through these data access protocols.
Understanding the data requirements for your project is the first step in your research journey. This tool should have assisted you in thinking about the essential considerations on the use of data for health and social care research. Please do ensure that you think about the most appropriate data you need for your study and whether your data access needs would meet the statutory and legal governance requirements in the UK. It is imperative that data used for the development of AI and data-driven interventions is accessed with the highest privacy and ethical standards. Spending time to identify the data you need, understand what data is available and consult the relevant people and organisations, may ensure that your project can get started as soon as possible with minimal delays.
We value all feedback
If you would like to tell us about your experience of our Data Tool, then please complete our feedback form:
HRA Data Decision Tool Feedback Form