How to deposit personal data?
The European Data Protection Day is the anniversary of the Council of Europe’s Convention 108 on the protection of personal data. It is the first legally binding international data protection law, and is celebrated every year, on 28 January, by the 47 countries of the Council of Europe and the EU institutions. As a researcher, you are responsible for protecting the privacy of data you collect during your scientific research. We would like to help you get started on how best to do this. This article gives you a quick overview.
Datasets containing personal data can be deposited in a repository, e.g. one of the DANS Data Stations. However, if your dataset contains personal data, there are additional things to consider. For example, you need ‘informed consent’ from the research participants. You can also choose to publish your dataset under ‘restricted access’. This means you need to give users permission before allowing them to use the data.
What is personal data?
Personal data within the meaning of the General Data Protection Regulation (GDPR) are data that can directly or indirectly identify a living person. Examples include names, identification numbers, location data, online identifiers, or elements characterising a person’s physical, physiological, genetic, mental, economic, cultural, or social identity. Anonymous data, where the identifying information is removed and no key exists to convert it back, are not considered personal data. However, research data is most often pseudonymized, where identifying information is removed but a linking key between the data and identifiable information still exists. Pseudonymized data are considered personal data and fall under the GDPR.
Protecting your data
There are many tools and guidelines available to help you to protect your data or to de-identify it. One way of de-identifying data is recoding: for example, date of birth to year of birth, postal code to numerals only, occupation to standard classification (in Dutch). The appropriate method for de-identification will always be context-dependent, as the balance between anonymity and loss of information needs to be judged.
We have listed some tools that can help you protect personal data:
- SURF Wikiwijs e-learning module ‘Privacy in research’
- CESSDA Data Management Expert Guide – chapter ‘Protect’
- Universities of The Netherlands (UNL) : Guideline for using personal data in scientific research (This guideline is currently being finalised, Dutch only)
- European Data Protection Board (EDPB): GDPR: Guidelines, Recommendations, Best Practices
How to make qualitative data reusable?
Qualitative data – like interview or case study data – can often be challenging to share, as it is rich and complex data that is difficult to de-identify without losing crucial information. Therefore, we have created a practical guidebook ‘Making Qualitative Data Reusable’. This guidebook aims to give an overview of the challenges associated in particular with making qualitative data reusable, as well as providing guidance on how reusability can be improved and addressed at all stages of the research data life cycle. The guidebook also includes a decision tree that researchers and data stewards can use to evaluate the options for making qualitative data reusable that are most relevant to their projects. The guidebook and decision tree are both available on Zenodo.
Depositing in a DANS Data Station
DANS Data Stations are domain-specific repositories that provide a secure digital environment in which individual researchers or groups of researchers have the opportunity to store datasets, with comprehensive metadata, version control, and tools to auto-complete information. Additionally, it is possible to link datasets in these Data Stations to data portals, platforms, and specific websites, making datasets even more findable and reusable for both scientific and non-scientific users. Your research data can be digitally archived and shared under all Creative Commons open licences, but it is also possible to protect sensitive (personal) data and determine per dataset whether other users can access the data. If a dataset should only be made available after a certain period of time, it can be stored with an embargo.
If you would like more information about one of the Data Stations or need assistance depositing datasets, please contact the Data Station Managers, attend our weekly online Open Hour on Monday mornings, or send an email via our contact form.
FAIR and Open dataRDMConsultancy