The synergy between Github, Python and DataverseNL

19 April 2024

The ASReview project uses state-of-the-art active learning techniques to solve one of the most interesting challenges in systematically screening large amounts of text: there’s not enough time to read everything!

As a spin-off from the ASReview project, the SYNERGY dataset (De Bruin et al. 2023) in DataverseNL can be downloaded via a Python package. The fully open dataset contains information on 26 systematic reviews, details can be found on GitHub.

Due to the many variables available per record (i.e. titles, abstracts, authors, references, topics), this dataset is useful for researchers in NLP, machine learning, network analysis, and more. In total, the dataset contains 82,668,134 trainable data points. That it is useful can also be seen from the number of downloads, over 750.000 to date (April 16, 2024).

The SYNERGY dataset is available via DataverseNL. DataverseNL is a research data repository co-provided by DANS and participating institutions. DANS manages the technical infrastructure and the institutions using DataverseNL are responsible for managing and curating the deposited research data within the repository.

Questions about the SYNERGY dataset, or want to add your data? Go to this page.
More information about DataverseNL can be found here

