"Awesome Research Data Management" is a curated list of awesome RDM resources for researchers and organizations.
"Research data are objects that you use and produce during your research life cycle, encompassing datasets, software, code, workflow, models, figures, tables, images and videos, interviews, articles. Data are your research asset." The Turing Way / Guide for Reproducible Research / Research Data Management / Research Data
Research "data management refers to the storage, access and preservation of data produced from a given investigation. Data management practices cover the entire lifecycle of the data, from planning the investigation to conducting it, and from backing up data as it is created and used to long term preservation of data deliverables after the research investigation has concluded. Specific activities and issues that fall within the category of data management include: File naming (the proper way to name computer files); data quality control and quality assurance; data access; data documentation (including levels of uncertainty); metadata creation and controlled vocabularies; data storage; data archiving and preservation; data sharing and reuse; data integrity; data security; data privacy; data rights; notebook protocols (lab or field)." CODATA RDM-Terminology / RDM
In a wider sense research data management include also research information management and research knowledge management.
- General resources
- RDM for researchers
- RDM for organizations
- Discipline-specific RDM
- Discipline-specific tools
- Discipline-specific repositories
- Domain-specific NFDI consortia
- Research data management (RDM) open training materials ZENODO Community
- Data Management Training (DMT) Clearinghouse is a registry for online learning resources focusing on RDM
- Data Management Skillbuilding Hub is a repository for open educational resources regarding data management
- FAIRsharing is a curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies
- re3data is a registry of research data repositories
- BARTOC registry of terminology registries
- The Basel Register of Thesauri, Ontologies and Classifications (BARTOC) includes all types of KOS in any format, across all subject areas.
- FAIRsharing is a curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies
- Linked Open Vocabularies (LOV) is a directory of RDF vocabularies
- Linked Data Catalogue
- Elixir RDMkit is a RDM toolkit for Life Sciences [git]
- JISC RDM Toolkit
- RDMtoolkit of the library at the University of Western Australia
- Data Management Toolkit @ UNH at University of New Hampshire
- LEARN Toolkit of Best Practice for RDM by fosteropenscience
- Research Data Toolkit at UNC library
- Data Management Expert Guide CESSDA is an advanced guide designed by European experts for social science researchers
- Essentials 4 Data Support is an introductory course
- MANTRA Research Data Management Training is a classic online course
- Datatree - Data Training Engaging End-users is a course for research students and early career researchers in the environmental sciences.
- Research Data Management and Sharing is a Coursera course by Helen Tibbo & Sarah Jones
- Carpentries-style lesson on Research Data Management [git]
- McMaster RDM Webinars [git]
- Research Data Management with DataLad [git]
- RDM at Griffith Uni Library is short guide for self paced RDM [git]
- RDM E-Learning Platform ist eine E-learning-Webseite zum Thema Forschungsdatenmanagement der HTW Chur und der HEG Genf
- PARTHENOS training is a set of training modules in digital humanities and research infrastructures
- FOSTER open science courses
- Managing and Sharing Research Data is an introductory course
- Research data bootcamp is a general online course from University of Bristol
- Data Management Short Course for Scientists by the ESIP Federation in cooperation with NOAA and the Data Conservancy
- RDM Knowledge Base by Uni Bochum
- EUDAT Training
- EOSC-Pillar: RDM Training and support catalogue
- Library Carpentry: FAIR Data and Software [git]
- Research Data Management Promotion Materials [git]
- Data Steward Certificate Course (paid) at the Vienna University Library.
- openAIRE Workshops on various open access and open science topics.
- Essentials 4 Data Support course by from Research Data Netherlands (RDNL).
- The Turing Way: a how to guide for reproducible data science [git]
- A Research Data Management Handbook OpenAIRE
- Research Data Management Handbook by Leiden University Faculty of Archaeology
- Research Data Management and Data Literacies by Koltay Tibor
- The Data Book: Collection and Management of Research Data by Meredith Zozus
- Hand-book of the modern development specialist is a Complete, Illustrated Guide to Responsible Data Usage, Manners, and General Deportment
- Data Management for Social Scientists by Nils B. Weidmann
- The FAIR Guiding Principles for scientific data management and stewardship is a classic paper on FAIR principles
- FAIR principles at go-FAIR.org with detailed descriptions
- The FAIR Cookbook [git]
- Three-point FAIRification Framework "How to go FAIR" at go-FAIR.org
- FAIR Maturity Indicators and Tools
- FAIR Data Week at Uni Mannheim
- "FAIR Data Resources" group at Zotero by Atif Latif
- awesome-fair GitHub repo is a curated list of awesome stuff around the FAIR principles
Use case: a researcher wants to plan, run and finish a research project.
- Benefits of data management by CESSDA
- Benefits of RDM by UP
- Advantages of research data management by UB Erlangen-Nürnberg
- Benefits of good data management by the University of St Andrews
- What are the advantages of managing research data? by UB TUM
- Benefits of Data Management by UCDL
- Benefits of RDM by FU Berlin
- Benefits of Open Data by Emory Library
Check out research data center at your university. They will guide you in RDM for free.
- Cost-benefit analysis for FAIR research data - Cost of not having FAIR research data by Directorate-General for Research and Innovation (European Commission) and PwC EU Services
- Data management costing tool and checklist by UK Data Service
- Guide on research data management costs by LCRDM
- How to identify and assess RDM costs is an OpenAIRE guide for H2020 grants
- Tips for your proposal by DFG
- Planning and Writing a Grant Proposal: The Basics by The Writing Center at University of Wisconsin – Madison
It makes sense only in big research projects.
- Data management coordination in RDMkit
A Data Management Plan (DMP) describes how research data is handled before the project has commenced, ensuring the traceability of data during the project and beyond. DMPs are often required in a formalized form when submitting a funding application or during the project period, for example with Horizon Europe, ERC grants. The DFG also asks for information on Data Management, although this is not explicitly a DMP and the DFG.
A DMP typically contains the following elements:
- Data Description/Data Collection
- Documentation and Data Quality
- Storage and Backup
- Legal and Ethical Requirements
- Data Sharing and Archiving
- Data Management, who and what?
DMPs and Research Funding in Germany:
DMP requirements differ depending on the funding institution within Germany (see below).
Funding Institution | DMP Requirements | DMP Template |
---|---|---|
German Research Foundation (DFG) | Not explicitly a DMP, but information on data management is usually required in section 2.4 of the application. There are also some subject-specific and program-specific recommendations on how to handle research data in grant applications, but researchers are not obligated to go further than the general guidance provided. See DFG Guidelines on the Handling of Research Data. | Yes (unofficially) |
Federal Ministry of Education and Research (BMBF) | There are no general requirements with regard to research data. The requirements are defined individually for each tender. See Federal Ministry of Education and Research. | No |
Volkswagen Foundation | Yes, it is a requirement for research funding from the Volkswagen Stiftung that applicants submit a DMP with their application for funding. See Volkswagen Stiftung Open Science Policy. | Yes |
Baden-Württemberg Stiftung | No | No |
Fritz Thyssen Foundation | No | No |
Hans Böckler Foundation | No | No |
Adapted from CESSDA Training Team (2017 – 2022). CESSDA Data Management Expert Guide. Bergen, Norway: CESSDA ERIC. Table by: CESSDA
And Across Europe:
Funding Institution | DMP Requirements | DMP Template |
---|---|---|
European Research Council (ERC) | Yes, applicants must submit a DMP after the first 6 months of the funding period and must continuously update the DMP if significant changes occur, see ERC, Open Science, Section 2. Research Data in Horizon Europe. | Yes |
Horizon Europe | Yes, applicants must submit a DMP after the first 6 months of the funding period and must continuously update the DMP if significant changes occur, see Horizon Europe. | Yes |
European Science Foundation | No | No |
You may also want to consider the data management standards in your own research field (e.g., Humanities, Social Sciences, Business and Economics) which might inform what you should include in your DMP, in which case, check out the following:
- Standardised Data Management Plan for Educational Research of the Research Data Education Network
- Research Data Management in the Social, Behavioral and Economic Sciences, Chapter 2
- Guidelines for Effective Data Management Plans of the Inter-university Consortium for Political and Social Research
- DMP Wizard by CLARIN-D, Humanities
- DMP template of the EU project PARTHENOS, Humanities
Need some inspiration? You can check out examples of DMPs from successful research applications by checking out the Digital Curation Center (DCC) here and get an idea of what reviewers might be looking for here.
Applying for funding outside of Germany? You can find out more information about DMP requirements for research funding applications abroad here.
For further information you can also check out the following:
- RDMO Research Data Management Organiser is funded by DFG
- DMPonline is created by Digital Curation Centre (DCC), UK
- Data Stewardship Wizard is created by ELIXIR CZ and NL
- DMPTool by CDL, USA
- Data Management Plan Catalogue by the LIBER Research Data Management Working Group
- Practical Guide to the International Alignment of Research Data Management [zenodo]
- Creating a data management plan (DMP) document by OSF
(Research-) Data Policies are guidelines and recommendations for handling research data. They can be on different levels and from different actors, such as:
- Data policies of third-party funders (e.g. DFG or ERC)
- Data policies of publishers and/or journals (e.g. American Economic Association, Oxford University Press, other policies can be found at ReplicationWiki)
- Discipline-specific guidelines (e.g. DFG)
- Guidelines of research institutions (e.g. institute-specific and project-specific)
There are also tips, toolkits and other materials to help institutions and projects develop a data policy. For example, you can find materials from these projects:
- FDMentor: Empfehlungen zur Erstellung institutioneller Forschungsdaten-Policies, Strategischer Leitfaden zur Etablierung einer institutionellen Forschungsdaten-Policy, RISE-DE
- LEARN: Toolkit of Best Practice for Research Data Management
- Existing data in RDMkit
- Reusing existing data by Ghent University
First, find it.
- Search in a suitable data repository which you can find in Registries.
- Search at
- Check the links in data papers:
- list of data journals at Forschungsdaten.org
- list of data journals at GitHub repo data-journals
- list of data journals at Wikidata
- examples of data journals: 1. Data by MDPI 2. Scientific Data by Nature 3. Data in brief by ScienceDirect
Check quality of the data. Check licenses. If you reuse data, cite it.
The focus here on:
- Reusing existing data
- Collecting new data
General info on collecting data
- Collecting data in RDMkit
Lists of data sources:
- 10 Great Places to Find Free Datasets for Your Next Project
- 21 Places to Find Free Datasets for Data Science Projects
Registries of data repositories:
- FAIRsharing is a curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies
- re3data is a registry of research data repositories
Metadata and data portals:
Methods of collecting data:
- 7 Data Collection Methods in Business Analytics by Harvard Business School
- User guide on creating metadata by Stanford University Libraries offers advice on gathering basic and semantic metadata for research data, including a list of some metadata standards, ontologies, and metadata creation tools.
- Data organization in RDMkit
- Folder structure, file names, and versioning by Swedish National Data Service
- File Naming and Versioning
- Naming files and folders by Imperial College London
- File Naming Conventions & Version Control
- File Naming Conventions: simple rules save time and effort
Separate storage for sensitive data
- Data backups 101: A complete guide for 2023
- Data Backup Strategies for Your PhD/Research Data
- Backup, Storage & Security
- The Ultimate Guide to Data Cleaning by Omar Elgabry
- What Is Data Cleaning and Why Does It Matter? by Will Hillier
- Data cleaning tutorial at Kaggle
- Top ten ways to clean your data from Microsoft
- 15 Data Exploration techniques to go from Data to Insights
- Comprehensive data exploration with Python
- 11 Open Source Data Exploration Tools You Need to Know in 2023
- Data Exploration in R (9 Examples) | Exploratory Analysis & Visualization
- Data Interpretation by Unacademy
- Data Interpretation: Method, Types, Tips with Solved Examples by Testbook
- Basic data interpretation by University of Portsmouth
- Data anonymization at Wikipedia
- Anonymisation and Pseudonymisation at University College London
- Anonymising quantitative data by UK Data Service
- Anonymisation and pseudonymisation at dataprotection.ie
Topics:
- Data Masking
- Pseudonimisation
- Aggregation
- Derived Data
- Data Protection Notice for Research Funding from the German Research Foundation (DFG)
- Germany - Data Protection Overview
- Data protection in research by Helmholtz.de
- Data provenance in RDMkit
- FAIRmat Guide to Legal Aspects in Research Data Management
- Research Data Management - Legal and Practical Aspects by Anja Perry, & Jan-Ocko Heuer (2022, Juli 15)
- Research Data Management - Legal Aspects by Maurice Schleußinger (2019, Juli 24)
The difference between sharing, publishing & archiving is:
- sharing: any way of sharing information, could mean also emailing. It means also making research data available throughout the research lifecycle, especially during the active research phase, typically via cloud storage.
- publishing: citable artifact, discoverable.
- archiving: long-term preservation.
There are many benefits to sharing data.
You can share the data via GitHub
Data journals:
- list of data journals at Forschungsdaten.org
- list of data journals at GitHub repo data-journals
- list of data journals at Wikidata
"from infographics to narrative reports, case studies and long form investigative articles, to graffiti or conceptual art"
- License chooser
- RDA & CODATA Legal Interoperability Of Research Data: Principles And Implementation Guidelines
- How do I license my research data? OpenAIRE
- What is the most appropriate license for my data?
- The Legal Side of Open Source
For restricted access data:
- Restrictive Licence Template
- Data Availability Statements for Restricted Data
- Aligning restricted access data with FAIR: a systematic review
- LEARN Project resources are resources to help Research Performing Institutions manage their research data
- Engaging Researchers with Data Management: The Cookbook [pdf]
- How to Develop RDM Services - a guide for HEIs
- The Realities of Research Data Management. Part Four: Sourcing and Scaling University RDM Services
- Ten simple rules for starting FAIR discussions in your community
- DOI registration agencies is a list of current DOI registration agencies
- URN is a list of all registered namespaces provided by the Internet Assigned Numbers Authority (IANA)
- Auffinden-Zitieren-Dokumentieren by ZBW, GESIS and RatSWD
- Ending principles for digital humanities projects
- Paper: Digitale Werkzeuge zur textbasierten Annotation, Korpusanalyse und Netzwerkanalyse in den Geisteswissenschaften that sums up and explains different tools
- The Programming Historian provides tutorials about DH topics for humanists
- OpenMethods.Dariah is a list of digital humanities tools and methods
- CLARIN-D is a is a research infrastructure that helps researchers of Humanities, Cultural and Social Sciences with accessing, preparing and analysing of research data
- TAPoR is a list of research tools for text analysis
- SSH Open Market Place is a place for resources for research in Social Sciences and Humanities
- BAS is a set of tools for speech sciences and technology
- TextGrid is a virtual research environment for the humanities that is optimised for working with TEI-coded resources
- Awesome Digital Humanities is a curated list of tools, resources, and services supporting the Digital Humanities
- Patrick Sahle's Catalog of Digital Scholarly Editions
- RIDE – A review journal for digital editions and resources
- TEI Publisher is a software for publishing digital editions
- ediarum is a software for creating and publishing digital editions
- KONDE - Kompetenzzentrum Digitale Edition is a guideline to publish a digital edition
- Dig-Ed-Cat is a Catalogue of Digital Editions
There are 26 domain-specific NFDI consortia aiming to ensure FAIR data in Germany.
- BERD@NFDI: NFDI for Business, Economic and Related Data
- KonsortSWD: Consortium for the Social, Educational, Behavioural and Economic Sciences
- NFDI4Culture: Consortium for Research Data on Material and Immaterial Cultural Heritage
- NFDI4Memory: The Consortium for the Historically Oriented Humanities
- NFDI4Objects: Research Data Infrastructure for the Material Remains of Human History
- Text+: Language and text-based research data infrastructure
- NFDI4DataScience: NFDI for Data Science and Artificial Intelligence
- NFDI4Energy: National Research Data Infrastructure for Interdisciplinary Energy System Research
- NFDI4Ing: NFDI for Engineering Sciences
- NFDI-MatWerk: National Research Data Infrastructure for Materials Science and Materials Engineering
- NFDIxCS: National Research Data Infrastructure for and with Computer Science
- DataPLANT: Plant research data
- FAIRagro: FAIR Data Infrastructure for Agrosystems
- NFDI4Immuno: National Research Data Infrastructure for Immunology
- GHGA: National Research Data Infrastructure for Immunologyv
- NFDI4Biodiversity: Biodiversity, Ecology and Environmental Data
- NFDI4BIOIMAGE: National research data infrastructure for microscopy and bioimage analysis
- NFDI4Health: NFDI personal health data
- NFDI4Microbiota: NFDI for Microbiota Research
- DAPHNE4NFDI: Data from PHoton and Neutron Experiments for NFDI
- FAIRmat: FAIR Data Infrastructure for Condensed-Matter Physics and the Chemical Physics of Solids
- NFDI4Cat: NFDI for sciences related to catalysis
- MaRDI: Mathematical Research Data Initiative
- NFDI4Chem: Chemistry consortium for the NFDI
- NFDI4Earth: NFDI Consortium Earth System Sciences
- PUNCH4NFDI: Particles, Universe, NuClei and Hadrons for the NFDI