DMP

From EERAdata Wiki
Jump to: navigation, search

Draft for EERAdata DMP

Our promise

  • should comply with maDMP, so suggested format is xml (e.g., see an example for maDMP mockup)
  • should fulfill the objectives to:
    • Manage all data relevant for the project.
    • Adhere to FAIR and open data principles.
    • Establish a proof-of-concept for a DMP blueprint for the low carbon energy research community (e.g. ready to use for H2020 projects or other funding agencies and to support recommendation no. 1 of the Technopolis Report ‘Request systematically a data management plan (DMP) for all energy research applications to H 2020’)
    • All the project activities will be compliant with the General Data Protection Regulation (GDPR).

In the application we have promised that the DMP would fulfill the 10 principles or functionalities of maDMP: (following Miksa et al. 2019):

  1. Integrate DMPs with the workflow of all stakeholders in the research data ecosystem: EERAdata will provide a collaborative workspace enabling this functionality.
  2. Allow an automated system to act on behalf of stakeholders: EERAdata will explore the possibility to automatically extract information to DMPs and/or to entries in the federated database (see WP 3). This includes administrative information (e.g. information on funding agency and participants etc as collected in CORDIS), license information (e.g. information on type of license using wizard from EUDAT), automated booking of necessary storage (e.g. using repositories such as DATAVERSE), automatic deposits of project data and associated metadata (e.g. towards automated reporting in H2020), validation & compliance (e.g. by funding agencies)
  3. Make policies for machines and people: EERAdata follows strict templates for documents.
  4. Describe - for both machines and humans - the components of the data management ecosystem: EERAdata tests to what extent this request is general or project-specific (with implications for the suggested DMP blueprint).
  5. Use PIDs and controlled vocabularies: EERAdata follows this in the design of all WP activities.
  6. Follow a common data model for DMPs: EERAdata builds on the models suggested by the Working Group on DMP of Research Data Alliance (RDA), see the structure below. Moreover, EERAdata involves Ana Slavec (RDA) in the advisory board, who is an export on DMPs.
  7. Make DMPs available for human and machine consumption: This is the core of EERAdata and its DMP adheres to FAIR and open data principles.
  8. Support data management evaluation and monitoring: Explore whether the periodic reporting functionality of the EC Portal can be improved through linking to a project’s DMP.
  9. Make DMPs updatable, living, versioned documents: EERAdata understands its DMP as a living document designed for versioning.
  10. Make DMPs publicly available.

The EERAdata DMP has dissemination level ‘PU’ (see D1.3).

Structure (following the hierarchy of the RDA model) with Contact information, Cost information, Track of changes, Staff involved, Datasets generated (incl. data quality assurance, data identification number, license, distribution, keywords, metadata, type), Description, Ethical issues, Language, Project information, and Title.

Stakeholders of EERAdata DMP

Aligning with stakeholders listed in Miksa et al. 2019 and definitions of stakeholders provided therein (here given in brackets)

Stakeholder Definition in Miksa et al. 2019 Specs in EERAdata
Funder funding agencies and foundations that specify requirements for DMPs and monitor compliance H2020 program, so our reports and deliverables should be included in the DMP. Link to the other project funded (through CORDIS portal).
Ethics review IRBs/REBs that authorize human subjects research Our "new" deliverables should be included here and the agencies who use them.
Legal expert technology transfer offices; copyright and patent lawyers Name our institution's legal experts here? Links to documents: GA, CA, Project Management Handbook.
Researcher Principal Investigator and collaborators, including postdoctoral researchers and graduate and undergraduate students All consortium with ORCID and ResearchGate?
Publisher purveyors of article and data publication services  ??? Link to publications of EERAdata, with DOIs.
Repository operator general (e.g., Zenodo), disciplinary (e.g., GenBank, ICPSR), and institutional data repositories Project and post-project hosts of EERAdata platform, WIKI, etc. (EERA; AIT; ENEA); additional EERAdata repository at GitHub. During project OnlyOffice.
Infrastructure provider providers of systems for creating DMPs (DMPTool, DMPonline), grants administration, researcher profiles, etc.  ??? Same as hosts for the repositories?; In case we use the DMP generation template, e.g. from TU WIEN. Not yet working.
Research support staff data managers/curators, research administrators, and data librarians wider EERAdata consortium with links to admin staff.
Institutional administrator office of research/sponsored programs, chief information officers, university librarians, others. DMP, data management plan; ICPSR,; IRB, institutional review board; maDMP, machine-actionable DMP; REB, research ethics board. H2020 EU portal, project officer.

Structure

Chapters I Administrative details

  • source project data from CORDIS
  • source consortium member data with roles from OO

II. Data and project management policies

  • data policies as described in (D1, D2, D3)
  • project management policies as described in (CA, GA, Project handbook)

III Re-using data

  • Linking to other EU project and existing data hubs and databases: database with links
  • Linking to FAIR/O standards
  • Linking to existing metadata frameworks

III. Creating and collecting data

  • pool of experts - link to database & consent forms from the other chapter
  • user data collection during workshops - link to these produces at open repositories (GitHub, storyboard, dataverse, project wiki)
  • production of project output - link to deliverables (storied in repositories of the project, published paper DOIs, project deliverables)

IV. Processing data

  • platform specs (incl. WIKI, website, project repositories)

V. Interpreting data

  • linking to publications, WIKI, platform, project repositories

VI. Preserving data

  • linking to platform
  • preservation policies adhering to FAIR/O
  • persistent identifier for the platform and repositories

VII. Giving access to data

  • Linking to data policies
  • Linking to project output (pool of expert, deliverables, repositories, platform)

Options for automated workflows and acting on behalf

  • Collating administrative data: Use Current Research Information Systems (CRIS). At EU exists EuroCRIS. Open question: So, how can we practically and automatically do this for our DMP? It is a question to OpenAIRE.
  • License selection: If the institutional policy recommends open access publishing and the data do not contain sensitive information, then CC0 could be the default setting for data, and CC BY for text and media. There is already a wizard from EUDAT.
  • Not so much available yet on data depositing and compliance/validation checks?
  • c2 metadata - continuous capture of metadata http://c2metadata.org/
  • common workflow language https://www.commonwl.org/

Review of H2020 project DMPs

List of project DMPs

Project DMP and link Purpose/type of project Structure/elements Notes
REEEM, D8.2 DMP Output: Stakeholder Interaction Portal, a Pathways Diagnostic Tool and an Energy System Learning Simulation. DMP for "data collection to populate models, calibrate them, as well as allow for data exchange between different types of models and different partners" TOC: Project info, authors, history of changes, project summary, about, principles & summary, 1. DMP checklist (data collection, documentation & metadata, ethics & legal compliance, storage & backup, selection and preservation, data sharing, responsibilities & resources, data project impact assessment), 2. Definition and Matter, 3. Links. Pdf not xml document. Not machine-actionable.
HYbuild DMP Output: develop two innovative compact hybrid electrical/thermal storage systems for stand stand-alone and district connected buildings. DMP outlines how data are collected or generated by the HYBUILD project, in terms of how it will be organized, stored, and shared. It specifies which data will be open access and which will be confidential within the consortium, as far as it is possible to do so at this stage. The report has been developed following the Horizon 2020 guidelines (EC DG R&I, 2017) with additional guidance from the joint OpenAIRE and EUDAT webinar “How to write a Data Management Plan” (OpenAIRE and EUDAT, 2018) TOC: executive summary. Acronyms & abbrev. Glossar. 1. Introduction (Aims of project, relation with other project activities, structure, partner contributions), 2. Approach (data availability and open access, data storage & sharing), 3. Descriptions of datasets (template, plus 39 individual data set descriptions), 4. Conclusions, 5. References Pdf not xml document. Not machine-actionable.
RESLAG D1.2 DMP Four large-scale demonstrations to recycle steel slag are considered: Extraction of non-ferrous high added metals; TES for heat recovery applications; TES to increase dispatchability of the CSP plant electricity; Production of innovative refractory ceramic compounds. DMP is to ensure the accessibility and intelligibility of the data that will be generated during the RESLAG project in order to comply with the Guidelines of the “Open Research Data Pilot” (annex II). TOC: Ex. sum, nomenclature, list of figs & tabs, 1. Intro, 2. metadata strategy & standardization, 3. fact sheet (data set descriptions, data set metadata), 4 data sharing, 5 storage and preservation, conclusion, two annexes no maDMP, link to Zenodo repository.
PANTERA, D.15 DMP Identify and implement initiatives aimed at raising the participation of EU countries in the needed R&I for developing technologies, systems and markets in support of the common energy market and the energy transition. DMP: final version of the (open) Data Management Plan for the PANTERA project in month 2 of the project. This Data Management Handling Plan investigates the appropriate methodologies and open repositories for data management and dissemination and tries to offer through open access as much information generated by the PANTERA project. TOC: Abbr., Exec sum, 1 Objective of the report, 2 Framework fro DMP, 3 Data archiving and preserving infrastructure, 4 Datasets and publications for DMP, 5 Ethics Management Plan, 6 Conclusion, 7 References, 8 Annex (List of figs, tabs, Ethis Manual, Consent form, Opt form, Privacy Policy) no maDMP, final DMP already in Month 2 (so it is actually never used)
Link Example Example Example

Sources to learn about DMPs

MaDMP diagram from RDA model

https://ddi-alliance.atlassian.net/wiki/spaces/DDI4/pages/7864356/Active+Data+Management+Plans+Team

DMP tools

Literature

  • Miksa T, Simms S, Mietchen D, Jones S (2019) Ten principles for machine-actionable data management plans. PLoS Comput Biol 15(3): e1006750. https://doi.org/10.1371/journal.pcbi.1006750
  • Sarah Jones, Robert Pergl, Rob Hooft, Tomasz Miksa, Robert Samors, Judit Ungvari, Rowena I. Davis, and Tina Lee (2020) Data Management Planning: How Requirements and Solutions are Beginning to Converge, Data Intelligence 2020 2:1-2, 208-219 : https://doi.org/10.1162/dint_a_00043