Introduction to Data Repository in Scholarly Spaces
Depositing research data in a secure, trusted repository is a cornerstone of open science and funder compliance. However, with hundreds of generalist, institutional, and domain-specific repositories available, selecting the correct archive can be a daunting task for researchers.
The Hierarchy of Repositories
Data repositories are organized into three primary tiers: 1. Domain-Specific (e.g., GenBank, Protein Data Bank), which specialize in highly structured data formats. 2. Institutional (e.g., University repositories), which preserve institutional research outputs. 3. Generalist (e.g., Zenodo, Figshare, Dryad), which accept all file formats and subjects.
Funder Requirements and Core Desirable Characteristics
Agencies like the NIH and NSF outline specific ‘desirable characteristics’ for data repositories. These include assigning unique persistent identifiers (DOIs), supporting long-term preservation policies, requiring standardized metadata schemas, and ensuring data and metadata are retrievable via secure, open protocols.
Step-by-Step Decision Matrix for Depositing Data
To make an informed choice, researchers must evaluate their files. If the data conforms to a standard schema in a specific field, a domain-specific repository should always be chosen. If no domain repository exists, a generalist repository or the institutional repository should be utilized, ensuring that the selected archive provides open licenses and structured metadata.
Key Data and Comparative Metrics
| Repository Tier | Target Audience | Key Strengths | Optimal Use Case |
|---|---|---|---|
| Domain-Specific | Specialized field experts | Deep integration with field-specific analysis tools and standard schemas. | Genomics, protein crystallography, astronomy datasets. |
| Generalist | Multidisciplinary | Accepts almost any file type, issues DOIs instantly. | Interdisciplinary projects, custom code, supplementary tables. |
| Institutional | Internal university staff | Direct support from library staff, tailored long-term preservation. | Affiliated student and faculty datasets, thesis supplements. |
Actionable Checklist for Data Repository
- Identify whether a domain-specific repository is standard in your scientific field.: Identify whether a domain-specific repository is standard in your scientific field.
- Verify that the chosen repository issues permanent, citeable Digital Object Identifiers (DOIs).: Verify that the chosen repository issues permanent, citeable Digital Object Identifiers (DOIs).
- Confirm that the repository has a clear long-term preservation plan (10+ years).: Confirm that the repository has a clear long-term preservation plan (10+ years).
- Check that the repository supports standard metadata formats and open data licensing.: Check that the repository supports standard metadata formats and open data licensing.
- Consult the library’s Research Data Management team for localized deposit assistance.: Consult the library’s Research Data Management team for localized deposit assistance.
Leave a Reply