Digital Humanities Research Data: Curation and Long-Term Preservation

Introduction to Digital Humanities in Scholarly Spaces

Digital Humanities (DH) research generates diverse, complex, and non-traditional datasets, including digital text editions, GIS maps, linguistic corpora, and interactive databases. Curating and preserving these digital assets presents unique challenges for research libraries.

The Ephemeral Nature of Digital Humanities Data

Unlike standardized tabular datasets in physical sciences, DH data is often tightly coupled with specific software, web applications, or custom user interfaces. If the underlying software becomes obsolete or the server shuts down, the digital research is lost. DH preservation must prioritize decoupling content from software dependencies.

Metadata Standards for Digital Curation

To make DH data discoverable and reusable, curators apply rich metadata schemas. Standard frameworks like Dublin Core, Text Encoding Initiative (TEI) XML, and Metadata Object Description Schema (MODS) are utilized to document the structural, administrative, and preservation history of digital artifacts.

Developing Institutional Digital Preservation Models

Research libraries must establish long-term curation models for DH. This includes: 1. Migrating data to open, non-proprietary formats (e.g., XML, plain text, SVG). 2. Utilizing containerization to archive interactive web platforms. 3. Partnering with national digital preservation networks (like Portico or CLOCKSS) to guarantee permanent access.

Key Data and Comparative Metrics

DH Dataset Format Primary Preservation Vulnerability Curation Best Practice Option
Interactive Database (SQL) Server obsolescence, database software deprecation. Export schema and data to flat CSV files alongside standard SQL scripts.
Digital Scholarly Edition (TEI) Custom stylesheet incompatibility, broken links. Enforce strict XML validation, embed stylesheets, use permanent DOIs.
Linguistic Audio Corpora Audio codec obsolescence, lack of transcript synchronization. Convert audio to open WAV/FLAC formats, sync with plain text transcripts.

Actionable Checklist for Digital Humanities

  • Establish a data management plan tailored specifically to digital humanities workflows.: Establish a data management plan tailored specifically to digital humanities workflows.
  • Utilize open, standardized file formats (e.g., TXT, XML, CSV, WAV) for final archive files.: Utilize open, standardized file formats (e.g., TXT, XML, CSV, WAV) for final archive files.
  • Apply rich descriptive metadata using the Text Encoding Initiative (TEI) guidelines.: Apply rich descriptive metadata using the Text Encoding Initiative (TEI) guidelines.
  • Assign permanent Digital Object Identifiers (DOIs) to all completed digital collections.: Assign permanent Digital Object Identifiers (DOIs) to all completed digital collections.
  • Deposit DH artifacts in a trusted, library-managed repository with digital preservation guarantees.: Deposit DH artifacts in a trusted, library-managed repository with digital preservation guarantees.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *