As with cataloguing of traditional analogue material digital objects also need metadata. 'Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource' [NISO, 2004]. A range of metadata is required in order to successfully manage and preserve digital objects.
There are 3 broad categories of metadata:
MARC21 is a format standard for the representation and communication of bibliographic and related information in machine-readable form. All MARC standards conform to ANSI/NISO Z39.2 and ISO 2709:2008.
The Dublin Core Metadata Element Set is a vocabulary of 15 properties for use in resource description. The 15 element set comprises of contributor, coverage, creator, date, description, format, identifier, language, publisher, relation, rights, source, subject, title and type. These have been formally endorsed in the following standards: ISO Standard 15836-2009, ANSI/NISO Standard Z39.85-2007, IETF RFC 5013
EAD is an XML DTD used throughout the archival community for the encoding of finding aids (collection-level description). It is capable of describing a digital collection and its internal structure, from the topmost collection-level, down to individual items.
This is a schema developed by the Library of Congress Audio-Visual Prototyping Project. The main aspects of the schema the Library makes use of is:
Audio Metadata (AMD), which contains technical metadata specific to audio files e.g. sampling frequency.
Video Metadata (VMD), which contains technical metadata that describes a
digital video object e.g. bit rate, compression codec.
MODS is a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications. As an XML schema it is intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation of original resource description records.
MIX is a set of technical data elements expressed in XML schema language required to manage digital image collections. The schema provides a format for interchange and/or storage of the data specified in the Data Dictionary – Technical Metadata for Digital Still Images.
PREMIS is the name of the international working group that has produced a report called PREMIS Data Dictionary for Preservation Metadata. Therefore when people speak of PREMIS they often mean the Data Dictionary although they may also be referring to the XML schema that is used to represent PREMIS. Preservation metadata supports activities intended to ensure the long-term usability of a digital resource. It is 'the information a repository uses to support the digital preservation process.' The PREMIS data model organizes the semantic units defined in the Data Dictionary. The data model defines 5 entities that are important to digital preservation activities: Intellectual Entities, Objects, Events, Rights, and Agents.
The Text Encoding Initiative (TEI) is a consortium, which collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines, which specify encoding methods for machine-readable texts, chiefly in the humanities, social sciences and linguistics. Since 1994, the TEI Guidelines have been widely used by libraries, museums, publishers, and individual scholars to present texts for online research, teaching, and preservation.
textMD is a XML Schema that details technical metadata for text-based digital objects. The schema allows for detailing properties such as: encoding information (quality, platform, software, agent), character information (character set and size, byte order and size, line terminators), languages, fonts, markup information, processing and textual notes, technical requirements for printing and viewing, page ordering and sequencing.
The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium. A METS document can be used in the role of Submission Information Package (SIP), Archival Information Package (AIP), or Dissemination Information Package (DIP) within the Open Archival Information System (OAIS) Reference model.
METS document
A METS document consists of 7 major sections:
1. METS Header: contains metadata describing the METS document itself, including information such as creator, editor
2. Descriptive metadata: section may point to descriptive metadata external to METS document (e.g. MARC record in an OPAC or an EAD finding Aid) or contain internally embedded descriptive metadata (e.g. MODS, DC) or both.
3. Administrative metadata: section provides information on how the files were created and stored, intellectual property rights, metadata regarding the original source object and information regarding the provenance of the files comprising the digital library object (i.e. master, derivative file relationships and migration / transformation information). Administrative metadata can also be internal or external and types include PREMIS, TEI, MIX
4. File section: lists all files containing content which comprises the electronic versions of the digital object <file> elements may be grouped within <fileGrp> elements to provide for subdividing files by object version.
5. Structural map: is the heart of the METS document. It outlines a hierarchical structure for the digital library object and links the elements of that structure to content files and metadata that pertain to each element.
6. Structural links: section allows METS creator to record the existence of hyperlinks between nodes in the hierarchy outlined in the structural map.
7. Behaviour: section can be used to associate executable behaviours with content in the METS object.
The following is an example of a METS document from NLW.