3  Metadata

3.1 Metadata?

  • Metadata is the data about your “outputs”
  • Citations are just metadata for a research article
  • Metadata can include codebooks for datasets
  • But also can be included for nearly everything we do
  • You will check out our proposed metadata categories
    • Are we missing any?
    • What should be included with each category?

3.2 Metadata?

  • Action: tracking when things happen within a project
  • Project: the global project level information
  • Contributor: person information about who is doing stuff on project
  • Contributorship System: how you will track CRediT or others
  • Organization: the university/ngo/organization a contributor is attached to

3.3 Metadata?

  • Documents: data, stimuli, materials, ethics, and much more
    • Text
    • Audio
    • Video
    • Pictures
  • Data: a special category of documents that will include special extra parts
    • Primary Data
    • Secondary Data


https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html https://project-open-data.cio.gov/v1.1/schema/

    • Input: XSD for the schema
    • Output: XML for the filled in (Native format)
  • XML from w3
  • RDF (which is often converted to json or xml)
    • Other objects related to N3, turtle, etc.
    • Turtle, Triples, Quads, CSV
  • JSON
    • JSON-schema is the validation
    • JSON-LD is the output

3.4 these words aren’t yours

While the metadata application is manifold, covering a large variety of fields, there are specialized and well-accepted models to specify types of metadata. Bretherton & Singley (1994) distinguish between two distinct classes: structural/control metadata and guide metadata.[24] Structural metadata describes the structure of database objects such as tables, columns, keys and indexes. Guide metadata helps humans find specific items and is usually expressed as a set of keywords in a natural language. According to Ralph Kimball, metadata can be divided into three categories: technical metadata (or internal metadata), business metadata (or external metadata), and process metadata.

NISO distinguishes three types of metadata: descriptive, structural, and administrative.[22] Descriptive metadata is typically used for discovery and identification, as information to search and locate an object, such as title, authors, subjects, keywords, and publisher. Structural metadata describes how the components of an object are organized. An example of structural metadata would be how pages are ordered to form chapters of a book. Finally, administrative metadata gives information to help manage the source. Administrative metadata refers to the technical information, such as file type, or when and how the file was created. Two sub-types of administrative metadata are rights management metadata and preservation metadata. Rights management metadata explains intellectual property rights, while preservation metadata contains information to preserve and save a resource.[8]

3.5 end

To do: Add glossary terms throughout this document.

https://www.nature.com/articles/s41597-022-01815-3 https://direct.mit.edu/qss/article/1/1/414/15577/Crossref-The-sustainable-source-of-community-owned https://www.cell.com/patterns/pdf/S2666-3899(21)00170-7.pdf

Why do I need it

Types of metadata

What is json

Why is json useful

3.6 STAPLE Metadata

3.6.1 Action

  • done

3.6.2 Project

Required elements of project-level metadata:

  • Name of the project
  • Project description
  • Authors
  • Corresponding Author
  • Citation
  • Persistent identifier (DOI)
  • Subject/keywords
  • Funders
  • Date


3.6.3 Data

Data gets a special type of schema, etc.

https://schema.org/Dataset Primary Data Secondary Data

3.6.4 Project Outputs

Note that data can be text, but this is the spot we classify other objects you may output. Text

https://schema.org/TextObject Audio

https://schema.org/AudioObject Video


3.6.5 Image


3.6.6 Author


3.6.7 Organization

3.6.8 Funder

3.6.9 CRediT

3.7 Add your specific requirements

json format

metadata builder?