Metadata elements


Knowing what elements of data will be included and managed in the Knowledge for All system is essential for designing and developing the technological infrastructure and approaching content collection and creation.  Metadata elements have been identified through examining existing citation databases in a variety of subject areas and reading scholarly literature about metadata in digital collections.  

The Knowledge for All database will contain data about 3 main types of content: journals, journal articles, and scholars/authors. These content types will be related, as journal articles will be part of journals and scholars will be linked with journal articles through the author/creator field. A journal issue content type could also be used to link articles to journals. Alternatively journal issue information could be included in every article record, but it may be desirable to have a separate journal issue content type in order to minimize data needed for journal article records and organize indexing workflows. Another significant node type in the Knowledge for All system will be contributors (indexers, editors, developers, etc.), but that will be addressed in the Contributors and Workflows section of the planning documentation. There may be considerable overlap between scholars and contributors.

Below I have made initial recommendations for data elements needed for each type of content. These elements are not mapped to any particular metadata schema. 

Journal:

  • Title
  • Alternative title
  • Past titles - should include information about date and issue when journal changed title
  • ISSN
  • E-ISSN
  • Publisher
  • Language(s)
  • Country of publication
  • Start year
  • End year
  • Subject(s)
  • Other subjects (community tagging field)
  • Peer-reviewed or not
  • Number of issues per year (accommodate significant variations)
  • Number of articles per issue (accommodate significant variations)
  • Website
  • Copyright information

Journal Issue:

  • Volume
  • Issue number
  • Year
  • Date (open text field to accommodate other date information, which could be month, season, or date)
  • Theme

Journal article:

  • Title
  • Author(s) (personal or corporate)
  • Journal name
  • Journal issue
  • Pages
  • Abstract
  • Subjects
  • DOI
  • Datasets included or not
  • Version
  • Format availability
  • Open URL to the article
  • Copyright information
  • Citations/bibliography
  • Related resources

Some subject-specific databases include other metadata elements which would largely only be relevant within that subject.  These include classification, population, location, age group, tests and measures, grant information, and methodology for psychology; type of literature, time period, subject author, subject work , literary theme, literary genre, and media for literature; and study design, place of study, period of study, materials, methods, edited by, and reviewed by for medicine.  We intend to include these in the Knowledge for All system to allow for highly refined searching within disciplines.  However, the fields will only be searchable if a user is searching within a particular discipline rather than doing a general, interdisciplinary search, and if the data for specialized fields is not available elsewhere volunteer indexers will need to create original data.

An article's list of citations or its bibliography is not a necessary element but ideally it will be included to allow for citation analysis.  As discussed in Legal Issues, citations may be protected under copyright.

Administrative metadata

Additional administrative metadata which could be included for the above content types include:

  • Source
  • Record created by
  • Record indexed by
  • Record edited by

Other administrative metadata elements will likely be added as the system's technical infrastructure and workflows are developed.

As the Knowledge for All database will not actually include digitized objects, metadata related to preservation of digital objects is not relevant.

Scholar names/Personal authors:

  • Given name
  • Family name
  • Institutional affiliation
  • Title/occupation
  • Area of expertise /subject specialty
  • Webpage
  • Postal address
  • Phone number
  • Email address

Other data possibly needed for name disambiguation (discussed further in Scholar Name Data Collection and Creation):

  • Preferred or authorized form of the name
  • Other forms of the name, including earlier names, nicknames, pseudonyms, shortened or longer forms of the name, name in other languages or scripts, names associated with the person’s office
  • Birth date
  • Death date
  • Gender
  • Life events
  • Family
  • Works
  • Languages the person normally writes or creates in or the person’s native language(s)
  • Brief biography
  • Unique identifier