Subject indexing and abstracts
The Knowledge for All system should have two levels of subject classification for articles and ideally abstracts included in every article record. Each article in the database should be classified in one or more broad subject categories and have multiple more specific subject terms assigned to it. This system of classification will allow us to provide rich searching and browsing capabilities in the citation database. It will also be useful for connecting volunteer indexers and editors with journals in which they have subject expertise.
Broad subject classification
The primary purpose of broad subject classification in the K4All database is to allow users to limit their searching to a specific discipline or set of disciplines. While the goal should be to classify an article or journal in a single broad subject category (and categories should be broad enough to allow that), classification in more than one category should be allowed for multidisciplinary articles or journals. Broad subjects should be selected and defined thoroughly to facilitate easy classification. This could be a flat or hierarchical list of subject categories. JournalSeek provides a model flat list while Directory of Open Access Journals provides a model hierarchical list.
Broad subject classification terms have been assigned to Thesauri for Subject Indexing records and Metadata Sources records, but the vocabulary has not been 'controlled' yet. The working, uncontrolled list of subject categories can be viewed here.
Classification of articles into one or more broad subject class can either occur at the article level or the journal level. An advantage of classifying entire journals in one or more broad subject categories means that indexers do not need to make this classification for each individual article. However, with interdisciplinary journals or journals whose articles might typically fall into more than one broad subject class, this could result in mis-classification of journal articles. As a compromise, it is recommended that journals be classified under broad subjects but that indexers are able to overwrite that classification on a per article basis if needed, and interdisciplinary journals will be flagged so that indexers assigned to those journals pay special attention to this field and correct if needed.
Subject indexing
Subject indexing or applying subject terms from controlled vocabularies to journal articles will be an important aspect of the Knowledge for All system, as it will provide search precision that is absent from many search tools that search for keywords in full-text only, such as Google Scholar, and that use subject terms that are not from controlled vocabularies, which is the case with many commercial databases.
Unless the source's license allows it, Knowledge for All will not be able to copy subject terms from existing journal article records due to copyright restrictions. This is discussed in more detail in Copyright of Journal Article Metadata. There may be exceptions when the subject terms were chosen from the same thesaurus used by Knowledge for All (such as the ubiquitous MESH) and the metadata record is part of an open data set. But otherwise volunteer indexers will select subject terms from controlled vocabularies for all articles in the Knowledge for All database.
There are currently many thesauri available online for different subject areas and in different languages. They are being located and listed on the Thesauri for Subject Indexing page. Once a near-complete list is composed, the Knowledge for All community will need to decide which thesauri will be used for different subject areas and ensure that indexers consistently use those thesauri.
It may be necessary to adapt and modify existing thesauri for Knowledge for All or to import terms and structures into the Knowledge for All system. If so, thesauri that allow this should be selected or Knowledge for All can request permission from thesauri maintainers. Another option is to develop thesauri from scratch for Knowledge for All, but this takes considerable time and skill and so should be avoided where possible.
Some original thesauri construction and modification will likely be necessary to sufficiently represent and describe all subjects covered by all published scholarly journal literature, particularly in areas that may be neglected or misrepresented by existing thesauri. These gaps will be identified once a complete list of existing thesauri is compiled and these thesauri are analyzed by subject specialists or as indexers begin to use them. Thesauri construction and indexing can be controversial due to the politics of language and naming (de la tierra, 2003). It is recommended that Knowledge for All make every effort to include diverse perspectives in construction and modification of its thesauri.
Abstracts
Copying of abstracts from journal metadata records is also restricted by copyright, unless the source's license allows it. As discussed in detail in Copyright of Journal Article Metadata, the creation of abstracts for most articles in the system would take considerable time and is not a feasible option. Thus, it is recommended that Knowledge for All find a way around the copyright issue with abstracts and make every effort to provide access to existing abstracts of articles in the database.
References
de la tierra, tatiana. "Latina Lesbian Subject Headings: The Power of Naming." Radical Cataloging: Essays at the Front. Ed: K.R. Roberto. Jefferson: McFarland, 2008 (94-102).
