SAC Subcommittee on Metadata and Subject Analysis
Midwinter 1999 - Philadelphia, PA
February 1, 1999

Present: Chan, Culbertson, Dede, El-Hoshy, Greever, Glassel, Harken, Trumble, Wool, Childress, Hickey, Cervone, Roe, Mugridge, Hoffman, Pillow, Clarke, Casey

The meeting began with a welcome from Diane Dates Casey, chair, and introductions of subcommittee members and guests.

Working Group on Recommendations for the Dublin Core Record Subject Element

Lois Mai Chan agreed to lead the working group which will draw up the recommendations and their rationale for defining the subject element of the Dublin Core record (DC) based on the subcommittee discussions. Shannon Hoffman, Bonnie Dede, Rebecca Culbertson and Rebecca Mugridge will work with her. A summary of the recommendations and their rationale will be developed between midwinter and annual.

CORC

Thomas Hickey of OCLC described the new CORC project which is a system of cooperative cataloging of Internet resources using Dublin Core and MARC. A traditional MARC approach is being taken toward subject analysis with an automatic assignment of Dewey class numbers. Eric Childress of OCLC also participated in the subcommittee discussion of the subject element in the Dublin Core. Casey invited Childress or another OCLC representative to serve as liaison to the subcommittee.

Discussion of the Subject Element in the Dublin Core Record

Audience

Consensus of the group was that the primary audience for these recommendations is the library community, but that the metadata community and web page creators should also be taken into consideration. [Perhaps a separate set of recommendations should be developed for the metadata community and web page creators --- Suggested at CCS Exec.] Chan emphasized that the characteristics of the DC -- simplicity, semantic interoperability and flexibility -- need to be taken into account as recommendations are developed. The subject element should be adapatable and flexible to accomodate simple schemes, as well as elaborate schemes. In addition, eight other elements of the DC can be used to express subject, so their use needs to be taken into account.

Other Groups Working on Subject Element

Casey reported that both she and Chan had searched on the Web to identify other groups working on the subject element of the DC and could find none. Childress confirmed that he was unaware of any other groups. Casey proposed that she contact Stuart Weibel of OCLC and offer the subcommittee as the Working Group on the DC Subject Element. The group agreed. Also, subcommittee members were encouraged to subscribe to the electronic discussion list on DC subject and description, dc-subdesc. Subscription information is found at http://www.mailbase.ac.uk/lists/dc-subdesc.

Keyword/Controlled Vocabulary

The consensus of the group was that both keywords and controlled vocabulary should be accomodated in the subject element. All existing controlled vocabularies relevant to the subject of an Internet resources are appropriate. However the scheme should be specified. A simplified vocabulary for web page creators would be helpful. The revised caption headings of DDC might serve such a purpose, as well as the controlled vocabularies being used in state digital library projects, such as California. Lynn El-Hoshy noted the challenge of maintaining a simplified controlled vocabulary which is necessarily dynamic and tends to become more complex. A combination of keywords and controlled vocabulary could obtain the desired level of specificity. Chan noted that simplicity can be applied both to semantics and to syntax in a controlled vocabulary. Classificatory numbers, such as Dewey, address the multilingual challenge of subject analysis. The consensus of the group was that mappings, cross walks, and harmonizations, such as the Unified Medical Language System (UMLS) and the H.W. Wilson harmonization of Sears and LCSH were needed for interoperability.

Specificity

The level of specificity depends upon the application. The consensus was that the most specific level of subject analysis should be applied. Greg Wool pointed out the use of structured subject references in assisting end users tofind the desired level of specificty. However this would be design issue for a search engine or database system. Specificity could also be brought out in the description element and the coverage element which reflects temporal and spatial data, as well as the format and type elements which accomodate form data. While at this time most search engines cannot accomodate subject strings, subject strings would still be one way of expressing subject for those with the expertise to construct them. Many individuals might want to deconstruct subject strings using a combination of the DC elements mentioned above. Deconstucted LCSH subject strings should be so designated. The consensus of the group was that all combinations should be permissable: local subject headings, semantics without syntax, syntax without semantics, both semantics and syntax, as well as a combination of the nine DC elements related to subject. A system design recommendation would be to develop search engines which could combine DC element searches to bring the subject together. Simplified syntax facilitiates mappings and semantic interoperability.

Consistency

The library community already accepts the value of consistency, but the case might need to be made to non-librarians. Within a specified digital collection or project the application of subject analysis should be consistent, in other words the same semantics and syntax should be applied throught out by catalogers. Compatibility with other metadata schemes is desirable. When a controlled vocabulary is used, the version of the vocabulary should be indicated along with the date on which the document was cataloged.

Class Numbers

The consensus was that in a library environment class numbers should be applied to DC records. The class number serves to express subject to a multilingual audience. Because in a digital environment the class number is not used for shelf-position organization, multiple class numbers could be applied to a resource. Cutters that express subject should be included, while cutters that determine shelf position should be eliminated. The level of specificity of the subject headings and the class number must be compatible. The more information that is given through subject analysis the greater are the possibilities for precoordination and postcoordination.

Assignments

Every one who took notes on the discussion on recommendations and rationale for defining the subject element of the Dublin Core record is requested to send their notes to Lois (loischan@pop.uky.edu) by February 15. Lois will use the notes to create a rough draft and share that draft with other members of the working group in April. The summary will be shared with the subcommittee by the beginning of June. The focus of the subcommittee’s annual meeting will be a discussion and revision, if need be, of the recommendations and rationale. Hopefully, the final recommendations and rationale will be completed by Midwinter 2000 for submission to SAC.

All subcommittee members are encouraged to subscribe to the electronic discussion group, dc-subdesc.

Casey may be contacting subcommittee members about developing a program on metadata and subject analysis.

Respectfully submitted,

Diane Dates Casey, Chair
d-casey@govst.edu