CIMI: Consortium for the Interchange of Museum Information Dublin Core (DC) Metadata Testbed

Lynn Ann Underwood July 1999
Museum Records Manager
Solomon R. Guggenheim Museum

What is CIMI?
"A group of institutions and organizations that encourages an open standards-based approach to the management and delivery of digital museum information."
Formed 1990.
Recent Projects:
Z39.50
IIM (Integrated Information Management)
Dublin Core (DC) Metadata Testbed

Metadata? What Are We Talking About?
Metadata is a fashionable term.
Used to describe People, Places, & Objects (Resources).
Structured data about data.
Cataloguing, indexing, documentation is one type of Metadata.
Commonly associated with electronic and networked information.
Databases & Web Pages
CIMIís definition acknowledges museums document objects/items, collections, programs, staff, etc.
Purpose for CIMI is information retrieval.

How is Metadata Used?

Metadata as part of a Resource Description Community
A resource description community is characterized by common semantic, structural and syntactic conventions used for the exchange of information.
Through the use of detailed standards MARC & AACR2 the library community promotes interoperability.
While the art community formed the Art & Architecture Thesaurus (AAT) and the Categories for the Description of Works of Art (CDWA), specifically the art museum community can use these in addition to metadata to share resources.

Why Use Dublin Core?
A useful tool to refine web searching.
Repurpose information that already exists.
It is easier to adopt an interdisciplinary standard already in use.
Interoperability: Allows different communities (libraries, archives, businesses, museums, etc.) to search for data using a common basis.
Establishes a basis for next-generation projects.

The Dublin Core

DC "Simple"
"Simple" or unqualified DC is comprised of the 15 elements with no further content definition.
Current "simple" definitions are based on IETF (Internet Engineering Task Force) RFC 2413 document.
The CIMI working group resisted the temptation to move directly to qualified DC.
Instead CIMI rigorously tested DC "Simple" and it is considered the primary application testing "Simple".
This process heightened the groupís awareness for the need for qualifiers (element & value).

DC Qualified
Qualified adds descriptive precision in retrieving a resource. This is achieved through the development of a substructure. For instance "Role" is a desired term to further describe, or "qualify", the CREATOR element.
Creator=Name.Creator Role=Artist
Qualified also allows for terms to be drawn from controlled vocabularies (LCSH, AAT) or classification schemes (DDC). The use of hierarchies provides further definition (semantic specificity).
Guggenheim family -- art patronage
Caution of using DC Qualified is that elements must degrade gracefully to preserve interoperability.

DC Qualified
DC Qualified is a currently under development by DC Working groups.
Working Groups:

DC Requirements
All 15 DC elements are optional.
All 15 DC elements may be repeated.
Proposed changes to the 15 core elements must be made through the framework of the DC working group.

DC Requirements
1:1 Principal
"...one object (or collection), resource, or instantiation can only be described within a single metadata record."
1:1 is not formally adopted.
This principal, along with the DC Type field, assists with description of the resource.
RDF (Resource Description Framework) reinforces the 1:1 rule.

XML: eXtensible Markup Language
Based on SGML.
Encoding syntax.
Tools under development.

RDF: Resource Discovery Framework
A scaleable or "extensible" data model.
It provides a framework for exchanging different types of metadata.
Types of Metadata (GLIS, INDECES, IMS)
Intended to be machine generated and understandable.
The Request for Comment (RFC) was announced in March 1999

Why DC for Museums
Museum community requires a method to access databases with different underlying schemas because the community historically lacks content standards.
Web provides museums with an opportunity to share with other museums, libraries, archives, individuals, through the use of commonly understood semantics.

What is Museum Specific?
Emphasis on attributes of physical objects.
Associate physical object with persons, places, and events.
Need to describe items, collections, institutions, people, and events.
Need to account for surrogates such as photographs.

Museum Metadata Model

CIMI Assumptions for Museums
DC is appropriate for use in describing both physical and digital resources.
DC is easy to learn and simple to use: Is it usable by non-cataloguers?
Information can be meaningfully and efficiently extracted from existing museum systems in order to populate DC records.
The creation of a DC record to describe a museum is cost-effective.
DC aids the discovery of resources more than access to the underlying Collection Management System might.

CIMI Identifies DC Challenges for Museums
Tension: functionality and simplicity.
Tension: extensibility and interoperability.
Human and machine creation and use.
Community-specific functionality, creation, administration, access.

Testbed Participants
Involvement of over 18 participants both 1998 & 1999.
Access Providers
Software Vendors
Technical Support Personnel
Content Providers
Cultural Heritage
Art
Natural History

Guggenheim Records
The Guggenheim has approximately 5,600 records in an Access database. Of the 15 DC Elements only a handful could be mapped.

Guggenheim Records
Due to the fact that Guggenheim records scarcely populated the 15 DC elements, my methodology to test DC elements was to build 134 records from scratch. This process of creating more robust records helped identify documentation projects, such as the addition of subject terms, etc. It also helped address information integration issues within the museum.

Guggenheim Records
Creating Object, Collection, Institution, & Event records required information to be brought together from different departments. For object records I combined information from the database with data from the curatorial and registrar files. Data for collection records was drawn from electronic and paper files in addition to our web site. Institution records were created using our web site and print catalogue information. For event records I used exhibition publications, brochures, and our web site.

Guggenheim Contribution
The 134 full or "rich" records describe individual artworks, collections, the museum, and events. Also contributed were over 5,600+ collection records exported from the collection database. Intended to be an exporting routine, most museums may find, as we did, that their DC records are not very robust. By providing the testbed with both rich and sparse records further user testing will benefit.

Testbed Products
Guide to Best Practice: Dublin Core
http://www.cimi.org/documents/meta_bestprac_VO31.html
Drafted Winter 1998
Peer Review Spring 1999
Published Summer 1999
Over 300,000 record repository
Contains museums, collections, artifacts
DC "Simple" records both created by hand or exported from legacy systems.

Outcomes

Outcomes: CIMI Institute
Responses included:
Need for more concrete examples, DC, XML, RDF.
Would like guidance on how to implement including storage strategies for archiving, retrievablity and architecture.
Fuller description of tools.
More discussion on cost.
Practical examples from the end userís perspective. What does this look like to the user who is searching for the resource (delivery mechanism).

Summary

WWW Infrastructure Evolving
Resource Description Framework (RDF) will allow rich metadata semantics for documents
http://www.w3.org/RDF/
Extensible Markup Language (XML)
will allow highly structured documents and rich linking (relationship) capabilities
http://www.w3.org/XML/
Uniform Resource Names (URNs)
will allow for persistent, globally unique identifiers

Resources
Dublin Core Home Page
http://purl.org/dc Metadata Matters
http://www.nla.gov.au/meta
IFLA Metadata Resources page
http://www.ifla.org/II/metadata.htm
Dlib Magazine (all DC workshop reports)

DC Resources
Proposed Recommendation of the DC Metadata Initiative
http://purl.org/dc/elements/1:1
Modifications to this document will replace RFC 2413
RFC 2413
http://www.ietf.org/rfc/rfc2413.txt

Resources: Metadata Tools
DC Dot (UKOLN)
http://www.ukoln.ac.uk/metadata/dcdot"
Reggie (DSTC)
http://metadata.net
The aim of the Reggie Metadata Editor is to enable the easy creation of various forms of metadata with the one flexible program. As it stands, the Reggie applet can create metadata using the HTML 3.2 standard, the HTML 4.0 standard, the RDF (Resource Description Framework) format and the RDF Abbreviated format.

Resources: Metadata Tools
Nordic DC Metadata Template
http://www.lub.lu.se/cgi-bin/nmdc.pl
CORC (OCLC)
http://purl.oclc.org/corc
SEED (Search Engine Evaluation & Development), University of Wolverhampton
Researched the automatic classification of web pages, initial work focused on Dewey Decimal Classification
http://scitsd.wlv.ac.uk:8080/metadata.html

DC Example Record

Appendix B- Original Artwork (Concept)


Example B.7 - Item Record describing the Fabrication of a Conceptual Artwork


DC Dot Dublin Core Generator


DC Dot Dublin Core Generator: RDF


DC Dot Guggenheim Enhanced DC Dot


Thank You!
Lynn Ann Underwood
Museum Records Manager
Documentation & Records
Solomon R. Guggenheim Museum
575 Broadway, 3rd floor
New York, NY 10012-4233
lunderwood@guggenheim.org
Telephone: (212) 423-3871
Telefax: (212) 360-4340