American Library Association
Annual Conference (129th : 2010 June 24-29 : Washington, D.C.)
Report on cataloging, etc. meetings

ALCTS/LITA/RUSA Machine-Readable Bibliographic Information Committee (MARBI) (Saturday morning and Sunday afternoon)

The agenda for MARBI may be found at http://www.loc.gov/marc/marbi/an2010_age.html and papers are linked from the agenda. The report below is in order by proposals and discussion papers.

Proposal No. 2010-06: Encoding the International Standard Name Identifier (ISNI) in the MARC 21 Bibliographic and Authority Formats Based on a discussion paper at the last MARBI meetings, this proposal passed and will allow for the encoding of the ISNI in subfield $0 (zero) in heading fields in a bibliographic record. This subfield is defined more broadly so that it could handle other standard identifiers, such as CONA or LC/NAF. In authority records, the 024 would be used for the entity.

Proposal No. 2010-07: ISBD Punctuation in the MARC 21 Bibliographic Format Most American libraries use explicit ISBD punctuation in their bibliographic records. The Germans and Austrians have adopted MARC 21 and this proposal was to accommodate their practice of using encoding to provide some of the punctuation. When AACR2 was adopted thirty years ago, the definition of Leader/18 conflated descriptive practice with cataloging rules. With the projected adoption of RDA in the next year or so, prospective coding will use Leader/18 only to indicate descriptive practice, mostly whether the record is ISBD or not. The code "rda" will be used in 040 $f to indicate that RDA is the cataloging code, or other codes there. There was quite a convoluted discussion of this paper (you had to be there) but code "c" will be added for the German situation. The definitions of the other codes has also been shortened.

Proposal No. 2010-08: Encoding Scheme of Coordinate Data in Field 034 (Coded Cartographic Mathematical Data) of the MARC 21 Bibliographic and Authority Formats The proposal, as written, did not accommodate decimal minutes and will therefore come back with that and a couple other matters clarified, e.g., zero fill to left of decimal. It was stated that decimal minutes could probably be checked by their structure (character and number of digits) but that the scheme should be explicit.

Proposal No. 2010-09: Addition of Subfield $u to Field 561 (Ownership and Custodial History) to the MARC 21 Bibliographic and Holdings Formats The proposal, which passed, would allow for links to provenance information. The subfield $u was made repeatable, with position defining to which part of the provenance it covered. Since provenance is important with cultural objects, being able to link from a bibliographic record to a published (posted) catalog or other information could be very beneficial.

Discussion Paper No. 2010-DP04: Encoding the International Standard Text Code (ISTC) in the MARC 21 Bibliographic and Authority Formats This paper suggested three options for including the ISTC in bibliographic and authority records, both on the record for the item as a whole as well as components and related works. The second option, i.e., using a related work field, was preferred by most everyone at MARBI. This does not involve format changes but examples will be added. So far, the ISTC is mostly used by publishers in their ONIX implementation.

Discussion Paper No. 2010-DP05: Language Coding for Moving Images in Field 041 of the MARC 21 Bibliographic Format Films and videos can have a real mixed bag of languages, e.g., original language, language of subtitles, language of accompanying material, language of titles (and subtitles) in silent films, sign language. This paper addressed issues of intermediary language of translation and other complications. There were also those that said that some resources are complicated beyond summary and a note is perhaps the only way to say it (that is, uncodable). From the discussion, LC and OLAC will draft some wording for online discussion between now and the next MARBI meetings at Midwinter 2011. (I volunteered Liz O'Keefe and me to come up with a good example of an exhibition catalog that had language complexity, something along the lines of a bilingual catalog in which the essays started in a variety of languages and are all translated into one or two languages unless they are already in that language. If you happen to have such a catalog up your sleeve, please let me or Liz know.)

LC report The MARC source codes have been reorganized and amalgamated as possible. Though most folks are now looking at the MARC documentation online, the full format is in the mail and Concise is at the printer. In the future, they may only print full for the Bibliographic Format and concise for Bibliographic, Authority, and Holdings formats. The SKOS service at id.loc.gov has been expanded to include the relators lists, TGM, Premis (3 lists), language code list, countries code list, and Geographic Area Codes. MODS 3.4 has been published. A form/genre list will be derived from the full LCSH and will be coded specifically rather than LCSH; this will mean more care in coding LCSH vs "lcgft" in $2 with indicator 7.


Subject Analysis Committee (Sunday morning and Monday afternoon)

SAC is establishing a task force to look at faceting, e.g., faceted vocabularies such as AAT and FAST, software such as PRIMO that uses "facets" in an opac interface. The group is starting out with a literature search. I am on the task force and Scott Opasik (UIUC) is the chair. There will be wiki and I'll let arlis-cpdg know if it becomes public.

Janis Young reported on subject cataloging developments at the Library of Congress. The ILS is to be upgraded in November, to the Tomcat interface (WebVoyage 7). LC is experimenting with XML Data Store for aggregating metadata from the opac, EAD, and several digital collections; beta version expected in early fall 2010. They are also looking at MetaProxy which will replace Z39.50. The Prints & Photographs catalog interface has been revised. They prepared 200 EAD finding aids this year and 947 are now available. The Copyright Office and CIP Program are now using a claim feature which has improved receipts of depository items by some 40%. Four catalogers are in a pilot project to use ONIX data for CIP cataloging. Geographic coordinates have been added to 77,000 name authority records for places, using FAST metadata. The new Korean romanization scheme is being implemented. The Virtual International Authority File (VIAF) now includes data from Library and Archives Canada, the national library of Poland, and the Union List of Artist Names. ULAN is one of the first special library vocabularies; most of the other data is drawn from national libraries. A "subject suggester function" has been added to http://id.loc.gov and LC has received one about African peoples. About 800 subject headings related to Cooking (formerly Cookery) have been revised with changes to references on several hundred more. About 45,000 validation records have been created; these are combinations of topics and subdivisions that would not NEED to be explicitly established but are helpful in automated authority processing. Subject sudivisions with "video" have been replaced by "film" or "film and television." Headings related to religions, e.g., Buddhist art, are now entered directly; if any are discovered in LCSH, a SACO proposal should be prepared. Shopping centers have been moved from SAF to NAF. Frankenstein has been divided into separate headings for the doctor and the monster. The Folger Library is working on SACO proposals for the Shakespearean characters; non-specific headings will be established for repeated characters such as clown and ghost.

The genre/form terms in LCSH are being spun off into a separate thesaurus and it will be given its own MARC code, i.e., lcgft. LC will implement this in late 2010 or early 2011. LC has now completed the moving image headings and about 65 cartographic ones were approved in May. Janis will be working with ATLA on religious genre/form terms. A separate SACO proposal form will be developed for LCGFT.

Much more information on LC activities in the first half of 2010 may be read at http://www.loc.gov/ala/an-2010-update.html

Joseph Miller reported that Sears will be using a single heading for all periods of Russia, rather than LC's Russia, Soviet Union, and Russia (Federation).

The Music Library Association is forming a SACO funnel. Michael Colby (UC Davis) will be the coordinator and will be a member of the MLA Bibliographic Control Committee. They are continuing to look at genre/form headings.

Patricia Dragon reported on the work of the Genre/Form subcommittee. They are using ALA Connect's wiki for their work, in spreadsheet mode. They will determine if they can make it available to the public, read only. New subject areas that they are considering working on include theater, dance, military, art and photography, and children's headings. They have talked some about where genre/form fits in RDA, probably at the manifestation/item level. A preconference is under development for Annual 2011. Adam Schiff (U of Washington) is the incoming chair of the subcommittee.

SAC is undertaking a review of H1095 subdivisions. LC has asked for SAC help in determining which subdivisions might be consolidated, which might be compiled as specific categories. Phase One identified categories; phase two will put the subdivisions into these categories; phase three will look at headings which could be made obsolete, synonyms, new patterns, or possible revisions or rewordings. Again, the group will look at making its preliminary work available.

Marcia Zeng did a presentation on the Functional Requirements for Subject Authority Data (FRSAD). It is available at http://connect.ala.org/files/28555/frsad_sac2010_pdf_15575.pdf


SACO-at-Large (Sunday morning)

Funnels: Music SACO being formed (Michael Colby, coordinator); Slavic SACO Funnel (Joanna Epstein, coordinator; 14 institutions have signed up); MELA Arabic Funnel (Joyce Bell, coordinator; looking for more members).

Kevin Ford did a presentation on the LC Authorities and Vocabularies Web Service at http://id.loc.gov and on Linked Data initiatives.

In RDA, fictitious entities, e.g., Barbara Bush's dog, can be creators/authors and will therefore be established in NAF rather than SAF as they now would be. They should continue as SACO proposals until RDA is implemented. Same with family names. Either of these categories of headings might appear on RDA test records but the authority divided world should be observed.


"Cataloging alchemy: making your data work harder" (OCLC) (Monday morning)
streaming video: http://www.oclc.org/ca/en/multimedia/2010/cataloging_alchemy_ala_2010.htm

Glenn Patton talked about WorldCat today with its 195 million records representing 1.6 billion holdings. LC is about 6.19%, national libraries 17.27%, vendors 1.81%, OCLC members 74.73% (based on the squawking about vendor records, you'd think that percentage would be WAY higher; guess it's those new new new imprints!). OCLC is cranking up its Duplicate Detection and Resolution (DDR) program again, processing the whole database. The new version of the program can look at all formats unlike the earlier version which only worked on books. Of 45 million records analyzed so far, 2.2 million have been merged.

Jean Godby talked about mapping bibliographic metadata and working with combinations of ONIX metadata from publishers with MARC metadata. MARC is more packaged than some other schemes which complicates some of the mapping and manipulation.

Rich Greene talked about the Global Library Manifestation Identifier (GLIMIR) which OCLC hopes to use to connect records for the same manifestation, e.g., different records by language of cataloging, different printings, different records by reproduction. The DDR software is the basis for this work but will follow a different flow. For example, language of cataloging variance keeps individual bib records but would result in one GLIMIR. Lessons learned: how to deal with ss, sz, and [ess-tset]; encoding differences; differences in rules and practices, especially if non-AACR and non-MARC21; mistakes and differences of opinion; clustering easier in books than other formats, especially tough in videos. Phase One is focusing on reproduction clusters including reprints. GLIMIR identifiers will be applied to the clustered results. The matching software is projected to be ready by all 2010. It is hoped that the clustering would enable OCLC to, for example, display a record in German for a germanophone searcher.

Glenn Patton then described work with VIAF and RDA. Currently, VIAF reflects 17 institutions, 20 files, and 14 million records for personal names (11.5 million clusters), with full display of records and links to the native metadata. The VIAF group wants to move forward with corporate and geographic names, more participants, ISNI and ORCID, linked data, and FRBR-matching.

In the Q&A, they admitted that DDR does not use PR 936s for detecting different-cataloging-lanugage twins.


Ronald J. Murray, with Barbara Tillett: "From Moby-Dick to mashups" (Monday morning)

I cannot possibly describe this work very well but, as you were living it, the talk was exciting, the concepts were compelling, and the diagrams are beautiful. If you google "from moby dick to mashups" or "frbr paper tool," you can see some relevant work as it evolves or you can see the slides for the presentation to CC:DA at http://files.me.com/kandroma1/h3h5oo


... go to other ALA reports ...