ALCTS NRMIG panel — “Issues related to metadata creation and management”

ALCTS Networked Resources and Metadata Interest Group
Sunday, January 21, 2007
8:00 A.M.-9:30 A.M. Panel Discussion | 9:30 A.M.-10:00 A.M. Business Meeting
Washington State Convention & Trade Center, Room 310

Panel Discussion on Issues Related to Metadata Creation and Management

Panelists and their presentations are:

>> Diane Hillmann
Research Librarian
Cornell University
Metadata management and updating in institutional environments: problems and potential solutions. Describing the “Metadata Management Layer” that any institution pushing content should be thinking about.

[Note: MD is my abbreviation for metadata.]

NSDL (National Science Digital Library) was the source of this work with metadata aggregation, which dealt with “stuff” people had created over the past decade.

Characteristics of library digital projects (grant based, frozen in time, content maintained in a variety of systems & not optimized for MD management) The application layer tried to coordinate access to all this material ( a big bandaide).
The MD remains static. These little silos were each different (images, text) and each had different MD.

Our tendency is to think that MD problems need to be solved at the provider end. But that doesn’t fit with reality (we already have this stuff, and some doesn’t fit into that environment). That worked in the MARC world but not in this one.

Better strategy:
Recognize that MD diversity is here to stay; understand that MD management has different requirements than content management. We need to think about it in an aggregated way & leave room for quality improvement—a “future proofing” strategy for digital projects.

Diane showed a diagram of a MD layer sandwiched in between the individual objects and their metadata, and the user layer.

The layer, silos, and services can all be based on OAI-PMH (easy to set up & it accommodates many schema). Then you can introduce new services for users. The management processes can be operated by non-technical staff.

MD is most effectively managed at the statement level, not the record level, with each statement sourced (origin, services, age). You don’t have to improve the MD at the silo level.

Why OAI? —because it supports many formats; many extensionsare available.

What kinds of services? Normalization, terminology, crosswalking, geo-referencing,

JISC [Joint Information Systems Committee, http://www.jisc.ac.uk/ ]– how we deal with vocabularies, with machine processes. Associate string values with URIs so you can detect when things change, with access to a full thesaural structure, & you can support mapping. She recommends reading the JISC report (I’ll ask Diane just what this report is and give you the link later).
Examples: linking resource views & ratings to MD about resources

Geo-referencing: can be considered a subset of terminology services. Add geo coordinates where text strings are insufficient to support mapping.

Augmentation services: “smartening up” MD by adding statements that are missing, etc. Can be based on automation or human review.
Most service interactions can be automated/ scheduled. (Harvesting, etc.)
The mgmt system would recognize when the MD footprint has changed, so you can re-evaluate.

Why bother? We have to figure this out. More different kinds of MD every day; it doesn’t age well. No one dominant service provider (like with Marc & OCLC) to do this
for us. There could be a decentralized growth of services.

Watch this space:
http://managemetadata.org

Q. How do you synchronize the md in the silos? A. You harvest it from your silo and shred it into statements. Anything else that is done, is a different statement. You don’t overwrite, but you just add. Example: you want to normalize some md from a silo, each one of the statements is sourced. So if your TEI header changes & you reharvest, you have a new statement, a new normalization, and can be reassembled so you can send the original md out to another provider, and then separately send the normalization, in whatever combination.

So does the statement require a unique ID at the provider end? Yes.
What is this for? For bringing together MD you don’t control? Or the MD you manage? A. That line is not easy to draw; you may not have control due to lack of resources.

>> Suzanne Pilsk
Librarian, Metadata Specialist
Smithsonian Institution Libraries
The importance of getting the right literature to the right people in the right format for them to use. This talk will focus on the taxonomic literature that is constantly referenced by current researchers world wide.

[Note: I followed this quickly-moving discussion as best I could. Sorry if the notes are a bit scattered; maybe some other folks could jump in and fill the gaps.]

All of our MD is in these silos and how do we manage it? Smithsonian is involved in the Biodiversity Heritage Library Project; Suzanne is the MD person & is trying to think about this in a unique way.

http://bhl.si.edu/BHL_content.cfm?page=About
About the Biodiversity Heritage Library

“Ten major natural history museum libraries, botanical libraries, and research institutions have joined to form the Biodiversity Heritage Library Project. The group is developing a strategy and operational plan to digitize the published literature of biodiversity held in their respective collections. This literature will be available through a global “biodiversity commons.””

Suzanne explained the need of biologists in the field to look online at printed documentation about specimens, and their desire to not have to come into the library.

She needs to push out the “inside” of her printed material to the people in the field. They need to know where they are within a specific text. Her MD is only at title level, though.

She showed an example of a digitized rare book from the Missouri Botanical Gardens, who are letting people tag the pages from the scanned book (del.icio.us)

Botanicus example from Linnaei. They’ve been able to pull out some names from this page of scanned Latin book, and include them as text (as OCR). Suzanne needs to provide them other access points into their databases, directly without requiring extra searches. She clicked on a name link that went out to the universal biological indexer and link to the name there, and then links back out.

Suzanne requested input from the group about how to capture page and volume data for the project. It is bringing out lots of MD issues we have not dealt with before.

>> Jody Perkins
Metadata Librarian
Miami University Libraries
Project planning and management / work flow issues with an emphasis on what makes metadata creation different than traditional cataloging. Observations regarding the impact of politics on digital library projects.

Very informal comments as food for thought.

Similarities between traditional cataloging environment & md creating environment:

production-oriented
result in tangible product
coordinate with several processes
both employ practices
both are standards-based

Differences:

MD—items are usually unique (no copy cataloging can be done), all cataloging is original cataloging, formats are not standardized.
Traditional cataloging—the opposite.
MD – versions are infinite; grouping & relationships between items are more important.
Items rarely stand alone; specialized cataloging & controlled vocabularies often needed.
Standards are still evolving; can mix & match. For cataloging, it’s a package deal.

Functions & purpose of the cataloging:
MD—items are usually available immediately (they’re digital), so the surrogate function is not necessary. Emphasis is on discovery & access, not description.
MD—creating occurs at the project level & requires planning (not a linear workflow).

Environment:
Cat—in the established hierarch of the organization
MD – usually in a project structure, with a variety of people. Politics can be a problem here, at the project level.

Reasons that politics are a problem:

Project managers have responsibility without authority;
team members can outrank you;
team members usually have other priorities;
they have divergent backgrounds;
no precedents to fall back on.

She cited the Scott Burkun book on Project Mgmt:
The art of project management [electronic resource] / Scott Berkun
Sebastopol, CA : O’Reilly, c2005.

If a project lacks leadership, clear roles, clear goals, trust, big picture – can cause problems

Conflict can bring important issues to light & create common bonds between team members. E.g., different people advocate for different standards. Or disagreements crop up between full-text search “radicals” advocates & traditional catalogers.

She occupies the progressive middle ground.

Philosophical differencess among team members (top down vs. bottom up approaches).


Comments:

— MD is OK done in a project environment; for sustainability it needs to be part of normal workflow. A certain % of a staff’s work is in non-marc formats, as regular work.

But small institutions can usually only do this as a project (grant).

— Q. about workflow and outputs. There are differences in the way we create MD vs. traditional cataloging. Management of APs vs. management of data in individual records.

There was a lively discussion about project vs. production workflow, staffing, etc. How cataloging staff see metadata creation (they like it, find it a burden, etc.) At what point in time do you make this a regular workflow? Initially it is all new, but some of that is change management. You have to move it over. But when is it stable enough to move it over?

Internal vs. external projects. Internal is done w/in the libraries, external is done in partnership, & the library has less control over the MD & it hosts the collection.

Blogger: Louise Ratliff

This entry was posted in ALA Midwinter 2007. Bookmark the permalink.

One Response to ALCTS NRMIG panel — “Issues related to metadata creation and management”

  1. The URL for the JISC Terminology
    Services and Technology is: http://www.jisc.ac.uk/Terminology_Services_and_Technology_Review_Sep_06

    Highly recommended!

Leave a Reply

Your email address will not be published. Required fields are marked *