Metadata Mashup: Creating and Publishing Application Profiles

ALCTS Program at the 2008 Annual Conference
Metadata Mashup: Creating and Publishing Application Profiles

*Slides are currently available or will be available at: http://presentations.ala.org/index.php?title=Saturday%2C_June_28

This One is Just Right: Metadata Design and Implementation at the University of Maryland
Jennifer O’Brien Roper, University of Virginia (Previously Metadata/Electronic Resources Librarian at University of Maryland)

Jennifer O’Brien Roper presented the metadata application profile employed by University of Maryland’s Digital Repository, UMDM (University of Maryland Descriptive Metadata), which uses Dublin Core, VRA Core and local elements with customized DTD for element parsing and validation. She spoke on how they developed this customized and rich metadata standard for individual collections and cross-collection search in the UM Digital Repository.

UM Digital Repository is based on Fedora platform. In light of the metadata standards requirement and the local needs, they created the application profile which utilizes rigorous minimum standard and is flexible in providing standard and unique descriptive information by collection. It was based on the one originally developed by the University of Virginia, and it includes required based elements, some optional base elements and the mandatority and repeatability of these elements. In her discussion, Roper went into great detail on what these elements are and how they were presented. She talked about the input standards including those consistent input key to cross-searchability, existing standards (such as DCMI media types, language ISO standard, LC name authority file, Getty Thesaurus of Geographical Names), local standards (such as local lists and terms created for culture and style elements) and un-mandatory standards. The documentation of the application profile can be used by MARC catalogers, XMLers, subject specialists and students and it can be used for metadata creation, maintenance and data migration. It documents the minimum requirement for base elements, elements, child elements, attributes and variations, input standards (such as ISO 8601 for date and time format), local standards and a combination of and borrowing from LCSH, ATT, TGM and MESH.

Roper concluded that it is not recommended to reinvent everything in the creation of a project-specific application profile, and it is important to make self-contained documentation, to document when the decision was made and to keep the documentation consistent.

Application Profiles at the University of Tennessee
Melanie Feltner-Reichert, Director, Digital Library Initiatives, University of Tennessee-Knoxville

Feltner-Reichert presented the development of metadata application profiles at the University of Tennessee in view of the changing world of metadata standards and the expanding landscape of today’s digital initiatives. She elaborated on how they tailor complex schemas for project-specific usage and collaborate with all project stakeholders.

She quoted at the beginning that “Metadata is expected to follow existing and emerging standards in order to facilitate integrated access to multiple information providers over the web…” “…and it is rare that the requirements of a particular project or site can all be met by any one standard ‘straight from the box’” (Baker, Dekkers, Heery, Patel, Salokhe).

Feltner-Reichert discussed in detail “Volunteer Voices: The Growth of Democracy in Tennessee”, an IMLS funded and statewide digital library project. It covers diverse materials from various cultural heritage institutions. In order to make such a statewide cooperative project work, there are many decisions for them to make such as branding and different metadata needs from different partners. Similar to every other application profile, their profile includes the selection of elements, child elements and attributes, their usage and constraints. To implement the profile, policies and guidelines, training, metadata creation tools and pre-ingestion quality control were established. Feltner-Reichert further introduced the tools they used such as an administrative database for facilitating metadata quality control in institutional, collection level and item level, and metadata workbook, an open web form which generates valid MODS XML syntax and enables compliance to their application profile.

Application (of METS) Profiles for Documentation
Arwen Hutt, University of California at San Diego

Hutt started by defining what application profile is, in plain term, “documentation, and documentation”. In detail, an application profile declares how we are using standard schemas and it documents the purpose of this profile, the schemas used, specific elements used, controlled vocabulary, constant data, content guidelines and encoding guidelines.

She furthered discussed how to document, why UCSD uses METS to document the profile and the pros and cons of this approach.

The traditional documentation tools consist of text docs, html and wiki, and newer ones include METS profile schema. A very recent one is Singapore Framework for DC Application Profiles presented at the International Conference on Dublin Core and Metadata Applications in Singapore, September 2007.

Hutt reported that they have created 5 application profiles starting from 2004, 2 of them have been registered with the METS Editorial Board: Simple object profile and complex object profile (for example: the latter can present the structure of a photograph object as well as its zoomed portions, and METS allows this approach). The draft application profiles are electronic thesis and dissertations profile (a complex profile customized for theses), Archivist’s Toolkit profile (it is for data exported from Archivist’s Toolkit and focuses more on structure), and File preservation transfer package profile (it deals with data validation and format migration).

The pros of their METS approach is that it employs standardized structure for encoding information (XML schema), unique IDs for specific rules and required sample documents (METS file) and a community forum provided for sharing data and profile. The cons is that it is not very user friendly in readability (style sheet needed for data re-generating and editing), it is not modular in dealing with simple and complex objects and it is not machine actionable (cannot validate automatically).

Thinking about Application Profiles:
Providing Interoperability in a World of Silos

Diane Hillmann, Director of Metadata Initiatives, Information Institute of Syracuse

Hillmann discussed application profiles in a higher view of the semantic web and from a broader approach of the domain application profiles. She first addressed human readable profile and the interoperability of the profile. While interoperability is a need within an institution, the profile should be extendable and reusable (e.g. operate at project, institution, community and domain level). She calls for a shift from internal services to aggregated access.

She promotes community consensus building and thinking on a macro level. There are two aspects to consider: human aspects (consensus building, documentation of consensus, communication of data intensions) and machine aspects (validation, increasingly specific expectations for content, improved ability to assess and improve metadata). Hillmann pointed out application profiles are not just documentation activity, and she asks us to rethink its functional requirement and domain model.

Hillmann urges us to think web as a platform and think of web standards in creating application profiles. She talked about the semantic web community’s work and the Singapore Framework for DC Application Profiles model. According to this framework, a Dublin Core Application Profile is packet of documentation which consists functional requirements, domain model, Description Set Profile (DSP), usage guidelines (optional) and encoding syntax guidelines (optional). This model shows “how the components of a Dublin Core Application Profile relate to ‘domain standards’ — models and specifications in broader use by communities — and to the W3C standard Resource Description Framework (RDF), the default foundation for machine-processable semantics in our time.” “Description Set Profiles are based on the DCMI Abstract Model (DCAM) inasmuch they specify how the entities of the DCAM are used in a specific set of metadata. In this sense, the DCAM constitutes a broadly recognized model of the structural components of metadata records. The DCAM, in turn, is grounded in RDF.” “Description Set Profiles typically use properties and classes defined in standard Metadata Vocabularies such as the DCMI Metadata Terms. Metadata Vocabularies, in turn, are expressed on the basis of the RDF Vocabulary Description Language (also known as RDF Schema, or RDFS).” (See http://dublincore.org/documents/singapore-framework/)

Hillmann gave us some examples such as SWAP (The Scholarly Works Application Profile), a modified FRBR model. This model’s entity and relationship labels are based on FRBR model but have been modified to make them more intuitive to eprints, such as “ScholarlyWork” “isFundedBy” “Agent” and “ScholarlyWork” “isExpressedAs” “Expression”.

She addressed the importance of data modeling in making application profiles, and at the same time, we are not modelers (We are librarians but shouldn’t feel daunted by xml and RDF stuff). She talked about quality assessment, open tools and templates, conformance measurement and profile maintenance. She indicated that it is vital to create the application profile in light of the semantic web standard and the RDF development.

* Some issues raised and questions answered from the Q&A session after the presentations:
– DC Conference on Metadata, Sept. 22-26, 2008
(solely on metadata, good for social networking)
“It’ll change your life…” (Diane Hillmann) 🙂

– Maintenance of Application Profiles
Eric Childress from OCLC asked whether the application profiles presented were being maintained. The answer from all three presenters was “no.” Jennifer is no longer in MU, Arwen commented that registered application profiles cannot be revised. Diane replied that Aps can be available in different versions and the conformance can be made to different versions. She commented later, “this is a really important reason to try and move APs ‘up the stack’ to the community or domain level, instead of just at the project level.”

– What to Document?
Aps are declarations of functional requirements (for/and) a domain. We don’t need to document general stuff. For example, for documentation of the education profile, the education component is much more important such as audience and media (Diane).
Arwen added that to document those required content, those derived from standards and usage guidelines.

– The Singapore framework will solicit for comments soon.

– Machine Validation against Application Profiles
Louise Ratliff from UCLA asked whether there are some effort going on in terms of validation implementations on behalf of the standards community in the same way as OCLC validates our marc records (not the same words, but similar idea here). Diane answered that it is still in its infancy (conceptual stage now).

– Terminology services
It is not suggested to develop vocabulary inside an application profile, Diane said, there are some vocabulary registry (such as metadataregistry.org) which can provide terminology services.

If you have things to add on or to modify, please reply to this post or to me at sai.deng@wichita.edu. I will be very glad to make changes or add amendments.

This entry was posted in ALA Annual 2008. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *