Call for Proposals on the Topic of Linked Open Data Implementation!

Exposing bibliographic and cultural heritage information as linked open data makes possible new modes of resource access and discovery. It also supports reuse and visualization of collections in novel ways. This program will introduce participants to linked open data in the real world through both introductory and intermediate level presentations on its application, including challenges inherent to moving towards a linked open data ecosystem.

Potential topics include:

  • Implementation of linked data project
  • Metadata analysis and evaluation
  • Linked open data visualizations
  • Entity reconciliation
  • Ontology alignment/mapping

If you have any questions, feel free to contact Darnelle Melvin (darnelle.melvin@unlv.edu) or Anne Washington (awashington@uh.edu), ALCTS MIG Programming Co-Chairs.

Please fill out the submission form with your proposal abstract by Wednesday, February 21, 2018. “Implementing Linked Open Data in the Real World” will take place during the 2018 ALA Annual Conference in New Orleans.

Posted in ALA Annual 2018, Conferences | Leave a comment

Metadata Interest Group Meeting at ALA Midwinter 2018

Join the ALCTS Metadata Interest Group during the ALA Midwinter Meeting in Denver to hear project updates about two IMLS-funded grants on Sunday, February 11th from 8:30-10:00 AM in Room 703 during our regular scheduled meeting. Alya Stein, Metadata Librarian from University of Illinois at Urbana-Champaign, will be giving a presentation on the current work the Digital Library Federation Assessment Interest Group has been doing to create a framework to measure digital object reuse. Chew Chiat Naun, Head of Metadata Creation from Harvard University, will discuss current work being done to share and scale local name authorities. The business meeting will follow. Looking forward to seeing you all in Denver, and be sure to add this event to your ALA Conference Scheduler.

Title: Developing A Framework for Measuring Reuse of Digital Objects: Project Update at the Metadata Interest Group, ALA Midwinter 2018

Presenter: Ayla Stein, Metadata Librarian, University of Illinois at Urbana-Champaign

Abstract: Content reuse, or how often and in what ways digital library materials are utilized and repurposed, is a key indicator of the impact and value of a digital collection. Traditional library analytics focus almost entirely on simple access statistics, which do not show how users utilize or transform unique materials digital collections. This lack of distinction, combined with a lack of standardized assessment approaches, makes it difficult to develop user-responsive collections or highlight the value of these materials. The grant project, Developing a Framework for Measuring Reuse of Digital Objects, an IMLS-funded project (LG-73-17-0002-17) by the Digital Library Federation Assessment Interest Group, is working to address this critical area. This presentation will illustrate the variety of ways digital library objects (including metadata) are being reused; share the results of the grant team’s work, including preliminary findings from the initial survey results as well as in-person and virtual focus groups sessions. The presentation will conclude with the team’s early findings and will engage the audience to contribute their feedback on the project and deliverables.

Title: National Strategy for Shareable Local Name Authorities

Presenter: Chew Chiat Naun, Head, Metadata Creation, Harvard University

Abstract: Libraries create local authorities to serve a variety of purposes, usually within an institutional context. It is becoming increasingly evident, however, that identities have much greater potential value if they can be shared. The IMLS Shareable Authorities forum brought together representatives from a wide range of stakeholders to explore themes including minimum viable specifications, data provider obligations, and reconciliation as a service. The objective of the forum is to identify services and practices that will be needed, and assumptions that will have to be made – or changed – to allow authorities to work across domains and at scale.

Presenter Bio: Chew Chiat Naun is a Graduate of Monash University in Melbourne, Australia, and recently joined Harvard Library as Head of Metadata Creation. Previously he was at Cornell University, where he and his then colleagues hosted the first IMLS Shareable Local Authorities forum. He is active in the Program for Cooperative Cataloging and co-chairs (with Ed Jones) the Standing Committee on Standards.

Posted in ALA Midwinter 2018, Conferences | Leave a comment

FAST and not so FAST faceting in digital collections

My library recently migrated away from a vendor-based digital asset management system to a homegrown system built with open source components. For additional background, check out this article in D-Lib, which addresses an aspect of the migration. We also recently published an article focusing on the tools and processes we used to migrate our metadata. While we did do some metadata clean-up prior to migration, there’s still a great deal of work to do with metadata remediation and enhancement after the migration. In our new system, one of the features I was particularly eager to try out was the ability to add custom facets for different collections. I developed a workflow for a part-time student to work on enhanced faceting. We’ve been experimenting with adding FAST headings to many of our oral history collections such as Interviews with Jews in Utah and the Carbon County Oral Histories. Right now, the default display shows all the facets in place for a collection, but showcasing just the top facets by record count with an option to expand is part of the future development plan for the system.

There are a few collections where just going only with FAST headings didn’t make sense, and I thought I would highlight them in this post and ask to see what other people might be doing with custom facets for their digital collections. One of the developers at the library, Alan Witkowski, implemented custom faceting for our Sanborn maps collection, where patrons can browse by year and by location. A librarian, Jessica Colbert, recently completed metadata enhancements in our Football Videos collection, which blends FAST headings for the teams, along with high interest facets specific to that collection such as “Away Games” and “Losses”.

Just having the ability to easily create custom facets for digital collections when we weren’t able to do that before is opening up new possibilities for digital collections at the University of Utah. For those of you who have added FAST headings to your digital collections, have you also run into situations where you wanted to add some additional faceting terms? What were your strategies for doing so? Feel free to share here, in a future guest post or in comments on this blog!

Anna Neatrour is the Digital Initiatives Librarian at the University of Utah J. Willard Marriott Library. You can find her on twitter as @annaneat.

Posted in General | Tagged , , | Leave a comment

Twice the Metadata is Twice the Fun!

Join the ALCTS Metadata Interest group for exciting programming at both our annual program and annual meeting!

Annual Program

The ALCTS Metadata Interest Group will be sponsoring Metadata Migrations: Managing Methods and Mayhem on Sunday June 25th from 3-4 pm in Room W185bc.  During this time, come hear experiences from the front lines with presentations from Maggie Dickson-Metadata Architect from Duke University Libraries; and Gretchen Gueguen-Data Services Coordinator from DPLA.  Looking forward in seeing you all in Chicago.  Do not forget to add this event to your ALA Conference Scheduler.

Maggie Dickson
Metadata Architect
Duke University Libraries

Title: Looking Back, Moving Forward: Remediating 20+ Years of Digital Collections Metadata

Abstract: In 2015, DUL began the process of migrating its digital collections to the Duke Digital Repository, a Fedora/Hydra/Blacklight-based platform. In preparation for this migration, we undertook a large-scale analysis and remediation of metadata describing approximately 112,000 items, created over the course of twenty years, by many different people, and using many different schemas and standards (or not). We formed a task group to make decisions, identify and engage stakeholders, and guide the workflow. This involved reviewing existing properties and values and evaluating the adoption of standards and vocabularies, with an eye toward linked open data and sharing our resources with the DPLA and beyond. The remediation itself (which at the time of this proposal is ongoing) is being completed using OpenRefine, scripting, and many good old spreadsheets. This presentation will describe the process, its challenges and successes, and future directions.

Gretchen Gueguen
Data Services Coordinator
Digital Public Library of America

Title: The Never-Ending Migration

Abstract: What if all you did was migrate metadata from one system to another? In a sense, that is what metadata mapping at DPLA is like. The first 2.5 million records were harvested and mapped in 2013 from 500 initial partners. Since then DPLA’s collection has grown to nearly 15 million records from more than 2000 contributing institutions. Since the project relies on metadata harvesting and synchronization, metadata is continually being harvested and mapped. This presentation will explore the tools and techniques that DPLA uses to analyze and map metadata from a variety of standard and bespoke metadata formats into a normalized application profile. Recently DPLA has been developing a new open source tool that can be used by anyone to harvest and map and analyze metadata from common data sources such as OAI feeds. Work on the creation of these tools as well as data quality efforts at DPLA will be reviewed.

Annual Meeting

Join the ALCTS Metadata Interest Group in Chicago for our meeting at ALA Annual 2017 at McComick Place, Room W102A, Sunday, June 25, 8:30 AM – 10:00 AM. We will have a presentation by the ALCTS/LITA Metadata Standards Committee on evaluating metadata standards, followed by our business meeting and election. Please join us!

Evaluating Metadata Standards – Principles into Practice

Jenn Riley, Lauren Corbett, and Erik Mitchell will present on their work in the Metadata Standards Committee in applying the principles (http://metaware.buzz/2016/08/04/principles-for-evaluating-metadata-standards/) to an example standard (the NISO Sample Tag Suite).  The principles for evaluation were developed in 2016 to give metadata communities a common tool to explore standards design.  The team will discuss the process for identifying standards to evaluate and approach to reviewing standards as well as the outcomes, lessons learned and next steps for the metadata principles.

Executive Committee Elections

The ALCTS Metadata Interest Group has the following offices open for election:

  • Vice-Chair/Chair Elect (Vice-Chair 2017-2018, Chair 2018-2019)
  • Program Co-Chair (2017-2019)
  • Secretary (2017-2019)

Terms are two years and begin following ALA Annual 2017. Officers must be able to commit to attending both ALA Midwinter and ALA Annual during their terms.

Elections will be held during the Metadata Interest Group meeting on Sunday, June 25th, 8:30 am to 10:00 am, McCormick Place W179b.

Anyone interested in standing for election to one of these offices is invited to get in touch with Mike Bolam (mrbst20@pitt.edu) and/or Liz Woolcott (liz.woolcott@usu.edu) prior to ALA. Please feel free to contact us if you have any questions or wish to announce your intent to run in advance. Additional nominations will be taken prior to the election at the meeting.

For more information on the roles and responsibilities of the positions, see our announcement on ALA Connect: http://connect.ala.org/node/266318.

Posted in ALA Annual 2017, Conferences | Leave a comment

Reminder: ALCTS Virtual Preconference, June 6-7

Join ALCTS for “Diverse, Inclusive, and Equitable Metadata,”  a virtual program in two sessions:

  • Session 1, Outreach and Inclusivity in Digital Libraries and Institutional Repositories, Tuesday, June 6, 2017, 1:00 p.m. – 2:00 p.m. CT
  • Session 2, Metadata Creation and Remediation in Zine and Digital Library Collections, Wednesday, June 7, 2017, 1:00 p.m. – 2:00 p.m. CT

For more information, or to register, visit http://www.ala.org/alcts/events/ac/2017/vc.

Posted in ALCTS Virtual Preconferences | Leave a comment

Metadata Librarian’s Little Helper: OpenRefine Reconciliation Services

This is the third in our series of follow-up posts by Midwinter Lightning Talk presenters.


When our archive opened to the public two years ago, the catalog of nearly 5,000 records were findable by keyword search and little else. The data was devoid of authorities and controlled vocabularies, and had not been compiled into finding aids. Attempting a migration from our legacy records management software to ArchivesSpace involved a great deal of cleanup, and authority reconciliation proved to be the most challenging part.  A specific feature in OpenRefine called Reconcile-csv helped with this task.

Since our legacy metadata lacked authority control, the issues were predictable: corporate and personal names were inconsistent, acronym-filled, and sometimes included related terms in parenthesis. Also, the records did not link to each other, so the same name could be formed differently in an authorities record, collection record and accession record. Cleaning up the names was going to be a large task since it meant addressing inconsistencies everywhere that a name appeared. Thankfully,  OpenRefine’s reconciliation service can automate some of this work: by plugging in a URL, the application will match data in your spreadsheet against a controlled vocabulary on the web, such as the Library of Congress (LC) Authorities.

Our plan was to export batches of 100 names from our catalog’s name authority file, which consisted of 1,700 names. Next, we would reconcile against LC Authorites in OpenRefine. Another great thing about the reconciliation services is that it can pull in other values from the controlled vocabularies, such as URIs. In doing this, our cleaned authority records could incorporate linked data.

OpenRefine’s reconciliation service is an amazing feature, but our metadata was in such rough shape that this stage took longer than anticipated. The name reconciliation matched or suggested matches for half of our names, and the remaining names either did not exist in LC Authorities, or they were so messy that no match could be found. Also, the names that were matched had to be evaluated for accuracy to make sure that our Michael Smith is the same person as the matched Michael Smith. On average, the reconciliation service took two minutes to run on 100 records, but the evaluation stage took an hour.

Our name cleanup also required us to standardize local names according to RDA. Since the bulk of our collection consists of university records, we had many variant spellings of university departments and offices. With an intern’s help, I made these edits in our name database, and added them, with a unique identifier, to a Google sheet that was to become our local name authority.

Now, our name authority records were clean, and ready to be imported into ArchivesSpace. But we were far from finished – all of the names within our collection and accession records were still a mess. Did this meant that we had to repeat the entire reconciliation process for these other record types? It did not, thanks to Reconcile-csv, a free reconciliation service developed by Open Knowledge Labs.

Two documents were the by-product of the work that we had just done: (1) a list of names from our catalog matched to LC authorities with their identifiers, and (2) a list of RDA-formed local names. Putting these together essentially gave us a master CSV of authority-controlled names. Now, we could use reconcile-csv, which matches data against a local CSV file. So, instead of matching our remaining messy names against the millions of entries in LC Authorities, then manually cross-referencing names against our local name authority, we simply matched against our master spreadsheet of 800 authoritative (local and LC) names that had been reconciled and evaluated for accuracy. This time, our match rate was higher and more accurate than with the LC Authorities reconciliation. As a result, our evaluation stage took significantly less time – 15 minutes per 100 records instead of an hour.

The Open Knowledge Labs website offers a simple 3-step instructions for downloading and running reconcile-csv from the command line. It behaves like any other OpenRefine reconciliation service: you can specify the column to be matched, view matches and suggested matches, and pull in other values from your master spreadsheet, like URIs.  Reconcile-csv is great for metadata that requires a lot of authority reconciliation, since compiling a segment of authoritative terms will make that reconciliation go much faster. For that reason, it’s really helpful for subject term reconciliation as well.

Resources:

Reconcile-csv: http://okfnlabs.org/reconcile-csv/
Reconcile-csv in GitHub: https://github.com/okfn/reconcile-csv
Reconcilable data sources for OpenRefine: https://github.com/OpenRefine/OpenRefine/wiki/Reconcilable-Data-Sources
How to use OpenRefine reconciliation services: https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation
Slides: https://goo.gl/3N0Gw3

Greer Martin, Discovery & Metadata Librarian, Illinois Institute of Technology

Posted in ALA Midwinter 2017, Conferences | Leave a comment

Metadata Interest Group Meeting at ALA Annual 2017

Join the ALCTS Metadata Interest Group in Chicago for our meeting at ALA Annual 2017 at McComick Place, Room W102A, 8:30 AM – 10:00 AM. We will have a presentation by the ALCTS/LITA Metadata Standards Committee on evaluating metadata standards, followed by our business meeting and election. Please join us!

Evaluating Metadata Standards – Principles into Practice

Jenn Riley, Lauren Corbett, and Erik Mitchell will present on their work in the Metadata Standards Committee in applying the principles (http://metaware.buzz/2016/08/04/principles-for-evaluating-metadata-standards/) to an example standard (the NISO Sample Tag Suite).  The principles for evaluation were developed in 2016 to give metadata communities a common tool to explore standards design.  The team will discuss the process for identifying standards to evaluate and approach to reviewing standards as well as the outcomes, lessons learned and next steps for the metadata principles.

Email

Posted in ALA Annual 2017, Conferences | Tagged | 1 Comment

Metadata Migrations: Managing Methods and Mayhem

Are you preparing to migrate out of a legacy system?  Do you have questions about metadata remediation, repurposing, or enhancement?  Of course, you do and we are here to help.  During ALA Annual in Chicago, The ALCTS Metadata Interest Group will be sponsoring Metadata Migrations: Managing Methods and Mayhem on Sunday June 25th from 3-4 pm in Room W185bc.  During this time, come hear experiences from the front lines with presentations from Maggie Dickson-Metadata Architect from Duke University Libraries; and Gretchen Gueguen-Data Services Coordinator from DPLA.  Looking forward in seeing you all in Chicago.  Do not forget to add this event to your ALA Conference Scheduler.

Title: Looking Back, Moving Forward: Remediating 20+ Years of Digital Collections Metadata

Presenter: Maggie Dickson, Metadata Architect, Duke University Libraries

Abstract: In 2015, DUL began the process of migrating its digital collections to the Duke Digital Repository, a Fedora/Hydra/Blacklight-based platform. In preparation for this migration, we undertook a large-scale analysis and remediation of metadata describing approximately 112,000 items, created over the course of twenty years, by many different people, and using many different schemas and standards (or not). We formed a task group to make decisions, identify and engage stakeholders, and guide the workflow. This involved reviewing existing properties and values and evaluating the adoption of standards and vocabularies, with an eye toward linked open data and sharing our resources with the DPLA and beyond. The remediation itself (which at the time of this proposal is ongoing) is being completed using OpenRefine, scripting, and many good old spreadsheets. This presentation will describe the process, its challenges and successes, and future directions.

Title: The Never-Ending Migration

Presenter: Gretchen Gueguen, Data Services Coordinator, Digital Public Library of America

Abstract: What if all you did was migrate metadata from one system to another? In a sense, that is what metadata mapping at DPLA is like. The first 2.5 million records were harvested and mapped in 2013 from 500 initial partners. Since then DPLA’s collection has grown to nearly 15 million records from more than 2000 contributing institutions. Since the project relies on metadata harvesting and synchronization, metadata is continually being harvested and mapped. This presentation will explore the tools and techniques that DPLA uses to analyze and map metadata from a variety of standard and bespoke metadata formats into a normalized application profile. Recently DPLA has been developing a new open source tool that can be used by anyone to harvest and map and analyze metadata from common data sources such as OAI feeds. Work on the creation of these tools as well as data quality efforts at DPLA will be reviewed.

Posted in ALA Annual 2007, Conferences | 1 Comment

Metadata Interest Group: Call for Nominations 2017

The ALCTS Metadata Interest Group has the following offices open for election:

  • Vice-Chair/Chair Elect (Vice-Chair 2017-2018, Chair 2018-2019)
  • Program Co-Chair (2017-2019)
  • Secretary (2017-2019)

Terms are two years and begin following ALA Annual 2017. Officers must be able to commit to attending both ALA Midwinter and ALA Annual during their terms.

Elections will be held during the Metadata Interest Group meeting on Sunday, June 25th, 8:30 am to 10:00 am, McCormick Place W179b.

Anyone interested in standing for election to one of these offices is invited to get in touch with Mike Bolam (mrbst20@pitt.edu) and/or Liz Woolcott (liz.woolcott@usu.edu) prior to ALA. Please feel free to contact us if you have any questions or wish to announce your intent to run in advance. Additional nominations will be taken prior to the election at the meeting.

For more information on the roles and responsibilities of the positions, see our announcement on ALA Connect: http://connect.ala.org/node/266318.

Email

Posted in ALA Annual 2017, Conferences | 2 Comments

Using MarcEdit to Retool Existing MARC Records of Paper Maps for Use in an Online Geoportal

This is the second in our series of follow-up posts by Midwinter Lightning Talk presenters.


The Michigan State University Libraries recently joined the Big Ten Academic Alliance Geoportal, a consortial online discovery tool for maps and geographic data. While the principal focus of the geoportal’s map-based interface is access to geospatial data for use in GIS applications, the geoportal also accommodates map-based discovery of digital scans of paper maps. Contributing our scanned paper maps to the geoportal requires submission of records suitable for the generation of ISO 19115-compliant metadata. To accomplish this, we devised a MarcEdit workflow using our existing MARC records for paper maps to create new MARC records for digital maps — which could then be delivered as MARCXML records to the geoportal staff, who used them to generate the ISO 19115 metadata for display in the geoportal. An additional benefit of the workflow was the creation of new MARC records for the digital scans, for use in our own library catalog.

We opted to start with MARC records for paper maps that have already been cataloged and scanned. The first step in our workflow was deciding which MARC fields could be programmatically edited using the paper-based record as a starting point, and which fields would require human review with manual entry.

Examples of programmatic changes included:

  • changing the 300$a field to “1 online resource”
  • changing some coding in the fixed fields
  • changing the 338 field’s carrier type to “online resource”
  • adding 655_7 “Digital Maps.”

Examples of manual edits applied after new records were generated in MarcEdit included:

  • conversion to RDA standards, including spelling out abbreviations and removing brackets in titles
  • removal of FAST headings so as to trigger OCLC’s process for automated re-analysis and re-application of FAST headings
  • miscellaneous punctuation and formatting issues.

Some fields, such as a 776 linking back to the original paper-based record, could be created programmatically for the new scan-based record, but required human review afterward. Our complete spreadsheet of changes can be viewed here.

As a result of this project, MSU Libraries now has 44 maps represented in the geoportal. An example geoportal record may be viewed here, and its corresponding record in the MSU Libraries catalog may be viewed here. We are happy with our initial results, although in hindsight we would have adhered to the PCC Provider Neutral Guidelines, and we have modified our procedure to do so in the future. The MSU Map Library staff are also pleased with our results, and we are excited to apply our workflow to additional records.

Tim Kiser, Special Materials Catalog Librarian
Nicole Smeltekop, Special Materials Catalog Librarian

Posted in ALA Midwinter 2017, Conferences | Leave a comment