Ethical AI and Librarianship

A Resource Guide

Code and documentation for Humans in the Loop (HITL)

Field Description
Title Code and documentation for Humans in the Loop (HITL)
Type Guidelines & Policies
Creator Library of Congress Labs, AVP (metadata solution provider)
Link https://github.com/LibraryOfCongress/hitl/tree/main
Creation Date 09/2020
Last Updated Date 06/2021
Summary This GitHub repository contains data, code, documentation and design artifacts for a machine learning framework in the library context. The framework was one of the outcomes of Humans in the Loop (HITL) initiative by LC (Library of Congress) Labs in collaboration with AVP (a metadata solution provider) from September 2020 to June 2021, demonstrating how to combine crowdsourcing and machine learning (ML) to support metadata enrichment in engaging, ethical, and useful ways. Cultural heritage institutions face growing needs to make digitized collections more searchable and accessible, especially materials not suited for OCR through tasks such as audio/text transcription and image classification. By enlisting volunteers to participate in annotation asks through crowdsourcing endeavors, cultural heritage institutions can generate accurate structured data to train ML models at scale. The framework addresses key challenges such as bias in ML algorithms, maintaining trustworthy machine-generated structured data, and communicating data provenance and potential inaccuracies to library users who trust library collections. Using the LC’s collection of U.S. Telephone Directory Yellow Pages as a case study, the framework presents two human-in-the-loop workflows: one where humans create training data, and another where humans correct ML outputs (see the “scribe-hitl” folder under “humans-in-the-loop-files”). The framework also incorporates user testing to help understand volunteer attitudes towards participating in crowdsourcing tasks and the design of an interface for presenting the data output of the ML processes. This framework offers a practical model for institutions seeking to apply ML ethically in library and cultural heritage settings.
Topic AI and Librarianship. Crowdsourcing. Digital collection. Metadata.
Source and Link Humans in the Loop project. https://labs.loc.gov/work/experiments/humans-loop/
Access Open
Accessibility Open
Audience Librarians – general. Information professionals. Scholars and Students.
Platform or Format Code (.py), document (.pdf), media (.png, .mov)
Length --
Geography USA
Language ENG
Description Date 05/30/2025

Ethical AI and Librarianship: A Resource Guide