Ethical AI and Librarianship

A Resource Guide

Home

Overview

Resources

Code and documentation for Humans in the Loop (HITL)

Field	Description
Title	Code and documentation for Humans in the Loop (HITL)
Type	Guidelines & Policies
Creator	Library of Congress Labs, AVP (metadata solution provider)
Link	https://github.com/LibraryOfCongress/hitl/tree/main
Creation Date	09/2020
Last Updated Date	06/2021
Summary	This GitHub repository contains data, code, documentation and design artifacts for a machine learning framework in the library context. The framework was one of the outcomes of Humans in the Loop (HITL) initiative by LC (Library of Congress) Labs in collaboration with AVP (a metadata solution provider) from September 2020 to June 2021, demonstrating how to combine crowdsourcing and machine learning (ML) to support metadata enrichment in engaging, ethical, and useful ways. Cultural heritage institutions face growing needs to make digitized collections more searchable and accessible, especially materials not suited for OCR through tasks such as audio/text transcription and image classification. By enlisting volunteers to participate in annotation asks through crowdsourcing endeavors, cultural heritage institutions can generate accurate structured data to train ML models at scale. The framework addresses key challenges such as bias in ML algorithms, maintaining trustworthy machine-generated structured data, and communicating data provenance and potential inaccuracies to library users who trust library collections. Using the LC’s collection of U.S. Telephone Directory Yellow Pages as a case study, the framework presents two human-in-the-loop workflows: one where humans create training data, and another where humans correct ML outputs (see the “scribe-hitl” folder under “humans-in-the-loop-files”). The framework also incorporates user testing to help understand volunteer attitudes towards participating in crowdsourcing tasks and the design of an interface for presenting the data output of the ML processes. This framework offers a practical model for institutions seeking to apply ML ethically in library and cultural heritage settings.
Topic	AI and Librarianship. Crowdsourcing. Digital collection. Metadata.
Source and Link	Humans in the Loop project. https://labs.loc.gov/work/experiments/humans-loop/
Access	Open
Accessibility	Open
Audience	Librarians – general. Information professionals. Scholars and Students.
Platform or Format	Code (.py), document (.pdf), media (.png, .mov)
Length	--
Geography	USA
Language	ENG
Description Date	05/30/2025

Ethical AI and Librarianship: A Resource Guide

Home

Overview

Resources