Ethical AI and Librarianship
A Resource Guide
Code and documentation for Humans in the Loop (HITL)
Field
|
Description |
---|---|
Title | Code and documentation for Humans in the Loop (HITL) |
Type | Guidelines & Policies |
Creator | Library of Congress Labs, AVP (metadata solution provider) |
Link | https://github.com/LibraryOfCongress/hitl/tree/main |
Creation Date | 09/2020 |
Last Updated Date | 06/2021 |
Summary | This GitHub repository contains data, code, documentation and design artifacts for a machine learning framework in the library context. The framework was one of the outcomes of Humans in the Loop (HITL) initiative by LC (Library of Congress) Labs in collaboration with AVP (a metadata solution provider) from September 2020 to June 2021, demonstrating how to combine crowdsourcing and machine learning (ML) to support metadata enrichment in engaging, ethical, and useful ways. Cultural heritage institutions face growing needs to make digitized collections more searchable and accessible, especially materials not suited for OCR through tasks such as audio/text transcription and image classification. By enlisting volunteers to participate in annotation asks through crowdsourcing endeavors, cultural heritage institutions can generate accurate structured data to train ML models at scale. The framework addresses key challenges such as bias in ML algorithms, maintaining trustworthy machine-generated structured data, and communicating data provenance and potential inaccuracies to library users who trust library collections. Using the LC’s collection of U.S. Telephone Directory Yellow Pages as a case study, the framework presents two human-in-the-loop workflows: one where humans create training data, and another where humans correct ML outputs (see the “scribe-hitl” folder under “humans-in-the-loop-files”). The framework also incorporates user testing to help understand volunteer attitudes towards participating in crowdsourcing tasks and the design of an interface for presenting the data output of the ML processes. This framework offers a practical model for institutions seeking to apply ML ethically in library and cultural heritage settings. |
Topic | AI and Librarianship. Crowdsourcing. Digital collection. Metadata. |
Source and Link | Humans in the Loop project. https://labs.loc.gov/work/experiments/humans-loop/ |
Access | Open |
Accessibility | Open |
Audience | Librarians – general. Information professionals. Scholars and Students. |
Platform or Format | Code (.py), document (.pdf), media (.png, .mov) |
Length | -- |
Geography | USA |
Language | ENG |
Description Date | 05/30/2025 |