Ethical AI and Librarianship
A Resource Guide
Speech to Text
Field
|
Description | |
---|---|---|
Title | Speech to Text | |
Type | Videos | |
Creator |
|
|
Link |
|
|
Creation Date | 09/26/2024 | |
Last Updated Date | -- | |
Summary | This is a video recording of a webinar that is part of the 2024 “AI in Libraries” series hosted by the CENL (Conference of European National Librarians). The video records two presentations. The first presentation, by Lars Mydtskov, Lasse Rogers, and Ditte Laursen from the Royal Danish Library, showcases a project that leverages advanced Automatic Speech Recognition (ASR) systems to enable text-based search within the Library’s extensive radio and television archives in Danish language. The project demonstrates how transcribing audiovisual (AV) materials can improve access and searchability of the materials. The presentation also addresses key legal and ethical challenges: 1) Under Danish law, the creator of the original audio retains copyright, including rights over reproductions such as transcriptions. Uses beyond research require licensing; 2) transcription may involve processing personal identifiable data of speakers. Such processing must be justifiable and adhere to data minimization principles. The second presentation, delivered by Per Egil Kummervold, Senior Researcher at the National Library of Norway, introduces the Norwegian Speech Transformer Model Project (NOSTRAM). This project focuses on fine-tuning OpenAI’s Whisper model using a large Norwegian speech corpus. The goal was to significantly improve transcription accuracy for the Norwegian language. The training data consisted primarily of legally deposited materials, including NRK (the national broadcaster) subtitles and audio books. During the Q&A, Kummervold clarified that while the team is not allowed to redistribute the copyrighted materials (e.g., audiobooks), Norwegian law specifically the Legal Deposit Act permits their use for training language models for research purposes. The video is freely and openly viewable through web browsers. The presentation slides are available for download in PDF format. A link to the full “AI in Libraries” webinar series is also provided (see Link field). | |
Topic | Ethical AI. Copyright. Libraries. AI and librarianship. | |
Source and Link | CENL 2024 webinar series: https://www.cenl.org/network-group-ai-in-libraries-webinars-2024/ | |
Access | Open. | |
Accessibility | Captions not available. | |
Audience | Librarian – General. Librarians – Academic. | |
Platform or Format | Web. PDF. | |
Length | 00:48:23 | |
Geography | DNK/NOR/Europe | |
Language | ENG | |
Description Date | 06/25/2025 |