A3/D4. Difficulty locating or accessing information related to visual items – Digital Library Accessibility and Usability Guidelines (DLAUG)

Situation Definition:

A situation that arises from difficulty finding and gaining access to alternative text, transcripts, or description for visual items in a DL.

Factor(s) Leading to the Situation:

- Complex information presentation: illogical structure (text with tooltip)
- Complex information presentation: multiple layers (or steps) to relevant DL items (text with tooltip)
- Inappropriate labeling: applications (text with tooltip)
- Lack of features/functions/items/information: a shortcut to the desired content (text with tooltip)
- Lack of features/functions/items/information: transcripts for visual items (text with tooltip)

Guideline or Design Recommendation

1. Provide concise and meaningful alternative text (text with tooltip) for all visual items
2. Provide transcripts (text with tooltip) for images of text documents
3. Provide audio descriptions for video materials
4. Provide descriptive metadata (text with tooltip) for each image or video item
5. Ensure alternative text, transcripts and descriptions for visual items can be easily located
6. Ensure alternative text, transcripts and descriptions for visual items can be easily accessed
7. Provide help information regarding alternative text, transcript, or description access

Rationale and Objective:

Providing a text transcript (text with tooltip) for an image or a video item is helpful for BVI (text with tooltip) users to search for a keyword or scan overall content ⁽¹⁾. Specifically, since DLs include a range of heterogeneous content, including illustrated books, photos, maps, manuscripts, etc., it is necessary to include a text transcript in a document form that is accessible. Preferably, the transcript should be in markup language for web pages, such as HTML (text with tooltip) / XML (text with tooltip) document format, as it is considered to be most easily accessible by screen readers (text with tooltip) . Regarding downloadable formats, because PDF format typically requires up-to-date assistive technology and a plug-in for accessing the PDF (text with tooltip) files, Microsoft Word format might be more appropriate. A caption (text with tooltip) and metadata (text with tooltip) should also be provided to offer additional information along with the transcript ⁽²⁾. Also, providing a clear label for the transcript is important because “text” metadata refers to text type, but BVI users may mistake this term to mean “transcript.” The addition of invisible headings would support ease of navigation for BVI users ⁽³⁾.

Techniques and Methods:

1.1. Provide a supplementary description for a visual item next to it with a clear label (text with tooltip)
2.1. Provide an option to download a transcript (text with tooltip) of a visual item
2.2. Provide partial transcripts if full transcripts are unavailable, and clearly indicate a partial transcript is available for a visual item
3.1. Offer audio description (text with tooltip) for scenes of videos between dialogues
4.1. Provide a clear and concise description of visual items in metadata (text with tooltip)
5.1. Provide an option for transcript display next to a visual item
5.2. Add “Transcript” as an invisible part of the title for a heading
5.1/6.1. Add an invisible “Skip to transcript” link
7.1. Provide a context-sensitive description for information embedded in a visual item. For example, if an image portrays protestors holding a sign for their protest, BVI audiences should be provided with contextual information on the broader meaning of the sign in addition to the text or the visual image.

Recommended Features:

1.1.A. Alt text (text with tooltip) (See example 1.1.A.a and 1.1.A.b)
1.1.B. Meaningful labels (text with tooltip) (See example 1.1.B.a and 1.1.B.b)
2.1. Downloadable transcript (text with tooltip) feature (See example 2.1)
2.2. Transcript display feature (See example 2.2.a and 2.2.b)
3.1. Invisible audio description controls (see example 3.1.a and 3.1.b)
4.1. Description element in metadata (See example 4.1)
5.1/6.1. Skip-to-transcript links (See example 5.1/6.1)
5.2. Invisible headings (See example 5.2.a and 5.2.b)
7.1. Context-sensitive description (See example 7.1)

Examples:

1.1.A.a. Alt text: How-to example

Use appropriate alternative text for images by considering the context of an image and text description/summary along with an image. For example, if an image is provided without any text (e.g., title, summary), a right alt text should include title and short/brief information about the image. If an image has a title without a summary, alt text should also provide a summary without a title. If an image has a link, alt text should provide “where it is heading to” information.

1.1.A.b. Alt Text: How-to example

Use “access interview,” “view movie,” or “listen to audio recording.”

In the media example, links to the media opened in a new browser tab, but this behavior was not indicated with the link. aria-label= “view movie opens in a new window” would have addressed the issue.

1.1.B.a. A supplementary description for a visual item: Good design

Replace the first image with the second image with the “image” tab selected so that screen readers can see both versions of the content

1.1.B.b. A supplementary description for a visual item: Bad design

No Alt-text for an image, and the file name is not meaningfully labeled.

Corresponds to this markup:

2.1. Downloadable transcript feature: Good design

Make the transcript downloadable in .txt format (e.g., “Download transcript here”) similar to the HathiTrust digital library where metadata is downloadable in text format (.txt). (HathiTrust, 2018)

2.2.a. Provide partial transcripts: Good design

Enabled transcript feature that provides a text-based representation of the visual content.

2.2.b. Provide partial transcripts: Bad design

No semantic markup around the transcript data (Transcript is set as p elements, nothing in markup to indicate that it’s the transcript text.)

3.1.a. Provide audio descriptions: Good design

Audio description (Art Beyond Sight, n.d.) with transcription.

3.1.b. Provide audio descriptions: Bad design

No audio description or no CC on a video file.

4.1. Provide a clear and concise description of visual items in metadata: How-to example

Clear labeling for metadata to avoid confusion between transcript and metadata: Object description -> Folder level metadata & Description -> Item level metadata.

5.1/6.1. Add an invisible “Skip to transcript” link: Good design

Invisible link: “Skip to Transcript” links to the “Text” tab if the transcript is available

5.2.a. Provide invisible heading: Good design

Invisible headings: a heading not visible to sighted users but readable by screen readers

5.2.b. Provide invisible heading: Bad design

No option to skip to the transcript or invisible heading.

7.1. Provide a context-sensitive description for information embedded in a visual item: Good design

Context-sensitive description (providing clear descriptions based on image or scanned document’s content)

Related Resources:

1. W3C. (2018). Understanding Success Criterion 2.4.6: Headings and Labels. Retrieved from https://www.w3.org/WAI/WCAG21/Understanding/headings-and-labels.html
2. Bitstreams. (2018). Interactive Transcripts have arrived. Retrieved from https://blogs.library.duke.edu/bitstreams/2018/02/16/interactive-transcripts-arrived/
3. Xie, I., Babu, R., Joo, S., & Fuller, P. (2015). Using digital libraries non-visually: Understanding the help-seeking situations of blind users. Information Research: An International Electronic Journal, 20(2), paper 673. Retrieved from http://InformationR.net/ir/20-2/paper673.html.
4. Griffin, E. (2015). Tips for making web video & audio accessible. http://www.3playmedia.com/2015/07/13/tips-for-making-web-video-audio-accessible/
5. Graves, A., Mohamed, A. R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In Acoustics, speech and signal processing (ICASSP), 2013 IEEE International Conference (pp. 6645-6649). IEEE.
6. HathiTrust. (2018). Retrieved from https://babel.hathitrust.org/cgi/mb?a=listis;c=464226859;sort=title_a;pn=1;lmt=ft
7. Ingle, R. R., Fujii, Y., Deselaers, T., Baccash, J., & Popat, A. C. (2019). A Scalable Handwritten Text Recognition System. arXiv preprint arXiv:1904.09150.
8. Muehlberger, G., Seaward, L., Terras, M., Oliveira, S. A., Bosch, V., Bryan, M., & Gatos, B. (2019). Transforming scholarship in the archives through handwritten text recognition. Journal of Documentation.
9. Penn State. Caption Guidelines and policy. Retrieved from https://accessibility.psu.edu/video/captions/
10. Transcripts. (2019). Retrieved from https://www.w3.org/WAI/media/av/transcripts/#where-to-put-transcripts.
11. Video Captions. (2019). Retrieved from https://www.w3.org/WAI/perspective-videos/captions/.
12. WebAIM. (2019). Alternative Text. Retrieved from https://webaim.org/techniques/alttext/
13. Zhong, Y., Raman, T. V., Burkhardt, C., Biadsy, F., & Bigham, J. P. (2014, April). JustSpeak: enabling universal voice control on Android. In Proceedings of the 11th Web for All Conference (p. 36). ACM.