Captions and Audio Descriptions


Captions are the display of text on-screen representing the dialogue and sound effects present in a video. Captions are synchronized with the video presentation and provide an accessible alternative for individuals who cannot hear the content. Captions include any spoken information as well as all relevant parts of the soundtrack, including background noises, sound effects, speaker identification, and any other audio cues that help the viewer understand the video.

Transcripts, unlike captions, are a text alternative to audio files, such as a podcast or pre-recorded radio show, and are not synchronized with the presentation. Transcripts should include speaker information or any other informational cues appropriate to understanding the recording. While transcripts may be provided for pre-recorded videos, they must be provided for audio-only content.


Videos to caption

  • Required instructional videos needed to pass the course
  • Optional instructional videos supporting student success

If you have a question about what types of videos to caption, please reach out to the Online Learning Pathways Instructional Design Supervisor, Liesl Boswell ( to discuss your situation.


Captioning and live events

Zoom meetings and webinars

Zoom provides the capability of including captions from a professional captioner or using auto-generated live captions from automatic speech recognition (ASR) engines. A professional captioner will have a higher level of accuracy compared to auto-generated captions but will require advanced scheduling to arrange the service.

For live events that are not public-facing or in which there is no request for captioning services, you can enable the live transcription feature in Zoom:
  1. Go to Zoom Settings and In Meeting (Advanced)
  2. Move to the Closed Captioning option
  3. Check the option “Enable live transcription service to show transcript on the side panel in-meeting.”
  4. Participants will then be able to choose to display auto-generated live captions during the Zoom meeting or event.

Before hosting a meeting or live event, verify whether or not real-time captions from a professional captioner will be needed. Auto-generated live captions can include critical errors that will impact the information or context of what is communicated. For assistance on real-time captioning as an accommodation, visit the SDCCD Online Learning Pathways or your Campus DSPS Office.

Captions Quality

The Captioning Key from the Described and Captioned Media Program provides specific guidance for producing quality captions for video presentations:

While auto-generated captions have made significant progress, they are still not as accurate as those produced by a professional captioner and may include critical errors that impact the information or context of what is communicated. Such errors may include incorrect text, a lack of correct punctuation and grammar, and missing speaker identification. Auto-generated captions may be used as a starting point from which to edit and create a more accurate captioned video.


Audio Descriptions

Audio descriptions provide a verbal depiction of the key visual elements in a video presentation. For individuals who are blind, visually impaired, or unable to view the video directly, audio descriptions communicate the important information relevant to understanding the video content. For example, a video may display a speaker’s name and title or specific instructions to follow. If this information is not included as part of the spoken dialogue, then it needs to be communicated as part of a separate audio description.

The Description Key from the Described and Captioned Media Program provides guidance for how to produce audio descriptions, including what to describe and how to describe on-screen information:

Recommended approach for existing videos

Please reach out to the Online Learning Pathways Instructional Design Supervisor, Liesl Boswell (

For new videos

One solution is to write your script so that any relevant on-screen information or cues are described in the spoken dialogue of the video, thus reducing the need for a separate audio description.