Accessibility FAST

AI Generated Image Descriptions

Generative AI chatbots have rapidly improved their ability to recognize and describe images. CoPilot, ChatGPT, and Gemini have shown to be useful at describing images and provide a starting point for writing text equivalents. figure captions, and alt text. For an introduction to alt text, consult the Alternative Text Quick Guide hosted earlier in this book.

When creating alt text consider the context and purpose of the image. When writing alt text focus on the type, the focus of the image, and the important details.

On this page:

  CoPilot

For generating alt text and image descriptions using Microsoft CoPilot:

The More Precise setting produces more accurate image descriptions.

Select the Add an image button to upload a file.

When using CoPilot in the Edge browser sidebar, select the Add a screenshot button to take a screenshot of an active webpage.

CoPilot requires specific prompting to produce useful image descriptions:

Basic prompts
  • “What does this image depict?”
  • “Describe this image to me?”
  • “Tell me the information in this image.”
Adding details
  • “This image is a flowchart. Can you outline the process depicted?”
  • “Here is a map. Can you describe the information, scope, and scale?”

Generative AI may be more accurate and effective when assuming a role. Output may be more accurate and useful when given parameters.

Roles

Include specific roles in your prompt, such as:

  • “Act as a college biology instructor and describe the information in this cell diagram.”
  • “Act as a digital accessibility expert writing alternative text for a college course. Write a description of this image.”
Parameters

Include instructions for limiting the length of response is essential as alt text must be concise.

  • “Act as a college biology instructor and describe the information in this cell diagram in 2 or 3 sentences please.”
  • “Act as a digital accessibility expert writing alternative text for a college course. Write a description of this image in 3 sentences.”

Removing the length parameter may be useful for complex images to ensure all information is included. Consult the Alt Text for Complex Images chapter for information on providing longer descriptions.

Try:

Act as an accessibility expert writing alternative text for a college textbook. Good alt text should identify the image type, describe the most important information, and add details that contribute information or meaning. Use proper sentence structure and grammar. Limit your response to 200 characters maximum.

A longer prompt may work to create both a short and long description at once.

Try:

You are a university-level instructor working to ensure images in your online course materials conform to WCAG 2.1 Web Content Accessibility Guidelines at Level AA. You need to accommodate for blindness, low vision, and/or color blindness to ensure equal accessibility of course content to all students.

You will be provided with an image and must return a succinct title, alternate text (alt text), and a long description.

The alt text should be no more than 150 characters and should briefly describe the meaning conveyed by the image in the context of the course material.

The detailed description may be one or more paragraphs and must describe the image in great detail. The detailed description should be easily scannable and use headings, paragraph breaks, and lists as appropriate.

Do not include the encoded image itself in your response.


From Purdue University Innovative Learning

Interacting with a chatbot like CoPilot is an iterative process of prompt, response, re-prompt. When CoPilot provides an image description, indicate what was incorrect or ask CoPilot to focus more on a specific element. With CoPilot, asking to make a long description shorter often crops, rather than condenses, the text to the requested length. Keep in mind, this may result in essential information being excluded. Always include a length parameter in your initial prompt if desired. Remember, use AI as a starting point to understand the main information and visual structure of an image and then edit the output for accuracy, clarity, and brevity. Read more about AI Prompting.

Note: Microsoft products like PowerPoint and Word do not have CoPilot’s ability to describe images. The AI generated image descriptions in those platforms are poor quality and should not be relied on.

Google Gemini and ChatGPT Plus also accept image uploads. Similar prompts as discussed above can be used on those platforms.

 Alt Text Assistant (GPT4)

Alt Text Assistant is a custom interface for ChatGPT with predefined parameters to help write alt text and image descriptions. Alt Text Assistant requires a ChatGPT4 account.

If you have a ChatGPT Plus account, navigate to Alt Text Assistant:

  1. Select Can you help me create alt text? or Can you help me create descriptive text?

    Alt text is best for basic images. Descriptive text is necessary for complex images.
  2. Select the paper clip icon to upload an image.
  3. Select your image file.
  4. Press Enter or click Send Message (up arrow icon) button on the right of the text field.
Based on testing, the Alt Text Assistant is the most user-friendly and accurate AI alt text tool. If you have access to ChatGPT4, Alt Text Assistant is recommended.

For further information on using CoPilot, Gemini, and ChatGPT consult the National Centre for AI Empowering Educators series.

Remember…

Always verify AI output for accuracy. Using AI to generate alt text and image descriptions is a great starting point but requires you, as the content creator and expert, to double check and refine what AI generates. The intention is to use AI to reduce your workload without creating junk descriptions that will create additional work and confusion for users.

There are privacy concerns with AI platforms. We recommend using caution when inputting – or having your students input – private, personal, or sensitive information (e.g. resumes or other identifying data). AI relies on large language models that are incomplete and biased. To generate content, chatbots use predictive text and any output should be verified for accuracy.
definition

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Accessibility Handbook for Teaching and Learning Copyright © 2023 by Briana Fraser and Luke McKnight is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.