a11y-llm-eval

by microsoft · Agent Tool · ★ 34

About a11y-llm-eval

A11y LLM Evaluation Harness and Dataset This is a research project to evaluate how well various LLM models generate accessible HTML content. Problem LLMs currently generate code with accessibility bugs, resulting in blockers for people with disabilities and costly re-work and fixes downstream. Goal Create a public test suite which can be used to benchmark how well various LLMs generates accessible HTML code. Eventually, it could also be used to help train models to generate more accessible code by default.

Quick Facts

Stars34
Forks5
LanguagePython
CategoryAgent Tool
LicenseMIT
Quality Score50.948/100
Last Updated2026-05-07
Created2025-09-24
Platformspython
Est. Tokens~70k

More Agent Tool Tools

Explore other popular agent tool tools:

View all Agent Tool tools →

Popular Python Agent Tools

Frequently Asked Questions

What is a11y-llm-eval?

a11y-llm-eval is An eval tool to benchmark how well LLMs generate accessible HTML. It is categorized as a Agent Tool with 34 GitHub stars.

What programming language is a11y-llm-eval written in?

a11y-llm-eval is primarily written in Python.

How do I install or use a11y-llm-eval?

You can find installation instructions and usage details in the a11y-llm-eval GitHub repository at github.com/microsoft/a11y-llm-eval. The project has 34 stars and 5 forks, indicating an active community.

What license does a11y-llm-eval use?

a11y-llm-eval is released under the MIT license, making it free to use and modify according to the license terms.

View on GitHub → Browse Agent Tool tools