a11y-llm-eval

About a11y-llm-eval

A11y LLM Evaluation Harness and Dataset This is a research project to evaluate how well various LLM models generate accessible HTML content. Problem LLMs currently generate code with accessibility bugs, resulting in blockers for people with disabilities and costly re-work and fixes downstream. Goal Create a public test suite which can be used to benchmark how well various LLMs generates accessible HTML code. Eventually, it could also be used to help train models to generate more accessible code by default.

Quick Facts

Stars	34
Forks	5
Language	Python
Category	Agent Tool
License	MIT
Quality Score	67.0067676345031/100
Last Updated	2026-05-07
Created	2025-09-24
Platforms	python
Est. Tokens	~70k

More Agent Tool Tools

Explore other popular agent tool tools:

superpowers ⭐ 249.6k
AutoGPT ⭐ 185.4k
ollama ⭐ 175.7k
skills ⭐ 160.9k
langflow ⭐ 151.4k
langchain ⭐ 141.3k
gstack ⭐ 120.5k
browser-use ⭐ 103.4k
autoresearch ⭐ 90.4k
deer-flow ⭐ 76.5k

View all Agent Tool tools →

Popular Python Agent Tools

hermes-agent ⭐ 211.4k · Codex Skill
AutoGPT ⭐ 185.4k · Agent Tool
skills ⭐ 157.3k · Claude Skill
langflow ⭐ 151.4k · Agent Tool
open-webui ⭐ 143.9k · MCP Server

Frequently Asked Questions

What is a11y-llm-eval?

a11y-llm-eval is An eval tool to benchmark how well LLMs generate accessible HTML. It is categorized as a Agent Tool with 34 GitHub stars.

What programming language is a11y-llm-eval written in?

a11y-llm-eval is primarily written in Python.

How do I install or use a11y-llm-eval?

You can find installation instructions and usage details in the a11y-llm-eval GitHub repository at github.com/microsoft/a11y-llm-eval. The project has 34 stars and 5 forks, indicating an active community.

What license does a11y-llm-eval use?

a11y-llm-eval is released under the MIT license, making it free to use and modify according to the license terms.

View on GitHub → Browse Agent Tool tools