AgentUI is a skill-driven benchmark project to improve UI implementation quality for coding agents (Codex + Claude).
- Improve trigger accuracy for the
ui-architectskill frontmatter - Benchmark output quality before and with skill instructions
- Produce auditable side-by-side artifacts and scoring
.agents/skills/ui-architect/— skill package used by agentsbenchmarks/prompts/— test promptsbenchmarks/baseline/— outputs without skill guidancebenchmarks/with-skill/— outputs with skill guidancebenchmarks/analysis/— side-by-side evaluation docsdocs/— reports and drive-share artifacts
- Fintech dashboard
- Healthcare settings/preferences
- Ecommerce search results + filters