
LLM Leaderboard
Compare how different large language models perform at writing Clerk code and select the one that best fits your requirements.
| Model / Average | Organizations | Billing | Webhooks | Auth | Quickstarts | UI Components | Upgrades | User Management |
|---|---|---|---|---|---|---|---|---|
1 GPT-5.4 79% | 77% | 89% | 87% | 69% | 73% | 69% | 80% | 92% |
2 v0-1.5-md 71% | 91% | 62% | 75% | 60% | 76% | 86% | — | 50% |
3 Claude Opus 4.6 68% | 68% | 63% | 91% | 49% | 93% | 68% | — | 42% |
4 GPT-5.2 67% | 77% | 67% | 87% | 54% | 91% | 66% | — | 25% |
5 Claude Opus 4.5 66% | 75% | 60% | 87% | 54% | 93% | 71% | — | 25% |
6 GPT-5 66% | 67% | 68% | 87% | 37% | 91% | 62% | — | 50% |
7 Gemini 3 Pro Preview 64% | 54% | 62% | 68% | 54% | 91% | 78% | — | 42% |
8 Claude Sonnet 4.5 64% | 48% | 63% | 74% | 43% | 83% | 86% | — | 50% |
9 Claude Sonnet 4 63% | 53% | 60% | 72% | 49% | 91% | 86% | — | 33% |
10 GPT-5 Chat 63% | 69% | 66% | 81% | 38% | 89% | 66% | — | 33% |
11 v0-1.5-lg 62% | 49% | 60% | 59% | 59% | 67% | 74% | — | 67% |
12 Claude Opus 4 60% | 40% | 60% | 78% | 38% | 91% | 81% | — | 33% |
13 GPT-5.2 Codex 59% | 74% | 63% | 78% | 38% | 54% | 64% | — | 42% |
14 Claude Haiku 4.5 58% | 62% | 66% | 68% | 44% | 71% | 71% | — | 25% |
15 GPT-4o 39% | 58% | 0% | 60% | 16% | 91% | 34% | — | 17% |
16 Gemini 2.5 Flash 24% | 0% | 0% | 0% | 0% | 91% | 79% | — | 0% |
Last updated: March 6, 2026
Trusted by fast-growing companies around the world.

