
LLM Leaderboard
Compare how different large language models perform at writing Clerk code and select the one that best fits your requirements.
| Model / Average | Authentication | Users | Organizations | Webhooks | API Routes | Checkout Flow |
|---|---|---|---|---|---|---|
1 Claude Opus 4.5 70% | 40% | 42% | 86% | 60% | 89% | 18% |
2 GPT-5 66% | 20% | 58% | 57% | 75% | 44% | 100% |
3 GPT-5 Chat 64% | 10% | 33% | 57% | 100% | 56% | 89% |
4 Claude Haiku 4.5 62% | 0% | 42% | 71% | 82% | 78% | 9% |
5 v0-1.5-md 60% | 30% | 33% | 29% | 83% | 100% | 89% |
6 Gemini 3 Pro Preview 59% | 20% | 58% | 71% | 50% | 78% | 89% |
7 Gemini 2.5 Flash 59% | 20% | 50% | 57% | 83% | 44% | 89% |
8 Claude Sonnet 4.5 59% | 30% | 50% | 71% | 92% | 56% | 89% |
9 Claude Sonnet 4 55% | 30% | 33% | 43% | 64% | 56% | 89% |
10 Claude Opus 4 54% | 20% | 33% | 43% | 64% | 56% | 100% |
11 GPT-4o 49% | 20% | 17% | 0% | 75% | 11% | 78% |
Last updated: December 2, 2025
Trusted by fast-growing companies around the world.

