Skip to main content

LLM Leaderboard

Compare how different large language models perform at writing Clerk code and select the one that best fits your requirements.

Model / AverageAuthenticationUsersOrganizationsWebhooksAPI RoutesCheckout Flow
Anthropic
Claude Opus 4.5
70%
40%
42%
86%
60%
89%
18%
OpenAI
GPT-5
66%
20%
58%
57%
75%
44%
100%
OpenAI
GPT-5 Chat
64%
10%
33%
57%
100%
56%
89%
Anthropic
Claude Haiku 4.5
62%
0%
42%
71%
82%
78%
9%
v0
v0-1.5-md
60%
30%
33%
29%
83%
100%
89%
Gemini
Gemini 3 Pro Preview
59%
20%
58%
71%
50%
78%
89%
Gemini
Gemini 2.5 Flash
59%
20%
50%
57%
83%
44%
89%
Anthropic
Claude Sonnet 4.5
59%
30%
50%
71%
92%
56%
89%
Anthropic
Claude Sonnet 4
55%
30%
33%
43%
64%
56%
89%
Anthropic
Claude Opus 4
54%
20%
33%
43%
64%
56%
100%
OpenAI
GPT-4o
49%
20%
17%
0%
75%
11%
78%

Last updated: December 2, 2025

Trusted by fast-growing companies around the world.

    • Browserbase
    • Inngest
    • Suno
    • Browserbase
    • Braintrust
    • Durable
    • OpenRouter
    • Braintrust
    • Higgsfield
    • Upstash
    • Samaya AI
    • Higgsfield
    • Consensus
    • Cartesia
    • David AI
    • Consensus