# Authentication for AI Applications

Authentication for AI applications is the practice of verifying the identity of **both** the human user and the AI agent acting on their behalf, then issuing short-lived, scoped, auditable credentials to each. Modern AI apps are dual-principal systems: a request typically carries a user identity (`sub`), a client identity (`azp`), and an agent identity (`act`) — and the server must know all three to authorize safely. There are two core patterns: **user-delegated** agents that act inside a human's session with that user's consent, and **autonomous** machine-to-machine (M2M) agents that act without a user in the loop. Both are live today — Anthropic reported **97 million+ monthly [Model Context Protocol](https://modelcontextprotocol.io/) SDK downloads** in its [December 2025 donation announcement](https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation) — and both are frequently mis-scoped: **53% of organizations report their AI agents exceeded intended permissions** in the past year ([CSA, April 2026](https://cloudsecurityalliance.org/press-releases/2026/04/16/more-than-half-of-organizations-experience-ai-agent-scope-violations-cloud-security-alliance-study-finds)).

## Introduction

AI changes authentication because a single request now carries two identities — the human who asked for the action and the agent that executed it — each with different lifetimes, blast radii, and audit requirements.

### What Authentication for AI Applications Means

[Authentication](https://clerk.com/glossary/authentication.md) proves identity; [authorization](https://clerk.com/glossary/authorization.md) decides what that identity may do. Both matter more for AI agents than for humans because agents operate at higher request volume, make autonomous decisions, and chain actions across multiple APIs. An AI application is any system where a language model or autonomous process invokes a backend API — in-product copilots, background workers that classify tickets overnight, and external [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) clients like Claude, Cursor, or ChatGPT invoking your tools. [AI authentication](https://clerk.com/glossary/ai-authentication.md) therefore covers two identity problems at once: verifying the human user in the normal way (session, OAuth, MFA), and verifying the non-human agent with a separate credential that can be revoked, attenuated, and audited independently.

### How It Differs From Traditional Web App Authentication

Traditional web auth assumes one human, one session, one cookie. Modern AI auth assumes one human plus N agents hitting M downstream APIs, with each hop producing its own token. Three shifts result: **non-human actors become first-class** (machine identities outnumber humans **82:1** per [CyberArk 2025](https://www.cyberark.com/press/machine-identities-outnumber-humans-by-more-than-80-to-1-new-report-exposes-the-exponential-threats-of-fragmented-identity-security/) and **144:1** per [NHIMG 2025](https://nhimg.org/2025-state-of-non-human-identities-and-secrets-in-cybersecurity)); **token volume and lifetime pressure** (an agent can issue thousands of requests per minute — short TTLs become mandatory); and **delegation chains** where each hop needs a verifiable link back to the original human principal. The mechanics (OAuth 2.1, [JWT](https://clerk.com/glossary/json-web-token.md), OIDC) are the same — scale and audit requirements change.

### What This Article Covers

This article covers, in order:

- Why AI apps have unique authentication needs
- The two core patterns: user-delegated vs. autonomous M2M
- Token scoping across time, resource, and action
- Delegation patterns
- Multi-tenant isolation
- MCP authentication
- Security and structured error responses
- Implementation with Clerk + Next.js 16
- Choosing an authentication provider
- Quick-reference checklists and an FAQ

## Why AI Applications Have Unique Authentication Needs

AI introduces three structural shifts: non-human identities at scale, dual-principal requests, and architectural diversity that blurs the line between client, agent, and server.

### Non-Human Identities Enter the Application

A non-human identity (NHI) is any principal that is not a person — service accounts, [API keys](https://clerk.com/glossary/api-key.md), workload identities, [SPIFFE](https://www.hashicorp.com/en/blog/spiffe-securing-the-identity-of-agentic-ai-and-non-human-actors) SVIDs, and now autonomous AI agents. Enterprises carry roughly **250,000 machine identities each** (up from 50,000 in 2021, [CyberArk](https://www.cyberark.com/press/machine-identities-outnumber-humans-by-more-than-80-to-1-new-report-exposes-the-exponential-threats-of-fragmented-identity-security/)), and **[Gartner predicts](https://www.gartner.com/en/newsroom/press-releases/2025-08-26-gartner-predicts-40-percent-of-enterprise-apps-will-feature-task-specific-ai-agents-by-2026-up-from-less-than-5-percent-in-2025) 40% of enterprise apps will feature task-specific AI agents by the end of 2026 (up from under 5% in 2025)**. **97% of NHIs possess excessive privileges** and **91% of former-employee tokens remain active** ([NHIMG](https://nhimg.org/2025-state-of-non-human-identities-and-secrets-in-cybersecurity)). Traditional [identity management](https://clerk.com/glossary/identity-management.md) was built for humans — MFA, password rotation, leavers processes — and those primitives do not map to an agent that may be created mid-request, act once, and never return. **82% of enterprises already have AI agents they did not knowingly deploy** ([CSA, April 2026](https://cloudsecurityalliance.org/press-releases/2026/04/21/new-cloud-security-alliance-survey-reveals-82-of-enterprises-have-unknown-ai-agents-in-their-environments)) — you cannot authenticate what you cannot see.

### Authenticating Both Human Users and AI Agents Simultaneously

Every agent request has three identities the server must track: `sub` (the user the action is for), `azp` (the client that initiated the agent — usually your web/mobile app), and `act` (the agent itself, [RFC 8693](https://www.rfc-editor.org/rfc/rfc8693)). Dropping `act` breaks attribution; dropping `sub` breaks user-scoped access control; dropping `azp` means you cannot revoke a compromised client without revoking everyone. The IETF draft [OBO for AI Agents v02](https://datatracker.ietf.org/doc/html/draft-oauth-ai-agents-on-behalf-of-user-02) codifies this triple with a `requested_actor` parameter at the token endpoint. Design every token so middleware can answer "who asked?", "through which app?", and "via which agent?" without extra network calls.

### Common AI Application Architectures

Three architectures dominate, each with a different auth shape.

- **In-session copilot.** A sidebar or chat UI reading the signed-in user's data. Reuse the user's session token on each tool call — the agent executes inside the browser's trust boundary. [Clerk session tokens](https://clerk.com/docs/guides/sessions/session-tokens.md) (60s TTL, auto-refresh) pass safely into AI SDK `tool()` calls for low-risk reads ([Vercel AI SDK](https://ai-sdk.dev/docs/ai-sdk-core/tools-and-tool-calling)).
- **Autonomous worker.** Nightly classifier, webhook-driven triage, scheduled cleanup. No user session to inherit. Use Clerk M2M tokens, OAuth Client Credentials, cloud workload identities, or SPIFFE SVIDs; obtain per-user data on demand via on-behalf-of (OBO) token exchange (see [On-Behalf-Of Token Exchange](#on-behalf-of-token-exchange)) or token vault.
- **External MCP client.** Your SaaS exposes tools over MCP to Claude / Cursor / ChatGPT. The external agent is an OAuth client of your app. Auth uses OAuth 2.1 + PKCE; the user consents once, tool invocations are verified server-side. The [MCP Authorization Spec](https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization) mandates OAuth 2.1 for HTTP transports and forbids it for stdio (which uses environment credentials).

Modern AI SaaS typically runs all three at once.

| Architecture        | Primary auth             | Who authenticates the agent | Typical TTL                    |
| ------------------- | ------------------------ | --------------------------- | ------------------------------ |
| In-session copilot  | User session token       | Inherited from user         | 60s–15m (with refresh)         |
| Autonomous worker   | M2M / Client Credentials | Your backend or cloud IAM   | 15m–1h                         |
| External MCP client | OAuth 2.1 + PKCE         | End user via consent        | ≤ 1h access / rotating refresh |

### Foundational Authentication Concepts (Quick Refresher)

Skip this subsection if you already know OAuth, JWTs, and token lifetimes.

[OAuth](https://clerk.com/glossary/oauth.md) is a delegation protocol; [OpenID Connect](https://clerk.com/glossary/openid-connect.md) layers identity on top. Four grant types matter for AI: [Authorization Code](https://clerk.com/glossary/authorization-code-flow.md) + [PKCE](https://clerk.com/glossary/pkce.md) for user-delegated agents ([RFC 6749](https://www.rfc-editor.org/rfc/rfc6749)), [Client Credentials](https://clerk.com/glossary/client-credentials-flow.md) for M2M, Token Exchange for on-behalf-of ([RFC 8693](https://www.rfc-editor.org/rfc/rfc8693)), and Device Authorization for headless CLIs. **OAuth 2.1** ([draft](https://datatracker.ietf.org/doc/html/draft-ietf-oauth-v2-1)) removes Implicit, mandates PKCE, and recommends sender-constrained tokens.

Three credential types you will issue to agents:

- **[Session tokens](https://clerk.com/glossary/session-token.md)** — short-lived [JWTs](https://clerk.com/glossary/json-web-token.md) tied to a browser session (Clerk default: 60s TTL, 50s refresh).
- **API keys** — long-lived, scope-bounded, no built-in expiration, server-revocable. Use for simple internal S2S ([Cloudflare comparison](https://www.cloudflare.com/learning/access-management/api-key-vs-oauth/)).
- **Service credentials** — cloud-managed workload tokens (AWS IAM Roles, Entra Workload Identities, GCP Service Accounts, SPIFFE SVIDs) that auto-rotate and never hold static secrets.

Token lifetimes: access tokens ≤ 1 hour (≤ 15 minutes for high-privilege actions per [RFC 6750](https://www.rfc-editor.org/rfc/rfc6750) / [RFC 9700](https://datatracker.ietf.org/doc/html/rfc9700)); [refresh tokens](https://clerk.com/glossary/refresh-token.md) rotated every use with automatic reuse detection ([Auth0](https://auth0.com/docs/secure/tokens/refresh-tokens/refresh-token-rotation)). [Oso](https://www.osohq.com/learn/best-practices-of-authorizing-ai-agents) documents an Okta benchmark: shortening tokens from 24 hours to 5 minutes reduced credential theft incidents by **92%**.

## Two Core Patterns for AI Agent Authentication

Every AI auth design starts with one binary question: is a human in the loop? If yes, use **Pattern 1: user-delegated** — the agent acts inside the user's session, with consent, bounded by the user's permissions. If no, use **Pattern 2: autonomous** machine-to-machine — the agent acts on its own identity with a separate credential. Most production systems run both.

### Pattern 1: User-Delegated AI Agents (Human-in-the-Loop)

**When to use:** the agent acts synchronously during a user session with the user's consent — in-product copilots, chat UIs, prompt-triggered tool calls. The decisive test: if the action should never outlive the user's active session, use this pattern.

**Flow and token shape:** OAuth 2.1 Authorization Code + [PKCE](https://clerk.com/glossary/pkce.md), producing an access token with `sub = user_id`, `aud` = your resource server (mandatory per [RFC 8725](https://datatracker.ietf.org/doc/html/rfc8725)), short TTL (1 hour max, 15 minutes for sensitive actions), and scopes limited to the user's current capabilities. Optionally issue a derived [JWT](https://clerk.com/glossary/json-web-token.md) from a [Clerk JWT template](https://clerk.com/docs/guides/sessions/jwt-templates.md) embedding agent context (`agent_id`, `tool_scopes`, `session_id`). The [consent screen](https://clerk.com/glossary/consent-screen.md) should surface the agent identity explicitly ([Curity](https://curity.io/blog/user-consent-best-practices-in-the-age-of-ai-agents/)).

**Example — chatbot that reads a user's calendar:** user signs in → chat UI asks about tomorrow's calendar → agent triggers a Google OAuth consent flow for `calendar.readonly` → backend exchanges the code, stores the refresh token in a server-side token vault, returns a short-lived access token to the agent → agent calls Google Calendar, never touching the refresh token. Vault pattern: [Auth0 Token Vault](https://auth0.com/ai/docs/intro/token-vault), [Anthropic Managed Agents Vaults](https://platform.claude.com/docs/en/managed-agents/vaults), [AWS Bedrock AgentCore Identity](https://aws.amazon.com/blogs/machine-learning/introducing-amazon-bedrock-agentcore-identity-securing-agentic-ai-at-scale/).

### Pattern 2: Autonomous AI Agents (Machine-to-Machine)

**When to use:** the agent runs without a user in the loop — background workers, scheduled jobs, webhook-driven triage, batch processors. The decisive test: if the agent acts when no user is online, use this pattern.

**Flow and credential shape:** two industry approaches. (1) **OAuth Client Credentials** ([RFC 6749 §4.4](https://www.rfc-editor.org/rfc/rfc6749)) with **JWT Bearer Assertions** ([RFC 7523](https://datatracker.ietf.org/doc/html/rfc7523)) replacing the static client secret — MCP's M2M extension uses this shape ([MCP OAuth Client Credentials extension](https://modelcontextprotocol.io/extensions/auth/oauth-client-credentials)). (2) **Clerk M2M tokens** — a Clerk M2M "scope" is a communication graph (which machine may talk to which), not an OAuth capability scope. As of April 2026, Clerk does not yet support the OAuth Client Credentials flow ([Clerk Machine Auth Overview](https://clerk.com/docs/guides/development/machine-auth/overview.md)). Both issue an [access token](https://clerk.com/glossary/access-token.md) whose `sub` is the machine identity; custom [claims](https://clerk.com/glossary/claim.md) identify the specific agent; tokens are short-lived and revocable.

**Example — background ticket classifier:** worker mints a Clerk M2M token with `clerkClient.m2m.createToken({ tokenFormat: 'jwt', secondsUntilExpiration: 3600 })`, calls your backend; the route handler runs `auth({ acceptsToken: ['m2m_token'] })` and verifies custom claims. Every update records `agent_id` alongside `updated_by`. For actions needing the ticket owner's permissions, the agent performs OBO token exchange (see [On-Behalf-Of Token Exchange](#on-behalf-of-token-exchange)) instead of reusing its own M2M token.

### Choosing Between the Two Patterns

Use this decision table for any new agent action.

| Question                                     | User-delegated           | Autonomous M2M                       | Hybrid (OBO)                                |
| -------------------------------------------- | ------------------------ | ------------------------------------ | ------------------------------------------- |
| Is a user in the request?                    | Yes                      | No                                   | Yes (origin) + No (execution)               |
| Does the action need the user's permissions? | Yes                      | No                                   | Yes                                         |
| Blast radius if token stolen                 | One user's data          | All data the agent can reach         | One user's data, one agent                  |
| Revocation granularity                       | Sign-out cascades        | Per-machine revocation               | Per-exchange revocation                     |
| Recommended token format                     | JWT with `sub = user_id` | JWT / opaque with `sub = machine_id` | JWT with `sub = user_id` + `act = agent_id` |

## Token Scoping for AI Agents

Scoping is the single most impactful control you can apply to AI agents. Layer three controls — **time**, **resource**, **action** — and you bound the blast radius of every compromised token. Skip any layer and a stolen or manipulated credential gains far more reach than the agent was ever supposed to have.

### Why Scoping Matters More for AI Than for Humans

Agents make more requests per unit time and take more autonomous decisions than humans, so over-permission compounds fast — a single scope bug multiplied across 10,000 nightly runs is a different incident than one misclicked human. **53% of organizations** reported AI agents exceeding intended permissions in the past year ([CSA, April 2026](https://cloudsecurityalliance.org/press-releases/2026/04/16/more-than-half-of-organizations-experience-ai-agent-scope-violations-cloud-security-alliance-study-finds)); **51% lack a formal revocation process** and credentials remain active **47 days on average** past need ([Okta](https://www.okta.com/blog/ai/ai-agent-security-when-authorization-outlives-intent/)). [Prompt injection](https://genai.owasp.org/llmrisk/llm01-prompt-injection/) (OWASP LLM01) turns any over-scoped token into a potential lateral-movement vector; the [OWASP Top 10 for Agentic Applications 2026](https://genai.owasp.org/2025/12/09/owasp-top-10-for-agentic-applications-the-benchmark-for-agentic-security-in-the-age-of-autonomous-ai/) lists Identity / Privilege Abuse (ASI03) and Tool Misuse (ASI02) as top-3 risks.

### Time-Limited Tokens

Default to ≤ 1 hour for user-context access tokens and ≤ 15 minutes for high-privilege actions ([RFC 6750](https://www.rfc-editor.org/rfc/rfc6750), [RFC 9700](https://datatracker.ietf.org/doc/html/rfc9700)). Rotate refresh tokens on every exchange with automatic reuse detection — Auth0's pattern invalidates the entire token family on a stale submission ([Auth0](https://auth0.com/docs/secure/tokens/refresh-tokens/refresh-token-rotation)) — and pair with sender-constrained refresh tokens via [DPoP (Demonstrating Proof-of-Possession)](https://www.rfc-editor.org/rfc/rfc9449) or mTLS so a stolen token cannot be replayed without the private key ([WorkOS — DPoP](https://workos.com/blog/dpop-rfc-9449-explained)). Refresh proactively with a 5-minute buffer ([Scalekit](https://www.scalekit.com/blog/oauth-ai-agents-architecture)); Clerk Core 3 does this inside the SDK ([changelog](https://clerk.com/changelog/2026-03-03-core-3.md)). Verify `exp` server-side — never trust client-side time. If an agent may outlive the user session, swap to a service credential before the session ends.

### Resource-Specific Permissions

[OAuth scopes](https://clerk.com/glossary/oauth-scopes.md) are coarse capability boundaries (`contacts:read`, `contacts:write`, `crm.write:acme`). Users can grant less than requested during consent ([Auth0](https://auth0.com/docs/get-started/apis/scopes)); Descope's [progressive-scoping patterns](https://www.descope.com/blog/post/progressive-scoping) map agent tools to scope bundles. Scopes alone are too coarse for agent safety — enforce per-endpoint and per-record decisions in middleware from the verified JWT claims. **Rich Authorization Requests** ([RFC 9396](https://www.rfc-editor.org/rfc/rfc9396)) add an `authorization_details` parameter that carries structured permissions like `{"type":"payment","amount":500,"merchant":"acme"}` — a better fit for agent actions than a blunt `payment:write` scope ([Stytch](https://stytch.com/blog/ai-agent-authentication-guide/), [Curity](https://curity.io/resources/learn/api-security-best-practice-for-ai-agents/)).

### Action-Based Restrictions

Default to read-only; require explicit opt-in for write. The GitHub MCP server exposes a production example — `X-MCP-Readonly: "true"` disables every mutating tool without changing scopes ([GitHub](https://github.blog/ai-and-ml/generative-ai/a-practical-guide-on-how-to-use-the-github-mcp-server/)).

For finer control, add **fine-grained authorization** — relationship-based access control ([ReBAC](https://openfga.dev/docs/concepts)), Zanzibar-style ([Google Zanzibar](https://research.google/pubs/zanzibar-googles-consistent-global-authorization-system/)). Scopes answer "can this token write to the CRM?"; FGA answers "can this specific agent, acting for this specific user, update _this_ contact record?". Clerk does not provide FGA natively; the recommended pattern is a composition: **Clerk** supplies identity context in the [JWT](https://clerk.com/glossary/json-web-token.md) (`user_id`, `org_id`, `org_role`, agent context from `user.public_metadata`), and an FGA engine — [OpenFGA](https://openfga.dev/docs/modeling/agents), [Auth0 FGA](https://auth0.com/blog/genai-tool-calling-intro/), [Oso](https://www.osohq.com/learn/best-practices-of-authorizing-ai-agents), [Cerbos](https://www.cerbos.dev/blog/dynamic-authorization-for-ai-agents-guide-to-fine-grained-permissions-mcp-servers), or [Permit.io](https://www.permit.io/blog/announcing-permit-ai-access-control-ai-identity-fga) — reads those claims and answers resource-level questions. Clerk's [org permissions](https://clerk.com/docs/guides/organizations/control-access/roles-and-permissions.md) handle most RBAC needs.

### Combining All Three Scoping Strategies in Practice

Layered scoping is the norm for production agents. A single tool call might carry:

- A token valid for **15 minutes** (time limit).
- Scope `crm.write:acme` (resource scope bound to a specific tenant — [Descope](https://www.descope.com/blog/post/progressive-scoping)).
- A middleware check that calls FGA with `(agent_id, "update", contact_123)` before the update executes (per-action).

Compromise any one layer and the other two still bound the blast radius.

## AI Agent Delegation Patterns

Delegation means an agent acts with another principal's authority — a user's, another agent's, or the organization's. Every production agent hits four delegation surfaces: **third-party APIs** (Gmail, Slack, GitHub), **your internal database**, **in-app user data** (via your backend), and **other agents** in a multi-agent pipeline. Each surface has a canonical pattern.

### Delegation Pattern: Agent Accessing Third-Party APIs

For Gmail, Slack, GitHub, and similar, use OAuth 2.1 Authorization Code + PKCE requested **in the user's name**. The critical design point is that the agent **never sees the refresh token** — it lives server-side in a token vault, and the agent asks the vault for a short-lived access token on each call.

Canonical implementations: [Auth0 Token Vault](https://auth0.com/ai/docs/intro/token-vault) (RFC 8693 internally, pre-built Google / Slack / GitHub / Microsoft connections), [Anthropic Managed Agents Vaults](https://platform.claude.com/docs/en/managed-agents/vaults) (`mcp_oauth` and `static_bearer` types, workspace-scoped, write-only fields), [AWS Bedrock AgentCore Identity](https://aws.amazon.com/blogs/machine-learning/introducing-amazon-bedrock-agentcore-identity-securing-agentic-ai-at-scale/), and [Descope Outbound Apps](https://www.descope.com/press-release/agentic-identity-hub-2.0).

#### Storing and Refreshing Third-Party Tokens

A production token vault needs: KMS-backed encryption at rest; a write-only API (secret fields never leave the vault); per-user isolation; refresh orchestration on expiry; reuse detection that invalidates the family on replay; and [revocation](https://clerk.com/glossary/token-revocation.md) hooks fired on user sign-out, role change, or disconnect.

### Delegation Pattern: Agent Performing Database Operations

Agent-to-database delegation enforces tenancy at two layers: the application layer (middleware checks on every request) and the database layer (row-level policies the agent cannot bypass). Agents that hit the database directly should connect as a dedicated Postgres role without `BYPASSRLS`. Enable [Postgres Row-Level Security](https://www.postgresql.org/docs/current/ddl-rowsecurity.html) on every multi-tenant table with a policy keyed to a session variable the middleware sets per request ([Supabase RLS](https://supabase.com/docs/guides/database/postgres/row-level-security), [Drizzle ORM RLS](https://orm.drizzle.team/docs/rls)).

```sql
ALTER TABLE contacts ENABLE ROW LEVEL SECURITY;

CREATE POLICY agent_tenant_scope ON contacts
  USING (tenant_id = current_setting('app.current_tenant_id')::INT);
```

Middleware sets `app.current_tenant_id` from the verified JWT's `org_id` claim at the start of each transaction. RLS becomes the last line of defense — even if a route handler forgets a tenant check, the database refuses cross-tenant reads.

For audit, include `agent_id`, `user_id`, and `trace_id` on every log row, and set a per-transaction application name so Postgres logs carry the agent identity on every query (`SET LOCAL application_name = 'agent-' || current_setting('app.agent_id');`). See [LoginRadius](https://www.loginradius.com/blog/engineering/auditing-and-logging-ai-agent-activity) and [ISACA — Auditing Agentic AI](https://www.isaca.org/resources/news-and-trends/industry-news/2025/the-growing-challenge-of-auditing-agentic-ai) for the full field set.

### Delegation Pattern: Agent Managing User Data Inside Your App

#### On-Behalf-Of Token Exchange

An autonomous agent often needs to act on a specific user's data. The standards-based approach is **OAuth 2.0 Token Exchange** ([RFC 8693](https://www.rfc-editor.org/rfc/rfc8693)): `subject_token` = user's token, `actor_token` = agent's token, response = new token with `sub = user_id` and `act = agent_id`. The most deployed reference is [Microsoft Entra's `jwt-bearer` OBO flow](https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-on-behalf-of-flow), extended in [Entra Agent ID](https://learn.microsoft.com/en-us/entra/agent-id/identity-platform/agent-user-oauth-flow) specifically for agent clients.

**Clerk note:** Clerk does not currently implement RFC 8693 natively. The practical workaround: validate the user session at the edge, then issue a [JWT](https://clerk.com/glossary/json-web-token.md) from a [custom template](https://clerk.com/docs/guides/sessions/jwt-templates.md) that embeds user identity (`sub`, `org_id`, `org_role`) and agent context (`agent_id`, `tool_scopes`). Downstream services verify that JWT against Clerk's JWKS.

#### Propagating User Identity to Downstream Services

Validate once at the edge, then propagate a signed identity token to each downstream service, which verifies locally against the JWKS. A service mesh or API gateway enforces that every internal call carries a valid token ([GitGuardian — OAuth for MCP enterprise patterns](https://blog.gitguardian.com/oauth-for-mcp-emerging-enterprise-patterns-for-agent-authorization/)); [HashiCorp Vault for AI Agents](https://developer.hashicorp.com/validated-patterns/vault/ai-agent-identity-with-hashicorp-vault) threads an `X-Correlation-ID` through the stack for end-to-end tracing.

### Delegation Pattern: Agent-to-Agent Handoffs

The [A2A Protocol](https://a2a-protocol.org/latest/specification/) (April 2025, now Linux Foundation-governed) defines HTTP + JSON-RPC 2.0 for agent-to-agent comms with five auth schemes (Bearer, OAuth Authorization Code, OAuth Client Credentials, API Key, HTTP Basic). Each agent publishes an **agent card** advertising capabilities and required auth. When agent A calls agent B, A obtains a token scoped to B's resource server via a token-exchange endpoint that records A as `azp` and appends A to the `act` chain ([Stytch A2A OAuth guide](https://stytch.com/blog/agent-to-agent-oauth-guide/); [IETF attenuating tokens draft](https://datatracker.ietf.org/doc/html/draft-niyikiza-oauth-attenuating-agent-tokens-00) for scope reduction).

Every action in a chain must remain attributable to the original user. Three mechanisms: pass `sub` through unchanged; append each agent to a nested `act` claim for the full delegation trail ([IANA JWT Claims Registry](https://www.iana.org/assignments/jwt/jwt.xhtml)); and reduce scopes at each hop so downstream agents cannot exceed the caller ([OIDC-A paper](https://arxiv.org/html/2509.25974v1), [AAuth draft](https://datatracker.ietf.org/doc/html/draft-rosenberg-oauth-aauth-00)).

## Multi-Tenant AI Architecture

Multi-tenant AI architecture isolates each user's and organization's data from every other tenant, even when agents share compute, caches, and model context. One user's agent must not read another user's data; one organization's agent must not cross into another organization's data; and prompt injection or state bleeding must never let an agent act under the wrong principal. This section covers per-user isolation, organization scope, boundary enforcement, and the leak vectors unique to AI.

### Per-User Agent Isolation

An agent bound to a single user session must only see that user's data. Two failure modes dominate: **stale binding** (an agent initialized for user A serves a request from user B because the session changed mid-flight) and **prompt injection** (external content tricks the agent into calling a tool with another user's identifier — [Unit 42](https://unit42.paloaltonetworks.com/ai-agent-prompt-injection/)). [LayerX's leakage analysis](https://layerxsecurity.com/generative-ai/multi-tenant-ai-leakage/) names five identities the server must keep straight: trigger, execution, authorization, tenant, and attribution.

Every agent credential should carry `sub` and a session identifier. Middleware rejects tokens where `sub` does not match the session cookie subject. Revoke agent credentials on sign-out via the Clerk `session.ended` webhook ([Clerk Webhooks](https://clerk.com/docs/guides/development/webhooks/overview.md)). [Scalekit](https://www.scalekit.com/blog/access-control-multi-tenant-ai-agents) catalogs three isolation layers — config isolation, named connection binding, and code boundaries — that map onto the middleware.

### Organization-Scoped AI Agents

Some agents belong to an organization, not a user — e.g., a marketing assistant for Acme Corp that runs for the whole team. Org-scoped tokens must carry `org_id` and `org_role` claims, verified on every request. [Clerk Organizations](https://clerk.com/docs/guides/organizations/overview.md) provide the multi-tenant primitive and expose org claims in every session JWT; background agents must pass the session token in the `Authorization` header (not cookies) to apply the correct org context.

Role-based access control ([RBAC](https://clerk.com/glossary/role-based-access-control-rbac.md)) for agents mirrors RBAC for humans. Clerk ships `org:admin` and `org:member` defaults with up to 10 [custom roles](https://clerk.com/glossary/custom-roles.md) per org, [custom permissions](https://clerk.com/glossary/custom-permissions.md) in the format `org:<feature>:<permission>` ([Clerk Org Roles](https://clerk.com/docs/guides/organizations/control-access/roles-and-permissions.md)), and full BAPI CRUD over roles and permissions ([changelog](https://clerk.com/changelog/2025-11-24-organization-roles-and-permission-bapi-management.md)). System permissions do not appear in session claims — check them explicitly with `has()`.

### Tenant Boundary Enforcement

Every agent request MUST carry an `org_id` (or tenant-equivalent) claim. Middleware extracts it, compares against the requested resource, and rejects mismatches with a 403 + structured error body (see [Structured Auth Error Responses for AI Agents](#structured-auth-error-responses-for-ai-agents)). A single missing check on a single endpoint is sufficient for cross-tenant leakage — there is no optimization that justifies skipping it.

In Next.js 16, edge routing protection lives in `proxy.ts` (`middleware.ts` is deprecated). The per-request token check runs inside the route handler via `auth({ acceptsToken: [...] })`, accepts session / OAuth / M2M / API-key tokens, and enforces the tenant boundary. The full real-TypeScript example lives in [Step 5: Enforce Multi-Tenant Isolation on Every Request](#step-5-enforce-multi-tenant-isolation-on-every-request). 403 responses must use the agent-parseable shape — `application/problem+json` ([RFC 9457](https://www.rfc-editor.org/rfc/rfc9457)) — so a misrouted agent can fail gracefully.

### Preventing Cross-Tenant Data Leakage from Agents

LayerX identifies five leak vectors specific to multi-tenant AI: **context-window bleeding** (a prior user's messages remain in model context), **KV-cache side channels** (shared inference caches leak across tenants), **state management failures** (per-request state rebinds to the wrong tenant), **broad unscoped queries** (no tenant filter), and **shared encryption keys** (one compromise cascades). Mitigate with a separate agent context per tenant, tenant-keyed database queries enforced at the RLS layer (see [Delegation Pattern: Agent Performing Database Operations](#delegation-pattern-agent-performing-database-operations)), tenant-aware caches, and per-tenant encryption keys. Palo Alto Unit 42's ["Double Agents" research](https://unit42.paloaltonetworks.com/double-agents-vertex-ai/) on Vertex AI shows how a metadata-service leak can defeat every application-layer check — the database and cache layers must independently enforce tenancy.

## Authenticating MCP Servers and Modern AI Protocols

The [Model Context Protocol](https://modelcontextprotocol.io/) is the dominant external-agent surface in 2026: **97 million+ monthly SDK downloads** as of Anthropic's December 2025 donation announcement ([Anthropic](https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation)) and **10,000+ active public MCP servers** ([MCP Manager](https://mcpmanager.ai/blog/mcp-adoption-statistics/)). This section covers the protocol, the OAuth 2.1 + PKCE authorization flow, the Dynamic Client Registration (DCR) and Client ID Metadata Documents (CIMD) client-identification standards, and the host-side requirements for exposing a SaaS application safely.

### What Is the Model Context Protocol (MCP)?

MCP is an open JSON-RPC 2.0 protocol that lets AI tools (Claude, ChatGPT, Cursor, VS Code, Copilot) invoke tools on remote servers. It defines a host / client / server triad with primitives (Tools, Resources, Prompts) and two transports: **stdio** (local subprocess) and **Streamable HTTP** (remote, single endpoint, mandatory `Origin` validation for local servers) ([MCP Architecture](https://modelcontextprotocol.io/docs/concepts/architecture), [MCP Transports](https://modelcontextprotocol.io/docs/concepts/transports)). Anthropic [donated MCP to the Linux Foundation's Agentic AI Foundation on December 9, 2025](https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation).

### OAuth 2.1 Authorization for MCP Servers

Streamable HTTP MCP servers SHOULD use OAuth 2.1 + [PKCE](https://clerk.com/glossary/pkce.md). The discovery + auth sequence:

1. Client sends an unauthenticated request; server returns `401` with `WWW-Authenticate: Bearer resource_metadata="<url>"`.
2. Client fetches the Protected Resource Metadata (PRM) document at `/.well-known/oauth-protected-resource` ([RFC 9728](https://datatracker.ietf.org/doc/html/rfc9728)) and the Authorization Server (AS) metadata at `/.well-known/oauth-authorization-server` ([RFC 8414](https://datatracker.ietf.org/doc/html/rfc8414)).
3. Client registers (via DCR, see [Dynamic Client Registration for AI Tools](#dynamic-client-registration-for-ai-tools)) or identifies itself (via CIMD), runs Authorization Code + PKCE, and presents the access token as `Authorization: Bearer` on every JSON-RPC request.

The JWT **audience must be validated** on every request ([MCP Authorization Tutorial](https://modelcontextprotocol.io/docs/tutorials/security/authorization)); tokens must never be logged. Aaron Parecki's ["Let's Fix OAuth in MCP"](https://aaronparecki.com/2025/04/03/15/oauth-for-model-context-protocol) is the foundational critique that shaped the current MCP spec.

### Dynamic Client Registration for AI Tools

With N AI clients × M MCP servers, manual registration is impractical. Two solutions: **[Dynamic Client Registration](https://clerk.com/glossary/dynamic-client-registration.md)** ([RFC 7591](https://datatracker.ietf.org/doc/html/rfc7591)) where clients self-register at runtime, and **Client ID Metadata Documents (CIMD)** — the Nov 2025 evolution where clients identify via a URL they control for DNS-based trust ([Aaron Parecki summary](https://aaronparecki.com/2025/11/25/1/mcp-authorization-spec-update)). Expect variance: GitHub's MCP server [does not support DCR](https://github.blog/ai-and-ml/generative-ai/a-practical-guide-on-how-to-use-the-github-mcp-server/). Clerk exposes DCR as a Dashboard toggle under OAuth applications ([changelog](https://clerk.com/changelog/2025-06-13-oauth-improvements.md)).

### Connecting an MCP Server to a SaaS Application Safely

Host-side requirements: expose PRM (`/.well-known/oauth-protected-resource`) and AS metadata (`/.well-known/oauth-authorization-server`); enforce audience binding, short TTLs, least-privilege scopes, and PKCE on every flow; restrict the `Origin` header on local servers; and for enterprise deployments support **ID-JAG** / Enterprise-Managed Authorization ([MCP extension](https://modelcontextprotocol.io/extensions/auth/enterprise-managed-authorization)) so a central IdP can federate agents across all your MCP servers.

For **Next.js**, Clerk's primitive is `verifyClerkToken` (from `@clerk/mcp-tools/next`), passed as the verifier callback into Vercel's `withMcpAuth` wrapper (from `mcp-handler` — not a Clerk export). The canonical PRM path via `resourceMetadataPath` is `/.well-known/oauth-protected-resource/mcp` ([Clerk MCP Next.js guide](https://clerk.com/docs/nextjs/guides/ai/mcp/build-mcp-server.md)). For **Express**, Clerk ships `mcpAuthClerk` middleware (from `@clerk/mcp-tools/express`) plus `protectedResourceHandlerClerk`, `authServerMetadataHandlerClerk`, and `streamableHttpHandler` — a one-import pattern ([Clerk MCP Express guide](https://clerk.com/docs/expressjs/guides/ai/mcp/build-mcp-server.md)). Clerk also hosts a docs/snippets MCP server at `https://mcp.clerk.com/mcp` exposing `clerk_sdk_snippet` and `list_clerk_sdk_snippets` to AI assistants ([Clerk MCP server guide](https://clerk.com/docs/guides/ai/mcp/clerk-mcp-server.md), [changelog](https://clerk.com/changelog/2026-01-20-clerk-mcp-server.md)) — a documentation tool, not an auth proxy. The endpoint is a Streamable HTTP MCP surface (not a browser URL); wire it into your AI assistant's MCP config rather than visiting it directly.

## Security Considerations for Agentic AI

Agentic AI security is the set of authentication, authorization, and audit controls that prevent AI agents from exceeding their intended authority, leaking tenant data, or being hijacked by prompt injection. It matters because **53% of organizations reported AI agents exceeding permissions in the past year** ([CSA, April 2026](https://cloudsecurityalliance.org/press-releases/2026/04/16/more-than-half-of-organizations-experience-ai-agent-scope-violations-cloud-security-alliance-study-finds)) and **88% confirmed or suspected security incidents involving AI agents** ([Gravitee, Feb 2026](https://www.gravitee.io/blog/state-of-ai-agent-security-2026-report-when-adoption-outpaces-control)). The [OWASP Top 10 for Agentic Applications 2026](https://genai.owasp.org/2025/12/09/owasp-top-10-for-agentic-applications-the-benchmark-for-agentic-security-in-the-age-of-autonomous-ai/) enumerates ten risk classes; this section covers the seven defense layers every production AI app needs.

### Least-Privilege by Default

Start each agent with **zero standing access**. Grant narrow, time-bound permissions per task — the Task-Based Authorization pattern documented in [OpenFGA's agent guide](https://openfga.dev/docs/modeling/agents) and [Oso's agent best practices](https://www.osohq.com/learn/best-practices-of-authorizing-ai-agents). Replace long-lived service-account credentials with task-scoped tokens; [Auth0's guide to mitigating excessive agency](https://auth0.com/blog/mitigate-excessive-agency-ai-agents/) pairs this with CIBA (Client-Initiated Backchannel Authentication) for asynchronous human approval on critical actions.

### Prompt Injection and Authorization Boundaries

Prompt injection ([OWASP LLM01](https://genai.owasp.org/llmrisk/llm01-prompt-injection/)) is the #1 LLM vulnerability and **cannot be fully prevented at the model layer** — enforcement must happen in the auth and authorization boundary around the agent. Indirect injection hides payloads in RAG corpora, PDFs, emails, and MCP tool descriptions; [Lakera](https://www.lakera.ai/blog/indirect-prompt-injection) showed **5 crafted documents can manipulate AI responses 90% of the time**, and Unit 42 catalogued [12 production attack cases](https://unit42.paloaltonetworks.com/ai-agent-prompt-injection/) including database destruction. Defenses that hold: the agent never sees raw credentials (token vault, see [Storing and Refreshing Third-Party Tokens](#storing-and-refreshing-third-party-tokens)); tool calls are per-task scoped with FGA decisions server-side; human-in-the-loop via CIBA for high-risk actions ([Auth0](https://auth0.com/blog/secure-human-in-the-loop-interactions-for-ai-agents/)). Assume the model will be manipulated and let the auth layer refuse.

### Audit Logging for Non-Human Actors

AI agent [audit logs](https://clerk.com/glossary/audit-logs.md) must capture **intent**, not just state ([ISACA — Auditing Agentic AI](https://www.isaca.org/resources/news-and-trends/industry-news/2025/the-growing-challenge-of-auditing-agentic-ai)). Mandatory fields per [LoginRadius](https://www.loginradius.com/blog/engineering/auditing-and-logging-ai-agent-activity) and the [NIST AI RMF](https://www.nist.gov/itl/ai-risk-management-framework): `agent_id`, `parent_identity`, `delegation_scope`, `tool_name`, `tool_params_hash` (SHA-256 — never log raw params that may contain secrets), `policy_decision`, and `trace_id`.

Use structured JSON aligned to [OpenTelemetry GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/) (spans `create_agent`, `invoke_agent`, `execute_tool`), retain 1–7 years, and align to [ISO/IEC 42001](https://aws.amazon.com/blogs/security/ai-lifecycle-risk-management-iso-iec-420012023-for-ai-governance/). The [EU AI Act Article 12 logging requirements](https://www.helpnetsecurity.com/2026/04/16/eu-ai-act-logging-requirements/) mandate 6-month minimum retention for high-risk systems from August 2, 2026.

### Structured Auth Error Responses for AI Agents

Agents need **machine-parseable** failure responses so they can refresh, retry, or escalate without human intervention. Use [RFC 9457 Problem Details](https://www.rfc-editor.org/rfc/rfc9457) (`application/problem+json`) for the body, [RFC 6750 §3](https://www.rfc-editor.org/rfc/rfc6750#section-3) OAuth machine codes in both the body and `WWW-Authenticate`, and [RFC 9110](https://www.rfc-editor.org/rfc/rfc9110)'s `Retry-After` / `RateLimit-*` headers on 429 / 503.

```json
{
  "type": "urn:problems:insufficient-scope",
  "title": "Insufficient scope",
  "status": 403,
  "detail": "Token is missing scope 'crm.write:acme'",
  "error": "insufficient_scope",
  "required_scopes": ["crm.write:acme"]
}
```

RFC 9457 accepts any URI for `type` — a URN avoids implying an HTTP document exists and matches the helper in [Step 5: Enforce Multi-Tenant Isolation on Every Request](#step-5-enforce-multi-tenant-isolation-on-every-request) (`urn:error:${error}`).

Paired header: `WWW-Authenticate: Bearer error="insufficient_scope", scope="crm.write:acme"`.

Agent recovery matrix:

| HTTP | OAuth error                | Agent action                                                                    |
| ---- | -------------------------- | ------------------------------------------------------------------------------- |
| 401  | `invalid_token`            | Refresh or re-acquire. M2M: call `clerkClient.m2m.createToken` again.           |
| 403  | `insufficient_scope`       | Re-prompt for consent with `required_scopes`; headless agents queue for review. |
| 403  | `invalid_request` (tenant) | Fail, surface typed error to human review — never retry silently.               |
| 429  | —                          | Back off using the larger of `Retry-After` and jitter.                          |

### Revoking and Rotating Agent Credentials

Revoke on user sign-out, role change, detected compromise, time-based rotation, and end-of-task. Hard constraint: **opaque tokens revoke instantly; JWTs do not** — JWTs cannot be invalidated mid-TTL without a denylist or a short TTL you can tolerate waiting out. **91% of former-employee tokens remain active** ([NHIMG](https://nhimg.org/2025-state-of-non-human-identities-and-secrets-in-cybersecurity)); **64% of 2022-era secrets are still not revoked in 2026** ([GitGuardian](https://blog.gitguardian.com/the-state-of-secrets-sprawl-2026/)). Clerk's [token formats guide](https://clerk.com/docs/guides/development/machine-auth/token-formats.md) frames the tradeoff: JWT for performance, opaque for instant invalidation.

### Secret Storage for AI Agents

Agents must never hold long-lived static secrets. Prefer ephemeral credentials: SPIFFE SVIDs ([HashiCorp](https://www.hashicorp.com/en/blog/spiffe-securing-the-identity-of-agentic-ai-and-non-human-actors)), workload identities, or vault-issued short TTLs. **24,008 unique secrets** were found in MCP config files in year one, with **2,117 confirmed exploitable** ([GitGuardian, 2026](https://blog.gitguardian.com/the-state-of-secrets-sprawl-2026/)).

> **MCP stdio arbitrary command execution (April 2026).** OX Security disclosed a systemic vulnerability in the MCP stdio transport enabling arbitrary OS command execution on \~**200,000 servers across 150M+ downloads**, with 10+ high/critical CVEs ([The Register](https://www.theregister.com/2026/04/16/anthropic_mcp_design_flaw/); [OX Security](https://www.ox.security/blog/the-mother-of-all-ai-supply-chains-critical-systemic-vulnerability-at-the-core-of-the-mcp/)). Anthropic declined to modify the protocol architecture. Streamable HTTP is not affected. **Production MCP servers should use Streamable HTTP, not stdio.**

### Rate Limiting and Abuse Prevention

Agents amplify request volume — 100× a human is normal. [Rate-limit](https://clerk.com/glossary/api-rate-limits.md) on **both** request count (RPM) and token count (input + output tokens per minute). A defensible ceiling is Anthropic's Tier 2 benchmark (April 2026): **1,000 RPM, 450,000 input TPM, 90,000 output TPM** for Claude Sonnet 4.x and Haiku 4.5 ([Anthropic — API rate limits](https://platform.claude.com/docs/en/api/rate-limits)).

| Workload                  | RPM                             | Input TPM         | Output TPM |
| ------------------------- | ------------------------------- | ----------------- | ---------- |
| Background / batch agents | 50–4,000                        | 30k–2M            | 8k–400k    |
| User-facing chat          | 500–10,000                      | 30k–800k combined | —          |
| Per-endpoint ceiling      | \~1,500 (AWS Bedrock AgentCore) | —                 | —          |
| Minimum viable floor      | 50                              | 30,000            | 8,000      |

Use a sliding-window algorithm for LLM traffic ([Zuplo](https://zuplo.com/learning-center/token-based-rate-limiting-ai-agents)), return structured 429 responses with `Retry-After` and `RateLimit-*` headers per RFC 9110.

### Shared Responsibility Between the Agent Framework and the Auth Layer

Draw a clear boundary: the **agent framework** (LangGraph, Vercel AI SDK, Mastra, Claude Agent SDK) handles tool invocation, prompt routing, and conversation state; the **auth layer** handles identity, token issuance, and verification. Never embed secrets in agent framework config — the auth layer injects scoped, short-lived credentials at call time. See [LangGraph custom auth](https://docs.langchain.com/langgraph-platform/custom-auth), [Mastra custom auth](https://mastra.ai/docs/server/auth/custom-auth-provider), [Vercel AI SDK 6](https://vercel.com/blog/ai-sdk-6), [Claude Agent SDK MCP](https://platform.claude.com/docs/en/agent-sdk/mcp), and the [Clerk Agent Toolkit](https://clerk.com/changelog/2025-03-7-clerk-agent-toolkit.md) for framework-side integration patterns.

## Implementation Guide: Wiring Up AI Agent Authentication

This section shows a working Next.js 16 + Clerk setup that handles users, in-app agents, M2M workers, multi-tenant isolation, and audit. Every snippet is real Clerk TypeScript — not pseudo-code.

### Step 1: Set Up Human User Authentication First

Install `@clerk/nextjs` and wrap the app in `<ClerkProvider>`; `auth()` then verifies the signed-in user on every request.

```ts
// app/layout.tsx
import { ClerkProvider } from '@clerk/nextjs'

export default function RootLayout({ children }: { children: React.ReactNode }) {
  return (
    <ClerkProvider>
      <html lang="en">
        <body>{children}</body>
      </html>
    </ClerkProvider>
  )
}
```

A minimal route handler confirms the setup:

```ts
// app/api/me/route.ts
import { auth } from '@clerk/nextjs/server'

export async function GET() {
  const { userId, isAuthenticated } = await auth()
  if (!isAuthenticated) return new Response('Unauthorized', { status: 401 })
  return Response.json({ userId })
}
```

### Step 2: Add Session Handling for In-App AI Agents

For an in-session copilot (see [Pattern 1: User-Delegated AI Agents](#pattern-1-user-delegated-ai-agents-human-in-the-loop)), forward the current user's session token into your AI tool. `getToken()` returns a short-lived JWT with 60s TTL and automatic refresh ([Clerk session tokens](https://clerk.com/docs/guides/sessions/session-tokens.md)).

```ts
// app/api/agent/route.ts
import { auth } from '@clerk/nextjs/server'
import { generateText } from 'ai'
import { myToolWithAuth } from '@/lib/tools'

export async function POST(req: Request) {
  const { userId, isAuthenticated, getToken } = await auth()
  if (!isAuthenticated) return new Response('Unauthorized', { status: 401 })

  const sessionToken = await getToken()
  const { prompt } = await req.json()

  const result = await generateText({
    model: 'claude-sonnet-4-6',
    prompt,
    tools: { query: myToolWithAuth({ sessionToken, userId }) },
  })
  return Response.json(result)
}
```

### Step 3: Issue Scoped Tokens for Delegated Agent Actions

Use a Clerk [JWT template](https://clerk.com/docs/guides/sessions/jwt-templates.md) to mint a downstream token carrying agent context. Define the template in the Dashboard (or via BAPI):

```json
{
  "aud": "https://example.com",
  "agent_id": "{{user.public_metadata.active_agent_id}}",
  "org_id": "{{org.id}}",
  "org_role": "{{org.role}}",
  "tool_scopes": "{{user.public_metadata.tool_scopes}}"
}
```

Replace `https://example.com` with the URL of your downstream resource server — the `aud` claim binds the token to that audience so the resource can validate it ([RFC 8725 §2.1](https://datatracker.ietf.org/doc/html/rfc8725)).

Request the templated token server-side:

```ts
const { getToken } = await auth()
const agentToken = await getToken({ template: 'agent-context' })
// Pass agentToken as Authorization: Bearer to the downstream service.
```

Auto-included claims (`sub`, `iat`, `exp`, `nbf`) cannot be overridden by the template.

### Step 4: Add Machine-to-Machine Credentials for Autonomous Agents

For a background worker (see [Pattern 2: Autonomous AI Agents](#pattern-2-autonomous-ai-agents-machine-to-machine)), create a Clerk M2M token. Namespace: `clerkClient.m2m` (singular); custom-claims param: `claims` (not `customClaims`); `tokenFormat` defaults to `'opaque'`; `secondsUntilExpiration` defaults to `null` (no expiry).

```ts
// Worker process: mint a JWT M2M token scoped to one agent identity.
import { clerkClient } from '@clerk/nextjs/server'

const client = await clerkClient()
const { token } = await client.m2m.createToken({
  tokenFormat: 'jwt',
  secondsUntilExpiration: 3600,
  claims: { agentId: 'agent-001', workload: 'ticket-classifier' },
})

// Downstream service: verify the M2M token without a network round-trip (JWT only).
const result = await client.m2m.verify({ token })
if (result.tokenType === 'm2m_token') {
  const agentId = result.claims?.agentId
  // Proceed — the token is valid and the agent identity is trusted.
}
```

JWT M2M tokens verify locally (no network) but cannot be revoked post-issuance ([changelog, Mar 2026](https://clerk.com/changelog/2026-02-24-m2m-jwt-tokens.md)). Opaque M2M tokens require a network call per verify but support instant revocation ([token formats](https://clerk.com/docs/guides/development/machine-auth/token-formats.md)) — choose per compromise-recovery SLA. Clerk M2M "scopes" are a communication graph (which machine talks to which), not OAuth capability scopes; Client Credentials is on the Clerk roadmap ([overview](https://clerk.com/docs/guides/development/machine-auth/overview.md)).

### Step 5: Enforce Multi-Tenant Isolation on Every Request

Next.js 16 uses `proxy.ts` for edge routing protection (`middleware.ts` is deprecated — same import, body, and matcher; only the filename changes per [Clerk's middleware reference](https://clerk.com/docs/references/nextjs/clerk-middleware.md)). The per-request token check runs inside each route handler via `auth({ acceptsToken: [...] })`.

```ts
// proxy.ts (Next.js 16 — replaces middleware.ts)
import { clerkMiddleware } from '@clerk/nextjs/server'

export default clerkMiddleware()

export const config = {
  matcher: ['/((?!_next|[^?]*\\.(?:html?|css|js|png|svg|woff2?)).*)', '/(api|trpc)(.*)'],
}
```

The handler accepts session / OAuth / M2M / API-key tokens and runs the tenant check with a 403 `application/problem+json` response on mismatch. `acceptsToken` is valid on `auth()` and `authenticateRequest()` — not on `clerkMiddleware()`. Prefer the explicit array over `'any'`.

```ts
// app/api/contacts/[id]/route.ts
import { auth } from '@clerk/nextjs/server'
import { getContactOrg } from '@/lib/contacts'

export async function GET(_req: Request, { params }: { params: { id: string } }) {
  const a = await auth({
    acceptsToken: ['session_token', 'oauth_token', 'm2m_token', 'api_key'],
  })
  if (!a.isAuthenticated) return problem(401, 'invalid_token')

  const resourceOrg = await getContactOrg(params.id)
  const orgId = a.tokenType === 'session_token' ? a.orgId : a.claims?.org_id
  if (!orgId || orgId !== resourceOrg) return problem(403, 'invalid_request', 'Tenant mismatch')

  return Response.json({ id: params.id })
}

function problem(status: number, error: string, detail?: string) {
  return new Response(
    JSON.stringify({ type: `urn:error:${error}`, title: error, status, detail, error }),
    {
      status,
      headers: {
        'Content-Type': 'application/problem+json',
        'WWW-Authenticate': `Bearer error="${error}"`,
      },
    },
  )
}
```

`orgId` is a session-token concept — machine-token branches read it from `claims`. Rely on `tokenType` rather than the presence of `userId` to choose the verification path ([Verifying API Keys](https://clerk.com/docs/guides/development/verifying-api-keys.md), [Verifying OAuth Access Tokens](https://clerk.com/docs/nextjs/guides/development/verifying-oauth-access-tokens.md)).

### Step 6: Add Observability, Audit Trails, and Revocation

Log a structured event per agent action and expose a revocation endpoint. Clerk's M2M revoke signature is `clerkClient.m2m.revokeToken({ m2mTokenId, revocationReason?, machineSecretKey? })` ([docs](https://clerk.com/docs/reference/backend/m2m-tokens/revoke-token.md)) — only opaque M2M tokens can be revoked server-side; JWT M2M tokens rely on TTL + denylist.

```ts
// lib/audit.ts
export type AgentAuditEvent = {
  ts: string
  agent_id: string
  parent_identity: string
  delegation_scope: string[]
  tool_name: string
  tool_params_hash: string
  policy_decision: 'allow' | 'deny'
  trace_id: string
}

export async function logAgentEvent(e: AgentAuditEvent) {
  // Forward to your log sink (Datadog, Elasticsearch, OTel).
  console.log(JSON.stringify({ level: 'info', ...e }))
}

// Revoke an opaque M2M token on user sign-out or detected compromise.
import { clerkClient } from '@clerk/nextjs/server'
const client = await clerkClient()
await client.m2m.revokeToken({
  m2mTokenId: 'm2m_01H...',
  revocationReason: 'user_revoked',
})
```

## Choosing an Authentication Provider for AI SaaS Applications

### Build vs Buy for AI Agent Authentication

OAuth 2.1 + PKCE + JWT validation + DCR + CIMD + token vaults + FGA hooks + audit is a very deep stack. **89% of AI-powered APIs rely on insecure authentication** ([Wallarm, 2025](https://www.wallarm.com/press-releases/wallarm-releases-2025-api-threatstats-report)) and **48% of cybersecurity professionals identify agentic AI as the single most dangerous attack vector** ([Bessemer, 2026](https://www.bvp.com/atlas/securing-ai-agents-the-defining-cybersecurity-challenge-of-2026)). Buy unless you have a strong reason not to.

### Evaluation Criteria for AI-Era Auth

Six criteria worth checking against any shortlist:

1. **Non-human identity support** — first-class M2M credentials with custom claims and revocation, not "use a service account" ([WorkOS criteria](https://workos.com/blog/best-oauth-oidc-providers-for-authenticating-ai-agents-2025)).
2. **Granular token scoping** — scopes + claims + clean hand-off to an FGA engine. Progressive and tenant-scoped scopes signal maturity ([Descope](https://www.descope.com/blog/post/progressive-scoping)).
3. **Multi-tenant primitives** — Organizations, roles, per-org JWT claims, RBAC without SQL hand-rolling.
4. **MCP-compatible flows** — OAuth 2.1 + PKCE, PRM + AS metadata, DCR **or** CIMD.
5. **OAuth provider capability** — your app as an OAuth IdP so external agents authenticate via user consent ([Clerk as OAuth IdP](https://clerk.com/docs/advanced-usage/clerk-idp.md), [Stytch Connected Apps](https://stytch.com/docs/guides/connected-apps/ai-agents), [Descope Inbound Apps](https://www.descope.com/press-release/agentic-identity-hub-2.0)).
6. **Audit logging and session management** — structured events, session introspection API, session lifecycle webhooks, compatibility with [OpenTelemetry GenAI](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/).

### Why Clerk Fits Modern AI Applications

Clerk ships five AI-specific capabilities:

- **M2M tokens** — Public Beta Aug 2025, GA Oct 14 2025, JWT format Mar 5 2026 ([changelog](https://clerk.com/changelog/2026-02-24-m2m-jwt-tokens.md)). Custom `claims`; opaque (instant revocation) or JWT (local verification). Revoke via `clerkClient.m2m.revokeToken({ m2mTokenId })`.
- **Organizations** — org-scoped identity, up to 10 [custom roles](https://clerk.com/glossary/custom-roles.md) per org, BAPI CRUD over roles and permissions, org claims on every session JWT ([changelog](https://clerk.com/changelog/2025-11-24-organization-roles-and-permission-bapi-management.md)).
- **OAuth Provider** — Authorization Code + PKCE, DCR (RFC 7591) as a Dashboard toggle, JWT OAuth access tokens for networkless verification ([changelog](https://clerk.com/changelog/2026-01-08-jwt-oauth-access-tokens.md)).
- **Short-TTL sessions** — 60-second default with proactive refresh in Core 3 ([changelog](https://clerk.com/changelog/2026-03-03-core-3.md)); JWT templates for agent context.
- **MCP primitives** — Next.js: `verifyClerkToken` from `@clerk/mcp-tools/next` plugs into Vercel's `withMcpAuth` wrapper. Express: `mcpAuthClerk` middleware collapses the pattern into one line. Clerk also hosts a docs MCP server at `https://mcp.clerk.com/mcp` for AI coding assistants ([Clerk MCP server guide](https://clerk.com/docs/guides/ai/mcp/clerk-mcp-server.md), [changelog](https://clerk.com/changelog/2026-01-20-clerk-mcp-server.md)) — a developer tool, not an auth proxy.

### Getting Started With Clerk for AI Agent Authentication

A 4-step path:

1. Install `@clerk/nextjs` (Next.js 16) or `@clerk/express`. Optionally scaffold via the Clerk CLI (`npm install -g clerk`, then `clerk init`) — [shipped 2026-04-22](https://clerk.com/changelog/2026-04-22-clerk-cli.md) with three commands: `clerk init` (detects your framework and scaffolds Clerk into the project), `clerk config` (manages application settings from the command line), and `clerk api` (interacts with the Backend API). A `clerk deploy` command is in development.
2. Enable machine auth in the Dashboard — M2M tokens, API keys, OAuth applications with DCR toggle.
3. Define a JWT template for agent context (see [Step 3: Issue Scoped Tokens for Delegated Agent Actions](#step-3-issue-scoped-tokens-for-delegated-agent-actions)).
4. In route handlers, call `auth({ acceptsToken: ['session_token', 'oauth_token', 'm2m_token', 'api_key'] })` to accept every token type on the same endpoint.

The manual quickstart (wrap `app/layout.tsx` in `<ClerkProvider>`, create `proxy.ts` with `clerkMiddleware()`) remains supported.

## Quick Reference Checklists

Three copy-pastable checklists for AI agent auth. Each item is independently verifiable.

### Token Scoping Checklist

- [ ] Access token TTL ≤ 1 hour (≤ 15 minutes for high-risk actions).
- [ ] Refresh tokens rotate on every use with automatic reuse detection.
- [ ] Tokens are sender-constrained (DPoP or mTLS) where the client supports it.
- [ ] Scopes are minimum-required; never `*` / `all`.
- [ ] Action-specific permissions enforced server-side (middleware + FGA).
- [ ] `aud` claim validated on every request.
- [ ] `exp`, `iat`, `nbf` verified against server clock.
- [ ] Tokens include `agent_id` and `parent_identity` claims.

### Multi-Tenant Agent Checklist

- [ ] Every agent token carries `org_id` (or tenant equivalent).
- [ ] Middleware rejects cross-tenant access with 403 + audit log.
- [ ] DB policies enforce `tenant_id` at row level (RLS or equivalent).
- [ ] Agents never hold long-lived credentials spanning tenants.
- [ ] Agent memory / context is isolated per tenant (no shared caches).
- [ ] Separate encryption keys per tenant for sensitive data.
- [ ] Revocation cascades to all per-tenant tokens on tenancy changes.

### Security Review Checklist

- [ ] No long-lived static secrets in agent code or config.
- [ ] MCP servers on Streamable HTTP, never stdio in production (April 2026 CVE cluster).
- [ ] Token vault for third-party OAuth credentials (agent never sees refresh tokens).
- [ ] Per-agent identity (no shared credentials across agents).
- [ ] Structured audit logs with tool invocation hashes.
- [ ] Human-in-the-loop (CIBA) required for high-risk actions.
- [ ] Rate limits on both RPM and token-per-minute.
- [ ] Incident response plan includes token revocation at scale.

## FAQ

## FAQ

### What is authentication for AI applications?

Verifying both the human user and the AI agent acting on their behalf, issuing short-lived, scoped, auditable credentials to each. Agents are first-class non-human identities with their own tokens, claims, and revocation lifecycle. See the [OpenID Foundation AI Identity whitepaper](https://openid.net/wp-content/uploads/2025/10/Identity-Management-for-Agentic-AI.pdf).

### How is AI agent authentication different from traditional OAuth?

Same mechanics — Authorization Code + PKCE, Client Credentials, JWTs — but with dual principals (user + agent), tighter TTLs, mandatory audience binding, and explicit delegation via the `act` claim ([RFC 8693](https://www.rfc-editor.org/rfc/rfc8693), [IETF OBO for AI Agents](https://datatracker.ietf.org/doc/html/draft-oauth-ai-agents-on-behalf-of-user-02)).

### What is the best authentication method for AI SaaS apps in 2026?

OAuth 2.1 + PKCE for user-delegated flows; JWT Bearer Assertions ([RFC 7523](https://datatracker.ietf.org/doc/html/rfc7523)) for M2M; Token Exchange ([RFC 8693](https://www.rfc-editor.org/rfc/rfc8693)) for on-behalf-of. MCP servers use OAuth 2.1 + PKCE with PRM discovery ([RFC 9728](https://datatracker.ietf.org/doc/html/rfc9728)).

### How do I authenticate AI agents in a B2B SaaS application?

Issue per-agent identities with `org_id`, short-lived JWTs, and scope-limited capabilities. [Clerk Organizations](https://clerk.com/docs/guides/organizations/overview.md) provide the tenant primitive; every agent request carries `org_id` and `org_role` claims enforced in middleware.

### Should AI agents use API keys or OAuth tokens?

OAuth tokens by default — scoped, short-lived, revocable. API keys are acceptable for simple internal S2S where user context is absent ([Cloudflare](https://www.cloudflare.com/learning/access-management/api-key-vs-oauth/)).

### What are machine-to-machine (M2M) tokens and when should AI agents use them?

Credentials issued to a machine identity with no user in the flow. Use them for autonomous background workers, scheduled jobs, and service-to-service calls ([Clerk M2M](https://clerk.com/docs/guides/development/machine-auth/m2m-tokens.md), [RFC 6749 §4.4](https://www.rfc-editor.org/rfc/rfc6749)).

### How do I scope a token so an AI agent can only do one specific action?

Combine OAuth scopes, resource indicators ([RFC 8707](https://www.rfc-editor.org/rfc/rfc8707)), and Rich Authorization Requests ([RFC 9396](https://www.rfc-editor.org/rfc/rfc9396)) for per-action, per-resource, per-amount limits — e.g., "transfer up to $500 to Merchant A" ([Stytch](https://stytch.com/blog/ai-agent-authentication-guide/)).

### How do I authenticate an MCP server?

Streamable HTTP + OAuth 2.1 + PKCE. Expose `/.well-known/oauth-protected-resource` ([RFC 9728](https://datatracker.ietf.org/doc/html/rfc9728)) and `/.well-known/oauth-authorization-server` ([RFC 8414](https://datatracker.ietf.org/doc/html/rfc8414)). Support DCR ([RFC 7591](https://datatracker.ietf.org/doc/html/rfc7591)) or CIMD. See the [MCP Authorization Specification](https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization).

### How do I prevent an AI agent from accessing another tenant's data?

Require `org_id` on every request, enforce the tenant check in middleware (see [Tenant Boundary Enforcement](#tenant-boundary-enforcement)), and layer database row-level security. PostgreSQL RLS with `USING (tenant_id = current_setting('app.current_tenant_id')::INT)` is standard ([Supabase RLS](https://supabase.com/docs/guides/database/postgres/row-level-security)).

### How do I revoke an AI agent's access after it has been issued tokens?

Use opaque tokens for instant revocation (Clerk: `clerkClient.m2m.revokeToken`). For JWTs, pair short TTLs with a denylist. Rotate refresh tokens on every exchange with reuse detection ([Auth0](https://auth0.com/docs/secure/tokens/refresh-tokens/refresh-token-rotation)).

### Do I need a separate identity for each AI agent, or can they share credentials?

Per-agent identity is required for auditable attribution and revocation granularity. Only 21.9% of orgs treat agents as independent identities and 45.6% still share API keys ([Gravitee 2026](https://www.gravitee.io/blog/state-of-ai-agent-security-2026-report-when-adoption-outpaces-control)).

### How do I handle AI agents that act on behalf of a user across multiple APIs?

Token Exchange ([RFC 8693](https://www.rfc-editor.org/rfc/rfc8693)) or a federated token vault ([Auth0 Token Vault](https://auth0.com/ai/docs/intro/token-vault), [Anthropic Managed Agents Vaults](https://platform.claude.com/docs/en/managed-agents/vaults)). Store user-scoped refresh tokens server-side; the agent never sees them.

### What are the biggest security risks for authentication in agentic AI systems?

Prompt injection, over-scoped tokens, stolen bearer tokens, stale credentials, and missing audit logs. [OWASP Top 10 for Agentic Applications 2026](https://genai.owasp.org/2025/12/09/owasp-top-10-for-agentic-applications-the-benchmark-for-agentic-security-in-the-age-of-autonomous-ai/) and [MITRE ATLAS v5.1.0](https://atlas.mitre.org/) are the canonical lists.

### How does token refresh work for long-running AI agents?

Refresh \~5 minutes before expiry using rotating refresh tokens with reuse detection ([Scalekit](https://www.scalekit.com/blog/oauth-ai-agents-architecture)). Workers with no refresh endpoint use M2M tokens and re-issue at each TTL boundary.

### Can Clerk be used as the authentication layer for AI applications?

Yes. Clerk provides session tokens, M2M tokens, API keys, OAuth provider (with DCR), Organizations, and MCP primitives (`verifyClerkToken` for Next.js, `mcpAuthClerk` for Express). As of April 2026, Client Credentials and native RFC 8693 token exchange are not yet implemented; JWT templates cover agent-context propagation and M2M covers autonomous agents.

### What is the OWASP Top 10 for Agentic Applications?

A ten-item ranked list of the highest-risk agentic-AI vulnerability classes, published in December 2025. The top three are Memory Poisoning (ASI01), Tool Misuse (ASI02), and Identity / Privilege Abuse (ASI03). See the [OWASP Top 10 for Agentic Applications 2026](https://genai.owasp.org/2025/12/09/owasp-top-10-for-agentic-applications-the-benchmark-for-agentic-security-in-the-age-of-autonomous-ai/) and [MITRE ATLAS](https://atlas.mitre.org/) for the canonical threat taxonomies.

### What is the difference between DCR and CIMD for MCP clients?

Dynamic Client Registration ([RFC 7591](https://datatracker.ietf.org/doc/html/rfc7591)) lets a client self-register with the authorization server at runtime, producing a server-issued `client_id`. Client ID Metadata Documents (CIMD) — the November 2025 evolution — replace registration with a URL the client controls that serves its metadata, giving DNS-based trust without the registration-table growth and rate-limiting problems DCR hit at scale ([Aaron Parecki, Nov 2025](https://aaronparecki.com/2025/11/25/1/mcp-authorization-spec-update)). The [2025-11-25 MCP spec](https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization) prefers CIMD for public MCP clients.

### How do I implement the act claim for AI agent delegation chains?

Set `sub` to the original user and add an `act` claim whose value is a JSON object identifying the agent (`{"sub": "agent-001", "client_id": "..."}`) per [RFC 8693 §4.1](https://www.rfc-editor.org/rfc/rfc8693#section-4.1). For multi-hop chains, nest `act` inside `act` so every agent in the delegation appears in the trail. Middleware reads the whole chain to audit which agent acted for which user via which upstream. See [IANA JWT Claims Registry](https://www.iana.org/assignments/jwt/jwt.xhtml).
