What is the difference between syncing via webhooks and the Backend API?

Webhooks push data in real-time as changes occur without rate limits, making them ideal for ongoing synchronization. The Backend API requires polling or querying and is subject to rate limits (1,000 req/10s), making it best suited for one-time bulk migrations.

How should I handle data privacy requests like GDPR right to erasure?

When receiving a user.deleted webhook, evaluate if the user has critical business data. If not, perform a hard delete. If they do, anonymize the user record to preserve referential integrity while fulfilling the erasure request.

Articles

How to sync Clerk user data to your database - Part 2

Brian Morrison II

Dec 19, 2025 (last updated Jun 27, 2026)

Part 2 of 2. Start with How to sync Clerk user data to your database.

Real-time synchronization handles ongoing changes, but applications often need to import historical data and ensure compliance. This second part of the series covers bulk user migration, privacy requirements, alternative database schemas, and production monitoring.

Initial data migration with the Backend API

For applications with existing Clerk users, webhooks only capture future changes. Use the Backend API's getUserList() method to perform an initial sync:

scripts/migrate-users.ts

import { clerkClient } from '@clerk/nextjs/server'

async function migrateExistingUsers() {
  const client = await clerkClient()
  let offset = 0
  const limit = 100 // Max 500 per request

  while (true) {
    const { data: users, totalCount } = await client.users.getUserList({
      limit,
      offset,
      orderBy: 'created_at',
    })

    if (users.length === 0) break

    for (const user of users) {
      const email = user.emailAddresses[0]?.emailAddress
      if (!email) continue

      await prisma.user.upsert({
        where: { clerkId: user.id },
        update: {
          email,
          firstName: user.firstName,
          lastName: user.lastName,
          imageUrl: user.imageUrl,
        },
        create: {
          clerkId: user.id,
          email,
          firstName: user.firstName,
          lastName: user.lastName,
          imageUrl: user.imageUrl,
        },
      })
    }

    offset += limit
    console.log(`Migrated ${offset} of ${totalCount} users`)

    // Respect rate limits: 1000 req/10s production, 100 req/10s development
    await new Promise((resolve) => setTimeout(resolve, 100))
  }
}

For even better performance, you can leverage Prisma's createMany operation to insert multiple records in a single statement. This approach reduces the number of database round trips and can significantly speed up large migrations:

scripts/batch-migrate-users.ts

// Alternative: Prisma batch operations for better performance
async function batchMigrateUsers() {
  const client = await clerkClient()
  let offset = 0
  const limit = 100

  while (true) {
    const { data: users } = await client.users.getUserList({
      limit,
      offset,
      orderBy: 'created_at',
    })

    if (users.length === 0) break

    const usersWithEmail = users.filter((u) => u.emailAddresses.length > 0)

    if (usersWithEmail.length > 0) {
      // createMany batch-inserts in a single statement (all SQL connectors; SQLite
      // since Prisma 5.12.0). skipDuplicates relies on ON CONFLICT DO NOTHING, so it
      // works on PostgreSQL/MySQL/CockroachDB but not on SQLite, SQL Server, or MongoDB.
      await prisma.user.createMany({
        data: usersWithEmail.map((user) => ({
          clerkId: user.id,
          email: user.emailAddresses[0].emailAddress,
          firstName: user.firstName,
          lastName: user.lastName,
          imageUrl: user.imageUrl,
        })),
        skipDuplicates: true, // Ignore conflicts on unique constraints
      })
    }

    offset += limit
    await new Promise((resolve) => setTimeout(resolve, 100))
  }
}

The Backend API supports filtering by email, phone, user ID (up to 100 values), and free-text search via the query parameter. For production environments, you have 1,000 requests per 10 seconds; development instances are limited to 100 requests per 10 seconds.

Connection pooling becomes critical during bulk migrations. PostgreSQL's connection documentation provides the foundational formula: active connections should be near (core_count × 2) + effective_spindle_count. PgBouncer's configuration reference documents pool modes and key settings like max_client_conn (default 100) and default_pool_size (default 20). (PostgreSQL connection documentation, 2024) (PgBouncer's configuration reference, 2024)

Prisma upsert operations can encounter race conditions during concurrent access. The Prisma CRUD documentation covers upsert behavior, and race conditions may throw P2002 errors on concurrent upserts. The transactions documentation covers four transaction patterns, with interactive transactions ($transaction(async (tx) => {...})) supporting custom logic including explicit rollback and isolation levels. (Prisma CRUD documentation, 2024) (Prisma transactions documentation, 2024)

Combining auth migration strategies

When migrating from existing authentication systems to Clerk, Auth0's migration documentation defines two proven strategies:

Automatic (lazy) migration where users migrate transparently during sign-in without password resets—"Over the course of a few weeks or months, a majority of users will have been automatically migrated without noticing anything has changed"—and bulk migration for urgent scenarios via Management API import with optional password hash preservation.

For Clerk migrations, the webhook-based sync approach supports both patterns: configure webhooks before migration to capture new users automatically, then use the Backend API for bulk historical data import. (Auth0's migration documentation, 2023)

Use cases that justify database sync

Analytics and reporting dashboards

When building dashboards that aggregate user data—signups over time, geographic distribution, engagement metrics—querying your own database is dramatically more efficient than calling Clerk's API repeatedly:

lib/analytics.ts

// Get signup trends by week
const signupsByWeek = await prisma.$queryRaw`
  SELECT
    DATE_TRUNC('week', created_at) as week,
    COUNT(*) as signups
  FROM users
  GROUP BY DATE_TRUNC('week', created_at)
  ORDER BY week DESC
  LIMIT 12
`

Custom user profiles

Clerk's metadata fields are typically sufficient for most applications, but if you need additional user attributes—bio, company, preferences, subscription tier—store them alongside the synced Clerk data:

prisma/schema.prisma

model User {
  id        Int      @id @default(autoincrement())
  clerkId   String   @unique
  email     String   @unique
  // Clerk-synced fields
  firstName String?
  lastName  String?
  imageUrl  String?
  deletedAt DateTime?
  // Application-specific fields
  bio       String?
  company   String?
  role      Role     @default(USER)
  tier      Tier     @default(FREE)
}

enum Role { USER ADMIN MODERATOR }
enum Tier { FREE PRO ENTERPRISE }

Note

If your application requires complex role-based access control or subscription tiers, consider using Clerk Organizations for roles and permissions, and Clerk Billing for subscription management. These built-in features handle the complexity of multi-tenant access control and recurring billing, saving you from building and maintaining these systems yourself.

Compliance and audit logging

Regulated industries often require immutable audit trails of user data changes. Capture each webhook event:

app/api/webhooks/clerk/route.ts

import { NextRequest } from 'next/server'
import { verifyWebhook, WebhookEvent } from '@clerk/nextjs/webhooks'
import { prisma } from '@/lib/prisma'

export async function POST(req: NextRequest) {
  let evt: WebhookEvent

  try {
    evt = await verifyWebhook(req)
  } catch (err) {
    console.error('Error verifying webhook:', err)
    return new Response('Error occurred', { status: 400 })
  }

  // Handle audit logging for user updates
  if (evt.type === 'user.updated') {
    const email = evt.data.email_addresses[0]?.email_address
    if (!email) return new Response('User has no email', { status: 200 })

    await prisma.$transaction([
      // upsert (not update) so an out-of-order event that arrives before
      // user.created still creates the row instead of throwing P2025
      prisma.user.upsert({
        where: { clerkId: evt.data.id },
        update: {
          email,
          firstName: evt.data.first_name,
          lastName: evt.data.last_name,
          imageUrl: evt.data.image_url,
          updatedAt: new Date(evt.data.updated_at),
        },
        create: {
          clerkId: evt.data.id,
          email,
          firstName: evt.data.first_name,
          lastName: evt.data.last_name,
          imageUrl: evt.data.image_url,
          updatedAt: new Date(evt.data.updated_at),
        },
      }),
      prisma.auditLog.create({
        data: {
          userId: evt.data.id,
          action: 'USER_UPDATED',
          payload: JSON.stringify({
            updated_at: evt.data.updated_at,
            id: evt.data.id,
          }),
          timestamp: new Date(),
          ipAddress: req.headers.get('x-forwarded-for') || 'unknown',
          userAgent: req.headers.get('user-agent') || 'unknown',
        },
      }),
    ])
  }

  return new Response('', { status: 200 })
}

CRM and third-party integrations

When integrating with external systems that need user data—email marketing platforms, support tools, analytics services—syncing to your database provides a single integration point:

app/api/webhooks/clerk/route.ts

import { NextRequest } from 'next/server'
import { verifyWebhook, WebhookEvent } from '@clerk/nextjs/webhooks'
import { prisma } from '@/lib/prisma'
import { emailService } from '@/lib/email'
import { analytics } from '@/lib/analytics'

export async function POST(req: NextRequest) {
  let evt: WebhookEvent

  try {
    evt = await verifyWebhook(req)
  } catch (err) {
    console.error('Error verifying webhook:', err)
    return new Response('Error occurred', { status: 400 })
  }

  // Handle user creation and downstream integrations
  if (evt.type === 'user.created') {
    const email = evt.data.email_addresses[0]?.email_address
    if (!email) return new Response('User has no email', { status: 200 })

    await prisma.user.upsert({
      where: { clerkId: evt.data.id },
      update: {}, // no-op on duplicate delivery (Svix is at-least-once)
      create: {
        clerkId: evt.data.id,
        email,
        firstName: evt.data.first_name,
        lastName: evt.data.last_name,
        imageUrl: evt.data.image_url,
        createdAt: new Date(evt.data.created_at),
        updatedAt: new Date(evt.data.updated_at),
      },
    })

    // Trigger downstream integrations
    await emailService.addContact({
      email: evt.data.email_addresses[0]?.email_address,
      firstName: evt.data.first_name,
      lastName: evt.data.last_name,
      tags: ['new-user'],
    })

    await analytics.identify(evt.data.id, {
      email: evt.data.email_addresses[0]?.email_address,
      createdAt: evt.data.created_at,
      source: 'clerk_webhook',
    })
  }

  return new Response('', { status: 200 })
}

Apply data minimization principles when deciding what to sync, keeping GDPR requirements in mind. The more user data you store locally, the greater your compliance burden.

Typically safe to sync:

id (required for linking)
email_addresses (if needed for application features)
first_name, last_name (for display purposes)
image_url (for avatars)

Consider not syncing:

phone_numbers (unless essential)
external_accounts (OAuth provider details)
private_metadata (admin-only data)
last_sign_in_at (often unnecessary)

For GDPR's right to erasure, handle user.deleted events properly. (GDPR Article 17, 2018) Right to Erasure mandates: "The controller shall have the obligation to erase personal data without undue delay." GDPR Article 17(2) extends this to downstream processors: controllers must "take reasonable steps, including technical measures, to inform controllers which are processing the personal data that the data subject has requested the erasure."

(GDPR Article 5(1)(c), 2018) defines data minimization: "Personal data shall be... adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed." The (UK ICO's practical guidance, 2024) addresses backup systems: "The key issue is to put the backup data 'beyond use'." Response time requirement: one calendar month from request receipt.

app/api/webhooks/clerk/route.ts

import { NextRequest } from 'next/server'
import { verifyWebhook, WebhookEvent } from '@clerk/nextjs/webhooks'
import { prisma } from '@/lib/prisma'
import { auditLogger } from '@/lib/audit'

export async function POST(req: NextRequest) {
  let evt: WebhookEvent

  try {
    evt = await verifyWebhook(req)
  } catch (err) {
    console.error('Error verifying webhook:', err)
    return new Response('Error occurred', { status: 400 })
  }

  // Handle user deletion with GDPR compliance options
  if (evt.type === 'user.deleted') {
    const userId = evt.data.id

    // The user.deleted payload's id is optional; bail if it is missing
    if (!userId) {
      return new Response('', { status: 200 })
    }

    try {
      // Check if user has critical business data
      const userRelations = await prisma.user.findUnique({
        where: { clerkId: userId },
        include: {
          orders: { select: { id: true } },
          posts: { select: { id: true } },
          comments: { select: { id: true } },
        },
      })

      if (!userRelations) {
        // Already removed or never synced — nothing to erase (idempotent)
        return new Response('', { status: 200 })
      }

      const hasBusinessData =
        userRelations.orders.length > 0 ||
        userRelations.posts.length > 0 ||
        userRelations.comments.length > 0

      if (!hasBusinessData) {
        // Option 1: Hard delete (safe when no business data exists)
        await prisma.user.delete({ where: { clerkId: userId } })
        await auditLogger.log('USER_HARD_DELETED', { userId })
      } else {
        // Option 2: Anonymize (preserves referential integrity)
        await prisma.user.update({
          where: { clerkId: userId },
          data: {
            email: `deleted-${userId}@anonymized.local`,
            firstName: 'Deleted',
            lastName: 'User',
            imageUrl: null,
            deletedAt: new Date(),
          },
        })
        await auditLogger.log('USER_ANONYMIZED', { userId, reason: 'has_business_data' })
      }
    } catch (error) {
      console.error('Error handling user deletion:', error)
      return new Response('Error processing deletion', { status: 500 })
    }
  }

  return new Response('', { status: 200 })
}

Clerk provides compliance documentation and a Data Processing Agreement for GDPR compliance, including data retention policies.

MySQL and MongoDB alternatives

For MySQL databases, the schema structure remains similar with syntax adjustments:

schema.sql

CREATE TABLE users (
  id INT AUTO_INCREMENT PRIMARY KEY,
  clerk_id VARCHAR(255) NOT NULL UNIQUE,
  email VARCHAR(255) NOT NULL UNIQUE,
  first_name VARCHAR(255),
  last_name VARCHAR(255),
  image_url TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  deleted_at TIMESTAMP NULL,

  INDEX idx_clerk_id (clerk_id),
  INDEX idx_email (email)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

MongoDB's document model offers natural flexibility for user profiles:

models/User.ts

const userSchema = new mongoose.Schema(
  {
    clerkId: { type: String, required: true, unique: true, index: true },
    email: { type: String, required: true, unique: true },
    profile: {
      firstName: String,
      lastName: String,
      imageUrl: String,
    },
    metadata: mongoose.Schema.Types.Mixed, // Flexible additional data
  },
  { timestamps: true },
)

MongoDB excels when user profiles have highly variable structures, though PostgreSQL with JSONB columns provides similar flexibility with stronger querying capabilities.

Soft delete considerations

While the examples above show hard deletion, some applications require soft deletes for audit trails or data recovery. However, soft deletes carry significant tradeoffs. The technical analysis argues against soft deletes: "The main problem with soft deletion is that you're systematically misleading the database... foreign keys are effectively lost." (brandur.org, 2024)

Domain-Driven Design authority frames deletion as a domain concept requiring explicit modeling rather than database-level implementation: "Don't delete—just don't." For implementations requiring soft deletes, PostgreSQL's partial unique indexes can exclude deleted records from index overhead: (Udi Dahan, 2009)

migrations/add_partial_index.sql

CREATE UNIQUE INDEX users_email_active_idx
ON users (email)
WHERE deleted_at IS NULL;

Common pitfalls and solutions

Webhook verification fails in Express: The express.json() middleware parses the body before your handler, breaking signature verification. Use express.raw() specifically for the webhook route.

308 redirect errors in production: Vercel and similar platforms may redirect between www and non-www domains, causing webhook failures. Configure your endpoint URL in Clerk Dashboard to match your canonical domain exactly.

Race condition on user creation: Users may access your application before the user.created webhook arrives and creates their database record. Solve this with the upsert pattern in both webhooks and your initial page load, or implement a loading state that waits for the record.

Payload parsing errors with Svix: If you see "Expected payload to be of type string or Buffer," your framework has pre-parsed the body. Ensure raw body access or use JSON.stringify() if reconstruction is necessary.

Monitoring and observability

Webhook handlers are critical infrastructure—when they fail silently, your database drifts out of sync with Clerk, leading to bugs that are difficult to diagnose. Implementing proper monitoring ensures you catch issues before they impact users.

Why monitoring matters:

Silent failures compound over time: A single failed webhook might not cause obvious problems, but hundreds of failed user.updated events mean your application shows stale user data everywhere
Debugging is difficult after the fact: Without logs, you won't know why a user's profile didn't update or why a deleted user still appears in your app
Performance degradation affects reliability: If your handler takes too long, Svix will timeout and retry, potentially causing duplicate processing or cascading failures

Track these metrics for production webhook handlers:

Processing latency: How long webhook handling takes (aim for under 1 second)
Failure rate: Percentage of webhooks returning non-2xx status
Queue depth: For async processing, how many events await processing
Duplicate rate: How often you receive the same svix-id

These align with Google's SRE book definition of the Four Golden Signals for monitoring distributed systems: Latency (track error latency separately), Traffic (requests per second), Errors (explicit HTTP 500s, implicit wrong content, policy violations), and Saturation (queue depth, processing backlog). The SRE guidance provides retry recommendations: "Always use randomized exponential backoff when scheduling retries. Limit retries per request. Don't retry indefinitely. Consider having a server-wide retry budget." (Google's SRE book, 2016) (addressing cascading failures, 2016)

app/api/webhooks/clerk/route.ts

import { NextRequest } from 'next/server'
import { verifyWebhook, WebhookEvent } from '@clerk/nextjs/webhooks'
import { prisma } from '@/lib/prisma'
import { logger } from '@/lib/logger'
import { metrics } from '@/lib/metrics'

export async function POST(req: NextRequest) {
  const startTime = Date.now()
  const webhookId = req.headers.get('svix-id')

  let evt: WebhookEvent

  try {
    evt = await verifyWebhook(req)
  } catch (err) {
    const processingTime = Date.now() - startTime
    logger.error('Webhook verification failed', {
      webhookId,
      processingTimeMs: processingTime,
      error: err instanceof Error ? err.message : String(err),
      status: 'verification_failed',
    })
    metrics.increment('webhook.verification.failed')
    return new Response('Error occurred', { status: 400 })
  }

  // Process webhook with comprehensive logging
  try {
    logger.info('Processing webhook', {
      webhookId,
      eventType: evt.type,
      userId: evt.data.id,
      timestamp: new Date().toISOString(),
    })

    // Handle different event types
    switch (evt.type) {
      case 'user.created': {
        const email = evt.data.email_addresses[0]?.email_address
        if (!email) break

        await prisma.user.upsert({
          where: { clerkId: evt.data.id },
          update: {}, // no-op on duplicate delivery (Svix is at-least-once)
          create: {
            clerkId: evt.data.id,
            email,
            firstName: evt.data.first_name,
            lastName: evt.data.last_name,
            imageUrl: evt.data.image_url,
            createdAt: new Date(evt.data.created_at),
          },
        })
        metrics.increment('webhook.user.created')
        break
      }

      case 'user.updated': {
        const email = evt.data.email_addresses[0]?.email_address
        if (!email) break

        // upsert so an out-of-order update (before user.created) creates the row
        await prisma.user.upsert({
          where: { clerkId: evt.data.id },
          update: {
            email,
            firstName: evt.data.first_name,
            lastName: evt.data.last_name,
            imageUrl: evt.data.image_url,
            updatedAt: new Date(evt.data.updated_at),
          },
          create: {
            clerkId: evt.data.id,
            email,
            firstName: evt.data.first_name,
            lastName: evt.data.last_name,
            imageUrl: evt.data.image_url,
            updatedAt: new Date(evt.data.updated_at),
          },
        })
        metrics.increment('webhook.user.updated')
        break
      }

      case 'user.deleted':
        // deleteMany returns { count: 0 } instead of throwing P2025 when the row
        // is already gone; guard the optional id so the filter is never empty
        if (evt.data.id) {
          await prisma.user.deleteMany({ where: { clerkId: evt.data.id } })
        }
        metrics.increment('webhook.user.deleted')
        break
    }

    const processingTime = Date.now() - startTime

    // Log successful processing with metrics
    logger.info('Webhook processed successfully', {
      webhookId,
      eventType: evt.type,
      userId: evt.data.id,
      processingTimeMs: processingTime,
      status: 'success',
    })

    // Track performance metrics
    metrics.timing('webhook.processing_time', processingTime)
    metrics.increment('webhook.processed.success')
  } catch (error) {
    const processingTime = Date.now() - startTime

    logger.error('Webhook processing failed', {
      webhookId,
      eventType: evt.type,
      userId: evt.data?.id,
      processingTimeMs: processingTime,
      error: error instanceof Error ? error.message : String(error),
      stack: error instanceof Error ? error.stack : undefined,
      status: 'processing_failed',
    })

    metrics.increment('webhook.processed.failed')
    return new Response('Processing failed', { status: 500 })
  }

  return new Response('', { status: 200 })
}

For production applications, consider integrating with observability platforms:

Structured logging: Use JSON-formatted logs with consistent fields for easier searching and alerting
Error tracking: Services like Sentry or Bugsnag can alert you immediately when webhook handlers throw exceptions
Metrics dashboards: Tools like Datadog, Grafana, or CloudWatch let you visualize webhook performance trends and set threshold alerts
Distributed tracing: For complex webhook processing with multiple downstream services, tracing helps you understand the full request lifecycle

OpenTelemetry tracing concepts define span types relevant to webhooks: Server spans for incoming webhook requests, Producer/Consumer spans for async queue patterns. HTTP semantic conventions specify key attributes including http.request.resend_count for retry tracking. (OpenTelemetry tracing concepts, 2024) (HTTP semantic conventions, 2024)

The Clerk Dashboard provides webhook delivery logs with detailed request and response information—essential for debugging production issues.

Best practices checklist

Before deploying your webhook endpoint to production, verify:

Security & Verification:

Signature verification using Clerk's verifyWebhook() helper
Raw body access configured (no JSON parsing before verification)
CLERK_WEBHOOK_SIGNING_SECRET stored in environment variables
Signing secret treated as sensitive—never logged or exposed
Server clock synchronized via NTP (required for timestamp validation)
Consider IP allowlisting from Svix servers for additional protection

Performance & Reliability:

Response returned within 15 seconds to avoid timeout retries
Idempotent endpoints using upserts or webhook ID tracking
Return 400 status for invalid signatures (triggers Svix retry)
Return 200 status for successful processing

Implementation:

All subscribed event types handled (user.created, user.updated, user.deleted)
Database indexes on clerkId column for fast lookups
Error logging for debugging failed webhooks
Separate webhook endpoints configured for development and production

Data & Compliance:

Initial migration script for existing users
GDPR-compliant user deletion handling
Monitoring and alerting configured

Conclusion

Syncing Clerk user data to your own database unlocks powerful capabilities—from analytics dashboards to custom profiles to reduced API dependency—but comes with meaningful trade-offs in infrastructure complexity and eventual consistency. Webhooks remain the recommended approach for real-time sync, with the Backend API serving bulk migrations and recovery scenarios.

The key insight from Clerk's own documentation bears repeating: if session data and metadata can serve your needs, you may not need to sync at all. When you do sync, focus on minimal data, idempotent handlers, and proper verification. Your webhook handler should be defensive, assuming duplicate deliveries and occasional out-of-order events.

For most Next.js applications with PostgreSQL, the combination of Prisma schemas, the verifyWebhook() helper, and upsert-based event handling provides a robust foundation. Start simple, sync only what you need, and expand the scope of synced data only when concrete requirements demand it.

Frequently asked questions

In this series

How to sync Clerk user data to your database
How to sync Clerk user data to your database - Part 2 (you are here)