PDF Security Blog

PDF Tamper Detection in Next.js: App Router Integration Guide

HTPBE Team··20 min read
PDF Tamper Detection in Next.js: App Router Integration Guide

This article is a snapshot — content was accurate as of June 2026 (code examples tested against the API as of August 2026). The product evolves actively; specific counts, examples, and detection rules may have changed since publication — see the changelog for the current state.

A Next.js app that accepts document uploads has a specific problem: the file never touches a long-lived server you control. It lands in a serverless function with a body-size ceiling, runs for a few hundred milliseconds, and disappears. If a user submits a bank statement, a payslip, or an invoice, your App Router handler has one chance to decide whether that file is trustworthy before it influences a payment, a credit limit, or a hire.

This guide shows the production pattern for wiring PDF tamper detection into a Next.js 14/15 App Router application: a route handler that runs on the Node.js runtime, a presigned client-side upload so large PDFs never hit your function body, a server-side call to the analysis API, and a typed result UI. The code below is end-to-end TypeScript — it compiles, runs the request flow, and handles the documented error codes — but you still need to harden it for your own environment (CORS on the bucket, bucket lifecycle policy, abuse protection on the upload-url route, log redaction, auth on routes that mint URLs). The hardening checklist is in the security section near the end. (If you are on plain Node.js or Express, the Node.js integration guide covers that surface; for Python and Django/Flask, see the Python guide. For the conceptual model, start with how PDF tamper detection works.)

Why the obvious Next.js approach breaks

The instinct is to accept a multipart/form-data upload in a route handler, read the file into a buffer, and forward it to the analysis service. On Vercel and most serverless platforms this fails for anything above roughly 4.5 MB — the request body never reaches your function. A 6 MB scanned statement returns a 413 before your code runs.

The second instinct is to set runtime = 'edge' for speed. That also breaks: the Edge runtime has no Node.js built-ins, and PDF parsing libraries depend on them. Any handler that touches PDF bytes must declare the Node.js runtime explicitly.

The pattern that holds up in production splits the work in two:

  1. The browser uploads the PDF directly to object storage using a short-lived presigned URL — the bytes never pass through your Next.js function.
  2. Your route handler receives only the resulting URL, then calls the analysis API server-side. The API downloads the file from that URL, analyses it synchronously, and returns a verdict.

This is exactly how the HTPBE? API is designed to be consumed. It does not accept file uploads at all — it takes a publicly reachable URL and fetches the PDF itself. That constraint, which looks like a limitation, is what lets a Next.js app stay within serverless limits and keep large files off the function body entirely.

Step 1: The presigned-upload route handler

First, a route handler that mints a presigned PUT URL for the browser. This example uses Cloudflare R2 (S3-compatible), but the same @aws-sdk calls work against S3 or GCS with one endpoint change.

app/api/upload-url/route.ts:

import { NextRequest, NextResponse } from 'next/server';
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';
import { randomUUID } from 'node:crypto';

export const runtime = 'nodejs';

const s3 = new S3Client({
  region: 'auto',
  endpoint: process.env.R2_ENDPOINT,
  credentials: {
    accessKeyId: process.env.R2_ACCESS_KEY_ID!,
    secretAccessKey: process.env.R2_SECRET_ACCESS_KEY!,
  },
});

export async function POST(request: NextRequest): Promise<NextResponse> {
  const { filename } = (await request.json()) as { filename?: string };

  // Force a .pdf key and strip anything unsafe from the original name.
  const safeBase = (filename ?? 'upload')
    .replace(/\.[^.]+$/, '')
    .replace(/[^a-zA-Z0-9_-]/g, '_')
    .slice(0, 50);
  const key = `incoming/${safeBase}--${randomUUID()}.pdf`;

  const uploadUrl = await getSignedUrl(
    s3,
    new PutObjectCommand({
      Bucket: process.env.R2_BUCKET!,
      Key: key,
      ContentType: 'application/pdf',
    }),
    { expiresIn: 300 } // 5 minutes
  );

  // The public URL the analysis API will fetch from.
  const publicUrl = `${process.env.R2_PUBLIC_URL}/${key}`;

  return NextResponse.json({ uploadUrl, publicUrl, originalFilename: filename });
}

Two details matter for security. The expiresIn: 300 window means the presigned link is dead five minutes after it is issued — long enough for the upload plus the analysis round-trip, short enough that a leaked link is worthless. And the route only mints a PUT for a key it generated itself, so a caller cannot overwrite arbitrary objects.

Step 2: The client-side upload

On the client, a small component requests the presigned URL, PUTs the file straight to storage, then hands the resulting publicUrl to the verification route. The file bytes go from the browser to R2 and never enter your Next.js function.

app/components/UploadForm.tsx:

'use client';

import { useState } from 'react';

interface VerdictResponse {
  status: 'intact' | 'modified' | 'inconclusive';
  statusReason?: string;
  markers: string[];
  producer: string | null;
}

export function UploadForm() {
  const [verdict, setVerdict] = useState<VerdictResponse | null>(null);
  const [error, setError] = useState<string | null>(null);
  const [busy, setBusy] = useState(false);

  async function handleSubmit(e: React.FormEvent<HTMLFormElement>) {
    e.preventDefault();
    const file = (e.currentTarget.elements.namedItem('pdf') as HTMLInputElement)
      .files?.[0];
    if (!file) return;

    // Client-side pre-checks. Cheap, not authoritative — re-validate on the
    // server side (the verify route + the bucket policy) because the browser
    // input is trivially bypassable. These exist to fail fast on the user's
    // own mistake (wrong file type, file too large), not to gate anything.
    const MAX_BYTES = 10 * 1024 * 1024; // matches API limit
    if (file.size > MAX_BYTES) {
      setError('File is larger than 10 MB — the API will reject it.');
      return;
    }
    if (file.type !== 'application/pdf' || !file.name.toLowerCase().endsWith('.pdf')) {
      setError('Only PDF files are accepted.');
      return;
    }

    setBusy(true);
    setError(null);
    setVerdict(null);

    try {
      // 1. Ask our own route for a presigned PUT URL.
      const tokenRes = await fetch('/api/upload-url', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ filename: file.name }),
      });
      const { uploadUrl, publicUrl, originalFilename } = await tokenRes.json();

      // 2. Upload the bytes straight to storage — bypasses the function body.
      const putRes = await fetch(uploadUrl, {
        method: 'PUT',
        headers: { 'Content-Type': 'application/pdf' },
        body: file,
      });
      if (!putRes.ok) throw new Error('Upload failed');

      // 3. Ask our verification route to analyse the stored URL.
      const verifyRes = await fetch('/api/verify', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ url: publicUrl, originalFilename }),
      });

      if (!verifyRes.ok) {
        const body = await verifyRes.json();
        throw new Error(body.error ?? 'Verification failed');
      }

      setVerdict((await verifyRes.json()) as VerdictResponse);
    } catch (err) {
      setError(err instanceof Error ? err.message : 'Something went wrong');
    } finally {
      setBusy(false);
    }
  }

  return (
    <form onSubmit={handleSubmit}>
      <input type="file" name="pdf" accept="application/pdf" required />
      <button type="submit" disabled={busy}>
        {busy ? 'Analysing…' : 'Verify document'}
      </button>
      {error && <p role="alert">{error}</p>}
      {verdict && <VerdictBadge verdict={verdict} />}
    </form>
  );
}

VerdictBadge is the result UI; we build it in Step 5.

Step 3: The verification route handler

This is the heart of the integration. The route receives the stored URL, calls POST /analyze, then GET /result/{id}, and maps the three possible verdicts onto an HTTP response your client can render. Note the explicit runtime = 'nodejs' and the fact that the API key is read from process.env — it never reaches the browser.

The HTPBE? API is synchronous. POST /analyze blocks until the verdict is computed and returns 201 Created with the check ID plus a Location header. There is no queue to poll and no webhook to register — by the time analyze returns, the full result is already retrievable. Two calls, one verdict.

app/api/verify/route.ts:

import { NextRequest, NextResponse } from 'next/server';

export const runtime = 'nodejs';
export const maxDuration = 30; // allow time for download + analysis

const HTPBE_BASE = 'https://api.htpbe.tech/v1';

export async function POST(request: NextRequest): Promise<NextResponse> {
  const { url, originalFilename } = (await request.json()) as {
    url?: string;
    originalFilename?: string;
  };

  if (!url) {
    return NextResponse.json({ error: 'url is required' }, { status: 400 });
  }

  const auth = { Authorization: `Bearer ${process.env.HTPBE_API_KEY}` };

  // Step 1: submit the stored URL for analysis.
  const analyzeRes = await fetch(`${HTPBE_BASE}/analyze`, {
    method: 'POST',
    headers: { ...auth, 'Content-Type': 'application/json' },
    body: JSON.stringify({ url, original_filename: originalFilename }),
  });

  if (!analyzeRes.ok) {
    return mapError(analyzeRes);
  }

  const { id } = (await analyzeRes.json()) as { id: string };

  // Step 2: retrieve the full flat result.
  const resultRes = await fetch(`${HTPBE_BASE}/result/${id}`, { headers: auth });
  if (!resultRes.ok) {
    return mapError(resultRes);
  }

  const result = await resultRes.json();

  // Hand the client only what the UI needs — never the raw API payload.
  return NextResponse.json({
    status: result.status,
    statusReason: result.status_reason,
    markers: result.modification_markers,
    producer: result.producer,
  });
}

The route hands the browser a trimmed object, not the full forensic payload. Leaking every field (object counts, xref geometry, producer strings) to a public client gives a fraudster a feedback loop for tuning their forgery. Send the verdict and the marker ids; keep the rest server-side for your audit log.

Step 4: Error handling that matters

The API returns specific status codes you should not paper over. Two deserve explicit treatment in a Next.js handler because they are not transient and must not be blindly retried.

mapError, referenced above:

async function mapError(res: Response): Promise<NextResponse> {
  // Surface the Retry-After header to the client for 429.
  const retryAfter = res.headers.get('Retry-After');

  let body: { error?: string; code?: string } = {};
  try {
    body = await res.json();
  } catch {
    // non-JSON body — fall through with status only
  }

  switch (res.status) {
    case 402:
      // PAYMENT_REQUIRED — the key has no credit source: monthly quota,
      // paid batch, and welcome credits are all exhausted, or there is no
      // active plan on a live key. Retrying will fail identically. This is a
      // billing event, not a user error — alert your team, do not show it raw.
      return NextResponse.json({ error: 'Verification temporarily unavailable' }, { status: 503 });

    case 422:
      // INVALID_PDF — the URL did not return a parseable PDF (corrupt,
      // encrypted, or not a PDF at all). This is the user's file, not your bug.
      return NextResponse.json(
        { error: 'That file is not a readable PDF. Try re-exporting it.' },
        { status: 422 }
      );

    case 413:
      return NextResponse.json({ error: 'PDF exceeds the 10 MB limit.' }, { status: 413 });

    case 429:
      // SERVER_AT_CAPACITY — server-wide concurrency, not per-key rate
      // limiting. The Retry-After header (seconds) tells you how long to wait.
      return NextResponse.json(
        { error: 'Server busy, retry shortly.' },
        { status: 429, headers: retryAfter ? { 'Retry-After': retryAfter } : {} }
      );

    case 401:
      // Your key is wrong or missing. This is a config error, never the
      // user's problem — log it loudly and hide it behind a generic 503.
      console.error('HTPBE_API_KEY invalid or missing');
      return NextResponse.json({ error: 'Verification temporarily unavailable' }, { status: 503 });

    default:
      return NextResponse.json(
        { error: body.error ?? 'Verification failed' },
        { status: res.status >= 500 ? 502 : res.status }
      );
  }
}

The two codes worth internalising:

  • 402 PAYMENT_REQUIRED — credits are universal at HTPBE?: a subscription’s monthly quota, a one-time top-up batch, and the welcome credits all draw from one pool. A 402 means all three are empty (or there is no active plan on a live key). It is a billing signal, not a per-request failure. Catch it, alert your team, top up at the pricing page. Never expose it to the end user and never retry it in a loop.
  • 429 SERVER_AT_CAPACITY — this is server-wide concurrency, not per-key throttling. The response carries a Retry-After header in seconds. Wait that long, then retry the same request. In a Next.js handler that is processing one document per request, the simplest correct behaviour is to forward the Retry-After to the client and let it re-submit; for batch jobs, respect it in a queue worker.

For the full code-by-code breakdown — download_failed, invalid_url_format, the analysis-timeout variant of invalid_pdf — see the /analyze error reference.

Step 5: Rendering the three verdicts in the UI

The API returns exactly one of three verdicts. Each maps to a different action and a different message. The single most common integration mistake is treating inconclusive as an error — it is not. It is a finding.

app/components/VerdictBadge.tsx:

interface Props {
  verdict: {
    status: 'intact' | 'modified' | 'inconclusive';
    statusReason?: string;
    markers: string[];
    producer: string | null;
  };
}

export function VerdictBadge({ verdict }: Props) {
  switch (verdict.status) {
    case 'intact':
      return (
        <div data-status="intact">
          <strong>Intact.</strong> No post-creation modification detected.
        </div>
      );

    case 'modified':
      return (
        <div data-status="modified">
          <strong>Modified.</strong> This document was changed after it was
          created. Findings: {verdict.markers.join(', ')}.
        </div>
      );

    case 'inconclusive':
      // NOT a failure. The document was made with consumer software, an
      // online editor, an HTML renderer, or a scanner — so there is no
      // institutional "original" to verify integrity against.
      return (
        <div data-status="inconclusive">
          <strong>Inconclusive.</strong> Created with consumer software
          {verdict.producer ? ` (${verdict.producer})` : ''}. Integrity cannot
          be verifiedroute to manual review if this claims to be from a bank
          or institution.
        </div>
      );
  }
}

The inconclusive verdict means the document was created with consumer software (Microsoft Word, Excel, LibreOffice), an online editor (iLovePDF, Smallpdf), an HTML-to-PDF renderer, or a scanner — rather than an institutional document-management system. The status_reason field tells you exactly which, and the distinction matters: a payslip that returns inconclusive with reason html_renderer_origin is a different risk than a personal cover letter that returns consumer_software_origin. For documents that claim institutional origin — a bank statement, a diploma, a signed contract — treat inconclusive the same way you treat modified: do not auto-accept, route to a human. For a fuller explanation, see what “inconclusive” really means.

The modification_markers array carries stable HTPBE_* ids — for example HTPBE_SIGNATURE_REMOVED, HTPBE_DATES_DISAGREE, HTPBE_POST_SIGNATURE_EDIT, HTPBE_MULTIPLE_REVISION_LAYERS. These are part of the public contract and never change once shipped. Branch your logic on the id; render the human-readable label from the dictionary published on htpbe.tech/how.

Step 6: Server Actions instead of route handlers

If you prefer Server Actions over route handlers, the verification logic moves into an action and the upload stays on the client. The body-size constraint is the same, so the presigned upload pattern does not change — only the server entry point does.

app/actions/verify.ts:

'use server';

const HTPBE_BASE = 'https://api.htpbe.tech/v1';

export async function verifyDocument(url: string, originalFilename?: string) {
  const auth = { Authorization: `Bearer ${process.env.HTPBE_API_KEY}` };

  const analyzeRes = await fetch(`${HTPBE_BASE}/analyze`, {
    method: 'POST',
    headers: { ...auth, 'Content-Type': 'application/json' },
    body: JSON.stringify({ url, original_filename: originalFilename }),
  });

  if (!analyzeRes.ok) {
    throw new Error(`Analysis failed: ${analyzeRes.status}`);
  }

  const { id } = (await analyzeRes.json()) as { id: string };

  const resultRes = await fetch(`${HTPBE_BASE}/result/${id}`, { headers: auth });
  const result = await resultRes.json();

  return {
    status: result.status as 'intact' | 'modified' | 'inconclusive',
    statusReason: result.status_reason as string | undefined,
    markers: result.modification_markers as string[],
  };
}

Server Actions run on the server by default, so the API key stays safe automatically. The one caveat: a Server Action cannot receive a 6 MB file in its arguments any more than a route handler can in its body. The file still goes to storage via the presigned PUT first; the action receives only the URL. Do not be tempted to pass a FormData blob to the action for large PDFs.

Step 7: Testing without burning quota

Every HTPBE? plan, including the free tier, ships a test key. Test keys accept a fixed set of mock URLs and return deterministic verdicts without consuming credits or downloading real files — the same idea as Stripe test cards. Point your integration tests at them so every verdict branch is covered in CI.

import { describe, it, expect } from 'vitest';

const BASE = 'https://api.htpbe.tech/v1';
const TEST = `${BASE}/test`;
const auth = { Authorization: `Bearer ${process.env.HTPBE_TEST_API_KEY}` };

async function verify(testUrl: string) {
  const a = await fetch(`${BASE}/analyze`, {
    method: 'POST',
    headers: { ...auth, 'Content-Type': 'application/json' },
    body: JSON.stringify({ url: testUrl }),
  });
  const { id } = await a.json();
  const r = await fetch(`${BASE}/result/${id}`, { headers: auth });
  return r.json();
}

describe('verify route verdicts', () => {
  it('returns intact for a clean document', async () => {
    const res = await verify(`${TEST}/clean.pdf`);
    expect(res.status).toBe('intact');
    expect(res.modification_markers).toHaveLength(0);
  });

  it('returns modified with markers', async () => {
    const res = await verify(`${TEST}/modified-high.pdf`);
    expect(res.status).toBe('modified');
    expect(res.modification_markers.length).toBeGreaterThan(0);
  });

  it('returns inconclusive with a status_reason', async () => {
    const res = await verify(`${TEST}/inconclusive.pdf`);
    expect(res.status).toBe('inconclusive');
    expect(res.status_reason).toBeDefined();
  });
});

The available mock URLs follow the pattern https://api.htpbe.tech/v1/test/{name}.pdfclean.pdf (intact), modified-high.pdf (modified), signature-removed.pdf (modified), inconclusive.pdf (inconclusive), and several more. A live key, by contrast, accepts any public URL and draws from your credit pool. Keep the two keys in separate environment files (.env.test vs .env.local) and never commit either.

Security and storage hygiene

The pattern above hands the browser a presigned URL that writes into your bucket, then sends a public-ish URL to the API. Both of those are surfaces that need hardening before this goes live. Six concrete pieces of work, none of which the route handlers above do for you:

Tight TTLs on both directions

The PUT URL minted in Step 1 already expires in 60 seconds — that is the right shape. Pair it with a separate, short-lived GET URL for the API to fetch the file: a 5-to-10-minute presigned GET (@aws-sdk/s3-request-presigner with GetObjectCommand) instead of a long-lived public URL. The API only needs the file long enough to download and analyse it — usually under 10 seconds — so 10 minutes is a generous safety margin. After expiry, the URL stops working even if it leaks into a log.

import { GetObjectCommand } from '@aws-sdk/client-s3';

// In the verify route, mint a fresh GET URL right before calling the API:
const downloadUrl = await getSignedUrl(s3, new GetObjectCommand({ Bucket: ..., Key: ... }), {
  expiresIn: 600, // 10 minutes
});

Bucket lifecycle policy

A presigned URL with a tight TTL stops working — the bytes are still in the bucket. Configure a lifecycle rule on the bucket that auto-deletes any object older than a day (or shorter, depending on your retention policy). On Cloudflare R2 it is in the dashboard’s Object Lifecycle; on S3 it is PutBucketLifecycleConfiguration. The cheapest privacy story is "we cannot leak what we no longer have."

Abuse protection on the upload-url route

/api/upload-url is unauthenticated as written. A bot can hit it in a loop and stuff your bucket with junk, even if no presigned URL ever leaks. Three controls cover most of this:

  • Authenticate the route. Require a user session, an API key, or a CAPTCHA challenge before the route mints a presigned URL. The example above intentionally has no auth so the snippet stays minimal — do not ship it that way.
  • Rate-limit per IP and per session. Even with auth, a compromised session can flood the route. Limit to a sensible cap (5 URLs per minute per session is generous).
  • Constrain what the presigned URL can do. Pass ContentLength or ContentLengthRange into PutObjectCommand so the upload cannot exceed the bucket’s expected size, and pin ContentType: 'application/pdf' so the browser must actually upload a PDF (the bucket policy can additionally enforce this).

CORS on the bucket

The browser PUT from Step 2 needs the bucket to accept cross-origin requests from your app’s domain. On R2 (and S3 with the same shape):

[
  {
    "AllowedOrigins": ["https://your-app.example.com"],
    "AllowedMethods": ["PUT", "GET"],
    "AllowedHeaders": ["Content-Type"],
    "ExposeHeaders": ["ETag"],
    "MaxAgeSeconds": 3600
  }
]

Pin AllowedOrigins to your real origin, not * — a wildcard turns the bucket into a free upload endpoint for any site.

Environment variables

The route handlers reference half a dozen environment variables; collect them in a single .env.local (and a matching list in your deploy secrets):

HTPBE_API_KEY=htpbe_live_...
R2_ENDPOINT=https://<account>.r2.cloudflarestorage.com
R2_BUCKET=your-bucket
R2_PUBLIC_URL=https://your-cdn-or-bucket-domain
R2_ACCESS_KEY_ID=...
R2_SECRET_ACCESS_KEY=...

The keys for R2 should be scoped to that bucket only (read + write, nothing else). The HTPBE_API_KEY is server-side only — never NEXT_PUBLIC_*.

Audit trail without leaking the document

Storing the verdict plus the HTPBE? check_id against the entity (loan application, claim, candidate) is the right audit shape — GET /result/{id} is a permanent, re-readable record of the forensic payload. What you should not store, and especially not log, is the full presigned URL or the bucket key in plaintext. Both expire on their own, but they have non-zero lifetime; a leaked log line during that window hands an attacker the document.

function redactUrl(url: string): string {
  try {
    const parsed = new URL(url);
    return `${parsed.origin}/[redacted-key]`;
  } catch {
    return '[unparseable-url]';
  }
}

// In your log emitter:
log.info('verification submitted', {
  entityId,
  filename: originalFilename, // safe to log
  storageUrl: redactUrl(publicUrl), // never log the raw URL
  checkId: result.id,
  verdict: result.status, // 'intact' | 'modified' | 'inconclusive' — safe
});

Treat presigned URLs the same way you treat session tokens: out of logs, out of metrics labels, out of error messages. The entity id, filename, and verdict are the right fields for the audit row.

What this integration does not catch

Being honest about the boundary keeps your automated decisions sound. Structural PDF analysis detects changes made after a file was created. It cannot see two classes of fraud:

  • Born-fake documents. If someone opens Word, types a false salary figure, and exports to PDF once, the file was never modified post-creation — it is structurally intact. The lie was authored, not edited in. This is why a confident intact on a document that claims institutional origin still deserves a second look if the content is implausible.
  • Recreated originals. An attacker who rebuilds a document from scratch in the same software the issuer used, matching the metadata fields, produces output that structural analysis cannot distinguish from the real thing. This takes real effort and is rare, but it is possible.

Both are reasons to treat HTPBE? as one layer — the structural layer — alongside content checks (amount validation, issuer lookups, sender authentication). It closes the gap that KYC and identity platforms leave open: they verify who submitted a document, not whether the file was altered after a bank produced it. For the layered model, see KYC vs. document forensics.

Shipping checklist for Next.js

Before this goes live in your App Router app:

  • runtime = 'nodejs' on every route handler that talks to the API or storage SDK — never edge.
  • maxDuration raised on the verify route (analysis takes 2–5 seconds for typical PDFs, up to 15 for complex ones) so the function does not time out mid-analysis.
  • Presigned upload for any file over a few hundred KB — never stream large PDFs through the function body.
  • API key server-side only — read from process.env in route handlers or Server Actions, never NEXT_PUBLIC_*.
  • 402 and 429 handled deliberately — 402 is a billing alert, 429 respects Retry-After. Neither is retried blindly.
  • inconclusive routed, not failed — for documents claiming institutional origin, treat it like modified.

The integration surface is intentionally small: two API calls, three verdicts, a handful of error codes. The engineering that matters is on the Next.js side — keeping bytes off the function body, keeping the key off the client, hardening the upload and storage surfaces, and routing the verdict correctly. Run the Step 7 tests against the mock URLs with a test key before pointing a live key at real documents. The full API reference has every field and error code.

Share This Article

Found this article helpful? Share it with others to spread knowledge about PDF security and fraud detection.

https://htpbe.tech/blog/pdf-tamper-detection-nextjs-api-routes

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.