Free PDF Check

AI-Generated Document Fraud Detection

AI now writes convincing fake pay stubs, bank statements, and invoices on demand — perfect on screen, indistinguishable from real exports to the human eye. HTPBE? doesn’t grade the prose. It reads the PDF’s binary structure and flags the seams where AI-generated content was rasterized, stitched, or stamped into the file.

~3 sec
per document
50 checks
forensic layers
From $15
per month
1,500+
docs / month on Growth
Scope

HTPBE? is a structural PDF forensics tool, not an AI-content classifier. We detect the assembly markers AI workflows leave in the PDF container — incremental updates after a real export, AI-rasterized pages mixed with programmatic content, online-editor producers, design-tool ingredient leaks, mass template-forgery fingerprints, modifications after a digital signature.

What we do not promise: reading a PDF and saying “this was written by ChatGPT.” If a fraudster generates a PDF entirely from scratch with ReportLab, PDFKit, or Puppeteer using a stolen logo and a fabricated IBAN, no structural tool flags that — including ours. That is content forgery, a different product category (Inscribe, Resistant AI, GPTZero). We sit alongside content-AI detectors as the structural layer they don’t analyze.

The problem

AI didn’t invent document fraud. It scaled the assembly line

Until 2023, fabricating a convincing pay stub or bank statement required either a real export to edit or a designer’s hour. Generative AI collapsed that to seconds. A fraud-ops team that saw a dozen forged documents a quarter now sees them weekly — and the rendered output is cleaner than what a human editor produced.

But the rendering layer is not where forensic evidence lives. AI generates pixels; PDFs are containers of objects, fonts, content streams, and cross-reference tables. To put AI-generated content into a PDF that fools a human, the fraudster has to either rebuild a real institutional file (leaving xref and producer traces) or stitch AI output into a real export (leaving page-assembly and font-subset traces). HTPBE? reads those traces.

For risk and fraud-ops teams, the AI-fraud question isn’t “was this written by a model?” — that’s a content question. The operative question is “was this PDF assembled by a human or a tool, after the real document was generated?” That’s the structural question, and it’s answerable in under three seconds per file.

AI-fraud patterns we actually catch

  • AI-generated text printed and rebuilt as PDF in an online editor
  • AI-rasterized pages swapped into a real bank statement export
  • Design-tool rebuild of an institutional template (Illustrator, InDesign)
  • Mass template forgery — same fraud kit run repeatedly through AI
  • AI-generated overlay applied to a stolen template via Photoshop
  • AI modifications appended to an already-signed document

What this looks like

Document fraud in 2026 — three concrete patterns

Three real fraud mechanics we catch at the structural PDF layer.

01

AI-generated text printed and rebuilt as PDF in an online editor

02

AI-rasterized pages swapped into a real bank statement export

03

Design-tool rebuild of an institutional template (Illustrator, InDesign)

04

Mass template forgery same fraud kit run repeatedly through AI

05

AI-generated overlay applied to a stolen template via Photoshop

06

AI modifications appended to an already-signed document

50 layers
Forensic checks per document
~3 sec
Median analysis time, end to end
From $15
Self-serve per month, no sales call

The detection gap

KYC platforms check the document. HTPBE? checks the file.

Two different checks — both matter.

KYC & identity platforms

Plaid · Persona · Alloy · Jumio

  • Is this a real bank statement template?
  • Does the account number match the identity?
  • Is the document format consistent with the issuing bank?

Detects fake documents. Does not detect edited real documents.

HTPBE? tamper detection API

Structural PDF integrity

  • Was this specific PDF file modified after it was generated?
  • Do metadata timestamps match the file structure?
  • Were digital signatures valid at the time of signing?

Catches edits invisible to visual review and template checks.

Results in under 3 seconds30 to 1,500+ documents/monthFrom $15/mo

What HTPBE? checks

What the API detects in AI-assembled PDFs

Six structural patterns that betray AI assembly — results in under 3 seconds

Hybrid raster scan with programmatic content

AI-generated images replacing pages in a real PDF leave a detectable mix: some pages are pure raster (the AI output), others carry programmatic text and fonts (the original export). Y-flip CTM patterns and DPI inconsistencies across pages are direct evidence of an AI-rasterized page injection.

Page assembly from different sources

Pages stitched together from independent rendering sessions carry distinct font subset prefixes, object-numbering breaks, and inconsistent producer traces. A bank statement where pages 1–2 are a real export and page 3 is AI-rebuilt fails the assembly-coherence check.

Incremental updates after AI processing

When AI content is pasted into a real PDF and re-saved through an online editor, the editor appends an incremental update to the file. The original cross-reference table remains, plus a new one. Multiple xref tables on a document that should be a single export is the most common AI-fraud signature.

Known-editor producer signatures

AI-generated content rarely lands in a PDF directly — it goes through a tool. iLovePDF, Smallpdf, PDF24, online editors, and design suites leave recognizable producer fields. An institutional document carrying a consumer-tool producer was processed after the real export.

Design-tool ingredient leaks

Illustrator, Photoshop, and InDesign rebuilds of institutional templates leak the operator’s local source-PDF path through xmpMM:Ingredients metadata — a fingerprint of an AI-assisted prepress workflow on a document that should have come straight from a banking system.

Modifications after digital signature

AI-generated content appended to an already-signed contract or institutional document triggers the highest-confidence verdict in the system: “certain.” The signature acts as a structural notarization, and any content added after it is mathematically detectable.

Integrate in minutes

Integrate AI-fraud detection in any stack

Two API calls — submit the suspicious PDF URL, read the structural verdict. Copy-paste examples for cURL, JavaScript, Python, PHP, Go, and Ruby.

bash
# curl is preinstalled on macOS and most Linux distributions

# Step 1: Submit PDF for analysis
curl -X POST https://api.htpbe.tech/v1/analyze \
  -H "Authorization: Bearer htpbe_live_..." \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/document.pdf"}'
# Returns: {"id":"3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"}

# Step 2: Retrieve full results
ID="3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"
curl -s "https://api.htpbe.tech/v1/result/$ID" \
  -H "Authorization: Bearer htpbe_live_..." \
  | jq '.status'

Pricing

Self-serve plans, no sales call

All plans include the same forensic checks. Pick the quota that matches your monthly document volume.

manual

Starter

$15/mo

30 checks/mo

Manual spot-checks and integration testing

most common

Growth

$149/mo

350 checks/mo

Active document processing pipelines

high volume

Pro

$499/mo

1,500 checks/mo

High-volume automation and API integrations

Enterprise (unlimited, on-premise available) see full pricing

API key on signup. Free test environment on every plan. No card required.

Customer Stories

Teams that stopped document fraud

Compliance, finance, and risk teams use HTPBE? to catch manipulated PDFs before they become costly mistakes.

Caught an invoice where the total had been changed by less than a thousand dollars. Without this I would have approved it without a second look.

Sarah M.

AP Manager

United States

We had three applicants in the same week with bank statements that looked completely fine. Two of them were flagged as modified. You simply cannot see this by reading the document — it is in the file structure.

Lars V.

Risk Analyst, Online Lending

Netherlands

Salary slips were coming with altered figures. We identified two problematic files before the placement was finalised.

Priya K.

HR Operations Lead

India

Since we started checking documents this way, we stopped two applications early in the process that would have been very difficult to reverse later.

Julien R.

Fraud Analyst, Fintech

France

Some applicants were sending PDFs that looked authentic but had been edited in ways not visible to the eye. We now ask for checked originals when something is flagged. Already saved us from a few bad decisions.

Marta S.

Compliance Coordinator

Spain

One invoice was caught because there was a mismatch between the document dates and structure. That particular case would have cost us significantly.

Tariq A.

Finance Manager

United Arab Emirates

FAQ

Frequently asked questions

Does HTPBE? detect text written by ChatGPT or Claude?

No, and that’s deliberate. Text classifiers (GPTZero, Originality.ai, Inscribe) read the prose and decide whether it looks AI-written. We read the PDF’s binary container — xref tables, fonts, content streams, signatures — and decide whether the file was assembled normally or stitched together. The two are complementary layers of an AI-fraud defense, not substitutes. Run text classifiers on the content; run HTPBE? on the file.

What about a fully fabricated PDF generated from scratch by an AI agent (ReportLab, PDFKit, Puppeteer)?

“Born-synthetic” PDFs are out of structural scope. If a fraudster spins up a clean ReportLab pipeline that emits a single-pass PDF with consistent fonts, no incremental updates, and a plausible producer, no structural tool catches it — including ours. That’s a content-forgery problem (verify the IBAN exists, verify the employer ID, verify the logo against a registry) and the right tools are content-AI detectors and KYC platforms. Our scope is post-creation tampering and assembly artifacts.

How is AI document fraud different from traditional document fraud?

In structural terms, less than you’d expect. The same forensic markers fire: incremental updates, producer mismatch, page assembly, font-subset divergence, hybrid raster scans. AI changed the volume and polish of fraudulent documents, not the underlying way they get built. A pay stub with an AI-rasterized fake amount triggers the same hybrid-scan and page-assembly markers as a Photoshop-edited pay stub from 2018.

Does this replace content-AI detection tools like GPTZero or Inscribe?

No. They check the prose; we check the container. A bank statement that fools GPTZero on text can still betray itself in font subset prefixes; a PDF that’s structurally pristine can still be a born-synthetic forgery their tools would catch. AI-fraud defense is a stack — identity layer (Persona, Onfido, Plaid), content layer (GPTZero, Inscribe, Resistant AI), structural layer (HTPBE?). Run all three on the same upload.

Won’t this trigger false positives on documents legitimately processed by AI tools?

It depends on what “processed by AI” means. A bank exporting a real PDF and then routing it through an AI-OCR pipeline before sending to the customer would carry post-export markers — but banks don’t do that, they ship the original export. The patterns we flag (hybrid raster scan, online-editor producer, page assembly from different sources) don’t happen in legitimate institutional document workflows. Genuine false positives come from a different class entirely — native design-tool exports like Lucidchart or SafetyCulture — not from AI processing.

Can I test it on AI-generated PDFs we’ve already collected?

Yes. The free web tool (no signup) accepts any PDF up to 10 MB — upload your collected AI-fraud samples and see the verdict and named markers per file. For automated pipelines, the API uses the same engine. Fraud-ops teams typically run a backfill of recent flagged applications through HTPBE? as a first-week exercise to size the AI-fraud baseline before integrating into the live queue.

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.