AI-Generated Document Fraud Detection
AI now writes convincing fake pay stubs, bank statements, and invoices on demand — perfect on screen, indistinguishable from real exports to the human eye. HTPBE? doesn’t grade the prose. It reads the PDF’s binary structure and flags the seams where AI-generated content was rasterized, stitched, or stamped into the file.
HTPBE? is a structural PDF forensics tool, not an AI-content classifier. We detect the assembly markers AI workflows leave in the PDF container — incremental updates after a real export, AI-rasterized pages mixed with programmatic content, online-editor producers, design-tool ingredient leaks, mass template-forgery fingerprints, modifications after a digital signature.
What we do not promise: reading a PDF and saying “this was written by ChatGPT.” If a fraudster generates a PDF entirely from scratch with ReportLab, PDFKit, or Puppeteer using a stolen logo and a fabricated IBAN, no structural tool flags that — including ours. That is content forgery, a different product category (Inscribe, Resistant AI, GPTZero). We sit alongside content-AI detectors as the structural layer they don’t analyze.
The problem
AI didn’t invent document fraud. It scaled the assembly line
Until 2023, fabricating a convincing pay stub or bank statement required either a real export to edit or a designer’s hour. Generative AI collapsed that to seconds. A fraud-ops team that saw a dozen forged documents a quarter now sees them weekly — and the rendered output is cleaner than what a human editor produced.
But the rendering layer is not where forensic evidence lives. AI generates pixels; PDFs are containers of objects, fonts, content streams, and cross-reference tables. To put AI-generated content into a PDF that fools a human, the fraudster has to either rebuild a real institutional file (leaving xref and producer traces) or stitch AI output into a real export (leaving page-assembly and font-subset traces). HTPBE? reads those traces.
For risk and fraud-ops teams, the AI-fraud question isn’t “was this written by a model?” — that’s a content question. The operative question is “was this PDF assembled by a human or a tool, after the real document was generated?” That’s the structural question, and it’s answerable in under three seconds per file.
AI-fraud patterns we actually catch
- AI-generated text printed and rebuilt as PDF in an online editor
- AI-rasterized pages swapped into a real bank statement export
- Design-tool rebuild of an institutional template (Illustrator, InDesign)
- Mass template forgery — same fraud kit run repeatedly through AI
- AI-generated overlay applied to a stolen template via Photoshop
- AI modifications appended to an already-signed document
What this looks like
Document fraud in 2026 — three concrete patterns
Three real fraud mechanics we catch at the structural PDF layer.
AI-generated text printed and rebuilt as PDF in an online editor
AI-rasterized pages swapped into a real bank statement export
Design-tool rebuild of an institutional template (Illustrator, InDesign)
Mass template forgery same fraud kit run repeatedly through AI
AI-generated overlay applied to a stolen template via Photoshop
AI modifications appended to an already-signed document
The detection gap
KYC platforms check the document. HTPBE? checks the file.
Two different checks — both matter.
KYC & identity platforms
Plaid · Persona · Alloy · Jumio
- Is this a real bank statement template?
- Does the account number match the identity?
- Is the document format consistent with the issuing bank?
Detects fake documents. Does not detect edited real documents.
HTPBE? tamper detection API
Structural PDF integrity
- Was this specific PDF file modified after it was generated?
- Do metadata timestamps match the file structure?
- Were digital signatures valid at the time of signing?
What HTPBE? checks
What the API detects in AI-assembled PDFs
Six structural patterns that betray AI assembly — results in under 3 seconds
Hybrid raster scan with programmatic content
AI-generated images replacing pages in a real PDF leave a detectable mix: some pages are pure raster (the AI output), others carry programmatic text and fonts (the original export). Y-flip CTM patterns and DPI inconsistencies across pages are direct evidence of an AI-rasterized page injection.
Page assembly from different sources
Pages stitched together from independent rendering sessions carry distinct font subset prefixes, object-numbering breaks, and inconsistent producer traces. A bank statement where pages 1–2 are a real export and page 3 is AI-rebuilt fails the assembly-coherence check.
Incremental updates after AI processing
When AI content is pasted into a real PDF and re-saved through an online editor, the editor appends an incremental update to the file. The original cross-reference table remains, plus a new one. Multiple xref tables on a document that should be a single export is the most common AI-fraud signature.
Known-editor producer signatures
AI-generated content rarely lands in a PDF directly — it goes through a tool. iLovePDF, Smallpdf, PDF24, online editors, and design suites leave recognizable producer fields. An institutional document carrying a consumer-tool producer was processed after the real export.
Design-tool ingredient leaks
Illustrator, Photoshop, and InDesign rebuilds of institutional templates leak the operator’s local source-PDF path through xmpMM:Ingredients metadata — a fingerprint of an AI-assisted prepress workflow on a document that should have come straight from a banking system.
Modifications after digital signature
AI-generated content appended to an already-signed contract or institutional document triggers the highest-confidence verdict in the system: “certain.” The signature acts as a structural notarization, and any content added after it is mathematically detectable.
Integrate in minutes
Integrate AI-fraud detection in any stack
Two API calls — submit the suspicious PDF URL, read the structural verdict. Copy-paste examples for cURL, JavaScript, Python, PHP, Go, and Ruby.
# curl is preinstalled on macOS and most Linux distributions
# Step 1: Submit PDF for analysis
curl -X POST https://api.htpbe.tech/v1/analyze \
-H "Authorization: Bearer htpbe_live_..." \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/document.pdf"}'
# Returns: {"id":"3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"}
# Step 2: Retrieve full results
ID="3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"
curl -s "https://api.htpbe.tech/v1/result/$ID" \
-H "Authorization: Bearer htpbe_live_..." \
| jq '.status'Pricing
Self-serve plans, no sales call
All plans include the same forensic checks. Pick the quota that matches your monthly document volume.
manualStarter
$15/mo
30 checks/mo
Manual spot-checks and integration testing
most commonGrowth
$149/mo
350 checks/mo
Active document processing pipelines
high volumePro
$499/mo
1,500 checks/mo
High-volume automation and API integrations
Enterprise (unlimited, on-premise available) — see full pricing
API key on signup. Free test environment on every plan. No card required.
Customer Stories
Teams that stopped document fraud
Compliance, finance, and risk teams use HTPBE? to catch manipulated PDFs before they become costly mistakes.
Caught an invoice where the total had been changed by less than a thousand dollars. Without this I would have approved it without a second look.
Sarah M.
AP Manager
United States
We had three applicants in the same week with bank statements that looked completely fine. Two of them were flagged as modified. You simply cannot see this by reading the document — it is in the file structure.
Lars V.
Risk Analyst, Online Lending
Netherlands
Salary slips were coming with altered figures. We identified two problematic files before the placement was finalised.
Priya K.
HR Operations Lead
India
Since we started checking documents this way, we stopped two applications early in the process that would have been very difficult to reverse later.
Julien R.
Fraud Analyst, Fintech
France
Some applicants were sending PDFs that looked authentic but had been edited in ways not visible to the eye. We now ask for checked originals when something is flagged. Already saved us from a few bad decisions.
Marta S.
Compliance Coordinator
Spain
One invoice was caught because there was a mismatch between the document dates and structure. That particular case would have cost us significantly.
Tariq A.
Finance Manager
United Arab Emirates
FAQ
Frequently asked questions
Does HTPBE? detect text written by ChatGPT or Claude?
What about a fully fabricated PDF generated from scratch by an AI agent (ReportLab, PDFKit, Puppeteer)?
How is AI document fraud different from traditional document fraud?
Does this replace content-AI detection tools like GPTZero or Inscribe?
Won’t this trigger false positives on documents legitimately processed by AI tools?
Can I test it on AI-generated PDFs we’ve already collected?
Secure your workflow
Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.