PDF Authenticity FAQ — Detect Tampering, Check Documents

Question 1

Does HTPBE detect text written by ChatGPT or Claude?

Accepted Answer

No, and that’s deliberate. Text classifiers (GPTZero, Originality.ai, Inscribe) read the prose and decide whether it looks AI-written. We read the PDF’s binary container — xref tables, fonts, content streams, signatures — and decide whether the file was assembled normally or stitched together. The two are complementary layers of an AI-fraud defense, not substitutes. Run text classifiers on the content; run HTPBE on the file.

Question 2

What about a fully fabricated PDF generated from scratch by an AI agent (ReportLab, PDFKit, Puppeteer)?

Accepted Answer

“Born-synthetic” PDFs are out of structural scope. If a fraudster spins up a clean ReportLab pipeline that emits a single-pass PDF with consistent fonts, no incremental updates, and a plausible producer, no structural tool catches it — including ours. That’s a content-forgery problem (verify the IBAN exists, verify the employer ID, verify the logo against a registry) and the right tools are content-AI detectors and KYC platforms. Our scope is post-creation tampering and assembly artifacts.

Question 3

How is AI document fraud different from traditional document fraud?

Accepted Answer

In structural terms, less than you’d expect. The same forensic markers fire: incremental updates, producer mismatch, page assembly, font-subset divergence, hybrid raster scans. AI changed the volume and polish of fraudulent documents, not the underlying way they get built. A pay stub with an AI-rasterized fake amount triggers the same hybrid-scan and page-assembly markers as a Photoshop-edited pay stub from 2018.

Question 4

Does this replace content-AI detection tools like GPTZero or Inscribe?

Accepted Answer

No. They check the prose; we check the container. A bank statement that fools GPTZero on text can still betray itself in font subset prefixes; a PDF that’s structurally pristine can still be a born-synthetic forgery their tools would catch. AI-fraud defense is a stack — identity layer (Persona, Onfido, Plaid), content layer (GPTZero, Inscribe, Resistant AI), structural layer (HTPBE). Run all three on the same upload.

Question 5

Won’t this trigger false positives on documents legitimately processed by AI tools?

Accepted Answer

It depends on what “processed by AI” means. A bank exporting a real PDF and then routing it through an AI-OCR pipeline before sending to the customer would carry post-export markers — but banks don’t do that, they ship the original export. The patterns we flag (hybrid raster scan, online-editor producer, page assembly from different sources) don’t happen in legitimate institutional document workflows. Genuine false positives come from a different class entirely — native design-tool exports like Lucidchart or SafetyCulture — not from AI processing.

Question 6

Can I test it on AI-generated PDFs we’ve already collected?

Accepted Answer

Yes. The web tool accepts any PDF up to 10 MB — upload your collected AI-fraud samples and see the verdict and named markers per file. New accounts get 5 free checks to start; after that, top up with credit packs (from $5) or a subscription. For automated pipelines, the API uses the same engine and the same shared credit pool. Fraud-ops teams typically run a backfill of recent flagged applications through HTPBE as a first-week exercise to size the AI-fraud baseline before integrating into the live queue.

AI-Generated Document Fraud Detection

AI didn’t invent document fraud. It scaled the assembly line

AI-fraud patterns we actually catch

Document fraud in 2026 — three concrete patterns

AI-generated text printed and rebuilt as PDF in an online editor

AI-rasterized pages swapped into a real bank statement export

Design-tool rebuild of an institutional template (Illustrator, InDesign)

Mass template forgery — same fraud kit run repeatedly through AI

AI-generated overlay applied to a stolen template via Photoshop

AI modifications appended to an already-signed document

KYC platforms check the document. HTPBE? checks the file.

KYC & identity platforms

HTPBE? tamper detection API

What the API detects in AI-assembled PDFs

Hybrid raster scan with programmatic content

Page assembly from different sources

Incremental updates after AI processing

Known-editor producer signatures

Design-tool ingredient leaks

Modifications after digital signature

Wire this into your intake pipeline in under a day

Self-serve plans, no sales call

Starter

Growth

Pro

Teams that stopped document fraud

Frequently asked questions

Does HTPBE? detect text written by ChatGPT or Claude?

What about a fully fabricated PDF generated from scratch by an AI agent (ReportLab, PDFKit, Puppeteer)?

How is AI document fraud different from traditional document fraud?

Does this replace content-AI detection tools like GPTZero or Inscribe?

Won’t this trigger false positives on documents legitimately processed by AI tools?

Can I test it on AI-generated PDFs we’ve already collected?

Secure your workflow