Free PDF Check

Fake Tax Document Detection API

Built for fraud ops at lending, insurance & compliance teams

Catch edited W-2s, rebuilt-from-Excel 1099s, and tampered P60s, T4s, Form 16, and NOA receipts in under 3 seconds. One REST call returns a deterministic verdict with named markers — no UI, no human review, no per-form configuration. The same endpoint handles every regional tax form because the analysis runs on the binary PDF structure, not the visual layout. Self-serve from $15/mo. No sales call.

~3 sec
per document
59 checks
forensic layers
From $15
per month
1,500+
docs / month on Growth

The problem

Modern document fraud is invisible to visual review

A growing class of document fraud opens a genuine PDF, edits a balance, a date, or a beneficiary, and re-saves it. Visually nothing changes — the document passes pixel-level review, layout review, and KYC.

Structural PDF analysis reads the layers rendering engines never expose: revision history, object structure, signature coverage maps. That is where edits leave fingerprints they cannot wipe.

Common tampering patterns

  • Modified balances or totals after export
  • Swapped IBAN or beneficiary on invoices
  • Post-signature edits on contracts
  • Backdated issue and modification dates
  • Fabricated documents from consumer PDF tools

What this looks like

How Tax Document Fraud Shows Up in the File

Three real fraud mechanics we catch at the structural PDF layer.

01

Edited employer payroll forms (W-2, P60, T4)

Real W-2s, P60s, and T4s come from payroll software — ADP, Paychex, Sage, Xero, and similar. When a fraudster edits values in a consumer PDF tool, the producer field changes from the payroll system to the editor. The API flags this producer mismatch as a high-confidence fraud marker.

02

Self-built 1099s and freelance income forms

Fake 1099-MISC and 1099-NEC forms used to inflate self-employment income are usually rebuilt in Excel and exported to PDF. The producer field exposes the spreadsheet origin instead of the legitimate IRS-form generator. Multiple revision layers and inconsistent fonts add structural evidence.

03

Tampered tax returns and assessment notices

Tax return PDFs, ITR receipts, and Notice of Assessment documents are often modified to inflate declared income for loan applications. Each edit appends an incremental update section to the cross-reference table. The API counts revision layers and returns the exact number detected.

04

Forged signatures on tax declarations

Some tax documents carry the issuing authority’s digital signature — HMRC, IRS e-file confirmations, EU national tax portal exports. If the signature was removed, replaced, or invalidated by post-signing edits, the API returns SIGNATURE_REMOVED or MODIFICATIONS_AFTER_SIGNATURE markers with certain confidence.

Tax-document fraud at the application layer

~17%
of self-employment income claims show structural tampering in independent fraud-ops samples
<3 sec
verdict latency on tax PDFs up to 10 MB — fits inline in underwriting
46
forensic checks per file, including KNOWN_EDITOR_IN_PRODUCER and SIGNATURE_REMOVED

The detection gap

KYC platforms check the document. HTPBE? checks the file.

Two different checks — both matter.

KYC & identity platforms

Plaid · Persona · Alloy · Jumio

  • Is this a real bank statement template?
  • Does the account number match the identity?
  • Is the document format consistent with the issuing bank?

Detects fake documents. Does not detect edited real documents.

HTPBE? tamper detection API

Structural PDF integrity

  • Was this specific PDF file modified after it was generated?
  • Do metadata timestamps match the file structure?
  • Were digital signatures valid at the time of signing?

Catches edits invisible to visual review and template checks.

Results in under 3 seconds30 to 1,500+ documents/monthFrom $15/mo

What HTPBE? checks

Detection capabilities

Deterministic structural signals. No probabilistic scores, no model training.

Producer signature mismatch

The PDF claims to come from one tool but the binary structure points to another. The first signal of post-export editing.

Incremental update trail

Every save after the original creates an incremental update. Long chains mean multiple editing sessions on the same file.

Multiple xref tables

Each editing session adds a new cross-reference table. Genuine institutional PDFs have one. Tampered PDFs have several.

Modification timestamp gap

A real PDF has matching CreationDate and ModDate. Months between them is a high-confidence forgery signal.

Digital signature validation

When a digital signature exists, we verify the coverage map. Modifications after signing return certain-confidence verdicts.

Font and object consistency

Edited text introduces new font subsets or objects with origin patterns inconsistent with the rest of the document.

Share with engineering

Wire this into your intake pipeline in under a day

Two API calls — one POST to submit the PDF, one GET to retrieve the verdict. Forward this page to your engineering team; the full API reference, quotas, and copy-paste examples in cURL, JavaScript, Python, PHP, Go, and Ruby are one click away.

Pricing

Self-serve plans, no sales call

All plans include the same forensic checks. Pick the quota that matches your monthly document volume.

manual

Starter

$15/mo

30 checks/mo

Manual spot-checks and integration testing

most common

Growth

$149/mo

350 checks/mo

Active document processing pipelines

high volume

Pro

$499/mo

1,500 checks/mo

High-volume automation and API integrations

Enterprise (unlimited, on-premise available) see full pricing

API key on signup. Free test environment on every plan. No card required.

Customer Stories

Teams that stopped document fraud

Compliance, finance, and risk teams use HTPBE? to catch manipulated PDFs before they become costly mistakes.

Caught an invoice where the total had been changed by less than a thousand dollars. Without this I would have approved it without a second look.

Sarah M.

AP Manager

United States

We had three applicants in the same week with bank statements that looked completely fine. Two of them were flagged as modified. You simply cannot see this by reading the document — it is in the file structure.

Lars V.

Risk Analyst, Online Lending

Netherlands

Salary slips were coming with altered figures. We identified two problematic files before the placement was finalised.

Priya K.

HR Operations Lead

India

Since we started checking documents this way, we stopped two applications early in the process that would have been very difficult to reverse later.

Julien R.

Fraud Analyst, Fintech

France

Some applicants were sending PDFs that looked authentic but had been edited in ways not visible to the eye. We now ask for checked originals when something is flagged. Already saved us from a few bad decisions.

Marta S.

Compliance Coordinator

Spain

One invoice was caught because there was a mismatch between the document dates and structure. That particular case would have cost us significantly.

Tariq A.

Finance Manager

United Arab Emirates

FAQ

Frequently asked questions

Which tax forms does the API support?

Any tax document submitted as a PDF is supported. The API has been used on US W-2 and 1099 forms, UK P60 and P45, Canadian T4 and NOA, Indian Form 16 and ITR-V receipts, and EU national tax authority exports. Because the analysis runs on the PDF binary structure, no document-type configuration is needed — the same API call handles every regional form.

Can it tell the difference between a real and a fake tax form?

The API answers a more precise question: was this specific PDF file modified after it was issued? A fake tax form built from scratch in Excel will be flagged because the producer field exposes the spreadsheet origin. A real tax form that was edited will be flagged because the modification creates structural evidence — timestamp gaps, additional revision layers, and producer fingerprint changes. Fully synthetic documents that were never real are a different problem and require visual or AI-based detection methods.

How is this different from IRS or HMRC fraud detection APIs?

Government fraud detection APIs check whether a tax document number exists in the issuing authority’s database. They do not check whether the specific PDF file submitted has been altered. A fraudster can have a valid tax document number on an altered PDF — it would pass a database lookup and fail HTPBE?’s structural analysis. Use both: government APIs for existence fraud detection, HTPBE? for file integrity.

What confidence level does the API return on tax documents?

Tax documents tend to produce high-confidence markers because they originate from a small set of well-known payroll and tax software vendors. When the producer field is inconsistent with the document type, the marker is unambiguous. When a digital signature was applied (common on EU and government-issued tax PDFs) and was later invalidated, the confidence is certain.

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.