logo
Receipt fraud

Fake Receipt Detection — Catch Regenerated and Edited PDFs

A receipt-generator site can produce a plausible-looking PDF in seconds — and most expense reviewers cannot tell. Finance and AP teams approve thousands of expense receipts a month. Insurance claims adjusters see receipts attached to every property and travel claim. Tax preparers receive client-uploaded receipts as substantiation. The fabrication scripts have changed — generator sites, AI tools, edited real receipts — but the visual still passes review every time.

~3 sec
per document
35 checks
forensic layers
From $15
per month
1,500+
docs / month on Growth
Scope

htpbe? analyzes the structural layer of the PDF file — the layer that records every edit, even invisible ones. We don't inspect holograms, phone photos, or ID biometrics. If your fraud problem is a digitally fabricated or tampered receipt, we're the most specific tool for it.

When htpbe? returns INCONCLUSIVE on a receipt that should have come from a POS or platform issuer (Uber, Square, Stripe, hotel PMS, restaurant POS), that's itself a fraud signal in this context — real receipts always carry the issuer's producer signature, never a desktop tool.

How it looks

One REST call, one deterministic verdict

Upload the PDF. The API returns INTACT, MODIFIED, or INCONCLUSIVE with named markers — in about three seconds.

What this looks like

How fake and tampered receipts actually look

Three real fraud mechanics we catch at the structural PDF layer.

01

Generator-tool fabrication

Free online receipt-generator sites and AI tools produce a plausible PDF for any merchant, amount, and date the user types in. These tools leave a recognisable producer signature (Chrome Headless, generic PDF library, or an AI-platform fingerprint) and miss the platform metadata authentic POS or platform exports carry.

02

Real receipt edited after issuance

Authentic Uber receipt or hotel folio downloaded by an actual customer, opened in any PDF editor, the amount or date bumped, exported. The producer field changes from the platform engine to whichever editor was used, and the xref chain shows an incremental update — visible structural evidence of post-issuance editing.

03

Same receipt submitted multiple times across expense periods

A real receipt is reused across multiple expense reports with the date subtly edited each time. Cross-document image hash and content hash comparison catches the reuse — the underlying payment was made once but reimbursed three times.

The scale

~5%
of annual T&E spend lost to expense fraud on average
$1.9M
median annual cost of expense fraud at mid-market companies
~3 sec
per receipt via API

Why your existing checks miss this

OCR extracts the amount. It does not verify the file.

Concur and Expensify read whatever the receipt shows — even if the receipt was generated this morning.

Expense management platforms (Concur, Expensify, Ramp, Brex, Pleo, Navan) extract receipt data via OCR and run policy rules. They cannot tell you whether the receipt PDF was issued by a real merchant or fabricated five minutes ago through a generator site. Tax-filing software accepts receipt PDFs at face value. htpbe? catches the receipt PDF the employee or claimant uploaded at the moment of intake — standalone, no platform integration required.

Results in under 3 seconds30 to 1,500+ documents/monthFrom $15/mo
How it works

Five forensic layers, one deterministic verdict

Every PDF we receive passes through the same structural pipeline — no model training, no thresholds to tune.

01

Metadata analysis

Creation and modification timestamps, producer and creator fields, XMP metadata — the first layer exposes basic tampering.

02

File structure

Xref tables, trailer chain, incremental updates. Any edit after export leaves a structural fingerprint here.

03

Digital signatures

Signature chain integrity and post-signature modifications produce deterministic markers. Certainty-level signal.

04

Content integrity

Fonts, objects, embedded content, page assembly. Multi-session edits and inserted objects are visible at this layer.

05

Verdict with markers

Deterministic output: INTACT / MODIFIED / INCONCLUSIVE, with named markers for every finding — suitable for audit trail.

Document types

Receipt and adjacent expense PDFs we check

Every type listed below is analyzed at the structural file layer — not the rendered image.

Restaurant receipt PDF (POS-issued)Hotel folio PDF (PMS-issued)Flight itinerary / booking confirmation PDFUber / Lyft / DoorDash receipt PDFOnline retail receipt PDF (Amazon, Shopify, Stripe)Rental car receipt PDFPer-diem claim PDFMileage log PDF
What htpbe? checks

Detection capabilities

Deterministic structural signals. No probabilistic scores, no model training.

Producer signature mismatch

Authentic receipts carry the producer signature of the issuing platform (Uber, Lyft, Square, Stripe, Toast, Clover, Marriott PMS, etc.). When the producer is Microsoft Excel, Microsoft Word, LibreOffice, Chrome Headless, or a generic PDF library, the document was authored on a desktop or generator tool — it didn't come from the platform.

Generator-tool and AI-platform fingerprints

Receipt-generator sites and AI tools produce PDFs through headless browsers or specific PDF libraries with recognisable producer strings. htpbe? flags these patterns separately from legitimate platform exports.

Incremental update trail

A clean platform export has one cross-reference table. Re-saves through any editor append a second xref — visible structural evidence of post-issuance editing.

Cross-receipt content hash

When the same expense report contains receipts that share content fingerprints across "different" merchants or dates, the API surfaces the duplication. Real receipt issuance from different vendors produces unique content; reuse-with-edits leaves a trail.

Modification timestamp gap

A real receipt issued at the time of purchase has CreationDate matching the transaction. A months-later modification on a "freshly issued" receipt is a high-confidence flag for post-export editing.

Font subset divergence across pages

Multi-session edits or page assembly leave font subset prefix shifts. Single-session legitimate exports have consistent subsets across all pages.

Integrate in minutes

Two HTTP calls to verify any receipt

Buyers can skip this section — developers, the integration is two HTTP calls.

Step 1 — submit the PDF

curl -X POST https://api.htpbe.tech/v1/analyze \
  -H "Authorization: Bearer $HTPBE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-storage/employee-hotel-folio.pdf"}'

Step 2 — read the verdict

{
  "id": "r1e2c3i4-5p6t-7c8h-9k0z-r1f2g3h4i5j6",
  "status": "modified",
  "modification_confidence": "high",
  "modification_markers": [
    "Spreadsheet producer detected (Microsoft Excel)",
    "Two cross-reference tables — incremental update",
    "Modification date 6 weeks after creation date"
  ],
  "producer": "Microsoft Excel",
  "creator": "Marriott Opera PMS (original)",
  "creation_date": 1707091200,
  "modification_date": 1710720000,
  "has_digital_signature": false,
  "xref_count": 2,
  "has_incremental_updates": true
}

Original came from Marriott Opera PMS. Then six weeks later it was opened in Microsoft Excel and re-saved — adding a second xref table. Verdict: modified at high confidence. The employee edited a real hotel folio after checkout, likely to inflate the room or extras line.

Customer Stories

Teams that stopped document fraud

Compliance, finance, and risk teams use htpbe? to catch manipulated PDFs before they become costly mistakes.

Caught an invoice where the total had been changed by less than a thousand dollars. Without this I would have approved it without a second look.

Sarah M.

AP Manager

United States

We had three applicants in the same week with bank statements that looked completely fine. Two of them were flagged as modified. You simply cannot see this by reading the document — it is in the file structure.

Lars V.

Risk Analyst, Online Lending

Netherlands

Salary slips were coming with altered figures. We identified two problematic files before the placement was finalised.

Priya K.

HR Operations Lead

India

Since we started checking documents this way, we stopped two applications early in the process that would have been very difficult to reverse later.

Julien R.

Fraud Analyst, Fintech

France

Some applicants were sending PDFs that looked authentic but had been edited in ways not visible to the eye. We now ask for verified originals when something is flagged. Already saved us from a few bad decisions.

Marta S.

Compliance Coordinator

Spain

One invoice was caught because there was a mismatch between the document dates and structure. That particular case would have cost us significantly.

Tariq A.

Finance Manager

United Arab Emirates

FAQ

Frequently asked questions

Concur, Expensify, Ramp, and Brex extract receipt data via OCR and apply policy rules. They process whatever the PDF shows, including fabricated PDFs. htpbe? inspects the PDF file structure to detect that the receipt was authored on a desktop, generated by a tool, or edited after issuance. Use both: extraction tools for data capture, htpbe? for integrity verification.
Many AI-generated receipts carry recognisable producer fingerprints (headless browsers, AI-platform tooling) that distinguish them from authentic POS exports. htpbe? flags these patterns. For a deeper treatment of AI-generated documents specifically, see /ai-generated-document-detection.
Phone photos are raster images with no PDF structure to analyse — outside scope. Always require digital PDF receipts where possible. Most modern receipt issuers (Uber, Stripe, Square, hotel PMS) email PDFs directly; require those rather than phone photos for any receipt above your audit threshold.
htpbe? returns INCONCLUSIVE when a receipt PDF lacks the institutional metadata that genuine platform exports carry — typically because the file was authored on a desktop with consumer software (Word, Excel, generator-site renderers) rather than exported from a POS, hotel PMS, or platform issuer. For receipts that should come from issuers (Uber, Lyft, hotel chains, Square, Stripe, e-commerce platforms), INCONCLUSIVE is itself a high-confidence fraud signal: a real receipt always carries the platform's producer signature. Treat INCONCLUSIVE on those receipts as fraud-positive and route the case to manual receipt sourcing or merchant verification.

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.