Fake Receipt Detection — Catch Regenerated and Edited PDFs
A receipt-generator site can produce a plausible-looking PDF in seconds — and most expense reviewers cannot tell. Finance and AP teams approve thousands of expense receipts a month. Insurance claims adjusters see receipts attached to every property and travel claim. Tax preparers receive client-uploaded receipts as substantiation. The fabrication scripts have changed — generator sites, AI tools, edited real receipts — but the visual still passes review every time.
htpbe? analyzes the structural layer of the PDF file — the layer that records every edit, even invisible ones. We don't inspect holograms, phone photos, or ID biometrics. If your fraud problem is a digitally fabricated or tampered receipt, we're the most specific tool for it.
When htpbe? returns INCONCLUSIVE on a receipt that should have come from a POS or platform issuer (Uber, Square, Stripe, hotel PMS, restaurant POS), that's itself a fraud signal in this context — real receipts always carry the issuer's producer signature, never a desktop tool.
One REST call, one deterministic verdict
Upload the PDF. The API returns INTACT, MODIFIED, or INCONCLUSIVE with named markers — in about three seconds.
How fake and tampered receipts actually look
Three real fraud mechanics we catch at the structural PDF layer.
Generator-tool fabrication
Free online receipt-generator sites and AI tools produce a plausible PDF for any merchant, amount, and date the user types in. These tools leave a recognisable producer signature (Chrome Headless, generic PDF library, or an AI-platform fingerprint) and miss the platform metadata authentic POS or platform exports carry.
Real receipt edited after issuance
Authentic Uber receipt or hotel folio downloaded by an actual customer, opened in any PDF editor, the amount or date bumped, exported. The producer field changes from the platform engine to whichever editor was used, and the xref chain shows an incremental update — visible structural evidence of post-issuance editing.
Same receipt submitted multiple times across expense periods
A real receipt is reused across multiple expense reports with the date subtly edited each time. Cross-document image hash and content hash comparison catches the reuse — the underlying payment was made once but reimbursed three times.
The scale
Why your existing checks miss this
OCR extracts the amount. It does not verify the file.
Concur and Expensify read whatever the receipt shows — even if the receipt was generated this morning.
Expense management platforms (Concur, Expensify, Ramp, Brex, Pleo, Navan) extract receipt data via OCR and run policy rules. They cannot tell you whether the receipt PDF was issued by a real merchant or fabricated five minutes ago through a generator site. Tax-filing software accepts receipt PDFs at face value. htpbe? catches the receipt PDF the employee or claimant uploaded at the moment of intake — standalone, no platform integration required.
Five forensic layers, one deterministic verdict
Every PDF we receive passes through the same structural pipeline — no model training, no thresholds to tune.
Metadata analysis
Creation and modification timestamps, producer and creator fields, XMP metadata — the first layer exposes basic tampering.
File structure
Xref tables, trailer chain, incremental updates. Any edit after export leaves a structural fingerprint here.
Digital signatures
Signature chain integrity and post-signature modifications produce deterministic markers. Certainty-level signal.
Content integrity
Fonts, objects, embedded content, page assembly. Multi-session edits and inserted objects are visible at this layer.
Verdict with markers
Deterministic output: INTACT / MODIFIED / INCONCLUSIVE, with named markers for every finding — suitable for audit trail.
Receipt and adjacent expense PDFs we check
Every type listed below is analyzed at the structural file layer — not the rendered image.
Detection capabilities
Deterministic structural signals. No probabilistic scores, no model training.
Producer signature mismatch
Authentic receipts carry the producer signature of the issuing platform (Uber, Lyft, Square, Stripe, Toast, Clover, Marriott PMS, etc.). When the producer is Microsoft Excel, Microsoft Word, LibreOffice, Chrome Headless, or a generic PDF library, the document was authored on a desktop or generator tool — it didn't come from the platform.
Generator-tool and AI-platform fingerprints
Receipt-generator sites and AI tools produce PDFs through headless browsers or specific PDF libraries with recognisable producer strings. htpbe? flags these patterns separately from legitimate platform exports.
Incremental update trail
A clean platform export has one cross-reference table. Re-saves through any editor append a second xref — visible structural evidence of post-issuance editing.
Cross-receipt content hash
When the same expense report contains receipts that share content fingerprints across "different" merchants or dates, the API surfaces the duplication. Real receipt issuance from different vendors produces unique content; reuse-with-edits leaves a trail.
Modification timestamp gap
A real receipt issued at the time of purchase has CreationDate matching the transaction. A months-later modification on a "freshly issued" receipt is a high-confidence flag for post-export editing.
Font subset divergence across pages
Multi-session edits or page assembly leave font subset prefix shifts. Single-session legitimate exports have consistent subsets across all pages.
Two HTTP calls to verify any receipt
Buyers can skip this section — developers, the integration is two HTTP calls.
Step 1 — submit the PDF
curl -X POST https://api.htpbe.tech/v1/analyze \
-H "Authorization: Bearer $HTPBE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-storage/employee-hotel-folio.pdf"}'Step 2 — read the verdict
{
"id": "r1e2c3i4-5p6t-7c8h-9k0z-r1f2g3h4i5j6",
"status": "modified",
"modification_confidence": "high",
"modification_markers": [
"Spreadsheet producer detected (Microsoft Excel)",
"Two cross-reference tables — incremental update",
"Modification date 6 weeks after creation date"
],
"producer": "Microsoft Excel",
"creator": "Marriott Opera PMS (original)",
"creation_date": 1707091200,
"modification_date": 1710720000,
"has_digital_signature": false,
"xref_count": 2,
"has_incremental_updates": true
}Original came from Marriott Opera PMS. Then six weeks later it was opened in Microsoft Excel and re-saved — adding a second xref table. Verdict: modified at high confidence. The employee edited a real hotel folio after checkout, likely to inflate the room or extras line.
Customer Stories
Teams that stopped document fraud
Compliance, finance, and risk teams use htpbe? to catch manipulated PDFs before they become costly mistakes.
Caught an invoice where the total had been changed by less than a thousand dollars. Without this I would have approved it without a second look.
Sarah M.
AP Manager
United States
We had three applicants in the same week with bank statements that looked completely fine. Two of them were flagged as modified. You simply cannot see this by reading the document — it is in the file structure.
Lars V.
Risk Analyst, Online Lending
Netherlands
Salary slips were coming with altered figures. We identified two problematic files before the placement was finalised.
Priya K.
HR Operations Lead
India
Since we started checking documents this way, we stopped two applications early in the process that would have been very difficult to reverse later.
Julien R.
Fraud Analyst, Fintech
France
Some applicants were sending PDFs that looked authentic but had been edited in ways not visible to the eye. We now ask for verified originals when something is flagged. Already saved us from a few bad decisions.
Marta S.
Compliance Coordinator
Spain
One invoice was caught because there was a mismatch between the document dates and structure. That particular case would have cost us significantly.
Tariq A.
Finance Manager
United Arab Emirates
Frequently asked questions
Related solutions and guides
Expense Reimbursement
Receipt + hotel folio + flight itinerary forensics for finance ops and T&E review teams.
Insurance Claims
Receipts attached to property and travel claims — same forensics, claims-ops angle.
AI-Generated Document Detection
Sister page — focused treatment of LLM- and AI-tool-generated PDFs.
Fake Repair Estimate Detection
Same claims cluster — forensics for repair estimate PDFs in insurance and expense workflows.
Secure your workflow
Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.