logo
Comparison

Why PDF Metadata Tools Miss Most Document Fraud

ExifTool, PDF metadata viewers, and generic document inspection tools show you what metadata says. htpbe? cross-validates metadata against file structure, font patterns, and digital signatures — because metadata is exactly what fraudsters manipulate first.

The core problem

Metadata Is the First Thing Fraudsters Clean

When a fraudster modifies a bank statement, the first thing they do is clean the metadata. Any tool that only reads metadata fields will show a clean result after this step. The real evidence is in the structural layers: cross-reference tables, object streams, font subsets, incremental update history. These layers cannot be erased without completely regenerating the file — which itself leaves a detectable trace.

Side by Side

Metadata Tools vs htpbe?

Both read PDF files. Only one reads what fraudsters can’t erase.

PDF Metadata Viewers (ExifTool, etc.)htpbe?
What they checkMetadata fields (producer, dates, author)Metadata + 6 structural layers
Fooled byClearing metadata fieldsVery difficult — structural traces remain even after metadata wipe
Requires original fileNoNo
Detects edited metadataShows values, doesn’t validate themCross-validates against internal binary structure
Digital signature analysisNoYes — detects post-signing edits and removed signatures
Font fingerprintingNoYes — pages from different source documents
Result formatRaw field dumpStructured verdict: INTACT / MODIFIED / INCONCLUSIVE
APINo (CLI tools)Yes — REST API, integrates into any workflow
Structural Analysis

What Structural Analysis Catches That Metadata Tools Miss

Four forensic signals that exist in the binary structure of every modified PDF.

1

Incremental update traces

When a PDF is reopened and edited, changes are appended as a new revision layer rather than rewriting the file. This trail lives in the cross-reference table structure — not in any metadata field. Metadata tools cannot see it.

2

Font subset divergence

Pages assembled from different source PDFs carry distinct font subset namespaces. These prefixes are assigned at PDF generation time and reveal when content originated in a different document — invisible in any metadata view.

3

Signature invalidation

A digital signature cryptographically covers the file content at the moment of signing. If the content changes afterward, the signature no longer validates — but the metadata may still show a signature field. Only structural analysis reveals the mismatch.

4

Generator fingerprint mismatch

The PDF binary contains a producer fingerprint embedded in its object structure, independent of the declared metadata. When these contradict each other — a known generator signature paired with mismatched metadata — it indicates the metadata was altered after creation.

Decision guide

When to Use Each

Metadata tools (ExifTool, etc.)

Quick manual spot-check

Useful when you already suspect something specific in a single document and want to quickly inspect raw field values. Reasonable starting point for a one-off investigation by someone who knows what they’re looking at.

htpbe?

Automated pipeline at scale

When you need consistent, scalable, tamper-resistant detection across hundreds of documents per month. Integrates via REST API into your lending, compliance, or accounts payable workflow. Returns a structured verdict in under 3 seconds, not a raw field dump to interpret manually.

See the full detection surface

35 checks. 7 structural layers. One verdict.

Read the complete breakdown of every signal htpbe? analyzes, or go straight to pricing to see which plan fits your volume.

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.