How HTPBE? Proves a PDF Was Altered After It Was Issued
61 forensic checks run on every API call — catching structural manipulation that visual review and KYC platforms miss. Under 3 seconds per document, no original file required.
Why structural analysis
Visual tools check appearance. HTPBE? checks what the software wrote.
A growing class of document fraud detection tools operates on what a PDF looks like — scanning for pixel-level inconsistencies, lighting artifacts, or unusual noise patterns. For physical documents that have been photographed, this has genuine value. For native digital PDFs — the kind a bank, payroll platform, or accounting system generates directly — it addresses the wrong question.
Sophisticated document fraud does not alter the image of a number. It replaces the underlying data while leaving the visual presentation intact. The resulting document passes every pixel-level check cleanly, because visually nothing changed.
HTPBE? examines the internal revision history, the object structure, the font assembly records, the signature coverage maps — layers that rendering engines never expose to the eye. A document can look exactly right and be structurally compromised. We find the second kind.
Detection coverage
What the 61 checks cover
Each domain below answers a question a fraud analyst would ask by hand — did the dates contradict each other on a bank statement? was a signature stripped after signing? — automated by 61 forensic checks that flag tampered documents in seconds.
Every document runs all checks independently. Multiple findings can fire simultaneously — each reported separately with its own confidence level.
Metadata & Timestamps
- Modification date postdating the declared creation date
HTPBE_DATES_DISAGREE - Two internal timestamp records that contradict each other
HTPBE_TIMESTAMP_LAYERS_DISAGREE - Author or title fields that disagree between internal records
HTPBE_IDENTITY_LAYERS_DISAGREE - Timestamp values that are physically impossible
HTPBE_IMPOSSIBLE_TIMESTAMP - Timezone offset that doesn’t correspond to any real location
HTPBE_INVALID_TIMEZONE - Annotation timestamps that predate the document’s own creation date
HTPBE_ANNOTATION_PREDATES_CREATION - Document title showing residue from a known editing or unlock tool
HTPBE_UNLOCK_TOOL_RESIDUE - Timestamps within the document metadata that disagree with each other — indicates metadata was rewritten after authoring
HTPBE_XMP_DATES_DESYNCED
Digital Signatures
- Content changed after the document was digitally signed
HTPBE_POST_SIGNATURE_EDIT - Evidence that a digital signature was stripped after signing
HTPBE_SIGNATURE_REMOVED
Incremental Update History
- Multiple revision layers — the document was modified after creation
HTPBE_MULTIPLE_REVISION_LAYERS - Edit history limited to metadata fields, characteristic of automated timestamp tampering
HTPBE_METADATA_ONLY_REVISION - Editing history collapsed into a single revision to mask multiple prior modification rounds inside what claims to be a freshly generated file
HTPBE_FLATTENED_EDIT_HISTORY
Generator Fingerprinting
- Declared authoring software doesn’t match actual file construction
HTPBE_TOOL_VS_STRUCTURE_MISMATCH - Font handling inconsistent with the declared authoring tool
HTPBE_FONT_VS_TOOL_MISMATCH - Residual generator identity left behind by a tool that tried to claim a different origin
HTPBE_RESIDUAL_PRIOR_GENERATOR - Fingerprints of a known PDF editing tool
HTPBE_EDITING_TOOL_FINGERPRINT - Document carries the identity of an online PDF editing or conversion service
HTPBE_ONLINE_EDITOR_ORIGIN - Generator identity selectively edited rather than fully replaced
HTPBE_SELECTIVE_IDENTITY_EDIT - Entire generator identity overwritten to hide origin
HTPBE_IDENTITY_OVERWRITTEN - Identity claims that match no legitimate authoring tool
HTPBE_UNKNOWN_IDENTITY_CLAIM - Active scripting added on top of a previously re-emitted document
HTPBE_SCRIPTING_ON_REEMITTED - Different generator identities across different parts of the file
HTPBE_LAYERED_IDENTITY_DISAGREEMENT - Tool combinations that don’t occur in any legitimate publishing pipeline
HTPBE_IMPOSSIBLE_TOOL_PIPELINE - Identity fields deliberately blanked by post-processing
HTPBE_IDENTITY_BLANKED - Document constructed from a design-tool template
HTPBE_DESIGN_TEMPLATE_ASSEMBLY - Document embeds a typeface whose identity shows signs of manual alteration inconsistent with any legitimate type foundry release
HTPBE_ALTERED_TYPEFACE_IDENTITY - Document carries default-metadata fingerprints of a generator that contradicts its declared authoring toolchain
HTPBE_TOOLCHAIN_DEFAULT_METADATA_RESIDUE - Document was re-printed to erase its editing history, but residual evidence of post-authoring value substitution remains
HTPBE_PRINT_PIPELINE_TAMPERING - Evidence the document’s font layer is inconsistent with how genuine institutional issuers assemble their typography
HTPBE_FONT_LAYER_INCONSISTENT_WITH_ISSUER - Evidence the document’s declared institutional generator was spoofed by a third-party tool to disguise its origin
HTPBE_PRODUCER_IDENTITY_FORGED
Document Assembly
- Pages assembled from independently rendered sources
HTPBE_PAGES_FROM_MULTIPLE_SOURCES - Residual identity from a different document embedded inside this one
HTPBE_RESIDUAL_DOCUMENT_STRUCTURE - Pages produced by inconsistent classes of tooling within the same document
HTPBE_MIXED_TOOLING_CLASSES - Scanned image inserted programmatically into a digital document
HTPBE_SCAN_IN_DIGITAL_DOC - Pages re-exported through an office-suite editor after creation
HTPBE_REEXPORTED_THROUGH_OFFICE_SUITE - Pages of physically different dimensions within the same document
HTPBE_MIXED_PAGE_DIMENSIONS - Template coverage that breaks pattern between pages
HTPBE_TEMPLATE_PATTERN_BREAK
Content Stream Analysis
- Editing markers inside the content stream
HTPBE_CONTENT_STREAM_EDITING_MARKERS - Drawing operations inconsistent with the document’s claimed authoring tool
HTPBE_DRAWING_OPS_INCONSISTENT - Text converted to vector shapes to defeat downstream extraction
HTPBE_TEXT_AS_VECTOR_OUTLINES - Document rebuilt by a consumer print-driver utility, erasing original authoring history
HTPBE_PRINTED_OUT_OF_PDF_READER - Text layer floating above scanned images — values overlaid on a scan
HTPBE_TEXT_OVERLAY_ON_SCAN - Redaction markings that don’t actually conceal the underlying content
HTPBE_INCOMPLETE_REDACTION - Invisible duplicate text shadowing visible values on the page
HTPBE_INVISIBLE_DUPLICATE_TEXT - Targeted character-level overlays added by a desktop PDF editor
HTPBE_CHARACTER_OVERLAY_EDIT - Targeted glyph-level edits where individual characters were replaced after authoring
HTPBE_GLYPH_LEVEL_EDIT - A single typeface embedded in two incompatible ways on the same page
HTPBE_MIXED_FONT_EMBEDDING - Editor re-save patterns where original font embedding was partially replaced
HTPBE_PARTIAL_FONT_REPLACEMENT - Annotations from a markup tool burned into the page after original creation
HTPBE_BURNED_IN_ANNOTATIONS - Form field value that disagrees with the value visible inside the widget
HTPBE_WIDGET_APPEARANCE_MISMATCH
Image Forensics
- Embedded image showing signs of having been edited and re-saved
HTPBE_IMAGE_EDITED_AND_RESAVED - Document content collapsed to image form to discard the editing history
HTPBE_COLLAPSED_TO_RASTER - Image geometry indicating the document is a scan of physical paper
HTPBE_PHYSICAL_SCAN_GEOMETRY - Documents that claim scanner origin without matching genuine scanner output
HTPBE_FAKE_SCANNER_ORIGIN - Documents presented as a captured image whose pixel content shows machine-rendered uniformity inconsistent with optical capture
HTPBE_RENDERED_PSEUDO_SCAN
Structural Integrity
- Internal navigation tables that don’t match the rest of the file
HTPBE_NAVIGATION_TABLE_MISMATCH - Declared file size that doesn’t match actual content
HTPBE_DECLARED_SIZE_MISMATCH - Document identifier inconsistent across the file’s internal records
HTPBE_IDENTIFIER_MISMATCH - Extra content appended past the document’s normal end marker
HTPBE_TRAILING_BYTES_AFTER_EOF - Internal reference to a structure that’s missing from the file
HTPBE_DANGLING_STRUCTURE_REFERENCE
How it works
Three steps, under 3 seconds
Send a PDF URL
POST a publicly accessible URL to /api/v1/analyze. No file upload, no size limit for the API. You get back a check ID immediately.
8 forensic domains analyzed
All 61 checks run in parallel across metadata, structure, digital signatures, generator fingerprinting, document assembly, content streams, image forensics, and structural integrity.
Verdict + named markers
GET /api/v1/result/{id} for a structured verdict — intact, modified, or inconclusive — with every triggered finding named individually and confidence rated.
Example API response
What you get back
A bank statement with a removed digital signature and mismatched timestamps returns this. Each marker is a named, actionable forensic finding — not a score.
{
"id": "9f3a2c1d-8b47-4e6f-a012-3d5e7f890123",
"status": "modified",
"modification_confidence": "certain",
"modification_markers": [
"HTPBE_SIGNATURE_REMOVED",
"HTPBE_DATES_DISAGREE"
],
"origin": {
"type": "institutional",
"software": null
},
"creator": "Adobe Acrobat Pro DC",
"has_incremental_updates": true,
"update_chain_length": 3,
"xref_count": 4,
"page_count": 12
}Full schema and all marker descriptions in the API reference →
Close the structural gap your KYC stack leaves open
Start with welcome credits, no card required. Point the API at a PDF URL and read the verdict in seconds.
Verdicts
Three possible outcomes
intactNo modification detected
No structural evidence of post-creation changes. The file structure matches what the generating software would produce without intervention.
Intact confirms the file was not modified. It does not guarantee the content is truthful — a document fabricated from scratch with false data returns intact.
inconclusiveOrigin can’t be confirmed — escalate
The document was created in consumer-grade software — Microsoft Word, LibreOffice, a free online converter — that does not leave the institutional markers genuine issued documents always contain.
For fraud teams: real banks and payroll engines generate documents programmatically. Inconclusive on a bank statement or payslip means the file was assembled outside an institutional system.
modifiedForensic evidence of tampering
One or more structural markers confirm post-creation modification. Confidence is either certain (signatures or date contradictions) or high for all other markers.
Secure your workflow
Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.