61 Forensic Checks

How HTPBE? Proves a PDF Was Altered After It Was Issued

61 forensic checks run on every API call — catching structural manipulation that visual review and KYC platforms miss. Under 3 seconds per document, no original file required.

Why structural analysis

Visual tools check appearance. HTPBE? checks what the software wrote.

A growing class of document fraud detection tools operates on what a PDF looks like — scanning for pixel-level inconsistencies, lighting artifacts, or unusual noise patterns. For physical documents that have been photographed, this has genuine value. For native digital PDFs — the kind a bank, payroll platform, or accounting system generates directly — it addresses the wrong question.

Sophisticated document fraud does not alter the image of a number. It replaces the underlying data while leaving the visual presentation intact. The resulting document passes every pixel-level check cleanly, because visually nothing changed.

HTPBE? examines the internal revision history, the object structure, the font assembly records, the signature coverage maps — layers that rendering engines never expose to the eye. A document can look exactly right and be structurally compromised. We find the second kind.

61
Forensic checks per document
< 3 sec
Median analysis time
8
Forensic domains analyzed
No original needed
to compare against

Detection coverage

What the 61 checks cover

Each domain below answers a question a fraud analyst would ask by hand — did the dates contradict each other on a bank statement? was a signature stripped after signing? — automated by 61 forensic checks that flag tampered documents in seconds.

Every document runs all checks independently. Multiple findings can fire simultaneously — each reported separately with its own confidence level.

8

Metadata & Timestamps

  • Modification date postdating the declared creation dateHTPBE_DATES_DISAGREE
  • Two internal timestamp records that contradict each otherHTPBE_TIMESTAMP_LAYERS_DISAGREE
  • Author or title fields that disagree between internal recordsHTPBE_IDENTITY_LAYERS_DISAGREE
  • Timestamp values that are physically impossibleHTPBE_IMPOSSIBLE_TIMESTAMP
  • Timezone offset that doesn’t correspond to any real locationHTPBE_INVALID_TIMEZONE
  • Annotation timestamps that predate the document’s own creation dateHTPBE_ANNOTATION_PREDATES_CREATION
  • Document title showing residue from a known editing or unlock toolHTPBE_UNLOCK_TOOL_RESIDUE
  • Timestamps within the document metadata that disagree with each other — indicates metadata was rewritten after authoringHTPBE_XMP_DATES_DESYNCED
2

Digital Signatures

  • Content changed after the document was digitally signedHTPBE_POST_SIGNATURE_EDIT
  • Evidence that a digital signature was stripped after signingHTPBE_SIGNATURE_REMOVED
3

Incremental Update History

  • Multiple revision layers — the document was modified after creationHTPBE_MULTIPLE_REVISION_LAYERS
  • Edit history limited to metadata fields, characteristic of automated timestamp tamperingHTPBE_METADATA_ONLY_REVISION
  • Editing history collapsed into a single revision to mask multiple prior modification rounds inside what claims to be a freshly generated fileHTPBE_FLATTENED_EDIT_HISTORY
18

Generator Fingerprinting

  • Declared authoring software doesn’t match actual file constructionHTPBE_TOOL_VS_STRUCTURE_MISMATCH
  • Font handling inconsistent with the declared authoring toolHTPBE_FONT_VS_TOOL_MISMATCH
  • Residual generator identity left behind by a tool that tried to claim a different originHTPBE_RESIDUAL_PRIOR_GENERATOR
  • Fingerprints of a known PDF editing toolHTPBE_EDITING_TOOL_FINGERPRINT
  • Document carries the identity of an online PDF editing or conversion serviceHTPBE_ONLINE_EDITOR_ORIGIN
  • Generator identity selectively edited rather than fully replacedHTPBE_SELECTIVE_IDENTITY_EDIT
  • Entire generator identity overwritten to hide originHTPBE_IDENTITY_OVERWRITTEN
  • Identity claims that match no legitimate authoring toolHTPBE_UNKNOWN_IDENTITY_CLAIM
  • Active scripting added on top of a previously re-emitted documentHTPBE_SCRIPTING_ON_REEMITTED
  • Different generator identities across different parts of the fileHTPBE_LAYERED_IDENTITY_DISAGREEMENT
  • Tool combinations that don’t occur in any legitimate publishing pipelineHTPBE_IMPOSSIBLE_TOOL_PIPELINE
  • Identity fields deliberately blanked by post-processingHTPBE_IDENTITY_BLANKED
  • Document constructed from a design-tool templateHTPBE_DESIGN_TEMPLATE_ASSEMBLY
  • Document embeds a typeface whose identity shows signs of manual alteration inconsistent with any legitimate type foundry releaseHTPBE_ALTERED_TYPEFACE_IDENTITY
  • Document carries default-metadata fingerprints of a generator that contradicts its declared authoring toolchainHTPBE_TOOLCHAIN_DEFAULT_METADATA_RESIDUE
  • Document was re-printed to erase its editing history, but residual evidence of post-authoring value substitution remainsHTPBE_PRINT_PIPELINE_TAMPERING
  • Evidence the document’s font layer is inconsistent with how genuine institutional issuers assemble their typographyHTPBE_FONT_LAYER_INCONSISTENT_WITH_ISSUER
  • Evidence the document’s declared institutional generator was spoofed by a third-party tool to disguise its originHTPBE_PRODUCER_IDENTITY_FORGED
7

Document Assembly

  • Pages assembled from independently rendered sourcesHTPBE_PAGES_FROM_MULTIPLE_SOURCES
  • Residual identity from a different document embedded inside this oneHTPBE_RESIDUAL_DOCUMENT_STRUCTURE
  • Pages produced by inconsistent classes of tooling within the same documentHTPBE_MIXED_TOOLING_CLASSES
  • Scanned image inserted programmatically into a digital documentHTPBE_SCAN_IN_DIGITAL_DOC
  • Pages re-exported through an office-suite editor after creationHTPBE_REEXPORTED_THROUGH_OFFICE_SUITE
  • Pages of physically different dimensions within the same documentHTPBE_MIXED_PAGE_DIMENSIONS
  • Template coverage that breaks pattern between pagesHTPBE_TEMPLATE_PATTERN_BREAK
13

Content Stream Analysis

  • Editing markers inside the content streamHTPBE_CONTENT_STREAM_EDITING_MARKERS
  • Drawing operations inconsistent with the document’s claimed authoring toolHTPBE_DRAWING_OPS_INCONSISTENT
  • Text converted to vector shapes to defeat downstream extractionHTPBE_TEXT_AS_VECTOR_OUTLINES
  • Document rebuilt by a consumer print-driver utility, erasing original authoring historyHTPBE_PRINTED_OUT_OF_PDF_READER
  • Text layer floating above scanned images — values overlaid on a scanHTPBE_TEXT_OVERLAY_ON_SCAN
  • Redaction markings that don’t actually conceal the underlying contentHTPBE_INCOMPLETE_REDACTION
  • Invisible duplicate text shadowing visible values on the pageHTPBE_INVISIBLE_DUPLICATE_TEXT
  • Targeted character-level overlays added by a desktop PDF editorHTPBE_CHARACTER_OVERLAY_EDIT
  • Targeted glyph-level edits where individual characters were replaced after authoringHTPBE_GLYPH_LEVEL_EDIT
  • A single typeface embedded in two incompatible ways on the same pageHTPBE_MIXED_FONT_EMBEDDING
  • Editor re-save patterns where original font embedding was partially replacedHTPBE_PARTIAL_FONT_REPLACEMENT
  • Annotations from a markup tool burned into the page after original creationHTPBE_BURNED_IN_ANNOTATIONS
  • Form field value that disagrees with the value visible inside the widgetHTPBE_WIDGET_APPEARANCE_MISMATCH
5

Image Forensics

  • Embedded image showing signs of having been edited and re-savedHTPBE_IMAGE_EDITED_AND_RESAVED
  • Document content collapsed to image form to discard the editing historyHTPBE_COLLAPSED_TO_RASTER
  • Image geometry indicating the document is a scan of physical paperHTPBE_PHYSICAL_SCAN_GEOMETRY
  • Documents that claim scanner origin without matching genuine scanner outputHTPBE_FAKE_SCANNER_ORIGIN
  • Documents presented as a captured image whose pixel content shows machine-rendered uniformity inconsistent with optical captureHTPBE_RENDERED_PSEUDO_SCAN
5

Structural Integrity

  • Internal navigation tables that don’t match the rest of the fileHTPBE_NAVIGATION_TABLE_MISMATCH
  • Declared file size that doesn’t match actual contentHTPBE_DECLARED_SIZE_MISMATCH
  • Document identifier inconsistent across the file’s internal recordsHTPBE_IDENTIFIER_MISMATCH
  • Extra content appended past the document’s normal end markerHTPBE_TRAILING_BYTES_AFTER_EOF
  • Internal reference to a structure that’s missing from the fileHTPBE_DANGLING_STRUCTURE_REFERENCE

How it works

Three steps, under 3 seconds

1

Send a PDF URL

POST a publicly accessible URL to /api/v1/analyze. No file upload, no size limit for the API. You get back a check ID immediately.

2

8 forensic domains analyzed

All 61 checks run in parallel across metadata, structure, digital signatures, generator fingerprinting, document assembly, content streams, image forensics, and structural integrity.

3

Verdict + named markers

GET /api/v1/result/{id} for a structured verdict — intact, modified, or inconclusive — with every triggered finding named individually and confidence rated.

Example API response

What you get back

A bank statement with a removed digital signature and mismatched timestamps returns this. Each marker is a named, actionable forensic finding — not a score.

GET /api/v1/result/{id}
{
  "id": "9f3a2c1d-8b47-4e6f-a012-3d5e7f890123",
  "status": "modified",
  "modification_confidence": "certain",
  "modification_markers": [
    "HTPBE_SIGNATURE_REMOVED",
    "HTPBE_DATES_DISAGREE"
  ],
  "origin": {
    "type": "institutional",
    "software": null
  },
  "creator": "Adobe Acrobat Pro DC",
  "has_incremental_updates": true,
  "update_chain_length": 3,
  "xref_count": 4,
  "page_count": 12
}

Full schema and all marker descriptions in the API reference →

Close the structural gap your KYC stack leaves open

Start with welcome credits, no card required. Point the API at a PDF URL and read the verdict in seconds.

Verdicts

Three possible outcomes

intact

No modification detected

No structural evidence of post-creation changes. The file structure matches what the generating software would produce without intervention.

Intact confirms the file was not modified. It does not guarantee the content is truthful — a document fabricated from scratch with false data returns intact.

inconclusive

Origin can’t be confirmed — escalate

The document was created in consumer-grade software — Microsoft Word, LibreOffice, a free online converter — that does not leave the institutional markers genuine issued documents always contain.

For fraud teams: real banks and payroll engines generate documents programmatically. Inconclusive on a bank statement or payslip means the file was assembled outside an institutional system.

modified

Forensic evidence of tampering

One or more structural markers confirm post-creation modification. Confidence is either certain (signatures or date contradictions) or high for all other markers.

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.