logologo
  • How it works
  • Why It Matters
  • Statistics
  • Pricing
  • API
HTPBE?

Structural PDF tamper detection API. Catches edits your KYC stack misses.

🇫🇮 Made in Finland

Product

  • How It Works
  • Why It Matters
  • Use Cases
  • Pricing

Developers

  • API Reference
  • GitHub/docs
  • Changelogv2.14.0

Resources

  • FAQ
  • Blog
  • Comparisons
  • Legal & Imprint

© 2024–2026 TMI Iurii Rogulia · VAT ID: FI29845875

Status

Algorithm v2.14.0

Back to FAQ
FAQ

Why does my official document show INCONCLUSIVE?

If your document was issued by a real institution — a state register, a court, a bank, a payroll system — and HTPBE? returns INCONCLUSIVE instead of INTACT, that does not mean the document is forged. It means HTPBE? cannot structurally prove that the file was not modified after creation. The document may well be genuine; the verdict reflects the limits of structural analysis, not the document itself.

Why a clean file can still be inconclusive

HTPBE? promises only one thing: detect post-creation modification of the PDF file. To say INTACT we need positive evidence that the file is structurally indistinguishable from the moment it was generated. That evidence does not exist when the document was produced by a tool that anyone can install or use:

  • HTML-to-PDF renderers — wkhtmltopdf, Chrome / Chromium print-to-PDF (Skia/PDF), Headless Chrome, Puppeteer, Playwright, WeasyPrint, Prince. A government registry that serves PDFs through wkhtmltopdf produces output that is byte-for-byte reproducible by anyone with the same template and the same wkhtmltopdf build.
  • Consumer office software — Microsoft Word, LibreOffice, Pages, Google Docs “Download as PDF”, generic print-to-PDF drivers. The same software that an institution might use is freely available to a forger.
  • Online PDF editors and converters — iLovePDF, Smallpdf, PDF24, ILovePDF, Sejda. These services strip original metadata as part of their normal pipeline, so the provenance of any document that passed through them is gone.
  • Scanned images — a PDF that contains only raster pages (photos or scans) and no selectable text. Anyone can print a document, alter the printout, and re-scan it. The scanner has no way to record “this is the original physical paper that left the issuer’s office.”
  • Filled-in PDF forms — Acrobat’s “Fill & Sign”, online form fillers. Filling a form is a legitimate edit, but at the file level it is indistinguishable from a malicious edit, so we do not call the result intact.
  • Unverifiable metadata — the producer or creator field is missing, blank, or stripped, leaving nothing for the engine to compare against known institutional patterns.

In every category the same logic applies: the production tool is public, the output is reproducible, and there is no cryptographic anchor (a digital signature, an issuer’s certificate) that ties the file to a specific source. We refuse to call that file INTACT because doing so would let forgers run the same public tool and inherit a green check.

What you should do with an inconclusive verdict

  • Get the document from the source. If it is a state-register extract, download it yourself directly from the registry. If it is a bank statement, log in to the bank and re-download the original. A copy received from a third party is the part of the chain you cannot trust; replacing it with a fresh copy from the issuer eliminates the question.
  • Look for a digital signature. Many institutional PDFs (court filings, EU eIDAS-compliant invoices, tax filings) carry a cryptographic signature from the issuer. If the signature is present and validates, that is far stronger evidence of authenticity than any structural analysis.
  • Use the issuer’s official verification API or portal when one exists. Government registers, eInvoicing networks, and academic credentialing bodies often expose a query interface that returns the canonical record by document number.
  • Check the visible content yourself. Compare names, dates, amounts, registration numbers against your own records or a trusted directory. HTPBE? does not read content — it cannot tell you whether the values printed on the page are the values the issuer originally produced.

What HTPBE? does well, and where it stops

HTPBE? is built to catch the most common attack: a contractor or counterparty receives a legitimate institutional document, opens it in an editor, changes a number or a name, and forwards the modified file. That category leaves structural fingerprints — multiple xref tables, mismatched generator strings, font-subset divergence, signature-coverage gaps — and HTPBE? is designed to surface them as MODIFIED.

What HTPBE? cannot do is verify fabricated-from-scratch documents. A forger who builds a counterfeit registry extract from scratch in wkhtmltopdf produces a structurally clean file. That is exactly the case we mark INCONCLUSIVE rather than INTACT: structural cleanliness is not authenticity, and we will not pretend otherwise.

For details on the categories themselves, see Can someone create a fake document from scratch? and How HTPBE? determines whether a PDF was modified.

← Previous

Can I check bank statements submitted by loan applicants?

Next →

Why would I need to check payment confirmations and receipts?

Related questions

Keep reading

3 answers

Why would I need to check payment confirmations and receipts?

→

How accurate is PDF modification detection?

→

What are the file size limits for PDF checking?

→