logologo
  • How it works
  • Why It Matters
  • Statistics
  • Pricing
  • API
logologo
  • How it works
  • Why It Matters
  • Statistics
  • Pricing
  • API
HTPBE?

Structural PDF tamper detection API. Catches edits your KYC stack misses.

Product

  • How It Works
  • Why It Matters
  • Use Cases
  • Pricing

Developers

  • API Reference
  • GitHub/docs
  • Changelogv2.23.1

Resources

  • FAQ
  • Blog
  • Comparisons
  • Legal & Imprint

© 2024–2026 TMI Iurii Rogulia · VAT ID: FI29845875 · Made in Finland 🇫🇮

Status

Algorithm v2.23.1

Tool profile

PyPDF

PyPDF appears on both legitimate first-generation output and downstream re-save flows — context (the other tool on the same document) is what flips the signal.

Back to all statistics
Forensic verdict

Mixed signal

Based on 84 appearances across the HTPBE? corpus.

Modification rate
4%-44pp below baseline
Corpus baseline: 48%
Total appearances
84
0.40% of corpus
Modification rate
4%
-44pp below baseline
Role split
0%C/100%P
Creator vs Producer share of appearances

Corpus profile

How PyPDF shows up in HTPBE? corpus

PyPDF is a Python PDF library (pypdf / PyPDF2 family) widely used in scripting and back-office automation to merge, split, encrypt, and extract pages from existing PDFs.

PyPDF rarely creates documents from scratch — it almost always re-emits an existing PDF after a programmatic operation. When PyPDF is the latest Producer on a document whose Creator was institutional, the producer/creator mismatch indicates programmatic post-processing.

The signal
PyPDF rarely creates documents from scratch — it almost always re-emits an existing PDF after a programmatic operation.

Role in the workflow

How PyPDF shows up in metadata

Every PDF carries a Creator (the application that produced the original document) and a Producer (the engine that wrote the PDF). The same tool can appear in either slot, with very different modification profiles.

CAs Creator · 0%
As Producer · 100%P
CAs Creator
  • Usage
    0
  • Modification rate
    0%
PAs Producer
  • Usage
    84
  • Modification rate
    4%
  • Avg file size
    810 KB

How to read this

The Creator slot typically reflects where a document started life. The Producer slot reflects whatever wrote the bytes — and is the field that gets overwritten when a PDF is opened, edited, and saved by a downstream tool.

A higher modification rate as Producer than as Creator usually means the tool is acting as a re-saver on documents that originated elsewhere. A higher rate as Creator points to fragile workflows around the original authoring app.

Name fingerprints

Also goes by

Different version strings and spellings observed for PyPDF in the wild. All are merged into the same canonical profile.

PyPDF288.1%
pypdf10.7%
PyPDF31.2%

Why variants matter

The same tool publishes itself under 3 different metadata strings — version bumps, locale tags, build IDs. We canonicalize them so the corpus reflects one identity, not noise.

Most common
PyPDF2
88.1% of appearances
Variant spread
3 distinct strings
Long-tail share: 11.9%
Observed range
28 Jan 2026 → 15 Apr 2026

Distributions

What ships alongside PyPDF

The PDF versions PyPDF writes when acting as Producer, and the other tools that appear in the same documents.

PDF versions written

Most output is PDF 1.4 (96% of files where PyPDF is the Producer).

PDF 1.496.2%
PDF 1.73.8%

Common Creators when PyPDF is the Producer

Quadient sits upstream in 50% of cases — read this row as “what kinds of documents end up routed through PyPDF.”

Quadient50.0%
wkhtmltopdf50.0%

Related profiles

Tools you’ll see next to PyPDF

Other tools that frequently share metadata with PyPDF in the same documents. Each card links to its own forensic profile.

C2% co-occurrence
Quadient
Appearances66
Mod rate76%
C2% co-occurrence
wkhtmltopdf
Appearances368
Mod rate6%

Long tail

Notable observations

Smaller cuts of the PyPDF corpus — useful context, but treat each row as a single data point rather than a strong signal.

Embedded JavaScript
1

Files containing JavaScript code

Pages parsed
154
Oldest observed
28 Jan 2026 — 4 months ago

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.

Start free — close the structural fraud gapSee pricing
Read API docs →