logologo
  • How it works
  • Why It Matters
  • Statistics
  • Pricing
  • API
logologo
  • How it works
  • Why It Matters
  • Statistics
  • Pricing
  • API
HTPBE?

Structural PDF tamper detection API. Catches edits your KYC stack misses.

Product

  • How It Works
  • Why It Matters
  • Use Cases
  • Pricing

Developers

  • API Reference
  • GitHub/docs
  • Changelogv2.23.1

Resources

  • FAQ
  • Blog
  • Comparisons
  • Legal & Imprint

© 2024–2026 TMI Iurii Rogulia · VAT ID: FI29845875 · Made in Finland 🇫🇮

Status

Algorithm v2.23.1

Tool profile

Scanner Xerox

Scanner Xerox appears on both legitimate first-generation output and downstream re-save flows — context (the other tool on the same document) is what flips the signal.

Back to all statistics
Forensic verdict

Mixed signal

Based on 267 appearances across the HTPBE? corpus.

Modification rate
2%-46pp below baseline
Corpus baseline: 48%
Total appearances
267
1.3% of corpus
Modification rate
2%
-46pp below baseline
Role split
51%C/49%P
Creator vs Producer share of appearances

Corpus profile

How Scanner Xerox shows up in HTPBE? corpus

Scanner Xerox is one of the PDF-handling tools surfaced in the HTPBE? corpus. Scanner Xerox splits its occurrences between Creator (51%) and Producer (49%) roles, meaning it sometimes originates documents and sometimes re-emits them after another tool created them.

In the HTPBE? corpus the contextual signal we look for is a producer/creator mismatch: when Scanner Xerox appears as the latest Producer on a document whose Creator was an institutional source (e.g. Adobe PDF Library, Microsoft Word, a banking back-end), the document was rebuilt or re-saved after its original creation. That mismatch is the marker — never the tool itself.

On documents where Scanner Xerox acts as Creator, 4% carry modification markers; on documents where it acts as Producer, 1% do. These are observed rates inside the HTPBE? corpus and should be read as base-rates, not as accusations against Scanner Xerox or its users.

The signal
In the HTPBE? corpus the contextual signal we look for is a producer/creator mismatch: when Scanner Xerox appears as the latest Producer on a document whose Creator was an institutional source (e.g. Adobe PDF Library, Microsoft Word, a banking back-end), the document was rebuilt or re-saved after its original creation. That mismatch is the marker — never the tool itself.

Role in the workflow

How Scanner Xerox shows up in metadata

Every PDF carries a Creator (the application that produced the original document) and a Producer (the engine that wrote the PDF). The same tool can appear in either slot, with very different modification profiles.

CAs Creator · 51%
As Producer · 49%P
CAs Creator
  • Usage
    136
  • Modification rate
    4%
  • Avg file size
    76 KB
PAs Producer
  • Usage
    131
  • Modification rate
    1%
  • Avg file size
    68 KB

How to read this

The Creator slot typically reflects where a document started life. The Producer slot reflects whatever wrote the bytes — and is the field that gets overwritten when a PDF is opened, edited, and saved by a downstream tool.

A higher modification rate as Producer than as Creator usually means the tool is acting as a re-saver on documents that originated elsewhere. A higher rate as Creator points to fragile workflows around the original authoring app.

Name fingerprints

Also goes by

Different version strings and spellings observed for Scanner Xerox in the wild. All are merged into the same canonical profile.

Xerox EC805693.6%
Xerox WorkCentre 73453.7%
Xerox Print Application1.5%
Xerox AltaLink C81551.1%

Why variants matter

The same tool publishes itself under 4 different metadata strings — version bumps, locale tags, build IDs. We canonicalize them so the corpus reflects one identity, not noise.

Most common
Xerox EC8056
93.6% of appearances
Variant spread
4 distinct strings
Long-tail share: 6.4%
Observed range
23 Oct 2025 → 22 Apr 2026

Distributions

What ships alongside Scanner Xerox

The PDF versions Scanner Xerox writes when acting as Producer, and the other tools that appear in the same documents.

PDF versions written

Most output is PDF 1.4 (96% of files where Scanner Xerox is the Producer).

PDF 1.496.2%
PDF 1.33.8%

Common Producers when Scanner Xerox is the Creator

CrawfordTech writes 3% of these files — that pairing is the Adobe-stack default for many institutional pipelines.

CrawfordTech2.9%
PDF Expert0.7%

Related profiles

Tools you’ll see next to Scanner Xerox

Other tools that frequently share metadata with Scanner Xerox in the same documents. Each card links to its own forensic profile.

P1% co-occurrence
CrawfordTech
Appearances67
Mod rate16%
P0% co-occurrence
PDF Expert
Appearances3
Mod rate33%

Long tail

Notable observations

Smaller cuts of the Scanner Xerox corpus — useful context, but treat each row as a single data point rather than a strong signal.

Pages parsed
338
Oldest observed
23 Oct 2025 — 7 months ago

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.

Start free — close the structural fraud gapSee pricing
Read API docs →