REST API

PDF Authenticity & Tamper Detection API

Detect tampered bank statements, invoices, and financial documents before they enter your pipeline. One API call, structural verdict in under 3 seconds.

No credit card needed. Test environment is free on every plan, unlimited calls. Live calls from $15/mo when you're ready.

59 forensic checks per documentResults in under 10 secondsNo original document needed

The detection gap

What this catches that KYC doesn’t

An applicant submits a bank statement. Your KYC platform confirms the account exists and the identity matches. But the balance has been edited — $2,400 inflated to $24,000. The edit is invisible to visual review and passes every template check.

HTPBE? checks the PDF file structure, not the content. The modification shows up in the xref table, the incremental update chain, and the producer field — regardless of what the document looks like on screen.

One API call returns a structured verdict and named markers. Designed to slot into existing intake pipelines alongside Plaid, Persona, and Alloy — not instead of them.

Example response — modified verdict

json
{
  "status": "modified",
  "modification_confidence": "high",
  "modification_markers": [
    "HTPBE_MULTIPLE_REVISION_LAYERS",
    "HTPBE_DATES_DISAGREE"
  ],
  "has_incremental_updates": true,
  "update_chain_length": 3,
  "xref_count": 4,
  "creator": "Microsoft Excel",
  "producer": "Adobe PDF Library 15.0"
}

Full response includes 20+ fields. See complete schema on GitHub →

How to integrate

From sign-up to verdict in three calls

1

Submit a PDF URL

POST a publicly accessible URL to /v1/analyze. S3 presigned, GCS, Azure, Dropbox, or your own CDN — any HTTPS URL up to 10 MB works. You get back a check ID immediately.

2

We run 59 forensic layers

Metadata, structure, digital signatures, generator fingerprinting, document assembly, content streams, image forensics, and structural integrity — all run in parallel against the binary, not the rendering.

3

Retrieve the verdict

GET /v1/result/{id} for a structured verdict — intact, modified, or inconclusive — with each marker named and confidence rated.

Documentation

Read it, or import it

Three endpoints with field-by-field docs on the left. Postman, Scalar, and OpenAPI bundles on the right when you’d rather skip the prose and start calling.

Quick reference

Three endpoints, one auth header

Base URL https://api.htpbe.tech/v1. All requests authenticated with Authorization: Bearer YOUR_API_KEY.

MethodEndpointDescription
POST/api/v1/analyzeAnalyze a PDF from URL for modifications
GET/api/v1/result/{id}Retrieve a previously completed check
GET/api/v1/checksList all checks with filters and pagination

Monthly quota depends on your plan — see pricing →

Quick start

First call in 30 seconds

Replace YOUR_API_KEY with the key from your dashboard. Test keys (htpbe_test_...) work the same way and return deterministic synthetic results.

bash
# Basic usage - analyze any publicly accessible PDF
curl -X POST https://api.htpbe.tech/v1/analyze \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/document.pdf"}'

Endpoint

POST /v1/analyze — submit a PDF

POST https://api.htpbe.tech/v1/analyze

Request headers

http
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Your API key is issued automatically when you sign up. Both htpbe_live_... (production) and htpbe_test_... (testing) keys are accepted.

Request body

json
{
  "url": "https://example.com/documents/contract.pdf",
  "original_filename": "contract.pdf"
}

url (required): public URL to your PDF. Must be reachable via HTTP/HTTPS.

original_filename (optional): original filename. Useful when the URL contains a generated or hashed filename (e.g. from R2 or S3) — stored and returned in results instead of what we extract from the URL.

Supported sources: AWS S3 (presigned URLs), Google Cloud Storage, Azure Blob, Dropbox shared links, your own CDN, or any publicly accessible URL.

Limits: 10 MB max file size, 30-second download timeout, 20-second analysis timeout. The URL must be reachable without authentication.

Response (201 Created)

json
{
  "id": "3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"
}

Analysis runs synchronously. The response contains only the check id — call GET /v1/result/{id} immediately after to retrieve the full analysis.

With test keys the ID is a deterministic UUID v4 like 00000000-0000-4000-8000-000000000001 — passes UUID format validation but is obviously synthetic.

Two-step usage

bash
# Step 1: Submit for analysis
curl -s -X POST https://api.htpbe.tech/v1/analyze \
  -H "Authorization: Bearer htpbe_live_..." \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/contract.pdf"}'
# → { "id": "3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21" }

# Step 2: Retrieve full result
curl https://api.htpbe.tech/v1/result/3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21 \
  -H "Authorization: Bearer htpbe_live_..."
# → { "status": "modified", "origin": { ... }, ... }

Endpoint

GET /v1/result/{id} — retrieve a check

GET https://api.htpbe.tech/v1/result/{id}

Retrieve a previously completed analysis by its check ID. Returns the full analysis including metadata, structure, signatures, and findings. Only checks that belong to your API client are returned.

Path parameter

id (required): check ID returned from POST /v1/analyze (full UUID v4).

Response (200 OK)

json
{
  "id": "506a6b1b-1360-48a2-b389-abb346f85d04",
  "filename": "contract.pdf",
  "check_date": 1736542583,
  "file_size": 245632,
  "algorithm_version": "2.18.2",
  "current_algorithm_version": "2.18.2",
  "status": "modified",
  "origin": { "type": "institutional", "software": null },
  "creation_date": 1704110400,
  "modification_date": 1707840000,
  "creator": "Adobe Acrobat Pro DC",
  "producer": "Adobe PDF Library 15.0",
  "modification_confidence": "certain",
  "date_sequence_valid": true,
  "metadata_completeness_score": 90,
  "xref_count": 4,
  "has_incremental_updates": true,
  "update_chain_length": 3,
  "pdf_version": "1.7",
  "has_digital_signature": false,
  "signature_count": 0,
  "signature_removed": true,
  "modifications_after_signature": false,
  "page_count": 12,
  "object_count": 487,
  "has_javascript": false,
  "has_embedded_files": false,
  "modification_markers": [
    "HTPBE_SIGNATURE_REMOVED",
    "HTPBE_DATES_DISAGREE"
  ]
}

All date fields (check_date, creation_date, modification_date) are Unix timestamps in seconds.

modification_markers: every modification signal detected, ordered strongest-first.

algorithm_version: reflects the algorithm in use at the time of analysis. The current version may differ.

Error responses

json
// 404 Not Found - Check doesn't exist or belongs to another client
{
  "error": "Check not found or access denied",
  "code": "not_found"
}

// 401 Unauthorized - Invalid API key
{
  "error": "Invalid API key. Please check your credentials.",
  "code": "invalid_api_key"
}

Endpoint

GET /v1/checks — list with filters

GET https://api.htpbe.tech/v1/checks

Paginated list of all your checks with flexible filtering. Use it to build dashboards, export data, or run custom analytics on your own results.

Query parameters (all optional)

limit (1–500, default 100): results per page.

offset (default 0): pagination cursor.

tool: filter by tool name (matches Creator OR Producer).

creator: filter by Creator only.

producer: filter by Producer only.

status (intact/modified/inconclusive): filter by verdict.

from_date / to_date (Unix timestamp): filter by check date.

Response (200 OK)

json
{
  "data": [
    {
      "id": "a3f5c9d2-1360-48a2-b389-abb346f85d04",
      "filename": "invoice-2024-01.pdf",
      "check_date": 1738368000,
      "status": "modified",
      "metadata_completeness_score": 85,
      "creator": "Microsoft Word for Microsoft 365",
      "producer": "Adobe PDF Library 15.0",
      "file_size": 524288,
      "page_count": 5,
      "pdf_version": "1.7",
      "creation_date": 1735689600,
      "modification_date": 1738281600,
      "has_javascript": false,
      "has_digital_signature": true,
      "has_embedded_files": false,
      "has_incremental_updates": true,
      "update_chain_length": 3,
      "object_count": 234
    }
  ],
  "total": 1250,
  "limit": 100,
  "offset": 0,
  "has_more": true
}

Use cases: export raw data, build custom analytics, discover all tools in your traffic, filter only modified PDFs.

Pagination: use has_more to know when to stop.

Example: /v1/checks?status=modified&limit=200

Errors

Error response codes

All errors include an error string and a machine-readable code. Some include a details string with extra context. Requests beyond your monthly quota are charged at overage rates — there is no 429 cutoff.

CodeDescription
400Bad Request — Invalid URL, malformed body, download failed
401Unauthorized — Missing or invalid API key
402Payment Required — No active subscription
403Forbidden — Deactivated key, or test key used with non-test URL
404Not Found — Check ID not found or belongs to a different API key
413Payload Too Large — File exceeds 10 MB
422Unprocessable Entity — Invalid or corrupted PDF
500Internal Server Error — Processing failed

Integration examples

Drop-in code in your stack

Curl, JavaScript, Python, Go, PHP, and Ruby — copy, paste, plug in your key.

bash
# curl is preinstalled on macOS and most Linux distributions

# Step 1: Submit PDF for analysis
curl -X POST https://api.htpbe.tech/v1/analyze \
  -H "Authorization: Bearer htpbe_live_..." \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/document.pdf"}'
# Returns: {"id":"3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"}

# Step 2: Retrieve full results
ID="3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"
curl -s "https://api.htpbe.tech/v1/result/$ID" \
  -H "Authorization: Bearer htpbe_live_..." \
  | jq '.status'

Working with results

Reading, filtering, and rolling up checks

Examples of using the GET endpoints to pull a single result, find every modified PDF in your traffic, and roll a dashboard out of /checks alone.

Get check result by ID

typescript
// Retrieve a specific check result
const checkId = '506a6b1b-1360-48a2-b389-abb346f85d04';

const response = await fetch(
  `https://api.htpbe.tech/v1/result/${checkId}`,
  {
    headers: {
      'Authorization': `Bearer ${API_KEY}`
    }
  }
);

const result = await response.json();

console.log(`File: ${result.filename}`);
console.log(`Status: ${result.status}`);
console.log(`Markers: ${result.modification_markers.join(', ')}`);

List all modified PDFs

python
import requests
from urllib.parse import quote

# Get all modified PDFs for manual review
response = requests.get(
    'https://api.htpbe.tech/v1/checks',
    params={
        'status': 'modified',
        'limit': 100
    },
    headers={'Authorization': f'Bearer {API_KEY}'}
)

data = response.json()

print(f"Found {data['total']} modified PDFs")
print(f"\nShowing first {len(data['data'])} results:")

for check in data['data'][:5]:
    print(f"\n{check['filename']}")
    print(f"  Tool: {check['creator']}{check['producer']}")
    print(f"  Review: https://htpbe.tech/result/{check['id']}")

Build a dashboard

typescript
// Build a dashboard from /checks — no extra endpoints needed
async function fetchDashboardData(apiKey: string) {
  const headers = { Authorization: `Bearer ${apiKey}` };

  // Fetch all checks (paginate if needed)
  const checksRes = await fetch(
    'https://api.htpbe.tech/v1/checks?limit=500',
    { headers }
  );
  const { data: checks, total } = await checksRes.json();

  // Calculate metrics from raw data
  const modified = checks.filter((c) => c.status === 'modified').length;
  const toolStats = new Map<string, { count: number; modified: number }>();

  checks.forEach((check) => {
    const tool = check.producer || 'Unknown';
    const current = toolStats.get(tool) || { count: 0, modified: 0 };
    toolStats.set(tool, {
      count: current.count + 1,
      modified: current.modified + (check.status === 'modified' ? 1 : 0)
    });
  });

  return {
    overview: {
      total,
      modified,
      modificationRate: total > 0 ? ((modified / total) * 100).toFixed(1) : '0.0'
    },
    recentModified: checks
      .filter((c) => c.status === 'modified')
      .slice(0, 5)
      .map((c) => ({ filename: c.filename, tool: c.producer })),
    toolBreakdown: Array.from(toolStats.entries())
      .map(([name, data]) => ({
        name,
        count: data.count,
        modificationRate: ((data.modified / data.count) * 100).toFixed(1)
      }))
      .sort((a, b) => b.count - a.count)
  };
}

const dashboardData = await fetchDashboardData(API_KEY);
console.log('Dashboard Data:', JSON.stringify(dashboardData, null, 2));

Real-world patterns

Three production integrations teams ship in a sprint

bank statement check

Block tampered statements before underwriting

One call before the underwriting decision. Modified verdict = decline and surface markers; inconclusive = manual review queue.

javascript
// Check bank statement before approving a loan application
async function checkBankStatement(statementUrl) {
  // Step 1: Submit for analysis
  const { id } = await fetch('https://api.htpbe.tech/v1/analyze', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer ' + process.env.HTPBE_API_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ url: statementUrl })
  }).then(r => r.json());

  // Step 2: Get full result
  const result = await fetch(`https://api.htpbe.tech/v1/result/${id}`, {
    headers: { 'Authorization': 'Bearer ' + process.env.HTPBE_API_KEY }
  }).then(r => r.json());

  if (result.status === 'modified') {
    return {
      approved: false,
      reason: 'Bank statement has been modified — do not process application',
      markers: result.modification_markers
    };
  }

  if (result.status === 'inconclusive') {
    return {
      approved: false,
      reason: 'Cannot determine document integrity — manual review required',
      origin: result.origin
    };
  }

  return { approved: true, reason: 'Bank statement is structurally intact' };
}
bulk processing

Concurrent scan across hundreds of documents

Submit and retrieve in parallel with aiohttp. Aggregate intact / modified / inconclusive counts for a daily intake report.

python
import asyncio
import aiohttp

async def analyze_bulk(urls: list[str], api_key: str):
    """Analyze multiple PDFs concurrently"""
    headers = {'Authorization': f'Bearer {api_key}'}

    async with aiohttp.ClientSession() as session:
        # Step 1: Submit all PDFs for analysis
        submit_tasks = [
            session.post(
                'https://api.htpbe.tech/v1/analyze',
                headers={**headers, 'Content-Type': 'application/json'},
                json={'url': url}
            )
            for url in urls
        ]
        submit_responses = await asyncio.gather(*submit_tasks)
        ids = [(await r.json())['id'] for r in submit_responses]

        # Step 2: Retrieve all results
        result_tasks = [
            session.get(
                f'https://api.htpbe.tech/v1/result/{id}',
                headers=headers
            )
            for id in ids
        ]
        result_responses = await asyncio.gather(*result_tasks)
        results = [await r.json() for r in result_responses]

        modified_count = sum(1 for r in results if r['status'] == 'modified')
        inconclusive_count = sum(1 for r in results if r['status'] == 'inconclusive')

        return {
            'total': len(results),
            'modified': modified_count,
            'inconclusive': inconclusive_count,
            'intact': len(results) - modified_count - inconclusive_count,
            'details': results
        }

# Process 100 documents in parallel
urls = [f'https://storage.example.com/doc_{i}.pdf' for i in range(100)]
summary = await analyze_bulk(urls, os.getenv('HTPBE_API_KEY'))
print(f"Scanned {summary['total']} docs: {summary['modified']} modified, {summary['inconclusive']} inconclusive, {summary['intact']} intact")
document management

Auto-check every upload, alert on modified

Run a check inside the upload handler. Persist the verdict on the document row; notify the security team only on modified.

typescript
// Automatic tamper check on upload
async function handleDocumentUpload(file: File) {
  // 1. Upload to your storage
  const fileUrl = await uploadToS3(file);

  // 2. Submit for analysis
  const { id } = await fetch('https://api.htpbe.tech/v1/analyze', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${HTPBE_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ url: fileUrl })
  }).then(r => r.json());

  // 3. Retrieve full result
  const result = await fetch(`https://api.htpbe.tech/v1/result/${id}`, {
    headers: { 'Authorization': `Bearer ${HTPBE_API_KEY}` }
  }).then(r => r.json());

  // 4. Store in database with detection status
  await db.documents.create({
    filename: file.name,
    url: fileUrl,
    intact: result.status === 'intact',
    uploaded_at: new Date()
  });

  // 5. Alert if modified
  if (result.status === 'modified') {
    await notifySecurityTeam({
      document: file.name,
      findings: result.modification_markers
    });
  }

  return result;
}

Enterprise — on-premise

Deploy the analyzer where the documents live

Designed for banks, healthcare providers, government agencies, and legal firms with strict data privacy requirements (GDPR, HIPAA, PCI DSS, SOX). Documents never leave your network — the analyzer runs as a single Docker container or in your existing Kubernetes cluster.

Custom pricing based on volume and integrations, with a dedicated account manager, priority support (1-hour response time guaranteed), and regular security patches.

View full on-premise deployment documentation →

100% data privacy
Documents analyzed entirely within your infrastructure. No files, metadata, or results ever leave your network.
Compliance ready
GDPR, HIPAA, PCI DSS, and SOX compliant by design. Your legal and security team approves the deployment.
Easy deployment
Single Docker container or Kubernetes. Production-ready in under 30 minutes. No file size limits.
Custom development
Specific integrations, custom webhook logic, or modifications to match your business processes — built to fit.

LLM-friendly documentation

For AI assistants integrating with HTPBE?, the API is mirrored in a machine-readable format optimized for language models.

View llms.txt

Ready to integrate?

API key issued on signup. Test keys free on every plan.
Live calls from $15/mo — no sales call, cancel any time.