PDF Authenticity & Tamper Detection API
Detect tampered bank statements, invoices, and financial documents before they enter your pipeline.
Quick Start Example
# Basic usage - analyze any publicly accessible PDF
curl -X POST https://api.htpbe.tech/v1/analyze \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/document.pdf"}'Key issued on signup. Test environment free on every plan. Live calls from $15/mo.
What this catches that KYC doesn’t
KYC platforms confirm that a document looks real. HTPBE? detects whether the specific PDF file was modified after it was generated.
An applicant submits a bank statement. Your KYC platform confirms the account exists and the identity matches. But the balance has been edited — $2,400 inflated to $24,000. The edit is invisible to visual review and passes all template checks.
HTPBE? checks the PDF file structure, not the content. The modification shows up in the xref table, the incremental update chain, and the producer field — regardless of what the document looks like on screen.
- One API call — result in under 10 seconds
- No original document needed — standalone structural analysis
- Works alongside Plaid, Persona, and Alloy — not instead of them
Example Response — Modified Verdict
{
"status": "modified",
"modification_confidence": "high",
"modification_markers": [
"Multiple xref tables detected",
"Incremental update chain length: 3",
"Creation and modification dates differ"
],
"has_incremental_updates": true,
"xref_count": 3,
"creator": "Microsoft Excel",
"producer": "Adobe PDF Library 15.0"
}The full response includes 20+ fields. See complete schema on GitHub →
API Documentation
A simple REST API with comprehensive PDF analysis capabilities
Complete API Reference on GitHub
For detailed field-by-field documentation including all possible values, error codes, and comprehensive examples, visit our GitHub documentation:
Quick Reference
| Method | Endpoint | Description |
|---|---|---|
POST | /api/v1/analyze | Analyze PDF from URL for modifications |
GET | /api/v1/result/{id} | Retrieve previously completed check by ID |
GET | /api/v1/checks | List all checks with filtering and pagination |
Base URL: https://api.htpbe.tech/v1
Authentication: All endpoints require Authorization: Bearer YOUR_API_KEY
Monthly Quota: Depends on your plan — see Pricing for details.
Analyze PDF Document
POST https://api.htpbe.tech/v1/analyzeRequest Headers
Authorization: Bearer YOUR_API_KEY
Content-Type: application/jsonYour API key is issued automatically when you sign up. Supports both htpbe_live_... (production) and htpbe_test_... (testing) keys.
Request Body
{
"url": "https://example.com/documents/contract.pdf",
"original_filename": "contract.pdf"
}url (required): Public URL to your PDF file. Must be accessible via HTTP/HTTPS.
original_filename (optional): Original filename of the document. Useful when the URL contains a generated or hashed filename (e.g. from Vercel Blob or S3). When provided, this name is stored and returned in results instead of the filename extracted from the URL.
Supported sources: AWS S3 (presigned URLs), Google Cloud Storage, Azure Blob, Dropbox shared links, your own CDN, or any publicly accessible URL.
Limitations: Max 10 MB file size, 30-second download timeout, 20-second analysis timeout. URL must be publicly accessible without authentication.
Response (201 Created)
{
"id": "3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"
}Analysis is performed synchronously. The response contains only the check id — call GET /api/v1/result/{id} immediately after to retrieve the full analysis.
With test keys the ID is a deterministic UUID v4 like 00000000-0000-4000-8000-000000000001 — passes UUID format validation but is obviously synthetic.
Two-Step Usage
# Step 1: Submit for analysis
curl -s -X POST https://api.htpbe.tech/v1/analyze \
-H "Authorization: Bearer htpbe_live_..." \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/contract.pdf"}'
# → { "id": "3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21" }
# Step 2: Retrieve full result
curl https://api.htpbe.tech/v1/result/3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21 \
-H "Authorization: Bearer htpbe_live_..."
# → { "status": "modified", "origin": { ... }, ... }Retrieve Check Result
GET https://api.htpbe.tech/v1/result/{id}Description
Retrieve a previously completed PDF analysis by its unique check ID. Returns the full analysis data including metadata, structure, signatures, and findings. Only returns checks that belong to your API client.
Request Headers
Authorization: Bearer YOUR_API_KEYPath Parameters
id (required): Check ID returned from POST /api/v1/analyze (full UUID v4)
Response (200 OK)
{
"id": "506a6b1b-1360-48a2-b389-abb346f85d04",
"filename": "contract.pdf",
"check_date": 1736542583,
"file_size": 245632,
"algorithm_version": "2.2.1",
"current_algorithm_version": "2.2.1",
"status": "modified",
"origin": { "type": "institutional", "software": null },
"creation_date": 1704110400,
"modification_date": 1707840000,
"creator": "Adobe Acrobat Pro DC",
"producer": "Adobe PDF Library 15.0",
"modification_confidence": "certain",
"date_sequence_valid": true,
"metadata_completeness_score": 90,
"xref_count": 2,
"has_incremental_updates": true,
"update_chain_length": 3,
"pdf_version": "1.7",
"has_digital_signature": false,
"signature_count": 0,
"signature_removed": true,
"modifications_after_signature": false,
"page_count": 12,
"object_count": 487,
"has_javascript": false,
"has_embedded_files": false,
"modification_markers": [
"Digital signature was removed",
"Different creation and modification dates"
]
}All date fields (check_date, creation_date, modification_date) are Unix timestamps (seconds since epoch).
modification_markers: All modification signals detected, ordered strongest-first
algorithm_version: Version numbers reflect the algorithm in use at the time of analysis. The current version may differ.
Error Responses
// 404 Not Found - Check doesn't exist or belongs to another client
{
"error": "Check not found or access denied",
"code": "not_found"
}
// 401 Unauthorized - Invalid API key
{
"error": "Invalid API key. Please check your credentials.",
"code": "invalid_api_key"
}List All Checks
GET https://api.htpbe.tech/v1/checksDescription
Retrieve a paginated list of all your PDF check results with flexible filtering options. This endpoint provides raw data access for custom analytics, exports, and advanced reporting. Use it to build dashboards, export data, or perform custom analysis on your PDF checks.
Request Headers
Authorization: Bearer YOUR_API_KEYQuery Parameters (All Optional)
limit (1-500, default: 100): Number of results per page
offset (default: 0): Number of results to skip for pagination
tool: Filter by tool name (matches Creator OR Producer)
creator: Filter by Creator tool only
producer: Filter by Producer tool only
status (intact/modified/inconclusive): Filter by verdict
from_date / to_date (Unix timestamp): Filter by check date (when analysis was performed)
Response (200 OK)
{
"data": [
{
"id": "a3f5c9d2-1360-48a2-b389-abb346f85d04",
"filename": "invoice-2024-01.pdf",
"check_date": 1738368000,
"status": "modified",
"metadata_completeness_score": 85,
"creator": "Microsoft Word for Microsoft 365",
"producer": "Adobe PDF Library 15.0",
"file_size": 524288,
"page_count": 5,
"pdf_version": "1.7",
"creation_date": 1735689600,
"modification_date": 1738281600,
"has_javascript": false,
"has_digital_signature": true,
"has_embedded_files": false,
"has_incremental_updates": true,
"update_chain_length": 3,
"object_count": 234
}
],
"total": 1250,
"limit": 100,
"offset": 0,
"has_more": true
}Use cases: Export all data, build custom analytics, discover all tools, filter modified PDFs
Pagination: Usehas_moreto know when to stop
Example:/api/v1/checks?status=modified&limit=200
Error Responses
All errors include an error string and a machine-readable code. Some errors also include a details string with additional context. Requests beyond your monthly quota are charged at overage rates — there is no 429 cutoff.
| Code | Description |
|---|---|
400 | Bad Request — Invalid URL, malformed body, download failed |
401 | Unauthorized — Missing or invalid API key |
402 | Payment Required — No active subscription |
403 | Forbidden — Deactivated key, or test key used with non-test URL |
404 | Not Found — Check ID not found or belongs to a different API key |
413 | Payload Too Large — File exceeds 10 MB |
422 | Unprocessable Entity — Invalid or corrupted PDF |
500 | Internal Server Error — Processing failed |
Integration Examples
Get started quickly with these code examples.
# curl is preinstalled on macOS and most Linux distributions
# Step 1: Submit PDF for analysis
curl -X POST https://api.htpbe.tech/v1/analyze \
-H "Authorization: Bearer htpbe_live_..." \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/document.pdf"}'
# Returns: {"id":"3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"}
# Step 2: Retrieve full results
ID="3f9c8b7a-2e1d-4c5f-9b8e-7a6d5c4b3a21"
curl -s "https://api.htpbe.tech/v1/result/$ID" \
-H "Authorization: Bearer htpbe_live_..." \
| jq '.status'Retrieving Results and Check History
Examples of using the GET endpoints to retrieve check results and check history
Get Check Result by ID
// Retrieve a specific check result
const checkId = '506a6b1b-1360-48a2-b389-abb346f85d04';
const response = await fetch(
`https://api.htpbe.tech/v1/result/${checkId}`,
{
headers: {
'Authorization': `Bearer ${API_KEY}`
}
}
);
const result = await response.json();
console.log(`File: ${result.filename}`);
console.log(`Status: ${result.status}`);
console.log(`Markers: ${result.modification_markers.join(', ')}`);Analyze Specific Tool Usage
import requests
from urllib.parse import quote
# Get all modified PDFs for manual review
response = requests.get(
'https://api.htpbe.tech/v1/checks',
params={
'status': 'modified',
'limit': 100
},
headers={'Authorization': f'Bearer {API_KEY}'}
)
data = response.json()
print(f"Found {data['total']} modified PDFs")
print(f"\nShowing first {len(data['data'])} results:")
for check in data['data'][:5]:
print(f"\n{check['filename']}")
print(f" Tool: {check['creator']} → {check['producer']}")
print(f" Review: https://htpbe.tech/result/{check['id']}")Building a Dashboard
// Build a dashboard from /checks — no extra endpoints needed
async function fetchDashboardData(apiKey: string) {
const headers = { Authorization: `Bearer ${apiKey}` };
// Fetch all checks (paginate if needed)
const checksRes = await fetch(
'https://api.htpbe.tech/v1/checks?limit=500',
{ headers }
);
const { data: checks, total } = await checksRes.json();
// Calculate metrics from raw data
const modified = checks.filter((c) => c.status === 'modified').length;
const toolStats = new Map<string, { count: number; modified: number }>();
checks.forEach((check) => {
const tool = check.producer || 'Unknown';
const current = toolStats.get(tool) || { count: 0, modified: 0 };
toolStats.set(tool, {
count: current.count + 1,
modified: current.modified + (check.status === 'modified' ? 1 : 0)
});
});
return {
overview: {
total,
modified,
modificationRate: total > 0 ? ((modified / total) * 100).toFixed(1) : '0.0'
},
recentModified: checks
.filter((c) => c.status === 'modified')
.slice(0, 5)
.map((c) => ({ filename: c.filename, tool: c.producer })),
toolBreakdown: Array.from(toolStats.entries())
.map(([name, data]) => ({
name,
count: data.count,
modificationRate: ((data.modified / data.count) * 100).toFixed(1)
}))
.sort((a, b) => b.count - a.count)
};
}
const dashboardData = await fetchDashboardData(API_KEY);
console.log('Dashboard Data:', JSON.stringify(dashboardData, null, 2));Real-World Use Cases
1. Bank Statement Fraud Detection (Lending)
// Check bank statement before approving a loan application
async function checkBankStatement(statementUrl) {
// Step 1: Submit for analysis
const { id } = await fetch('https://api.htpbe.tech/v1/analyze', {
method: 'POST',
headers: {
'Authorization': 'Bearer ' + process.env.HTPBE_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({ url: statementUrl })
}).then(r => r.json());
// Step 2: Get full result
const result = await fetch(`https://api.htpbe.tech/v1/result/${id}`, {
headers: { 'Authorization': 'Bearer ' + process.env.HTPBE_API_KEY }
}).then(r => r.json());
if (result.status === 'modified') {
return {
approved: false,
reason: 'Bank statement has been modified — do not process application',
markers: result.modification_markers
};
}
if (result.status === 'inconclusive') {
return {
approved: false,
reason: 'Cannot determine document integrity — manual review required',
origin: result.origin
};
}
return { approved: true, reason: 'Bank statement is structurally intact' };
}2. Bulk Document Tamper Detection
import asyncio
import aiohttp
async def analyze_bulk(urls: list[str], api_key: str):
"""Analyze multiple PDFs concurrently"""
headers = {'Authorization': f'Bearer {api_key}'}
async with aiohttp.ClientSession() as session:
# Step 1: Submit all PDFs for analysis
submit_tasks = [
session.post(
'https://api.htpbe.tech/v1/analyze',
headers={**headers, 'Content-Type': 'application/json'},
json={'url': url}
)
for url in urls
]
submit_responses = await asyncio.gather(*submit_tasks)
ids = [(await r.json())['id'] for r in submit_responses]
# Step 2: Retrieve all results
result_tasks = [
session.get(
f'https://api.htpbe.tech/v1/result/{id}',
headers=headers
)
for id in ids
]
result_responses = await asyncio.gather(*result_tasks)
results = [await r.json() for r in result_responses]
modified_count = sum(1 for r in results if r['status'] == 'modified')
inconclusive_count = sum(1 for r in results if r['status'] == 'inconclusive')
return {
'total': len(results),
'modified': modified_count,
'inconclusive': inconclusive_count,
'intact': len(results) - modified_count - inconclusive_count,
'details': results
}
# Process 100 documents in parallel
urls = [f'https://storage.example.com/doc_{i}.pdf' for i in range(100)]
summary = await analyze_bulk(urls, os.getenv('HTPBE_API_KEY'))
print(f"Scanned {summary['total']} docs: {summary['modified']} modified, {summary['inconclusive']} inconclusive, {summary['intact']} intact")3. Document Management System Integration
// Automatic tamper check on upload
async function handleDocumentUpload(file: File) {
// 1. Upload to your storage
const fileUrl = await uploadToS3(file);
// 2. Submit for analysis
const { id } = await fetch('https://api.htpbe.tech/v1/analyze', {
method: 'POST',
headers: {
'Authorization': `Bearer ${HTPBE_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({ url: fileUrl })
}).then(r => r.json());
// 3. Retrieve full result
const result = await fetch(`https://api.htpbe.tech/v1/result/${id}`, {
headers: { 'Authorization': `Bearer ${HTPBE_API_KEY}` }
}).then(r => r.json());
// 4. Store in database with detection status
await db.documents.create({
filename: file.name,
url: fileUrl,
intact: result.status === 'intact',
uploaded_at: new Date()
});
// 5. Alert if modified
if (result.status === 'modified') {
await notifySecurityTeam({
document: file.name,
findings: result.modification_markers
});
}
return result;
}LLM-Friendly Documentation
For AI assistants and LLM integration, our API documentation is available in a machine-readable format optimized for language models.
View llms.txtReady to integrate?
API key issued on signup. Test keys free on every plan.
Live calls from $15/mo — no sales call, cancel any time.