PDF Tamper Detection API for Go: Integration Guide

This article is a snapshot — content was accurate as of July 2026 (code examples tested against the API as of June 2026). The product evolves actively; specific counts, examples, and detection rules may have changed since publication — see the changelog for the current state.
PDF fraud is a backend problem. The forged bank statement, the altered invoice, the doctored payslip — none of it reaches a human reviewer untouched. By the time your Go handler has written a row to the database and returned 201, the document’s claims have already propagated into your business logic. The right place to catch the structural-tampering layer is at ingress: before your service trusts the file, not after.
This guide walks through integrating the PDF tamper detection API into a Go service — from the first curl command to an idiomatic client built on net/http, encoding/json, and context, with a typed result struct, error handling that distinguishes retryable failures from permanent ones, and a small bank-statement gate that decides accept / reject / review. The code compiles and runs the real request flow against the documented error codes; adapt and harden it for your own traffic profile and threat model. (If you want the conceptual overview first, start with How to Detect PDF Tampering Programmatically. If you are integrating from Node.js, Python, or PHP instead, see the Node.js, Python, and Laravel / PHP guides.)
TL;DR
- Two API calls, three verdicts:
POST /analyzereturns a check id,GET /result/{id}returns the flat verdict object withstatusbeing one ofintact,modified, orinconclusive. - Minimum integration is the standard library —
net/httpplusencoding/json, no third-party dependency. - Production-grade client: a typed
Resultstruct,context.Contexttimeouts, a retry loop that backs off on 5xx and 429 only, and a parsedRetry-After. - A
Gateexample that maps the three verdicts to anAccept/Reject/Reviewdecision for documents that claim institutional origin. - This is structural PDF tamper and forgery detection — not KYC, not OCR, not AI-text detection. It complements an identity stack; it does not replace one.
Prerequisites
- Go 1.21+ (for
errors.Join,slog, and the stablecontextergonomics used below) - An HTPBE? API key (Dashboard → copy key)
- No external modules — everything below is standard library
Step 1: Test the API with curl
Before writing any Go, confirm your key works. The API uses a two-step flow: POST /analyze submits a PDF URL and returns a check id, then GET /result/{id} retrieves the full verdict. (For a language-agnostic overview of what the API detects, see how PDF tamper detection works.)
Step 1a — submit for analysis:
curl -X POST https://api.htpbe.tech/v1/analyze \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://api.htpbe.tech/v1/test/clean.pdf"}'You will receive: {"id": "00000000-0000-4000-8000-000000000001"}
Step 1b — retrieve the result:
curl https://api.htpbe.tech/v1/result/YOUR_CHECK_ID \
-H "Authorization: Bearer YOUR_API_KEY"You will receive a flat JSON object with "status": "intact" and the full set of analysis fields:
{
"id": "00000000-0000-4000-8000-000000000001",
"status": "intact",
"origin": { "type": "institutional", "software": null },
"creator": "Adobe Acrobat Pro DC",
"producer": "Adobe PDF Library 15.0",
"modification_confidence": "none",
"has_incremental_updates": false,
"update_chain_length": 1,
"signature_removed": false,
"modifications_after_signature": false,
"modification_markers": []
}(The real response carries every field documented in the struct below.) The same shape comes back for modified and inconclusive verdicts — only the values change. Two fields are conditional: status_reason appears only when status is inconclusive, and outdated_warning only when the check ran against an older algorithm version.
The URL https://api.htpbe.tech/v1/test/clean.pdf is a test mock — it returns a predictable response without consuming quota. Test keys (prefix htpbe_test_) accept only these mock URLs; live keys (prefix htpbe_live_) accept any public PDF URL.
Step 2: The Result Struct
Define a struct that mirrors the GET /result/{id} response. Go’s encoding/json ignores unknown fields by default, so new API fields never break unmarshalling. Use pointer types (*int64, *string) for the fields the API documents as nullable — that lets you distinguish “absent” from a genuine zero value, which matters for timestamps and the producer string.
package htpbe
// Result mirrors the flat GET /result/{id} response.
// Nullable fields use pointers so "absent" is distinguishable from zero.
type Result struct {
ID string `json:"id"`
Filename string `json:"filename"`
FileSize int64 `json:"file_size"`
PageCount int `json:"page_count"`
AlgorithmVersion string `json:"algorithm_version"`
CurrentAlgorithmVersion string `json:"current_algorithm_version"`
OutdatedWarning string `json:"outdated_warning"`
// Primary verdict: "intact" | "modified" | "inconclusive"
Status string `json:"status"`
// StatusReason is present only when Status == "inconclusive":
// "consumer_software_origin" | "online_editor_origin" |
// "scanned_document" | "html_renderer_origin"
StatusReason string `json:"status_reason"`
Origin struct {
// "consumer_software" | "institutional" | "unknown" |
// "online_editor" | "scanned"
Type string `json:"type"`
Software *string `json:"software"`
} `json:"origin"`
// "certain" | "high" | "none" | null
ModificationConfidence *string `json:"modification_confidence"`
Creator *string `json:"creator"`
Producer *string `json:"producer"`
CreationDate *int64 `json:"creation_date"` // Unix seconds
ModificationDate *int64 `json:"modification_date"` // Unix seconds
PDFVersion *string `json:"pdf_version"`
DateSequenceValid bool `json:"date_sequence_valid"`
MetadataCompletenessScore int `json:"metadata_completeness_score"`
XrefCount int `json:"xref_count"`
HasIncrementalUpdates bool `json:"has_incremental_updates"`
UpdateChainLength int `json:"update_chain_length"`
HasDigitalSignature bool `json:"has_digital_signature"`
SignatureCount int `json:"signature_count"`
SignatureRemoved bool `json:"signature_removed"`
ModificationsAfterSignature bool `json:"modifications_after_signature"`
ObjectCount int `json:"object_count"`
HasJavascript bool `json:"has_javascript"`
HasEmbeddedFiles bool `json:"has_embedded_files"`
// Stable HTPBE_* marker ids, e.g. ["HTPBE_SIGNATURE_REMOVED"].
// Empty when Status is "intact" or "inconclusive".
ModificationMarkers []string `json:"modification_markers"`
}Two fields deserve a closer look. StatusReason is populated only when Status is inconclusive, and it carries one of four values — consumer_software_origin, online_editor_origin, scanned_document, or html_renderer_origin. The difference matters: a scanned document is benign for a user-submitted handwritten form, but an html_renderer_origin on something that claims to be a payslip is a strong signal to route for review — the kind of origin you would not expect from a real payroll system. Branch on the specific reason, not just on the top-level inconclusive.
ModificationMarkers returns stable machine-readable ids prefixed HTPBE_ — for example HTPBE_SIGNATURE_REMOVED, HTPBE_DATES_DISAGREE, HTPBE_MULTIPLE_REVISION_LAYERS, HTPBE_POST_SIGNATURE_EDIT. Branch your integration logic on the id; render the human-readable label from the dictionary published on htpbe.tech/how. These ids are part of the public contract and never change once shipped.
It is worth defining named constants for the verdicts so the rest of your codebase never compares against bare string literals:
const (
StatusIntact = "intact"
StatusModified = "modified"
StatusInconclusive = "inconclusive"
)Step 3: A Typed Error
A 401 means your key is wrong; a 402 means the credit pool is dry; a 500 is transient. The retry loop and your business logic both need to branch on the status code, so wrap every non-2xx response in a typed error that carries it.
package htpbe
import "fmt"
// APIError is returned for any non-2xx response from the API.
type APIError struct {
StatusCode int
Code string // machine-readable code from the JSON body, when present
Message string
RetryAfterSecs int // parsed from Retry-After on 429; 0 when absent
}
func (e *APIError) Error() string {
return fmt.Sprintf("htpbe: %d %s: %s", e.StatusCode, e.Code, e.Message)
}
// Retryable reports whether retrying the same request could succeed.
// Only 5xx and 429 are transient; 4xx (other than 429) are permanent.
func (e *APIError) Retryable() bool {
return e.StatusCode >= 500 || e.StatusCode == 429
}Step 4: The Client
Here is a complete client on the standard library. Verify chains both calls and returns the full result; callers pass a context.Context so the whole round trip honours a deadline or a cancelled request.
A subtlety the retry logic gets right: POST /analyze is the billable, job-creating step. Each successful POST starts a new analysis and draws a credit. GET /result/{id} is a free read. So the two steps are retried independently — the POST is retried on its own until it yields an id, and once that id is in hand a failed GET is retried by re-reading the same result, never by replaying the POST. Wrapping the whole flow in one retry loop would re-submit (and re-bill) a fresh analysis every time a transient GET failure occurred; this client never does that.
package htpbe
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"io"
"net/http"
"strconv"
"strings"
"time"
)
const defaultBaseURL = "https://api.htpbe.tech/v1"
// Client is a reusable, concurrency-safe HTPBE API client.
type Client struct {
apiKey string
baseURL string
httpClient *http.Client
maxRetries int
}
// Option configures a Client.
type Option func(*Client)
// WithHTTPClient overrides the default *http.Client. A nil value is ignored
// so the option can never produce a nil-deref at request time.
func WithHTTPClient(hc *http.Client) Option {
return func(c *Client) {
if hc != nil {
c.httpClient = hc
}
}
}
func WithBaseURL(u string) Option { return func(c *Client) { c.baseURL = u } }
func WithMaxRetries(n int) Option { return func(c *Client) { c.maxRetries = n } }
// New constructs a Client. The apiKey is required.
func New(apiKey string, opts ...Option) (*Client, error) {
if apiKey == "" {
return nil, errors.New("htpbe: API key is required")
}
c := &Client{
apiKey: apiKey,
baseURL: defaultBaseURL,
httpClient: &http.Client{Timeout: 35 * time.Second},
maxRetries: 3,
}
for _, opt := range opts {
opt(c)
}
// Tolerate a trailing slash so callers can pass either form of the base URL.
c.baseURL = strings.TrimRight(c.baseURL, "/")
return c, nil
}
type analyzeRequest struct {
URL string `json:"url"`
OriginalFilename string `json:"original_filename,omitempty"`
}
type analyzeResponse struct {
ID string `json:"id"`
}
type errorBody struct {
Error string `json:"error"`
Code string `json:"code"`
}
// Verify submits a PDF URL and returns the full verdict. The two steps are
// retried independently: POST /analyze (the billable step) is retried until
// it yields an id, then GET /result/{id} (a free read) is retried on its own.
// A failed GET never replays the POST, so a transient read failure cannot
// create a duplicate analysis job. The context governs cancellation and the
// overall deadline.
//
// originalFilename is optional; pass it so the result's `filename` field
// shows a human-readable name instead of an opaque storage key.
func (c *Client) Verify(ctx context.Context, pdfURL, originalFilename string) (*Result, error) {
if pdfURL == "" {
return nil, errors.New("htpbe: pdfURL is required")
}
// Step 1: submit for analysis (billable — retried in isolation).
id, err := c.withRetry(ctx, func() (string, error) {
return c.submitAnalysis(ctx, pdfURL, originalFilename)
})
if err != nil {
return nil, err
}
// Step 2: read the result (free — retried in isolation, never replays POST).
var result *Result
_, err = c.withRetry(ctx, func() (string, error) {
r, e := c.getResult(ctx, id)
if e != nil {
return "", e
}
result = r
return id, nil
})
if err != nil {
return nil, err
}
return result, nil
}
// withRetry runs op until it succeeds, backing off only on Retryable() API
// errors (5xx, 429). It honours a parsed Retry-After and the context deadline.
func (c *Client) withRetry(ctx context.Context, op func() (string, error)) (string, error) {
var lastErr error
for attempt := 0; attempt <= c.maxRetries; attempt++ {
if attempt > 0 {
// Honour a server-supplied Retry-After on 429; otherwise
// exponential backoff: 1s, 2s, 4s.
delay := time.Duration(1<<(attempt-1)) * time.Second
var apiErr *APIError
if errors.As(lastErr, &apiErr) && apiErr.RetryAfterSecs > 0 {
delay = time.Duration(apiErr.RetryAfterSecs) * time.Second
}
select {
case <-ctx.Done():
return "", ctx.Err()
case <-time.After(delay):
}
}
v, err := op()
if err == nil {
return v, nil
}
lastErr = err
if apiErr := asAPIError(err); apiErr != nil && !apiErr.Retryable() {
return "", err // permanent — do not retry
}
}
return "", fmt.Errorf("htpbe: exhausted retries: %w", lastErr)
}
func (c *Client) submitAnalysis(ctx context.Context, pdfURL, originalFilename string) (string, error) {
body, err := json.Marshal(analyzeRequest{URL: pdfURL, OriginalFilename: originalFilename})
if err != nil {
return "", fmt.Errorf("htpbe: marshal request: %w", err)
}
req, err := http.NewRequestWithContext(ctx, http.MethodPost,
c.baseURL+"/analyze", bytes.NewReader(body))
if err != nil {
return "", err
}
req.Header.Set("Authorization", "Bearer "+c.apiKey)
req.Header.Set("Content-Type", "application/json")
resp, err := c.httpClient.Do(req)
if err != nil {
return "", fmt.Errorf("htpbe: analyze request: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode/100 != 2 {
return "", parseError(resp)
}
var ar analyzeResponse
if err := json.NewDecoder(resp.Body).Decode(&ar); err != nil {
return "", fmt.Errorf("htpbe: decode analyze response: %w", err)
}
if ar.ID == "" {
return "", errors.New("htpbe: analyze response missing id")
}
return ar.ID, nil
}
func (c *Client) getResult(ctx context.Context, id string) (*Result, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet,
c.baseURL+"/result/"+id, nil)
if err != nil {
return nil, err
}
req.Header.Set("Authorization", "Bearer "+c.apiKey)
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("htpbe: result request: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode/100 != 2 {
return nil, parseError(resp)
}
var result Result
if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
return nil, fmt.Errorf("htpbe: decode result: %w", err)
}
return &result, nil
}
// parseError builds a typed APIError from a non-2xx response.
func parseError(resp *http.Response) error {
raw, _ := io.ReadAll(io.LimitReader(resp.Body, 1<<16))
var eb errorBody
_ = json.Unmarshal(raw, &eb) // best effort — body may not be JSON
apiErr := &APIError{
StatusCode: resp.StatusCode,
Code: eb.Code,
Message: eb.Error,
}
if apiErr.Message == "" {
apiErr.Message = http.StatusText(resp.StatusCode)
}
switch resp.StatusCode {
case http.StatusUnauthorized: // 401
apiErr.Message = "invalid API key — check the HTPBE_API_KEY environment variable"
case http.StatusPaymentRequired: // 402
apiErr.Message = "no credits available for this key — top up or subscribe"
case http.StatusRequestEntityTooLarge: // 413
apiErr.Message = "PDF exceeds the 10 MB size limit"
case http.StatusUnprocessableEntity: // 422
apiErr.Message = "the URL did not return a valid PDF file"
case http.StatusTooManyRequests: // 429
apiErr.RetryAfterSecs = parseRetryAfter(resp.Header.Get("Retry-After"))
}
return apiErr
}
// parseRetryAfter handles the delay-seconds form ("30") and the HTTP-date
// form, clamping the result to a sane [1, 600] range. Returns 0 when absent
// or unparseable so callers fall back to their own backoff.
func parseRetryAfter(h string) int {
if h == "" {
return 0
}
if secs, err := strconv.Atoi(h); err == nil {
return clamp(secs, 1, 600)
}
if t, err := http.ParseTime(h); err == nil {
return clamp(int(time.Until(t).Seconds()), 1, 600)
}
return 0
}
func clamp(v, lo, hi int) int {
if v < lo {
return lo
}
if v > hi {
return hi
}
return v
}
func asAPIError(err error) *APIError {
var apiErr *APIError
if errors.As(err, &apiErr) {
return apiErr
}
return nil
}Two status codes deserve explicit handling in your own code:
402(Payment Required) — the key has no credit source left. Credits are universal: a subscription’s monthly quota, a one-time top-up batch, and the welcome credits all draw from one pool. A 402 means all three are exhausted (or there is no active plan on a live key).APIError.Retryable()returnsfalsefor it — surface it to your billing logic rather than retrying, because retrying fails identically until the account is topped up at the pricing page.429(Too Many Requests) — this is server-wide concurrency, not per-key rate limiting. The response carries aRetry-Afterheader, whichparseErrorparses (handling bothdelay-secondsandHTTP-dateforms, clamped to[1, 600]) and stashes onAPIError.RetryAfterSecs. The retry loop reads that value before falling back to exponential backoff.
Step 5: The Bank-Statement Gate
The client returns facts. Turning those facts into an accept / reject / review decision is a policy choice that depends on what the document claims to be. A bank statement, a payslip, or a diploma claims institutional origin, so anything other than intact should stop the automated path. A user-generated form is held to a looser standard.
package htpbe
// Decision is the routing outcome for a verified document.
type Decision string
const (
Accept Decision = "accept"
Reject Decision = "reject"
Review Decision = "review"
)
// GateInstitutional maps a verdict to a decision for documents that
// claim institutional origin (bank statements, payslips, diplomas).
// For these, "inconclusive" is treated as strictly as "modified":
// a document that should have come from a bank's system but looks
// like it was built in Word does not get the benefit of the doubt.
func GateInstitutional(r *Result) Decision {
switch r.Status {
case StatusModified:
return Reject
case StatusInconclusive:
// A bank statement that comes back inconclusive should not be
// auto-accepted: it typically came from consumer software rather
// than a bank's own system, which is a signal to route for review
// — not proof of tampering. Route to a human.
return Review
default: // StatusIntact
return Accept
}
}Wire the client and the gate into an HTTP handler that accepts a JSON body with a reachable URL. The handler runs the check before any business logic touches the file:
package main
import (
"context"
"encoding/json"
"errors"
"log/slog"
"net/http"
"os"
"time"
"yourapp/htpbe"
)
type verifyRequest struct {
DocumentURL string `json:"document_url"`
OriginalFilename string `json:"original_filename"`
}
func main() {
client, err := htpbe.New(os.Getenv("HTPBE_API_KEY"))
if err != nil {
slog.Error("htpbe init failed", "err", err)
os.Exit(1)
}
http.HandleFunc("POST /api/documents", func(w http.ResponseWriter, req *http.Request) {
var body verifyRequest
if err := json.NewDecoder(req.Body).Decode(&body); err != nil || body.DocumentURL == "" {
http.Error(w, `{"error":"document_url is required"}`, http.StatusBadRequest)
return
}
// Bound the whole two-step round trip to 40 seconds.
ctx, cancel := context.WithTimeout(req.Context(), 40*time.Second)
defer cancel()
result, err := client.Verify(ctx, body.DocumentURL, body.OriginalFilename)
if err != nil {
var apiErr *htpbe.APIError
if errors.As(err, &apiErr) {
switch apiErr.StatusCode {
case http.StatusUnauthorized, http.StatusPaymentRequired:
// Configuration / billing error — never expose details to the caller.
slog.Error("htpbe misconfigured", "code", apiErr.Code)
http.Error(w, `{"error":"verification temporarily unavailable"}`, http.StatusServiceUnavailable)
return
case http.StatusUnprocessableEntity:
http.Error(w, `{"error":"the URL did not return a valid PDF"}`, http.StatusUnprocessableEntity)
return
case http.StatusRequestEntityTooLarge:
http.Error(w, `{"error":"PDF must be under 10 MB"}`, http.StatusRequestEntityTooLarge)
return
}
}
slog.Error("htpbe verify failed", "err", err)
http.Error(w, `{"error":"verification failed"}`, http.StatusBadGateway)
return
}
switch htpbe.GateInstitutional(result) {
case htpbe.Reject:
writeJSON(w, http.StatusUnprocessableEntity, map[string]any{
"decision": "reject",
"reason": "document modified after creation",
"modification_markers": result.ModificationMarkers,
})
case htpbe.Review:
writeJSON(w, http.StatusAccepted, map[string]any{
"decision": "review",
"status_reason": result.StatusReason,
})
default: // Accept
writeJSON(w, http.StatusOK, map[string]any{
"decision": "accept",
"check_id": result.ID,
})
}
})
slog.Info("listening on :8080")
_ = http.ListenAndServe(":8080", nil)
}
func writeJSON(w http.ResponseWriter, status int, v any) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
_ = json.NewEncoder(w).Encode(v)
}An inconclusive result should not be auto-accepted — it typically indicates the file came from consumer software, an online editor, an HTML renderer, or a scanner rather than an institutional generator. That is a signal to route for review, not proof of tampering. For a deeper explanation, see what “inconclusive” really means. For documents that claim institutional origin, treat inconclusive with the same caution as modified: do not accept automatically, route to a human reviewer.
Step 6: Giving the API a Reachable URL
The API does not accept file uploads — it downloads the PDF from a URL you supply, so the file must be publicly reachable for the 2–5 seconds the analysis takes. The cleanest pattern is a short-lived presigned URL from your object store: you never expose the bucket, the link expires in minutes, and passing originalFilename keeps the audit trail readable instead of showing the opaque storage key.
// Store the upload privately, mint a 5-minute presigned GET URL, verify.
key := "incoming/" + uuid.NewString() + ".pdf"
if _, err := s3Client.PutObject(ctx, &s3.PutObjectInput{
Bucket: &bucket, Key: &key,
Body: bytes.NewReader(data), ContentType: aws.String("application/pdf"),
}); err != nil {
return nil, err
}
presigned, err := presigner.PresignGetObject(ctx, &s3.GetObjectInput{
Bucket: &bucket, Key: &key,
}, s3.WithPresignExpires(5*time.Minute))
if err != nil {
return nil, err
}
return client.Verify(ctx, presigned.URL, originalFilename)The same pattern works with Google Cloud Storage (SignedURL on a bucket handle) or Cloudflare R2 (S3-compatible — reuse this with the R2 endpoint). One security note: the API fetches whatever URL you give it, so if a URL ever comes from untrusted input (a user-pasted link, a webhook payload), validate that it resolves to a public host first — reject localhost, 169.254.169.254 (cloud metadata), and RFC 1918 ranges to close the SSRF surface.
Batch Processing, Test Mode, and Quota
A few operational details, kept short.
Synchronous flow. Analysis is synchronous: POST /analyze blocks until the verdict is computed, then returns the id, and the response also carries a Location header pointing at the result URL. There is no queue to poll and no webhook to register — by the time analyze returns, GET /result/{id} is ready. Verify chains both, so one call gives you the full verdict.
Batch work. For a backlog of statements or a portfolio of claims, fan out across a bounded number of goroutines and let the shared Client reuse its connection pool. Keep the worker count modest: the client retries on 429, but capping concurrency means you rarely hit capacity in the first place.
func verifyBatch(ctx context.Context, client *htpbe.Client, urls []string) map[string]*htpbe.Result {
const workers = 8 // stay within your plan's concurrency comfort zone
sem := make(chan struct{}, workers)
results := make(map[string]*htpbe.Result)
var mu sync.Mutex
var wg sync.WaitGroup
for _, u := range urls {
wg.Add(1)
sem <- struct{}{}
go func(pdfURL string) {
defer wg.Done()
defer func() { <-sem }()
r, err := client.Verify(ctx, pdfURL, "")
if err != nil {
slog.Warn("verify failed", "url", pdfURL, "err", err)
return
}
mu.Lock()
results[pdfURL] = r
mu.Unlock()
}(u)
}
wg.Wait()
return results
}Test mode. Every plan includes a test API key (prefix htpbe_test_) that accepts only mock URLs of the form https://api.htpbe.tech/v1/test/{filename}.pdf and returns deterministic responses — like Stripe test cards, with no quota cost. Useful fixtures: clean.pdf → intact, signature-removed.pdf → modified, dates-mismatch.pdf → modified, and inconclusive.pdf → inconclusive. Point Verify at these in your testing suite to cover every branch of the gate; for handler unit tests without the network, point the client at an httptest.Server via WithBaseURL and serve canned JSON for /analyze and /result/{id}. Keep test and live keys in separate environment files and never commit either.
Reviewing past checks. GET /api/v1/checks returns a paginated list of every result for your key — filter by status and limit for audit dashboards or weekly reports (c.baseURL+"/checks?status=modified&limit=50", same Authorization header as the other calls).
Quota. When you reach your monthly quota, further requests return 402 PAYMENT_REQUIRED until it resets — add a one-time credit pack or move to a higher tier to keep going. Handle the 402 so a quota boundary never silently drops a check, and watch consumption on the dashboard.
What This Does Not Catch
Structural analysis has honest limits, and a Go service making automated decisions should encode them:
- Content fabricated in one pass. If someone opens Word, types a false salary, and exports once, the file was never modified post-creation — it is structurally
intact. The fraud happened at authorship, not at the byte level. This is why a payslip from a consumer tool tends to returninconclusiverather thanintact: the analysis cannot vouch for a document anyone could have created from scratch. - Documents rebuilt from scratch in the original’s software. A determined attacker who recreates a document in the same institutional tool and matches the metadata fields leaves few structural signals. This is rare and high-effort, but possible.
- Encrypted or password-protected PDFs. The service cannot parse a file it cannot open; remove the password before submitting.
These limits are why structural tamper detection works as one layer in a fraud-detection stack, not the whole stack. Pair the structural verdict with domain checks — amount validation, account-number lookups, sender authentication, and your KYC or OCR provider — for a layered defence. See PDF Fraud Prevention Best Practices.
Decisions Before You Ship
The integration surface is intentionally small: one POST, one GET, three verdicts, the typed error above. The complexity lives on the Go side, and two choices matter most:
- Where verification runs. Synchronous inside the request handler gives the caller an immediate decision but blocks for 2–5 seconds; a goroutine or queue consumer returns instantly and defers the verdict. Sync suits low-volume B2B onboarding; async suits high-volume portals.
inconclusiverouting. For documents that claim institutional origin (bank statements, diplomas, payslips), treatinconclusivewith the same caution asmodifiedand route to human review. For genuinely user-generated content it may be acceptable as-is — that is whatGateInstitutionalencodes, and you may want a second gate with a looser policy.
To start, sign up for HTPBE? — new accounts get five checks to try, then pay-per-check credits or a subscription from $15/mo — copy your test key, and run the curl call from Step 1. The full API reference documents every response field, error code, and the marker dictionary the Go client branches on.