PDF Security Blog

APRA CPS 234 Document Fraud Detection: A Guide for AU-Licensed Entities

HTPBE Team·09.05.2026·13 min read

This article is a snapshot — content was accurate as of May 2026 (code examples tested against the API as of April 2026). The product evolves actively; specific counts, examples, and detection rules may have changed since publication — see the changelog for the current state.

APRA CPS 234 Document Fraud Detection for AU-Licensed Entities

In November 2019, CPS 234 came into force for all APRA-regulated entities — authorised deposit-taking institutions, insurers, and registrable superannuation entities. The standard requires that information security controls be proportional to the vulnerability of information assets and the potential impact of a security incident involving those assets.

Most CPS 234 compliance work has focused on network perimeter controls, access management, and incident response. The document layer has received far less attention.

This is changing. APRA CPS 234 document fraud detection is now a live compliance obligation, not a theoretical one. Document fraud in Australian financial services — altered bank statements, fabricated payslips generated in Xero or MYOB, manipulated superannuation statements, fraudulent ATO Notices of Assessment — has increased sharply since 2022 as more lending, insurance, and KYC decisions incorporate submitted PDF documents. Each of those submitted PDFs is an information asset entering a decision-making pipeline. CPS 234 applies directly.

What CPS 234 Requires

CPS 234 Section 15 requires that an APRA-regulated entity implement information security controls to protect the confidentiality, integrity, and availability of information assets, including those managed by related parties and third parties.

Section 36 requires that an entity assess the effectiveness of its information security controls, and that the assessment be commensurate with the criticality and sensitivity of those assets.

Submitted documents — income evidence, bank statements, superannuation statements, tax returns — qualify as “information assets” under the standard. They enter automated decisioning pipelines. They influence credit assessments, claims assessments, and KYC determinations. A fraudulent document in that pipeline is a security incident: it compromises the integrity of an information asset, produces a corrupted decision output, and may trigger a regulatory breach if it results in a non-compliant lending decision.

Most APRA-regulated entities have a gap here. Controls exist for the security of their own systems and data stores, but the integrity of documents received from external parties is typically checked only at the visual layer — if at all. No structural check confirms whether the PDF that entered the pipeline is the one originally generated by the claimed source.

The Australian Document Fraud Surface

The document types that appear most frequently in Australian lending, insurance, and KYC workflows each carry specific fraud patterns.

Xero and MYOB payslips. Both platforms are widely used by Australian small-to-medium businesses, and both produce PDFs with identifiable producer signatures. A legitimate Xero payslip carries Xero’s PDF generation fingerprint in the document’s structural metadata. When a payslip that claims to come from Xero was actually produced in Microsoft Word, that contradiction is visible inside the file. Similarly, an employee who downloads their MYOB-generated payslip and edits the gross earnings figure in a consumer PDF editor leaves a structural record — the original producer string and the editing tool’s producer string both appear in the file’s modification history.

Australian bank statements. CBA, ANZ, NAB, and Westpac all generate statements through institutional document systems. The resulting PDFs carry the producer signatures of those generation systems — not Microsoft Excel, not iLovePDF, not Google Docs. A bank statement that claims to come from CommBank but was regenerated from a spreadsheet is structurally a different class of document from one generated by CommBank’s own systems. The file itself records which.

Superannuation statements. Industry funds and retail super platforms generate member statements through document management systems. A super statement that was edited to inflate the account balance — a common pattern in property lending and SMSF applications — carries the structural markers of the editing session alongside the original generation event.

ATO Notices of Assessment. The ATO generates NOA documents through its own systems. These documents carry specific structural characteristics that are inconsistent with consumer PDF creation tools. Fabricated NOAs — a documented fraud vector in Australian mortgage applications — frequently originate in tools such as Adobe Acrobat, Word, or online editors, all of which leave different structural fingerprints from genuine ATO-generated documents.

Why OCR and Visual Review Miss This

Most document intake platforms in Australian financial services process PDFs through two checks: OCR field extraction (to pull income figures, account numbers, dates) and visual review (to confirm the document looks correct). In some KYC workflows, an identity proofing provider also performs template analysis.

None of these checks reads what the file actually is at the structural level.

OCR extracts the text rendered on the page. It cannot distinguish between text that was in a document when it was generated by CBA’s core banking system and text that was placed on the page by an online editor three days later. The extracted numbers look identical because, on the rendered page, they are.

Visual review has the same limitation. A skilled edit of a Westpac PDF — one where the fraudster preserved fonts, alignment, and formatting — is indistinguishable from the original by a human reviewer looking at the rendered page. The modification is not on the rendered page. It is in the binary structure of the file.

Open Banking data aggregation platforms like Basiq and Illion address a different problem: they retrieve transaction data directly from financial institutions, bypassing the submitted document entirely. That covers bank statements when applicants consent to data sharing. It does not cover payslips, superannuation statements, tax documents, or cases where an applicant submits a document from an institution not connected to the aggregation platform.

The structural forensics layer fills the gap that OCR, visual review, and data aggregation each leave: it confirms whether the PDF that was submitted was generated by the system it claims to originate from, or whether it was constructed or modified after original generation.

The Structural Signals HTPBE? Reads

Document integrity analysis operates on the PDF’s internal structure — the same data that is present in every PDF but is never rendered on the page.

The primary signals are:

Producer and creator fields. Every PDF records the software that generated it. A legitimate Xero payslip shows Xero’s PDF library as the producer. If the file was subsequently opened and re-saved in Adobe Acrobat or an online editor, both the original producer and the editing tool’s producer appear in the modification record. When those two entries belong to different categories of software — institutional document generator versus consumer editing tool — the document’s origin is inconsistent with its claimed source.

Modification history (xref chain). The PDF format records each save operation as a separate section in the file. A document that was generated once and submitted directly has a single revision. A document that was generated, then opened and edited, then saved, has two revisions. This structural record cannot be cleanly erased without rebuilding the file entirely, and most consumer fraud tools do not do so.

Incremental updates. When a PDF is edited using a standard tool, changes are appended to the end of the original file rather than written in place. The result is that both the original content and the modification are present in the binary. The modification is structurally identifiable even without access to the original document.

Signature fraud detection. Some Australian institutional documents carry digital signatures. A document that has been modified after a digital signature was applied will show the signature as covering only a portion of the file, with content appearing outside the signed byte range. This is one of the highest-confidence signals available — modification after a digital signature is structurally certain, not probabilistic.

Timestamp consistency. The file records when it was created and when it was last modified. Inconsistencies between those timestamps, between the metadata timestamps and the claimed document period, or between timestamps across different metadata layers within the same file are a structural flag.

The verdict returned is one of three: intact (the document shows no structural evidence of post-creation modification), modified (one or more structural signals indicate the document was altered after generation), or inconclusive (the document was created with consumer software and therefore lacks the institutional structural fingerprint against which comparison can be made).

What “Inconclusive” Means in an APRA Context

inconclusive is not a failure of the analysis. It is a precise finding with direct operational significance.

When a document returns inconclusive, it means the file was created with consumer software — Microsoft Word, Google Docs, a print-to-PDF driver — rather than with the institutional document generation system it is presented as coming from. For the document types that appear in APRA-regulated entity workflows, this has a clear operational meaning: institutional documents from CBA, Xero, the ATO, or major super funds do not originate in consumer software.

If a bank statement returns inconclusive with a status_reason of consumer_software_origin and the producer field shows Microsoft Excel, the document presents as a CBA statement but was generated in Excel. This is not a document of uncertain status. It is a document that cannot have been generated by CBA’s systems. The correct operational response is to require an alternative fraud detection channel — a direct data feed or a checked document request through secure channels — not to re-request the PDF from the applicant.

In a CPS 234 compliance context, inconclusive on an institutional document type is a control trigger, not a neutral outcome. Your procedure should define explicitly what action it requires.

The Audit Trail Requirement

CPS 234 Section 36 requires that an entity be able to demonstrate the effectiveness of its controls. For document integrity controls, this means maintaining records of which documents were assessed, what the assessment returned, and what action was taken.

HTPBE? returns a structured JSON result for every analysis, with a unique check ID that can be stored alongside the document record in your system of record. The response includes the verdict, the specific modification markers that triggered it, the producer and creator fields extracted from the file, the timestamp data, and the modification confidence level.

A typical response for a modified payslip:

{
  "id": "ck_9f2e4b17-8c3a-4d7f-a2e1-6b5c3d8e9f0a",
  "status": "modified",
  "modification_confidence": "high",
  "modification_markers": ["HTPBE_MULTIPLE_REVISION_LAYERS", "HTPBE_EDITING_TOOL_FINGERPRINT"],
  "producer": "iLovePDF",
  "creator": "Xero",
  "creation_date": 1743120000,
  "modification_date": 1743379200,
  "xref_count": 2,
  "has_incremental_updates": true,
  "update_chain_length": 1,
  "has_digital_signature": false
}

This response is audit-ready. The check ID ties the analysis result to the specific document review event. The modification markers name exactly what was detected. The producer and creator fields record the structural evidence. All of this is retrievable via the API for as long as you retain the check ID in your records.

For entities subject to CPS 234 reporting requirements, the ability to retrieve structured, timestamped records of document integrity assessments with named finding details is the difference between a demonstrable control and an informal process.

Integration Into Australian Document Intake Workflows

The integration model for APRA-regulated entities maps onto three common workflow patterns.

Lending document intake. At the point where an applicant submits income or financial documents, the PDF is uploaded to your storage layer and the URL is submitted to the HTPBE? API. The analysis returns within seconds. Applications where submitted documents return modified are routed to a fraud investigation queue. Applications where institutional document types return inconclusive (consumer software origin) are held for alternative fraud detection. intact documents proceed through standard processing.

curl -X POST https://api.htpbe.tech/v1/analyze \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-storage.example.com/applications/a1b2c3/payslip.pdf"}'

Insurance claims intake. At the point of claim submission, each supporting document — repair estimates, medical reports, receipts, income statements — is assessed before entering the claims assessment queue. Modified documents are flagged for adjuster review with the specific signals identified. The check ID is stored in the claims management system against the document record.

KYC and onboarding. At the point of customer onboarding, supporting documents submitted as income or identity evidence are assessed alongside the KYC provider’s identity proofing flow. The two layers operate independently and answer different questions: the KYC provider answers “is this person who they claim to be?” and the document integrity check answers “was this PDF generated by the source it claims?”

Australian-Specific Document Patterns

Several Australian document types carry structural characteristics that make integrity analysis particularly valuable.

Xero and MYOB payslips both originate from SaaS accounting platforms with consistent PDF generation stacks. Their structural fingerprints are stable across versions. A payslip that claims Xero origin but was produced elsewhere is immediately distinguishable at the structural level.

CBA, ANZ, NAB, and Westpac statement PDFs each carry the producer signatures of the respective bank’s document generation infrastructure. These signatures differ from one another and differ substantially from consumer tools. A statement that claims CBA origin and returns producer: "Microsoft Excel" is a document that cannot have been generated by CBA’s systems.

MyGov-sourced documents — including ATO Notices of Assessment accessed and downloaded from myGov — carry structural characteristics of government document generation systems. Fabricated NOAs, which are a documented fraud vector in Australian mortgage applications, almost universally originate in different toolchains.

Superannuation statements from the major industry funds and retail platforms carry fund-specific producer signatures. A super statement fabricated for an SMSF loan application will typically show a consumer software origin that is inconsistent with the fund’s actual document generation system.

Who This Applies To

Any APRA-regulated entity that accepts PDF documents as part of a credit assessment, claims assessment, or onboarding process has a document integrity control gap if submitted PDFs are not subject to structural fraud detection.

The most directly affected entity types:

ADIs with retail or SME lending books that accept income documents, bank statements, and tax documents from applicants
General insurers and life insurers with claims processes that accept supporting documents
RSEs and fund administrators with member services or advice workflows that accept financial documentation
Fintechs operating under an ADI licence or credit licence with document-intensive onboarding flows

The control is not complex to implement. The HTPBE? API accepts a document URL and returns a structured verdict in seconds. Integration into an existing document intake workflow is a matter of adding a single API call at the point of document receipt, storing the returned check ID, and defining the routing logic for each verdict class.

For Australian fintech and lending use cases specifically, the fintech lending workflow covers integration patterns and decision thresholds in detail. For Xero payslip fraud detection, see the Xero payslip fraud detection page. For superannuation statement integrity, see fake super statement detection.

HTPBE? is available immediately on self-serve plans from $15 per month, with no procurement cycle required. API keys are provisioned on signup and include test keys for integration testing against synthetic documents before any live documents are processed.