How HR Teams Can Automate Diploma Verification with a PDF Checking API
Code examples verified against the API as of March 2026. If the API has changed since then, check the changelog.
Credential fraud is no longer a fringe concern for hiring teams. Gartner projects that by 2028, one in four candidate profiles will contain fabricated or AI-generated information. ADP Research found that one in eight resumes contains falsified information — and that number has been climbing steadily as AI-assisted document forgery becomes accessible to anyone with a laptop.
For HR Directors, Talent Acquisition Managers, and People Ops Leads, this creates a concrete operational problem: how do you reliably screen diploma authenticity at scale without paying $20–$150 per document for full background check services on every applicant in your pipeline?
This article explains how PDF metadata analysis works, what it reveals about diploma authenticity, and how to integrate a diploma verification API into your ATS workflow in an afternoon.
The Scale of the Problem
The global fake diploma industry is estimated at $21 billion annually. That figure reflects everything from diploma mills issuing credentials for non-existent degrees to individuals purchasing edited PDFs of real university templates and presenting them as their own.
Checkr reports that 23% of companies have discovered identity fraud among new hires. The damage typically extends well beyond the initial hiring decision:
- Recruiting costs: sourcing, interviews, offer letters, onboarding
- Training investment: weeks or months of role-specific training
- Productivity loss: work assigned to someone unqualified to perform it
- Legal exposure: depending on the role, negligent hiring claims
The average cost of a bad hire exceeds $50,000 when you account for all of these factors. For regulated industries — healthcare, finance, law, engineering — the regulatory and liability exposure can be substantially higher.
The core challenge is not that fraud has become more common. It is that it has become harder to detect visually.
Why Visual Inspection No Longer Works
A decade ago, fake diplomas were often low-quality printouts with obvious formatting problems: wrong fonts, blurry seals, misaligned text. A trained HR coordinator could frequently spot them on inspection.
That era is over.
Modern PDF editing tools — Adobe Acrobat, Canva, LibreOffice, Affinity Publisher — allow anyone to open a PDF template, replace names and dates with pixel-perfect precision, match fonts exactly, and export a document that is visually indistinguishable from the original. AI-assisted design tools have made this even faster. A convincing fake diploma for any major university can now be produced in under 30 minutes by someone with no technical background.
What visual inspection cannot see is the file’s structural metadata: the information embedded in the PDF itself about how, when, and with what software it was created.
This is where automated verification has a decisive advantage.
What Diploma Metadata Actually Reveals
Every PDF contains metadata fields that record the document’s origin and history. For a legitimately issued university diploma, this metadata follows a consistent and recognizable pattern.
Authentic university diploma — typical metadata profile:
- Producer: institutional document management software (PeopleSoft, Ellucian Banner, Oracle HCM, or a custom registrar system)
- Creator: a system service account, not an individual user
- Creation date: matches or closely precedes the graduation ceremony date
- Modification date: identical to creation date (document was never reopened)
- XRef table structure: single increment, consistent with a document that was generated once and never edited
Forged diploma — typical metadata profile:
- Producer: “Canva” / “LibreOffice Impress” / “Microsoft Word” / “Adobe Acrobat DC”
- Creator: personal user account or application default
- Creation date: days or weeks ago, regardless of the claimed graduation year
- Modification date: present and different from creation date (template was edited)
- XRef table structure: multiple increments, indicating the document was opened and saved multiple times
No legitimate university uses Canva to issue diplomas. No registrar’s office sends graduates a Word document. When a candidate submits a PDF claiming to be a 2019 diploma from a major research university but the metadata shows it was produced in Canva last Tuesday, that is not an inconclusive edge case. That is a red flag that warrants immediate follow-up.
A Realistic API Response: Suspicious Diploma
Here is an example of what the HTPBE API returns for a diploma with clear forgery indicators:
{
"id": "a3f7c219-08b1-4e52-9d6a-bc1234567890",
"status": "modified",
"creator": "Canva",
"producer": "Canva",
"creation_date": 1772634131,
"modification_date": 1772635863,
"file_size": 348210,
"xref_count": 3,
"pdf_version": "1.7",
"origin": { "type": "consumer_software", "software": "Canva" },
"has_incremental_updates": true,
"has_digital_signature": false,
"signature_removed": false,
"modifications_after_signature": false,
"has_javascript": false,
"has_embedded_files": false,
"modification_markers": [
"Different creation and modification dates",
"Multiple cross-reference tables (incremental updates)"
]
}
The analysis identified multiple independent signals that converge on the same conclusion. A single anomaly in isolation might warrant follow-up. Four anomalies pointing the same direction — consumer design software, recent creation date, post-creation editing traces, multiple xref increments — is a clear finding.
Integrating the API into Your ATS Workflow
The integration pattern is straightforward. Most ATS platforms (Greenhouse, Lever, Workday, SmartRecruiters) support webhooks or custom integrations that can trigger an HTTP request when a candidate uploads a document.
The workflow:
- Candidate uploads diploma to your ATS during application
- ATS stores the file and generates a publicly accessible URL
- Your integration sends a POST request to the HTPBE API with that URL
- API returns a verdict in 2–5 seconds
- Your integration writes the result back to the candidate record
- If
statusis"modified"or"inconclusive", the application is automatically flagged for HR review
Basic curl example:
curl -X POST https://htpbe.tech/api/v1/analyze \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://your-ats.com/storage/candidate-documents/diploma-123.pdf"
}'
The response arrives in seconds. You can parse the status field to implement whatever routing logic fits your process — auto-advance intact files, queue modified files for manual review, or send the full findings JSON to the recruiter’s dashboard.
For teams building custom integrations, the API documentation covers authentication, request formats, error handling, and test mode.
Interpreting the Three Verdicts
The API returns one of three verdicts. Knowing what each one means determines how your team responds.
intact — Proceed with Confidence
The document shows no signs of post-creation modification. The producer software is consistent with institutional document management systems. Creation and modification dates are consistent with the claimed issuance date. XRef structure is clean.
This does not constitute a guarantee — metadata can theoretically be manipulated by someone with technical knowledge — but it is a strong positive signal. For most hiring scenarios, an intact verdict allows you to proceed to the next stage without additional document verification.
modified — Request Official Verification
The analysis detected post-creation modifications. Do not simply reject the candidate based on this result, but do not proceed without verification either.
The appropriate response is to request an official transcript or verification letter sent directly from the institution to your organization, not re-submitted by the candidate. Most universities offer transcript verification services; many work with third-party verification platforms like the National Student Clearinghouse.
inconclusive — Consumer Software Origin
The document was created with consumer software (Microsoft Word, Canva, LibreOffice, Google Docs) rather than institutional document management systems. No modifications were detected after creation, but the origin itself is the problem.
Legitimate universities do not issue diplomas created in Word or Canva. An inconclusive verdict for a university diploma almost always means the candidate produced the document themselves, even if they have not technically edited an existing file.
The correct response is the same as for modified: request official verification directly from the institution. Explain to the candidate that you require an official transcript and that documents created with personal productivity software cannot be accepted as institutional records.
The Economics: First-Pass Filtering vs. Full Background Checks
Full background check services that include credential verification typically charge $20–$150 per document, depending on the provider and the depth of verification (domestic vs. international, contact with institutions, etc.).
These services are thorough, but they create a bottleneck: if you run full background checks on every applicant before advancing them, you are spending significant money on candidates who may not survive a basic phone screen.
HTPBE is designed as a first-pass filter, not a replacement for comprehensive background checks.
| Scenario | Cost |
|---|---|
| Full background check service (per document) | $20–$150 |
| HTPBE Growth plan (350 checks/month) | ~$0.43/check |
| HTPBE Starter plan (30 checks/month) | ~$0.50/check |
For a recruiting team processing 200 diploma submissions per month:
- HTPBE pre-screening: ~$86/month (Growth plan)
- Full background checks on all 200: $4,000–$30,000/month
The practical workflow is layered: use HTPBE to screen the full applicant pool in the first pass, eliminate obvious fakes immediately, and apply full background check services only to candidates who have cleared initial screening and received an offer — when the investment is genuinely warranted.
This approach concentrates your background check spend on candidates who are likely to be hired, while catching the majority of credential fraud before you invest significant recruiting time.
Building the Business Case Internally
If you are proposing this integration to a CFO or CHRO, the numbers are straightforward.
A single bad hire costs $50,000+ in direct and indirect costs. If your organization makes 50 hires per year and credential fraud affects even 1–2% of candidates (well below the industry averages cited above), you are statistically looking at one fraudulent hire per year. The $149/month Growth plan costs $1,788 annually. The expected value calculation is not close.
Beyond cost avoidance, there are compliance considerations. In industries where specific qualifications are legally required — nursing licenses, engineering certifications, CPA credentials — hiring someone with fraudulent credentials creates regulatory exposure that far exceeds any direct costs.
Next Steps
If you want to test the analysis before committing to an integration, the HTPBE web tool accepts PDF uploads directly — no API key required. Upload a sample diploma and see the full metadata analysis and verdict in real time.
For API integration, test API keys are available on all plans including free. Test mode returns synthetic responses so your development team can build and validate the integration without spending production quota.
CTA
Verify any candidate diploma — free web tool, instant results.
Upload a diploma at htpbe.tech and see the full metadata analysis in under 5 seconds. No account required.
Ready for volume? The Growth plan covers 350 checks per month at $0.43 per check — enough for most mid-size recruiting pipelines. See API plans.