The jsonformatter.org Breach Is a Warning for Every PDF Tool
The November 2025 jsonformatter.org and codebeautify.org incident leaked over 5 GB of credentials and personal data because shared URLs were predictable and uncontrolled. The same architecture pattern describes every server-side PDF tool — and that is the jsonformatter breach safe PDF alternative argument in one paragraph.
What happened in the breach
On November 25, 2025, The Hacker News published a report titled "Years of JSONFormatter and CodeBeautify Leaks Expose Thousands of Passwords and API Keys." The article — at thehackernews.com — described how researchers harvested over 5 GB of enriched, annotated JSON data captured from two of the internet's most-used developer tools: jsonformatter.org (covering five years of content) and codebeautify.org (covering one year). The dataset spanned more than 80,000 files.
What the dataset contained is the part that should alarm anyone who has ever pasted operational data into a free online tool:
- Usernames and passwords
- Repository authentication keys
- Active Directory credentials
- Database credentials
- FTP credentials
- Cloud environment keys
- LDAP configuration information
- Helpdesk and meeting-room API keys
- SSH session recordings
- Personal information across multiple regulated sectors
The affected sectors named in the reporting span critical infrastructure, government, finance, banking, telecommunications, healthcare, technology, retail, aerospace, education, travel, and cybersecurity. This was not a niche developer tool failure — it was a cross-industry credential leak that pulled production secrets from regulated organizations.
To confirm the data was being actively exploited, the researchers seeded fake AWS access keys into the platforms. Within 48 hours they observed exploitation attempts against the seeded credentials. Attackers were actively scraping the platforms and testing harvested keys against real cloud infrastructure. Both jsonformatter.org and codebeautify.org disabled their save functionality in September 2025 after the researchers alerted affected organizations.
The root cause was architectural
The Hacker News article identifies the technical root cause clearly: predictable URL formats combined with a Recent Links page that enumerated shared content. The two URL schemes were:
https://jsonformatter.org/{id-here}
https://codebeautify.org/{formatter-type}/{id-here}The {id-here} was sequential or otherwise enumerable. There was no authentication on the shared links, no per-user access control, no rate limiting against mass crawling, and the Recent Links page provided a starting set of valid IDs to expand from. Anyone with a web browser and a script could iterate the address space and pull down whatever the original users had pasted.
None of these decisions was malicious. Each one — sequential IDs, share-by-URL, a Recent page for discoverability — looks reasonable in isolation. The failure was the combination plus the fact that user pastes routinely contained production secrets. The operator's privacy posture relied on no attacker noticing the share-URL pattern, an assumption that did not hold for five years.
Generalize to every server-side PDF tool
Now substitute "PDF" for "JSON" in that root cause description. A server-side PDF tool accepts user uploads on a server, processes the uploads on the server, and usually offers some form of share-by-link or temporary-download URL for the processed output. The data class is different — PDFs containing bank statements, contracts, medical records, KYC scans, regulatory filings, M&A working files — but the architectural shape is the same:
- User content is held on operator infrastructure during processing.
- Operator chooses retention period via privacy policy.
- Operator may issue share URLs or download URLs that are guessable or enumerable.
- Operator may maintain a Recent or History page for convenience.
- Operator may log file metadata, content fingerprints, or full bodies for service operations.
Every one of those bullets is a place where the jsonformatter and codebeautify failure could repeat with PDFs instead of JSON. A predictable download URL on a PDF compress tool exposes whatever PDF the upstream user just uploaded. A Recent Documents history page on a server-side PDF editor lets enumeration start from a known valid set. Retention policies that nobody reads turn last week's bank statement into this week's data-breach incident.
For the architectural argument expanded with more failure-mode detail, see our companion post on why server-side PDF tools leak data. For the original framing of how jsonformatter applied specifically to API-key paste habits, see never paste API keys into a JSON formatter.
Failure modes server-side PDF tools share with jsonformatter
Three failure modes are documented in the jsonformatter incident and apply directly to PDF tools that share the same architecture:
1. Predictable share or download URLs
jsonformatter's URLs were enumerable. A server-side PDF tool that produces a short URL for the processed output — common for "get the download link" features and for one-click sharing — is subject to the same risk if the URL is guessable without authentication. Long random IDs help; cryptographic share tokens help more; per-user authentication required to access any output is the only structural fix. Many free PDF tools skip all three for the sake of frictionless UX.
2. Convenience pages that enumerate prior work
jsonformatter had a Recent Links page that gave attackers a starting list of valid IDs. A PDF tool's Recent Documents or History page does the same thing for the operator's stored content. Even if individual documents require authentication, an enumeration of recent IDs leaks the existence and timing of user uploads — a side channel that has its own intelligence value.
3. Retention policies the user never sees
Most free server-side PDF tools list retention windows in their privacy policy (one hour, two hours, 24 hours, 30 days). Few users read these. The retention is the breach surface — whatever is stored for that window is what an attacker can potentially exfiltrate if the storage is compromised. The Hacker News report on jsonformatter showed five years of retention captured in a single harvesting run. PDF tools with weeks or months of retention are smaller targets in time scale but the same kind of target.
The browser-local alternative
The alternative is browser-local processing — running the PDF operation entirely inside the user's browser tab using WebAssembly, with no server in the processing path. That is the model at pdfmavericks.com.
What that means in practice:
- The PDF is read from your disk via the File API into the browser tab's memory. No upload request fires.
- The operation runs inside a WebAssembly module that the tab loaded once from a CDN. The module is the same library (pdf-lib, qpdf, MuPDF, or Tesseract for OCR) that drives a lot of server-side PDF infrastructure, recompiled to run in the browser.
- The processed PDF is written back to disk via the browser download API. The in-memory copy is discarded when you close the tab.
- No share URL is generated. No Recent Documents page exists. No retention window applies because no server-side asset was created.
The jsonformatter failure modes do not apply because there is nothing to enumerate. There is no share URL because the file never left the user's tab. There is no Recent Documents page because the operator does not see what the user processes. There is no retention policy because there is no retention. The five-year window of captured content that the researchers harvested from jsonformatter does not have a PDF-Mavericks-equivalent because the corresponding content was never on a pdfmavericks.com server in the first place.
Verification: open pdfmavericks.com/all-tools, pick any tool, open the browser developer tools, switch to the Network panel, clear it, drop a PDF onto the tool, and run the operation. The Network panel should show only the CDN GETs that loaded before you started — no POST or PUT request carrying the PDF as payload. If one appeared, the file would have been uploaded. It does not appear, because nothing is uploaded.
What to do if you have already used a server-side PDF tool
If you have routinely uploaded sensitive PDFs to a server-side tool, the response is the same playbook any incident-response team uses for a possible exposure:
- List what was uploaded. Roughly inventory the document categories — bank statements, contracts, KYC scans, medical records, payroll registers — and approximate count over the past 12 months.
- Read the operator's privacy policy. Note the retention window. Anything within that window may still be on the operator's infrastructure. Anything older was retained or purged according to whatever the policy stated.
- Treat any credentials in the uploaded PDFs as potentially exposed. If the PDFs contained API keys, passwords, account numbers, or tokens embedded in the document text, rotate them. The jsonformatter incident specifically captured credentials embedded in pasted JSON — the same risk applies to credentials inside PDFs.
- Document the activity in your internal log. Under GDPR Article 33 (notification of personal-data breach to the supervisory authority) and the DPDPA 2023 Section 8 obligations for data fiduciaries, the uploaded PDF processing may need to be characterized as third-party data processing without a DPA. That is an audit-trail entry, not necessarily a public disclosure — talk to your DPO or legal team.
- Switch routine handling to browser-local tools. For the ongoing workflow, the pdfmavericks.com catalog covers most common operations without creating new uploads. For the umbrella view of what is available browser-locally, see browser-only PDF editor: no upload, no account.
None of this requires panic. The jsonformatter incident took five years to harvest; most organizations have time to migrate routine PDF handling to a local-first workflow before any specific server-side tool they have used is compromised. The point of the November 2025 disclosure is that the architectural pattern has been proven to fail at scale — not that any specific server-side PDF tool is about to leak in the next 24 hours.
Closing
Server-side file processing is the architecture that produced the jsonformatter and codebeautify breach. It is also the architecture that produces every free-online-PDF-tool retention policy users have to take on faith. Both arrangements ask the user to trust a future operator decision about retention, access control, share URL safety, and incident response. Sometimes those decisions go right for years; sometimes they go wrong for years and the evidence shows up in a Hacker News story.
The structural fix is to remove the server from the path. Browser-local PDF tools do that by design. The November 2025 incident is the proof that the structural fix matters — not because every server-side tool is bad, but because the failure mode that took down jsonformatter is repeatable on any tool that shares its shape. PDF documents carrying financial, medical, legal, and personal data are exactly the wrong place to bet on a server-side operator getting every decision right for the next five years.
For the corresponding regulatory framing under EU GDPR, see our redact PDF for GDPR compliance post. For the safest password handling on a bank-statement PDF, see safest way to remove a PDF password.
No upload, no server-side asset, no breach surface
PDF Mavericks runs every operation inside your browser using WebAssembly. The file never leaves your device — so the jsonformatter failure mode cannot repeat here.
Frequently asked questions
What is the jsonformatter.org / codebeautify.org breach?
A November 2025 report from The Hacker News disclosed that researchers harvested more than 5 GB of saved content from jsonformatter.org (five years of data) and codebeautify.org (one year of data). The dataset contained over 80,000 files including usernames, passwords, repository authentication keys, Active Directory credentials, database credentials, FTP credentials, cloud environment keys, LDAP configuration information, helpdesk API keys, meeting-room API keys, SSH session recordings, and personal information. The full reporting is at thehackernews.com.
How did attackers actually access the data?
The root cause was predictable URL formats combined with a Recent Links page that enumerated shared content. The Hacker News report describes URL structures like jsonformatter.org/{id-here} and codebeautify.org/{formatter-type}/{id-here} that bad actors could crawl. There was no authentication on the shared links and no rate limiting that prevented mass enumeration. Researchers seeded fake AWS access keys into the platforms and observed exploitation attempts within 48 hours — the harvested credentials were being actively scraped and tested by attackers.
Which sectors were affected by the leak?
The Hacker News reporting cites affected organizations across critical infrastructure, government, finance, banking, telecommunications, healthcare, technology, retail, aerospace, education, travel, and cybersecurity. The leak was not a niche developer problem — it spanned regulated industries handling personal financial data, patient records, classified communications, and trade secrets. The architectural pattern that caused it (server-side processing of pasted user content with predictable share URLs and no access control) is common to many free online tools, not just JSON formatters.
Why does this matter for PDF tools specifically?
Every server-side PDF tool sits inside the same architectural pattern: accept user input on a server, process the input on the server, retain copies under a published or undocumented policy. The data class is different (PDFs instead of JSON) but the failure mode is the same — anything pasted, uploaded, or processed becomes a server-side asset, subject to whatever retention, share-link, or backup misconfigurations the operator has in place. The November 2025 incident is the proof case for what every server-side file tool risks. Browser-local processing avoids the entire failure class because there is no server-side asset to leak.
Have similar breaches happened to PDF tools before?
Specific server-side PDF tool breaches have been less publicized than the jsonformatter incident, but the architectural risk is identical. Documented credential leaks via developer-tool history pages have hit Pastebin, multiple paste services, and online code beautifiers over the past decade. The pattern — a free service offering convenience in exchange for storing user input on the operator's infrastructure — produces predictable failure modes regardless of file type. A PDF tool with predictable share URLs and a Recent Documents page would produce the same outcome as jsonformatter did, with different data classes (bank statements, contracts, medical records instead of API keys).
What is the safe alternative for routine PDF work?
Browser-local PDF tools that run the entire operation inside the browser tab using WebAssembly. The pdfmavericks.com catalog at pdfmavericks.com/all-tools includes more than 30 operations — merge, split, compress, redact, sign, fill, unlock, OCR, conversion — all running in the browser without an upload. The PDF is read from disk via the File API, edited in tab memory, and written back to disk via the browser download API. No server-side asset is created, so no breach surface exists. Verification: open the browser developer tools Network panel and confirm no POST or PUT requests carry the file after the page loads.
Is this argument just marketing — does every server-side tool really carry the same risk?
The risk is architectural, not marketing. A server-side tool that does everything right (TLS in transit, encryption at rest, short retention, no share URLs, strict access control) can be safe in practice. The jsonformatter and codebeautify incident is a reminder that the architecture allows for failure — a single misconfiguration of share URLs or retention turned a routine operational choice into a 5 GB credential leak. Browser-local processing eliminates the failure surface rather than relying on the operator to configure it correctly forever. Both architectures can be safe; only one cannot fail in the way jsonformatter failed.
What should I do if I have used a server-side PDF tool with sensitive documents?
Assume the document may be retained according to the tool's stated retention policy. Review the operator's privacy policy for retention period (most range from one hour to 90 days for free tiers). If the document contained credentials, personal data, or KYC information, treat it as exposed: rotate any passwords, notify affected data subjects if required under GDPR Article 33 or DPDPA Section 32, and document the incident in your internal log. Going forward, switch routine sensitive-document handling to browser-local tools where no third-party copy exists in the first place.