Does PDF to CSV work on scanned PDFs?

No. Extraction works on text-based PDFs — documents exported from Excel, accounting software, bank portals, or any digital source. Scanned PDFs store page images with no text layer; OCR support for those is planned for v2.

Are merged cells handled?

Not automatically. The tool detects columns by x-position gaps in the text stream. Merged cells in the source PDF may cause some cells to appear shifted. Adjust by reviewing the preview table and exporting page by page for complex layouts.

Can I edit columns before exporting?

Yes. The preview table shows detected rows and columns. You can include or exclude individual pages via the checkbox panel. More granular column-boundary editing (drag separators) is on the v2 roadmap.

Is the file uploaded to a server?

No. All processing runs in your browser using PDF.js. Your PDF never leaves your device. There is no account, no cloud copy, and no server-side retention.

What about complex multi-table PDFs?

The algorithm works best on PDFs with one consistent table structure per page, such as bank statements and invoices. For PDFs with multiple tables side by side or overlapping layouts, export page by page and cross-reference each section manually.

PDF to CSV

Extract tabular data from PDFs into CSV. Browser-only — nothing uploaded.

Drag & drop your PDF here

or browse files

Single file · PDF supported

How PDF to CSV extraction works

PDF.js reads the text content stream embedded in each page — the same layer used by search engines and screen readers. Every text item comes with an x/y coordinate, width, and height. The tool groups items with a similar y-position (within ±5 px) into rows, then analyzes x-position gaps wider than 2.5× the median character width to identify column boundaries.

This works well on PDFs exported from Excel, accounting software, bank portals, GST return portals, and financial reporting tools — anywhere the source document was digital. Scanned PDFs (paper fed through a scanner) store page images with no text layer; OCR support for those is on the v2 roadmap.

Use cases

Bank statement reconciliation — extract transaction rows to CSV for Tally or Excel import
GST return preparation — copy invoice line items directly into your accounting system
Financial report analysis — get numbers out of PDF reports into a spreadsheet
Vendor invoice processing — extract item, quantity, and amount columns automatically

All processing runs locally in your browser. Your PDF never touches a server.

PDF to CSV

How PDF to CSV extraction works

Use cases

Related Tools