How to split a PDF by its bookmarks
Three steps — drop a bookmarked PDF, pick a depth level, download the ZIP. The whole flow runs in your browser, so the source document and the produced parts never leave your device.
- Load the PDF. Drag onto the dropzone or click Choose PDF. The tool reads the document outline using pdfjs-dist's
getOutlinecall and resolves each bookmark's destination to a page index — typically a few hundred milliseconds even on a 500-page book. - Pick the depth. Top-level only is the default — one file per chapter. Switch to Levels 1-2 for one file per section, Levels 1-3 for sub-sections, or All levels to split on every bookmark. The outline preview on the left highlights which entries become split points; the output preview on the right shows the resulting filenames and page ranges before you commit.
- Download the ZIP. Hit Split. pdf-lib copies pages into N output documents, JSZip packs them, and a download link appears. Click it to save the file as
<original>-split-by-outline.zip. Each PDF inside is named after its bookmark title (or part number, if you picked numeric mode).
What the outline tree looks like
A typical dissertation PDF outline looks like this. The depth level you pick decides which entries become split points — the ones at or above the chosen depth, marked in bold in the preview.
Chapter 1 — Introduction (level 1, p.1)
1.1 Background (level 2, p.3)
1.2 Research Questions (level 2, p.7)
Chapter 2 — Literature Review (level 1, p.12)
2.1 Prior Work (level 2, p.13)
2.1.1 Browser-Local Tools (level 3, p.14)
2.1.2 Server-Based Tools (level 3, p.20)
2.2 Gaps (level 2, p.28)
Chapter 3 — Methodology (level 1, p.34)
Chapter 4 — Results (level 1, p.55)
Chapter 5 — Discussion (level 1, p.78)
References (level 1, p.95)
Appendix A — Code Samples (level 1, p.108)With Top-level only, this 120-page dissertation produces 7 files (Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5, References, Appendix A). With Levels 1-2, 11 files. With All levels, 13 files (the two level-3 sub-sections under 2.1 each become their own part).
Why no upload — privacy by architecture
Most online PDF splitters (iLovePDF, Smallpdf, PDF24, Sejda) accept your file on a server, slice it there, and hand back a download link. The privacy policies promise deletion after some interval — but the file did sit on a server you don't control, often in a jurisdiction you don't pick. For a dissertation under embargo, an unpublished manuscript, a client's book draft, an internal technical manual, or any document whose chapters you don't want sitting in a vendor S3 bucket, that's a real problem.
This splitter doesn't have a server endpoint that receives your PDF. The page is a static HTML+JS bundle. When you drop a file, it goes into your browser's memory. pdfjs-dist reads the outline locally. pdf-lib copies pages and produces the output documents in browser memory. JSZip packs them. The download link points at a blob URL inside your browser — the ZIP is built on your machine and never touches our infrastructure. You can disconnect from the internet after the page loads and the splitter still works.
The PostHog analytics on this site track anonymous events — file loaded (size only, no content), tool started with the chosen level, tool completed, tool failed. The PDF's content, filename, and page contents are never read by any tracker. We log these events to find UX problems (where do users abandon mid-flow); we never log document data.
Split-by-Outline vs. Split vs. Extract Pages — pick the right tool
Three tools, three intents. The names overlap in competitor docs — here is the clean separation:
| Tool | Output | Cut points come from | Use when |
|---|---|---|---|
| Split by Bookmarks (this page) | Many PDFs (ZIP) | The PDF's own outline | The PDF has bookmarks (book, dissertation, manual, generated report) and you want one file per chapter or section without typing page numbers. |
| Split PDF | Many PDFs (ZIP) | Page numbers you provide | Any PDF — bookmarked or not. Useful when the splits don't match the outline (cut after page 50, regardless of where chapter 3 begins) or the PDF has no outline at all. |
| Extract Pages | One PDF | Specific pages you pick | You want a single subset PDF — pages 3-7 of a contract, or just the signed page of a 30-page agreement. One output, not many. |
| Delete Pages | One PDF | Pages you discard | Original document minus a few unwanted pages — drop a cover sheet, remove blank scanner pages, trim a duplicate from a botched merge. |
Quick rule of thumb: if your PDF has bookmarks and you want one file per chapter, use split-by-outline. If it doesn't (or you want different cut points than the outline gives you), use /split. If you want one PDF with a subset of pages, use extract-pages. The browser-local privacy guarantee holds across all four — none of them upload your file.
Common reasons to split a PDF by bookmarks
- Dissertation chapter shipping. A 280-page dissertation with 8 chapter bookmarks. The advisor wants Chapter 3 only. Drop the file, pick Top-level only, download the ZIP, attach the Chapter-3 PDF. The other seven files stay on your laptop.
- Technical manual distribution. A 600-page product manual where each section is the unit a support engineer needs. Pick Levels 1-2, get one PDF per section, add the relevant ones to a ticket reply. Faster for the customer than scrolling a 600-page document.
- E-book chapter loading. A reference book bought legally, want to load chapter 9 onto your e-reader without the rest. Top-level only produces one PDF per chapter — copy chapter 9 to the device.
- Compliance bundle splitting. A 1200-page audit pack with bookmarks for every section. Each section routes to a different reviewer. Split, ZIP, distribute.
- Research paper compilation. A combined PDF of 30 papers, each one bookmarked at its title. Split-by- outline produces 30 individual PDFs ready for filing.
- Course materials. A semester reading bundle with one bookmark per week. Split into 14 weekly PDFs, push one to the LMS each week instead of asking students to find the right page in a giant file.
- Legal exhibits. A bundle PDF where each exhibit is bookmarked. Split, name with bookmark titles, serve the relevant exhibits with the bates-numbered file intact.
Under the hood
Three libraries do the work. pdfjs-dist (Mozilla's pdf.js) reads the outline tree via doc.getOutline() — each entry exposes a title, optional nested items, and a destination that's either an indirect reference or a named destination string. For named destinations we resolve them via doc.getDestination(name). Either way we end up with a page reference, which we feed through doc.getPageIndex(ref) to get a 0-based page index for the bookmark.
pdf-lib does the slicing. After the bookmarks are flattened to a list of page indices and sorted, each consecutive pair becomes one inclusive page range. For each range we create a fresh PDFDocument, call copyPages with the indices, append, and serialize to bytes. No rasterization, no font re-embedding, no quality loss — the output PDFs are clean structural subsets of the input.
JSZip wraps everything. Each output PDF goes into the archive under its sanitized filename. We use DEFLATE level 6 compression — PDFs are already compressed internally, so the ZIP-level pass mostly removes redundant headers (5-15% size reduction in practice).
Encrypted PDFs with owner passwords work — pdf-lib loads them with ignoreEncryption: true. Encrypted PDFs that ask for a user password to even open don't work in this tool — use unlock-pdf first if you have the password, then come back.
Edge cases the tool handles
- Bookmarks with the same target page. If two bookmarks point to the same page (a chapter that starts on the same page as a sub-section), the tool dedupes — only the first one becomes a split point at that page. The output filename uses the higher-level bookmark's title.
- Reverse-ordered or non-monotonic bookmarks. If a bookmark resolves to an earlier page than the previous one (rare, usually a malformed PDF), the tool sorts the flattened list before computing ranges — bookmark order in the tree doesn't have to match page order.
- Bookmarks without destinations. Some bookmarks are pure section labels with no page reference. They're skipped — only entries with a resolvable page index produce a split point.
- Front matter before the first bookmark. If page 1 isn't the first bookmark target, the leading pages get bundled into a Front matter file so the cover, preface, and TOC don't get dropped.
- Duplicate sanitized filenames. Two bookmarks named "Summary" in different chapters would produce identical filenames. The tool appends -2, -3 etc. to keep them distinct in the ZIP.
- Empty titles after sanitization. A bookmark titled with only filesystem-hostile characters (rare, but it happens with non-Latin punctuation) becomes empty after sanitizing. The tool falls back to part-NN.
- Very deep outlines. All levels mode walks the entire tree. A PDF with 500 leaf bookmarks produces 500 output files — the ZIP step takes longer and the browser memory footprint scales linearly. Top-level only is the light-on-memory choice for huge documents.
Frequently asked questions
What is a PDF outline (bookmarks) and how do I know if my PDF has one?
The outline is the table of contents pane that opens on the left in Acrobat, Preview, or any PDF reader — a tree of titled entries each pointing to a page in the document. Books, dissertations, technical manuals, and most generated reports (LaTeX, Word with heading styles, Markdown to PDF) include an outline. Scanned PDFs and many quick-export PDFs from spreadsheets or screenshots don't. To check: open the PDF in any reader and look for a sidebar labelled Bookmarks, Contents, or Outline. If it's empty or missing, this tool won't have anything to split on. Drop the file anyway — the tool detects an empty outline and tells you up front, so you don't waste time.
What happens if my PDF has no bookmarks?
The tool reads the outline before you click Split. If it's empty, you get a clear message and links to the alternatives — /split for splitting by page range or one file per page, and /extract-pages for picking specific pages by number. Nothing is uploaded; the detection runs in the browser. There's no point in pretending we can split on nothing — outline-based splitting only works when the outline exists.
How do the level options work?
PDF outlines are nested. Top-level entries are usually chapters (Chapter 1, Chapter 2). Inside each chapter, level-2 entries are sections (1.1, 1.2). Level-3 are sub-sections. Top-level only treats every chapter as one part — most common choice for splitting a book or dissertation. Levels 1-2 produces one file per section, useful for technical manuals where each section is the unit you want to share. Levels 1-3 goes one level deeper. All levels treats every bookmark as a split point, including the deepest sub-sub-sections — this can produce a lot of tiny files, so use it only when the document is structured that way and you actually want each leaf entry as its own PDF.
What does the Front matter file contain?
If the first bookmark in your PDF doesn't point to page 1, there are pages before the first split point — title page, dedication, copyright, acknowledgements, table of contents itself. The tool puts those leading pages into a file labelled Front matter so they're not lost. If the first bookmark IS at page 1, no Front matter file gets generated. This matches how most book-splitting workflows want it: the chapters split cleanly, and the prologue material rides along as one preserved chunk.
How are the output filenames generated?
Two patterns. Title mode: <original-filename>-<bookmark-title>.pdf — for example, dissertation-Chapter 3 Methodology.pdf. The bookmark title is sanitized to remove filesystem-hostile characters (slashes, colons, asterisks, question marks, quotes, angle brackets, pipe), trimmed to 100 characters, and falls back to part-NN if the title becomes empty after sanitization. Numeric mode: <original-filename>-part-NN.pdf with two-digit zero-padded sequential numbers — useful when bookmark titles are duplicated or non-Latin and you want predictable filenames. If two parts produce the same filename in title mode, the tool appends -2, -3 etc. so nothing silently overwrites another file in the ZIP.
Does it preserve text, fonts, and quality?
Yes. The split uses pdf-lib's copyPages, which does a structure-level page copy — it grabs each page object and the resources it references (fonts, images, color profiles) and embeds them into the output document. There is no rasterization, no font re-embedding, no quality loss. Selectable text stays selectable. Vector graphics stay sharp at any zoom. Each output PDF is a clean subset of the input pages.
Will internal links and cross-references survive?
Page-level content survives intact: text, images, vector graphics, embedded fonts, AcroForm field annotations on the included pages. The document outline of each output file is not re-built — pdf-lib doesn't rewrite outline entries to point at renumbered pages. Cross-page link annotations that pointed to pages in a different output part become dangling references (Acrobat will show a broken-link icon when clicked). For most splitting use cases this is fine — once a chapter is its own file, internal references to other chapters are expected to be absent. If you need preserved cross-document navigation, splitting is the wrong operation.
How is split-by-outline different from regular split?
Regular /split asks where to cut — page 5, every 10 pages, one file per page. You provide the cut points. Split-by-outline reads the cut points from the document's own bookmarks. For a 400-page dissertation with 8 chapters, /split would need you to type the start page of each chapter. Split-by-outline reads the chapter starts from the outline and produces 8 PDFs in one click. The tradeoff: split-by-outline only works on PDFs with an outline; regular /split works on anything.
Does the file get uploaded?
No. The PDF stays on your device. pdfjs reads the outline locally. pdf-lib copies the pages and writes the output PDFs in browser memory. JSZip packs the result into a ZIP, also in browser memory. The download link points at a blob URL — the file is built on your machine and never touches our infrastructure. You can disconnect from the internet after the page loads and the splitter still works. PostHog logs anonymous events (file loaded with size only, tool started with chosen level, tool completed) to find UX problems; document content is never sent.
What is the file size limit?
Soft limit around 200 MB for browser memory reasons. The pdf-lib copyPages step holds the source document and the output documents in memory simultaneously. A 200-page, 50 MB PDF splits into 12 chapters in a few seconds on modern hardware. A 1500-page, 500 MB technical manual will struggle on older devices — the browser may stall mid-split. For very large PDFs, splitting into fewer, larger parts (Top-level only) is much lighter than All levels.