After I submitted my 50 page camera ready to POPL 19, I received an email from the conference publishers indicating my appendix (21 pages) was too long. They requested I split the appendix into a separate document.
There was only one problem: my appendix and paper had references to
each others’ sections, which meant they had to produced in the same
run of LaTeX (lest those pesky “??” placeholders start showing up).
However, using tools like
pdftk to split the resulting document
would destroy the nice table of contents generated by
did a lot of Googling, but there is no tool available that splits a
PDF while preserving these bookmarks out of the box.
To solve this problem, I’ve hacked up a simple python script that dumps a textual representation of the source PDF’s bookmarks, splits the PDF, and then updates the two resulting PDFs with the bookmarks extracted from the source PDF.
You can find the script here. To use it, you’ll need Python 2 and pdftk (>= 1.45) installed somewhere on your system.
The script is invoked as:
python ./split_toc source.pdf n second.pdf
source.pdf is split into two PDFs, one containing pages 1 - n from
source.pdf and the second containing pages n + 1 onward. The first
PDF (pages 1 - n) overwrites
source.pdf, and the second PDF is
second.pdf. As a part of the splitting process, the table
of contents of
source.pdf are split between the new
second.pdf, with updates to the referenced pages as
appropriate. This script doesn’t support arbitrary page ranges, but
you can pretty easily accomplish what you want by composing multiple
I hope this saves you as much hair pulling as it did me!