合并PDF文件,并保留目录
有些出版社只提供按章节分开的PDF文件,而我喜欢读整本书,所以查了查如何把它们拼成一整个PDF文件——很简单的!只需要安装MuPDF,然后使用如下命令即可:
mutool merge -o output.pdf input1.pdf input2.pdf input3.pdf ...
我昨天以为它和PyMuPDF的join一样不保留目录,所以写了如下脚本:
import fitz
from sys import argv
if i := argv.index("-o"):
_ = argv.pop(i)
output = argv.pop(i) # Assume `-o output.pdf`
else:
output = "output.pdf"
def page_plus(offset):
def page_plus_offset(row):
row[2] += offset
return row
return page_plus_offset
with fitz.open() as doc:
toc = []
for chapter in argv[1:]:
with fitz.open(chapter) as f:
# Ignore t[3] (get_toc(False)) in case kind == fitz.LINK_NAMED
toc.extend(map(page_plus(len(doc)), f.get_toc()))
# Metadata is unchanged, that's why we need to manually set toc
doc.insert_pdf(f)
doc.set_toc(toc)
doc.save(output, garbage=3, deflate=True)
然后我开始写这篇博客,想着最好还是在发布之前试一下mutool
——试完发现自己白写了。😭