Among dozens of other filetypes, FileOptimizer also compresses PDFs - often significantly. The issue is that the used plugin smpdf is free for non-commercial use only and it annoyingly also overwrites metadata information to state this.
The following tool remedies these metadata changes (but not the license situation!).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | #! python
from __future__ import print_function
import fitz
import sys, os, subprocess, tempfile, time
'''
Optimizes a PDF with FileOptimizer. But as "/Producer" and "/Creator" get
spoiled by this, we first save metadata and restore it after optimization.
This means we also accept non-compressed object definitions (as created by
FileOptimizer).
'''
assert len(sys.argv) == 2, "need filename parameter"
fn = sys.argv[1]
assert fn.lower().endswith(".pdf"), "must be a PDF file"
fullname = os.path.abspath(fn) # get the full path & name
t0 = time.clock() # save current time
doc = fitz.open(fullname) # open PDF to save metadata
meta = doc.metadata
doc.close()
t1 = time.clock() # save current time again
subprocess.call(["fileoptimizer64", fullname]) # now invoke FileOptimizer
t2 = time.clock() # save current time again
cdir = os.path.split(fullname)[0] # split dir from filename
fnout = tempfile.mkstemp(suffix = ".pdf", dir = cdir) # create temp pdf name
doc = fitz.open(fullname) # open now optimized PDF
doc.setMetadata(meta) # restore old metadata
doc.save(fnout[1], garbage = 4) # save temp PDF with it, a little sub opt
doc.close() # close it
os.remove(fn) # remove super optimized file
os.close(fnout[0]) # close temp file
os.rename(fnout[1], fn) # and rename it to original filename
t3 = time.clock() # save current time again
# put out runtime statistics
print("Timings:")
print(str(round(t1-t0, 4)).rjust(10), "save old metata")
print(str(round(t2-t1, 4)).rjust(10), "execute FileOptimizer")
print(str(round(t3-t2, 4)).rjust(10), "restore old metadata")
|
Runs under all Python versions 2.7 and up.
FileOptimizer is a tool for Windows platforms, but stated it can run on UNIX-like systems with WINE.
It must be installed for this script to run.
Foolowing example output shows a 50% filesize reduction:
$ dir sdw_2010_11.pdf
17.07.2016 05:59 19.174.030 sdw_2010_11.pdf 1 Datei(en), 19.174.030 Bytes 0 Verzeichnis(se), 1.587.349.512.192 Bytes frei
$ python pdf-opt.py sdw_2010_11.pdf Timings: 0.0031 save old metata 65.4435 execute FileOptimizer 0.3112 restore old metadata
$ dir sdw_2010_11.pdf
29.10.2016 08:03 9.769.342 sdw_2010_11.pdf 1 Datei(en), 9.769.342 Bytes 0 Verzeichnis(se), 1.587.380.068.352 Bytes frei
$
now with proper newliners: