Disliking PDF
I am no fan of PDF as a file format, particularly when compared to PostScript, its older rival. There are many reasons to dislike PDF, such as:
- Due to the byte offsets in its xref table, it is effectively a binary format which is very hard to write from shell scripts and other text-based programming languages, and near impossible to edit in an editor. PostScript is fine.
- Its standard evolves too fast. Version 1.0 was introduced in 1993, and by 2006 we had 1.7, the eighth version. Since then it has slowed slightly, but we have had EL 1, 3, 5, 6 and 8, and PDF 2.0. Version 1.5 (2003) was a particularly major revision, as a new format for the xref table was introduced. In contrast Postscript appeared in 1984, and reached level 3 by 1997 when development ceased.
- PDF uses a single global namespace, which means there is no equivalent of an EPS file which can be trivially included in another document. To include one PDF file in another, one has to rename/renumber all objects to avoid clashes, whereas nothing in an EPS file needs to be parsed.
- Many readers are buggy - this issue it shares with PostScript.
It is the finally issue to which I wished to draw attention. Here is a PDF file. It is short, simple, and invalid. It contains the three words "Do", "not" and "vote" in three different fonts. The font used for "not" is invalid, so might not be displayed. But many PDF viewers give no warning, and display the text "Do vote", negating the (probable) meaning. A quick test gives:
chromium 75 | No warning, "Do not vote" |
evince 3.28.2 | No warning, "Do vote" |
gs 9.25 | Many warnings, "Do not vote" |
gv (gs 9.25) | One warning, "Do not vote" |
firefox 68 | No warning, "Do vote" |
okular 1.3.3 | No warning, "Do vote" |
preview MacOS 10.13 | No warning, "vote" |
safari MacOS 10.13 | No warning, "vote" |
xpdf 3.04 | Four warnings, "Do vote" |
So three completely different renderings with no warnings accompanying them. Similar fun can be had if one attempts to write text with no current font defined at all (rather than attempting to use as a built-in font something which one has no reason to believe will be built in, as this example does).
PDF's advantages
So what are some of the advantages of PDF over PostScript?
- A reasonable alpha-based transparency model, rather than the one bit version of PostScript. PostScript 4 could address this...
- Internal and external hyperlinks. Combined with DSC, PostScript 4 could address this.
- Simpler to interpret. Less of an issue as computing power becomes cheaper.
But to any Computer Scientist, PDF will always be dull. A Mandelbrot Set in PDF is just an image (unless one embeds a JavaScript object, which is surely cheating). A Mandelbrot Set in PostScript is probably a program which generates the set when run.
Both PostScript and PDF are trademarks of Adobe.