PDFTron is now Apryse. Same great products, new name.
By Adam Pez | 2019 Sep 13
However, we recently surveyed 57 unique organizations that tried PDF.js and later decided to look for an alternative. And 15.8% cited failure to open files or browser crashes as a reason for switching to a PDF.js alternative.
To help you avoid making the same mistakes, we wanted to find out: what types of PDFs does PDF.js crash on?
Our research involved opening 1,663 PDF files in PDF.js. These documents included random PDFs from Google, as well as business documents, financial statements, construction drawings, college textbooks, and more.
What we found is that PDF.js will open 98.6% of PDFs found in the wild. However, some types of PDFs crashed or froze the browser more than others. While simple PDFs, like invoices, performed well enough, graphics-heavy documents tended to have higher failure rates.
(For more information on other topics from performance to supported functionality, read our comprehensive guide to PDF.js.)
PDFs are an incredibly complex file format; this is especially so given that a PDF can be generated a hundred different ways, all of which a renderer needs to handle gracefully.
– Developer, Linkedin
PDFs found in the wild come in all different sizes and compositions, from small and simple invoices — to massive reports and intricate designs shared in workflows across government and enterprise settings.
PDF is an incredibly complex file format—the specification is more than a thousand pages long, not including the extensions and supplements.
– Senior Developer, Dropbox
If at the 11th hour your takeoff system is not reading that last file, even if you rendered the last 10 out of 11 files perfectly, you can’t get your estimate. That person can’t do their job. Effectively, you could lose business by not being able to open one of these PDFs… It’s like getting it 99% right and 1% wrong — and you actually fail.
– Tony Cornwall, Construction Computer Software
As soon as your tool is seen as not 100% reliable, even if it’s still 99% reliable, the customer is going to switch off and default to the next-lowest common denominator — the Adobe Acrobats or pen and paper.
– CEO, AEC Project Collaboration Software
Our customers reported reliability issues with PDF.js that caused them to seek an alternative.
We also tried PDF.js to render pdf using a blob object. It is working on iPad and iPhones with a few limitations like it is not able to open PDFs bigger than 100MB, and it doesn’t support pinch zoom.
– Developer, Fortune 50 Company
We are using PDF.js now as an embedded viewer for PDF documents in a single page application, and we are having some issues with crashing browsers and suspect issues with the viewer.
– CTO, Training & Compliance Software
At present, we’re working with open-source PDF.js which is great for the 95% of PDFs, but the other 5% is critical. Larger PDFs are tricky.
– Co-founder, eDiscovery Software
To understand these issues better, we opened 1,663 PDFs using Chrome 76 on a new laptop and the latest version of the PDF.js demo viewer (v2.3.146).
These PDFs included:
Note: Even though PDF.js may open documents, it may not render content quickly or accurately. For this benchmark, we only looked at whether documents would crash or hang the browser.
To learn more, check out our detailed guide to PDF.js rendering accuracy.
Documents such as text-based financial filings, government forms, e-magazines, textbooks, and scientific reports opened in PDF.js without any apparent difficulty.
Other documents did not perform as well, particularly graphics-heavy documents.
For example, Architecture, Construction, and Engineering drawings showed a 1% failure rate, while PDFs from Grabcad.com performed the worst, with as many as 1 in 10 (10%) failing to open or crashing the browser. These were PDFs generated from models using a variety of different CAD applications.
Random PDFs found on Google also had a failure rate of 1%. These findings are consistent with an older yet similar PDF.js benchmark, published on Mozilla Hacks.
This study looked at about 7,000 PDFs taken from Google and found 0.8% (roughly 1/100) would crash the browser with PDF.js. It also noted 2.8% of documents produced a “less-than-optimal” UX and that PDF.js had difficulty with graphics-heavy documents.
Documents that crashed PDF.js would do so in a couple common ways:
First were corrupted documents. PDF.js would throw an exception and close them right away:
Many other documents, however, crashed due to memory issues. PDF.js simply could not allocate memory efficiently enough, especially for graphics-heavy PDFs.
As a result, the browser would throw an exception after trying to load the file:
Other times, PDF.js would open the file—only to hang indefinitely when rendering the page:
As illustrated on the PDF.js GitHub and elsewhere, PDF.js may not allocate memory efficiently, especially on certain browsers, such as when it needs to render a large embedded jpeg, or when rendering an especially large and complicated page.
There are a couple reasons why this may happen.
First, large canvases are essentially huge bitmaps and thus consume lots of memory. This is especially true when one interacts with (zooms into, pans, and scrolls across) a document, and PDF.js is forced to re-render complicated canvases at a larger size and higher resolution.
Due to lack of support for canvas tiling that would break up rendering into smaller manageable pieces, PDF.js renders page content all at once onto a single large canvas image, which in some cases, may be larger than what the browser permits or consumes too much memory.
PDF.js therefore struggles to handle larger design documents, maps, and blueprints, especially on mobile browsers where memory constraints are tightest.
Another issue pertains to large PDFs with many layers, such as the Geospatial PDFs with a 3% failure rate.
Geospatial PDFs, for example, may include a street or topographic vector layer over top a satellite imagery raster background. The latter is switched off by default to ensure readability and performance.
But since PDF.js does not support OCG layers, it will render every layer—even layers switched off by default.
And with an especially big and complex map, it can quickly hit a wall.
You’ll want to see whether your files open in the PDF.js demo viewer on the browsers and devices you expect your users will prefer. You’ll also want to interact with these documents to test whether PDF.js viewer options deliver the desired UX.
Try scrolling and panning across a document, and zooming into and out of areas where you expect users will want to read small text or perform measurements.
If after 20-30 seconds of heavy interaction, performance is still relatively smooth and the browser hasn’t crashed, then PDF.js may work for your PDFs.
However, if your browser hangs or crashes, or if the UX degrades considerably — you may wish to consider alternatives.
Once you’ve run your tests and if you haven’t experienced any problems, you could try PDF.js. If you encountered issues that may be of concern, however, you could consider a more robust commercial solution, like Apryse WebViewer.
We always appreciate feedback on our blog. If you have any questions, don’t hesitate to contact us directly.
Share this post