Available Now: Explore our latest release with enhanced accessibility and powerful IDP features
By Adam Pez | 2019 Aug 30
9 min
Tags
guide
pdf.js
Performing due diligence on commercial software can be demanding in its own right. You’re working under pressure to get the project to market as soon as possible — but the consequences of a wrong decision could weigh on your organization for years.
If you’re considering a PDF.js-based project, this article provides a detailed guide sourced from real-world PDF.js implementations.
As a vendor of a commercial PDF SDK, we hear from customers who come to us after trying PDF.js and finding it cannot meet their needs.
Based on our research, the number one reason PDF.js deployments fail is due to the difficulty of adding more functionality. Developers possibly underestimate the complexity of PDF and the amount of time and effort required by a PDF.js-based customization.
(To learn more about other aspects like functionality and performance, read our comprehensive guide to PDF.js.)
To test this hypothesis, we surveyed 57 unique organizations who came to us after trying PDF.js and finding it could not meet their needs. Many of these organizations consisted of OEMs and enterprises from industries such as construction and engineering, publishing, finance, legal, education, and life sciences.
These organizations ultimately deemed using a PDF.js library as unacceptable for one of the following reasons:
Notably, of the 42 organizations who wanted more functionality, 71.4% tried to implement that functionality themselves first with PDF.js — and found it too difficult or time-intensive.
If you are thinking about a PDF.js demo or customization, you may wish to consider the following:
The challenge of a PDF.js-based customization is that PDF.js was intended as a Mozilla PDF reader and Firefox’s integrated PDF viewer, as Dropbox wrote after abandoning a PDF.js-based project:
Integrating PDF.js with Dropbox was quite difficult, if not downright hacky. PDF.js was designed to be Firefox’s integrated PDF viewer, rather than a component of another product.
– Senior Developer, Dropbox
Due to it being an open-source project not intended for use in other products, PDF.js lacks the conveniences of a commercial PDF SDK that would streamline development.
Out-of-box functionality is limited to viewing capabilities. You will be required to build additional functionality in-house or by using another open-source project, of varying quality and completeness.
PDF.js does not have an API for adding functionality to the UI. Thus adding certain features like annotations — attempted by 42% of 57 surveyed organizations — will prove challenging and time-intensive. You will need to familiarize yourself with the PDF.js code base, itself complex and assuming familiarity with the PDF specification.
The PDF specification is very complex:
PDF is an incredibly complex file format — the specification is more than a thousand pages long, not including the extensions and supplements.
– Senior Developer, Dropbox
PDFs are an incredibly complex file format; this is especially so given that a PDF can be generated in a hundred different ways, all of which a renderer needs to handle gracefully.
– Developer, Linkedin
Achieving familiarity with the PDF specification will entail acquiring specialized knowledge and expertise, which will take time. When adding annotations to PDF.js, for example, you may need a PDF.js demo to learn how to handle basic rendering instructions, including how to convert PDF annotation coordinates to <canvas>
coordinates:
All of this translation is required every time the Annotation moves, whether the movement is caused by the user drawing the annotation, scrolling/resizing the document, etc.
– Senior Developer. Dropbox
Documentation is often absent, stale (many broken links) or incomplete because PDF.js is maintained by volunteers working without consistent oversight or quality control.
Support may be unreliable. You will have to rely on voluntary forum responses, and depending on the complexity of your request, answers may be slow or inadequate. If your request falls outside the scope of the project, you are largely on your own.
Certain features may need to be rebuilt — like PDF.js text-select and text-search, which may not deliver the desired UX out-of-the-box.
We are developing a document viewer app that provides a secure container and syncs the documents for offline reading. We evaluated PDF.js, but the UX was not the best.
– Senior UX Consultant, Fortune 50 Software Company
All told, your team may spend months to learn, build, calibrate, and optimize new and existing features.
Additionally, you will have to invest time into feature support and maintenance. Since PDF.js is an open-source project, it cannot guarantee code stability and backward compatibility.
It is important to bear in mind that PDF.js, unlike a commercial SDK, is under an Apache License 2.0 — without warranties or liabilities for defects or regressions should either be introduced by a community contributor.
With over 6,000 forks of PDF.js, commits happen on average several times a week, and these changes are not necessarily performed with your project in mind:
You may find that community fixes lead to undesired rendering behavior or removal of certain features:
In some cases, PDF.js updates would break any custom-built functionality on top of PDF.js. Some of our customers had to dedicate additional staff to monitoring and testing changes. This made it harder for them to implement changes later on and reduced their capacity to build new features.
– Andrey Safonov, Apryse Solutions Engineer
Additionally, the PDF.js GitHub currently has 600+ open issues and has seen a noticeable decline in the issue resolution rate.
Many open issues stem from difficult-to-fix aspects of PDF.js such as issues related to the core rendering as well as text parsing engine, responsible for defining the text overlay used for text select, text extraction, and text search. (PDF.js text-select alone has 90+ open issues, more than any other single issue category.)
You may be required to own these issues as well to satisfy your users’ performance, rendering accuracy, and feature requirements.
After investing months or years into a highly customized JavaScript PDF viewer, you may ultimately find that PDF.js is unable to meet your document performance, reliability, and rendering accuracy requirements.
Next to difficulty building functionality, almost half (45.6%) of 57 surveyed organizations cited either performance, reliability, or rendering accuracy as their primary reason for switching from PDF.js.
Here are just a few of these customers’ testimonials:
We also tried PDF.js to render pdf using a blob object. It is working on iPad and iPhones with a few limitations like it is not able to open PDFs bigger than 100MB, and it doesn’t support pinch zoom.
– Developer, Fortune 50 Company
...we have a custom PDF viewer, which decrypts the PDFs on the client side and renders them as SVGs using Mozilla’s PDF.js library. But the library is slow, inefficient, and requires the client to handle the rendering.
– Developer, eLearning Software
Customers are complaining about performance (mainly time to first page render). We want to have the same experience across all platforms for our two main use cases...
– Solution Architect, Life Sciences Software
While the document viewer works well and provides zoom, pan, annotation, outline and thumbnail navigation, it is slow since it requires the entire document to be downloaded before it can be viewed. We are looking for something better.
–Technical Director, Document Management Software
We are using PDF.js now as an embedded viewer for PDF documents in a single page application, and we are having some issues with crashing browsers and suspect issues with the viewer.
– CTO, Training & Compliance Software
At present, we’re working with open-source PDF.js which is great for the 95% of PDFs, but the other 5% is critical. Larger PDFs are tricky.
– Co-founder, eDiscovery Software
We are currently using [PDF.js] to view construction plans related to a project being bid on. We have a small percentage of plans that don’t render correctly. In these cases we have a work around for the user to download the plan to Acrobat.
– VP, Software Consulting Firm
We have about 1000 paid users now. PDF.js has some problems: 1) Some weird formatting, such as with really old PDFs in a school database. 2) When the PDF is huge or full of images, for example, textbooks, it loads really slow. Also, it consumes a lot of RAM.
– Developer, eLearning Software
Our drawback with PDF.js is the loss of quality on some large plans on 100% zoom level and beyond. This loss of quality can sometimes block the user’s ability to make correct measurements in the file.
– Developer, 3D Mapping Software
PDF.js affords a few advantages as a simple, short-term solution:
The project layers (core PDF parsing and rendering, the display API, and the example PDF viewer) are nicely separated. Installation is a breeze if one wants to use the example viewer layer or implement a custom JavaScript PDF viewer with limited functionality. Most of PDF.js’s dependencies rely on universal web standards. And basic UI elements, such as buttons, can be restyled quickly via the project CSS and HTML files. PDF.js may, therefore, prove cost-effective in the following situations:
Three years ago, Slack embedded a PDF.js viewer using only the resources of a single recent hire.
The organization was able to trim the viewer UI to an easy-to-maintain minimum of features and achieved basic viewing primarily for small PDFs (e.g., invoices, contracts, and sales reports).
They then blogged their success:
PDFs are complex documents — structured into different layers of information, data, and objects, and containing different languages, images, and graphics… PDF.js provided basic capabilities, including security and reliability, and helped us abstract away the complexities of the project. For our first pass at inline PDF viewing, we intentionally kept our scope narrow: display and text selection support for small PDF files.
– Senior Developer, Slack
However, despite first hinting at further iterations, after the passage of three years, Slack hasn’t added more features to its PDF.js viewer such as annotations, form filling, or signatures — features that would let users do more with their PDFs in Slack.
Instead, there are now third-party tools in the Slack App Directory that offer a few PDF capabilities, such as form fill. And these tools require a separate purchase or subscription.
Some of our customers testify that adding more features to PDF.js such as annotations, form filling, and signatures may prove very challenging and time-intensive.
Currently I'm evaluating possible solutions to replace PDF.js in a DMS application. We would like to move away from PDF.js because it’s limited in its functionality and we need some advanced stuff like annotations which can’t be easily done with it.
– Senior Developer, DMS Software
PDF.js is great for getting a proof of concept out there. It does 95% of the things we want, but that 5% is crucial to us.
– Developer, Legal Software
Therefore, a PDF.js-based project may not prove cost-effective if any of the following are true:
...you shouldn’t build anything that’s available off the shelf because it’s not a source of competitive advantage if everybody else can avail themselves of it. The only scenario where you should build is if it’s your core technology — the core source of your competitive differentiation and competitive advantage.
– Mark Holst-Knudsen, President ThomasNet @ MIT’s 2014 CIO Symposium
What the question of build vs. buy comes down to in a majority of cases is whether the total costs of an in-house build (time spent learning, building, maintaining, and supporting custom features) has a lesser impact on your bottom line than the price of a commercial SDK license.
With these considerations in mind, PDF.js may prove good enough where you need a fast, short-term solution for web viewing small and simple PDFs. In contrast, PDF.js may not be as dependable, flexible, or scalable as required if your web viewer or PDF.js editor will be heavily relied upon in a commercial product or organizational setting; where your feature requirements are more advanced; and where performance, reliability, and rendering accuracy are important.
However, if you require faster performance, reliability, and near-flawless rendering, as well as easy access to hundreds of unique features cross-platform, and accelerated time to market — then you may wish to consider a commercial solution such as Apryse SDK.
We’d love to hear any feedback you may have about this article or our PDF SDK. Don’t hesitate to contact us directly.
Tags
guide
pdf.js
Adam Pez
Related Products
Share this post
PRODUCTS
Enterprise
Small Business
Popular Content