AVAILABLE NOW: Spring 2025 Release
By Roger Dunham | 2025 Jun 13
6 min
Tags
python
linux
If you're looking to build powerful PDF and document processing features into your Python applications, then the Apryse SDK (formerly PDFTron) offers a robust and flexible solution.
In this blog, we’ll explore how to get up and running with the Apryse SDK on Linux (in particular Ubuntu) using Python—from installation to basic usage, and then on to more advance usage, leveraging the Barcode module to detect Barcodes in a PDF.
The Apryse SDK is available for both Python 2 and Python 3. In this article, though, we are looking only at Python 3. You will need:
This article was written using Ubuntu 24.02. But the concepts also apply to Windows and macOS (although as we will see Barcode detection is not currently available for macOS).
While you can install the SDK directly using pip, we recommend using python3 -m pip (or python -m pip on Windows) since that guarantees that the correct version of pip is used, even if you have multiple versions of python on your machine.
I’ll use that method in this article, but the choice is up to you.
python3 -m pip apryse-sdk --index-url=https://pypi.apryse.com
Note that we are also using --index-url=https://pypi.apryse.com. This matters since we use a dedicated location for the SDK, not the default PyPI. The version of apryse-sdk on the default PyPI has not been updated since 9.5.
Attempting to install the SDK on Ubuntu, this may fail with an “externally-managed-environment” error.
Figure 1 – Ubuntu, as an example, may insist that you use a virtual environment.
If you get this error, then create a new virtual environment. In this example, we will create one called “apryse-venv”.
python3 -m venv apryse-venv
Figure 2 – Creating a virtual environment.
Creating a virtual environment will result in a new folder being created.
Figure 3 – The newly created folder associated with the virtual environment.
Next, you need to activate the virtual environment.
source apryse-venv/bin/activate
You will now be able to install the apryse-sdk.
Figure 4 – Successfully installing the Apryse SDK within a virtual environment.
When you have finished with the virtual environment, you can use deactivate to exit it.
If you want to remove it, then you can just delete the folder (in this case “apryse-venv”) that was created.
The Apryse SDK requires a license key. However, it is free to get an Apryse Server SDK trial license.
We will see in a moment how we use the key.
You can use whatever code editor you prefer. For this article, I will use VS Code.
Create a new file and paste the following code (taken from the Getting started with Python 3 guide) into it.
# You can add the following line to integrate apryse-sdk
# into your solution from anywhere on your system so long as
# the library was installed successfully via pip
from apryse_sdk import *
def main():
PDFNet.Initialize("YOUR_APRYSE_LICENSE_KEY")
try:
doc = PDFDoc()
page1 = doc.PageCreate()
doc.PagePushBack(page1)
doc.Save(("linearized_output.pdf"), SDFDoc.e_linearized)
doc.Close()
except Exception as e:
print("Unable to create PDF document, error: " + str(e))
PDFNet.Terminate()
if __name__ == "__main__":
main()
The essential steps are:
Great. Let’s run the code and be delighted by the results.
With about 10 lines of code, we have created a new blank PDF.
Figure 5 – Our code and the brand-new empty PDF that it created.
While it is interesting to create an empty PDF, and it shows that you have correctly configured everything, the Apryse SDK does much, much more.
Let’s use it to extract barcodes from a PDF.
As an example, let’s consider a receipt from a grocery store that contains a QR code and a “regular” barcode.
Figure 6 – An example receipt from a grocery store.
It’s a typical receipt with various lines – each showing what was bought, how many, and the cost, then there are three lines for Subtotal, Tax and Grand Total.
The Apryse SDK offers a lot of functionality straight out of the box – you can merge documents, rotate pages, and find text on the pages of a PDF for example.
Even more functionality is available via add-on modules. These add-on modules include “CAD”, “Conversion to Office”, “Advanced Imaging”, “Data Extraction” and, of interest for this article, “Barcode Detection”.
Not all modules are available for all platforms. The “Barcode Detection” module, for example, is available for Windows and Linux only, while the “Data Extraction” module is available for Windows, Linux and macOS.
Three of the modules are available via pip, making them a cinch to install:
python -m pip install apryse-data-extraction --extra-index-url=https://pypi.apryse.com
python -m pip install apryse-cad --extra-index-url=https://pypi.apryse.com
python -m pip install apryse-ocr --extra-index-url=https://pypi.apryse.com
The other modules, including the Barcode module, need to be installed manually. That’s not hard, you just need to download the module extract the contents and place them into the location that matches where the SDK is expecting them.
For this article we will create a new folder called “apryse-python” and extract the module into it.
To keep things tidy, we will also copy our receipt file into the same folder.
Figure 7 – The barcode module extracted into the folder where we will put our code. I could also just have copied the Lib folder that it contains.
Even though we will be doing something far more complex than creating an empty PDF, the principles are still the same: “Import the SDK”, “Initialize the SDK with a license key”, “Do the work”, “Terminate the SDK”.
Let’s create a new python file called “barcode.py” within the “apryse-python” folder and copy the following code into it:
# You can add the following line to integrate apryse-sdk
# into your solution from anywhere on your system so long as
# the library was installed successfully via pip
from apryse_sdk import *
def main():
PDFNet.Initialize("[Your license key]")
try:
PDFNet.AddResourceSearchPath('./BarcodeModuleLinux/Lib')
if not BarcodeModule.IsModuleAvailable():
print("Unable to run BarcodeTest: Apryse SDK Barcode Module not available.")
else:
# A) Open the .pdf document
doc = PDFDoc("receipt.pdf")
# B) Detect PDF barcodes with the default options
BarcodeModule.ExtractBarcodes(doc, "receipt.json")
doc.Close()
except Exception as e:
print("Unable to create PDF document, error: " + str(e))
PDFNet.Terminate()
if __name__ == "__main__":
main()
Most of the code is self-explanatory, so I won’t discuss it. There are two lines of real interest.
The first is:
PDFNet.AddResourceSearchPath('./BarcodeModuleLinux/Lib')
This allows the Apryse SDK to know where the Lib folder for the Barcode module is located. This is relative to the current working directory.
The second line of interest is:
BarcodeModule.ExtractBarcodes(doc, "receipt.json")
This single line is performing the magic- it is taking a PDFDoc object, which was previously created, scans it for barcodes, extracts the data and writes the results into a JSON file.
Now, if we run the code (having specified that “apryse-python” is the current working directory) then a JSON file will be created that has information about all of the barcodes detected in the document.
Figure 8 – The generated JSON file.
The generated file shows that two bar codes were found (which is correct), that one was a Code-128, and the other was a QR Code. We even get to see where the barcodes are located, the rotation angle of each one, and the text that they contain. The QR code, for example, has the text “Store: Fake Store Inc. Date: 2025-05-12 16:00:30.114610 Total: $14.98".
Figure 9 – Details of the detected QR code, showing the location, text and rotation angle.
Awesome!
What you do with the data, once you have it, is up to you. You could easily extend the Python code to annotate the PDF with where the PDFs were located – you could even, in this example, extract the data from the receipt as key/value pairs (using the Data Extraction module) and verify that it matches the QR code’s contents as a form of tamper detection.
The possibilities are endless.
We’ve only scratched the surface of what is possible in this article. Get yourself a trial key, check out the documentation for the SDK, and see how it can quickly let you turn your ideas into solutions, saving you time and letting you get to market faster.
We want you to succeed, so if you have any questions, then please reach out to us on Discord.
Tags
python
linux
Roger Dunham
Share this post
PRODUCTS
Platform Integrations
End User Applications
Popular Content