Available Now: Explore our latest release with enhanced accessibility and powerful IDP features

How to Build Optical Character Recognition (OCR) in Python

By Isaac Maw | 2025 Jan 16

Sanity Image
Read time

4 min

Summary: If you know basic Python, you’re well on your way to getting OCR up and running to improve productivity in your business processes for forms, scanned documents, and images of text, for example. Check out this quick guide to learn how to install and use the Apryse OCR module for the Apryse SDK.

Python is a popular programming language, and it has been for many years. It’s versatile and beginner-friendly, and it’s commonly used for software testing, task automation, and data science applications, such as data analysis or visualization.

These use cases make it an ideal language to work with optical character recognition (OCR), because OCR is a tool for collecting data: it translates human-readable images of text to machine-readable text, and it’s useful for processing high volumes of documents, such as scanned pages of legal discovery, images of ID cards, or insurance claim forms. All of these examples can also benefit from a python programmer who knows how to build automated processes to process these documents.

The OCR Module for Apryse SDK is a great way to add OCR capabilities to your Python application.

So, let’s get started!

How to Set Up OCR on Server/Desktop in Python

Copied to clipboard

To add OCR functionality in the Apryse SDK, you need to install the OCR add-on module. Find it for download in our documentation here. 

Setup

The first step in setup before you can begin using OCR is to set up the location of the Lib directory under which the external add-ons are installed so that the SDK knows where to look for them. This is achieved using the PDFNet AddResourceSearchPath function. If a relative path is used, it’s based on the end-user executable.

PDFNet.AddResourceSearchPath("../../../PDFNetC/Lib/") 

Note: do not specify the actual Windows/Barcode or Linux/Barcode directories, where the extension libraries are, but the parent "Lib" folder. This allows you to add a single resource search path for all installed modules.

For error handling purposes, it is generally advisable to test whether the module is available via the IsModuleAvailable function.

if not OCRModule.IsModuleAvailable():  
pass # OCR Module unavailable 

If you have the module installed but the function still returns false, please double check that the correct path was used in AddResourceSearchPath earlier.

Using the OCR

Copied to clipboard

Now you’re ready to use OCR. Here are some sample workflows in Python:

OCR on an Image

The OCR Module makes a searchable PDF by adding invisible text to an image.

doc = PDFDoc()  
 
# Run OCR on the image without options  
OCRModule.ImageToPDF(doc, image_path, None) 

OCR on a PDF file

Here’s how to use the OCR SDK to add invisible text to an image based pdf file, such as a scanned document:

doc = PDFDoc(filename)  
# Set English as the language of choice  
opts = OCROptions() opts.AddLang("eng")  
# Run OCR on the PDF with options  
OCRModule.ProcessPDF(doc, opts) 

Adjusting the Input Resolution

Sometimes, tweaking the input DPI can help produce better results. Here’s how:

opts.AddDPI(300) 

Other Workflows

In addition to these, you can also use the OCR module to work with the raw OCR output and metadata, and more. Check out our OCR documentation to learn more.

What is Tesseract in Python?

Copied to clipboard

Tesseract is an open-source OCR engine available under Apache 2.0. While it doesn’t have a built-in GUI, it is compatible with many programming languages and frameworks. Tesseract powers many OCR engines available on the market. Tesseract 4 includes a new neural network subsystem which helps recognize lines of text.

Which OCR Engine is best in Python?

Copied to clipboard

Like any good “which is best” question, the boring answer is correct: “it depends.” There are many OCR engines available on the market. Apryse offers 3 OCR engines to meet a range of user needs. These include:

LEADTOOLS OCR Engine: This is our default engine, powered by LEADTOOLS technology. As part of v11.0 of our SDK, Apryse added this OCR engine to deliver improvements to the accuracy and speed of our OCR offerings. This engine also improves pre-processing, such as deskew and despeckling.

Tesseract OCR Module: Just like other Tesseract-based OCR engines out there, the Apryse Tesseract OCR Module uses Tesseract 4.0 as a foundation, and adds functionality to improve performance, such as improved word and character recognition, as well as features such as multiple language support and JSON output.

When users use Apryse OCR, they don’t select between the LEADTOOLS engine and the Tesseract engine. The Module automatically uses both for different languages and alphabets.

IRIS OCR Module: In addition to the out-of-the-box OCR module, Apryse offers the IRIS OCR Module based on the IRIS iDRS engine. This package is licensed separately from the default OCR module and may provide better results in some cases, especially when considering multiple disconnected text snippets on a page, as might occur in documents such as magazine covers or a CAD documents. The IRIS module is currently available on Windows and Linux Platforms.

In Closing

Copied to clipboard

The best OCR for your use case depends on performance for your input documents, resolution, and other factors. Try testing different OCR engines with your input needs and see what performs best.

While OCR is a well-established technology, it relies on computers recognizing the strange glyphs we humans have developed over thousands of years, and even sometimes scratch onto paper using graphite. Tweaking the input resolution, scanning parameters and other funky configuration settings may be required to get you the best results.

In any case, if you know Python and you have are interested in other Apryse Server SDK functionality, the OCR module is an easy place to start.

Try it now to start improving your productivity with searchable PDFs and more today!

Sanity Image

Isaac Maw

Technical Content Creator

Share this post

email
linkedIn
twitter