AVAILABLE NOW: Spring 2025 Release

Tutorial: Auto Recognize and Process a Form

By Apryse | 2025 May 19

Sanity Image
Read time

4 min

Processing forms and invoices are a large part of many companies' day-to-day workflow. When a copy of a form is filled out by a person and scanned back into the company, that information then needs to be extracted. Many OCR engines struggle to extract this information since the form could have been scanned in at a lower resolution than the original, could have noise introduced by the scanner, or the fields may be unstructured and dynamically generated. Thankfully, the Apryse SDK takes care of all of that and eliminates the need for any additional manual processing. Powered by Apryse’s patented machine learning algorithms, these advanced forms recognition and OCR libraries handle both structured and unstructured forms and can help save companies valuable time and money.

The code below shows the core of what is needed to get a .NET forms recognition and OCR application running. If you want a complete step-by-step tutorial, check out the Apryse documentation.

C# code:

 static TemplateExtractor extractor;  

      static void InitFormsObjects()  

      {  

         Console.WriteLine("Loading Templates...");  

         using TemplatesCollection templates = new TemplatesCollection.Builder()  

            .FromSourceFolder(@"C:\Users\TyBrucker\OneDrive - Apryse\Desktop\template_test\Templates")  

            .Build();  

  

         Console.WriteLine("Creating Extractor...");  

         extractor = new TemplateExtractor.Builder()  

            .SetTemplatesSource(templates)  

            .SetRuntimeFolder(PDFTronResources.TemplateRuntimes)  

            .SetLicense([PDFTronResources.LicenseFile, PDFTronResources.LicenseKey])  

            .Build();  

      }  

  

      static void RecognizeAndProcessForm()  

      {  

         Console.WriteLine("Begin recognizing forms...\n");  

         string formToRecognize = @"C:\Users\TyBrucker\OneDrive - Apryse\Desktop\template_test\Test Forms\72193839.pdf";  

  

         using var fileStream = new FileStream(formToRecognize, FileMode.Open, FileAccess.Read, FileShare.Read);  

         var inputStream = new InputStream(fileStream);  

  

         Result<ExtractResult> recognitionResult = extractor.Extract(inputStream);  

         Console.WriteLine("Recognition Results:");  

         Console.WriteLine("=========================================================================");  

         ShowProcessedResults(recognitionResult);  

      }  

  

      static void ShowProcessedResults(Result<ExtractResult> result)  

      {  

         string resultsMessage = "";  

  

         foreach (ExtractedPage page in result.Value.Pages)  

         {  

            foreach (ExtractedText field in page.Fields)  

            {  

               resultsMessage += ($"{field.Id,-20}{field.Text,-20}{field.Confidence,-10:F2}\n");  

            }  

         }  

  

         Console.WriteLine("Field Processing Results:");  

         Console.WriteLine(resultsMessage);  

      }  

See For Yourself - Free Evaluation

Copied to clipboard

Download the Apryse SDK for free. It’s fully functional and comes with free chat and email support.

Stay Tuned for More Conversion Samples

Stay tuned for more conversion examples to see how Apryse easily fits into any workflow converting PDF files into other document files or images and back again. Need help in the meantime? Contact our support team for free technical support! For pricing or licensing questions, you can contact our sales team.

If you haven't already read our prior post on how to Create a Multipage File from Multiple Images, check that out and stay tuned for more. We'll be featuring a lot more tutorials that programmers can use to develop applications that will directly impact data capture, recognition, exchange, and other pressing business needs.

Sanity Image

Apryse

Share this post

email
linkedIn
twitter