Available Now: Explore our latest release with enhanced accessibility and powerful IDP features

Automatically Recognize Invoices from Different Vendors using C#

By Apryse | 2025 Feb 19

Sanity Image
Read time

4 min

When working in a paperless office, businesses receive hundreds of different forms and invoices from different vendors. It is often a major pain point and bottleneck to manually find, extract, and store all the necessary information. Thankfully with a Template Extraction SDK, everything can be easily automated to improve workflow productivity and efficiency.

With Apryse, users need only to create templates for each of the different invoices or form types. These templates are then stored in a repository and used to automatically recognize which type of filled form is currently being processed.

In the demonstration below, we take a directory of files and compare them to a list of templates.

For more details on Template Extraction, check out How to Extract ACORD Form Data using Template Extraction in .NET 

private static void RecognizeForms() 
{ 
 Console.WriteLine("Loading Templates..."); 
 using TemplatesCollection templates = new TemplatesCollection.Builder() 
   .FromSourceFolder(templatesDirectory) 
   .Build(); 
 
 Console.WriteLine("Creating Classifier..."); 
 using TemplateClassifier classifier = new TemplateClassifier.Builder() 
   .SetTemplatesSource(templates) 
   .SetRuntimeFolder(@".\TemplateExtraction1.0\Runtimes") 
   .SetLicense([license, licenseKey]) 
   .Build(); 
 
 Console.WriteLine("Creating Extractor..."); 
 using TemplateExtractor extractor = new TemplateExtractor.Builder() 
   .SetTemplatesSource(templates) 
   .SetRuntimeFolder(@".\TemplateExtraction1.0\Runtimes") 
   .SetLicense([license, licenseKey]) 
   .Build(); 
 
 Console.WriteLine("Begin recognizing forms...\n"); 
 string[] formsToRecognize = Directory.GetFiles(filledFormsDirectory, "*.pdf", SearchOption.AllDirectories); 
 foreach (string form in formsToRecognize) 
 { 
   using var fileStream = new FileStream(form, FileMode.Open, FileAccess.Read, FileShare.Read); 
   var inputStream = new InputStream(fileStream); 
   Result<ClassifyResult> classifyResult = classifier.Classify(inputStream); 
   if (!classifyResult.IsSuccess) 
   { 
    Console.WriteLine("Unable to classify file..."); 
   } 
   else 
   { 
    Console.WriteLine($"Now reading {Path.GetFileName(form)} which has been recognized as a {classifyResult.Value.ClassName}"); 
    Result<ExtractResult> result = extractor.Extract(inputStream); 
    Console.WriteLine($"{"Field",-20}{"Result",-20}{"Confidence",-10}"); 
    foreach (ExtractedPage page in result.Value.Pages) 
    { 
      foreach (ExtractedText field in page.Fields) 
      { 
       Console.WriteLine($"{field.Id,-20}{field.Text,-20}{field.Confidence,-10:F2}"); 
      } 
    } 
   } 
   Console.WriteLine(); 
 } 
} 

Let’s break down this sample code to understand what each part does.

First, loading the templates:

using TemplatesCollection templates = new TemplatesCollection.Builder()    .FromSourceFolder(templatesDirectory)    
.Build(); 

Next, we set up and use the Classifier object to classify the files to be extracted to the correct template:

using TemplateClassifier classifier = new TemplateClassifier.Builder()    .SetTemplatesSource(templates)    .SetRuntimeFolder(@".\TemplateExtraction1.0\Runtimes")     
.SetLicense([license, licenseKey])     
.Build(); 
using var fileStream = new FileStream(form, FileMode.Open, FileAccess.Read, FileShare.Read);     
var inputStream = new InputStream(fileStream);     
Result<ClassifyResult>  
classifyResult = classifier.Classify(inputStream);   

That’s how Apryse Template Extraction supports automated data classification from a large set of form files, including different types of forms, documents, and invoices.

Extracting the Data

The code which extracts the specified data from target files using the template is included in the snippet above. See:

using TemplateExtractor extractor = new TemplateExtractor.Builder()    .SetTemplatesSource(templates)    .SetRuntimeFolder(@".\TemplateExtraction1.0\Runtimes")     
.SetLicense([license, licenseKey])     
.Build(); 

And:

Result<ExtractResult> 
result = extractor.Extract(inputStream); 
Console.WriteLine($"{"Field",-20}{"Result",-20}{"Confidence",-10}"); 
foreach (ExtractedPage page in result.Value.Pages) { 
  foreach (ExtractedText field in page.Fields) { 
    Console.WriteLine($"{field.Id,-20}{field.Text,-20}{field.Confidence,-10:F2}"); 
    
  } 
  
} 

To learn more about creating the templates, visit our documentation guide. 

If you’re interested in getting started with Template Extraction, contact us. 

Sanity Image

Apryse

Share this post

email
linkedIn
twitter