Apryse Announces Acquisition of AI-Powered Document Toolkit Provider LEAD Technologies

What’s New in Apryse 10.7?

By Apryse | 2024 Feb 28

Sanity Image
Read time

6 min

Introduction

Copied to clipboard

Apryse offers unrivaled document processing technology for Developers. Each new release incrementally improves the software building on the previous strengths.

Release 10.7 contained a range of improvements across the product portfolio. In this article we will look at some of the improvements within Apryse SDK, WebViewer, Fluent and Xodo.

Apryse SDK

Copied to clipboard

Detection of Headers and Footers when converting from PDF to DOCX

Headers and footers in a Word document are those areas at the top and bottom of each page where you can add text or graphics that you want to appear consistently on every (or every other) page in the document. Common use cases include adding page numbers, document titles, chapter titles, author names, dates, and logos.

However, when converting a PDF into a Word document it can be difficult to identify what is a header and a footer, since very often the PDF contains no clue as to what text or a graphic is other than its location.

Until now, Apryse Structured Output, has solved this by looking for similar content or patterns on multiple pages within the document. That has given impressive results for more than ten years, allowing the reconstructed DOCX file to behave exactly how you would expect. That’s the reason that Apryse is the best PDF to Office converter.

A fundamental problem with looking for similar content on multiple pages, however, is that it is impossible to detect headers and footers from a single page document.

Apryse 10.7 contains a step change improvement in header and footer detection. It is now possible to detect headers and footers on many single-page documents.

Blog image

  Figure 1 - In Apryse 10.6, the footer text at the bottom of the page is positioned using newlines. While that looks correct it makes editing difficult. Modifying any of the text on the page may result in changing the location of this footer text.

Blog image

  Figure 2 - In Apryse 10.7, the text is now correctly located inside the page footer, so that changing the document will not alter the text location.

Data Extraction Module - IDP

The Apryse Data Extraction module has been able to detect the location of form fields since February 2023 using artificial intelligence and computer vision.

In Apryse 10.7 we have added the ability to extract not only the field data, but also to identify the label associated with that field, even in documents that do not have interactive field annotations embedd

Blog image

Figure 3 - Part of a form containing data.

Blog image

Figure 4 - Part of the JSON output when detecting form contents. This incudes not just the words in the form fields but also their location and the associated label, which in this case includes ‘Given Name”.

This means that you can take a PDF based form and extract the contents or generate a version of that form modified to capture the specific data that you require. Better still you can do that while avoiding time consuming and error prone process of reconstructing the form manually.

We continue to work on Data Extraction, so you can expect even better things in future releases.

WebViewer

Copied to clipboard

Simplifying UI customization

It has long been possible to customize the Apryse WebViewer UI to suit your specific requirements. In fact, since it is open source, you can modify it in whatever way you wish.

Apryse 10.7 includes a modular UI that simplifies customization and avoids the need to maintain a custom fork.

To support this the various panels have been rewritten, and can now be located at the top, sides, or bottom of the viewer.

Blog image

  Figure 5 - An example of the new modular UI, with tools shown at the left-hand side, rather than at the top.  

Blog image

Figure 6 - With just a tiny change in the code, the tools are now at the bottom.

Improvements to PDF Content Editing

Making Annotations Link to the Text

Previously, if an annotation was added to a PDF, for example, a highlight annotation to text, there was no explicit link between the annotation and the underlying text. As such if you edited the text then the annotation could end up in the wrong location.

Blog image

Figure 7 - A PDF with highlighted text.

Blog image

Figure 8 - The same PDF after adding the words 'complete and' to the existing text. The highlighted area has remained in the same place, resulting in different words being incorrectly highlighted.

Blog image

Figure 9 - Worse still, in the past, if you moved the text then the annotation stayed in the same place.

In 10.7 that’s been fixed, so the highlighted text remains correct, and with the introduction of fully WYSIWYG editing since 10.6, you can see exactly what is going to happen. The annotation stays associated with the correct text even if you move the entire paragraph, giving an intuitive, and accurate, look to the document.

Blog image

Figure 10 - The result in 10.7. As new words are added the correct text remains highlighted, and that is also the case if the entire paragraph is moved.

Making Color Selection easy with the new Color Picker

Often when editing text in one part of a document, you want the color to be the same as that used elsewhere in the document.

While it has been possible to set custom colors for a long time, that doesn’t solve the problem of how to find what color to use. It can be done by inspecting the HTML and CSS, but that is hardly a simple solution.

In 10.7, when text is selected, its color is displayed. The new copy button makes it a cinch to add that color into the custom colors, allowing you to use it wherever you wish.

Blog image

Figure 11 - The new Text Styles panel, showing how the selected text sets the Current Color button. Clicking on the copy icon will copy that into Custom Colors. Note also that Strikethrough is now available as a style.

Fluent

Copied to clipboard

Fluent – the automated Reporting and Template Creation tool for Microsoft Excel, Word, and PowerPoint has become even better.

It’s been great at creating documents as PDFs, and other types for years. Content can be maintained in ONE place but used in several templates - avoiding the need to hunt down multiple places for update when content changes are necessary.

Now Fluent can also create PDFs suitable for archiving (PDF/A) as well as PDFs designed to be accessible (PDF/UA). That’s great for accessibility, and long-term storage.

Note: At the moment it is not possible to create PDF/A documents using a trial license. Please contact Apryse sales for more information.

Blog image

Figure 12 - The Template designer showing the new options for PDF/A and PDF/UA, these are also accessible from the Fluent Engine.

Head on over to the PDF/A documentation and PDF/UA documentation to learn more about how to use Fluent to generate documents in these formats.

Composition Templates (Templates within Templates)​

Fluent Templates can refer to other templates as a dependency. This leverages the ability to create high quality, consistent, tailored documents using the latest data by reusing templates that you have already set up the way that you want. However, if the main template is available, but one that it is dependent on is not, then errors may occur.

Fluent Manager now makes it simple to import templates that have embedded templates. The UI will not only detect any missing dependencies, but also help you to resolve the problem, saving you time and worry.

Xodo.com

Copied to clipboard

Apryse’s awesome SaaS Document tool just got even better!

The new AskPDF feature allows you to open a PDF then ask questions in a straightforward way and have AI answer the questions for you.

For example, having opened the file air-nz-2022-annual-results-presentation.pdf, within Xodo, you can ask detailed questions about the content of the PDF such as

“How much did the load factor increase by between 2021 and 2022?” or “Did RASK increase or decrease between 2021 and 2022?”.

AskPDF will interpret your questions then search the PDF to give you answers (The load factor increased by 9.8 percentage points between 2021 and 2022, and The RASK (Revenue per Available Seat Kilometer) decreased between 2021 and 2022. 

It will even tell you the page on which it found the answers, allowing you to verify them if you wish.

Blog image

Figure 13 - An example of using AskPDF to get answers to industry specific questions with almost zero user effort.

That is truly awesome.

Conclusion

Copied to clipboard

Apryse 10.7 improves an already great product making life simpler and work better.

Whether you are new to Apryse, or an existing customer, you can try out many aspects of the functionality with the Apryse Showcase. Alternatively, you can download the Apryse SDK or Apryse WebViewer and test them locally.

Additionally, if your interest is automated Report and Template Creation using Microsoft Office then there is something for you too.

If you have any technical questions, then you can chat with us through our active Discord group.

Alternatively, if you have sales questions then please contact the Sales team who will be happy to help.

Sanity Image

Apryse

Share this post

email
linkedIn
twitter