COMING SOON: Spring 2025 Release

How to Validate PDF/A documents on Server/Desktop Using Different Methods

By Isaac Maw | 2025 Mar 26

Sanity Image
Read time

3 min

Summary: PDF/A is essential for digital archiving due to its self-contained nature, lack of external dependencies, inclusion of metadata, and prohibition of encryption. Validation methods, including open-source libraries, CLI tools, and SDKs like Apryse PDF/A SDK, help ensure compliance with the PDF/A standard. Using a PDF/A validator helps to verify that your documents meet these requirements.

PDF/A is a essential standard for digital archives. Compared to PDF and other formats, PDF/A is better suited for archiving and storage because it includes features designed to safeguard data against problems such as missing fonts, images, and compatibility issues. All high-quality PDF viewers are capable of displaying PDF/A files, and are required to be backward-compatible.

To provide these benefits, PDF/A compliant files comply with the following requirements, for example:

  • Self-contained: PDF/A files must be self-contained, meaning all the information needed to display the document is embedded within the file itself (fonts, color profiles, etc.).
  • No external dependencies: They cannot rely on external sources like hyperlinks or multimedia content.
  • Metadata: PDF/A files include metadata to help with document management and retrieval.
  • No encryption: Encryption is not allowed in PDF/A to ensure that the document can always be accessed.

There are a few methods to validate PDF/A compliance to ensure that your documents meet these requirements, including open-source libraries, CLI tools, and framework-specific solutions, such as our Apryse PDF/A SDK. Keep reading below for more information about some of these options.

Looking to convert your files to PDF/A? Check out How to Convert PDF to PDF/A. 

Why Validate PDF/A Documents?

Copied to clipboard

PDF/A is an ISO-standardized version of the PDF, specialized for use in archiving and long-term preservation of digital documents. The filename extension of a PDF/A file is .pdf, just like a typical PDF. This is a main reason why PDF/A validation is important. To give an example, a user converting images from .JPEG to .TIFF can easily identify converted files and is ready to use the .TIFF files. But because PDF/A is a standard, not a different file extension, a PDF file must be validated to identify issues in standards adherence that might prevent the document from being properly displayed or printed in future, including all the requirements set forth in the PDF/A ISO standard.

Can’t I Just Check the Properties to Find out if a File is PDF/A?

Copied to clipboard

PDF/A documents do have specific metadata, however, this metadata does not ensure compliance. A PDF document can be identified as PDF/A but contain features not allowed in PDF/A, and a PDF document that doesn’t have PDF/A metadata can be compliant with the standard. So, it’s important to validate PDF/A files, instead of relying for example on tools which can convert or print to PDF/A, especially for important documents for archiving and preservation.

PDF/A Validator Options

Copied to clipboard

Command Line Interface: VeraPDF

VeraPDF is a trustworthy, open-source PDF/A validator. This tool was developed by the PDF Assocation, as part of their goal to facilitate the development of open specifications and ISO standards for PDF technology. You can find VeraPDF on the PDF Association website.

In many ways, you could say VeraPDF is “the” PDF validator CLI tool, as it was developed by a consortium with the express goal of developing an industry-supported PDF/A Validator.

Framework-Specific Solutions

If you need to include PDF/A validation in your application, you may be looking for a solution that works in your application framework, such as a JavaScript PDF/A validator or a PDF/A SDK. Our cross-platform PDF/A SDK enables conversion of 20+ file formats into ISO-compliant PDF/A files that pass VeraPDF validation.

As part of the Server SDK, this functionality is available in several languages, including C#, C++, Go, Java, JavaScript, PHP, Python, Ruby Visual Basic.

PDF/A Validator SDK Key Functions:

Copied to clipboard

Key functions of the PDF/A validation for Server SDK include:

  • Checks if a PDF file is compliant with any PDF/A specifications (ISO 19005-1, 19005-2, 19005-3, 19005-4).
  • Converts any PDF to a PDF/A compliant document.
  • Supports all PDFA versions and conformance levels: PDF/A-1A, PDF/A-1B, PDF/A-2A, PDF/A-2B, PDF/A-2U, PDF/A-3A, PDF/A-3B, PDF/A-3U, PDF/A-4, PDF/A-4E, PDFA-4F.
  • Produces a detailed report of compliance violations and associated PDF objects.
  • Keeps the required changes to a minimum, preserving the consistency of the original.
  • Tracks all changes to allow for automatic assessment of data loss.
  • Allows user to customize compliance checks or omit specific changes during the conversion process.
  • Preserves tags, logical structure, and color information in existing PDF documents.
  • Supports user-defined color profiles.
  • Offers automatic font substitution, embedding, and subsetting options.

How to Validate PDF/A using Apryse SDK

If this is your first time getting started with the Server SDK, follow the steps in the documentation guide. Once you’re set and ready to go, you can use the below sample code to programmatically convert generic PDF documents into ISO-compliant, VeraPDF-valid PDF/A files, or to validate PDF/A compliance.

Here's the PDF/A Validator SDK sample code in JavaScript:

//---------------------------------------------------------------------------------------
// Copyright (c) 2001-2024 by Apryse Software Inc. All Rights Reserved.
// Consult legal.txt regarding legal and license information.
//---------------------------------------------------------------------------------------

const { PDFNet } = require('@pdftron/pdfnet-node');
const PDFTronLicense = require('../LicenseKey/LicenseKey');

((exports) => {

  exports.runPDFA = () => {

    const printResults = async (pdfa, filename) => {

      const errorCount = await pdfa.getErrorCount();
      if (errorCount === 0) {
        console.log(filename + ': OK.');
      } else {
        console.log(filename + ' is NOT a valid PDFA.');
        for (let i = 0; i < errorCount; i++) {
          const errorCode = await pdfa.getError(i);
          const errorMsg = await PDFNet.PDFACompliance.getPDFAErrorMessage(errorCode);
          console.log(' - e_PDFA ' + errorCode + ': ' + errorMsg + '.');
          const numRefs = await pdfa.getRefObjCount(errorCode);
          if (numRefs > 0) {
            const objs = [];
            for (let j = 0; j < numRefs; j++) {
              const objRef = await pdfa.getRefObj(errorCode, j);
              objs.push(objRef);
            }
            console.log('   Objects: ' + objs.join(', '));
          }
        }
        console.log('');
      }
    }

    //---------------------------------------------------------------------------------------
    // The following sample illustrates how to parse and check if a PDF document meets the
    //	PDFA standard, using the PDFACompliance class object. 
    //---------------------------------------------------------------------------------------
    const main = async () => {
      const inputPath = '../TestFiles/';
      const outputPath = inputPath + 'Output/';
      await PDFNet.setColorManagement();  // Enable color management (required for PDFA validation).

      //-----------------------------------------------------------
      // Example 1: PDF/A Validation
      //-----------------------------------------------------------
      try {
        const filename = 'newsletter.pdf';
        /* The max_ref_objs parameter to the PDFACompliance constructor controls the maximum number 
        of object numbers that are collected for particular error codes. The default value is 10 
        in order to prevent spam. If you need all the object numbers, pass 0 for max_ref_objs. */
        const pdfa = await PDFNet.PDFACompliance.createFromFile(false, inputPath + filename, '', PDFNet.PDFACompliance.Conformance.e_Level2B);
        await printResults(pdfa, filename);
      } catch (err) {
        console.log(err);
      }

      //-----------------------------------------------------------
      // Example 2: PDF/A Conversion
      //-----------------------------------------------------------
      try {
        let filename = 'fish.pdf';
        const pdfa = await PDFNet.PDFACompliance.createFromFile(true, inputPath + filename, '', PDFNet.PDFACompliance.Conformance.e_Level2B);
        filename = 'pdfa.pdf';
        await pdfa.saveAsFromFileName(outputPath + filename);

        // Re-validate the document after the conversion...
        const comp = await PDFNet.PDFACompliance.createFromFile(false, outputPath + filename, '', PDFNet.PDFACompliance.Conformance.e_Level2B);
        await printResults(comp, filename);
      } catch (err) {
        console.log(err);
      }

      console.log('PDFACompliance test completed.')
    };

    PDFNet.runWithCleanup(main, PDFTronLicense.Key).catch(function (error) { console.log('Error: ' + JSON.stringify(error)); }).then(function () { return PDFNet.shutdown(); });
  };
  exports.runPDFA();
})(exports);
// eslint-disable-next-line spaced-comment
//# sourceURL=PDFATest.js

Wrapping Up

Copied to clipboard

PDF/A is an important and useful standard for maintaining digital documents, but without the right tools to validate and convert to PDF/A, you could be at risk of non-compliance, lost data, and unreadable documents. Try one of the tools above to take full advantage of the powerful PDF/A standard.

For more information on the PDF/A library, visit our product page. You can also find our code sample for Convert to PDF/A here. Or, reach out to us with any questions. 

Sanity Image

Isaac Maw

Technical Content Creator

Share this post

email
linkedIn
twitter