Available Now: Explore our latest release with enhanced accessibility and powerful IDP features

PDF to PDF/A Conversion Using PHP

By Garry Klooesterman | 2025 Mar 13

Sanity Image
Read time

3 min

Summary: The PDF/A standard is an ISO 19005 compliant standard for digital document archiving and preservation. In this guide, we'll explore how to convert 20+ file types to PDF/A with the Apryse PDF/A SDK using PHP.

Introduction

Copied to clipboard

Secure and reliable document archiving and preservation is crucial for businesses in many industries such as healthcare, finance, and legal and insurance. Just as crucial, is finding the right format for preserving those electronic documents. As platforms, standards, and operating systems evolve over time, files in formats that aren’t designed for archiving can become corrupted or unusable.

For example, using standard PDF files or DOCX files for archiving may result in issues such as:

  • Font Embedding: The document may use fonts that aren't embedded in it, which could cause the document to not display correctly on systems that don't have the same fonts installed.
  • Security Risks: DOCX files can be vulnerable to macro viruses and other security threats. Metadata, version history, and track changes may be accessible in DOCX files, resulting in compliance issues with security and privacy.
  • Long-Term Accessibility: The software required to open DOCX files may become obsolete, making it difficult to access the documents in the future.

What is PDF/A?

Copied to clipboard

The PDF/A standard, compliant with ISO 19005, addresses these issues with features, such as formatting, font embedding, and raster and vector graphics, designed to preserve documents for long-term storage. By embedding all necessary elements in the file, PDF/A ensures that documents are reliably and consistently rendered in the future.

PDF to PDF/A Conversion

Copied to clipboard

The PDF/A SDK:

  • Converts from 20+ file formats, including PDF, JPG, HTML, Word, and TIFF into VeraPDF-valid ISO-compliant PDF/A files. It supports all PDF/A standards (PDF/A-1, -2, -3 and -4) and conformance levels (a, b, u, 4e and 4f). The SDK can also repair non-compliant PDF/A files.
  • Supports high volume PDF to PDF/A batch conversion from the command-line, or as a PDF conversion library integrated into a document workflow automation.
  • Analyzes the content of existing PDF files and performs a sequence of modifications to produce a PDF/A compliant document. Features not suitable for long-term archiving such as encryption, missing fonts, or device-dependent colors are replaced with PDF/A compliant equivalents. Information loss is minimal as only necessary changes are applied to the source during conversion. A detailed report for each change is provided, making it easy to inspect changes and to determine whether the conversion loss is acceptable.

How to Use the SDK in PHP

Copied to clipboard

You can find all the details on using the Apryse PDF/A SDK with PHP and other languages, and the sample code below in the documentation guide.

To convert PDFs to PDF/A format, we’ll be using the following code.

<?php
//---------------------------------------------------------------------------------------
// Copyright (c) 2001-2023 by Apryse Software Inc. All Rights Reserved.
// Consult LICENSE.txt regarding license information.
//---------------------------------------------------------------------------------------
if(file_exists("../../../PDFNetC/Lib/PDFNetPHP.php"))
include("../../../PDFNetC/Lib/PDFNetPHP.php");
include("../../LicenseKey/PHP/LicenseKey.php");

//---------------------------------------------------------------------------------------
// The following sample illustrates how to parse and check if a PDF document meets the
//	PDFA standard, using the PDFACompliance class object. 
//---------------------------------------------------------------------------------------


function PrintResults($pdf_a, $filename) 
{
	$err_cnt = $pdf_a->GetErrorCount();
	if ($err_cnt == 0) 
	{
		echo nl2br($filename.": OK.\n");
	}
	else 
	{
		echo nl2br($filename." is NOT a valid PDFA.\n");
		for ($i=0; $i<$err_cnt; ++$i) 
		{
			$c = $pdf_a->GetError($i);
			$str1 = " - e_PDFA ".$c.": ".PDFACompliance::GetPDFAErrorMessage($c).".";
			if (true) 
			{
				$num_refs = $pdf_a->GetRefObjCount($c);
				if ($num_refs > 0)  
				{
					$str1 = $str1."\n   Objects: ";
					for ($j=0; $j<$num_refs; ++$j) 
					{
						$str1 = $str1.$pdf_a->GetRefObj($c, $j);
						if ($j<$num_refs-1) 
							$str1 = $str1. ", ";
					}
				}
			}
			echo nl2br($str1."\n");
		}
		echo nl2br("\n");
	}
}

	// Relative path to the folder containing the test files.
	$input_path = getcwd()."/../../TestFiles/";
	$output_path = getcwd()."/../../TestFiles/Output/";

	PDFNet::Initialize($LicenseKey);
	PDFNet::GetSystemFontList();    // Wait for fonts to be loaded if they haven't already. This is done because PHP can run into errors when shutting down if font loading is still in progress.
	PDFNet::SetColorManagement();  // Enable color management (required for PDFA validation).

	//-----------------------------------------------------------
	// Example 1: PDF/A Validation
	//-----------------------------------------------------------
	$filename = "newsletter.pdf";
	// The max_ref_objs parameter to the PDFACompliance constructor controls the maximum number 
	// of object numbers that are collected for particular error codes. The default value is 10 
	// in order to prevent spam. If you need all the object numbers, pass 0 for max_ref_objs.
	$pdf_a = new PDFACompliance(false, $input_path.$filename, "", PDFACompliance::e_Level2B, 0, 0, 10);
	PrintResults($pdf_a, $filename);
	$pdf_a->Destroy();

	//-----------------------------------------------------------
	// Example 2: PDF/A Conversion
	//-----------------------------------------------------------
	$filename = "fish.pdf";
	
	$pdf_a = new PDFACompliance(true, $input_path.$filename, "", PDFACompliance::e_Level2B, 0, 0, 10);
	$filename = "pdfa.pdf";
	$pdf_a->SaveAs($output_path.$filename, false);
	$pdf_a->Destroy();

	// Re-validate the document after the conversion...
	$pdf_a = new PDFACompliance(false, $output_path.$filename, "", PDFACompliance::e_Level2B, 0, 0, 10);		
	PrintResults($pdf_a, $filename);
	$pdf_a->Destroy();
	PDFNet::Terminate();	
	echo nl2br("PDFACompliance test completed.\n");
?>

Conclusion

Copied to clipboard

Using the Apryse PDF/A SDK with PHP to convert files to PDF/A for secure, long-term archiving maintains the reliability of your electronic documents and ensures consistent rendering in the future. 

Download the SDK to get started now or try out the demo to see how it works.

Have questions, contact our sales team or join our Discord community for support and discussions.

Sanity Image

Garry Klooesterman

Senior Technical Content Creator

Share this post

email
linkedIn
twitter