Available Now: Explore our latest release with enhanced accessibility and powerful IDP features

How to Convert Office Documents to PDF on a Server Without Installing Office using Java.

By Apryse | 2023 Aug 30

Sanity Image
Read time

4 min

In the modern business landscape, sharing and archiving documents in a reliable and universally accessible format is paramount. Portable Document Format (PDF) stands out as a versatile and widely supported format that ensures consistent presentation across different platforms. Converting Office documents such as Word, Excel, and PowerPoint files into PDFs is a common requirement.

The Apryse PDF SDK can be used with multiple languages and frameworks to convert Office Documents to PDF, with both client-side (within the browser) and server-side conversions.

The Apryse Java library can be downloaded from the Apryse website and contains dozens of samples. In this blog, we will walk through getting started with the sample for converting a Word document to PDF, then extend it to see some of the other functionality that is available.

This blog was written using version 10.3.0 of the Apryse SDK and a Trial license key.

But Surely Creating a PDF is Easy, You Can Just Use Export From Word

Copied to clipboard

If you have Office installed on your own machine, then creating a PDF is simple.

But what do you do if you don’t have Office installed?

Theoretically, a server could be set up that uses Office to perform the conversions within the Web server backend, but there are several issues with that:

  • Some versions of Office are not recommended to be used as a service component.
  • Office sometimes brings up modal user dialogs, particularly when an update is needed, and since services have no UI, the dialog can never be dismissed, and the application will hang.
  • Licensing of Office in this way is not the same as for a single user Desktop machine, so there is additional complexity in using this solution in a legal, licensed way.

The awesome thing about the Apryse solution is that there is no need for Office to be installed, either on the user’s machine, or on a server, for the conversion to occur.

That’s right. The conversion occurs entirely without the need for Office.

Sample Project for Converting a Document to PDF

Copied to clipboard

The sample project is intended to show how specific documents in a hard-coded location can be converted to PDF.

In reality, you are likely to want to be able to specify which document is to be converted and what you want to do with the PDF once the conversion is complete. As such, the code should be considered to be an example of how to convert a file and see the result, rather than as a template of how to write an entire document processing solution.

There is a great guide to getting started available from https://docs.apryse.com/try-now/ which will lead you through the steps to creating an application on Windows, macOS or Linux.

Blog image

Figure 1 - Getting started

Prerequisites

Copied to clipboard

You will need the JDK installed and a trial license key that can be downloaded from the Apryse website. If you are having trouble with the JDK, then please see https://docs.apryse.com/documentation/java/faq/

How to Get an Apryse SDK Trial Key

Copied to clipboard

If you don't already have an Apryse account, go to https://dev.apryse.com and register a new account. This allows Apryse to grant you a demo license key which will be used with the Apryse SDK to enable demo functionality.

Blog image

Figure 2- The Developer Portal

Log into https://dev.apryse.com with your registered account. The Java SDK is available for Windows, Linux, and macOS, so select the platform that you are using.

Click on the reveal button to get your personalized Trial key.

Blog image

Figure 3- Download Center Platform and Trial Key

How to Obtain the Apryse SDK

Copied to clipboard

Having selected the platform and found the trial key, if you scroll a little further, you will see the multitude of languages that are available for download.

Blog image

Figure 4- Just some of the supported languages for the SDK. The list of options varies slightly between Windows, Linux and macOS.

The SDK is a zip file called PDFNetJava.zip. This contains binary files for Windows, Linux and macOS, which is typical for Java libraries.

Extract that file to a location of your choice. I chose to extract it to a folder called source. Within the folder there are the executable files, documentation, and a lot of samples.

Blog image

Within the Samples folder there is a folder called OfficeToPDFTest with a subfolder for JAVA.

Blog image

Open a terminal and navigate to this folder.

Setting Up Your Project

Copied to clipboard

Within the file OffToPDFTest.java you will need to enter the trial license that you have already acquired. To do this replace ‘PDFTronLicense.Key()’ with your actual key. I used VSCode, but you can use whatever editor you prefer.

Blog image

This code specifies the files that should be converted from Office to PDF as hardcoded locations within the zip file that we downloaded.

This is just one of the Word document files, but feel free to investigate the others.

Blog image

Figure 5- the first page of one of the Word documents that will be converted by the sample code.

You can start the conversion by entering cmd.exe -/c RunTest.bat from Powershell, or use whatever method you prefer for the platform that you are using.

After a few moments the terminal will show the results of the function.

Blog image

Now navigate to the Output folder within the samples library

Blog image

You can see that there are three files there that have recently been created.

Opening each in turn will show that we have created PDFs from the original documents.

Blog image

Figure 6- A PDF that was created from a Word document

Blog image

Figure 7 - Another PDF that was created from a Word document

Blog image

Figure 8- A PDF that was created from an Arabic language Word document, illustrating right-to-left text.

You could now use this code as the back-end to a website with the Word document being uploaded, and the converted PDF being returned.

How does the sample code work?

Copied to clipboard

There are two different functions in the sample: simpleDocxConvert and flexibleDocxConvert.

In this blog I will just look at the simpler method.

publicstaticvoidsimpleDocxConvert(StringinputFilename, StringoutputFilename) {
try (PDFDocdoc=newPDFDoc()) {

// perform the conversion with no optional parameters
PDFDocpdfdoc=newPDFDoc[RD4] ();
Convert.officeToPdf(pdfdoc, input_path + inputFilename, null);

// save the result
pdfdoc.save(output_path + outputFilename, SDFDoc.SaveMode.INCREMENTAL, null);
// output PDF pdfdoc

// And we're done!
System.out.println("Done conversion "+ output_path + outputFilename);
   } catch (PDFNetException e) { 
   System.out.println("Unable to convert MS Office document, error:")
   e.printStackTrace(); 
   System.out.println(e); 
   } 
} 

Of this code only three lines are doing any real work, the rest is logging and error handling.

That really is very simple.

The Word document is converted to a PDF in just three lines of code.

Talk about simply getting a great result!

I will write more about the function flexibleDocxConvert in a later blog, but, for now, it is enough to say that it illustrates how the code can be used with various options, or within a multithreaded environment to monitor and cancel conversions.

Does Our SDK do More Thans Just Convert Word Documents to PDF?

Absolutely.

In addition to converting from Word, the SDK can convert from Excel and PowerPoint to PDF, and even supports legacy Office formats: .doc, .xls and .ppt.

At its simplest, all that needs to be done to convert from other Office document types is to include the file extension when passing the file to the conversion method.

The function simpleDocxConvert isn’t a great name, as it suggests that the Apryse SDK is less powerful than it really is, but that is part of the sample code, not part of the SDK, so feel free to change it if you want.

For example, an Excel Spreadsheet can be converted into a multiple page PDF with each page laid out in the same way as the original spreadsheet by copying the file to the TestFiles folder then using:

awaitsimpleDocxConvert('Cashflow.xlsx', 'Cashflow.pdf');

The SDK is clever enough to know that .xlsx means that a conversion from Excel to PDF is required.

Blog image

Figure 9: An example Excel spreadsheet now converted to a PDF

In the same way, PowerPoint presentations can be converted into multi-page PDFs using:

awaitsimpleDocxConvert('WW1Cryptography.pptx', 'WW1Cryptography.pdf');
Blog image

Figure 10: A PowerPoint presentation now converted to a PDF

Conclusion

Copied to clipboard

Apryse offers a simple mechanism for converting Office documents to PDF without the need for Office to be installed. This can be done with just a few lines of code that use default options. More complex options exist to allow the conversion mechanism to be tailored to your requirements.

These powerful conversion capabilities, coupled with the ease of integration provided by its Java library, make the Apryse SDK the best choice for developers aiming to enhance their document processing workflows. Whether you're building a document management system, an online collaboration platform, or any other application involving Office documents, Apryse can help you provide a seamless and efficient conversion process.

In addition to converting Office documents to PDF, Apryse offers many tools for editing and handling both Office Documents and PDFs, including converting PDFs into Office documents.

When you want to see this code in action, the website https://xodo.com uses the SDK for creating PDFs from Word documents, Excel spreadsheets and PowerPoint presentations. When you are ready to get started see the documentation for the SDK to get started quickly. Don’t forget, you can also reach out to us on Discord if you have any issues.

Sanity Image

Apryse

Share this post

email
linkedIn
twitter