Available Now: Explore our latest release with enhanced accessibility and powerful IDP features
By Apryse | 2023 Aug 30
4 min
Tags
office conversion
java
docx to pdf
In the modern business landscape, sharing and archiving documents in a reliable and universally accessible format is paramount. Portable Document Format (PDF) stands out as a versatile and widely supported format that ensures consistent presentation across different platforms. Converting Office documents such as Word, Excel, and PowerPoint files into PDFs is a common requirement.
The Apryse PDF SDK can be used with multiple languages and frameworks to convert Office Documents to PDF, with both client-side (within the browser) and server-side conversions.
The Apryse Java library can be downloaded from the Apryse website and contains dozens of samples. In this blog, we will walk through getting started with the sample for converting a Word document to PDF, then extend it to see some of the other functionality that is available.
This blog was written using version 10.3.0 of the Apryse SDK and a Trial license key.
If you have Office installed on your own machine, then creating a PDF is simple.
But what do you do if you don’t have Office installed?
Theoretically, a server could be set up that uses Office to perform the conversions within the Web server backend, but there are several issues with that:
The awesome thing about the Apryse solution is that there is no need for Office to be installed, either on the user’s machine, or on a server, for the conversion to occur.
That’s right. The conversion occurs entirely without the need for Office.
The sample project is intended to show how specific documents in a hard-coded location can be converted to PDF.
In reality, you are likely to want to be able to specify which document is to be converted and what you want to do with the PDF once the conversion is complete. As such, the code should be considered to be an example of how to convert a file and see the result, rather than as a template of how to write an entire document processing solution.
There is a great guide to getting started available from https://docs.apryse.com/try-now/ which will lead you through the steps to creating an application on Windows, macOS or Linux.
Figure 1 - Getting started
You will need the JDK installed and a trial license key that can be downloaded from the Apryse website. If you are having trouble with the JDK, then please see https://docs.apryse.com/documentation/java/faq/
If you don't already have an Apryse account, go to https://dev.apryse.com and register a new account. This allows Apryse to grant you a demo license key which will be used with the Apryse SDK to enable demo functionality.
Figure 2- The Developer Portal
Log into https://dev.apryse.com with your registered account. The Java SDK is available for Windows, Linux, and macOS, so select the platform that you are using.
Click on the reveal button to get your personalized Trial key.
Figure 3- Download Center Platform and Trial Key
Having selected the platform and found the trial key, if you scroll a little further, you will see the multitude of languages that are available for download.
Figure 4- Just some of the supported languages for the SDK. The list of options varies slightly between Windows, Linux and macOS.
The SDK is a zip file called PDFNetJava.zip. This contains binary files for Windows, Linux and macOS, which is typical for Java libraries.
Extract that file to a location of your choice. I chose to extract it to a folder called source. Within the folder there are the executable files, documentation, and a lot of samples.
Within the Samples folder there is a folder called OfficeToPDFTest with a subfolder for JAVA.
Open a terminal and navigate to this folder.
Within the file OffToPDFTest.java you will need to enter the trial license that you have already acquired. To do this replace ‘PDFTronLicense.Key()’ with your actual key. I used VSCode, but you can use whatever editor you prefer.
This code specifies the files that should be converted from Office to PDF as hardcoded locations within the zip file that we downloaded.
This is just one of the Word document files, but feel free to investigate the others.
Figure 5- the first page of one of the Word documents that will be converted by the sample code.
You can start the conversion by entering cmd.exe -/c RunTest.bat from Powershell, or use whatever method you prefer for the platform that you are using.
After a few moments the terminal will show the results of the function.
Now navigate to the Output folder within the samples library
You can see that there are three files there that have recently been created.
Opening each in turn will show that we have created PDFs from the original documents.
Figure 6- A PDF that was created from a Word document
Figure 7 - Another PDF that was created from a Word document
Figure 8- A PDF that was created from an Arabic language Word document, illustrating right-to-left text.
You could now use this code as the back-end to a website with the Word document being uploaded, and the converted PDF being returned.
There are two different functions in the sample: simpleDocxConvert and flexibleDocxConvert.
In this blog I will just look at the simpler method.
publicstaticvoidsimpleDocxConvert(StringinputFilename, StringoutputFilename) {
try (PDFDocdoc=newPDFDoc()) {
// perform the conversion with no optional parameters
PDFDocpdfdoc=newPDFDoc[RD4] ();
Convert.officeToPdf(pdfdoc, input_path + inputFilename, null);
// save the result
pdfdoc.save(output_path + outputFilename, SDFDoc.SaveMode.INCREMENTAL, null);
// output PDF pdfdoc
// And we're done!
System.out.println("Done conversion "+ output_path + outputFilename);
} catch (PDFNetException e) {
System.out.println("Unable to convert MS Office document, error:")
e.printStackTrace();
System.out.println(e);
}
}
Of this code only three lines are doing any real work, the rest is logging and error handling.
That really is very simple.
The Word document is converted to a PDF in just three lines of code.
Talk about simply getting a great result!
I will write more about the function flexibleDocxConvert in a later blog, but, for now, it is enough to say that it illustrates how the code can be used with various options, or within a multithreaded environment to monitor and cancel conversions.
Absolutely.
In addition to converting from Word, the SDK can convert from Excel and PowerPoint to PDF, and even supports legacy Office formats: .doc, .xls and .ppt.
At its simplest, all that needs to be done to convert from other Office document types is to include the file extension when passing the file to the conversion method.
The function simpleDocxConvert isn’t a great name, as it suggests that the Apryse SDK is less powerful than it really is, but that is part of the sample code, not part of the SDK, so feel free to change it if you want.
For example, an Excel Spreadsheet can be converted into a multiple page PDF with each page laid out in the same way as the original spreadsheet by copying the file to the TestFiles folder then using:
awaitsimpleDocxConvert('Cashflow.xlsx', 'Cashflow.pdf');
The SDK is clever enough to know that .xlsx means that a conversion from Excel to PDF is required.
Figure 9: An example Excel spreadsheet now converted to a PDF
In the same way, PowerPoint presentations can be converted into multi-page PDFs using:
awaitsimpleDocxConvert('WW1Cryptography.pptx', 'WW1Cryptography.pdf');
Figure 10: A PowerPoint presentation now converted to a PDF
Apryse offers a simple mechanism for converting Office documents to PDF without the need for Office to be installed. This can be done with just a few lines of code that use default options. More complex options exist to allow the conversion mechanism to be tailored to your requirements.
These powerful conversion capabilities, coupled with the ease of integration provided by its Java library, make the Apryse SDK the best choice for developers aiming to enhance their document processing workflows. Whether you're building a document management system, an online collaboration platform, or any other application involving Office documents, Apryse can help you provide a seamless and efficient conversion process.
In addition to converting Office documents to PDF, Apryse offers many tools for editing and handling both Office Documents and PDFs, including converting PDFs into Office documents.
When you want to see this code in action, the website https://xodo.com uses the SDK for creating PDFs from Word documents, Excel spreadsheets and PowerPoint presentations. When you are ready to get started see the documentation for the SDK to get started quickly. Don’t forget, you can also reach out to us on Discord if you have any issues.
Tags
office conversion
java
docx to pdf
Apryse
Share this post
PRODUCTS
Enterprise
Small Business
Popular Content