COMING SOON: Fall 2024 Release

How to Convert Office Documents to PDF on a Server Without Installing Office using C++.

By Apryse | 2023 Sep 21

Sanity Image
Read time

8 min

Learn how to seamlessly convert Office documents to PDF on a server without the need to install Microsoft Office, leveraging the power of C++ for efficient document processing with Apryse.

Introduction

The ability to share and archive documents in a reliable accessible format is an essential requirement for many businesses. While Office document formats (whether Word, Excel or PowerPoint) are excellent for generating content, they are less good for sharing due to potential different appearance in different versions of software – what looks good on one machine may look terrible on another. 

On the other hand, portable Document Format (PDF) provides a versatile and widely supported format that ensures consistent presentation across different platforms.

As such, the ability to convert Office documents into PDFs is a common requirement. While this is easy if Office is available, the ability to convert Office documents to PDF is more difficult if Office is not installed, for example on a web server, or a Linux operating system.

This is where the Apryse PDF SDK offers a solution. It can be used with multiple programming languages and frameworks to convert Office Documents to PDF without the need for Office to be installed. There is even support for both client-side (within the browser) and server-side conversions.

This blog includes:

  • How to download, and run, a simple program written in C++
  • An explanation on how the code works, and how it can be extended
  • Examples of how the Apryse SDK can be used to create PDFs from DOCX, PPT and XLSX files
  • Advice on the next steps that can be taken to include this in your workflow.


The ApryseC++ library contains not just the executables, but also dozens of samples, andis available for Windows ,Linux and macOS.In this blog we will walk through getting started with the sample for converting a Word document to PDF, then extend it to see some of the other functionality that is available.

But Surely Creating a PDF is Easy, You Can Just Use 'Export from Word'

If you have Office installed, then creating a PDF using that on your own machine is simple.

Theoretically a server could be set up that uses Office to perform the conversions within the Web server backend, but there are several issues with that:

  • Some versions of Office are not recommended to be used as a service component
  • Office sometimes brings up modal user dialogs, particularly when an update is needed, and since services have no UI, the dialog can never be dismissed, and the application will hang
  • Licensing of Office in this way is different from for a single user Desktop machine, so there is additional complexity in using this solution in a legal, licensed way.

The awesome thing about the Apryse solution is that there is no need for Office to be installed, either on the user’s machine, or on a server, for the conversion to occur.

Why Would you Want to Convert on a Server Rather Than in the Browser?

While there are some advantages of performing the conversion in a browser, conversion on a server is likely to be faster for complex large files, and is likely to result in the same conversion every time. Converting a file relies, at least in part, on locally available fonts, so a browser-based conversion performed on one machine could potentially look different from the conversion of the same file that was performed on a different machine which had different fonts available.

Apryse offers solutions for both server-side and within-browser conversions, but in this example, we will just look at the server-side option.

See here for a blog about performing conversions within-browser.

Sample Project for Converting a Document to PDF

The sample project is intended to show how documents, in a hard coded location, can be converted to PDF.

You are likely to want to be able to specify which document is to be converted, and what to do with the PDF once the conversion is complete. As such the code should be considered as an example of how to convert a file and see the result, rather than as a template of how to write an entire document processing solution.

There is a great guide to getting started available from https://docs.apryse.com/try-now/ which will lead you through the steps to creating an application on Windows, macOS or Linux.

Blog image

Figure 1 - the start of the guide for getting started

Prerequisites

The prerequisites depend on platform but are outlined in the guide for that platform. In my case I was using Visual Studio 2022 with the Desktop development with C++ workload and Windows 11.0 SDK.

How to Get an Apryse SDK Trial Key

If you don't already have an Apryse account, go to https://dev.apryse.com and register a new account. This allows Apryse to grant you a demo license key which will be used with the Apryse SDK to enable demo functionality.

Blog image

Figure 2- The Developer Portal

Log into https://dev.apryse.comwith your registered account. The .NET Core SDK is available for Windows, Linux and macOS, so select the platform that you are using.

Click on the reveal button to get your personalized Trial key.

Blog image

Figure 3 - Download Center Platform and Trial Key.

How to Get the Apryse SDK

Having selected the platform, and found the trial key, if you scroll a little further, you will see options to help you Get started for a multitude of language.

Blog image

Select the C++ SDK option. This will take you to a page where you can download the SDK for the bit-ness and platform that you require.

Blog image

Figure 4 – the start of the Getting started guide. The options differ between platforms.

The SDK is a zip file called PDFNetC64.zip.

Extract that file to a location of your choice. I chose to extract it to a folder called. Within the folder there are the executable files, documentation, and a lot of samples.

Blog image

Figure 5 - the contents of the SDK downloaded file after extraction.

We will use Visual Studio 2022for development and Apryse SDK version 10.3. This is later than the guide relates to, so we will need to make some changes as we go.

  • Before going further, you will need to save the license key that you have previously downloaded into the file LicenseKey.h.
Blog image

Figure 6 - Location of the License Key file that will be used by the project.

Within the Samples folder there is a folder called OfficeToPDFTest. This contains specific folders for C++, C# and Java.

Blog image

Figure 7 - the contents of the CPP sample folder.

There are a lot of project files there that target different versions of Visual Studio. However there is no option for Visual Studio 2022, as such when I open that file OfficeToPDFTestVC2019.vcxprojI am given the option to retarget the project.

Blog image

Figure 8 - the option shown in Visual Studio 2022 when the project is opened.

  • Keep the defaults and press OK.

This code specifies the files that should be converted from Office to PDF as hardcoded locations within the zip file that we downloaded.

This is just one of the Word document files, but feel free to investigate the others.

Blog image

Figure 9 - The location where files to be converted should be placed, before the program has run.

  • Run the program using Visual Studio. After a few seconds the processing will complete, and files will have been created in the output folder.
Blog image

Figure 10 - The location where files to be converted should be placed, showing the newly created PDF after the program has run.

If you compare the original Word document and the newly created PDF, you can see that they look identical.

Blog image

Figure 11- the original document in Word.

Blog image

Figure 12 - The newly created PDF (shown in Chrome).

Blog image

Figure 13 - Another PDF that was created from a Word document

Blog image

Figure 14- A PDF that was created from an Arabic language Word document, illustrating right-to-left text.

You could now use this code as the back-end to a website with the Word document being uploaded, and the converted PDF being returned.

How Does the Sample Code Work?

There are two different functions in the sample:

simpleDocxConvert and flexibleDocxConvert

In this blog I will just look at the simpler method.

void SimpleDocxConvert(UString input_filename, UString output_filename) 
{ 
    // Start with a PDFDoc (the conversion destination) 
    PDFDoc pdfdoc; 
    // perform the conversion with no optional parameters 
    Convert::OfficeToPDF(pdfdoc, input_path + input_filename, NULL); 

    // save the result 
    pdfdoc.Save(output_path + output_filename, SDF::SDFDoc::e_linearized, NULL); 

    // And we're done! 
    std::cout << "Saved " << output_filename << std::endl; 
} 

Of this code only three lines are doing any real work, the rest is logging and error handling.

It's really that simple.

The Word document is converted to a PDF in just three lines of code.

Talk about simply getting a great result!

We will write more about the function flexibleDocxConvert in a later blog, but, for now, it is enough to say that it illustrates how the code can be used with various options, or within a multithreaded environment to monitor and cancel conversions.

Can the SDK do More than Convert Just Word Documents to PDF?

Absolutely.

In addition to converting from Word, the SDK can convert from Excel and PowerPoint to PDF, and even supports legacy Office formats: .doc, .xls and .ppt.

At its simplest, all that needs to be done to convert from other Office document types is to include the file extension when passing the file to the conversion method.

The function simpleDocxConvert isn’t a great name, as it suggests that the Apryse SDK is less powerful than it really is. A better name might be simpleLotsOfFileFormatConvert, but that is rather a mouthful!

For example, an Excel Spreadsheet can be converted into a multiple page PDF with each page laid out in the same way as the original spreadsheet by copying the file to the TestFiles folder then using:

SimpleConvert("./Cashflow.xlsx", "./Cashflow.pdf");

The SDK is clever enough to know that .xlsx means that a conversion from Excel to PDF is required.

Blog image

Figure 15: An example Excel spreadsheet now converted to a PDF. 

And PowerPoint presentations can be converted in a similar way into multi-page PDFs using:

SimpleConvert("./WW1Cryptography.pptx", "./WW1Cryptography.pdf");

Blog image

Figure 16: A PowerPoint presentation now converted to a PDF.

Further Examples

The method shown above is extremely simple. However, that means that it does not demonstrate all the possibilities that are available. The FlexibleConvertDocx option supports page ranges and many other options.

Elsewhere in the files that you have already downloaded are dozens of other samples that you may wish to try, demonstrating the range of functionality available in the Apryse SDK.

Conclusion

Apryse offers a simple mechanism for converting Office documents to PDF without Office needing to be installed. This can be done with just a few lines of code that uses default options. More complex options exist to allow the conversion mechanism to be tailored to your requirements.

These powerful conversion capabilities, coupled with the ease of integration provided by its C# library, make it the best choice for developers aiming to enhance their document processing workflows. Whether you're building a document management system, an online collaboration platform, or any other application involving Office documents, Apryse can help you provide a seamless and efficient conversion process.

With Apryse's capabilities at your disposal, you can enhance your application's functionality and provide users with a reliable way to convert and work with Office documents in PDF format.

In addition to converting Office documents to PDF, Apryse offers many tools for editing and handling both Office Documents and PDFs, including converting PDFs into Office documents.

When you are ready to get started, see the documentation for the SDK to get started quickly. Don’t forget, you can also reach out to us on Discord if you have any issues.

Sanity Image

Apryse

Share this post

email
linkedIn
twitter