COMING SOON: Fall 2024 Release

How to Edit Text in a PDF Using WebViewer JavaScript

By Andrey Safonov, Roger Dunham | 2024 Apr 21

Sanity Image
Read time

4 min

Have you ever come across a converted PDF document and noticed a typo or something out of date, like a heading, title, or revision number? How do we correct this? One approach is to find the original DOCX format file. If the original is unavailable, you might convert from PDF back to Office using a converter such as the Apryse PDF to Word component or a free web-based tool like Xodo.

However, this approach is inefficient in many professional workflows, not to mention less secure; first, your users have to jump through several hoops using various tools to edit, convert, and upload the file as a new version. Second, converting to and from formats with most converters likely results in some changes, so the edited file will not look like the original.

WebViewer solves this problem by allowing users to edit PDFs and MS Office documents directly in the browser without losing information or having sensitive data leave your application.

Let’s look at how.

Learn more about Apryse's PDF and DOCX editor functionality.

Edit Text in a PDF with WebViewer UI

Copied to clipboard

WebViewer has allowed editing of PDF text with an out-of-the-box UI since version 8.3, and since that release it has become better and better.

To enable editing functionality, go in the callback function when WebViewer initializes, and enable 'contentEditButton':

Webviewer({`
  path: '/lib',
  preloadWorker: 'contentEdit'
}, document.getElementById('viewer')).then(async instance => {
  instance.UI.enableElements(['contentEditButton']);
});

Notice the new option in the constructor that allows you to preload a PDF text editing worker. This is not required, but can help to improve the user experience. It allows content editing to be initialized faster, but at the expense of slightly slower rendering of WebViewer as a whole. You may wish to use it if you know that users will be editing text.

After you have enabled the contentEditButton in the code, you will see a new edit tool under the edit tab.

PDF text becomes editable paragraph boxes

You can now select any paragraph and either resize, delete, or edit the text.

An image showing the start of editing text within the PDF.

The start of editing within the PDF. Since 10.6 a WYSIWYG editor has been available.

The styles are automatically detected and selected. You can now enter new text and set styling options such as Bold, Italic, Underline. Strikethrough and alignment.

The panel showing styling options when editing text in a PDF

Styling options when editing text in a PDF

To save the edits, simply exit the edit mode. You can also resize or move text by dragging the control handles for the paragraph.

The control handles for a paragraph

The control handles for a paragraph which allow it to be resized or moved.

WebViewer also detects any embedded images and allows users to select, resize and reposition images. This is useful when, for example, a logo needs to be repositioned.

One of the most popular PDF optimization techniques is to strip font characters and font families that are not in use. In this case, when a user enters the edit mode and types a character that is not embedded in a PDF, the inbuilt font substitution logic will attempt to pick a font from the same font family.

Learn about updates to WebViewer's real-time WYSIWYG PDF editing.

Edit PDF Text Programmatically

Copied to clipboard

WebViewer allows editing text via APIs that can be utilized to perform edits programmatically or even to help you build your own UI. Let’s preload the worker and enter the edit mode as soon as WebViewer loads.

 const {Core} = instance;
      const{documentViewer, ContentEditManager} = Core;
     
      //Optional: Use this to preload the worker if you know that the user will edit the PDF
      const contentEditManager = new ContentEditManager(documentViewer);
      Core.ContentEdit.preloadWorker(contentEditManager);

      const contentEditTool = documentViewer.getTool(Core.Tools.ToolNames.CONTENT_EDIT);
      documentViewer.setToolMode(contentEditTool);

Once the PDF has loaded you many wish to process the content. The editable areas (or paragraphs) are annotations drawn on top of a PDF. You can either loop over all the annotations or get the selected annotation and check if it is editable.

The actual content of the annotation is HTML with inline styling. The following example demonstrates one way in which you can update the annotation.

// There are various ways to access the annotations - this is just one example
annotationManager.addEventListener('annotationChanged', async (annotations, action) => {
  if (action === 'add') {
    // @ts-ignore 
    const editAnnotations = annotations.filter(annot => annot.isContentEditPlaceholder());
    if (editAnnotations.length > 0) {
      // @ts-ignore 
      editAnnotations.forEach(async annot => {
        const content = await Core.ContentEdit.getDocumentContent(annot);

        // You could pass content to library that can display rich text, for example Quill
        // but for now just log it
        console.log(content);
      });

      // later after the content has been updated, it can be updated on the page
      // for now a hard code string is being used to demonstrate this
      const newContent = '<p><span style="font-family: SourceSansProSemi;font-weight: bold;font-size: 30px;color: #444444;">Important Factors when Choosing a PDF Library</span></p>';
      await Core.ContentEdit.updateDocumentContent(editAnnotations[0], newContent);
    }
  }
});

An animated gif showing the effect of programmatically updating the content of the edit Annotation.

Effect of programmatically updating the content of the edit Annotation. In this case the number '6' is being removed programmatically

If you want to differentiate between editable text area or image, you can do so by calling getContentEditType.


if (annotation.getContentEditType() === Core.ContentEdit.Types.TEXT) {

  // this has text that can be updated

}

You can also delete annotations, or move the editable areas, paragraphs or images:


annotationManager.deleteAnnotation(myContentEditAnnotation);

myOtherContentEditAnnotation.X = 50;

annotationManager.trigger(Core.AnnotationManager.Events.ANNOTATION_CHANGED, ['modify', [myOtherContentEditAnnotation], {}]);

Events for Text Edits

Copied to clipboard

You can set up analytics or have an audit trail for paragraphs or areas that have been edited. Since the editable areas are just annotations with HTML, you can tap into the same events.

const { Core } = instance;
  const { annotationManager } = Core;
  annotationManager.addEventListener(
    'annotationChanged',
    (annotations, action) => {
      annotations.forEach(async (annot) => {
        if (annot.isContentEditPlaceholder()) {
          const content = await Core.ContentEdit.getDocumentContent(annot);
          console.log(content);
        }
      });
    }
 );

The annotation has properties that include who made the change and when, which you can access within your implementation.

More Key Information

Copied to clipboard

You can count on your edits with this component to write into documents accurately.

In older versions of WebViewer the user was presented with a warning the first time that they entered edit mode, since annotations, such a underlines and highlights might not map to the original text if the text was modified.

Warning message when opening the PDF editor in WebViewer

However, since 10.6 this has not been an issue. Now, if editing a paragraph results in a change in the location of specific text, any underlines or highlights associated with that text will automatically adjust to remain with the correct text.

Wrap up

Copied to clipboard

The Apryse WebViewer allows you to edit PDFs directly as we have seen, but it also allows you to edit Word documents directly within the browser. And we are continuing to extend this functionality.

Have any questions about Apryse’s SDK or WebViewer? Start a chat on Discord with our solutions engineers or reach out to our sales team for a personalized demo.

[This blog was originally written in November 2022. The original version can be found here.]

Sanity Image

Andrey Safonov

Director of Product

LinkedIn link
Sanity Image

Roger Dunham

Share this post

email
linkedIn
twitter