Compare Word or PDF documents in Python

High-fidelity Python via .NET library to compare two documents in PDF, Word, HTML, TXT, MD and other formats

Using our programming API, you can compare two files and find the difference between them. In other words, our Python via .NET library is a powerful file difference checker. After using Document Comparison API, you can get the result and save it in DOCX, PDF, DOC and some other formats.

What is Document Compare

Comparing documents is a very complex function. But we implemented our solution so that you get the most accurate result. Instead of looking for document differences manually, use our Python via .NET API to compare docs.

Document comparing is performed by comparing words at the level of characters or whole words. In this case, if only a character was changed, the word will be highlighted as the whole changed.

The most popular are Word Compare and PDF Compare. Therefore, we will show the Python via .NET Comparison API using the example of comparing Word files and comparing PDF documents.

Compare PDF files using Python

Comparing PDF files programmatically is a typical task of modern digital workflow. This may be required when you are not sure that your document has not been modified. Or when you know your original PDF has been updated and you want to know how.

To compare two PDFs, just verify them with our Python via .NET library. It allows you to diff PDF and find even small changes that would be invisible to the human eye.

Compare Word documents in Python

To compare two Word documents in Python you need to do the same: diff them using our powerful Python via .NET library through the example below.

Unlike PDFs, Word documents are easier to change, which is why it can be so important to compare Word documents if you need to make sure that some parts of it or an entire file are unchanged.

Comparing two files

To test how our Python via .NET solution works and to diff two files, import files you want to compare and choose an export file format. After files are compared, the document containing the difference of this comparison will be automatically downloaded.

Note that documents to compare should not have any revisions before calling the compare method, so we took care of that in our example:

Compare documents in Python
Input files
Upload the compared file
Run code
Upload the second file to compare
Output format
Select the target format from the list
import aspose.words as aw

docA = aw.Document("Input1.docx")
docB = aw.Document("Input2.docx")

# There should be no revisions before comparison.
docB.accept_all_revisions(), "Author Name","Output.docx")
Run code

How to compare two text files in Python

  1. Install 'Aspose.Words for Python via .NET'
  2. Add a library reference (import the library) to your Python project
  3. Load two documents to compare
  4. Accept all revisions before calling the 'compare()' method
  5. Call the 'compare()' method to compare two docs
  6. Call the 'Save()' method, passing an output filename with required extension
  7. Get the result of compression as a separate file

Python library to compare documents

We host our Python packages in PyPi repositories. Please follow the step-by-step instructions on how to install "Aspose.Words for Python via .NET" to your developer environment.

System Requirements

This package is compatible with Python 3.5, 3.6, 3.7, 3.8 and 3.9. If you develop software for Linux, please have a look at additional requirements for gcc and libpython in Product Documentation.

Most popular file formats for comparison


Subscribe to Aspose Product Updates

Get monthly newsletters and offers directly delivered to your mailbox.

© Aspose Pty Ltd 2001-2023. All Rights Reserved.