Redact PDF via Python

PDF document sensitive redaction information. Use Aspose.PDF for Python for .NET to modify PDF documents programmatically

How to Redact PDF File Using Python Library

In order to redact PDF file, we’ll use Aspose.PDF for .NET API which is a feature-rich, powerful and easy to use document manipulation API for python-net platform. Open NuGet package manager, search for Aspose.PDF and install. You may also use the following command from the Package Manager Console.

Redact PDF documents via Python


You need Aspose.PDF for Python via .NET to try the code in your environment.

  1. Load the PDF with an instance of Document.
  2. Create TextFragmentAbsorber object with search terms as argument.
  3. Set Search Options.
  4. Loop through each fragment collect to redact.
  5. Save PDF file.

Redact PDF Files - Python

import aspose.pdf as ap
dataDir = "..."
doc = ap.Document(dataDir + "sample.pdf")
searchTerm = "AsposePDF"
textFragmentAbsorber = ap.text.TextFragmentAbsorber(searchTerm)
textSearchOptions = ap.text.TextSearchOptions(True)
textFragmentAbsorber.text_search_options = textSearchOptions

doc.pages.accept(textFragmentAbsorber)
textFragmentCollection = textFragmentAbsorber.text_fragments
for textFragment in textFragmentCollection:
    page = textFragment.page
    annotationRectangle = textFragment.rectangle
    annot = ap.annotations.RedactionAnnotation(page, annotationRectangle)
    annot.fill_color = ap.Color.black
    doc.pages[page.number].annotations.add(annot, True)
    annot.redact()

    doc.save(dataDir + "output.pdf")