Remove Tables from PDF via Python

Delete tables from PDF document using Aspose.PDF for Python for .NET Library

How to deleting Tables from PDF document Using Python for .NET Library

In order to delete table, we’ll use Aspose.PDF for .NET API which is a feature-rich, powerful and easy to use document manipulation API for python-net platform. Open NuGet package manager, search for Aspose.PDF and install. You may also use the following command from the Package Manager Console.

Python Package Manager Console

pip install aspose-pdf

Delete Tables from PDF via Python


You need Aspose.PDF for Python via .NET to try the code in your environment.

  1. Load the PDF with an instance of Document.
  2. Create TableAbsorber object to find tables.
  3. Visit first page with absorber.
  4. Get first table on the page.
  5. Remove the table. Save the file.

Delete Tables from PDF - Python

    import aspose.pdf as ap

    input_file = DIR_INPUT_TABLE + "Table_input.pdf"
    output_file = DIR_OUTPUT + "Table_out.pdf"
    # Load existing PDF document
    pdf_document = ap.Document(input_file)
    # Create TableAbsorber object to find tables
    absorber = ap.text.TableAbsorber()
    # Visit first page with absorber
    absorber.visit(pdf_document.pages[1])
    # Get first table on the page
    table = absorber.table_list[0]
    # Remove the table
    absorber.remove(table)
    # Save PDF
    pdf_document.save(output_file)