Busca en PDF a través de Python

Búsqueda avanzada de documentos PDF. Utilice Aspose.PDF para Python for .NET para modificar documentos PDF mediante programación

C# Java C++ Python

Aspose.PDF
for Python for .NET

Descargar

Aprender

Comprar

Cómo buscar un archivo PDF con Python

Para buscar archivos PDF, usaremos la API Aspose.PDF for .NET, que es una API de manipulación de documentos rica en funciones, potente y fácil de usar para la plataforma python-net. Abra el administrador de paquetes NuGet, busque Aspose.pdf e instálelo. También puede usar el siguiente comando desde la consola de Package Manager.

Python Package Manager Console

pip install aspose-pdf

Buscar archivo PDF a través de Python

Necesita Aspose.PDF for .NET para probar el código en su entorno.

Cargue el PDF con una instancia de Document.
Cree el objeto TextFragmentAbsorber con texto para encontrarlo como parámetro.
Obtenga toda la colección de fragmentos de texto extraídos.
Recorre cada fragmento para obtener toda su información.

Buscar archivos PDF: Python

import aspose.pdf as ap

# Search Text from All the Pages of PDF Document
pdfDocument = ap.Document("c:\\samples\\sample.pdf")

# Create TextAbsorber object to find all instances of the input search phrase
textFragmentAbsorber = ap.text.TextFragmentAbsorber("PDF")

# Accept the absorber for all the pages
pdfDocument.pages.accept(textFragmentAbsorber)

# Loop through the fragments
for textFragment in textFragmentAbsorber.text_fragments:
    print(f"Text : {textFragment.text}" )
    print(f"Position : {textFragment.position}")
    print(f"XIndent : {textFragment.position.x_indent}")
    print(f"YIndent : {textFragment.position.y_indent}")
    print(f"Font - Name : {textFragment.text_state.font.font_name}" )
    print(f"Font - IsAccessible : {textFragment.text_state.font.is_accessible} " )
    print(f"Font - IsEmbedded : {textFragment.text_state.font.is_embedded} " )
    print(f"Font - IsSubset : {textFragment.text_state.font.is_subset} ")
    print(f"Font Size : {textFragment.text_state.font_size}" )
    print(f"Foreground Color : {textFragment.text_state.foreground_color} " )