Extract PDF Metadata via Python

How to Edit PDF Metadata Using Python for .NET Library

Aspose.PDF for Python for .NET Logo

How to Extract PDF Metadata Using Python for .NET Library

In order to Extract Metadata from PDF files, we’ll use Aspose.PDF for .NET API, which is a feature-rich, powerful, and easy-to-use document manipulation API for .NET. Open NuGet package manager, search for Aspose.PDF and install. You may also use the following command from the Package Manager Console.

Python Package Manager Console

pip install aspose-pdf

Extract PDF Metadata via Python


To try the code in your environment, you need Aspose.PDF for .NET.

  1. Load the PDF with an instance of Document.
  2. Get DocumentInfo using Document.Info property.
  3. Access & display different Document.Info properties.

System Requirements


Just make sure that you have the following prerequisites.

  • Microsoft Windows or a compatible OS with .NET Framework, .NET Core, and PHP, VBScript, Delphi, C++ via COM Interop.
  • Development environment like Microsoft Visual Studio.
  • Aspose.PDF for .NET DLL referenced in your project.

Extract Metadata of PDF - Python.

This sample code shows how to extract metadata informations of the PDF file

    pdfDocument = Document(dataDir + "GetFileInfo.pdf")
    docInfo = pdfDocument.Info
    print("Author: " + docInfo.Author)
    print("Creation Date: " + docInfo.CreationDate)
    print("Keywords: " + docInfo.Keywords)
    print("Modify Date: " + docInfo.ModDate)
    print("Subject: " + docInfo.Subject)
    print("Title: " + docInfo.Title)

About Aspose.PDF for Python API

Aspose.PDF API can be used for PDF document manipulation and parsing within applications. One can create, modify, compress, secure, print or save PDF to TXT, HTML, PCL, XFA, XML, XPS, EPUB, TEX, Images and more formats. Aspose.PDF is a standalone API and it does not depend on any software including Adobe Acrobat.