Aspose.Total for Java is a comprehensive suite of APIs that enables developers to integrate PDF to CSV conversion feature in their Java applications. It consists of two components, Aspose.PDF for Java and Aspose.Cells for Java.
Aspose.PDF for Java is a powerful PDF manipulation API that enables developers to render PDF documents to XLSX format. It supports a wide range of features such as text extraction, image extraction, page manipulation, annotations, bookmarks, and much more. It also provides support for PDF/A-1, PDF/A-2, and PDF/A-3 standards.
Aspose.Cells for Java is a Spreadsheet Programming API that enables developers to convert XLSX to CSV format. It provides support for a wide range of features such as data validation, formatting, worksheet protection, charting, and much more. It also provides support for popular spreadsheet formats such as XLS, XLSX, XLSB, XLSM, CSV, and ODS.
By using Aspose.Total for Java, developers can easily integrate PDF to CSV conversion feature in their Java applications in two-step process. Firstly, they can render PDF to XLSX by using Aspose.PDF for Java. In the second step, they can convert XLSX to CSV by using Spreadsheet Programming API Aspose.Cells for Java. This makes it easier for developers to quickly and easily integrate PDF to CSV conversion feature in their Java applications.
Convert PDF File to CSV via Java
Conversion Requirements
You can easily use Aspose.Total for Java directly from a Maven based project and include Aspose.PDF for Java and Aspose.Cells for Java in your pom.xml.
// supports PDF, CGM, EPUB, TeX, PCL, PS, SVG, XPS, MD, MHTML, XML, and XSLFO file format | |
// load PDF with an instance of Document | |
Document document = new Document("template.pdf"); | |
// save document in XLSX format | |
document.save("XlsxOutput.xlsx", SaveFormat.Xlsx); | |
// load the XLSX file in an instance of Workbook | |
Workbook book = new Workbook("XlsxOutput.xlsx"); | |
// supports CSV, XLSB, XLSM, XLT, XLTX, XLTM, XLAM, TSV, TXT, ODS, DIF, MD, SXC, and FODS file format | |
// save XLSX as CSV | |
book.save("output.csv", SaveFormat.AUTO); |
Convert Protected PDF to CSV via Java
If your PDF document is password protected, you cannot convert it to CSV without the password. Using the API, you can first open the protected document using a valid password and convert it after it. In order to open the encrypted file, you can initialize a new instance of the Document class and pass filename and password as arguments.
// supports PDF, CGM, EPUB, TeX, PCL, PS, SVG, XPS, MD, MHTML, XML, and XSLFO file format | |
// open PDF document | |
Document doc = new Document("input.pdf", "Your@Password"); | |
// save PDF as XLSX format | |
document.save("XlsxOutput.xlsx", SaveFormat.Xlsx); | |
// load the XLSX file in an instance of Workbook | |
Workbook book = new Workbook("XlsxOutput.xlsx"); | |
// supports CSV, XLSB, XLSM, XLT, XLTX, XLTM, XLAM, TSV, TXT, ODS, DIF, MD, SXC, and FODS file format | |
// save XLSX as CSV | |
book.save("output.csv", SaveFormat.AUTO); |
Convert PDF File to CSV with Watermark via Java
While converting PDF file to CSV, you can also add watermark to your output CSV file format. In order to add a watermark, create a new Workbook to open the converted XLSX file. Select Worksheet via its index, create a Shape and use its addTextEffect function, set colors, transparency and more. After that you can save your XLSX document as CSV with Watermark.
// supports PDF, CGM, EPUB, TeX, PCL, PS, SVG, XPS, MD, MHTML, XML, and XSLFO file format | |
// load PDF with an instance of Document | |
Document document = new Document("template.pdf"); | |
// save document in XLSX format | |
document.save("XlsxOutput.xlsx", SaveFormat.Xlsx); | |
// load the XLSX file in an instance of Workbook | |
Workbook book = new Workbook("XlsxOutput.xlsx"); | |
// get the first default sheet | |
Worksheet sheet = book.getWorksheets().get(0); | |
// add Watermark | |
Shape wordart = sheet.getShapes().addTextEffect(MsoPresetTextEffect.TEXT_EFFECT_1, "CONFIDENTIAL", | |
"Arial Black", 50, false, true, 18, 8, 1, 1, 130, 800); | |
// get the fill format of the word art | |
FillFormat wordArtFormat = wordart.getFill(); | |
// set the color | |
wordArtFormat.setOneColorGradient(Color.getRed(), 0.2, GradientStyleType.HORIZONTAL, 2); | |
// set the transparency | |
wordArtFormat.setTransparency(0.9); | |
// make the line invisible | |
LineFormat lineFormat = wordart.getLine(); | |
lineFormat.setWeight(0.0); | |
// supports CSV, XLSB, XLSM, XLT, XLTX, XLTM, XLAM, TSV, TXT, ODS, DIF, MD, SXC, and FODS file format | |
// save XLSX as CSV | |
book.save("output.csv", SaveFormat.AUTO); |
Explore PDF Conversion Options with Java
What is PDF File Format?
PDF, or Portable Document Format, is a file format designed for presenting documents in a manner that remains consistent across various software applications, hardware devices, and operating systems. Each PDF file contains a comprehensive description of a fixed-layout document, encompassing text, fonts, graphics, and other necessary information for accurate display. Initially developed by Adobe Systems in the early 1990s, PDF served as a means to share computer documents while preserving text formatting and inline images.
PDF files are typically generated using software like Adobe Acrobat or similar PDF creation tools. Presently, PDF has become an open standard governed by the International Organization for Standardization (ISO). This standardization ensures compatibility and interoperability across different platforms and systems. To view PDF files, users can utilize free software such as Adobe Reader or other PDF viewers available.
One of the significant advantages of PDF is its platform independence, allowing seamless viewing and printing on a wide range of devices and operating systems. Regardless of the hardware or software used, the document’s layout and content will remain intact. This universal accessibility has contributed to the popularity of PDF as a preferred format for sharing and distributing documents across diverse platforms and systems.
PDF’s capability to encapsulate a complete document, including text, fonts, graphics, and formatting, makes it a reliable choice for various applications. Whether it’s sharing important reports, publishing e-books, distributing forms, or delivering professional presentations, PDF ensures consistent document rendering and reliable preservation of content across different environments.
What is CSV File Format?
A CSV (Comma-Separated Values) file is a commonly used format for storing tabular data, resembling a spreadsheet or database. It consists of data separated by commas, where each row represents a record. CSV files can be opened in text editors like Microsoft Notepad or Apple TextEdit, as well as spreadsheet programs such as Microsoft Excel or Apple Numbers.
When opened in a text editor, CSV data is displayed in a table format. Columns are separated by commas, and each row is separated by a new line. The first row, known as the header row, contains column names.
CSV files allow for easy data exchange between different applications. Data can be exported from spreadsheet programs like Excel or Numbers and saved in a CSV format. Similarly, CSV files can be imported into these programs, allowing data to be transferred from one system to another.
CSV files offer flexibility and compatibility due to their simple and universal structure. They are widely used for data migration, sharing information across platforms, and integrating data from various sources. The straightforward nature of CSV files makes them accessible for data manipulation, analysis, and processing by both humans and computer systems.