PDF File Conversion via C++

PDF to Microsoft Office® Word, HTML, Images and various other formats Conversion

 

For enhancing the functionality of a C++ software to handle PDF files conversion to other formats. C++ PDF manipulation and rendering library makes it easy for developers. As it supports multiple conversion including PDF to Image, HTML and Microsoft Office Word formats. Programmers can utilize codes listed below as well as enhance as of their relevant requirements.

Convert PDF to Word Formats

PDF C++ library makes the conversion process simple. So PDF to Word formats conversion including DOC and DOCX files using C++ is just few lines of coding. C++ API provides Document class that load the Microsoft Word® files. After loading call the Save method for PDF to Word conversion.

C++ PDF to Word Converter Code
// Load PDF Source File
auto pdftodoc = MakeObject<Document>(u"sourceInput.pdf");

// Save PDF to Word Document
pdftodoc->Save(u"pdf-to-word.doc", SaveFormat::Doc);
 

Converting PDF to HTML

Process of converting PDF to HTML in general, is almost same loading the file and call the Save methd having output HTML document path and SaveFormat::Html as parameters. Moreover, for specific HTML saving settings, C++ PDF Parser library provides HtmlSaveOptions class, having different special functionalities like setting for fonts, splitting HTML and CSS in multiple pages, folder for images and more.

PDF to HTML Converter C++ Code
// Load the source PDF document
auto pdftoHtmlObj = MakeObject<Document>(u"sourceFile.pdf");

// Create an instance of the HtmlSaveOptions class
SharedPtr<HtmlSaveOptions> pdftoHTMLoptions = MakeObject<HtmlSaveOptions>();

// Set the required options
pdftoHTMLoptions->PartsEmbeddingMode = HtmlSaveOptions::PartsEmbeddingModes::EmbedAllIntoHtml;
pdftoHTMLoptions->LettersPositioningMethod = HtmlSaveOptions::LettersPositioningMethods::UseEmUnitsAndCompensationOfRoundingErrorsInCss;
pdftoHTMLoptions->RasterImagesSavingMode = HtmlSaveOptions::RasterImagesSavingModes::AsEmbeddedPartsOfPngPageBackground;
pdftoHTMLoptions->FontSavingMode = HtmlSaveOptions::FontSavingModes::SaveInAllFormats;

// PDF to HTML Conversion
pdftoHtmlObj->Save(u"pdfto.html", pdftoHTMLoptions);
 

Save PDF to Images

PDF pages to image conversion including TIFF, JPEG, BMP, PNG, etc is easy within C++ based applications using code snippets listed below. Developers can use the PdfConverter class, Calling the BindPdf for loading the file. Convert the pages via DoConvert then looping through each page and saving as required format image.

C++ PDF to Image Converter Code
// instantiate PdfConverter
System::SharedPtr<Aspose::Pdf::Facades::PdfConverter> PdfImageConverter = System::MakeObject<Aspose::Pdf::Facades::PdfConverter>();
// load an existing PDF document
PdfImageConverter->BindPdf(dir + L"sourceFile.pdf");
// convert PDF pages to images
PdfImageConverter->DoConvert();
int32_t imageNumber = 1;
while (PdfImageConverter->HasNextImage()) {
// save each page in JPG format
PdfImageConverter->GetNextImage(dir + imageNumber + L".jpg", System::Drawing::Imaging::ImageFormat::get_Jpeg(), 800, 1000);
imageNumber++;
}