Java HTML Files Manipulation APIs

Manipulate and Render HTML documents including CSS styles to PDF & Raster Image formats.

  Download Free Trial
Aspose.HTML for Java

Aspose.HTML for Java


Aspose.HTML for Java is an advanced HTML manipulation API to generate and manipulate HTML within the Java applications. API allows to to insert, remove, replace HTML nodes, extract CSS and navigate through HTML document via multiple ways. Moreover, API provides the capabilities to load EPUB and MHTML as well as offers the scripting which allows manipulating HTML DOM via JavaScript.

Aspose.HTML for Java supports inter-file format conversion to load HTML file and render the output in PDF, XPS and raster image formats including JPEG, PNG, BMP and more as well as provide encryption for PDF files.


Advanced Java HTML Processing API Features



Create HTML pages from Scratch


Load existing HTML


Implement W3C HTML specifications


Lightweight & standalone component


Add, replace or remove HTML nodes


Extract CSS styling information


Load EPUB and MHTML file formats


Render HTML to raster image formats


Convert HTML to XPS and PDF

Rendering from HTML to PDF and XPS Format 

Aspose.HTML for Java provides the capabilities to create or load HTML files, and render the output in PDF and XPS.

Render HTML as fixed-layout formats - Java

// load the file to be rendered
HTMLDocument html = new HTMLDocument(dir + "template.html");
// render to PDF & XPS
HtmlRenderer renderer = new HtmlRenderer();
renderer.render(new PdfDevice(new PdfRenderingOptions(), dir + "output.pdf"), html);
renderer.render(new XpsDevice(new XpsRenderingOptions(), dir + "output.xps"), html);

The conversion process is highly customizable, allowing you to configure PageSetup aspects for the resultant fixed-layout formats, that is; you can specify the page numbers to be rendered, tweak the resultant page size or set the JPEG compression for the embedded images.

Conversion to Raster Images

Aspose.HTML for Java offers the high fidelity rendering engine at its core which can convert HTML pages to most commonly used raster image formats including TIFF, BMP, PNG & JPEG without requiring any additional software or tool.

Manipulation of ePub and MHTML Files

Aspose.HTML for Java is capable of loading ePub and MHTML files to perform various operations including the conversion to fixed-layout and raster image formats.

HTML Nodes Navigation

Aspose.HTML for Java enables you to navigate through the HTML document either by elements, XPath or CSS selector queries, and extract, insert, remove, replace HTML nodes on the go.

Extract all nodes of type anchor - Java

// create an instance of HTMLDocument & load HTML from URL
HTMLDocument document = new HTMLDocument("");
// get all nodes of type anchor
NodeList nodelist = document.getDocumentElement().querySelectorAll("a");
// display anchor text & href values for all nodes
for (Node node : nodelist)
    HTMLAnchorElement anchor = (HTMLAnchorElement)node;
    System.out.println("Text: " + node.getTextContent() + " Href: " + anchor.getHref());

Configure Sandbox

The HTML API enables you to configure a document sandbox that affects the processing of HTML documents, that is; the CSS styles in some cases are dependent on screen size.

Aspose.HTML for Java allows to configure the environment independent of the execution machine.


Support and Learning Resources


Aspose.HTML offers individual HTML processing APIs for other popular development environments as listed below: