Java HTML Files Manipulation APIs

Manipulate and Render HTML documents including CSS styles to PDF & Raster Image formats.

  Download Free Trial
Aspose.HTML for Java

Aspose.HTML for Java


Aspose.HTML for Java is an advanced HTML manipulation API to manipulate and generate HTML within the Java applications. API allows to add, delete, replace nodes, extract CSS and navigate through a document via multiple ways. Moreover, API provides the capabilities to load EPUB and MHTML as well as offers the scripting which allows manipulating DOM via JavaScript.

Aspose.HTML for Java supports inter-file format conversion to load HTML document and save the output in XPS, PDF and raster images including JPEG, PNG, BMP and more as well as provide encryption for PDF files.


Advanced Java HTML Processing API Features



Create HTML pages from Scratch


Load existing file


Implement W3C specifications


Lightweight and standalone component


Insert, replace or delete nodes


Extract CSS styling information


Load EPUB and MHTML document formats


Render HTML to raster images


Convert HTML to XPS and PDF

Rendering from HTML to PDF and XPS Format 

API supports the rendering of HTML to most commonly used raster images including BMP, TIFF, JPEG, & PNG, PDF, and XPS formats. Developers can customize by configuring PageSetup aspects for the resultant fixed-layout formats including page numbers to be rendered, resultant page size or setting the JPEG compression for the embedded images.

Render HTML as fixed-layout formats - Java

// load the file to be rendered
HTMLDocument htmdoc = new HTMLDocument(dir + "template.html");
// render to PDF & XPS
HtmlRenderer renderer = new HtmlRenderer();
renderer.render(new PdfDevice(new PdfRenderingOptions(), dir + "output.pdf"), htmdoc);
renderer.render(new XpsDevice(new XpsRenderingOptions(), dir + "output.xps"), htmdoc);

Manipulation of ePub and MHTML Files

The library is capable of loading ePub and MHTML files to perform various operations including the conversion to fixed-layout and raster image formats.

HTML Nodes Navigation

API supports navigation through the HTML file either by XPath,  elements or CSS selector queries and one can insert, extract, remove or replace nodes easily.

Extract all nodes of type anchor - Java

// instance creation of HTMLDocument and loading HTML from URL
HTMLDocument dct = new HTMLDocument("");
// get all anchor type nodes 
NodeList nodelist = dct.getDocumentElement().querySelectorAll("a");
// display anchor text & href values for all nodes
for (Node node : nodelist)
    HTMLAnchorElement anchor = (HTMLAnchorElement)node;
    System.out.println("Text: " + node.getTextContent() + " Href: " + anchor.getHref());

Support and Learning Resources


Aspose.HTML offers individual HTML processing APIs for other popular development environments as listed below: