Aspose.HTML for Java is an advanced HTML manipulation API to manipulate and generate HTML within the Java applications. API allows to add, delete, replace nodes, extract CSS and navigate through a document via multiple ways. Moreover, API provides the capabilities to load EPUB and MHTML as well as offers the scripting which allows manipulating DOM via JavaScript.
Aspose.HTML for Java supports inter-file format conversion to load HTML document and save the output in XPS, PDF and raster images including JPEG, PNG, BMP and more as well as provide encryption for PDF files.

Advanced Java HTML Processing API Features

Create HTML pages from Scratch

Load existing file

Implement W3C specifications

Lightweight and standalone component

Insert, replace or delete nodes

Extract CSS styling information

Load EPUB and MHTML document formats

Render HTML to raster images

API Features in Documentation

You can see the full list of Aspose.HTML features in our documentation. Using Aspose.HTML for Java library in your project allows you to perform the following tasks:

  • Creating or opening an existing HTML document from different sources (Aspose.HTML.Examples.QuickStart.DocumentOpenTests in the examples project).
  • HTML Manipulation: creating, editing, removing and replacing HTML nodes via API.
  • Saving HTML document.
  • Extracting CSS styles for particular HTML node.
  • Configuring a document sandbox that affects the processing of HTML documents.
  • Navigation through an HTML document in different ways.
  • Converting HTML document into various supported formats: JPEG, PNG, BMP, TIFF, PDF, XPS, and more.

Convert HTML to PDF and XPS Format

API supports the rendering of HTML to most commonly used raster images including BMP, TIFF, JPEG, & PNG, PDF, and XPS formats. Developers can customize by configuring PageSetup aspects for the resultant fixed-layout formats including page numbers to be rendered, resultant page size or setting the JPEG compression for the embedded images.

Render HTML as fixed-layout formats - Java


     
    // Load a file to be rendered
    HTMLDocument htmdoc = new HTMLDocument(dir + "template.html");

    // Render HTML to PDF & XPS
    HtmlRenderer renderer = new HtmlRenderer();

    renderer.render(new PdfDevice(new PdfRenderingOptions(), dir + "output.pdf"), htmdoc);
    renderer.render(new XpsDevice(new XpsRenderingOptions(), dir + "output.xps"), htmdoc);


You can try online HTML Converter.

You can also convert HTML, XHTML, MHTML, Markdown, EPUB, or SVG into many other file formats including few listed below:

Conversion to Raster Images

Aspose.HTML for Java offers the high fidelity rendering engine at its core which can convert HTML pages to most commonly used raster image formats including TIFF, BMP, PNG & JPEG without requiring any additional software or tool.

Manipulating EPUB and MHTML files

The library is capable of loading EPUB and MHTML files to perform various operations including the conversion to fixed-layout and raster image formats.

HTML Nodes Navigation

API supports navigation through the HTML file either by XPath, elements or CSS selector queries and one can insert, extract, remove or replace nodes easily.

Extract all nodes of type anchor - Java


     
    // instance creation of HTMLDocument and loading HTML from URL
    HTMLDocument dct = new HTMLDocument("https://www.aspose.com");

    // get all anchor type nodes 
    NodeList nodelist = dct.getDocumentElement().querySelectorAll("a");

    // display anchor text & href values for all nodes
    for (Node node : nodelist){

        HTMLAnchorElement anchor = (HTMLAnchorElement)node;
        System.out.println("Text: " + node.getTextContent() + " Href: " + anchor.getHref());
    }



Configure Sandbox

The HTML API enables you to configure a document sandbox that affects the processing of HTML documents, that is; the CSS styles in some cases are dependent on screen size.



  

Support and Learning Resources

  
  

Aspose.HTML offers individual HTML processing APIs for other popular development environments as listed below: