.NET API for working with real-world HTML

Class library to create, edit, extract data & convert HTML pages to PDF, XPS, Images and other formats.

  Download Free Trial   Try Online
Aspose.HTML for .NET

Aspose.HTML for .NET

 
 

Aspose.HTML for .NET is an advanced HTML processing API to perform a wide range of management and manipulation tasks within cross-platform applications. API supports to generate, modify, extract data, convert and render HTML documents without any external software. Also, it supports popular file formats such as EPUB, MHTML, SVG, and Markdown and rendering to PDF, XPS and Image file formats.

Moreover, the HTML Document Object Model is integrated with embedded formats and specifications such as CSS, HTML Canvas, SVG, XPath and JavaScript out-of-the-box that extend the manipulation functional and rendering quality.

 

Advanced .NET HTML Manipulation API Features

 

 

Create HTML pages from Scratch

 

Load existing HTML from file, stream or URL

 

Implement W3C specifications

 

Implement templates using template merger

 

Fill the template with various data sources

 

Render HTML Canvas 2D to PDF

 

Add, replace or remove nodes

 

Extract data from HTML documents

 

Load EPUB and MHTML file formats

 

Render HTML to raster image formats

 

Render multiple documents at once

 

Implement Markdown to HTML converter

 

Apply header and footer during HTML to PDF

Convert HTML to PDF, Image and Other Formats

API allows with just a few lines of code implement HTML to PDF, HTML to Image or any other conversion for your .NET applications.

Convert HTML to PDF and PNG - C#

// Load the HTML file to be converted
using (var document = new Aspose.Html.HTMLDocument("document.html"))
{
    // Convert HTML to PDF
    Aspose.Html.Converters.Converter.ConvertHTML(document, new PdfSaveOptions(), "output.pdf");

    // Convert HTML to Image
    Aspose.Html.Converters.Converter.ConvertHTML(document, new ImageSaveOptions(ImageFormat.Png), "output.png");
}
You can check the quality of conversion here.

Markdown Support

Markdown is a markup language with a plain-text-formatting syntax. Markdown is often used as a format for documentation and readme files since it allows writing in an easy-to-read and easy-to-write style. Aspose.HTML provides a powerful and flexible Markdown Converter that can convert in both directions from Markdown to HTML and from HTML to Markdown. Moreover, the converter API has a set of predefined rules, so you can convert HTML to Markdown using the authentic Markdown syntax, GitLab Flavored Markdown modification or even configure the rules for your needs.

Convert HTML to Markdown - C#

// Load HTML file
using (var document = new Aspose.Html.HTMLDocument("document.html"))
{
    // Convert HTML to Markdown using a set of features supported by GitLab Flavored Markdown
    document.Save("output.md", Aspose.Html.Saving.MarkdownSaveOptions.Git);
}
The reverse conversion is that simple!

Convert Markdown to HTML - C#

// Convert Markdown to HTML
Aspose.Html.Converters.Converter.ConvertMarkdown("document.md", "output.html");
You can try Markdown Converter here.

Electronic Books and Web Archives

The Electronic Books (EPUB) formats and Web Archive (MHTML) formats supported out-of-the-box. API offers high fidelity rendering EPUB and MHTML files to the supported output formats such as PDF, XPS and Image file formats.

Convert EPUB to PDF - C#

//  Convert EPUB to PDF.
Aspose.Html.Converters.Converter.ConvertEPUB("document.epub", new Aspose.Html.Saving.PdfSaveOptions(), "output.pdf");

Convert MHTML to PDF - C#

//  Convert MHTML to PDF.
Aspose.Html.Converters.Converter.ConvertMHTML("document.mht", new Aspose.Html.Saving.PdfSaveOptions(), "output.pdf");

Web Scraping

Web scraping, also well known as web harvesting, web data extraction or web crawling, is a technique to extract data from a website. Aspose.HTML doesn't support a Web Scraping module out-of-the-box. However, using Aspose.HTML API that is entirely based on W3C specification and supports XPath and CSS Selector queries you can easily inspect the content of any HTML document and create your own Web Scraping solution.

Simple Web Data Extraction - C#

// Create an instance of the HTML document with a website as a parameter.
using (var document = new Aspose.Html.HTMLDocument("https://en.wikipedia.org/wiki/Aspose_API"))
{
    // Get all anchor-elements
    var elements = document.QuerySelectorAll("a");

    // Dump the anchor-element data to the console.
    elements.Cast<HTMLAnchorElement>().ToList().ForEach(x =>
    {
        System.Console.WriteLine("[Href]: " + x.Href);
        System.Console.WriteLine("[Content]: " + x.TextContent);
    });
}
 

Support and Learning Resources

 
 

Aspose.HTML offers individual HTML processing APIs for other popular development environments as listed below: