Aspose.HTML  for .NET

C# API to Parse HTML Files

Class library for working with real-world HTML. Create, edit, extract data, merge and convert HTML pages to PDF, DOCX, XPS, Images and other formats.

  Download Free Trial
  
 

Aspose.HTML for .NET is an advanced HTML processing API to perform a wide range of management and manipulation tasks within cross-platform applications. API is designed to create, modify, extract data, convert and render HTML documents without any external software. Also, it supports popular file formats such as EPUB, MHTML, XML, SVG, and Markdown and rendering to PDF, DOCX, XPS and Image file formats. Aspose.HTML for .NET is written completely in C# and can be used to build any type of 32-bit or 64-bit .NET application including ASP.NET, WCF, WinForms & .NET Core.

Moreover, the HTML Document Object Model is integrated with embedded formats and specifications such as CSS, HTML Canvas, SVG, XPath and JavaScript out-of-the-box that extend the manipulation functional and rendering quality. You can see the full list of Aspose.HTML features in our documentation. Using Aspose.HTML C# library in your project allows you to perform the following tasks:

Advanced .NET HTML API Features

Create HTML pages from Scratch

Load existing HTML from file, stream or URL

Implement W3C specifications

Implement templates using template merger

Fill the template with various data sources

Render HTML Canvas 2D to PDF

Add, replace or remove nodes

Extract data from HTML documents

Load EPUB and MHTML file formats

Render HTML to raster image formats

Render multiple documents at once

Apply header and footer during HTML to PDF

Navigate HTML using XPath Query or CSS Selector

Convert HTML to PDF, Image and Other Formats in C#

C# API allows with just a few lines of code to implement HTML to PDF, HTML to Image or any other conversion for your .NET applications. The conversion process is simple and reliable, thus making Aspose.HTML for .NET API a perfect choice.

Convert HTML to PDF - C#


     
using Aspose.HTML;
using Aspose.HTML.Saving;
using Aspose.HTML.Converters;
...
    
    // Load an HTML file to be converted
    using var document = new HTMLDocument("input.html")
    
    // Create an instance of the PdfSaveOptions class
    var pdfSaveOptions = new PdfSaveOptions();    
    
    // Convert HTML to PDF
    Converter.ConvertHTML(document, pdfSaveOptions, "output.pdf");
    


You can try online HTML Converter here.

You can also convert HTML, XHTML, MHTML, Markdown, EPUB, or SVG into many other file formats including few listed below:

Editing HTML Documents

Aspose.HTML for .NET allows you to create and edit HTML documents using a Document Object Model (DOM). The DOM is a programming interface for HTML documents that represents the document (as nodes and objects) as a node tree, where each node represents part of the document. Aspose.HTML for .NET API lets you connect to the page and can change the document structure, style, and content. You can modify the document by inserting new nodes and removing or editing existing nodes' content.

The .NET HTML API assists developers to read, modify, navigate and edit (X)HTML documents. Some file editing functions that the Aspose.HTML for .NET API can perform are the following:
- navigate HTML documents by using various methods, such as, element traversal, document traversal, XPath queries, and CSS selector queries,
- remove and replace HTML nodes,
- extract and edit CSS from HTML,
- configure a document sandbox and more.

Markdown Support

Markdown is a markup language with a plain-text-formatting syntax. Markdown is often used as a format for documentation and readme files since it allows writing in an easy-to-read and easy-to-write style. Aspose.HTML provides a powerful and flexible Markdown Converter that can convert in both directions from Markdown to HTML and from HTML to Markdown. Moreover, the converter API has a set of predefined rules, so you can convert HTML to Markdown using the authentic Markdown syntax, GitLab Flavored Markdown modification or even configure the rules for your needs.

Convert HTML to Markdown - C#


     
using Aspose.Html;
using Aspose.HTML.Saving;
...
    
	// Load an HTML file
	using var document = new HTMLDocument("document.html");

	// Convert HTML to Markdown using a set of features supported by GitLab Flavored Markdown
	document.Save("output.md", MarkdownSaveOptions.Git);


The reverse conversion is that simple! Using the Aspose.HTML class library in your C# application, you can easily convert Markdown into an HTML file with just one line of code!

Convert Markdown to HTML - C#


     
using Aspose.Html.Converters;
...	

	// Convert Markdown to HTML
	Converter.ConvertMarkdown("document.md", "output.html");



You can try online Markdown Converter here. You can convert Markdown to PDF, XPS, DOCX, JPG, PNG, BMP, TIFF, GIF, and MHTML. Upload, transform your documents and get results in a few seconds. You don't need any additional software.

Electronic Books and Web Archives

Aspose.HTML for .NET is capable of loading ePub and MHTML files to perform various operations including the conversion to fixed-layout and raster image formats.

Convert EPUB to PDF - C#


     
using Aspose.Html.Converters;
using Aspose.Html.Saving;
...
    
	// Open an existing EPUB file for reading
     using var stream = File.OpenRead("input.epub");     
    
     // Create an instance of PdfSaveOptions
     var options = new PdfSaveOptions();
    
     // Call the ConvertEPUB method to convert EPUB to PDF
     Converter.ConvertEPUB(stream, options, "output.pdf"); 	 


Convert MHTML to PDF - C#


     
using Aspose.Html.Converters;
using Aspose.Html.Saving;
...   
	
	 // Open an existing MHTML file for reading
     using var stream = File.OpenRead("input.mht");     
    
     // Create an instance of PdfSaveOptions
     var options = new PdfSaveOptions();
    
     // Call the ConvertMHTML method to convert MHTML to PDF
     Converter.ConvertMHTML(stream, options, output.pdf); 



You can try online MHTML Converter and online EPUB Converter. Our browser-based converting tools work from all platforms, including Windows, Linux, Mac OS, Android and iOS. Converters are compatible with all PC devices, smartphones and tablets.

Web Scraping

Web scraping, also well known as web harvesting, web data extraction or web crawling, is a technique to extract data from a website. Aspose.HTML doesn't support a Web Scraping module out-of-the-box. However, using Aspose.HTML API that is entirely based on W3C specification and supports XPath and CSS Selector queries you can easily inspect the content of any HTML document and create your own Web Scraping solution.

Simple Web Data Extraction - C#


     
using Aspose.Html;
...

    // Create an instance of the HTML document with a website as a parameter
    using var document = new Aspose.Html.HTMLDocument("https://en.wikipedia.org/wiki/Aspose_API");

    // Get all anchor-elements
    var elements = document.QuerySelectorAll("a");

    // Dump the anchor-element data to the console
    elements.Cast<HTMLAnchorElement>().ToList().ForEach(x =>
        {
            System.Console.WriteLine("[Href]: " + x.Href);
            System.Console.WriteLine("[Content]: " + x.TextContent);
        });



Aspose.HTML offers free online Data Scrapers Apps that are a way to get data from websites. Our Apps are safe, work on any platform and do not require any software installation. Data Scrapers can be used for image extracting, getting keywords from a webpage, etc. They are easy and clear to use, yet forceful and reliable.

  

Support and Learning Resources

  
  

Aspose.HTML offers individual HTML processing APIs for other popular development environments as listed below: