Redact HTML Formats in Java
Native and high performance HTML document sensitive redaction information using Java APIs, without the use of any software like Microsoft or Adobe PDF.
How to Redact HTML File Using Java
In order to redact HTML file, we’ll use
API which is a feature-rich, powerful and easy to use redaction API for Java platform. You can download its latest version directly from
and install it within your Maven-based project by adding the following configurations to the pom.xml.
Repository
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
Dependency
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>version of aspose-words API</version>
<classifier>jdk17</classifier>
</dependency>
Steps to Redact HTML Files in Java
A basic document search and replace text in contents, comments or metadata can be done with just few lines of code. Redact sensitive information through search and replace text in contents, comments or metadata in Word documents.
- Instantiate Document class..
- Create FindReplaceOptions object.
- Set Pattern
- Use Replace method with relevant options.
- Save document.
System Requirements
Before integrating the code, make sure that you have the following prerequisites.
- Microsoft Windows or a compatible OS with Java Runtime Environment for JSP/JSF Application and Desktop Applications.
- Get latest version of Aspose.Words for Java directly from Maven .
Redact HTML Files - Java
Document html = new Document();
DocumentBuilder builder = new DocumentBuilder(html);
builder.writeln("sad mad bad");
if(html.getText().trim() == "sad mad bad"){
System.out.println("Strings are equal!");
}
// Replaces all occurrences of the words "sad" or "mad" to "bad".
FindReplaceOptions options = new FindReplaceOptions();
html.getRange().replace(Pattern.compile("[s|m]ad"), "bad", options);
// Save the HTML document.
html.save(dataDir + "output.html");
Online HTML Redaction Live Demos
HTML What is HTML File Format
HTML (Hyper Text Markup Language) is the extension for web pages created for display in browsers. Known as language of the web, HTML has evolved with requirements of new information requirements to be displayed as part of web pages. The latest variant is known as HTML 5 that gives a lot of flexibility for working with the language. HTML pages are either received from server, where these are hosted, or can be loaded from local system as well. Each HTML page is made up of HTML elements such as forms, text, images, animations, links, etc. These elements are represented by tags such as img, a, p and several others where each tag has start and end. It can also embed applications written in scripting languages such as JavaScript and Style Sheets (CSS) for overall layout representation.
Read MoreOther Supported Redaction Documents
Using Java, one can easily redact different formats including.