Search MHTML Formats in Java
Native and high performance MHTML file search using Java APIs, without the use of any software like Microsoft or Adobe PDF.
How to Search MHTML File Using Java
In order to search Microsoft Word MHTML file, we’ll use
API which is a feature-rich, powerful and easy to use Search API for Java platform. You can download its latest version directly from
and install it within your Maven-based project by adding the following configurations to the pom.xml.
Repository
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
Dependency
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>version of aspose-words API</version>
<classifier>jdk17</classifier>
</dependency>
Steps to Search MHTML Files in Java
Developers can easily integrate code with just few lines as listed.
- Load MHTML file by instantiating Document Class object.
- Instantiate FindReplaceOptions.
- Use Pattern.compile() method to define a regex pattern
- Use getRange().replace method to find and replace
- Save MHTML file.
System Requirements
Before integrating the code, make sure that you have the following prerequisites.
- Microsoft Windows or a compatible OS with Java Runtime Environment for JSP/JSF Application and Desktop Applications.
- Get latest version of Aspose.Words for Java directly from Maven .
Search MHTML Files - Java
// Load MHTML file
Document mhtml = new Document("sourceFile.mhtml");
// Find and replace similar pattern words in the file
FindReplaceOptions options = new FindReplaceOptions();
mhtml.getRange().replace(Pattern.compile("[B|S|M]ad"), "[replaced]", options);
// Save the MHTML file
mhtml.save("output.mhtml");
Online MHTML Search Live Demos
MHTML What is MHTML File Format
Files with MHTML extension represent a web page archive format that can be created by a number of different applications. The format is known as archive format because it saves the web HTML code and associated resources in a single file. These resources include anything linked to the webpage such as images, applets, animations, audio files and so on. MHTML files can be opened in a variety of applications such as Internet Explorer and Microsoft Word. Microsoft Windows uses MHTML file format for recording scenarios of problems observed during the usage of any application on Windows that raises issues. The MHTML file format encodes the page contents similar to specifications defined in message/rfc822 which is plain text email related specifications. The actual specifications of the format are as detailed by RFC 2557.
Read MoreOther Supported Search Documents
Using Java, one can also search other files including.