Search DOC Formats in Java
Native and high performance Microsoft Word DOC file search using Java APIs, without the use of any software like Microsoft or Adobe PDF.
How to Search DOC File Using Java
In order to search Microsoft Word DOC file, we’ll use
API which is a feature-rich, powerful and easy to use Search API for Java platform. You can download its latest version directly from
and install it within your Maven-based project by adding the following configurations to the pom.xml.
Repository
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
Dependency
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>version of aspose-words API</version>
<classifier>jdk17</classifier>
</dependency>
Steps to Search DOC Files in Java
Developers can easily integrate code with just few lines as listed.
- Load DOC file by instantiating Document Class object.
- Instantiate FindReplaceOptions.
- Use Pattern.compile() method to define a regex pattern
- Use getRange().replace method to find and replace
- Save DOC file.
System Requirements
Before integrating the code, make sure that you have the following prerequisites.
- Microsoft Windows or a compatible OS with Java Runtime Environment for JSP/JSF Application and Desktop Applications.
- Get latest version of Aspose.Words for Java directly from Maven .
Search DOC Files - Java
// Load DOC file
Document doc = new Document("sourceFile.doc");
// Find and replace similar pattern words in the file
FindReplaceOptions options = new FindReplaceOptions();
doc.getRange().replace(Pattern.compile("[B|S|M]ad"), "[replaced]", options);
// Save the DOC file
doc.save("output.doc");
Online DOC Search Live Demos
DOC What is DOC File Format
Files with .doc extension represent documents generated by Microsoft Word or other word processing documents in binary file format. The extension was initially used for plain text documentation on several different operating systems. It can contain several different types of data such as images, formatted as well as plain text, graphs, charts, embedded objects, links, pages, page formatting, print settings and a lot others. The format was popular for all sorts of documentation due to the variety of options it offers to users for writing manuals, proposals, specifications, resumes, articles or any similar documents. The updated version of DOC is DOCX which is based on Office OpenXML whose specifications are openly available.
Read MoreOther Supported Search Documents
Using Java, one can also search other files including.