PNG JPG BMP TIFF DOC
  Product Family

Redact DOC Formats in Java

Native and high performance Microsoft Word DOC document sensitive redaction information using Java APIs, without the use of any software like Microsoft or Adobe PDF.

How to Redact DOC File Using Java

In order to redact Microsoft Word DOC file, we’ll use

Aspose.Words for Java

API which is a feature-rich, powerful and easy to use redaction API for Java platform. You can download its latest version directly from

Maven

and install it within your Maven-based project by adding the following configurations to the pom.xml.

Repository


<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>

Dependency

<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>version of aspose-words API</version>
<classifier>jdk17</classifier>
</dependency>

Steps to Redact DOC Files in Java

A basic document search and replace text in contents, comments or metadata can be done with just few lines of code. Redact sensitive information through search and replace text in contents, comments or metadata in Word documents.

  • Instantiate Document class..
  • Create FindReplaceOptions object.
  • Set Pattern
  • Use Replace method with relevant options.
  • Save document.

System Requirements

Before integrating the code, make sure that you have the following prerequisites.

  • Microsoft Windows or a compatible OS with Java Runtime Environment for JSP/JSF Application and Desktop Applications.
  • Get latest version of Aspose.Words for Java directly from Maven .
 

Redact DOC Files - Java

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.writeln("sad mad bad");
		
if(doc.getText().trim() == "sad mad bad"){
	System.out.println("Strings are equal!");
}

// Replaces all occurrences of the words "sad" or "mad" to "bad".
FindReplaceOptions options = new FindReplaceOptions();
doc.getRange().replace(Pattern.compile("[s|m]ad"), "bad", options);

// Save the DOC document.
doc.save(dataDir + "output.doc");
 
  • Java Words API can be used to load, view and convert Microsoft Word and OpenDocument Formats like DOC, DOCX, ODT to PDF, XPS, HTML and various other formats. You can also create new documents from scratch and save them in the supported formats. It is a standalone API that is suitable for server side and backend systems where high performance is required. It does not depend on any software like Microsoft or OpenOffice. ‎

    Online DOC Redaction Live Demos

    DOC What is DOC File Format

    Files with .doc extension represent documents generated by Microsoft Word or other word processing documents in binary file format. The extension was initially used for plain text documentation on several different operating systems. It can contain several different types of data such as images, formatted as well as plain text, graphs, charts, embedded objects, links, pages, page formatting, print settings and a lot others. The format was popular for all sorts of documentation due to the variety of options it offers to users for writing manuals, proposals, specifications, resumes, articles or any similar documents. The updated version of DOC is DOCX which is based on Office OpenXML whose specifications are openly available.

    Read More

    Other Supported Redaction Documents

    Using Java, one can easily redact different formats including.

    DOCX (Office 2007+ Words Document)
    DOT (Microsoft Word Template Files)
    DOTX (Microsoft Word Template File)
    HTML (Hyper Text Markup Language)
    MD (Markdown Language)
    MHTML (Web Page Archive Format)
    ODT (OpenDocument Text File Format)
    OTT (OpenDocument Standard Format)
    RTF (Rich Text Format)
    TXT (Text Document)
    XHTML (XML Text Based Markup)