When a Word, Excel, or PowerPoint file is protected using a Microsoft Purview Sensitivity Label, the file is encrypted using Microsoft Information Protection (MIP).
Here is ChatGPT's advice on how to deal with that (I haven't tested this, so please let us know how you go)
Here’s how to decrypt a Microsoft 365 Sensitivity Label–protected `.docx` file with the **Microsoft Information Protection (MIP) SDK**, then load it into **docx4j**.
In your Maven `pom.xml`:
Using xml Syntax Highlighting
<!-- Microsoft Information Protection SDK -->
<dependency>
<groupId>com.microsoft.informationprotection
</groupId>
<artifactId>mip-sdk
</artifactId>
<version>1.11.79
</version> <!-- example version -->
</dependency>
Parsed in 0.001 seconds, using
GeSHi 1.0.8.4
The MIP SDK Java bindings require Microsoft’s native runtime libraries. You can get them from
https://github.com/AzureAD/microsoft-in ... ection-sdk
Using java Syntax Highlighting
import com.microsoft.informationprotection.*;
import com.microsoft.informationprotection.file.*;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import java.io.*;
public class DecryptAndReadDocx4j
{
public static void main
(String[] args
) throws Exception {
// Initialize MIP SDK
MIP.
Initialize(MipComponent.
FILE);
// Application info (from Azure AD app registration)
ApplicationInfo appInfo
= new ApplicationInfo
("YOUR_CLIENT_ID",
"MyJavaApp",
"1.0.0");
// Create AuthDelegate (you implement this to get OAuth tokens)
AuthDelegateImpl authDelegate
= new AuthDelegateImpl
(
"YOUR_TENANT_ID",
"YOUR_CLIENT_ID",
"YOUR_CLIENT_SECRET"
);
// Create FileProfile
FileProfileSettings profileSettings
= new FileProfileSettings.
Builder("MyProfile",
"logs/")
.
setAuthDelegate(authDelegate
)
.
build();
FileProfile profile
= MIP.
loadFileProfile(profileSettings
);
// Create FileEngine for this user/context
FileEngineSettings engineSettings
= new FileEngineSettings
(
"user@yourdomain.com",
// must be a licensed M365 user
"en-US",
"MyEngine",
false
);
FileEngine engine
= profile.
addEngine(engineSettings
);
// Load the protected DOCX file
String protectedFilePath
= "sensitive.docx";
FileHandler fileHandler
= engine.
createFileHandler(protectedFilePath,
new FileHandlerSettings
());
// Decrypt to memory
ByteArrayOutputStream decryptedData
= new ByteArrayOutputStream();
fileHandler.
saveAs(decryptedData, FileFormat.
OpenXML);
// Use docx4j to load the decrypted document
try (InputStream in
= new ByteArrayInputStream(decryptedData.
toByteArray())) {
WordprocessingMLPackage
wordMLPackage = WordprocessingMLPackage.
load(in
);
// etc
}
// Clean up
profile.
unload();
MIP.
uninitialize();
}
}
Parsed in 0.017 seconds, using
GeSHi 1.0.8.4
How it Works:
1. **MIP SDK** handles decryption, policy, and user authorization.
2. The file’s decrypted bytes are stored in memory (`ByteArrayOutputStream`).
3. **docx4j** loads from that stream as if it were a normal `.docx`.
4. No unencrypted file is written to disk — keeping it secure.
You’ll need to implement `AuthDelegateImpl`, which retrieves OAuth2 tokens for the MIP SDK.
Here’s a minimal example that uses **client credentials flow** (service principal):
Using java Syntax Highlighting
import com.microsoft.informationprotection.AuthDelegate;
import com.microsoft.informationprotection.AuthDelegateLogonType;
import java.util.concurrent.CompletableFuture;
public class AuthDelegateImpl
implements AuthDelegate
{
private final String tenantId
;
private final String clientId
;
private final String clientSecret
;
public AuthDelegateImpl
(String tenantId,
String clientId,
String clientSecret
) {
this.
tenantId = tenantId
;
this.
clientId = clientId
;
this.
clientSecret = clientSecret
;
}
@Override
public CompletableFuture
<String
> acquireToken
(String authority,
String resource,
String claim
) {
return CompletableFuture.
supplyAsync(() -> {
try {
// Simplest possible example using MSAL4J
com.
microsoft.
aad.
msal4j.
ConfidentialClientApplication app
=
com.
microsoft.
aad.
msal4j.
ConfidentialClientApplication.
builder(
clientId,
com.
microsoft.
aad.
msal4j.
ClientCredentialFactory.
createFromSecret(clientSecret
)
)
.
authority("https://login.microsoftonline.com/" + tenantId
)
.
build();
com.
microsoft.
aad.
msal4j.
ClientCredentialParameters params
=
com.
microsoft.
aad.
msal4j.
ClientCredentialParameters.
builder(
java.
util.
Collections.
singleton(resource
+ "/.default"))
.
build();
return app.
acquireToken(params
).
join().
accessToken();
} catch (Exception e
) {
throw new RuntimeException("Token acquisition failed", e
);
}
});
}
@Override
public AuthDelegateLogonType getLogonType
() {
return AuthDelegateLogonType.
CLIENT_CREDENTIALS;
}
}
Parsed in 0.016 seconds, using
GeSHi 1.0.8.4