OpenDoPE and XPath 2.0/3.0
January 3rd, 2019 by JasonDocx4j generally uses Apache XPath (org.apache.xpath), from the Xalan 2.7.2 jar. (docx4j uses Xalan plus Xalan-specific extension functions for XSLT in various places including HTML export and OpenDoPE processing).
There are 2 main places where docx4j uses XPath:
- JaxbXmlPartXPathAware contains method getJAXBNodesViaXPath, which – thanks to JAXB’s concept of a binder – you can use to select objects (say P objects) in your MainDocumentPart
- OpenDoPE content control data binding: XPath is central to content control data binding (binding document content to XML data via XPath).
XPath 2.0 became a W3C Rec in 2007; XPath 3.0 became a W3C Rec in 2014.
Sadly, Apache XPath has languished at XPath 1.0 level: https://intellectualcramps.wordpress.com/2009/01/12/xerces-getting-xpath-2-0-support/ and http://apache-xml-project.6118.n7.nabble.com/XSLT-2-0-td20898.html
Saxon, in contrast, has supported XPath 2.0 for ages, and also supports 3.1.
In docx4j 6.1.0 we made it easy for you to try Saxon for case 1 (JaxbXmlPartXPathAware getJAXBNodesViaXPath):
Step 1: add Saxon to your classpath, for example (Maven):
<dependency> <groupId>net.sf.saxon</groupId> <artifactId>Saxon-HE</artifactId> <version>9.9.0-2</version> </dependency>
Step 2: add the following early in your code:
XPathFactoryUtil.setxPathFactory(new net.sf.saxon.xpath.XPathFactoryImpl())
In docx4j 6.1.0, this only affects case 1. OpenDoPE content control data binding would still use Apache XPath.
In docx4j 8.0.0, Saxon would also be used for OpenDoPE content control data binding.
An example: date comparison
You can add an OpenDoPE conditional content control, in which the content is inserted only if XPath “xs:date(/invoice/date) > xs:date(‘2018-12-31’)” is true. (date comparison is harder in XPath 1.0: https://stackoverflow.com/questions/4347320/xpath-dates-comparison )
For this to work, you need the prefix mapping xmlns:xs=’http://www.w3.org/2001/XMLSchema’, so your XPath in the OpenDoPE XPaths path would look something like:
<xpath id="dateGt"> <dataBinding xpath="xs:date(/invoice/date) &gt; xs:date('2018-12-31')" prefixMappings="xmlns:xs='http://www.w3.org/2001/XMLSchema'" storeItemID="{8B049945-9DFE-4726-9DE9-CF5691E53858}"/> </xpath>
(for now, you need to manually edit the zipped docx to add that; I’ll update the authoring tools to do it in due course)
You can try this example right away:
- get a docx4j 8.0 nightly: https://docx4java.org/docx4j/docx4j-8.0.0-SNAPSHOT-20190102.jar
- add Saxon (as above)
- bind invoice_Saxon_XPath2.docx from https://github.com/plutext/docx4j/blob/VERSION_8_0_0/sample-docs/word/databinding/invoice_Saxon_XPath2.docx using ContentControlBindingExtensions.java
Try changing the date in invoice-data.xml to say, 2018-01-15, then observe the affect on the output docx.
Just to re-iterate, you need Saxon for this to work. Xalan’s XPath will cause an exception.
org.eclipse.wst.xml.xpath2.processor is an interesting possible alternative, but it is not in Maven Central, not as well-known as Saxon, and possibly not so easy to get support?