I assume you're importing the HTML using a docx4j nightly, the package org.docx4j.convert.in.xhtml, and something like:
Using java Syntax Highlighting
WordprocessingMLPackage
wordMLPackage = WordprocessingMLPackage.
createPackage();
wordMLPackage.
getMainDocumentPart().
getContent().
addAll(
convert
(f, wordMLPackage
) );
Parsed in 0.015 seconds, using
GeSHi 1.0.8.4
but that you'd instead like to attach the content at position n in some wordMLPackage2. You'd do:
Using java Syntax Highlighting
wordMLPackage2.
getMainDocumentPart().
getContent().
addAll(
n, convert
(f, wordMLPackage
) );
Parsed in 0.013 seconds, using
GeSHi 1.0.8.4
Now how you determine your position n is another question. Cut/pasted from elsewhere:
There are three approaches for finding the relevant block:
• manually
• via XPath
• via TraversalUtils
TraversalUtils is the recommended approach. This is mainly because there is a limitation to using XPath in JAXB (as to which see below).
Explanations of the three approaches follow.
Common to all of them however, is the question of how to identify what you are looking for.
• Paragraphs don't have ID's, so you might search for a particular string.
• Or you might search for the first paragraph following a section break.
• A good approach is to use content controls (which can have ID's), and to search for your content control by ID, title or tag.
Manual approach
The manual approach is to iterate through the block level elements in the document yourself, looking for the paragraph or table or content control which matches your criteria. To do this, you'd use org.docx4j.wml.Body element method:
Using java Syntax Highlighting
public List
<Object
> getEGBlockLevelElts
()
Parsed in 0.014 seconds, using
GeSHi 1.0.8.4
XPath approach
Underlying this approach is the use of XPath to select JAXB nodes:
Using java Syntax Highlighting
MainDocumentPart documentPart
= wordMLPackage.
getMainDocumentPart();
String xpath
= "//w:p";
List
<Object
> list
= documentPart.
getJAXBNodesViaXPath(xpath,
false);
Parsed in 0.014 seconds, using
GeSHi 1.0.8.4
You then find the index of the returned node in EGBlockLevelElts.
Beware, there is a limitation to using XPath in JAXB: the xpath expressions are evaluated against the XML document as it was when first opened in docx4j. You can update the associated XML document once only, by passing true into getJAXBNodesViaXPath. Updating it again (with current JAXB 2.1.x or 2.2.x) will cause an error. So you need to be a bit careful!
TraversalUtils approach
TraversalUtil is a general approach for traversing the JAXB object tree in the main document part. TraversalUtil has an interface Callback, which you use to specify how you want to traverse the nodes, and what you want to do to them.
TraversalUtil can be used to find a node; you then get the index of the returned node in EGBlockLevelElts.