${substitution} with HTML

by **daceilo** » Sun Jan 15, 2012 9:03 am

Ok, hoping someone can help me out because I'm feeling brain dead. I have the need to take HTML (From a complex TextArea webform) and insert it into a docx document. I can do the following already:

1. Convert the HTML into the part, assign the part to the end of a new document using docx4j
2. I can replace the ${tags} with plain strings in a docx document

But I'm unsure of how to insert a part in the correct place... Sorry if this is a faq, I've searched all over the forums for examples of this.

by **jason** » Sun Jan 15, 2012 12:00 pm

I assume you're importing the HTML using a docx4j nightly, the package org.docx4j.convert.in.xhtml, and something like:

Syntax: [ Download ] [ Hide ]

Using java Syntax Highlighting

                WordprocessingMLPackage wordMLPackage= WordprocessingMLPackage.createPackage();

                wordMLPackage.getMainDocumentPart().getContent().addAll(

                                convert(f, wordMLPackage));
Parsed in 0.015 seconds,  using GeSHi 1.0.8.4

but that you'd instead like to attach the content at position n in some wordMLPackage2. You'd do:

Syntax: [ Download ] [ Hide ]

Using java Syntax Highlighting

                wordMLPackage2.getMainDocumentPart().getContent().addAll(

                                n, convert(f, wordMLPackage));
Parsed in 0.013 seconds,  using GeSHi 1.0.8.4

Now how you determine your position n is another question. Cut/pasted from elsewhere:

There are three approaches for finding the relevant block:
• manually
• via XPath
• via TraversalUtils

TraversalUtils is the recommended approach. This is mainly because there is a limitation to using XPath in JAXB (as to which see below).

Explanations of the three approaches follow.

Common to all of them however, is the question of how to identify what you are looking for.
• Paragraphs don't have ID's, so you might search for a particular string.
• Or you might search for the first paragraph following a section break.
• A good approach is to use content controls (which can have ID's), and to search for your content control by ID, title or tag.

Manual approach

The manual approach is to iterate through the block level elements in the document yourself, looking for the paragraph or table or content control which matches your criteria. To do this, you'd use org.docx4j.wml.Body element method:

Syntax: [ Download ] [ Hide ]

Using java Syntax Highlighting

public List<Object> getEGBlockLevelElts()
Parsed in 0.014 seconds,  using GeSHi 1.0.8.4

XPath approach

Underlying this approach is the use of XPath to select JAXB nodes:

Syntax: [ Download ] [ Hide ]

Using java Syntax Highlighting

        MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
String xpath ="//w:p";

        List<Object> list = documentPart.getJAXBNodesViaXPath(xpath, false);
Parsed in 0.014 seconds,  using GeSHi 1.0.8.4

You then find the index of the returned node in EGBlockLevelElts.

Beware, there is a limitation to using XPath in JAXB: the xpath expressions are evaluated against the XML document as it was when first opened in docx4j. You can update the associated XML document once only, by passing true into getJAXBNodesViaXPath. Updating it again (with current JAXB 2.1.x or 2.2.x) will cause an error. So you need to be a bit careful!

TraversalUtils approach

TraversalUtil is a general approach for traversing the JAXB object tree in the main document part. TraversalUtil has an interface Callback, which you use to specify how you want to traverse the nodes, and what you want to do to them.

TraversalUtil can be used to find a node; you then get the index of the returned node in EGBlockLevelElts.

by **daceilo** » Tue Jan 31, 2012 5:52 pm

Thanks! I definitely overcomplicated it... It was extremely easy as you pointed out as soon as I stopped thinking too hard

Thanks again.

by **Empirica** » Fri Mar 23, 2012 3:55 am

Sorry, having the same problem here and I don't get it what do you mean by

jason wrote:You then find the index of the returned node in EGBlockLevelElts.

The Code piece in question is this one. Replacing a placeholder with text just works fine.

Syntax: [ Download ] [ Hide ]

Using java Syntax Highlighting

privatevoid replacePlaceholders()throws JAXBException {
Properties templateProperties =newProperties();

try{

                        templateProperties.load(getClass().getResourceAsStream("template.properties"));
}catch(Exception e){

                        e.printStackTrace();
}


                List<Object> texts = givenPackage.getMainDocumentPart()

                                .getJAXBNodesViaXPath(XPATH_TO_SELECT_TEXT_NODES, true);

//givenPackage is a WordprocessingMLPackage




                for(Object obj : texts){

                        Text text =(Text)((JAXBElement) obj).getValue();

if(isHTMLPlaceholder(text.getValue())){
//replace element with HTML content
}
else{
//replace Node with normal text
String textValue = replacePlaceholdersByValue(text.getValue(),templateProperties);

                                text.setValue(textValue);
}

}
}
Parsed in 0.015 seconds,  using GeSHi 1.0.8.4

I guess I then have to do sth like that:

Syntax: [ Download ] [ Hide ]

Using java Syntax Highlighting

privatevoid replaceHTML(){

                givenPackage.getMainDocumentPart().getContent().addAll(INDEX, HTML_CHUNK);
}
Parsed in 0.015 seconds,  using GeSHi 1.0.8.4

But I still don't know how to get the correct Index and how to correctly replace the placeholder with HTML Content at the specified position.

please help!

by **jason** » Mon Mar 26, 2012 9:13 pm

If you have an object o, and you want to find its position in the contents, you can do:

Code: Select all: givenPackage.getMainDocumentPart().getContent().indexOf(o)

by **Empirica** » Thu Mar 29, 2012 10:05 pm

Thank you. For everyone else who searches a solution, here is the complete code for Text AND Html replacement using docx4j in Word documents:

Syntax: [ Download ] [ Hide ]

Using java Syntax Highlighting

//Traces the whole document tree for each word
privatevoid replacePlaceholders()throws JAXBException {
Properties templateProperties =newProperties();
int index;

try{

                        templateProperties.load(getClass().getResourceAsStream("template.properties"));
}catch(Exception e){

                        e.printStackTrace();
}


                List<Object> texts = givenPackage.getMainDocumentPart()

                                .getJAXBNodesViaXPath(XPATH_TO_SELECT_TEXT_NODES, true);

for(Object obj : texts){

                        Text text =(Text)((JAXBElement) obj).getValue();

if(isHTMLPlaceholder(text.getValue())){
//replace element with HTML content     

                                logger.info("New HTML Placeholder found");

                                index = texts.indexOf(obj);

                                replaceHTML(index,text.getValue(),templateProperties);

//remove superflous placeholder

                                givenPackage.getMainDocumentPart().getContent().remove(index);

}
else{

//replace Node with normal text
if(text.getValue().startsWith("$")&& text.getValue().endsWith("$")){

                                        logger.info("New TEXT Placeholder found");
String textValue = replacePlaceholdersByValue(text.getValue(),templateProperties);

                                        text.setValue(textValue);
}
}

}
}

privateboolean isHTMLPlaceholder(String placeholderValue){
if(placeholderValue.startsWith("$")&& placeholderValue.endsWith("$")){
if( placeholderValue.contains("$HTML"))returntrue;
}
returnfalse;
}

//Replace Placeholders with HTML
privatevoid replaceHTML(int index, String placeholderValue, Properties templateProperties){

String methodName ="";

try{
if(templateProperties.containsKey(placeholderValue)){

                                methodName =(String)templateProperties.get(placeholderValue);

                                logger.debug("Placeholder gefunden: "+ methodName);

                                placeholderValue =(String) givenData.getClass().getMethod(methodName,null).invoke(givenData,null);
}

}catch(Exception e){

                        logger.error(e.getMessage());

                        e.printStackTrace();
}

String html ="<html>"+ placeholderValue +"</html>";

                AlternativeFormatInputPart afiPart =null;

//Create HTML
try{

                        logger.info("Trying to create an html part.");

                        afiPart =new AlternativeFormatInputPart(new PartName("/hw"+String.valueOf(index)+".html"));//CAUTION: each html part needs a new name!!
}catch(InvalidFormatException e){

                        e.printStackTrace();
}

//Parse Content

                logger.info("Get the Bytes and set the Content type of the html part.");

                afiPart.setBinaryData(html.getBytes());

                afiPart.setContentType(new ContentType("text/html"));


                Relationship altChunkRel =null;

try{

                        logger.info("adding the Target Path...");

                        altChunkRel = givenPackage.getMainDocumentPart().addTargetPart(afiPart);
}catch(InvalidFormatException e){

                        e.printStackTrace();
}

//Add HTML to document

                logger.info("Adding HTML to the document..");

                CTAltChunk ac =Context.getWmlObjectFactory().createCTAltChunk();

                ac.setId(altChunkRel.getId());

                givenPackage.getMainDocumentPart().getContent().add(index-1,ac);
}

//Replace Placeholders with text
privateString replacePlaceholdersByValue(String placeholderValue, Properties templateProperties){
String methodName ="";
try{
if(templateProperties.containsKey(placeholderValue)){

                                methodName =(String)templateProperties.get(placeholderValue);

                                logger.debug("Placeholder gefunden: "+ methodName);

                                placeholderValue =(String) givenData.getClass().getMethod(methodName,null).invoke(givenData,null);
}
}catch(Exception e){

                        logger.error(e.getMessage());

                        e.printStackTrace();
}

return placeholderValue;
}
Parsed in 0.028 seconds,  using GeSHi 1.0.8.4

Hope it helps someone.

Cheers
- Empirica

by **subhrajlahiri** » Mon Apr 06, 2020 9:58 pm

Hi,

I have been trying this code. But I am ending up getting an error on the following line:

Code: Select all: givenPackage.getMainDocumentPart().getContent().add(index-1,ac);

The error is:

Code: Select all: Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 80, Size: 11 at java.util.ArrayList.rangeCheckForAdd(ArrayList.java:665) at java.util.ArrayList.add(ArrayList.java:477) at per.subhra.docxtemplate.App.replaceHTML(App.java:418) at per.subhra.docxtemplate.App.replaceHtmlPlaceHolders(App.java:136) at per.subhra.docxtemplate.App.main(App.java:117)

Any help is appreciated. Thanks.

${substitution} with HTML

${substitution} with HTML

Re: ${substitution} with HTML

Re: ${substitution} with HTML

Re: ${substitution} with HTML

Re: ${substitution} with HTML

Re: ${substitution} with HTML

Re: ${substitution} with HTML

Who is online