XHTMLImporterImpl and input Tag

by **willi.firulais** » Tue Mar 18, 2014 12:16 am

Hallo,

As I haven't seen any possiblity to get html:input tags converted using XHTMLImporterImpl i wanted to enhnace the mapping. But I haven't found any possiblity to do this because looking at the XHTMLImporterImpl class i haven't found any possiblity to enhance this to custom needs.

Some thoughts that come into mind while playing around with XHTMLImporterImpl. Wouldn't it be great if the if/else loop in XHTMLImporterImpl would be extensible to the developer that is using this real cool library?

What about to place the code between the if/else blocks of the traverse method in their own methods and of cource make some state attriubtes available to the inherited class. eg.

Code: Select all: else { org.docx4j.wml.P currentP = this.getCurrentParagraph(true); currentP.setPPr(this.getPPr(blockBox, cssMap)); }

changed to

Code: Select all: else { doDefaultConvert(box); }

This way it would be possible to do some customization without changing your code by inhert the XHTMLImporterImpl and override the doDefaultConvert(). Maybe this first step can be done without greater refactoring.

Going a step further some refactoring of the XHTMLImporterImpl would be great using some
- Listener Patterns (Callback Functions), or
- "SAX"ish Patterns (Handlers), or
- Rule Design Patterns (Conditions/Facts and Actions).

Code: Select all: xHTMLImporterImpl.getRules().add(new GernericMatcher('ol', box), new DefaultListHandler(ContentAccessor)); xHTMLImporterImpl.getRules().add(new TableBoxMatcher(box), new DefaultTableHandler(ContentAccessor)); xHTMLImporterImpl.getRules().add(new InputItemMatcher(box), new DefaultItemHandler(ContentAccessor));

Code: Select all: private void traverse(Box box, Box parent) { ... foreach (rule in this.rules) { if (rule.matcher.match) { rule.handler.do(); } } ... }

Currently I use XHTMLImport to get out of emphasised markup a nice looking word document that's based on a word template.

Emphasised markup (semantic HTML) means just use h1 to h6, p, strong, emphasis, etc. but no div, no style, no JavaScript, etc.

The following workflow has helped me for preparing the word document:
- creation of xhtml with ckeditor markup
- cleaning and preparing markup with HtmlCleaner for conversion
- docx4j for importing word template
- XHTMLImport for preparation of Word ML and attaching to the loaded template
- postprocessing of Word ML with docx4j

Thx, Willia

by **jason** » Wed Mar 19, 2014 8:30 am

willi.firulais wrote:This way it would be possible to do some customization without changing your code by inhert the XHTMLImporterImpl and override the doDefaultConvert(). Maybe this first step can be done without greater refactoring.

Going a step further some refactoring of the XHTMLImporterImpl would be great using some
- Listener Patterns (Callback Functions), or
- "SAX"ish Patterns (Handlers), or
- Rule Design Patterns (Conditions/Facts and Actions).

Happy to accept contributions (under ASLv2) that add this kind of flexibility.

It'd be good to discuss any proposal a bit here first .. I'll rename this thread to "refactor for easier customization"

XHTMLImporterImpl and input Tag

XHTMLImporterImpl and input Tag

Re: XHTMLImporterImpl and input Tag

Who is online