This thread is intended to kick off discussion on what should/will be in docx4j v3.
If there is anything you'd especially like to see (whether listed or not), please reply. If you can contribute something, all the better.
New features:
Word 2010 support. Support for the new XML elements/schemas introduced with Word 2010, and for the compatibility mechanism. This is the main justification for the 3.0 label.
What else? I will add to this list over time...
Cleanup/Rationalisation:
HTML exporters: get rid of old ones; standardise on NG2. The idea is to remove any 'which should I use' confusion, and focus effort/know-how.
PDF exporters: standardise on viaXSLFO, and get rid of viaIText and viaHTML. As with HTML, the idea is to remove any 'which should I use' confusion, and focus effort/know-how. docx4j could produce XSLFO only, and rely on the user to have FOP or equivalent to actually produce the PDF. This will reduce dependencies, making docx4j lighter. The goal would be to remove the fop jar (2.8M), PDF renderer jar (1.6M), iText jar (1.1M), and core-renderer (1M).
Font handling: I'm thinking of removing the panose stuff, so we don't need a customised FOP jar.
dom4j: remove all dependencies (already marked deprecated). (Remove jdom as well?)
Noted as being of interest in the forum, but not scheduled as part of v3:
Layout model: docx4j contains a DocumentModel, which sets us up to work out what page a particular paragraph will fall on (though at first, we'd only aim to do that if the document contains only simple text and tables, and perhaps images). The problem is that whatever page numbers we come up with, whatever actually renders the output (eg Word, a web browser, or something like FOP) is outside our control and will layout the page differently (to a greater or lesser extent).
Inserting OLE objects: I'd like to work on this, but would only be looking at it for fun...