by jason » Sat Apr 17, 2010 12:51 am
Hello Anbu Chezhian.S
I've had a look at the documents you supplied, thanks.
I've made some quick fixes in svn.
Several issues remain:
- underline (should be easy to fix)
- wmf/emf images (see recent posts)
- field code handling
- improvements to hanging indentation, numbering
- a JAXB error
Some of these I may look at soon, others not. So you can either try fixing them yourself, or avail yourself of Plutext professional services.
Out of interest, how many documents were there in total in the corpus these came from?
And did you resolve your MathML issue? If so, please consider posting/contributing your solution. Thanks!
Details:-
coming_content_loss.docx
------------------------
style missing basedOn - FIXED
content_los1.docx - ignore w:smartTagPr, FIXED
-----------------
content_missing.docx
--------------------
Image missing (tc containing wp:inline)
Caused by: java.lang.ClassCastException:
org.docx4j.openpackaging.parts.WordprocessingML.MetafileEmfPart
cannot be cast to org.docx4j.openpackaging.parts.WordprocessingML.BinaryPartAbstractImage
at org.docx4j.model.images.WordXmlPictureE20.handleImageRel(WordXmlPictureE20.java:416)
at org.docx4j.model.images.WordXmlPictureE20.createWordXmlPictureFromE20(WordXmlPictureE20.java:239)
at org.docx4j.model.images.WordXmlPictureE20.createHtmlImgE20(WordXmlPictureE20.java:293)
Table - no cell borders
erroe_UC-006_Ppmts_Manage Encumbrances.docx
-------------------------------------------
Handled java.lang.NullPointerException
at org.docx4j.model.listnumbering.Emulator.getNumber(Emulator.java:179)
Image missing (presume emf/wmf). Is that the main problem?
This doc is 18 pages - I only looked at it briefly.
error.docx
----------
Seems ok, maybe because of fixes above?
error_Use Case.docx
-------------------
Seems ok
error_vita_dfp2.docx
--------------------
NOT IMPLEMENTED: support for fldChar
NOT IMPLEMENTED: support for instrText
I think one of the other community members is working on support for field codes.
and question marks generated somewhere from empty paragraphs which look like:
<w:p>
<w:pPr>
<w:pStyle w:val="Achievement"/>
<w:numPr>
<w:ilvl w:val="0"/>
<w:numId w:val="0"/>
</w:numPr>
<w:ind w:left="245"/>
<w:jc w:val="left"/>
</w:pPr>
</w:p>
exception_not_converted.docx
----------------------------
JAXB error
16.04.2010 22:36:21 *INFO * Part: Constructing /word/document.xml (Part.java, line 132)
java.lang.NumberFormatException: For input string: "62259f"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.math.BigInteger.<init>(Unknown Source)
at java.math.BigInteger.<init>(Unknown Source)
at com.sun.xml.bind.DatatypeConverterImpl._parseInteger(DatatypeConverterImpl.java:72)
at com.sun.xml.bind.v2.model.impl.RuntimeBuiltinLeafInfoImpl$21.parse(RuntimeBuiltinLeafInfoImpl.java:674)
Needs to be looked at.
fine_listing_not_coming.docx
----------------------------
Small caps come out as normal, that's all.
halfdone_exception1.docx
-------------------------
Underline missing
Bullet not aligned with others
Bulleted hanging indents aren't hanging
notopening.docx
----------------
Added <xsl:template match="w:tblPrEx"/> -- seems ok
underline_missing.docx
----------------------
Underline is indeed missing (should be easily fixed), as is an image (presumed to be emf/wmf).