Background: Client is requesting a report in multiple formats, one of which is docx. We decided the easiest route would be to have an HTML template to populate the report data into and then convert the resulting string to XHTML and then the final format. The report contains details about a given record, and there is one table in the report for a type of association. We have recently integrated with a dataset that has records with over 100k of that type of association. We've been getting timeout issues for docx when the table is housing 100k+ rows. We can't drop the table, and we can't extend timeout time.
Question: Is it possible to optimize the XHTML conversion process, or should I look into handling this as a special case and basically not populate the table until after conversion? I noticed the comment on the XHTMLImporterImpl about the FSEntityResolver issue, but it sounds like that is something that needs to be fixed in FlyingSaucer, please correct me if I am misinterpreting that. Thank you for your time.
As an aside, we have advised the client that a report of that size isn't ideal for docx, as the format is generally unstable dealing with files past a certain size threshold.