I am using the library for reading Word Docs. I do the following to get the text out:
- Code: Select all
static String getParagraphText(Object obj) {
if (!(obj instanceof P)) { /
return "";
}
try {
P paragraph = (P) obj;
wordStringWriter.getBuffer().setLength(0);
TextUtils.extractText(paragraph, wordStringWriter);
} catch (Exception exception) {
return "";
}
however I always get stuff like
in the text. I thought of regex replacing them, but ...".....REF _Ref472256234 \w \h \* MERGEFORMAT ...."