Feb 23 2010

docx4j v2.3.0 released

I’m pleased to announce the release of docx4j v2.3.0

docx4j is an open source (Apache license) project which facilitates the manipulation of Microsoft OpenXML docx (and now pptx) documents in Java, using JAXB.

The main features of this release are support for pptx files, and improvements to HTML export (via NG2), and PDF export (via XSL FO).

For further details, please see the release announcement.

Feb 09 2010

Importing Word documents into Google Wave

Plutext has released a robot for Google Wave which you can use to convert Microsoft Office Word documents into Wave content.

The robot is at docxwave@appspot.com

This is especially useful if your Word document contains tables or images, because copy/pasting from Word leaves them out. Adding the document as an attachment in Wave wouldn’t be the answer either, because that doesn’t bring the power of Wave to bear on the doc at all.

This wave was the announcement and is for support (Wave account required):

[wave id=”googlewave.com!w+Kb5sDrZkA” color=”#000000″ bgcolor=”#FFFFFF”]

Sep 16 2009

Microsoft Word and the (i4i) patent madness

By now you’ve probably heard how hitherto largely unknown i4i teamed up with some bottom feeding lowlife and successfully sued  Microsoft for patent infringement to the tune of USD 240 million.

If you read the patent in question, US patent number 5787449, you’ll see that the so-called invention was entirely obvious, consisting only of annotations pointing to positions in a character stream (as distinct from embedded within it).

Microsoft suggested some prior art, but nothing which knocked out the patent.

What i4i chose to call a ‘metacode’ sounds a lot like the markers described in 1986 in Data Structures in the Andrew Text Editor:

A marker is a data structure that refers to a portion of the text of a document; the portion starting at some character and extending for some length.

The people involved in the Text Encoding Initiative guidelines leading up to publication of their proposal 3 in 1994 (which is before the patent was filed) probably discussed and published relevant stuff as well – did anyone ask them? [Edit 18/9/09] Hmmm, Markup Reconsidered (presented in 1992) says:

A word should be said about so-called out-of-line markup, non-embedded structure that conforms to the syntactic requirements of a given markup standard

and references a 1992 document: David Barnard, Lou Burnard, Jean-Pierre Gaspart, Lynne Price, C.M. Sperberg-McQueen, and Nino Varile, “Notes on SGML Solutions to Markup Problems”, TEI MLW18, which I haven’t looked at.  And “Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies” contains a wealth of historical references which probably contain stuff.

[Added 18/9/09] Most interestingly, Ted Nelson states in “Embedded Markup Considered Harmful” that Xanadu has used parallel markup since the 1960’s!  Unfortunately, he didn’t publish much about it (which limits its value as prior art); though Rick points out that Nelson disclosed some stuff in his 1992 book.

So the i4i patent should have been knocked out:

  • because it was obvious, and
  • because the USPTO and Microsoft should have found prior art.

In any case, the patent system should be reformed, so that, if you must have a patent system at all, and one in which software and algorithms are effectively patentable, such patents last for no more than say 4 years from the priority date.  Their patent would have expired in 1998 (the same year it was granted!).

In this case, what i4i patented has been independently thought of (ie invented) as part of one approach to the related problem of overlapping markup.  Independent invention should also be a defense (although that might not have helped Microsoft in this case)

A further thought: what if patentees were liable every time they described something as new if it could be shown that it wasn’t? That might give them pause for thought.

Mar 03 2009

How to try Plutext for yourself

Here is a screencast which walks you through sharing your own document, and trying our collaboration features:

Get the Flash Player to see this content.

Of course, you can just play with one of the pre-existing shared documents.

The video width is 1280 pixels, so if you are browsing in a narrow window, you’ll need to expand your browser window to see it properly.  (Everybody has screens that wide these days don’t they, unless they are mobile?)

For completeness:

Mar 02 2009

Plutext collaboration for Word: new features

We’ve just published a new build of the Word Add-In, which among other things, supports replication between users of images and comments.

For a good while now, with Plutext you’ve been able to be in a Word document at the same time as your co-workers – provided all you were doing was working on tables and paragraphs (editing them, inserting, deleting or moving them around).

With this latest release, you can add images and Word comments, and have them replicate properly between Word 2007 users.

Here is a screencast of this in action:

Get the Flash Player to see this content.

If you want to play with this yourself, you can download our Word Add-In and give it a shot!

For username & password, please see here. The password is “tester”.

For detailed instructions, see this PDF, or this earlier screencast.

If you’d like to chat about your own Plutext installation, please contact us using this form.

Dec 09 2008

Unifying the web browser and the desktop?

You can launch docx4all as a desktop application, or as an applet in your web browser (requires Java 6).

If you choose to do the latter, you can (provided you are running then new Java Update 10) drag the applet to your desktop, where it will keep running even if you close the web browser.

See this video:

Get the Flash Player to see this content.

Is this a gimmick, or is it truly useful?

It is useful if you want to close your browser, but not docx4all.

It can also be thought of as a way to preview before you install (dragging it to the desktop installs the desktop shortcut / Start menu option).

It would be nice if, having dragged the applet outside of the browser, you could resize it (as you can with a normal desktop application) – but you can’t (at least without some extra coding on our part).

So, although its cool, it is not really a major feature.

Nov 16 2008

collaborate on a Word doc with docx4all

docx4all has now reached the point where you can collaborate happily with a Word user, both working on the document at the same time.

This screencast shows a docx4all user and a Word user doing that:

Get the Flash Player to see this content.

docx4all will work on any platform if you have Java 6 installed – including Windows, OSX, or Linux.

You can try collaborating now, in your web browser by clicking here (warning: ~10 MB).  The download is of course one-time.  Next time, it will start quicker.

That link takes you to the docx4all applet, which does collaboration in your web browser.

You can also run docx4all as a desktop application – the functionality is identical.

The nice thing about the docx4all experience is that with just one-click you can be collaborating. Ok, a couple of clicks – one to start docx4all, and another to do File > Open.

Because all changes are versioned, from the Plutext menu you can see:

  • a history of all the changes which have been made to a given content control
  • a version of the document showing the most recent change to each paragraph

Nov 11 2008

docx4j v2.1.0 released

We’re pleased to announce that we’ve released v2.1.0 of docx4j.  Get it from our downloads page.

docx4j is an open source Java library for manipulating OpenXML WordprocessingML documents, released under the Apache software licence. docx is the default file format in Word 2007 in Microsoft Office 2007, and part of an ISO standard (more or less unchanged).

v2.1.0 is mainly a maintenance release.

Attention has been paid to ease of use of hyperlinks, images, and headers/footers.

The HTML output has been redone to use the XSLT from the OpenXMLViewer project; it can be configured to save images as files, and automatic list numbers are handled.

This release should also work under Java 1.5, now that I have re-built fop-fonts.  I had contributed TTC (true type collection) handling code to FOP, and it was accepted, so fop-fonts now uses that (ie the patch which makes fop-fonts is that much smaller).

Oct 28 2008

Microsoft’s “Office Web” announcement.

Well, the announcement happened, and its vaporware.

Microsoft’s anouncement is that you will be able to “create, edit and collaborate” on Office documents using your web browser (IE, Firefox, or Safari), but not until Office 14.

Office 14 is expected late 2009 or 2010. So if you wait for Microsoft to deliver Office 14 – and your IT department to roll it out – before you start collaborating in Word, count on waiting until 2011. They didn’t tell you you can get started now, using Plutext and Word 2007 :-)

That’s the only real surprise.

There were no surprises re:

  • Technology – Office Web uses Silverlight (or AJAX)
  • Delivery model – you need Sharepoint or Office Live Workspace to host the service
  • Pricing – it is available as a hosted subscription service or through existing volume licensing agreements

It is interesting to see that their collaboration stuff seems to work on a synch-every-few-seconds model (like Google Docs) in OneNote, but in Word the user has to explicitly synch.  I’ll blog in another post why this is the correct design decision.

What happens if you go offline? This probably depends on underlying support for offline in Silverlight.

Oct 26 2008

Microsoft’s collaboration stuff any day now?

It’s Monday morning on October 27th as I write here in Australia.

Steve Balmer gave hints in 2 separate reports at the beginning of the month that they’ll be announcing their in-Office collaboration stuff this week.

The first report was in www.cio.co.uk

Ballmer:So we are embracing Software + Services, Cloud Computing as hard as anybody. By the time we finish our Professional Developers Conference this month, I think you’ll have to say that there is nobody out there with as wide a range of Cloud Computing services as Microsoft, including, dare I say it, Google …

CIO: Steve, I guess the $64,000 question from a lot of people’s point of view is, is there going to be an Office for the Web, something that really competes head on with Google Docs, Google Apps?

Ballmer: .. I think what people want is something as rich as Microsoft Office, something that you can ‘click and run’, if you are not at your own desk. Something that is compatible, document-wise with Microsoft Office and something that offers the kind of joint editing capabilities that is nice in Google Docs and Spreadsheets. Will Microsoft Office offer that? Yes! Standby for details in the next month.

CIO: So, in the backend of Microsoft R&D, are there people beavering away at versions of Word, PowerPoint, Excel, etc, that are purely web based? Or, is it always going to be this hybrid?

Ballmer: What does it mean to be purely Web based? Do we want them to be as only as powerful as ‘runs in a browser’? No. We want software that is more powerful than runs in a browser. Does that mean we will not have some neat stuff that does run in the browser? No.

We think you’ll actually want the full power of Word, Excel and PowerPoint – and you’ll want to be able to get that simply. But, if you just happen to be in an Internet cafe kiosk and you want to do some light editing, perhaps we need to have a way to support you in that as well, inside the browser. ..

In another, in response to a question about Office Live, he said:

“Office Live has a few things left it needs to do. Number one, and probably most important, is to make sure that people using Office have greater ability to collaborate with one another. We have some of that today with [Office Live] Workspaces, as well as that we’ve got SharePoint; we can do more and some of those things will be better than the other alternatives.

Number two, is when we do Office Live, it has to be true to Office; you’ll need to be able to have full Office documents and programs and share them.

Number three, we have to make it so that – most people use Office most of the time from a single machine. But if you’re away from your desk, at a cafe, a kiosk or your school library, and you don’t have Office, you’ll want to be able to do something quickly; we have to make sure you can get it easily, stream it down, put it in a browser, something like that there… details coming in a few weeks.

I’m not going to write here what I think they are likely to announce.  More sensible to wait a little longer.  It will be interesting though to see what is available immediately, and how much is just vaporware.