Oct 20 2008

Plutext walkthrough

PlutextWalkthrough (PDF) is a step by step guide to collaborating on a Word document using Plutext.

It contains more or less the same information as my last blog post, but in a format which allows you to avoid the videos.

Oct 01 2008

Collaboration in Word – ready for alpha testing

Plutext enables everyone on your team to make changes in Word, at the same time (ie it lets you collaborate just as you can in Google Docs, but in your familiar Word environment, with formatting, change tracking etc).

Here is a short screencast of the gist of it:

Get the Flash Player to see this content.

If you are working on legal documents, government reports, or other formal deliverables you’ll probably want to make the process more structured.  Here is an excerpt from an old screencast showing our features for lawyers and others requiring accountability:

Get the Flash Player to see this content.

If you want to give it a try, the easiest way to try it out is to download our Word 2007 add-in, then fire up Word and login to the “public0902” group with these “tester” settings (on Word’s review ribbon, click our “File” button, then Settings), using password “tester”:


Click to enlarge

then open an existing document (from the Plutext “File” button on the Review ribbon).

You can get a colleague to work with you on a document. Or you can simulate collaboration simply by opening the document twice on your PC (which is what I’ve done in the screencasts above).

Right now, you need Word 2007.  Next week, we’ll release an updated build of our cross-platform client which you can try.

This video shows you how to add your own document to the public space (or your private space):

Get the Flash Player to see this content.

But be careful, anyone else can see the documents if you just use the “public” group.

If you’d like a little privacy, you can setup a space of your own on our test server.

We’d love to know what you think, either in the comments, or our forums, or privately (jason@plutext.org).

Please report problems with the Word add-in here, and server problems here.  Thanks.

Naturally, there are a few limitations in this alpha, including:

  • the Audit function doesn’t like bookmarks
  • adding an image won’t work

Finally, if you want to uninstall the Word add-in, you can do this from Window’s add/remove programs in the usual way.

Jul 22 2008

docx4j v2.0 released

We’re pleased to announce that we’ve released v2.0 of docx4j.

docx4j is an open source Java library for manipulating OpenXML WordprocessingML documents, released under the Apache software licence. docx is the default file format in Word 2007 in Microsoft Office 2007.

docx4j supports the following:

  • Open existing docx (from filesystem, SMB/CIFS, WebDAV using VFS)
  • Create new docx (just one line of code)
  • Programmatically manipulate the docx document (of course), including tables, images
  • Import a binary doc (proof of concept)
  • Import/export Word 2007’s xmlPackage (pkg) format
  • Save docx to filesystem as a docx (ie zipped), or to JCR (unzipped)
  • Apply transforms, including common filters
  • Export as HTML or PDF
  • Diff/compare paragraphs or sdt (content controls), outputting OpenXML with changes marked up
  • Font support (font substitution, and use of any fonts embedded in the document)
  • Use the power of JAXB to do other cool stuff

Get it from here.

What is it about this release that warrants being labeled v2.0?

The new features include image support, diff, and xmlPackage.  A factor is the version numbering convention Microsoft has chosen for their Open XML SDK: its v2.0 which will first contain an API for WordprocessingML.

So think of a “level 1” API as one which handles the Open Packaging conventions (basically, the unzipping step), but leaves you to handle the document (part) content using low level XML (DOM, SAX, etc).

A “level 2” API is one which gives you a higher level API to manipulate the part content.  At the very least, this would include objects to represent paragraphs, tables, styles etc.  But you’d also expect it to be easy, for example, to add a paragraph using a specified style (maybe this is “level 3”?  In any case, docx4j can do it)

Given that docx4j brought a “level 2” WordML API to the Java world 6 months ago, it is appropriate that it be labelled version 2.0.

Jul 14 2008

“Document locked” – never again!

Last Thursday I demo’d our Plutext collaboration system to an audience of lawyers and legal technologists and some old friends at the Victorian Society for Computers & the Law’s Legal Technology Conference 2008.

The accompanying presentation is here (pdf).

Our approach to collaboration means you will never be told your document is locked or checked out by someone else.

This in itself is a great step forward for many long-suffering users of traditional document management systems.

I’m collecting screenshots of locked / checked-out messages from different document management systems.  So next time this happens to you, please email it to me.  I’m jason, that’s at plutext.org.  Thanks.

May 03 2008

Click to try docx4all v0.2

Jo and I are pleased to have just uploaded a new version of docx4all for you to try.

We’ve added quite a few features since I last blogged about docx4all (21 Feb).

New features include:

The VFS file chooser allows docx4all to open documents not just from the local file system, but also from a WebDAV server (such as Alfresco), and potentially, CIFS etc.  To do this, docx4all uses VFSJFileChooser, and webdavclient4j (a project we’ve started to address the gap left when Apache retired Slide, including its WebDAV client).

The incoming document filter is used to convert certain features of WordprocessingML which docx4all can’t yet handle, into something it can.   Examples include proofErr, hyperlink, and lastRenderedPageBreak.  This behaviour relies on a feature of docx4j, which makes it easy to apply a transform to a docx package (by converting it to pkg:package format).

Docx4all can’t yet render tables (let alone edit them), but we’re working on changing that.

Apr 30 2008

modified Office Open XML schema now in Subversion

We’ve been tweaking the schemas – especially wml.xsd – to make the Java classes generated by JAXB’s xjc more user-friendly.

I’m satisfied that this is permitted by ECMA, so I’ve put the modified schemas into subversion .

For anyone interested in the reasoning, the Ecma website says:

“Ecma Standards and Technical Reports are made available to all interested persons or organizations, free of charge and copyright, in printed form and, as files in Acrobat (R) PDF format.”

For this to apply, it needs to be an “Ecma Standards or Technical Report”.

That page says “A Standard or a Technical Report is a formal document prepared by an Ecma Technical Committee and approved by the Ecma General Assembly.”

Office Open XML was so approved.

So the only possible glitch would be words to the effect that the schema aren’t part of the official standard.

I’ve checked the language in parts 2 and 4 (of the Ecma TC45 Final Draft) which says “This Office Open XML specification includes a family of schemas … The normative definition of these schemas reside in an accompanying file named … which is distributed in electronic form only.”

Which makes it clear the schemas are part of the Standard :)

So the ECMA standard’s XSD are “free of copyright” – an explicit waiver of copyright. So no problemo in creating derivative works.

Apr 10 2008

docx4j now released under Apache License

We’re pleased to announce that docx4j is now available under the Apache License (v2).

This is a response to feedback on an earlier post.  This is also the last license change we’ll be making to docx4j. Word documents are mostly manipulated in corporate environments.  This change removes barriers to adoption of docx4j by business and institutions.

docx4j uses org.merlin.io to efficiently turn streams inside out. That package had been available under the GPL.  Its author, Merlin Hughes, today kindly released it under v2 of the Apache License, so we now use it under that license.

There’s a new nightly build of docx4j available from the downloads page if you want to grab it.  This build can load/save to/from a WebDAV server – more on that in another post.

Mar 18 2008

Sun’s bug votes on steroids

I like programming in Java.  It is still a great way to write cross-platform code.  I’ve bet my business on it.

But sometimes, Sun is just too slow to fix bugs (or make the fixes available). And this is still their role, even when a user has a fix to contribute.

Take the following 2 which have bitten me this week:

  1. Preferences broken if you use org.apache.xalan.processor.TransformerFactoryImpl
  2. Printing on Ubuntu 7.10

Fixes haven’t become available for either of these yet on Java 6 (though the first has been closed  here and here)

Sun really needs to invest more in Java, to get all the outstanding bugs fixed, and the fixes out quickly.  (Yes, people who write and use open source expect fixes more quickly than most vendors can deliver them.  Life is much quicker in the open source world)

But as we know, Sun doesn’t make much money from it – directly at least.  And even though Sun is quite clear in their strategy to use Java to drive sales of their hardware, this lack of revenue shows – shows up as a lack of support.

So what about a logo people using Java can put on their websites, which communicates “I bought some Sun hardware to support Sun’s investment in Java” to other people who use Java.  This may make them consider buying some Sun gear as well, and proliferation of the logo would remind Sun that Java really is what butters their bread.

Maybe that needs to be  “to support Sun’s investment in Java on Linux” (or even on Linux x86_64) – since its not Windows that these bugs occur on.

Or how about a way for Sun to earn credits towards a Sun hardware purchase: “If Sun fixes this bug, it will earn them a notional half a purchase”.  Fix this one as well, and I’ll buy something.  A great little site for someone to write.

Yes, I know you can vote for a bug (the printing bug has 45 votes since 26 November 2007 – that’s a lot of votes, comparatively speaking – but still there is no indication of when a fix will be available).

But Sun is wildly optimistic in only giving people three votes, no matter how many bugs are causing them grief.

I’ve bought 3 servers, 2 workstations, and a laptop in the last 6 months or so, and none of these are from Sun.  But I would change my purchasing policies for some tangible indication that result in quicker bug fixes.  So my third idea in this little brainstorm – what about allocating special higher priority bug votes when  people buy Sun gear?

Mar 12 2008

Office Online – not yet after all

Well, there were a few interesting announcements from Microsoft last week, but they didn’t include OaaS (Office as a Service), nor improved collaboration.

The three announcements:

  1. Office Live Workspace Beta is publicly available
  2. Sharepoint Online has been available to businesses with over 5000 employees; now it is available in beta to businesses with under 5000 (provided you are based in the US)
  3. Silverlight 2 Beta

Office Live Workspace doesn’t have real collaboration, yet. As ReadWriteWeb puts it:

Although Office Live Workspace allows for collaboration, it’s not real-time, online collaboration. Instead, if one user is editing a file, another will be informed the file is “checked out.” When they finish editing and save their changes the document is checked back in for other users to access.

The situation is similar in Sharepoint. As Bill Gates put it:

I have a Word document that if I open it up, you can see that I’ve been force [sic] versioning, check in/check out on my documents, so I could check out the document, make a change, and then come down and save those changes

Mr Gates explained that between these 2 products Microsoft intends to cover the whole market:

We want to scale [Sharepoint] all the way down, so that literally you don’t have to have an IT capability, and that’s where we get into what we’ve branded Live. So we’re working that one up through small customers. We want to work [Sharepoint] down and make sure there’s no gap in-between.

When Microsoft eventually gets around to offering real collaboration, there is no reason for either of those 2 products to do it differently (unless they wish to upsell people to Sharepoint).  So its more a question of which one gets real collaboration first; Sharepoint customers are probably more deserving, but Office Live Workspace customers might make good guinea pigs.

“Microsoft Office Live Workspace is being offered free of charge. .. The company expects to release the final public version of Office Live Workspace later in the year.”

That’s not to say that real collaboration will necessarily be free, though it might be.

For hosted Sharepoint (Microsoft Online Services), the licensing model:

New customers and customers without Microsoft Software Assurance can purchase Microsoft Online Services as a per-user subscription. Existing customers with Software Assurance on their Microsoft Client Access Licenses can purchase a user subscription at a discount, enabling them to maximize their existing Microsoft software investments. Customers with a subscription have rights to both Microsoft Online Services and to access on-premises server software, giving them the ability to blend Web-based services with on-premises software.

So when will the collaboration offering happen?

Venture Beat says that “with Microsoft still raking in so much money from traditional software, [full-on war with Google Apps is] still at least a couple years away”.  Mary Jo tells us Microsoft will fill in the blanks around its Live services strategy at its Professional Developers Conference in October.

Which brings me to the Silverlight 2 beta.  I’m inclined to think Microsoft will offer real collaboration as soon as they’ve got a suitable client (ie not before Silverlight 2 has been through its beta cycle).  The TextGlow docx viewer sets high expectations as to how this might perform.

Mar 03 2008

Microsoft Office Online .. soon?

Nick Carr has sparked speculation that Microsoft will soon unveil its strategy for bringing its Office suite online – which to me means a way of working with Office documents on any computer which has an internet connection.  If you are connected, I’d expect you to be able to collaborate with others in real time; if you are not connected, I’d expect the software to work in offline mode.

When I say “any computer”, I don’t mean to restrict that to any particular operating system (and indeed, Silverlight runs on the Mac, and Microsoft has announce it is working with Novell on a linux implementation).  What good is collaboration software if some of the people you need to collaborate with can’t play?I thought I’d make some predictions about the business model.

There seem to be 2 key questions:

  •  does each end user pay, or does a collaboration originator pay for the right to invite a certain number of collaborators?
  • what support for Mac and Linux users, and when?

Whether each individual user is required to pay, or the originator pays, will reveal much about how Microsoft regards its online offering.  The latter model, that the person who originates a collaboration session pays for a certain number of people to be able to collaborate (ie whatever their platform), would show that their focus is firmly on collaboration.  This is the model we would use for any plutext SAAS offering (available to people who don’t want to install plutext server internally, for free or a fee). 

Here are my predictions:

  1. Enterprise version (ie behind the firewall).  There will be a version an enterprise can install on its Sharepoint server, for those businesses which are not comfortable with their documents being hosted externally.  I’m sure Microsoft can work out how to let people give access to people outside the firewall as necessary.  An enterprise licensee will be able to invite people outside the enterprise without charge.
  2. Cloud version. I expect there will be a cloud version for SMBs.  I think you will be able to use this for free, provided you have a license for the traditional Office product.  You will definitely need this (2007 version) to originate collaboration around a document (ie invite other users) – unless you are prepared to pay a full price for the online offering.  Maybe anyone will be able to accept a collaboration invitation (ie whether or not they are licensed to use Office), making the “who pays” question mute.  To create a new document (or print it?), I expect you will need to have a licence for the traditional Office product, or pay for the SAAS offering.
  3. Mac and Linux support.  I think Microsoft will offer Mac support sooner or later, but delay any hint of support for Linux for as long as possible.  This is because Linux is much more of a threat than OSX (two reasons: (1) Linux is free, and (2) it is very easy to install it on your existing Windows PC).  That said, they might have it “only on Windows” to try to keep people there – until some critical tipping point is reached.  I would say that even now, the only thing stopping Microsoft from seeking revenues from Linux users are the inevitable press headlines along the lines of “Microsoft admits defeat” that would come with this.  The cost of this in terms of perception would surely outweigh any incremental revenues in the short term.  Mac users may be able to use it for free – provided they had an Office license they were able to associate with their online user ID.  
  4. docx only. The documents which come out of this online service will be docx documents, not binary or RTF.  This will help to make the new format ubiquitous.

I wonder whether the collaboration protocols will be published under the recent interoperability initiative?  If they are, the way would be open for a rich world, in which docx4all could potentially play…  I’d be pleasantly surprised if they were, and there was nothing stopping someone from making a client or server of their own.  If anyone else could create a server, then why not get rid of it altogether and go peer-to-peer?  Maybe, just maybe, the thinking is that it would take forever for someone other than Microsoft to create a fully featured server, so third party implementations are to be encouraged (as is presently the case for OpenXML), since Microsoft’s offering will always be the RollsRoyce implementation which attracts the most usage, with the other implementations adding value to the ecosystem.

 The announcement, if/when it comes, will be fascinating!  Read the rest of this entry »