OOXML strikes back… continued

Friday, June 26th, 2009

A year or so after the dispute between ODF and OOXML seemed to be settled, the situation is getting more and more confused.

OOXML was fast-tracked as an ISO standard in April 2008 through a questionable procedure, but a few months later Microsoft admitted that “ODF had clearly won” the Document Format Standard war, almost a year ago. However, Microsoft increased its participation in ODF Standards bodies participation, to the point that some open source advocates feared they might take control over the ODF Standards.

After a number of other episodes, the strategy delivers - Microsoft’s OOXML standard implementation is back on the list of recommended formats for government organizations, including France’s RGI (Référentiel général d’interopérabilité). Open Source advocates argue that a second Document Format is useless since ODF has been around for a few years now, and that even Microsoft products do not support the OOXML ISO implementation yet (support for the ISO version of OOXML Microsoft Office in Microsoft Office version is slated for next year). Microsoft counters that OOXML is not their software anymore, since anyone can implement OOXML compliance (after reading the 6000 pages spec of course).

And us, document users, are sitting in the middle, worried that we might not have a single, universal open document format after all, which was the whole point in that ODF-OOXML decision…

Could Future Document Formats prevent the next financial crisis?

Friday, June 19th, 2009

This interesting article points out that XML-structured document formats such as XBRL (eXtended Business Reporting Language)  could ensure a much tighter reporting and control over financial institutions and companies – and maybe avoid the next financial crisis?

XBRL started in the late 90’s and defines a XML schema for the exchange of financial information between companies, accountants and the SEC – including “semantic” information, which can be extracted very easily.

Starting this year, larger companies will have to submit their reports in this document format which can be programatically analyzed and validated - this will be a dramatic change from the current submission of html, pdf, ascii or anything else, which SEC analysts had to parse and analyze manually, and in case of an error get back to the filers much more quickly. Plus, this information will be available to anyone else, since this is public information, for analysis and others.

“Semantic” XML-based vertical document formats will be the next wave for Document formats. HL7 or other formats in the Health Care domain will help dramatically increase the throughput and reduce the errors in Health Management. There are quite a few out there already, but the Future of Documents will be made of many of these vertical schemas which will be a dramatic element in improving Document-Intensive Business Processes.

The Future of Documents: XML

Thursday, February 19th, 2009

If you’ve been reading my blog in the past or attended one of my presentations, you remember that I strongly believe in XML as the future of documents to bring structure, interoperability, and openness.

This is essential to allow the dissemination of documents, and their content. Since late 2007, we have one open XML-based format to represent the layout of documents: Open Document Format (ODF). But ODF only represents layout and “logical” information. The next frontier is the markup of the document ”content” or semantic information in those documents for specific verticals. A number of formats are appearing, including XBRL (Extensible Business Markup Language) and Health Level 7 (HL7).

On that topic, this great blog post by Kurt Cagle on the O’Reilly Community uses the SEC announcement that companies over $5 billion in assets would be required to start reporting their earning using XBRL to document the need for such XML-based standards. XBRL, like other similar formats, turns your documents from plain, unstructured containers of information, into highly structured, queriable containers of data - thus facilitating greatly the extraction of their content.

Although the author notes that achieving transparency in the financial domain will be harder than just imposing XBRL as a standard, he states:

 ”it is very likely that 2009 will be a banner year for XML technologies in general, as two of the key issues that are highly visible this year – financial transparency within corporations and the streamlining of health care, both involve rich XML standards – XBRL for financial reporting, HL7 v3 for electronic health records.”

I couldn’t agree more - let’s hope XBRL and HL7 pave the way to a wide spectrum of XML-based formats for the semantic representation of the data which is required for any business process.