Semantic Technologies
New ODF 1.2 format
Submitted by Francois RagnetOctober 14th, 2011
The latest Open Document Format, ODF 1.2, was finally approved by the OASIS committee. It is a major milestone, as the previous version (1.1) dates back to 2006. ODF is also a standard under the ISO committee, but in its even older 1.0 instantiation.
The new version adds a number of major features, such as OpenFormula (for spreadsheet calculations) but also support for semantic web standards such as RDF (Rich Document Format) and digital signature.
To benefit from these latest advances, LibreOffice, OpenOffice, but also office applications from Zoho office, IBM, Google Docs, and many others are up to date. ODF 1.2 should be supported in Microsoft Office next year.
Thanks to ODF 1.2, documents are really taking a major step forward towards being structured , secure and interchangeable .
Real Applications for Semantics
Submitted by Francois RagnetSeptember 30th, 2011
Not so fresh news, but it’s good to see Semantics applied to real world applications. After having fun playing Jeopardy, Watson is being to be put to work, mining WellPoint patient information, to help diagnose and suggest treatments for patients.
The WellPoint application will combine data from patient’s chart and electronic records, the insurance company’s history of medicines and treatments, and Watson’s huge library of textbooks and medical information.
Pretty scary – but at least initially this is only going to be used as an assistance to the doctor. Still, other Watson applications had been envisaged before – like answering customer call centre requests – which could have been less “risky”.
When will the application actually become a reality? The press release mentions early 2012 for the first pilots.
But one should not forget that, in fact, Semantics – and Xerox technology – is already being used in Healthcare (although not in a commercial application). The ALADIN project, based on Xerox FactSpotter technology, is helping mine patient records for signs of Hospital-Acquired Infections, and is making great progress. And I prefer having the computer improve my patient experience after the fact, rather than give answers that might be life-threatening…
IBM on a shopping spree for Business Analytics acquisitions
Submitted by Francois RagnetSeptember 1st, 2011
IBM announced its second acquisition related to Data Analytics in two days.
Yesterday, Big Blue announced their intention to acquire i2, a company providing business and intelligence analytics for crime and “smarter cities”, for an undisclosed amount.
Today, it was the turn of Algorithmics, a Toronto-based risk analytics firm for financial services, for $387 million.
Those two companies are only two on a long list of acquisitions in the business acquisition space – including in recent years AptSoft, Cognos, SPSS and many others. But this acquisition rate seems to be accelerating.
Although not directly related to the Future of Documents, Analytics and documents have a lot in common: document structures and semantics are all about bringing the relevant information, or should I say knowledge, to the user, turning a “flat” or unstructured data representation into “actionable” business information.
HP gets serious about Content Mining business, and gets further away from hardware and mobile
Submitted by Francois RagnetAugust 23rd, 2011
HP announced a number of drastic decisions during their latest earnings calls. The first one was their decision to acquire Autonomy for $ 10.3 billion. By acquiring one the leader in Meaning Based Computing and Enterprise Search, HP shows a “dramatic entry into Information Management”.
At the same time, HP confirms that it is exploring spinning off its Personal Computing business. One immediate – and rather sad – side-effect of this move away from hardware is shown by HP abandoning the WebOS platform for mobile devices, which was acquired from Palm just over a year ago.
HP is making a drastic shift into Sofware and Services, similar to the trajectory that Xerox is following. One difference though is that Xerox has a more balanced strategy that relies on external acquisitions (e.g. ACS) but also on internal innovation such as Smarter Document Technologies, FactSpotter, or Docushare.
Autonomy acquires Iron Mountain Digital Storage Businesses
Submitted by Francois RagnetMay 19th, 2011
Interesting acquisition, where UK-based Autonomy acquires Iron Mountain’s digital archiving, e-discovery and online backup businesses.
Iron Mountain, whose traditional business was in “physical” document storage (initially in caves), ventured into digital storage in 2001, came in (too) late into cloud storage (in 2009), and had to take recently the difficult decision to step away from that business. Tough decision, in a world going away from paper, to see a company give up its digital business.
Autonomy is slowly becoming a huge player in digital content management. Traditionally a leader in Semantics (Meaning-Based Computing), it acquired big companies like Verity for its indexing engines and Interwoven for its Entreprise Content Management Systems, and now provides solutions and services for eDiscovery and multichannel Marketing.
“Real” applications of Natural Language Processing
Submitted by Francois RagnetMarch 7th, 2011
Sure, IBM’s Watson Jeopardy demonstration was very impressive. But, its biggest value was to bring Natural Language Processing to everyone’s awareness – and prove that these techologies are real, and ready for real business.
My esteemed colleague Frederique Segond posted a very interesting write-up on some of the applications of Natural Language Processing to “real life” applications. In particular, she talks to some of the applications I have mentioned a few times already on my blog, on the SIIA Semantics panel, or anywhere else I can mention it. For example, NLP can help mine and make sense out of unstructured medical folders, as in the ALADIN project or others. Or, in a litigation case, NLP can filter out a large volume of the irrelevant documents, but also extract significant information that can then be stored in a knowledge base to support the case.
Semantics embedded in Xerox machines?
Submitted by Francois RagnetFebruary 4th, 2011
Semantics can have multiple applications, including making search more effective and relevant. But this can be pretty far-fetched, including making Xerox machines more reliable (or, more precisely, easier to fix) as I mentioned on the Semantics panel last week. So no, Semantics is not embedded in the machines, but they are directly accessible from there.
Indeed, the latest WorkCenter 7500 series includes access to the Online Knowledge Base right from the Device’s EIP (Extensible Interface Platform) screen. But it is not a standard search, based on proximity between the keywords that were entered. The “search” is aided by Natural Language Processing – the query is analyzed gramatically, but also “translated” between the user’s “layman” terms and the technical jargon used in the Base. This provides more coverage, but also more accurate searches, making Xerox customers more able to fix problems on their machines, and maximizing the machine’s uptime while minimizing the maintenance cost both for customer and Xerox.
Semantics were already used in printers before, in applications such as Natural Language Color Editing, which lets users tune their color settings with (restricted) natural language queries.