The Future of Documents
Open Xerox Innovation
Saturday, November 7th, 2009Nice initiative: test drive some of the latest Xerox Innovation – Open Xerox is an Open Innovation space to explore Xerox technologies, interact with Xerox scientists, or even establish innovation partnerships.
Currently only one demo is available online – Natural Language Color Editing. This technology lets user use common words and phrases to change and improve color images – no need for advanced photo editing skills and tools. Nice example of making very complex technology much simpler to use and accessible, with a simpler version being part of the Xerox 7500 color printer driver.
Similar Image Search becomes mainstream
Thursday, October 29th, 2009The ways of “cutting through the clutter” and surviving the information overload have changed: after trying to impose a structure on data by adding metadata, we quickly realized that was helpless and would not scale up if front of the explosion of the Digital Universe. Instead, we decided to (barely) keep head over water by using whatever techniques were available to search for the content we’re looking for in that sea of documents.
This has become relatively easy for documents, even scanned. By searching for keywords, OCR and the latest indexing technologies do a fairly good job at finding the information we need – provided we can make that search the right way.
But the next frontier lies in more complex, richer documents, such as pictures or videos. But there also, searching by content is finally becoming (relatively) mainstream.
I recently blogged about this research project where images can be searched for similar content, but this capability is now coming out of the labs. Gazopa, for example, is entering open beta stage.This plug-in allows you to search for a similar picture which has been indexed, or even create a drawing, and find similar images. Google Similar Images allows you to do the same, although the original image you search from needs to be indexed ahead of time. Even more interesting, Google’s latest Picasa version lets you index individual elements of a picture, such as faces, and then search for other occurences of that face throughout your collection.
Does it really work? I’ll let you test it for yourself – my personal experience has been relatively mixed. However, I have no doubt they will shortly improve significantly, causing a paradigm shift in the way we deal with information overload.
Back from Optimizing Innovation
Monday, October 26th, 2009The conference was really interesting. I was honored, thrilled (and intimidated) to be presenting along Chief Innovation Officers, Strategy VPs, and other such great presenters.
The focus was on understanding how to drive innovation into any organization – either using a top-down strategy, or, as in my case, in a bottom-up fashion. Although my presentation was outside the ”core”, I felt it was successful and very well received. Indeed, the world of “Services” (as in Professional Services) has different challenges from more traditional industries, such as repeatability and shorter cycle times. But it made my participation even more relevant and thought-provoking - and the use case of our Smarter Document Management Technologies, including Hybrid Categorizer, in which we have been able to accelerate and reuse a number of world-class innovations from Xerox research, really struck a chord.
But it was a great opportunity for learning, too, and some of the approaches or initiatives that were presented (e.g. ”Innovation Bootcamps”) were very interesting. My take on it? There is no magic recipe, but requires a mix of culture, top-down processes and ad-hoc, bottom up initiatves.
On my way to Optimizing Innovation ‘09
Tuesday, October 20th, 2009Getting ready for my presentation at Optimizing Innovation ‘09, Thursday morning, in NYC. Should be an interesting story on how the “Less Paper Office” meets Business Process Improvement.
Promises to be an interesting conference: “Optimizing Innovation 2009 will give you the opportunity to hear ideas and experiences from top speakers from the most innovative companies, on the most current and exciting topics in the form of social networking activities, brainstorming sessions, talking circles, keynote case studies and insightful presentations”.
If you happen are in the Big Apple, and want to drop at – or will be at the conference itself, don’t hesitate to come meet me in person!
Automatic Text Categorization for e-Discovery
Sunday, October 18th, 2009Technology can help in many document-intensive processes – even sometimes in the most difficult cases, where human was until now considered the only option. For example, Xerox Litigation Services is now leveraging Categorix, the Text Categorizer,to expedite the review of documents in a litigation case.
Categorix is a technology developed at the Xerox Research Centre Europe, which uses the textual information in a document. Machine-Learning based, it learns from a number of samples the vocabulary which is representative of each class it has to deal with (here “responsive” vs “non-responsive). Once trained, it can identify “responsive” documents with an accuracy actually higher than the human, and automate the typical review process by automatically tagging the documents where it is confident enough – leaving the more uncertain ones for human confirmation.
When considering a typical litigation involves a million documents, with average review costs around $1, such a technology can accrue major savings – not to mention speed and consistency, of course.
Getting ready for Optimizing Innovation ‘09
Wednesday, October 14th, 2009I’ll be presenting at the Optimizing Innovation conference next week in NYC.
Under the theme of “Succeeding through Service Innovation”, I’ll be talking to how we are making the “Less Paper Office” a reality for our customers - reducing cost, improving productivity and quality, and driving sustainability into our clients paper-intensive document processes. I will be sharing some insights on Smarter Document Technologies, and how to leverage best-of-breed innovation from world-renowned labs (Xerox, of course!) and focus research creativity to come up with new, repeatable but disruptive service offerings – while making sure this innovation corresponds to the customer, of course.
Feel free to join me or catch me up if you are around! The full program can be found there, but is not fully up to date – please note my talk is now scheduled for 9:30am on October 22nd.
Ads-sponsored Health Records?
Monday, October 12th, 2009The Stimulus package is definitely a great incentive for getting small practices and large hospitals to move towards Electronic Medical Records – despite a pretty high upfront cost of around $44000 per physician to install a new electronic health-record system. Daily Finance has an interesting article and interview on a new trend: ad-sponsored online health records.
Practice Fusion, a small start-up, has an interesting approach of making that service free but ad-sponsored. Their software is web- and cloud-based (partners of Salesforce.com), meaning doctors don’t have to worry about setting up the software. Even better, it can be free, provided doctors agree to have ads appear on their record system. Practice Fusion provides interesting capabilities, like automatic charting, patient management, ePrescription, scheduling and billing.
One thing that leaves me a bit uncomfortable with many of these proposals is the gap between past and present (paper) and future (full digital), and the deliberate avoidance of the hardest problem - getting legacy paper records accessible in the new system. Sure, new paper documents can be scanned and imported as images, but what about the legacy volume of documents still sitting in folders? How can you extract and inject them into an electronic Medical Record system, while making sure this information can be searched, accessed and retrieved easily?
I visited one of our customers recently, who has a huge warehouse of over a million Medical Records folders - scary experience, especially when thinking that my life might depend, one day, on the speedy access to the right information contained in that 200 pages folder, sitting with another million folders …
So how do you intelligently scan the legacy medical record and recreate an intelligent, electronic version is navigable, searchable, and brings as much information to the doctor -and hopefully more- as the physical paper record? That is, to me, the toughest problem. I’ll be touching on some of these aspects in the future.
