Going green: What’s good for the Environment is Good for Business

May 16th, 2008

Excellent Webcast from Gartner on how the electronic workplace and moving to electronic content management and “basic content services” strategy can help you get greener… while improving your business efficiency.

Mark Gilbert, Research VP at Gartner, and Patricia Calkins, take us through a very interesting presentation. Key learnings are that leveraging scanning, eForms or electronic documents, to reduce paper usage… sounds familiar ?

Innovation for the Future of Documents - and beyond

May 2nd, 2008

Early this week occured a great event at Xerox PARC: an open house saw many journalists invited for a peek preview at some of the Xerox innovation in support of the Future of Documents, but also on Green technologies.

Here are two excellent videos on the topic:

For those of you that prefer reading, here are a few sources:

It’s always good to see great innovation on very focused topics (whether 3D visualisation of documents or reusable paper) is getting such excellent coverage. 

Long-term image document preservation

April 30th, 2008

While browsing the Digital Preservation Coalition site, I found this other interesting recent report on long term preservation, written by my Xerox colleague Rob Buckley for DPC.

 Indeed, records management and long-term archiving need to support documents in various formats - good old electronic or paper “office” documents, but also images or multimedia.

As suggested in “JPEG 2000: A Practical Digital Preservation Standard ?” , JPEG 2000 is a very good option to combine very high quality, high compression, and openness - therefore a good choice for long-term preservation of image documents.

And of course, I’m always glad to see colleagues involved in shaping the Future of Documents.

Using PDF for long-term document preservation

April 28th, 2008

The Digital Preservation Coalition (DPC) has just published a report on digital preservation, which states that “PDF should be used to preserve information for the future”. This is an important step for the Future of Document - whether for records management, long-term archival, or other forms of preservation, it is important to choose a format that will make today’s archive documents readable and accessible a few decades  from now (or even later).

 The Digital Preservation Coalition (DPC) was established in 2001 to foster joint action to address the urgent challenges of securing the preservation of digital resources in the UK and to work with others internationally to secure our global digital memory and knowledge base.

 If you remember my blog on XML or PDF-A for archiving, this is fully in line with what I have been advocating for - you need to make sure whatever digital format you choose will still be accessible or queryable in a few decades - and there’s nothing like a stable, standard format, with long-term support commitment.

However, it is true that PDF is just a generic container, and will not contain the “semantic” information that a specialized XML would carry, and which will make your document “queryable” in the future (be aware, though, that general-purpose XML “standards” such as OOXML or ODF probably won’t carry much more semantic information than PDF…). However, the schema for querying your document will be vastly different from today’s, if at all supported. So, for the time being, I would agree PDF/A is your safest bet.

I highly recommend reading the actual DPC report (”Preserving the Data Explosion: Using PDF“) which provides detailed history, concrete tips and useful resources and links (including for specific verticals). My only builds would be common sense: embed as much information as possible in your PDF while complying with the standard (e.g. original hi-res image with text for a paper scan, full-text information for native electronic documents, add as much meta-data as possible etc…) and put a plan in place for preserving your files…

Lifelines for home offices drowning in paper

April 20th, 2008

The “less paper office” is becoming a popular topic, even beyond the corporate world - See here CNN’s coverage on the topic. Although quite simplified, this view is very similar to the one which is exposed in the Less Paper Office White Paper.

Automated Intelligent Document Classification, Data Extraction and Search Tools for the Legal market

April 17th, 2008

An interesting article from ContentWrangler describes how Document Classification techniques, such as those that are regularly discussed in that blog, can be applied to adding intelligence and structure to Legal documents.

Although a bit “salesy” this article (originally written by A2iA, a leading Advanced Document Recognition company) shows how technologies such as Document Classification, Intelligent Data Extraction (including handwriting) can help the Legal market, but more generally any paper-intensive content management system.

Digital Footprint Calculator

April 8th, 2008

It has been a few years since mankind started to realize how scarce and precious natural resources were, and what impact our activities, however insignificant, could have on the environment. This is especially true of document activities, with recent examples such as estimates of the “Paper Universe” or Xerox’s Carbon Footprint Calculator made available to estimate your print fleet’s impact on the environment.

But mankind is now starting to realize that even digital, “virtual” information has a cost and impact, as we see first attempts to evaluate its footprint. IDC and EMC are at the forefront of these activities, with the Digital Footprint Calculator.

For those of you that are familiar with my blog, you might remember my first post on the topic and the humongous estimate of the Digital Universe; This report and estimate has been revised, and is actually higher than expected - 281 exabytes in 2007, due to grow to 1800 exabytes.

In fact, less than half of that activity is created by human activities (documents, pictures, phone calls, …). The rest constitutes a “digital shadow”, surveillance photos, logs, journals, backups, etc…

The report is available online at this URL, and more interesting resources such as videos and a copy of the Digital Footprint Calculator can be found here.

Scary, huh ?

How PARC sees printers boosting clean tech

April 4th, 2008

See this interesting article on News.com on how research derived from printer research can be applied to clean tech and sustainability, e.g. to purify water or to create extremely efficient solar panels. Worth reading !

ISO and IEC approve Open Office XML format

April 2nd, 2008

Microsoft confirmed yesterday in a press release that OOXML was ratified as an ISO standard, promptly followed by an ECMA announcement.

This should be a step forward for long-term preservation of documents and records management, as these quotes illustrate:

Just as we have worked to establish and steward our print collections, the British Library is committed to preserving and providing access to the U.K.’s digital heritage,” said Adam Farquhar, head of   Digital Library Technology  at the British Library, and vice-chair of Ecma TC45. “Establishing Office Open XML as an open standard substantially enhances our ability to achieve this. It’s an important step forward for digital preservation and will help us fulfill the British Library’s core responsibility of making our digital collections accessible for generations to come.”

The U.S. Library of Congress believes that the preservation of digital content for future generations will be much easier if widely used software applications use formats with full public specifications that will be maintained by the global community going forward,” said Martha Anderson, Director of Program Management, National Digital Information Infrastructure and Preservation Program. “The approval of Office Open XML as an international standard has important benefits for libraries and other archival institutions for generations to come.

A number of observers are however a bit more dubious about the real impact of that standardization - and even recommend changes in the standardization process.

OpenXML approved as ISO standard ?

March 31st, 2008

That’s at least what early reports indicated yesterday. OOXML has apparently collected enough votes to be certified as ISO standard DIS 29500. The news should be confirmed today.

Despite early predictions over the last few days that the standard might be rejected for the second time  (see here or there) the latest results seem to indicate that Microsoft will win this second battle.

 This second vote, which I am still hopeful is good news for the Future of Documents and open document standards, has unfortunately been darkened by quite some political maneuvers even more than the first round - even CNET.com News.com titles “Open XML vote: Politics, intrigue, and, oh, some tech“under its Top headlines…