Managing Information Overload and Photographs
Wednesday, July 1st, 2009Documents are not only textual documents. Pictures, photographs, music and videos are taking up an increasing amount of space on our hard disks, and constitute a new breed of documents growing at an incredible rate. this is due to democratization of digital capture devices such as cameras, camcorders, and smartphones, the increasing storage space we have at our disposal, whether offline or online.
It has become really difficult to manage all of these documents - How will I find a picture of my kid in front of the Eiffel tower in 10 years from now? One way is to be disciplined in the way we organize them and add metadata -But how can they be handled for pictures where metadata is not available? Not to mention this metadata is often subjective (e.g. content of a photograph – should I use Paris, Eiffel, Tower, Trocadero? )
What if the actual content of pictures and photographs could be used as the source of your search…
Computer Vision is progressing, and a number of technologies are under development to automatically analyze the content of a picture, and find other photographs with similar content. There are many technologies for managing large image repositories, at Xerox and elsewhere. I particularly like this example application though, because it’s live and works off a real database – it lets you search in a database of 10 million images for images similar to the one you have selected or uploaded. Try it with your holiday pictures, it’s quite impressive – especially in the accuracy of the results.

