Google is still trying its hands on ‘to digitize the entire book collections of the New York Public Library and Harvard University libraries’, keeping us oblivious of the outcome, leaving them pondering lets move ahead with another breakthrough project - ‘the Million Book Project at Carnegie Mellon University in Pittsburgh’, which is going on from last seven years.



Google is aiming at providing all offline books online and the Mellon University project too is similar and is facing the same challenges.



The problems faced:



1. Physically scanning of millions of pages

2. Different languages

3. Different fonts

4. Linking the outcome to the text searches

5. And above all, giving the same library browsing experience



In the CMU project Minolta PS 7000 book scanners are used which are provided at 40 scanning stations in India and China where pages are scanned by manually turning them. The project is moving fast towards it goal with around 100,000 pages scanned and hopefully it’s just five years to its completion.



Where as Google is using its own scanning technology and is not allowing the cat out of the bag. Anyway, the under CMU project researchers are using algorithms to identify sentences length, structure, and punctuation.



So, can you do it for your library? You just need a scanning software for different languages and an organized system.



Check out here the details.