Friday 25 May 2007

Distorted letter recognition helps improve book digitization

Those clever folks at CMU have done it again. A CMU professor invented "Captcha" technology that has users recognize distorted letter/number sequences in order to log into blogs or websites - eliminating some spam. Now a team of CMU students has launched a service that is using the same technology to scan and digitize books that can't easily be digitized using conventional OCR (optical character recognition) technology. The service, launched Tuesday May 21, is called "ReCaptcha" and is already being used on 150 websites. The CMU project is using the technology to digitize books in the Internet Archive, a project building a digital library of cultural materials.

No comments: