Tuesday, June 30, 2009

Prof. Luis von Ahn

I just finished watching the PBS show, Nova Science Now, http://www.pbs.org/wgbh/nova/sciencenow/ and saw an amazing profile of Professor Luis von Ahn, http://www.pbs.org/wgbh/nova/sciencenow/0401/04.html. He helped to invent CAPTCHA, which is the box with the morphed words that you have to decipher when using Facebook and other websites or setting up a new email account. It is a security feature meant to stop computerized spam programs. Humans can decipher the letters but computers cannot. So it basically determines if the thing accessing a website is a person or a computer. (If you want to see an example, check out the Wikipedia article at http://en.wikipedia.org/wiki/CAPTCHA)

At one point, von Ahn decided that people were wasting a lot of time typing those letters into boxes. So he decided to take his creation one step further. He wanted to use this technology for something useful.

So he came up with reCAPTCHA, http://recaptcha.net/. reCAPTCHA uses this technology to digitize books, newspapers and old time radio shows. Old books that are being digitized by OCR scanning technology often have words that cannot be read by the scanner for various reasons. These words are gathered and used as CAPTCHAs that allow humans to decipher the words that the computer cannot. According to the reCaptcha website, reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly. Currently this re-imagined technology is being used with books digitized for Internet Archive and old issues of the New York Times.

You can help in this effort by using reCaptcha on your website of anywhere you post your email address. For more information consult the website at http://recaptcha.net/learnmore.html.

No comments: