Tag Archives: Optical character recognition


HS Talkoot: Microtask to the rescue of Finnish media history

Posted on by Tommaso De Benetti

… (December 1889 – January 1890). All pages have been digitized in advance through automatic text recognition. Unfortunately, OCR software makes lots of mistakes when reading text that has deteriorated or is typed in old fonts. This is where human intelligence (especially our innate ability to understand signs) is needed to set the record straight.

Gamifying media

If this sounds noble but boring, don’t panic. That’s not how we run things. HS Talkoot is structured as an online game. …

Tags: Aamulehti crowd crowdsourcing Facebook Finland Finnish language Helsingin Sanomat microwork National Library of Finland Optical character recognition Päivälehti

, , , , , , , , , , | Leave a comment

Lost in the Virtual Economy? Here’s a map

Posted on by Tommaso De Benetti

… Digitalkoot, the first Microtask-powered service, combines advanced OCR techniques, human recognition skills and game mechanics to seamlessly distribute work to end users. What we are really proud of is that the crowd does the work by playing games. (Let’s call it “gamesourcing”, just to confuse everyone even more.)

Thankfully, the report also considers what some of these new terms actually mean. Usually the word “crowdsourcing” is used interchangeably with …

Tags: Amazon Mechanical Turk crowd crowdsourcing gamesourcing microtask microwork National Library of Finland Optical character recognition University of Tokyo Ville Miettinen virtual economies virtual economy virtual item work

, , , , , , , , , , , , , | 4 Comments

Summer Blockbuster, in cinemas now: The Document Processing Knight Rises

Posted on by Tommaso De Benetti

As regular readers of this blog know, there is nothing we like more than discussing strange and new types of crowdsourcing. From weird music-related experiments to the incidence of expressions such as “I need to” during the Mad Men era , we try to keep you informed with what is going on across our industry.

Every now and then, however, we use this forum to talk …

Tags: crowdsourcing Don Draper Downton Abbey google History Mad Men microtask National Library of Finland Ngram Optical character recognition

, , , , , , , , , | Leave a comment

Search me: what Mad Men and brave moles can do for historical records

Posted on by Ville Miettinen

Ever since we began helping the National Library of Finland correct mistakes in its old newspaper archive, I have noticed myself developing a slightly anti social interest in historical texts. I say ‘anti social’ because of its effect on conversation: what I have found is that while most people claim to be interested in history, the best way to get unwanted guests to leave …

Tags: crowdsourcing Don Draper Downton Abbey google History Mad Men microtask National Library of Finland Ngram Optical character recognition

, , , , , , , , , | Leave a comment

The secrets of Digitalkoot: Lessons learned crowdsourcing data entry to 50,000 people (for free)

Posted on by Tommaso De Benetti

… making them searchable over the internet. It uses crowdsourced volunteers to input data that Optical Character Recognition (OCR) software struggles with (for example documents that are handwritten or printed in old fonts, such as very old copies of the newspaper Aamulehti).

Digitalkoot relies on machines, humans and a gaming twist to make it all fun. This is how it works in practice: old text newspaper is scanned by OCR software and then cut up into individual words. These words are …

, , , , , , , , , , | 15 Comments