How can I digitize a large amount of documents without the transcription errors?

Episode 900 (40:15)

Robert from Carlsbad, CA

There's two ways to do it. He could do it manually, but with up to 300,000 pages, that would just be too daunting. The other option is to do it mechanically using a sheet feed scanner. He can put in hundreds of pages at a time. Now he'll have an image, not editable text. To convert these to text documents, he'll need to get optical character recognition (OCR) software, which comes with most scanners. That will go through it and do the best it can to transcribe it. There is OCR software that is specifically made to handle medical documents, so he might want to look for that.

He should first scan all the documents in as image files, so he'll at least have them all digitally. Then he can try running those through the OCR software. He might still have to spotcheck and correct the translation errors, though. Fifo in the chatroom says a company called MicroFacs specializes in medical based, searchable OCR.

Topics: 
Tags: