preprocessing steps to do ocr?

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: preprocessing steps to do ocr?

Post by anthony »

median filter, Prehaps LAT with a large sigma to equalise the overall brightness, or even dividiing by a strongly blurred version of the image. When overall brightness is equalized you can try other filters to improve the character definition.

Oh and try to kepp the black border so you can -deskew your document to remove rotation from the scan.

Above all use a high resolution when scanning. OCR seems to assume at least 600 dpi scan density.

How about providing links or small reduced test images and what results and methods you find was best. Very few people have reported there findings with OCR improvements. With some test images others may also be able to give hints.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
Post Reply