Reducing Noise in scanned documents

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
olaeblue
Posts: 2
Joined: 2011-02-25T02:03:07-07:00
Authentication code: 8675308

Reducing Noise in scanned documents

Post by olaeblue »

I am scanning some old documents and need to pre-process before OCR. I have found in paint.net a function called reduce noise where with a radius of 200 and a strength of 1 (strength is range 0 to 1) I can move from image 1 to image 2 (these are low res versions of the large tif files I actually have but give the general impression) basically cleaning up the grey background to white.

Image http://www.yrc.org.uk/data/files/downloads/orig.jpg
Image 1

Image http://www.yrc.org.uk/data/files/downloads/clean.jpg
Image 2

I have about 600 of these to do! What is the equivalent function in Imagemagick. -despeckle doesn't seem to do it & I think -blur might be right, but can't get right effect.

Any help gratefully received.
el_supremo
Posts: 1015
Joined: 2005-03-21T21:16:57-07:00

Re: Reducing Noise in scanned documents

Post by el_supremo »

To clean up grayscale scans of documents, I've been using this command - although I wasn't then running them through an OCR program:

Code: Select all

convert Scan10007.bmp -threshold 65% -deskew 40% scan_7.png
You would have to play with the threshold value to suit your images - it converts the image to black and white. The output file format should not be JPG because it will introduce compression artifacts which will probably confuse the OCR engine.

Pete
Sorry, my ISP shutdown all personal webspace so my MagickWand Examples in C is offline.
See my message in this topic for a link to a zip of all the files.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Reducing Noise in scanned documents

Post by fmw42 »

you might take a look at my bash script textcleaner at the link below
olaeblue
Posts: 2
Joined: 2011-02-25T02:03:07-07:00
Authentication code: 8675308

Re: Reducing Noise in scanned documents

Post by olaeblue »

Thanks. Couldn't use bash script as windows person, but the explaination of what it does allowed me to build a suitable command line. :D
Post Reply