Cleaning up noise around text
Cleaning up noise around text
I've tried -noise radius and -noise geometry and they don't seem to do what I want at all. I have some b&w images (TIFF G4 Fax compression) with lots of noise around the characters. This noise takes the form of pixel blobs that are 1 pixel wide in most cases.
My desire is to do the following 3 steps (in this order):
Whiteout all black pixels that are 1 pixel wide
Whiteout all black pixels that are 1 pixel tall
Whiteout all black pixels that are 1 pixel wide
So the question is, do I have to crack out my C++ skills, or can I do this with imagemagick?
My desire is to do the following 3 steps (in this order):
Whiteout all black pixels that are 1 pixel wide
Whiteout all black pixels that are 1 pixel tall
Whiteout all black pixels that are 1 pixel wide
So the question is, do I have to crack out my C++ skills, or can I do this with imagemagick?
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Cleaning up noise around text
see -morphology close (it would be open if your image was white letters on black, but you need to use close for black letters on white background)
http://www.imagemagick.org/Usage/morphology/#basic
you will have to pick the shape/size of the filter to correspond to the noise you want to remove. If tall noise use narrow wide filter and vice versa.
Can you post a link to your image? It would help to have that to know if this is a viable approach.
http://www.imagemagick.org/Usage/morphology/#basic
you will have to pick the shape/size of the filter to correspond to the noise you want to remove. If tall noise use narrow wide filter and vice versa.
Can you post a link to your image? It would help to have that to know if this is a viable approach.
Last edited by fmw42 on 2011-05-09T16:13:22-07:00, edited 1 time in total.
Re: Cleaning up noise around text
Here is a snippet from the image. I've read quite about about morphology, but still haven't managed to come up with something that helps with cleanup without doing more damage than it fixes.
http://www.imagehawk.com/images/cleanup.tif
http://www.imagehawk.com/images/cleanup.tif
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Cleaning up noise around text
I don't think anything is going to help as the noise is nearly as big as the thickness of the text characters and the noise is too close to the characters. If they were further away, then perhaps something might be done.
This is not too bad using morphology close with a square shape. But you can try other shapes.
convert cleanup.tif -morphology close square:1 cleanup_close1.gif
This is not too bad using morphology close with a square shape. But you can try other shapes.
convert cleanup.tif -morphology close square:1 cleanup_close1.gif
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: Cleaning up noise around text
Most of the noise is cleaned up using
How fred is right it is very hard when the noise is so close to the original text.
However you specified specifically what you want to do, and adding specific pixels (making white) can be done using a Thicken morphology operation.
For example remove black pixels that are one pixel wide
remove black pixels that are one pixel high
Or do both, one following the other (two rotated kernels)
The real problem however is your source image. It looks like the text was a JPEG that has been thresholded.
It looks like the threshold levels however was wrong, leaving ringing artefacts in the resulting image.
Code: Select all
convert cleanup.tif -morphology close diamond show:
However you specified specifically what you want to do, and adding specific pixels (making white) can be done using a Thicken morphology operation.
For example remove black pixels that are one pixel wide
Code: Select all
convert cleanup.tif -morphology thicken '3x1:1,0,1' show:
Code: Select all
convert cleanup.tif -morphology thicken '1x3:1,0,1' show:
Code: Select all
convert cleanup.tif -morphology thicken '1x3>:1,0,1' show:
It looks like the threshold levels however was wrong, leaving ringing artefacts in the resulting image.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
Re: Cleaning up noise around text
The morphology really does clean up the image for human readability, but when I zoom in though, I think it square is going to possibly hurt the OCR
However diamond may actually help quite a bit.
I get an invalid argument for -morphology when I use this command:
so I'll try to update tonight and see if that will do the trick.
Version: ImageMagick 6.6.9-4 2011-04-01 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2011 ImageMagick Studio LLC
Features: OpenMP
These images were made on a really expensive ($250,000) scanner. I'm guessing they didn' t know how to use it properly..... We are working with them to do a better job on future scans (including 300 dpi....)
Thanks for the help.
However diamond may actually help quite a bit.
I get an invalid argument for -morphology when I use this command:
Code: Select all
convert cleanup.tif -morphology thicken '3x1:1,0,1'
Version: ImageMagick 6.6.9-4 2011-04-01 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2011 ImageMagick Studio LLC
Features: OpenMP
These images were made on a really expensive ($250,000) scanner. I'm guessing they didn' t know how to use it properly..... We are working with them to do a better job on future scans (including 300 dpi....)
Thanks for the help.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Cleaning up noise around text
You need to specify an output image!
convert cleanup.tif -morphology thicken '3x1:1,0,1' result.gif
The following as Anthony suggested with diamond rather than square works well.
convert cleanup.tif -morphology close diamond:1 cleanup_close1.gif
convert cleanup.tif -morphology thicken '3x1:1,0,1' result.gif
The following as Anthony suggested with diamond rather than square works well.
convert cleanup.tif -morphology close diamond:1 cleanup_close1.gif
Re: Cleaning up noise around text
Due to the nature of the ringing noise, all black noise specks are separated by at least 1 pixel from the letters.
One good approach to remove this noise would be to dilate the image so that at least one "seed" part of each letter remains, then erode these seeds while using the original image as a mask; in effect a flood-fill for each letter.
This way the shape of the letters and other large blobs is preserved perfectly, and smaller blobs disappear.
The biggest dilate that still leaves a part of each letter shape seems to be a 3x4 rectangle for the example data; perhaps use something smaller to be on the safe side.
This command first dilates that 3x4 rectangle, end then erodes until the letters are all whole again
One good approach to remove this noise would be to dilate the image so that at least one "seed" part of each letter remains, then erode these seeds while using the original image as a mask; in effect a flood-fill for each letter.
This way the shape of the letters and other large blobs is preserved perfectly, and smaller blobs disappear.
The biggest dilate that still leaves a part of each letter shape seems to be a 3x4 rectangle for the example data; perhaps use something smaller to be on the safe side.
This command first dilates that 3x4 rectangle, end then erodes until the letters are all whole again
Code: Select all
convert cleanup.tif -write MPR:source ^
-morphology close rectangle:3x4 ^
-morphology erode square MPR:source -compose Lighten -composite ^
-morphology erode square MPR:source -composite ^
-morphology erode square MPR:source -composite ^
-morphology erode square MPR:source -composite ^
-morphology erode square MPR:source -composite ^
-morphology erode square MPR:source -composite ^
-morphology erode square MPR:source -composite ^
-morphology erode square MPR:source -composite ^
-morphology erode square MPR:source -composite ^
cleaned.png
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: Cleaning up noise around text
this is basically known as "conditional dilation" (or for negated image "conditional erode" and while I have not explored this enough to generate examples it should actually be available RIGHT NOW!HugoRune wrote:Due to the nature of the ringing noise, all black noise specks are separated by at least 1 pixel from the letters.
One good approach to remove this noise would be to dilate the image so that at least one "seed" part of each letter remains, then erode these seeds while using the original image as a mask; in effect a flood-fill for each letter.
The trick is to use a 'write mask' (the original image) on the 'seed image' and then dilate to infinity.
At this time I only have quick notes on using image write masks in
http://www.imagemagick.org/Usage/maskin ... ping_masks
For morphology I would use make sure the write mask was boolean by specifying it using -clip-mask
The clip mask should be white where you do not want the image to be updated.
Hmmm... This is my first attempt at conditional morphology, exactly as I envisaged!
Code: Select all
convert cleanup.tif -write MPR:source \
-morphology close rectangle:3x4 \
-clip-mask MPR:source \
-morphology erode:8 square \
+clip-mask cleaned.png
This is the equivalent of HugoRune's conditional erode and gets the same result.
NOTE do not use an infinite erode (iteration count = -1), as it will never end (for a long time). Morphology does not actually understand write masks, so it sees pixel changes even though they are never written, as and such it never sees a final 'static' image. In IMv7 (yet to fork) use of infinite iterations to 'seed flood fill' may be possible.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: Cleaning up noise around text
Both of you missed the '>' in my example to remove 1 pixel width and height pixels.fmw42 wrote:You need to specify an output image!
convert cleanup.tif -morphology thicken '3x1:1,0,1' result.gif
The following as Anthony suggested with diamond rather than square works well.
convert cleanup.tif -morphology close diamond:1 cleanup_close1.gif
And that is not quite the same as a 'diamond'.
As for the use of the scanner. Yes I'd say they should scan a sample image in a number of ways so that you can look for figure out what is best. Either that or have then deliver a raw grayscale (color?) scan so you can adjust thresholding and other parameters yourself.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Cleaning up noise around text
I did not miss it -- just finished your first example to replace show: with an image.
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: Cleaning up noise around text
Doesn't show: work on a Mac?
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Cleaning up noise around text
anthony wrote:Doesn't show: work on a Mac?
Yes, it does (and your commands show just fine), but the user left off both show: and an output and was complaining of getting errors.
So all I was trying to do was remind him of the need for an output image.I get an invalid argument for -morphology when I use this command:
convert cleanup.tif -morphology thicken '3x1:1,0,1'
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: Cleaning up noise around text
Fair enough.... Back to the problem at hand.
mark0978... Are you satisfied with the solutions provided?
mark0978... Are you satisfied with the solutions provided?
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/