Hi, I'm trying to clean up this image for OCR.
https://ibb.co/bRK8ia
First I used lat to clean up most of the noise. I also tried canny and it works very well but I do not know how to fill the empty letters (I tried EdgeIn/Out with no luck).
Now I need to select the large dots forming the horizontal lines, my main problem. I tried with a 6x6 kernel and it works
convert temp-inv.png \( +clone \
-morphology HMT "6: 0,0,0,0,0,0 0,-,-,-,-,0 0,-,1,1,-,0 0,-,1,1,-,0 0,-,-,-,-,0 0,0,0,0,0,0" \
-morphology Dilate Ring \
-background red -alpha shape \
\) -composite x:
but now I do not know how to remove the selected dots. I tried to use this:
convert temp-inv.png -morphology HMT "6>: 0,0,0,0,0,0 0,-,-,-,-,0 0,-,1,1,-,0 0,-,1,1,-,0 0,-,-,-,-,0 0,0,0,0,0,0" lines-thin.png
to extract the lines only and laters subtract these but the dots I get are very thin and I was not able to make them larger with Thicken (with unity, square, hullconvex,...). I also do not like the idea of Thickening the dots, I'd prefer just to subtract the original matching pixels the were selected in the first stage.
So I'd like to use something different from HMT to extract the pixels that match the kernel. Is there something like that?
I also have a few dots missing that are smaller than 6x6 and I do not know if the correct route is to add a few more kernels (6x5, 5x4, 6x4, etc.) or if there is a better alternative to remove small dust (I tried median and it works quite well). I tried to use the Gaussian kernel expecting to be able to select a single round "blob" but the whole image is always selected.
I also tried open/close to remove small dots with worst results.
Thanks for any help/suggestions.
A few problems removing large dots
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: A few problems removing large dots
It seems the image you show isn't the one you would use as input to your command.
If the morphology makes an image that has black and white pixels, and you want the image to become white where the morphology result is white, use "-compose Lighten -composite".
If the morphology makes an image that has black and white pixels, and you want the image to become white where the morphology result is white, use "-compose Lighten -composite".
snibgo's IM pages: im.snibgo.com
Re: A few problems removing large dots
Hi snibgo, thanks for your answer. The image I'm using is black and white, I used lat to clean it. The image I'm running morphology on looks like this:
and this is the original file: http://www.tgischia.it/wordpress/wp-con ... 3%A0-1.jpg
I'd like to get something like this (I used a couple of fine tuned gimp filters to get this):
And this is the output from the morphology command:
so it matches most of the dots, but I do not know how to delete them.
Using the above HMT command I get this:
and after subtraction this:
As you can see the subtracted dots are too small and creates a lot of donuts. Of course I can blur/enlarge them, and it works quite well, but in this way the dots are slightly different from ones I matched with the morphology kernel and I need another "clean" pass to fix the remaining dust. Maybe there is a way that I missed to simply zero the matched dots.
The second problem is that to match/clean the remaining dots I need to repeat all of this with 3 or 4 more kernels (vertical dots, horizontal, etc) and I'm wondering if there is a whole better approach.
P.S. I'm using IM 6.8.9-9 on Linux
and this is the original file: http://www.tgischia.it/wordpress/wp-con ... 3%A0-1.jpg
I'd like to get something like this (I used a couple of fine tuned gimp filters to get this):
And this is the output from the morphology command:
so it matches most of the dots, but I do not know how to delete them.
Using the above HMT command I get this:
and after subtraction this:
As you can see the subtracted dots are too small and creates a lot of donuts. Of course I can blur/enlarge them, and it works quite well, but in this way the dots are slightly different from ones I matched with the morphology kernel and I need another "clean" pass to fix the remaining dust. Maybe there is a way that I missed to simply zero the matched dots.
The second problem is that to match/clean the remaining dots I need to repeat all of this with 3 or 4 more kernels (vertical dots, horizontal, etc) and I'm wondering if there is a whole better approach.
P.S. I'm using IM 6.8.9-9 on Linux