Page 1 of 1

Text in gray area remove gray area

Posted: 2013-09-27T23:27:13-07:00
by benito2313
Hello,

I got a image and it has a part of a gray area with text in it.
I want to remove the gray area but want to keep the text in it for ocr.

How can i do this?

Im werking on Windows and using the commandline

Thanx in advance.

Regards,

Benito2313

Re: Text in gray area remove gray area

Posted: 2013-09-28T07:36:41-07:00
by snibgo
I don't know what you mean by "remove the gray area". Can you provide an example image? Put it somewhere like Dropbox and provide a link here.

Re: Text in gray area remove gray area

Posted: 2013-09-29T09:47:45-07:00
by RitterRunkel
I think he wants only the text to be left in the image. Kind of an extraction/background-foreground/segmentation problem. The question is, whether his images are artificial images and the background is homogenous filled with a certain color, which could be simply set to white or transparent ... or if he took photographs and there are many grey colors in the background, so that he will have to find a threshold to make it rather binary b/w.

Fred provides many good scripts to calculate the threshold in different ways. For instance local adaptive: http://www.fmwconcepts.com/imagemagick/ ... /index.php That would probably be the fine way. To make my photos ready to print I normalize, brighten up, lower gamma and reduce color ... maybe that would be sufficient for OCR, too, and a bit shorter/easier:

Code: Select all

convert *.png -normalize -modulate 120 -gamma 0.85,0.85,0.85 +dither -posterize 32 *.png

Re: Text in gray area remove gray area

Posted: 2013-09-29T11:12:22-07:00
by fmw42
I would need to see an example, but perhaps my script textcleaner would work. But the OP is on Windows, so unless Cygwin is installed it won't help. However, my script is built around -lat. So the OP could explore using that to create a mask and composite the image with the mask using white.

Re: Text in gray area remove gray area

Posted: 2013-09-30T02:41:37-07:00
by benito2313
Hello everybody,

Here's an example for what i'm dealing with.

https://www.wetransfer.com/downloads/04 ... 431/e2fab0
If you click on download you can download the PNG file.

its an artificial document its always a document. never a picture.
But it could be scanned and maybe in the future it is a picture but not for now.

Regards,

Benito2313

Re: Text in gray area remove gray area

Posted: 2013-09-30T10:17:16-07:00
by fmw42
try

convert naamloos.PNG -threshold 0.1% -statistic median 3x3 result.png

Re: Text in gray area remove gray area

Posted: 2013-10-01T01:31:17-07:00
by benito2313
That command is comming closer, but now the numbers are not connecting is ther a way to connect dots within a range?

Re: Text in gray area remove gray area

Posted: 2013-10-01T01:57:58-07:00
by RitterRunkel
I think the problem is that the document is less grey in the background than rather black dotted/dashed. So kernel size and threshold will eliminate some of the right dots as well. If there's no other possibility to scan differently (less contrast, color, ...) it could become difficult. Since I'm not that experienced compared to other people here, I'm also curious what would be the best way to eliminate those lines in the background. Maybe py pattern and fft (http://www.fmwconcepts.com/imagemagick/ ... se_removal)?

Have you tried OCR directly? Is it irritated by the background?

Re: Text in gray area remove gray area

Posted: 2013-10-01T06:29:04-07:00
by snibgo
For removing horizontal lines, see viewtopic.php?f=1&t=24116