Remove horizontal line
Remove horizontal line
Hello,
I'm looking for help removing horizontal lines from images so that I can OCR them. Here's an example:
https://docs.google.com/open?id=0B2mMRo ... XZYWjlEVVk
I would like to remove the line under the word "Billy" so that the OCR engine doesn't get confused by it.
Version: ImageMagick 6.7.7-6 2012-07-31 Q16 on OSX
Thanks in advance,
Hank
I'm looking for help removing horizontal lines from images so that I can OCR them. Here's an example:
https://docs.google.com/open?id=0B2mMRo ... XZYWjlEVVk
I would like to remove the line under the word "Billy" so that the OCR engine doesn't get confused by it.
Version: ImageMagick 6.7.7-6 2012-07-31 Q16 on OSX
Thanks in advance,
Hank
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Remove horizontal line
try
convert billy.png -morphology close:2 "1x4: 0,1,1,0" result.png
convert billy.png -morphology close:2 "1x4: 0,1,1,0" result.png
Re: Remove horizontal line
Wow, awesome. Thanks!
Since I like to understand what I'm using instead of blindly executing it, I have a few questions:
Am I understanding the kernel correctly in that this works if there's a vertical pattern of pixels that are white, black, black, white? Will this only work on "lines" only 2 pixels thick? Sorry, if I don't quite understand kernels completely.
I ran this with only 1 iteration of "close", and it didn't remove the lines, so I'm wondering what the 2nd iteration is doing that the 1st didn't accomplish. Is it thinning the existing lines so that the 2nd pass removes them completely?
The reason for these questions is because I need to tune it some more to handle situations where the image isn't as clear as my example, and rather than post each issue I come up with, I'd rather solve it myself.
Thanks again.
Since I like to understand what I'm using instead of blindly executing it, I have a few questions:
Am I understanding the kernel correctly in that this works if there's a vertical pattern of pixels that are white, black, black, white? Will this only work on "lines" only 2 pixels thick? Sorry, if I don't quite understand kernels completely.
I ran this with only 1 iteration of "close", and it didn't remove the lines, so I'm wondering what the 2nd iteration is doing that the 1st didn't accomplish. Is it thinning the existing lines so that the 2nd pass removes them completely?
The reason for these questions is because I need to tune it some more to handle situations where the image isn't as clear as my example, and rather than post each issue I come up with, I'd rather solve it myself.
Thanks again.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Remove horizontal line
It was a quick solution that I did not try to refine. It is a morphological close (erode and dilate) attempt to remove two pixel tall horizontal black lines that have white above and below them. I tried running only one iteration as well as I thought that would work, but it left a thinner line. So I ran two iterations and that worked. Each iteration will remove more and more. It may be possible, though I have not tested, to create a 3 or 4 pixel tall black kernel with 1 white pixel above and below and then run only one iteration. That would be 1x5: 0,1,1,1,0, etc. It may also be possible, but again untested to use 1x3: 0,1,0 and just run it with multiple iterations. I think I or you would have to test to see how many iterations are needed for any thickness of horizontal line and which approach works best.hank2000 wrote:
Am I understanding the kernel correctly in that this works if there's a vertical pattern of pixels that are white, black, black, white? Will this only work on "lines" only 2 pixels thick? Sorry, if I don't quite understand kernels completely.
I ran this with only 1 iteration of "close", and it didn't remove the lines, so I'm wondering what the 2nd iteration is doing that the 1st didn't accomplish. Is it thinning the existing lines so that the 2nd pass removes them completely?
The reason for these questions is because I need to tune it some more to handle situations where the image isn't as clear as my example, and rather than post each issue I come up with, I'd rather solve it myself.
Thanks again.
I will leave it to you to test further, but it would be appreciated if you would let us know what you find works best.
If you still have trouble, then provide another example and we can see what can be done further.
For more information about morphologic operators see:
http://www.imagemagick.org/Usage/morphology/
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Remove horizontal line
I tested multiple iterations of 1x3: 0,1,0 but that does not work and I should have known better. The filter needs to have the number of ones the same or larger than the thickness of the line. So this works in one iteration.
convert billy.png -morphology close:1 "1x5: 0,1,1,1,0" show:
I don't see any visual difference between:
convert billy.png -morphology close:1 "1x5: 0,1,1,1,0" billy_1x5x1.gif
convert billy.png -morphology close:2 "1x4: 0,1,1,0" billy_1x4x2.gif
convert billy.png -morphology close:1 "1x5: 0,1,1,1,0" show:
I don't see any visual difference between:
convert billy.png -morphology close:1 "1x5: 0,1,1,1,0" billy_1x5x1.gif
convert billy.png -morphology close:2 "1x4: 0,1,1,0" billy_1x4x2.gif
Re: Remove horizontal line
This is excellent feedback. #1. How would I utilize a similar script on lines that are greater than a certain pixel length and #2. How would I remove the entire line even if it intersects with a letter (in this case the Y in Billy).
Perhaps we would want to identify lines (of a minimum pixel height) as being those with white above and below for a predetermined minimum horizontal length (anywhere within the line) and then beyond this length remove all vertical pixels the whole length of the line (but not beyond the maximum height discovered between the white space)?
I'd like to see us be able to remove the line below the Y in Billy without removing pixels from the Y itself.
Thank you kindly!
Perhaps we would want to identify lines (of a minimum pixel height) as being those with white above and below for a predetermined minimum horizontal length (anywhere within the line) and then beyond this length remove all vertical pixels the whole length of the line (but not beyond the maximum height discovered between the white space)?
I'd like to see us be able to remove the line below the Y in Billy without removing pixels from the Y itself.
Thank you kindly!
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Remove horizontal line
1) See http://magick.imagemagick.org/script/co ... onents.php and filter on areas longer than your linexthickness or by individual id.
2) You cannot as far as I know. It won't get that part of the line that connects with the bottom of the Y.
2) You cannot as far as I know. It won't get that part of the line that connects with the bottom of the Y.
Re: Remove horizontal line
Here is a link to an example file I am working on:
https://www.dropbox.com/s/2luszvqka7wc2 ... e.png?dl=0
https://www.dropbox.com/s/2luszvqka7wc2 ... e.png?dl=0
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Remove horizontal line
Connected components won't remove the lines if they are connected to some other part of your text. Sorry, I do not know how to deal with that. If I get any ideas, I will post back here.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Remove horizontal line
This works for the "for sale" example. It turns white all black lines that are at least 50 pixels wide. (The widest character is shorter than this.) Then it removes remaining noise.
Windows BAT syntax.
Windows BAT syntax.
Code: Select all
convert ^
Test2image.png ^
-strip ^
-write mpr:ORG ^
( +clone ^
-negate ^
-morphology Erode rectangle:50x1 ^
-mask mpr:ORG -morphology Dilate rectangle:50x1 ^
+mask ^
) ^
-compose Lighten -composite ^
( +clone ^
-morphology HMT "1x4:1,0,0,1" ^
) ^
-compose Lighten -composite ^
( +clone ^
-morphology HMT "1x3:1,0,1" ^
) ^
-compose Lighten -composite ^
( +clone ^
-morphology HMT "3x1:1,0,1" ^
) ^
-compose Lighten -composite ^
out.png
snibgo's IM pages: im.snibgo.com
Re: Remove horizontal line
Thank you snibgo! This worked very well on the sample image I provided. I noticed that other areas of the document with large font were affected as they are comprised of a fraction of long straight lines of greater length than 50 pixels. I tried increasing the rectangle size height however this left some odd artifacts. Here is a link to a file which the script struggles with: https://www.dropbox.com/s/uxj92oteykx5o ... e.png?dl=0 (Test3image.png)
Any ideas on how to remove the lines from the initial test image (Test2image.png) without affecting text with large font in the second image (Test3image.png)? I believe the lines we want to remove all have one border which is almost completely 0 pixels. This may be a clue however the large E and letters like T have similar characteristics. Thank you kindly !
Any ideas on how to remove the lines from the initial test image (Test2image.png) without affecting text with large font in the second image (Test3image.png)? I believe the lines we want to remove all have one border which is almost completely 0 pixels. This may be a clue however the large E and letters like T have similar characteristics. Thank you kindly !
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Remove horizontal line
test3image.png has a different problem, so needs a different solution, eg:
If both types of problem occur in the same image, then your difficulties are large. The problem then is of determining which area suffers from which problem. Then you can chop the image into pieces, apply the appropriate solution to each, and reassemble.
That problem is more difficult. You could start by taking your entire image (or a sample of images), and chop them manually. Find the solution for each piece. Find what identifies each area, so you can automatically chop them, and automatically solve each one.
Code: Select all
convert test3image.png -blur 0x3 -level 30%,70% b.png
That problem is more difficult. You could start by taking your entire image (or a sample of images), and chop them manually. Find the solution for each piece. Find what identifies each area, so you can automatically chop them, and automatically solve each one.
snibgo's IM pages: im.snibgo.com
Re: Remove horizontal line
snibgo -Thank you again! This is very helpful. I am including a link to a new image which combines the previous examples into one image. The ultimate goal is to write one script which removes the underlines without affecting any text (both smaller font and or larger font). I believe it can be done I just don't know how...
https://www.dropbox.com/s/7cupzdzpneiv8 ... e.png?dl=0
Thank you kindly -
https://www.dropbox.com/s/7cupzdzpneiv8 ... e.png?dl=0
Thank you kindly -
Re: Remove horizontal line
snibgo- One other thing, I ran your original script against the composite image of both test images. Here is the result:
https://www.dropbox.com/s/ytjgpnq4tta9o ... t.png?dl=0
Have a great day!
https://www.dropbox.com/s/ytjgpnq4tta9o ... t.png?dl=0
Have a great day!