Page 1 of 1
Can ImageMagick do first part of OCR?
Posted: 2011-08-29T09:49:33-07:00
by yong321
I'm looking for a program to compare simple line art-style images. For example, the program should tell me
http://www.zdic.net/pic/zy/xz2/4E5D.gif
and
http://yong321.freeshell.org/temp/4E5D_yong.PNG
are very similar (regardless image size), and both are very different from
http://www.zdic.net/pic/zy/xz2/4E5C.gif
These are ancient Chinese characters. Essentially, this image comparison is like the first part of the OCR work short of identifying the character to be text. I read this article on the compare program at
http://imagemagick.org/Usage/compare/
and tried a couple of options of compare, with no success. Can compare do this? Any other tool if not? Thank you.
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-29T10:04:06-07:00
by fmw42
For compare to do what you want, the two images must be the same size.
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-30T13:11:59-07:00
by yong321
I resized one to be the same as the other. "compare" generates a new image that pretty much looks like one of the two. What I would like the result to be is something like a report "The outlines in the two images are xxx percent similar", instead of (apparently) an image as a result of pixel-by-pixel color subtraction.
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-30T13:55:51-07:00
by fmw42
If the two images are the same size, then
compare -metric rmse image1 image2 null:
will give you the rmse error difference. You can use other metrics, see
http://www.imagemagick.org/Usage/compare/#statistics
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-30T17:08:12-07:00
by anthony
IM compare is really not designed for OCR work. It can do some things, like: direct image compare, correlation compare (including FFT, and scale-rotation independent matching) and morphology hit-n-miss whcih are all techniques for OCR, but they are not always ideal.
You are really opening a can of worms.
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-30T21:42:05-07:00
by yong321
I tested fmw42's advice and used rmse metric (4E5D_yong3.gif is the same as 4E5D_yong.png except it was resized):
C:\temp>compare -metric rmse 4E5D_yong3.gif 4E5D.gif null:
58.5 (0.229412)
C:\temp>compare -metric rmse 4E5D_yong3.gif 4E5C.gif null:
60.5834 (0.237582)
Ideally, I would like to see very different numbers because 4E5C is a completely different character than 4E5D. But the results here are too close.
I'm sure anthony is right. But before I give up on compare, could you give an example command I can try? If compare is not the way to go, any tool you recommend? There're free or commercial OCR tools. But I don't want OCR per se. I want line-art images, possibly hand-drawn, to be compared with a standard image.
Thank you both.
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-30T21:51:44-07:00
by fmw42
can you post links to the exact two files you just tested. if they are the same character (as it looks to me from your uploaded images) and all you did was resize, then they should be very close. Perhaps I misunderstand your issue. But I need to see and try the exact two images you just did the compare on. Be sure that transparency is not enabled on your files as that may change the compare results depending upon the metric.
The issue as far as I can tell is that your images are binary (white and black) and mostly white. So what little black there is won't matter too much in the compare. If only a few black pixels are involved and they don't match or match, the metric will report similar results as there is too much white that does match, which biases the compare too much. So I don't think compare on line art is going to be much help.
IM does not do this, but my best suggestion for shape matching in such images (invariant to scale and rotation) is Fourier Descriptors. You can do a Google search and find many articles about it.
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-31T07:08:16-07:00
by yong321
I posted the image at
http://yong321.freeshell.org/temp/4E5D_yong3.gif
It's a resize but it also squeezes the image thinner. A human can identify it with 4E5D.gif and I hope some software can too.
I'll do some reading on Fourier Descriptors. Found your reply to a similar request
viewtopic.php?f=1&t=16820
I'm not good at the math involved in image comparison. But I'll see what I can do. Thanks again.
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-31T15:56:02-07:00
by fmw42
I have never tried this before, so am not sure I am interpreting it correctly. But it appears that if both images have transparency, then it only compares where there is no fully? transparent pixels.
For example with both images having transparency:
compare -metric rmse 4E5D.gif 4E5D_yong3t.gif null:
55595.1 (0.848328)
So 85% different
Whereas when I remove the alpha channel so both images are black and white with no transparency, the white is included and overwhelms the comparison as much white matches.
compare -metric rmse 4E5D_aoff.gif 4E5D_yong3.gif null:
15034.5 (0.229412)
So only 23% different
Re: Can ImageMagick do first part of OCR?
Posted: 2011-08-31T19:20:57-07:00
by fmw42
I suspect my interpretation above is incorrect. I have done other tests that do not seem to indicate my assumption is correct.