Advanced image comparison for screenshots taken from webpage
Re: Advanced image comparison for screenshots taken from webpage
Anthony, the algorithm is encoded in magick/transform.c/ExtractSubimageFromImage() and its quite compact. Feel free to review and improve as you see fit.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Advanced image comparison for screenshots taken from webpage
I am not the best reader of code, but if I have not misinterpreted, then it looks like it does the following:magick wrote:Anthony, the algorithm is encoded in magick/transform.c/ExtractSubimageFromImage() and its quite compact. Feel free to review and improve as you see fit.
1) For each shift position of the small image relative to the larger, get the subsection of the larger image. Initialize "similarity" to zero.
2) Loop over columns and rows and for each pixel:
3) Get the sum of squared differences between each channel normalized (range 0 to 1) value for a given pixel and call it "similarity".
4) Accumulate "similarity across a given row.
5) At the end of a row, take square root and normalize by the number of pixels processed by all rows so far, and call this "normalized_similarity"
6) Compare the "normalized_similarity" to the "similarity_threshold" (which is initialized to fuzz/100). If the normalized similarity > similarity_threshold, stop processing.
7) If not >, continue accumulating with the next row.
8 ) If reach end of subsection, keep "normalized_similarity", which is just the RMSE error for that subsection.
9) Test the returned "normalized_similarity" with the "similarity_threshold". If "normalized_similarity" < current "similarity_threshold", then save "normalized_similarity" as "similarity_threshold" along with X,Y coordinates of shift position of subsection relative to top left corner of larger image.
10) Continue with the next subsection.
11) At the end, we will have the first occurrence of the smallest RMSE value.
Looks like no matter what -metric you supply to compare, it will always compute the RMSE metric and return that!
Anthony or Magick, let me know if this is more or less correct or if I have misunderstood.
Fred
Re: Advanced image comparison for screenshots taken from webpage
We use RMSE to compute similarity, however, once the subimage is extracted it reports the metric you specify on the command line.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Advanced image comparison for screenshots taken from webpage
Note: this does not work if one image, zelda3.png is truecolor and the other zelda3g_32_27.png is grayscale. (I accidentally specified the wrong pair of images.)magick wrote:We use RMSE to compute similarity, however, once the subimage is extracted it reports the metric you specify on the command line.
compare -fuzz 1% -metric rmse zelda3.png zelda3g_32_27.png null:
compare: image size differs `zelda3.png' @ compare.c/CompareImageChannels/153.
However setting -fuzz to 100% allows it to work and does find the right answer.
compare -fuzz 100% -metric rmse zelda3.png zelda3g_32_27.png null:
8647.91 (0.131959) @ 32,27
I guess that makes sense.
Seems like the smallest -fuzz should just makes things work faster once you get a fuzz value that is large enough to allow the images to be processed. Is that correct?
So this is interesting as it appears that -fuzz makes very little difference in speed.
convert logo.png -crop 150x100+340+160 -blur 0x1 +repage wizard2.jpg
time compare -fuzz 1% -metric rmse logo.png wizard2.jpg null:
compare: image size differs `logo.png' @ compare.c/CompareImageChannels/153.
real 1m25.910s
user 1m17.251s
sys 0m0.491s
time compare -fuzz 5% -metric rmse logo.png wizard2.jpg null:
compare: image size differs `logo.png' @ compare.c/CompareImageChannels/153.
real 1m29.016s
user 1m17.667s
sys 0m0.530s
time compare -fuzz 7% -metric rmse logo.png wizard2.jpg null:
7420.62 (0.113231) @ 340,160
real 1m35.781s
user 1m18.438s
sys 0m0.580s
time compare -fuzz 10% -metric rmse logo.png wizard2.jpg null:
7420.62 (0.113231) @ 340,160
real 1m30.205s
user 1m17.913s
sys 0m0.559s
time compare -fuzz 100% -metric rmse logo.png wizard2.jpg null:
7420.62 (0.113231) @ 340,160
real 1m36.157s
user 1m18.052s
sys 0m0.619s
Perhaps I am misinterpreting the "break" statement at the end of of the row calculation and it is not stopping further row calculations?
Last edited by fmw42 on 2009-03-30T17:42:47-07:00, edited 2 times in total.
Re: Advanced image comparison for screenshots taken from webpage
The only way to make things work faster is to add more cores to your computer. For each core, a subimage is inspected in parallel. We do check the minimum similarity and when a subimage exceeds the minimum, the process aborts for that subimage. The best solution to this problem is likely the FFT but as you know that's solution is not quite ready for prime time yet .
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Advanced image comparison for screenshots taken from webpage
Right about FFT, but am I misinterpreting the "break" statement at the end of of the row calculation and it is not stopping further row calculations or is it just going to be very sensitive to -fuzz to catch something that cuts off in the early rows? Otherwise, it does seem to be working as I expect it should from my analysis of the code and the approach makes sense.magick wrote:The only way to make things work faster is to add more cores to your computer. For each core, a subimage is inspected in parallel. We do check the minimum similarity and when a subimage exceeds the minimum, the process aborts for that subimage. The best solution to this problem is likely the FFT but as you know that's solution is not quite ready for prime time yet .
Now I need to see what happens with an output image.
Here is a test:
compare -fuzz 10% -metric rmse logo.png wizard2.jpg tmp.png
7420.62 (0.113231) @ 340,160
The result is the size of wizard2.jpg and seems to be the metric values at each pixel of the best match subsection.
However, I think a better returned image would be the metric values returned from each offset position, so that the returned image would be the size of the width and height difference between the big and little images. This would allow one to see how unique the match was or if there are other close matches. This is something similar to what I have produced from my FFT NCC. (See my NCC examples above).
Is that easily done? Comments welcome.
Last edited by fmw42 on 2009-03-30T18:08:25-07:00, edited 2 times in total.
Re: Advanced image comparison for screenshots taken from webpage
Fuzz does not affect the similarity measurement only the minimum similarity does. Fuzz is evaluated only after the minimum similarity is computed for the subimage across all pixels of the image.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Advanced image comparison for screenshots taken from webpage
See my edited comment in blue above about the output image from the compare. Comments welcome about my suggested change.magick wrote:Fuzz does not affect the similarity measurement only the minimum similarity does. Fuzz is evaluated only after the minimum similarity is computed for the subimage across all pixels of the image.
In any case, nice work and good to have any kind of image matching technique in IM.
Re: Advanced image comparison for screenshots taken from webpage
Can you create a mock-up that we can review. Post an image, a subimage, and the expected difference image so we can better understand what you are recommending.However, I think a better returned image would be the metric values returned from each offset position, so that the returned image would be the size of the width and height difference between the big and little images. This would allow one to see how unique the match was or if there are other close matches. This is something similar to what I have produced from my FFT NCC. (See my NCC examples above).
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Advanced image comparison for screenshots taken from webpage
OK. Try this.magick wrote:Can you create a mock-up that we can review. Post an image, a subimage, and the expected difference image so we can better understand what you are recommending.However, I think a better returned image would be the metric values returned from each offset position, so that the returned image would be the size of the width and height difference between the big and little images. This would allow one to see how unique the match was or if there are other close matches. This is something similar to what I have produced from my FFT NCC. (See my NCC examples above).
Suppose you have the following two image to match:
large image (128x128):
small image (64x64 subsection at 32,27)
Simulate metric surface, which is just the returned metric value for each subsection shift position relative to the larger image. So it is the difference in size between the two image, which in this case is 64x64 (128-64=64, sorry it happens to be the same size as the subsection, but that is just because it came from my existing NCC example). The smaller the subsection, the larger the output would be.
Now this nominally would be grayscale as the metric is the rmse from the accumulated channels, i.e. there is not an rmse for each channel. So it would look like this, since your output "difference" image is nominally white where there is close match and dark where there is not a close match (opposite from the metric itself). So the white area is where the best match occurs at position 32,27.
But as you nominally color code where the difference is really bad, then it could look like this, if we color code red where above the fuzz threshold (but nominally I would not expect quite so much red).
The brighter the area the better the match and where red, the match is beyond the fuzz threshold. But as you can see the white area is rather broad, so it shows that close matches are nearby. But in some cases, depending upon the image, one might get other close matches (bright areas) at much different locations.
This is just a suggestion and I welcome comments from any one.
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: Advanced image comparison for screenshots taken from webpage
NOTE: the size of such an sub-image-fit image is the dimentions of the large image minus the dimensions of the small images. that is the range of the posible placement coordinates.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
Re: Advanced image comparison for screenshots taken from webpage
It seems that each coordinate of the compare image is the measure of similarity at that coordinate. Easy enough to do but the result is not a difference image. Instead we may need another option to return a similarity image. We also need to figure out the best color scheme. Do you have a grayscale remapping colormap? Usually they show bluish for background (dissimilar) and whitish for foreground (similar). Some similarity algorithms list all the coordinates that are less than a user defined threshold. Do you need that as well or will the similarity image be sufficient? If we are showing a similarity image should similarity be shown as RMSE only or other metrics as well (e.g. AE)? For measuring similarity we suspect that RMSE is sufficient and other metrics would not provide any additional information other than scaling the similarity image color range.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Advanced image comparison for screenshots taken from webpage
magick wrote:It seems that each coordinate of the compare image is the measure of similarity at that coordinate. Easy enough to do but the result is not a difference image. Instead we may need another option to return a similarity image. We also need to figure out the best color scheme. Do you have a grayscale remapping colormap? Usually they show bluish for background (dissimilar) and whitish for foreground (similar). Some similarity algorithms list all the coordinates that are less than a user defined threshold. Do you need that as well or will the similarity image be sufficient? If we are showing a similarity image should similarity be shown as RMSE only or other metrics as well (e.g. AE)? For measuring similarity we suspect that RMSE is sufficient and other metrics would not provide any additional information other than scaling the similarity image color range.
In the NCC case, the metric is bright where similar and dark where not similar. But when using your color matching RMSE is dark where similar and bright where not similar. However, it can be negated to produce the opposite effect, so that bright is similar. I can generate a color mapping if needed, but would like to know if you want it to have other shades other than blue to white. I have several schemes for generating a pseudocolor lut or table. See my scripts, pseudocolor and mapcolor at http://www.fmwconcepts.com/imagemagick/index.html. However, both right now show a cyclic color mapping from using HSL. Thus redish colors are both low and high values. But at my tidbits page, I show some other "rainbow" luts. See http://www.fmwconcepts.com/imagemagick/ ... hp#rainbow. Another approach is from my script halo. It uses a technique that allow one to adjust the "rainbow" from a nominal rgb spectrum to one that varies to white. Nevertheless, almost any rainbow scheme is easy to generate using Anthony's technique of linearly interpolating a few colors, which I used in the one rainbow on my tidbits page. I just need to know what various colors are desired.
Perhaps a blue to white gradient that has red (or any other color) where fuzz value is exceeded. What do you think about this (example below)?
fuzzval=1
convert zelda3_metric.png -black-threshold $fuzzval% \
+level-colors royalblue,white -fill red -opaque royalblue \
zelda3_metric_colored.png
Listing more values probably won't be useful unless you can locate all the local maxima; otherwise, you will just get all the locations that are near the global maxima.
I have no problem if you want to have two output modes, your current one and this newer idea.
RMSE is probably adequate and the most useful measure (in my opinion). But I have no problem if you want to have the metric image show whatever metric is requested by the user. I mostly only use RMSE anyway. (less often PSNR and rarely any of the others).
Other peoples suggestions and comments welcome.