Advanced image comparison for screenshots taken from webpage

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Advanced image comparison for screenshots taken from webpage

Post by fmw42 »

magick wrote:The code is templated but it needs more work. Perhaps by tomorrow evening.

OK. No problem and no hurry. If you can, give me a heads up when there is something firmed up to test.

Fred
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Advanced image comparison for screenshots taken from webpage

Post by anthony »

Fred, in your cross corelation matching of images, the resulting images spikes to mark the top-left corner position for placement of the smaller image.

What if the two images being compared are both cropped from a larger image, with partical overlap. The best match may be such that the top left corner of the smaller 'search image' is at a negative location! The cross correlation you have would not find such a match unless both images are rotated 180 degrees!

This is probably a common problem when trying to stitch photos together, where you are trying to find the best overlapping location, without using image registration methods.

Any thought on this?

Also do you have any information of finding correspondences between sets of image registration coordinates. These may be a much faster way of doing sub image searches as it vastly reduces the data to correlate, and may also handle scaling and rotational matching of images.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Advanced image comparison for screenshots taken from webpage

Post by fmw42 »

anthony wrote:Fred, in your cross corelation matching of images, the resulting images spikes to mark the top-left corner position for placement of the smaller image.
Yes, the match coordinates (or pixels in the output) reference where the top left corner pixel in the smaller image is located in the larger image.
anthony wrote: What if the two images being compared are both cropped from a larger image, with partical overlap. The best match may be such that the top left corner of the smaller 'search image' is at a negative location! The cross correlation you have would not find such a match unless both images are rotated 180 degrees!
I don't understand about the 180 rotate. NCC is very sensitive to rotation and the matching will fall apart if the images are not reasonably aligned.


No pixel matching algorithm that I know will work well if the smaller image is not fully located in the larger image.
anthony wrote: This is probably a common problem when trying to stitch photos together, where you are trying to find the best overlapping location, without using image registration methods.

Any thought on this?
I don't know for sure. I have never worked on that. But it was a very familiar problem when I was in the virtual tour business. But we used commercial software for that. I believe that they find unique features in the two images in the overlap area and match very small subsections from one to the other in a limited distance. I suspect they do not use FFT, but direct correlation matching in the spatial domain as they have small subsections and know about where to look in the other image, so the search area is also small. This is due often to manual initial coarse alignment. But it has been years since I did any of that.[/quote]
anthony wrote: Also do you have any information of finding correspondences between sets of image registration coordinates. These may be a much faster way of doing sub image searches as it vastly reduces the data to correlate, and may also handle scaling and rotational matching of images.
No, unfortunately, I have not seen that. But that may also be included in the process mentioned above after finding match locations for however many feature subsections as they might do. Perhaps it is just a least squares fit of the subsection match locations.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Advanced image comparison for screenshots taken from webpage

Post by magick »

The current implementation of subimage search is extremely naive. We simply scan across the image and look for zero distortion as defined by the -fuzz option. The only speed-up is as the current search exceeds the current distortion level, the search is aborted and we try again at the next (x,y). Feel free to replace the current ExtractSubImageFromImage() method with whatever algorithm might improve the subimage search process.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Advanced image comparison for screenshots taken from webpage

Post by fmw42 »

magick wrote:The current implementation of subimage search is extremely naive. We simply scan across the image and look for zero distortion as defined by the -fuzz option. The only speed-up is as the current search exceeds the current distortion level, the search is aborted and we try again at the next (x,y). Feel free to replace the current ExtractSubImageFromImage() method with whatever algorithm might improve the subimage search process.

Thanks. I just tried in the released version of IM 6.5.0-9, but it still reports images are different sizes. So I presume it was not fully implemented there and will look in IM 6.5.0-10 when available.

What do you mean by "current distortion level"? How is that determined? Are you keeping a running value and just skipping to the next pixel if the current rgb color difference value is larger than the running value? So I take it you are not saving every value and making an output image, but just looking for the best match (in this implementation).

Will this then be the syntax?

compare -fuzz XX% largerimage smallerimage null:

will the image order matter?

Fred
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Advanced image comparison for screenshots taken from webpage

Post by magick »

Let's get a starting point. Try this command sequence. Once it works we will go from there:
  • convert logo: logo.png
    convert logo.png -crop 150x100+340+160 wizard.png
    compare -metric rmse logo.png wizard.png null:
    0 (0) @ 340,160
Here the compare program successfully found the subimage at (340,160).
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Advanced image comparison for screenshots taken from webpage

Post by fmw42 »

magick wrote:Let's get a starting point. Try this command sequence. Once it works we will go from there:
  • convert logo: logo.png
    convert logo.png -crop 150x100+340+160 wizard.png
    compare -metric rmse logo.png wizard.png null:
    0 (0) @ 340,160
Here the compare program successfully found the subimage at (340,160).

Great!

On my Mac Mini G4 1.4GHz in IM 6.5.0-9:

time compare -metric rmse logo.png wizard.png null:
0 (0) @ 340,160

real 5m33.708s
user 5m0.356s
sys 0m1.944s

So it does work but is rather slow, but to be expected for that image size and spatial matching. If I ever get any help with the FFT stuff, the NCC will be much faster.

I also tried adding -fuzz, but it made no apparent difference but was slower. Is it supposed to play a part?

time compare -fuzz 10% -metric rmse logo.png wizard.png null:
0 (0) @ 340,160

real 6m30.422s
user 5m25.875s
sys 0m2.241s

Is -metric rmse fuzz sensitive now? In Anthony's notes I see:

AE ...... Absolute Error count of the number of different pixels (0=equal)
.
.
.
This is the ONLY metric which is 'fuzz' effected.


I also tried leaving the metric off:

time compare logo.png wizard.png null:
@ 340,160

real 6m1.895s
user 5m3.648s
sys 0m2.207s

It finished with no match score reported, but did find the subsection. So I presume some metric must be provided. Is there no default metric?

I also tried -metric AE:

time compare -metric AE logo.png wizard.png null:
0 @ 340,160

real 6m3.346s
user 5m3.983s
sys 0m2.206s



What metrics are relevant (all of them?) and which ones are fuzz sensitive for this image matching?



Thanks.

Fred
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Advanced image comparison for screenshots taken from webpage

Post by magick »

Next up, use the same compare command but first change on pixel in the wizard.png image. If you do not specify the fuzz option, the comparison will fail. However, if you specify a -fuzz 1% option you see how the metric is now useful. The compare program found the subimage and measure the distortion between the original and its reconstruction. You can also visualize which pixel was changed with the difference image. All very cool but very slow (for now until we improve this algorithm).

You might have noticed that what we just said in fact does not work. We have a patch for the problem. Look and updated beta with the patch later tonight or tomorrow.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Advanced image comparison for screenshots taken from webpage

Post by fmw42 »

magick wrote:Next up, use the same compare command but first change on pixel in the wizard.png image. If you do not specify the fuzz option, the comparison will fail. However, if you specify a -fuzz 1% option you see how the metric is now useful. The compare program found the subimage and measure the distortion between the original and its reconstruction. You can also visualize which pixel was changed with the difference image. All very cool but very slow (for now until we improve this algorithm).

You might have noticed that what we just said in fact does not work. We have a patch for the problem. Look and updated beta with the patch later tonight or tomorrow.

OK. I have not yet tried what you suggest. Not sure what you are suggesting?

"...use the same compare command but first change on pixel in the wizard.png image"

Do you mean change ONE pixel (not ON pixel)?

Why is the -fuzz required? It worked in my tests above? Does it have to do with changing the ONE pixel and why do I need a fuzz factor just for that. There will still be an rmse value for each shift of the subsection whether you have a fuzz value or not?

I guess I don't understand why you are not just using the metric itself and need to include a fuzz factor. Can you not just calculate the metric for each shift of the subimage and locate the subsection with the minimum metric?

Sorry for being so naive about what you are doing.

With regard to the difference image, I am not sure what you are putting in that image as the two images are not the same size. Are you displaying the metric value at each shift position as an image.

I look forward to the next beta to see what is going on, but no rush on my part.

Thanks.

Fred
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Advanced image comparison for screenshots taken from webpage

Post by anthony »

Here I repeat the comparision using some smaller images for faster testing

Code: Select all

   convert logo: -resize x180 -gravity center -crop 180x180+0+0! logo.png
   convert logo.png  -crop 25x25+130+31  star.png
   compare -metric rmse logo.png star.png null:
results in
0 (0) @ 130,31
However if I used a JPEG images so pixels will differ slightly!!!
and I get image size differs errors!

Code: Select all

   convert logo: -resize x180 -gravity center -crop 180x180+0+0! logo.jpg
   convert logo.jpg  -crop 25x25+130+31  star.jpg
   compare -fuzz 10% -metric rmse logo.jpg star.jpg null:
Any use of JPG seems to cause problems! even converting the PNG back to PNG
still produces a " image size differs" error!

NOTE comparing the PNG logo and the JPG logo from the above produces
1859.64 (0.0283763) @ 0,0
showing an approximate maximim fuzz difference of about 2.8% over the whole image (not exact). Changing fuzz does not help in any case.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Advanced image comparison for screenshots taken from webpage

Post by magick »

We adjusted the algorithm sensitivity and your example now works. Thanks.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Advanced image comparison for screenshots taken from webpage

Post by fmw42 »

Sorry, I still don't understand why -fuzz is needed?
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Advanced image comparison for screenshots taken from webpage

Post by anthony »

Because the pixels do not match exactly!!!!!

I am not ceratin on the extact method used but there are two that could be in use.

It would be used to just abort the compare for the current position until a pixel falls outside the fuzz distance, as soon as this happens reject the compare and move on, I believe this is what it does. In other word no chance of getting a map of how closely an image compares (as you get with FFT) at every point, and it is generally a lot faster.

Another way is to add up differences between pixels until the metric goes beyond the fuzz percentage, and then abort. That is the meaning of fuzz depends on the metric, AE for a count of matching pixels (though for that we need a -fuzz for pixel matching and a fuzz for comparison counts), or PAE for peak color difference, or the average count goes about the fuzz threshold.

The former is probably want is being used. while the latter would be slower and harder to implement, but can be more useful. However a different option to -fuzz should be used as a 'comparison limit factor'

Note the the above does not mention anything about WHAT exactly is returned with regard to the best match. Is it the best metric fit? Or just the first match found? Or a list of the best matching positions. All of these could be useful!
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Advanced image comparison for screenshots taken from webpage

Post by anthony »

magick wrote:We adjusted the algorithm sensitivity and your example now works.
Yeap it does now work for a in-exact JPG image

Code: Select all

   convert logo: -resize x180 -gravity center -crop 180x180+0+0! logo.jpg
   convert logo.jpg  -crop 25x25+130+31  star.jpg
   compare -fuzz 10% -metric rmse logo.jpg star.jpg null:
returned
2154.36 (0.0328734) @ 130,31
removing the fuzz results in an immediate "image size differs" error which I presume indicates the sub image very quickly failed to match at every location.

Using a -fuzz of 200% still returns the correct location so I presume the best match is the one returned.

Also changing the null: to show: nicely returns an image the size of the sub-image, but I am not certain of its meaning, as it shows pixels that don't match according to the fuzz factor.

Perhaps we really need to separate -fuzz, as used for pixel mask, and the 'comparison match limit' as used for sub-image match determination.

More info in EXACTLY what the algorithm does for 'fuzzy matching' may be useful at this point.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Advanced image comparison for screenshots taken from webpage

Post by fmw42 »

anthony wrote:Because the pixels do not match exactly!!!!!

I am not ceratin on the extact method used but there are two that could be in use.

It would be used to just abort the compare for the current position until a pixel falls outside the fuzz distance, as soon as this happens reject the compare and move on, I believe this is what it does. In other word no chance of getting a map of how closely an image compares (as you get with FFT) at every point, and it is generally a lot faster.

Another way is to add up differences between pixels until the metric goes beyond the fuzz percentage, and then abort. That is the meaning of fuzz depends on the metric, AE for a count of matching pixels (though for that we need a -fuzz for pixel matching and a fuzz for comparison counts), or PAE for peak color difference, or the average count goes about the fuzz threshold.

The former is probably want is being used. while the latter would be slower and harder to implement, but can be more useful. However a different option to -fuzz should be used as a 'comparison limit factor'

Note the the above does not mention anything about WHAT exactly is returned with regard to the best match. Is it the best metric fit? Or just the first match found? Or a list of the best matching positions. All of these could be useful!
OK. I can understand using -fuzz as a threshold limit or cut-off on the metric to stop the processing at any given smaller image shift relative to the larger one, since once the fuzz value is reached by the metric, there is no point in evaluating any further pixels in the metric for that shift value. That makes more sense to me than if any pixel exceeds the fuzz threshold.

I will have to try looking at the code when the next version is available. But as I don't read code that well, I may not be able to follow what is exactly done. Thus a better explanation from Magick might be welcome.

However, as one must process every shift value, one could do the following:

1) For every shift value, check the metric vs the stored minimum metric and if less than or equal update the stored value. This will make the returned best fit be the last shift value encountered with that value. Alternately, one would use a less than only test, which would result in the first value with the lowest metric being returned as the best fit.

2) If an "difference" image is requested and returned, it would be the size of the larger image dimension less the smaller image dimension, i.e., the size determined by every possible shift value of the smaller image inside the larger image. Since a fuzz value is used to cut off the processing at any shift position if the threshold metric is reached, then one would just store "white" at that value (or some color) to specify that the fuzz threshold was exceeded.
Post Reply