Get perceptual hash value for image using command line
Get perceptual hash value for image using command line
How can I get the perceptual image hash value of an image using the command line? Is this possible? Does it even make sense? If it does - what's the best way to do this?
I'm a novice ImageMagick user & total ignoramus when it comes to perceptual hashing. I've been trying to find a way to generate a perceptual hash for a given image and store it in a database so it can be compared with other P-Hashes in the future. Experimenting with NodeJS & NPM packages, they all seem to return different values in different formats (hex or binary or varying wildly in length). I think I may be missing some critical knowledge about the entire concept of perceptual hashing. That, and/or a simple & very obvious flag I can pass to imagemagick's 'identify' program...
Apologies if this has already been covered - I just can't seem to find an answer. If anybody could give me a kick in the right direction it'd be greatly appreciated.
I'm a novice ImageMagick user & total ignoramus when it comes to perceptual hashing. I've been trying to find a way to generate a perceptual hash for a given image and store it in a database so it can be compared with other P-Hashes in the future. Experimenting with NodeJS & NPM packages, they all seem to return different values in different formats (hex or binary or varying wildly in length). I think I may be missing some critical knowledge about the entire concept of perceptual hashing. That, and/or a simple & very obvious flag I can pass to imagemagick's 'identify' program...
Apologies if this has already been covered - I just can't seem to find an answer. If anybody could give me a kick in the right direction it'd be greatly appreciated.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Get perceptual hash value for image using command line
Code: Select all
identify -verbose -define identify:moments x.png
Code: Select all
Channel perceptual hash:
Red, Hue:
PH1: 0.556327, 11
PH2: 2.5103, 11
PH3: 3.35776, 11
PH4: 3.35776, 11
PH5: 6.71553, 11
PH6: 4.61291, 11
PH7: 11, 11
Green, Chroma:
PH1: 0.556327, 11
PH2: 2.5103, 11
PH3: 3.35776, 11
PH4: 3.35776, 11
PH5: 6.71553, 11
PH6: 4.61291, 11
PH7: 11, 11
Blue, Luma:
PH1: 0.556327, 0.556327
PH2: 2.5103, 2.5103
PH3: 3.35776, 3.35776
PH4: 3.35776, 3.35776
PH5: 6.71553, 6.71553
PH6: 4.61291, 4.61291
PH7: 11, 11
snibgo's IM pages: im.snibgo.com
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Get perceptual hash value for image using command line
Blair: See viewtopic.php?f=4&t=24906
Snibgo: I have been checking out the hue issue and do agree. So I have been testing the code using YCbCr and LAB in place of HCLp. They seem to work reasonably well. I am working with Magick on this.
In the meantime, if you want to compile the code using some other colorspace. It is easy to do. Change line 2095 in statistic.c to use some other colorspace. It won't change the titling in identify -verbose -moments, but you will be using the new colorspace.
From:
status=TransformImageColorspace(hash_image,HCLpColorspace);
To
status=TransformImageColorspace(hash_image,LabColorspace);
or
status=TransformImageColorspace(hash_image,YCbCrColorspace);
Snibgo: I have been checking out the hue issue and do agree. So I have been testing the code using YCbCr and LAB in place of HCLp. They seem to work reasonably well. I am working with Magick on this.
In the meantime, if you want to compile the code using some other colorspace. It is easy to do. Change line 2095 in statistic.c to use some other colorspace. It won't change the titling in identify -verbose -moments, but you will be using the new colorspace.
From:
status=TransformImageColorspace(hash_image,HCLpColorspace);
To
status=TransformImageColorspace(hash_image,LabColorspace);
or
status=TransformImageColorspace(hash_image,YCbCrColorspace);
Re: Get perceptual hash value for image using command line
Thanks guys.
Fred: I had found the link you posted, but unfortunately, it left me with as many questions as when I started.
Snibgo: I used the command you suggested, and it gives me the image moments as I have seen them when they're generated with something like
identify -verbose -features 1 -moments -unique x.jpg
Using the command you provided, and its output as an example:
What I think I don't understand is this: How to turn those "42 numbers" into a single & unique (and therefore relational-database-friendly) hash value representative of a given image. Such as what this Node module does: https://www.npmjs.com/package/imghash
It promises to return a single hex-hash value, which you can then use to compare to another hash to determine image similarity.
It does indeed appear to do this, as I have determined by experimenting with it.
But what I don't get is how this is being done, or if it is even an effective or wise way to produce a perceptual hash.
If it is indeed a fair and reliable way to represent a perceptual hash (with a single, unique hex value, for example) - then is it possible for imagemagick to produce this hash value from the command line?
Thanks again for your patience and help.
Fred: I had found the link you posted, but unfortunately, it left me with as many questions as when I started.
Snibgo: I used the command you suggested, and it gives me the image moments as I have seen them when they're generated with something like
identify -verbose -features 1 -moments -unique x.jpg
Using the command you provided, and its output as an example:
Code: Select all
Image: scott.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Mime type: image/jpeg
Class: DirectClass
Geometry: 831x1108+0+0
Units: Undefined
Type: TrueColor
Endianess: Undefined
Colorspace: sRGB
Depth: 8-bit
Channel depth:
red: 8-bit
green: 8-bit
blue: 8-bit
Channel statistics:
Pixels: 920748
Red:
min: 19 (0.0745098)
max: 255 (1)
mean: 103.957 (0.407673)
standard deviation: 53.8262 (0.211083)
kurtosis: 0.7424
skewness: 1.19267
entropy: 0.916575
Green:
min: 0 (0)
max: 255 (1)
mean: 48.6739 (0.190878)
standard deviation: 33.935 (0.133078)
kurtosis: 7.07903
skewness: 1.9035
entropy: 0.843458
Blue:
min: 0 (0)
max: 255 (1)
mean: 41.3478 (0.162148)
standard deviation: 37.4749 (0.14696)
kurtosis: 13.8376
skewness: 3.31578
entropy: 0.808608
Image statistics:
Overall:
min: 0 (0)
max: 255 (1)
mean: 64.6594 (0.253566)
standard deviation: 42.6349 (0.167196)
kurtosis: 9.32091
skewness: 2.86859
entropy: 0.856213
Channel moments:
Red:
Centroid: 484.098,465.684
Ellipse Semi-Major/Minor axis: 633.899,468.061
Ellipse angle: 7.73379
Ellipse eccentricity: 0.511483
Ellipse intensity: 102.688 (0.402698)
I1: 0.00162172 (0.413539)
I2: 2.27821e-07 (0.014814)
I3: 8.82495e-11 (0.0014633)
I4: 2.09412e-10 (0.00347234)
I5: -9.82006e-22 (-2.69995e-07)
I6: 8.55174e-14 (0.000361589)
I7: -2.84511e-20 (-7.82241e-06)
I8: -2.58722e-14 (-0.000109394)
Green:
Centroid: 486.112,498.321
Ellipse Semi-Major/Minor axis: 688.378,427.294
Ellipse angle: 11.1463
Ellipse eccentricity: 0.615853
Ellipse intensity: 48.4991 (0.190192)
I1: 0.00366186 (0.933773)
I2: 2.64023e-06 (0.171681)
I3: 3.71273e-10 (0.00615622)
I4: 9.74618e-10 (0.0161605)
I5: -5.80934e-19 (-0.000159723)
I6: 1.54173e-12 (0.00651884)
I7: -7.89243e-20 (-2.16996e-05)
I8: -1.80941e-13 (-0.000765066)
Blue:
Centroid: 509.679,462.952
Ellipse Semi-Major/Minor axis: 700.409,429.122
Ellipse angle: 18.9012
Ellipse eccentricity: 0.622356
Ellipse intensity: 40.3191 (0.158114)
I1: 0.00443067 (1.12982)
I2: 4.049e-06 (0.263287)
I3: 2.20332e-10 (0.0036534)
I4: 4.68073e-09 (0.0776129)
I5: -1.88331e-18 (-0.000517802)
I6: 9.08329e-12 (0.0384064)
I7: -4.36445e-18 (-0.00119997)
I8: -1.24544e-12 (-0.00526601)
Image moments:
Overall:
Centroid: 415.5,554
Ellipse Semi-Major/Minor axis: 661.302,451.636
Ellipse angle: 11.8618
Ellipse eccentricity: 0.563072
Ellipse intensity: 63.4503 (0.248825)
I1: 0.00269294 (0.686699)
I2: 9.60135e-07 (0.0624328)
I3: 1.43378e-10 (0.00237741)
I4: 8.03585e-10 (0.0133245)
I5: -8.57493e-20 (-2.35761e-05)
I6: 7.26833e-13 (0.00307323)
I7: -2.58937e-19 (-7.11925e-05)
I8: -1.51427e-13 (-0.00064027)
Channel perceptual hash:
Red, Hue:
PH1: 0.383483, 0.2179
PH2: 1.82932, 1.53278
PH3: 2.83467, 1.88231
PH4: 2.45938, 2.47389
PH5: 6.56882, 4.69728
PH6: 3.44179, 4.25623
PH7: 5.10667, 5.01462
Green, Chroma:
PH1: 0.0297579, 0.21087
PH2: 0.765282, 1.72529
PH3: 2.21072, 1.97494
PH4: 1.79154, 1.82264
PH5: 3.79665, 4.85014
PH6: 2.18583, 2.88466
PH7: 4.66359, 3.72263
Blue, Luma:
PH1: -0.0530096, 0.160801
PH2: 0.579577, 1.17793
PH3: 2.4373, 2.66479
PH4: 1.11007, 1.93322
PH5: 3.28589, 4.67092
PH6: 1.4156, 2.55493
PH7: 2.92082, 4.26312
Rendering intent: Perceptual
Gamma: 0.454545
Chromaticity:
red primary: (0.64,0.33)
green primary: (0.3,0.6)
blue primary: (0.15,0.06)
white point: (0.3127,0.329)
Background color: white
Border color: srgb(223,223,223)
Matte color: grey74
Transparent color: black
Interlace: None
Intensity: Undefined
Compose: Over
Page geometry: 831x1108+0+0
Dispose: Undefined
Iterations: 0
Compression: JPEG
Quality: 93
Orientation: Undefined
Properties:
date:create: 2016-02-21T11:01:57-05:00
date:modify: 2016-02-21T11:01:51-05:00
exif:Software: Google
jpeg:colorspace: 2
jpeg:sampling-factor: 2x2,1x1,1x1
signature: 1a15bfa8c718dc31195948695ce67778e7ac54e707b44d425e0115234d2863c3
Profiles:
Profile-exif: 40 bytes
Artifacts:
filename: scott.jpg
identify:moments:
verbose: true
Tainted: False
Filesize: 69.4KB
Number pixels: 921K
Pixels per second: 460.37GB
User time: 0.000u
Elapsed time: 0:01.000
Version: ImageMagick 6.9.3-0 Q16 x86_64 2016-01-08 http://www.imagemagick.org
It promises to return a single hex-hash value, which you can then use to compare to another hash to determine image similarity.
It does indeed appear to do this, as I have determined by experimenting with it.
But what I don't get is how this is being done, or if it is even an effective or wise way to produce a perceptual hash.
If it is indeed a fair and reliable way to represent a perceptual hash (with a single, unique hex value, for example) - then is it possible for imagemagick to produce this hash value from the command line?
Thanks again for your patience and help.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Get perceptual hash value for image using command line
The 42 floating point values from two images are compared using a (strike this: root mean) square distance measure. It is not a simple binary has that can be compared using the hamming distance.
If you are on Unix (Linux, Mac OSX, Windows with Cygwin or Windows 10), then I have two scripts, phashconvert and phashcompare, at the link below. The first converts the 42 floats to a string of digits (not binary) that can be stored. The second takes two strings of digits, converts back to floating point values and does the rms difference.
EDIT: The metric is just the Sum of Squared Differences between the 42 float values.
If you are on Unix (Linux, Mac OSX, Windows with Cygwin or Windows 10), then I have two scripts, phashconvert and phashcompare, at the link below. The first converts the 42 floats to a string of digits (not binary) that can be stored. The second takes two strings of digits, converts back to floating point values and does the rms difference.
EDIT: The metric is just the Sum of Squared Differences between the 42 float values.
Re: Get perceptual hash value for image using command line
Thanks Fred! I will experiment with those scripts.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Get perceptual hash value for image using command line
Note the comments above about issues using the Hue channel that may cause poor results for reddish images.blair wrote:Thanks Fred! I will experiment with those scripts.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Get perceptual hash value for image using command line
@Fred: If I had time to experiment, I'd set the colorspace from an attribute so I could use a define, rather than re-compiling for different colorspaces.
I suspect the following would also work, to get Lab numbers in the left-hand column:
@Blair: The numbers can be stored as 42 numeric fields in a database or, as Fred says, packed together into a single text field, then unpacked when you need them. When comparing two images, calculate the RMS of the differences. That is: subtract numbers of one image from corresponding numbers of the other image. Square these 42 differences. Add the squares together, and divide by 42. Take the square root. The single resulting number is the "distance" between the images. Zero means they match exactly.
IM's "-metric phash" does something like this.
For a job I did, I ignored the Hue numbers, so use only 35.
Questions in my mind (which, sadly, I don't have time to investigate):
1. Do we gain anything by using two colorspaces instead of one? [EDIT: Would three colorspaces be even better?]
2. I would expect that a perceptually uniform colorspace would give better results.
3. How much precision do we need in the numbers? The default is 6. I think Fred's scheme uses 4. Is 4 sufficient? Is 6 better? Is 10 better?
4. The usual RMS scheme gives equal weighting to all the numbers. But I notice that patterns like this are common:
PH7 is usually (always?) much greater than PH1. I expect the difference between two images is likewise. So, perhaps we should take the proportional difference instead of the absolute difference. That is, instead of:
... perhaps we should use:
5. There is probably a standard database of images somewhere that we can test against, and compare IM's methods with those of other systems.
As I say, sadly I don't have time for this right now.
I suspect the following would also work, to get Lab numbers in the left-hand column:
Code: Select all
convert in.tiff -colorspace Lab -set colorspace sRGB -verbose -define identify:moments info:
IM's "-metric phash" does something like this.
For a job I did, I ignored the Hue numbers, so use only 35.
Questions in my mind (which, sadly, I don't have time to investigate):
1. Do we gain anything by using two colorspaces instead of one? [EDIT: Would three colorspaces be even better?]
2. I would expect that a perceptually uniform colorspace would give better results.
3. How much precision do we need in the numbers? The default is 6. I think Fred's scheme uses 4. Is 4 sufficient? Is 6 better? Is 10 better?
4. The usual RMS scheme gives equal weighting to all the numbers. But I notice that patterns like this are common:
Code: Select all
Red.Hue
PH1: 0.414409, 0.423139
PH2: 1.50504, 1.62729
PH3: 3.66349, 4.2049
PH4: 5.41613, 3.9596
PH5: 10.3857, 8.33355
PH6: 6.48774, 4.93924
PH7: 9.98823, 8.10752
Code: Select all
diff.Red.PH1 = image1.Red.PH1 - image2.Red.PH1
Code: Select all
image1.Red.PH1 - image2.Red.PH1
diff.Red.PH1 = -----------------------------------
(image1.Red.PH1 + image2.Red.PH1)/2
As I say, sadly I don't have time for this right now.
snibgo's IM pages: im.snibgo.com
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Get perceptual hash value for image using command line
Another question occurs to me:
6. If the colorspace (or one of the colorspaces) used has a lightness channel, such as YCrCb or Lab, is there benefit to giving more weight to the lightness hashes?
I'm afraid I have more questions than answers. Sorry about that.
6. If the colorspace (or one of the colorspaces) used has a lightness channel, such as YCrCb or Lab, is there benefit to giving more weight to the lightness hashes?
I'm afraid I have more questions than answers. Sorry about that.
snibgo's IM pages: im.snibgo.com
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Get perceptual hash value for image using command line
That is what I am discussing with Magicksnibgo wrote:@Fred: If I had time to experiment, I'd set the colorspace from an attribute so I could use a define, rather than re-compiling for different colorspaces.
Interesting point.snibgo wrote:PH7 is usually (always?) much greater than PH1. I expect the difference between two images is likewise. So, perhaps we should take the proportional difference instead of the absolute difference.
How well did that work? Would it make sense to use 3 colorspace with 8 channels? RGB, CLp, YCbCr? Perhaps that gives more weight to "intensity-like" channels (Lp and Y)snibgo wrote:For a job I did, I ignored the Hue numbers, so use only 35.
Would that be like converting your image to linear RGB first, before the phash compare?I would expect that a perceptually uniform colorspace would give better results.
Re: Get perceptual hash value for image using command line
No problem snibgo, all of this information is a handy start - very much appreciated.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Get perceptual hash value for image using command line
It worked fine, successfully finding (for example) ten images that were close to a given image. When Hue was included, it missed images that should have been close.fmw42 wrote:How well did that work?snibgo wrote:For a job I did, I ignored the Hue numbers, so use only 35.
Speed is always an issue. Calculating PH values takes a long time, and for two (or more) colorspaces takes twice (or more) as long.
If two (or more) colorspaces give better results, that's fair enough. Perhaps one colorspace with a lightness channel, and no hue, is sufficient. I don't know.
Would linear RGB be better? I doubt it, as linear RGB is even less perceptually uniform than sRGB. But it is worth testing.
If it is found that more colorspaces give better results, then perhaps this could be an option. Currently IM always calculates two. Perhaps we could give IM a list of colorspaces to calculate. That's not important, as we can do the job in a command, finding as many as we want (but currently wasting half the effort):
Code: Select all
convert ^
r.png ^
-verbose ^
-define identify:moments ^
( +clone -colorspace Lab -set colorspace sRGB +write info: +delete ) ^
( +clone -colorspace YCbCr -set colorspace sRGB +write info: ) ^
NULL:
snibgo's IM pages: im.snibgo.com
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Get perceptual hash value for image using command line
I've skimmed through the paper http://www.naturalspublishing.com/files ... g3omq1.pdf "Perceptual Hashing for Color Images Using Invariant Moments", Zhenjun Tang, Yumin Dai and Xianquan Zhang, 2011.
IM's method appears to be based on this. Tang et al uses YCbCr and HSI colorspaces. They take a load of standard images (Lena, Baboon, etc) and tweak ("attack") each one, then calculate RMS phash between all pairs of images. Where this is below a certain threshold, the images are considered the same; otherwise they are not.
What tweaking do they do? Brightness adjustment, contrast adjustment, gamma correction, 3x3 Gaussian low-pass filtering, JPEG compression, watermark embedding, scaling, and rotation.
That list has no operation that changes hues. This is why including Hue as one of the channels didn't cause problems in their testing.
At http://www.fmwconcepts.com/misc_tests/p ... index.html , Fred includes other tweaking operations, such as translation and various distortions, but again with no operations that change hues.
Including Hue as one of the channels may well aid discrimination, but it falsely increases the score when a tweaking operation changes hue slightly eg from 99% to 1%, so it harms the robustness to this tweaking.
Changing hue is a common operation on photography and video, perhaps most often for colour balancing. So, for my purposes, Hue should not be used.
IM's method appears to be based on this. Tang et al uses YCbCr and HSI colorspaces. They take a load of standard images (Lena, Baboon, etc) and tweak ("attack") each one, then calculate RMS phash between all pairs of images. Where this is below a certain threshold, the images are considered the same; otherwise they are not.
What tweaking do they do? Brightness adjustment, contrast adjustment, gamma correction, 3x3 Gaussian low-pass filtering, JPEG compression, watermark embedding, scaling, and rotation.
That list has no operation that changes hues. This is why including Hue as one of the channels didn't cause problems in their testing.
At http://www.fmwconcepts.com/misc_tests/p ... index.html , Fred includes other tweaking operations, such as translation and various distortions, but again with no operations that change hues.
Including Hue as one of the channels may well aid discrimination, but it falsely increases the score when a tweaking operation changes hue slightly eg from 99% to 1%, so it harms the robustness to this tweaking.
Changing hue is a common operation on photography and video, perhaps most often for colour balancing. So, for my purposes, Hue should not be used.
snibgo's IM pages: im.snibgo.com
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Get perceptual hash value for image using command line
Indeed, that was on oversight in the paper from which I created our Phash. And an oversight in my "attacks".snibgo wrote:That list has no operation that changes hues. This is why including Hue as one of the channels didn't cause problems in their testing.
I am going to suggest that we keep the current method without the hue channel, if that can be done without too much effort, but also add a -define or argument so that other colorspaces can be used such as Lab or YCbCr (or other Yuv, Yiq), so that others can test for themselves.
snibgo: do you have any other suggestions.
I am not sure yet about the normalization, but I can see if I can get a test version with it implemented. Do you think we should have a -define for the normalization?
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Get perceptual hash value for image using command line
Perhaps we should have a define to allow the use or Hue (or dis-allow it). That would allow backward compatibility and also allow one to decide if that is needed or not for the type of images.