Get perceptual hash value for image using command line

blair · Post by **blair** » 2016-08-07T17:56:00-07:00

How can I get the perceptual image hash value of an image using the command line? Is this possible? Does it even make sense? If it does - what's the best way to do this?

I'm a novice ImageMagick user & total ignoramus when it comes to perceptual hashing. I've been trying to find a way to generate a perceptual hash for a given image and store it in a database so it can be compared with other P-Hashes in the future. Experimenting with NodeJS & NPM packages, they all seem to return different values in different formats (hex or binary or varying wildly in length). I think I may be missing some critical knowledge about the entire concept of perceptual hashing. That, and/or a simple & very obvious flag I can pass to imagemagick's 'identify' program...

Apologies if this has already been covered - I just can't seem to find an answer. If anybody could give me a kick in the right direction it'd be greatly appreciated.

Post by **snibgo** » 2016-08-07T18:13:21-07:00

Code: Select all

identify -verbose -define identify:moments x.png

The output will include something like:

Code: Select all

  Channel perceptual hash:
    Red, Hue:
      PH1: 0.556327, 11
      PH2: 2.5103, 11
      PH3: 3.35776, 11
      PH4: 3.35776, 11
      PH5: 6.71553, 11
      PH6: 4.61291, 11
      PH7: 11, 11
    Green, Chroma:
      PH1: 0.556327, 11
      PH2: 2.5103, 11
      PH3: 3.35776, 11
      PH4: 3.35776, 11
      PH5: 6.71553, 11
      PH6: 4.61291, 11
      PH7: 11, 11
    Blue, Luma:
      PH1: 0.556327, 0.556327
      PH2: 2.5103, 2.5103
      PH3: 3.35776, 3.35776
      PH4: 3.35776, 3.35776
      PH5: 6.71553, 6.71553
      PH6: 4.61291, 4.61291
      PH7: 11, 11

Those 42 numbers are what you want. This is a monochrome image, so there are only 7 values. Personally, I think the Hue values (2nd column in the first group) shouldn't be used.

Post by **fmw42** » 2016-08-07T18:50:22-07:00

Blair: See viewtopic.php?f=4&t=24906

Snibgo: I have been checking out the hue issue and do agree. So I have been testing the code using YCbCr and LAB in place of HCLp. They seem to work reasonably well. I am working with Magick on this.

In the meantime, if you want to compile the code using some other colorspace. It is easy to do. Change line 2095 in statistic.c to use some other colorspace. It won't change the titling in identify -verbose -moments, but you will be using the new colorspace.

From:

status=TransformImageColorspace(hash_image,HCLpColorspace);

To

status=TransformImageColorspace(hash_image,LabColorspace);
or
status=TransformImageColorspace(hash_image,YCbCrColorspace);

blair · Post by **blair** » 2016-08-07T19:36:37-07:00

Thanks guys.

Fred: I had found the link you posted, but unfortunately, it left me with as many questions as when I started.

Snibgo: I used the command you suggested, and it gives me the image moments as I have seen them when they're generated with something like

identify -verbose -features 1 -moments -unique x.jpg

Using the command you provided, and its output as an example:

Code: Select all

Image: scott.jpg
  Format: JPEG (Joint Photographic Experts Group JFIF format)
  Mime type: image/jpeg
  Class: DirectClass
  Geometry: 831x1108+0+0
  Units: Undefined
  Type: TrueColor
  Endianess: Undefined
  Colorspace: sRGB
  Depth: 8-bit
  Channel depth:
    red: 8-bit
    green: 8-bit
    blue: 8-bit
  Channel statistics:
    Pixels: 920748
    Red:
      min: 19 (0.0745098)
      max: 255 (1)
      mean: 103.957 (0.407673)
      standard deviation: 53.8262 (0.211083)
      kurtosis: 0.7424
      skewness: 1.19267
      entropy: 0.916575
    Green:
      min: 0 (0)
      max: 255 (1)
      mean: 48.6739 (0.190878)
      standard deviation: 33.935 (0.133078)
      kurtosis: 7.07903
      skewness: 1.9035
      entropy: 0.843458
    Blue:
      min: 0 (0)
      max: 255 (1)
      mean: 41.3478 (0.162148)
      standard deviation: 37.4749 (0.14696)
      kurtosis: 13.8376
      skewness: 3.31578
      entropy: 0.808608
  Image statistics:
    Overall:
      min: 0 (0)
      max: 255 (1)
      mean: 64.6594 (0.253566)
      standard deviation: 42.6349 (0.167196)
      kurtosis: 9.32091
      skewness: 2.86859
      entropy: 0.856213
  Channel moments:
    Red:
      Centroid: 484.098,465.684
      Ellipse Semi-Major/Minor axis: 633.899,468.061
      Ellipse angle: 7.73379
      Ellipse eccentricity: 0.511483
      Ellipse intensity: 102.688 (0.402698)
      I1: 0.00162172 (0.413539)
      I2: 2.27821e-07 (0.014814)
      I3: 8.82495e-11 (0.0014633)
      I4: 2.09412e-10 (0.00347234)
      I5: -9.82006e-22 (-2.69995e-07)
      I6: 8.55174e-14 (0.000361589)
      I7: -2.84511e-20 (-7.82241e-06)
      I8: -2.58722e-14 (-0.000109394)
    Green:
      Centroid: 486.112,498.321
      Ellipse Semi-Major/Minor axis: 688.378,427.294
      Ellipse angle: 11.1463
      Ellipse eccentricity: 0.615853
      Ellipse intensity: 48.4991 (0.190192)
      I1: 0.00366186 (0.933773)
      I2: 2.64023e-06 (0.171681)
      I3: 3.71273e-10 (0.00615622)
      I4: 9.74618e-10 (0.0161605)
      I5: -5.80934e-19 (-0.000159723)
      I6: 1.54173e-12 (0.00651884)
      I7: -7.89243e-20 (-2.16996e-05)
      I8: -1.80941e-13 (-0.000765066)
    Blue:
      Centroid: 509.679,462.952
      Ellipse Semi-Major/Minor axis: 700.409,429.122
      Ellipse angle: 18.9012
      Ellipse eccentricity: 0.622356
      Ellipse intensity: 40.3191 (0.158114)
      I1: 0.00443067 (1.12982)
      I2: 4.049e-06 (0.263287)
      I3: 2.20332e-10 (0.0036534)
      I4: 4.68073e-09 (0.0776129)
      I5: -1.88331e-18 (-0.000517802)
      I6: 9.08329e-12 (0.0384064)
      I7: -4.36445e-18 (-0.00119997)
      I8: -1.24544e-12 (-0.00526601)
  Image moments:
    Overall:
      Centroid: 415.5,554
      Ellipse Semi-Major/Minor axis: 661.302,451.636
      Ellipse angle: 11.8618
      Ellipse eccentricity: 0.563072
      Ellipse intensity: 63.4503 (0.248825)
      I1: 0.00269294 (0.686699)
      I2: 9.60135e-07 (0.0624328)
      I3: 1.43378e-10 (0.00237741)
      I4: 8.03585e-10 (0.0133245)
      I5: -8.57493e-20 (-2.35761e-05)
      I6: 7.26833e-13 (0.00307323)
      I7: -2.58937e-19 (-7.11925e-05)
      I8: -1.51427e-13 (-0.00064027)
  Channel perceptual hash:
    Red, Hue:
      PH1: 0.383483, 0.2179
      PH2: 1.82932, 1.53278
      PH3: 2.83467, 1.88231
      PH4: 2.45938, 2.47389
      PH5: 6.56882, 4.69728
      PH6: 3.44179, 4.25623
      PH7: 5.10667, 5.01462
    Green, Chroma:
      PH1: 0.0297579, 0.21087
      PH2: 0.765282, 1.72529
      PH3: 2.21072, 1.97494
      PH4: 1.79154, 1.82264
      PH5: 3.79665, 4.85014
      PH6: 2.18583, 2.88466
      PH7: 4.66359, 3.72263
    Blue, Luma:
      PH1: -0.0530096, 0.160801
      PH2: 0.579577, 1.17793
      PH3: 2.4373, 2.66479
      PH4: 1.11007, 1.93322
      PH5: 3.28589, 4.67092
      PH6: 1.4156, 2.55493
      PH7: 2.92082, 4.26312
  Rendering intent: Perceptual
  Gamma: 0.454545
  Chromaticity:
    red primary: (0.64,0.33)
    green primary: (0.3,0.6)
    blue primary: (0.15,0.06)
    white point: (0.3127,0.329)
  Background color: white
  Border color: srgb(223,223,223)
  Matte color: grey74
  Transparent color: black
  Interlace: None
  Intensity: Undefined
  Compose: Over
  Page geometry: 831x1108+0+0
  Dispose: Undefined
  Iterations: 0
  Compression: JPEG
  Quality: 93
  Orientation: Undefined
  Properties:
    date:create: 2016-02-21T11:01:57-05:00
    date:modify: 2016-02-21T11:01:51-05:00
    exif:Software: Google
    jpeg:colorspace: 2
    jpeg:sampling-factor: 2x2,1x1,1x1
    signature: 1a15bfa8c718dc31195948695ce67778e7ac54e707b44d425e0115234d2863c3
  Profiles:
    Profile-exif: 40 bytes
  Artifacts:
    filename: scott.jpg
    identify:moments: 
    verbose: true
  Tainted: False
  Filesize: 69.4KB
  Number pixels: 921K
  Pixels per second: 460.37GB
  User time: 0.000u
  Elapsed time: 0:01.000
  Version: ImageMagick 6.9.3-0 Q16 x86_64 2016-01-08 http://www.imagemagick.org

What I think I don't understand is this: How to turn those "42 numbers" into a single & unique (and therefore relational-database-friendly) hash value representative of a given image. Such as what this Node module does: https://www.npmjs.com/package/imghash

It promises to return a single hex-hash value, which you can then use to compare to another hash to determine image similarity.

It does indeed appear to do this, as I have determined by experimenting with it.

But what I don't get is how this is being done, or if it is even an effective or wise way to produce a perceptual hash.

If it is indeed a fair and reliable way to represent a perceptual hash (with a single, unique hex value, for example) - then is it possible for imagemagick to produce this hash value from the command line?

Thanks again for your patience and help.

Post by **fmw42** » 2016-08-07T19:41:17-07:00

The 42 floating point values from two images are compared using a (strike this: root mean) square distance measure. It is not a simple binary has that can be compared using the hamming distance.

If you are on Unix (Linux, Mac OSX, Windows with Cygwin or Windows 10), then I have two scripts, phashconvert and phashcompare, at the link below. The first converts the 42 floats to a string of digits (not binary) that can be stored. The second takes two strings of digits, converts back to floating point values and does the rms difference.

EDIT: The metric is just the Sum of Squared Differences between the 42 float values.

blair · Post by **blair** » 2016-08-07T20:14:28-07:00

Thanks Fred! I will experiment with those scripts.

Post by **fmw42** » 2016-08-07T21:37:38-07:00

blair wrote:Thanks Fred! I will experiment with those scripts.

Note the comments above about issues using the Hue channel that may cause poor results for reddish images.

Post by **snibgo** » 2016-08-08T06:24:26-07:00

@Fred: If I had time to experiment, I'd set the colorspace from an attribute so I could use a define, rather than re-compiling for different colorspaces.

I suspect the following would also work, to get Lab numbers in the left-hand column:

Code: Select all

convert in.tiff -colorspace Lab -set colorspace sRGB -verbose -define identify:moments info:

@Blair: The numbers can be stored as 42 numeric fields in a database or, as Fred says, packed together into a single text field, then unpacked when you need them. When comparing two images, calculate the RMS of the differences. That is: subtract numbers of one image from corresponding numbers of the other image. Square these 42 differences. Add the squares together, and divide by 42. Take the square root. The single resulting number is the "distance" between the images. Zero means they match exactly.

IM's "-metric phash" does something like this.

For a job I did, I ignored the Hue numbers, so use only 35.

Questions in my mind (which, sadly, I don't have time to investigate):

1. Do we gain anything by using two colorspaces instead of one? [EDIT: Would three colorspaces be even better?]

2. I would expect that a perceptually uniform colorspace would give better results.

3. How much precision do we need in the numbers? The default is 6. I think Fred's scheme uses 4. Is 4 sufficient? Is 6 better? Is 10 better?

4. The usual RMS scheme gives equal weighting to all the numbers. But I notice that patterns like this are common:

Code: Select all

 Red.Hue
 PH1: 0.414409, 0.423139
 PH2: 1.50504, 1.62729
 PH3: 3.66349, 4.2049
 PH4: 5.41613, 3.9596
 PH5: 10.3857, 8.33355
 PH6: 6.48774, 4.93924
 PH7: 9.98823, 8.10752

PH7 is usually (always?) much greater than PH1. I expect the difference between two images is likewise. So, perhaps we should take the proportional difference instead of the absolute difference. That is, instead of:

Code: Select all

diff.Red.PH1 = image1.Red.PH1 - image2.Red.PH1

... perhaps we should use:

Code: Select all

                 image1.Red.PH1 - image2.Red.PH1
diff.Red.PH1 = -----------------------------------
               (image1.Red.PH1 + image2.Red.PH1)/2

5. There is probably a standard database of images somewhere that we can test against, and compare IM's methods with those of other systems.

As I say, sadly I don't have time for this right now.

Post by **snibgo** » 2016-08-08T08:26:41-07:00

Another question occurs to me:

6. If the colorspace (or one of the colorspaces) used has a lightness channel, such as YCrCb or Lab, is there benefit to giving more weight to the lightness hashes?

I'm afraid I have more questions than answers. Sorry about that.

Post by **fmw42** » 2016-08-08T09:08:00-07:00

snibgo wrote:@Fred: If I had time to experiment, I'd set the colorspace from an attribute so I could use a define, rather than re-compiling for different colorspaces.

That is what I am discussing with Magick

snibgo wrote:PH7 is usually (always?) much greater than PH1. I expect the difference between two images is likewise. So, perhaps we should take the proportional difference instead of the absolute difference.

Interesting point.

snibgo wrote:For a job I did, I ignored the Hue numbers, so use only 35.

How well did that work? Would it make sense to use 3 colorspace with 8 channels? RGB, CLp, YCbCr? Perhaps that gives more weight to "intensity-like" channels (Lp and Y)

I would expect that a perceptually uniform colorspace would give better results.

Would that be like converting your image to linear RGB first, before the phash compare?

blair · Post by **blair** » 2016-08-08T09:17:53-07:00

No problem snibgo, all of this information is a handy start - very much appreciated.

Post by **snibgo** » 2016-08-08T10:01:05-07:00

fmw42 wrote:
snibgo wrote:For a job I did, I ignored the Hue numbers, so use only 35.
How well did that work?

It worked fine, successfully finding (for example) ten images that were close to a given image. When Hue was included, it missed images that should have been close.

Speed is always an issue. Calculating PH values takes a long time, and for two (or more) colorspaces takes twice (or more) as long.

If two (or more) colorspaces give better results, that's fair enough. Perhaps one colorspace with a lightness channel, and no hue, is sufficient. I don't know.

Would linear RGB be better? I doubt it, as linear RGB is even less perceptually uniform than sRGB. But it is worth testing.

If it is found that more colorspaces give better results, then perhaps this could be an option. Currently IM always calculates two. Perhaps we could give IM a list of colorspaces to calculate. That's not important, as we can do the job in a command, finding as many as we want (but currently wasting half the effort):

Code: Select all

convert ^
  r.png ^
  -verbose ^
  -define identify:moments ^
  ( +clone -colorspace Lab -set colorspace sRGB +write info: +delete ) ^
  ( +clone -colorspace YCbCr -set colorspace sRGB +write info: ) ^
  NULL:

Post by **snibgo** » 2016-08-08T14:51:33-07:00

I've skimmed through the paper http://www.naturalspublishing.com/files ... g3omq1.pdf "Perceptual Hashing for Color Images Using Invariant Moments", Zhenjun Tang, Yumin Dai and Xianquan Zhang, 2011.

IM's method appears to be based on this. Tang et al uses YCbCr and HSI colorspaces. They take a load of standard images (Lena, Baboon, etc) and tweak ("attack") each one, then calculate RMS phash between all pairs of images. Where this is below a certain threshold, the images are considered the same; otherwise they are not.

What tweaking do they do? Brightness adjustment, contrast adjustment, gamma correction, 3x3 Gaussian low-pass filtering, JPEG compression, watermark embedding, scaling, and rotation.

That list has no operation that changes hues. This is why including Hue as one of the channels didn't cause problems in their testing.

At http://www.fmwconcepts.com/misc_tests/p ... index.html , Fred includes other tweaking operations, such as translation and various distortions, but again with no operations that change hues.

Including Hue as one of the channels may well aid discrimination, but it falsely increases the score when a tweaking operation changes hue slightly eg from 99% to 1%, so it harms the robustness to this tweaking.

Changing hue is a common operation on photography and video, perhaps most often for colour balancing. So, for my purposes, Hue should not be used.

Post by **fmw42** » 2016-08-08T15:41:22-07:00

snibgo wrote:That list has no operation that changes hues. This is why including Hue as one of the channels didn't cause problems in their testing.

Indeed, that was on oversight in the paper from which I created our Phash. And an oversight in my "attacks".

I am going to suggest that we keep the current method without the hue channel, if that can be done without too much effort, but also add a -define or argument so that other colorspaces can be used such as Lab or YCbCr (or other Yuv, Yiq), so that others can test for themselves.

snibgo: do you have any other suggestions.

I am not sure yet about the normalization, but I can see if I can get a test version with it implemented. Do you think we should have a -define for the normalization?

Post by **fmw42** » 2016-08-08T15:48:54-07:00

Perhaps we should have a define to allow the use or Hue (or dis-allow it). That would allow backward compatibility and also allow one to decide if that is needed or not for the type of images.

Legacy ImageMagick Discussions Archive

Get perceptual hash value for image using command line

Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line

Re: Get perceptual hash value for image using command line