Large Image Catalog - Color Sorting

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
CybrMike
Posts: 6
Joined: 2010-09-28T08:50:41-07:00
Authentication code: 8675308

Large Image Catalog - Color Sorting

Post by CybrMike »

Have a large image catalog, 100,000+ images, that needs to be sorted via major colors (red, green, blue, purple, etc.).

Say about 16 major colors. How would I go about grouping the images into these groups based on pixel count. Obviously I'd need to group all the blue images together, with the images having the highest concentration of blue near the top.

I can compress the image down to its main 16 colors doing something like this:

Code: Select all

$img = new Imagick($image_fname);
$img->quantizeImage(16,Imagick::COLORSPACE_RGB,1,false,false);
Then, grab the histogram:

Code: Select all

$histogram = $img->getImageHistogram();

foreach($histogram as $h){
                                //echo $h->getColorAsString() . "\n";
                                $color=$h->getColor();
                                $data = array("productid"=>$pid,
                                        "red"=>$color['r'],
                                        "green"=>$color['g'],
                                        "blue"=>$color['b']
                                );
}
But this doesn't get me the colors sorted into the 16 main colors. The rgb values found are of course all over the board. How do I decide if an RGB value is close enough to blue for instance. The other thing this doesn't account for is the percentage of blue vs red so that I can rank the images by the amount of color.

Any help here would be appreciated and would be rewarded with Bitcoin.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Large Image Catalog - Color Sorting

Post by fmw42 »

One possibility is to average your image down to 1 pixel, then compare -metric rmse that pixel to a 1 pixel image of your basic colors. That will tell you how close your overall image color is to your primary colors.

Another way is to do

convert yourimage samesizedcolorimage -compose difference -composite -format "%[mean]" info:

That will get you the average difference between your image and the samesized color image. However, this will much slower that the above method.


Another way, is to do as you have and then use -fx to compute the rmse value between each histogram color and your primary colors. Or use compare on 1 pixel images between each color and your histogram colors.

Another way, is to use -remap on your image with a colormap image made up of only your colors. Then generate the historgram which will have only your colors in in it. The remap converts your image pixels into the closest values from the colormap image. see http://www.imagemagick.org/Usage/quantize/#remap, just turn the dither off with +dither.
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Large Image Catalog - Color Sorting

Post by anthony »

Basically you are wanting to get a 'metric' that describes your image...

I would first separate images by basic overall type.
http://www.imagemagick.org/Usage/compare/#type_general

Then use some type of metric for that type
http://www.imagemagick.org/Usage/compare/#metrics

Overall average color is one metric, (scale image to a single pixel)
another is average foreground (central) background (edges) color,
one of the best metrics I have seen is a 3x3 array the average colors (scale image to 3x3 pixels)

This last can be expanded further into an array of differences between areas of an image, which is what is used by one paper I recently seen.
See Image Signature for Any Kind of Image
http://www.cs.cmu.edu/~hcwong/Pdfs/icip02.ps

This may not be the best for an image database, but then know one has determines what is the best metric for image sorting and searching. So far all that seems to be happening is the generation of more and more complex image metrics, which works with one set of images, but fails for other sets of images.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
Post Reply