Legacy ImageMagick Discussions Archive

I posted this to the IM mailing list, and was advised to post it here ...

I need to crop a bunch of photos to square while creating thumbnails. I am currently using ImageMagick to crop the largest square from the center of the photo ... works fine.

I am wondering if there are algorithms that could do a better job of identifying the more interesting parts of a photo (like simple face detection) or eliminating relatively uninteresting parts of a photo (like blue sky or white walls).

Doesn't have to be perfect. I am just looking for a simple improvement over grabbing the square from the center of the photo.

Any feedback / advice / experiences would be appreciated.

Here is one idea, though it did not turn out too well. Convert to grayscale, get edge image, compute mean over every 64x64 subsection, stretch the result and look for the brightest regions. Crop a 64x64 area around the brightest (or some very bright) region.

Original:

convert zelda3.png -colorspace gray -edge 1 zelda3_edge1.png

convert zelda3_edge1.png -virtual pixel black -statistic mean 64x64 -auto-level zelda3_edg1_mean64_al.png

Find offset coordinates of 64x64 region around the brightest region (done visually in a GUI app such as PS or GIMP) -- +32+16

convert zelda3.png[64x64+32+16] zelda3_crop.png

Using IM to locate the brightest value:
coords=`convert zelda3_edg1_mean64_al.png txt: | grep "white" | head -n 1 | sed -n 's/^$.*,.*$:.*$/\1/p'`
echo $coords
69,49

xcoord=`echo $coords | cut -d, -f1`
ycoord=`echo $coords | cut -d, -f2`
echo "xcoord=$xcoord; ycoord=$ycoord"
xcoord=69; ycoord=49

xoff=`convert xc: -format "%[fx:$xcoord-(64/2)]" info:`
yoff=`convert xc: -format "%[fx:$ycoord-(64/2)]" info:`
echo "xoff=$xoff; yoff=$yoff"
xoff=37; yoff=17

convert zelda3.png[64x64+37+17] zelda3_crop2.png

Note the above will be very slow due to the size 64x64 computed at each pixel. One could consider doing something similar with some offset, but I have not looked into how to do this other than reduce the image size and the window size.

PS.

You could also threshold the mean image to remove uninteresting areas. Then use -trim to get the remaining area. Then use -verbose info or string formats to get the size and offsets if you do not use +repage after the -trim and keep the image in png format. Then you could use those size and offsets to crop your input image.

For get about this idea -- it failed at time of writing... Left as a reference only...

fmw42 wrote:Here is one idea, though it did not turn out too well. Convert to grayscale, get edge image, compute mean over every 64x64 subsection, stretch the result and look for the brightest regions. Crop a 64x64 area around the brightest (or some very bright) region.
...
Code: Select all
convert zelda3_edge1.png -virtual-pixel black -statistic mean 64x64 -auto-level zelda3_edg1_mean64_al.png
Find offset coordinates of 64x64 region around the brightest region (done visually in a GUI app such as PS or GIMP) -- +32+16

Rather than computing an offset to get the mean you could use a convolution using a 64x64 rectangle but with the 'center' of the convolution at the top left corner (0,0). However you are really doing image comparision so a 'correlation' is needed (180 rotated convolution).
http://www.imagemagick.org/Usage/morphology/#rectangle
http://www.imagemagick.org/Usage/convolve/#mean
http://www.imagemagick.org/Usage/convolve/#correlate

Code: Select all

convert zelda3_edge1.png -virtual-pixel black -define convolve:scale=\! -morphology Correlate Rectangle:64x64+0+0 -auto-level zelda_edge_density.png

Strange something the result of the above is the same for convolution. And the pixels contains the results for the bottom +63+63 offset (convolution result), where I was trying to get the actual top-left corner of the 64x64 area.

Something is wrong with the kernel 180 rotate needed for correlate,-- looks like I have found a bug!

Actually one could use -blur 32x65000 to compute the mean and then the standard deviation within the window (65x65 pixels) for each pixel. This would achieve similar results to the mean of the edge image. Both are measures of the amount of variation within the window.

Note 65=32*2+1 as 32 is the radius, but using a very large sigma means that it is a uniform mean rather than gaussian

std=sqrt( mean(x^2) -(mean(x))^2 ), where x is the image and mean comes from -blur on the image. This can be easily coded from a few lines of IM commands.

P.S. It would be nice if -statistic had an std option.

P.S. Again
Testing the above does not work as well as the edge method.

the Bug has been fixed in the SVN version. Problem only effected kernels define using "Rectangle", a silly "if this - no need to do" statement that was wrong!

As I was saying before I found a bug. Using a correlation...

Code: Select all

convert zelda3_edge1.png -virtual-pixel black -define convolve:scale=\! \
        -morphology Correlate Rectangle:64x64+0+0 -auto-level zelda_edge_density.png

would (well should have and now does) result in an image where the maximum value (white)
is the actual offset (the top-left corner) of the 64x64 area of maximum edges (typically regarded as 'most interesting' )
No need to adjust the offset found, that pixel is the top left corner.

However as Fred mentioned, using a large kernel (unless you use Fast Forier Transforms) typically takes a long time to work. You can however typically get a speed up using smaller kernels multiple times.

Using a single large square...

Code: Select all

convert zelda3_edge1.png -virtual-pixel black -define convolve:scale=\! \
            -morphology Convolve Square:32  -auto-level zelda_edge_density.png

time: 0m0.390s

and for comparison Fred's linear blur (whcih uses a 2-pass convolution).

Code: Select all

convert zelda3_edge1.png -virtual-pixel black \
            -blur 32x65000 -auto-level zelda_edge_density.png

time: 0m0.039s

As is often the case a 2-pass 1-dimentional blur, beats any form of 2-dimentional convolution/correlate by a very wide margin! It also probably used my computers GPU rather than the CPU.

Here I force the use of 2-pass 1-D blur using CPU... (morphology has no GPU code - YET)

Code: Select all

convert zelda3_edge1.png -virtual-pixel black -define convolve:scale=\! \
              -morphology Convolve Blur:32x65000\>  -auto-level zelda_edge_density.png

time: 0m0.047s
A little slower but not by a huge amount.

And finally the original statistical mean solution

Code: Select all

convert zelda3_edge1.png -virtual-pixel \
          -statistic mean 64x64 -auto-level zelda_edge_density.png

time: 0m21.844s

Now that is slow!

I like to applogize to michaelUFL we seemed to have gotten a little too deep...

Does anyone else have some suggestion for michaelUFL's problem?

The method proposed by fmw42 fails miserably with portraits containing textured backgrounds in focus... but I can't come up with another simple one. Apart from that it seems pretty damn good.

For relatively dark photos, low key style, you can simple choose the brightest spots as the center

An intermediate method would be to calculate the "center of gravity", that is, the brightness center, and the center of edges, like fmw42 proposed. Then use the average of both centers.

Other ideas occurring to me include the use of artificial intelligence technology. Not the kind of stuff you can do in imagemagick

Likewise for high key images one can use the "center of blacks".

Now, one question: a simple way of sorting images in 3 categories?

- low key - use center of blacks
- high key - use center of whites
- without a key / normal - use center of edges

Extra ideas?...

Anthony,

How about showing my example by your method as it will take some days before your fixes get into the release. The reason I ask is that my std method using -blur did not produce a good result.

Fred

rnbc wrote:The method proposed by fmw42 fails miserably with portraits containing textured backgrounds in focus... but I can't come up with another simple one. Apart from that it seems pretty damn good.

I would probably adjust the edge detection to ignore small edges, such as causes by patterns. Using a LoG (Laplician of a Gaussian) edge detector for example will let you specify a 'sigma' radius that does exactly that. This will remove a lot of noise and look for the more important boundaries.

See Convolution, Log Edge Detection...
http://www.imagemagick.org/Usage/convolve/#log
as well as the other filters around it.

NOTE: the examples in the above uses a greyscale center. You may want to solarise this to make no changes zero. Also while I have not gone into it many other discussion in the forum have looks at 'zero crossing' edge detection of the LoG filter. Search for them.. Or ask Fred!

Yes, a LoG edge detector would likely be better than a simple edge detector. As Anthony said, it has the ability to filter out the very high frequency noise edges.

fmw42 wrote:Anthony,

How about showing my example by your method as it will take some days before your fixes get into the release. The reason I ask is that my std method using -blur did not produce a good result.

Fred

I have edited my 'correlate with offset' example to include the resulting image. The only different between this and Freds result is that the image is shifted by the 'offset' so that the maximum occurs at the position of the top left corner, rather thna in the center.

If Convolve have been used instead, the result would have been a maximum at the bottom-right corner, due to the way that convolution uses a rotated kernel (to make its mathematics work better).

Until you get my patched update for Correlate with a Rectangle Kernel (so you can do the correct operation of correlate instead of convolve) you can get the same result by rotating the kernel yourself and using convolve. For example.

Code: Select all

convert zelda3_edge1.png -virtual-pixel black -define convolve:scale=\! \
      -morphology Convolve Rectangle:64x64+63+63 -auto-level zelda_edge_density.png

Note the offset was rotated. Yes it seems very weird, and I myself think it is weird, but that is the way convolution works. and a rotated 180 degree kernel is the only difference between Convolve and Correlate.

Also note that correlate in IM is working fine for a non-rectangular kernels, but then almost every other 'named kernel' built into IM is symmetrical, meaning that convolve and correlate will produce the same result anyway. This is why people often say and use convolve (a frequency multiplier), when what they really mean correlate (for comparisons).

Legacy ImageMagick Discussions Archive

simple auto-cropping of photos?

simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?

Re: simple auto-cropping of photos?