Page 1 of 1

get rid of duplicates in a directory

Posted: 2013-07-31T23:59:17-07:00
by Senmeis
Hello,

Many frames are extracted from a video file, which includes a PPT presentation. Since many frames are the same, I want to get rid of all the duplicates with “compare”. I know two graphics can be compared in this way:

compare -verbose graphic1.jpg graphic2.jpg /dev/null

But how to compare the whole graphics successively and delete all the duplicates?

Thanks
Owen

Re: get rid of duplicates in a directory

Posted: 2013-08-01T09:18:39-07:00
by fmw42
You will have to script a loop over each pair of images in your directory (one image vs all the rest). Then use compare and put the result into a variable. Then test the variable against some threshold. If the difference is small enough, then use rm to delete one of the files. Then repeat for the the next image in the directory.

There is no IM only solution. So this is really more of a scripting issue than one about IM.

Re: get rid of duplicates in a directory

Posted: 2013-08-02T08:06:44-07:00
by snibgo
As fmw42 says, IM can readily find a difference number, eg:

Code: Select all

compare -metric RMSE frame_000045.tiff frame_000046.tiff NULL: 2>diff_000045_000046.txt
What you do with these numbers depends on what exactly you want to do.

Re: get rid of duplicates in a directory

Posted: 2013-08-02T10:38:10-07:00
by fmw42
snibgo wrote:As fmw42 says, IM can readily find a difference number, eg:

Code: Select all

compare -metric RMSE frame_000045.tiff frame_000046.tiff NULL: 2>diff_000045_000046.txt
What you do with these numbers depends on what exactly you want to do.

If you want to put it into a variable then use

var=`compare -metric RMSE frame_000045.tiff frame_000046.tiff NULL: 2>&1`

you can add a pipe to sed or (tr and cut) to extract one of the two values returned. I usually go with the second which is in the range 0 to 1, so not IM compile dependent.

var=`compare -metric RMSE frame_000045.tiff frame_000046.tiff NULL: 2>&1 | tr -cs ".0-9" " " | cut -d\ -f2`

Then you can test $var against some fixed threshold value and decide to rm the file or not and continue the loop.

Re: get rid of duplicates in a directory

Posted: 2013-08-04T23:28:40-07:00
by anthony
if you can read in the frames into memory, you can also use some of the GIF animation tests with a -fuzz to see what has changed
however videos are notoriously noisy, and lower quality, though that may not be noticable when actually playing.