The JPEGs are scans of a book, JPEGs are only 1200 px height and have quality 60, so ideally I’d want them to be taken to PDF untouched. Since there may be some metadata manipulation in the process, I can’t compare the original image to the extracted from the PDF with md5sum. But what I need is that they would be visually identical, so I decided to make a SSIM comparison.
Code: Select all
$ convert 00000001.jpg -compress JPEG 00000001.pdf
$ pdfimages -f 1 -l 1 -j 00000001.pdf testpage
$ ssim.sh 00000001.jpg testpage-000.jpg
Fx/Image//tmp/SSIM.17975[tmpI1.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.17975[tmpM1.mpc]: 1 of 2, 100% complete
Fx/Image//tmp/SSIM.17975[tmpI1.mpc]: 1199 of 1200, 100% complete
Fx/Image//tmp/SSIM.17975[tmpI2.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.17975[tmpM2.mpc]: 1 of 2, 100% complete
Fx/Image//tmp/SSIM.17975[tmpI2.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.17975[tmpI2.mpc]: 1 of 2, 100% complete
Fx/Image//tmp/SSIM.17975[tmpI1.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.17975[tmpM2.mpc]: 2 of 3, 100% complete
Fx/Image//tmp/SSIM.17975[tmpI1.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.17975[tmpC12.mpc]: 4 of 5, 100% complete
Fx/Image//tmp/SSIM.17975[tmpM1.mpc]: 1199 of 1200, 100% complete
ssim=0.998 dssim=0.002
I take another image – and SSIM returns 1. For every image except for this one, what’s added to PDF is visually identical to the original. Then I remember, that 00000001.jpg is edited with GIMP – it’s a cover with fabric and it was originally scanned too dark, – so I edited the levels in GIMP. For an experiment I got the source image as it was in the library, named it 00000001.orig.jpg and reran the test.
Code: Select all
$ convert 00000001.orig.jpg -compress JPEG "00000001.orig.pdf"
$ pdfimages -f 1 -l 1 -j 00000001.orig.pdf testpage.orig
$ ssim.sh 00000001.orig.jpg testpage.orig-000.jpg
Fx/Image//tmp/SSIM.21724[tmpI1.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.21724[tmpM1.mpc]: 1 of 2, 100% complete
Fx/Image//tmp/SSIM.21724[tmpI1.mpc]: 1199 of 1200, 100% complete
Fx/Image//tmp/SSIM.21724[tmpI2.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.21724[tmpM2.mpc]: 1 of 2, 100% complete
Fx/Image//tmp/SSIM.21724[tmpI2.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.21724[tmpI2.mpc]: 1 of 2, 100% complete
Fx/Image//tmp/SSIM.21724[tmpI1.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.21724[tmpM2.mpc]: 2 of 3, 100% complete
Fx/Image//tmp/SSIM.21724[tmpI1.mpc]: 1199 of 1200, 100% complete
Mogrify/Image//tmp/SSIM.21724[tmpC12.mpc]: 4 of 5, 100% complete
Fx/Image//tmp/SSIM.21724[tmpM1.mpc]: 1199 of 1200, 100% complete
ssim=1 dssim=0
00000001.jpg
00000001.orig.jpg
OS: Gentoo, x86_64
Code: Select all
$ convert -version | head -n 1
Version: ImageMagick 7.0.7-35 Q16 x86_64 2018-06-04 https://www.imagemagick.org
Code: Select all
$ pdfimages --version |& head -n1
pdfimages version 0.65.0
***
I know, that the “Zip” compression algorithm is recommended for combining images to PDF, but it increases size 5–7 times, and the jpegs, that the book is comprised of, already take 100 MiB.