compare TIFF and PDF files
Posted: 2009-07-13T13:47:07-07:00
Colleagues
I am currently doing a mass TIFF to PDF conversion project for our PLM system, about 150k files. All files are contained in the database. I am using a batch processing macro in Acrobat 9 ProX to affect the conversion. While the conversions appear to go OK, reviewing the document properties with IDENTIFY, I see that the original TIFF image may be 300x300 DPI but the resultant PDF is 72x72, which is the default PDF resolution (PPI).
I need to create the resultant conversion at the same resolution as the original, create it in PDF/A compliant format (ISO 32000-1), and validate the image between the original file (TIFF) and the converted file (PDF).
I have used COMPARE to validate differences between TIFF files and it works great. However, when attempting to do the same between the original TIFF file and the PDF, I get the following error:
# compare -fuzz 10% 1460D0100_SH1.tif 1460D0100_SH1.pdf cmpr.gif
compare: image size differs `1460D0100_SH1.tif' @ compare.c/CompareImageChannels/153.
This of course makes sence if the file resolution is being changed. We are still determining if the change in resolution will affect the downstream readability of the file, yet intuitively one would want to maintain the same resolution.
One thought was that the User coordinate system in the PDF was obscuring the actual raster image resolution. However, that can not be validated, at least with the options I have tried with VALIDATE.
My attempted course of action was to convert both the TIFF and PDF to a common image format, and then compare them. I attempted this, but found the PDF to GIF conversion was very bad, and of course failed in comparison (note file sizes):
07/07/2009 01:26 PM 150,745 1460D0100_SH1.pdf
06/02/2009 04:26 PM 148,590 1460D0100_SH1.tif
07/13/2009 04:23 PM 23,245 cmpr_pdf.gif
07/13/2009 04:23 PM 360,890 cmpr_tiff.gif
I had also thought that if I could extract a raw uncompressed raster image from each, then do a md5 checksum that may also offer some way of validation of the images. However, I am unfamiliar enough that I can not do that with either the TIFF or PDF.
Hence, I am at a loss for a course of action, to both make the PDF conversion yield a proper resolution and be PDF/A compliant, and for a mechanism to provide quality assurance that the TIFF and the resulting PDF images are the same, within a reasonable tolerance.
Thank you in advance for your assistance.
Regards
John Lopez
PDM Architect
Goodrich Corp.
I am currently doing a mass TIFF to PDF conversion project for our PLM system, about 150k files. All files are contained in the database. I am using a batch processing macro in Acrobat 9 ProX to affect the conversion. While the conversions appear to go OK, reviewing the document properties with IDENTIFY, I see that the original TIFF image may be 300x300 DPI but the resultant PDF is 72x72, which is the default PDF resolution (PPI).
I need to create the resultant conversion at the same resolution as the original, create it in PDF/A compliant format (ISO 32000-1), and validate the image between the original file (TIFF) and the converted file (PDF).
I have used COMPARE to validate differences between TIFF files and it works great. However, when attempting to do the same between the original TIFF file and the PDF, I get the following error:
# compare -fuzz 10% 1460D0100_SH1.tif 1460D0100_SH1.pdf cmpr.gif
compare: image size differs `1460D0100_SH1.tif' @ compare.c/CompareImageChannels/153.
This of course makes sence if the file resolution is being changed. We are still determining if the change in resolution will affect the downstream readability of the file, yet intuitively one would want to maintain the same resolution.
One thought was that the User coordinate system in the PDF was obscuring the actual raster image resolution. However, that can not be validated, at least with the options I have tried with VALIDATE.
My attempted course of action was to convert both the TIFF and PDF to a common image format, and then compare them. I attempted this, but found the PDF to GIF conversion was very bad, and of course failed in comparison (note file sizes):
07/07/2009 01:26 PM 150,745 1460D0100_SH1.pdf
06/02/2009 04:26 PM 148,590 1460D0100_SH1.tif
07/13/2009 04:23 PM 23,245 cmpr_pdf.gif
07/13/2009 04:23 PM 360,890 cmpr_tiff.gif
I had also thought that if I could extract a raw uncompressed raster image from each, then do a md5 checksum that may also offer some way of validation of the images. However, I am unfamiliar enough that I can not do that with either the TIFF or PDF.
Hence, I am at a loss for a course of action, to both make the PDF conversion yield a proper resolution and be PDF/A compliant, and for a mechanism to provide quality assurance that the TIFF and the resulting PDF images are the same, within a reasonable tolerance.
Thank you in advance for your assistance.
Regards
John Lopez
PDM Architect
Goodrich Corp.