Thanks snibgo, you've cleared some things up. I'm happy to hear about "pdfimages -list".. that's quite useful.
I now have a working solution that's verifiable. But I will mention some annoyances to warn others, perhaps to also serve as a note to developers:
* ImageMagick alters the resolution when it's not told to do so. This fails the rule of least astonishment. E.g.
Code: Select all
$ convert source_600dpi.pbm target_72dpi.pdf
$ pdfimages -list target_72dpi.pdf
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
1 0 image 5100 6601 gray 1 8 image no 8 0 72 72 690K 2.1%
That's not good. I didn't tell it to downsample 600dpi to 72dpi. But I'm glad I can at least force convert to do the right thing using -density.
* The "identify -verbose" command often gives no resolution for raster images, which must have a resolution. And for PDFs, you say it gives the "overall" resolution, but when the PDF is nothing other than a single raster image and nothing else, I expect the overall resolution to match that of the embedded image. Since it's always showing 72dpi, I suspect the PDF may contain a resolution for rendering/display property. No big deal, but a PDF is perhaps the one case where it would actually be sensible for the identify command to omit resolution, and in fact it's giving something that mismatches the objects inside.
* Regarding pdfimages, my version supports the "-all" parameter, which extracts images without conversion. When I convert a pbm to a pdf, ImageMagick apparently converts the pbm to a png file before wrapping it in a PDF even if I supply "-compress none". Unless perhaps there is some inherent problem with embedding pbm files in a PDF, this is unexpected.
Anyway, I can live with these things. Thanks for the help.