trouble with pdf pages of different sizes.
trouble with pdf pages of different sizes.
Hi.
We are having trouble extending our use of imageMagick to converting PDF pages into JPEG images. Version 6.0.7
Currently we are doing this and it works fine. It generates a jpeg for ea. page:
mogrify -rotate "90<" -resize 800 -format jpeg document.pdf'
However, now we have pdf documents whose first page is of a different size (larger) than its subsequent pages. ImageMagick seems to use the geometry of that first page no matter what we do. For example these do not work:
mogrify -rotate "90<" -crop 567x439+0+0 -resize 800 -format jpeg document.pdf
The cropped image is correct, but it is not getting resized up to 800. Any suggestions would be great. Thanks.
We are having trouble extending our use of imageMagick to converting PDF pages into JPEG images. Version 6.0.7
Currently we are doing this and it works fine. It generates a jpeg for ea. page:
mogrify -rotate "90<" -resize 800 -format jpeg document.pdf'
However, now we have pdf documents whose first page is of a different size (larger) than its subsequent pages. ImageMagick seems to use the geometry of that first page no matter what we do. For example these do not work:
mogrify -rotate "90<" -crop 567x439+0+0 -resize 800 -format jpeg document.pdf
The cropped image is correct, but it is not getting resized up to 800. Any suggestions would be great. Thanks.
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: trouble with pdf pages of different sizes.
Yes, don't use mogrify. mogrify is find for simple operations that involve simple single images.
I suggest you use a looped convert instead. Just make sure you read the image in before
you modify it with operation options.
I suggest you use a looped convert instead. Just make sure you read the image in before
you modify it with operation options.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
Re: trouble with pdf pages of different sizes.
I am trying with convert untility now, but still having trouble. Resize after crop seems to make crop not work. For example:
convert -rotate "90<" -crop 819x646+0+0 -format jpeg document.pdf[2] x.jpg
crops the correct image area, however, after adding -resize option, the crop seems to no longer have any effect and the entire image is resized:
convert -rotate "90<" -crop 819x646+0+0 -resize 800 -format jpeg document.pdf[2] x.jpg
If I place resize before the crop they both work, but this not a convenient way of working:
convert -rotate "90<" -resize 2237 -crop 800x620+0+0 -format jpeg document.pdf[2] x.jpg
Thanks for your help.
convert -rotate "90<" -crop 819x646+0+0 -format jpeg document.pdf[2] x.jpg
crops the correct image area, however, after adding -resize option, the crop seems to no longer have any effect and the entire image is resized:
convert -rotate "90<" -crop 819x646+0+0 -resize 800 -format jpeg document.pdf[2] x.jpg
If I place resize before the crop they both work, but this not a convenient way of working:
convert -rotate "90<" -resize 2237 -crop 800x620+0+0 -format jpeg document.pdf[2] x.jpg
Thanks for your help.
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: trouble with pdf pages of different sizes.
It may be an interaction with 'virtual canvas' information crop leaves behind.
Try adding a +repage after the crop.
See Removing Canvas/Page Geometry
http://www.imagemagick.org/Usage/crop/#crop_repage
After the rsize you can also try -set page A4
to set the images 'virtual canvas' or 'page' for the PDF.
Please let me know how it goes.
Try adding a +repage after the crop.
See Removing Canvas/Page Geometry
http://www.imagemagick.org/Usage/crop/#crop_repage
After the rsize you can also try -set page A4
to set the images 'virtual canvas' or 'page' for the PDF.
Please let me know how it goes.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
Re: trouble with pdf pages of different sizes.
Thank you Anthony for your posts.
The +repage doesn't have any effect.
I think the real problem is that the page geometry is being set by the first page in the PDF, which is unusually wide. And code we have working for the subsequent pages now needs to be modified to treat those pages as though they was as wide as the first page. A complex execise in measurement transformations.
Thanks again for your help.
The +repage doesn't have any effect.
I think the real problem is that the page geometry is being set by the first page in the PDF, which is unusually wide. And code we have working for the subsequent pages now needs to be modified to treat those pages as though they was as wide as the first page. A complex execise in measurement transformations.
Thanks again for your help.
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: trouble with pdf pages of different sizes.
Sorry to hear it didn't work. Check with -identify before your save to PDF and see
what IM has to work with at that point.
In any case let us know what you come up with. Don't just leave us hanging, as others may have simular problems.
what IM has to work with at that point.
In any case let us know what you come up with. Don't just leave us hanging, as others may have simular problems.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
Re: trouble with pdf pages of different sizes.
I think I am experiencing something similar.
Basically, I have a bunch of PDFs that I want to create thumbnails for. Some are docs, but many more are PowerPoints or Excel charts that may have odd sizes. What it appears that IM is doing is taking the PDF, rasterizing it, and then fitting it to a A4 sheet of paper.
What I'd like:
What I'm getting:
...or in the case of a document:
What I'd like:
What I'm getting:
For the document, it seems to just be adding height to page (width seems OK), and then centering the content. For the PowerPoint PDF, it adds white space up top.
The PDFs look like the "What I'd Like" thumbs. Any further suggestions?
Nate
Basically, I have a bunch of PDFs that I want to create thumbnails for. Some are docs, but many more are PowerPoints or Excel charts that may have odd sizes. What it appears that IM is doing is taking the PDF, rasterizing it, and then fitting it to a A4 sheet of paper.
What I'd like:
What I'm getting:
...or in the case of a document:
What I'd like:
What I'm getting:
For the document, it seems to just be adding height to page (width seems OK), and then centering the content. For the PowerPoint PDF, it adds white space up top.
The PDFs look like the "What I'd Like" thumbs. Any further suggestions?
Nate
Re: trouble with pdf pages of different sizes.
I've tried a number of different options with "+repage" and "-size" to the same result.
A sample "identify" from a PDF is listed below:
A sample "identify" from a PDF is listed below:
Code: Select all
Image: 1090.pdf
Format: PDF (Portable Document Format)
Geometry: 612x842
Class: DirectClass
Type: TrueColor
Endianess: Undefined
Colorspace: RGB
Channel depth:
Red: 8-bits
Green: 8-bits
Blue: 8-bits
Channel statistics:
Red:
Min: 0 (0)
Max: 255 (1)
Mean: 242.061 (0.949257)
Standard deviation: 44.952 (0.176282)
Green:
Min: 0 (0)
Max: 255 (1)
Mean: 242.593 (0.951345)
Standard deviation: 42.9759 (0.168533)
Blue:
Min: 0 (0)
Max: 255 (1)
Mean: 245.41 (0.96239)
Standard deviation: 35.6866 (0.139947)
Colors: 1605
Rendering-intent: Undefined
Resolution: 72x72
Units: Undefined
Filesize: 1.5mb
Interlace: None
Background Color: white
Border Color: #DFDFDF
Matte Color: grey74
Page geometry: 612x842+0+0
Dispose: Undefined
Iterations: 0
Compression: Undefined
Orientation: Undefined
Comment: Image generated by ESP Ghostscript (device=pnmraw)
Signature: 6475bcba2a703dd4e34d24ef04b07ebd3ad26be2f6486b1b6305d7bd1017da16
Tainted: False
Version: ImageMagick 6.2.4 02/16/07 Q16 http://www.imagemagick.org
Image: 1090.pdf
Format: PDF (Portable Document Format)
Geometry: 612x842
Class: PseudoClass
Type: Bilevel
Endianess: Undefined
Colorspace: Gray
Channel depth:
Gray: 1-bits
Channel statistics:
Gray:
Min: 1 (1)
Max: 1 (1)
Mean: 1 (1)
Standard deviation: 0 (0)
Colors: 2
Histogram:
515304: (255,255,255) white
Rendering-intent: Undefined
Resolution: 72x72
Units: Undefined
Filesize: 1.5mb
Interlace: None
Background Color: white
Border Color: #DFDFDF
Matte Color: grey74
Page geometry: 612x842+0+0
Dispose: Undefined
Iterations: 0
Scene: 1
Compression: Undefined
Orientation: Undefined
Comment: Image generated by ESP Ghostscript (device=pnmraw)
Signature: 5bb6259c6b6bc718959ca769757433d97c00d3da7e974fbdf0f8e45f84823597
Tainted: False
User Time: 0.830u
Elapsed Time: 0:02
Pixels per second: 503kb
Version: ImageMagick 6.2.4 02/16/07 Q16 http://www.imagemagick.org
- anthony
- Posts: 8883
- Joined: 2004-05-31T19:27:03-07:00
- Authentication code: 8675308
- Location: Brisbane, Australia
Re: trouble with pdf pages of different sizes.
It may be that IM always uses a default page size for PDF and postscript conversion. That is is is not sizing the page to the size given somewhere and somehow in the PDF document. The Image size above is that of an A4 page at 72 dpi resolution.
It is probably caused by the way IM uses Ghostscript to do the image conversion.
This could be classed as a bug or a feature, though in your case it is probably a bug.
The only solution I can see is to try and set the page size and density correctly based in information in the PDF file format. How IM can do this, or how you can do this, that is the question.
If you like to see this fixed, all I can suggest is to try to determine how to get page info, and put in a 'bug report' with the soution to be incorporated into IM. Without a solution or method to fix, any report is likely to take time as it will go into a todo.
It is probably caused by the way IM uses Ghostscript to do the image conversion.
This could be classed as a bug or a feature, though in your case it is probably a bug.
The only solution I can see is to try and set the page size and density correctly based in information in the PDF file format. How IM can do this, or how you can do this, that is the question.
If you like to see this fixed, all I can suggest is to try to determine how to get page info, and put in a 'bug report' with the soution to be incorporated into IM. Without a solution or method to fix, any report is likely to take time as it will go into a todo.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
https://imagemagick.org/Usage/
Re: trouble with pdf pages of different sizes.
I've kept plugging away at this to no resolve until today. Very very small step forward, but a step forward non-the-less. On a whim I tried using the ImageMagick Studio online to see what results it gave...and it works perfect.
Example: 1373.pdf is a PowerPoint converted to PDF with OpenOffice.org 2...
on my local system, typing "identify 1373.pdf" provides:
...the important part is the 720x842. This particular PowerPoint file is wider than it is tall and opening it in any PDF viewer seems to work (i.e. it looks as it should).
Using the ImageMagick Studio identify command, I get:
...or the correct size (at least in ratio). The conversion process, then, also produces a correctly sized image.
So, the question is, where is my system out-of-date in comparision to IMStudio? I am running IM 6.2.4 and GhostScript 815.04, both of which seem to be the most recent stable builds. Is it possible that IMStudio isn't using GS? What would I substitute?
Hopefully this will shed some light on the matter.
Thanks!
Nate
Example: 1373.pdf is a PowerPoint converted to PDF with OpenOffice.org 2...
on my local system, typing "identify 1373.pdf" provides:
Code: Select all
1373.pdf[0] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.660u 0:02
1373.pdf[1] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.620u 0:02
1373.pdf[2] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.580u 0:02
1373.pdf[3] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.540u 0:02
1373.pdf[4] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.490u 0:02
1373.pdf[5] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.450u 0:02
1373.pdf[6] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.400u 0:02
1373.pdf[7] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.360u 0:02
1373.pdf[8] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.340u 0:02
1373.pdf[9] PDF 720x842 720x842+0+0 DirectClass 26.0mb 0.290u 0:02
1373.pdf[10] PDF 720x842 720x842+0+0 DirectClass 26.0mb
1373.pdf[11] PDF 720x842 720x842+0+0 DirectClass 26.0mb
1373.pdf[12] PDF 720x842 720x842+0+0 DirectClass 26.0mb
1373.pdf[13] PDF 720x842 720x842+0+0 DirectClass 26.0mb
1373.pdf[14] PDF 720x842 720x842+0+0 DirectClass 26.0mb
Using the ImageMagick Studio identify command, I get:
Code: Select all
Image: 1373.pdf
Base filename: MagickStudio.mpc
Format: pdf (Portable Document Format)
Class: DirectClass
Geometry: 720x540+0+0
Type: Palette
Endianess: Undefined...
So, the question is, where is my system out-of-date in comparision to IMStudio? I am running IM 6.2.4 and GhostScript 815.04, both of which seem to be the most recent stable builds. Is it possible that IMStudio isn't using GS? What would I substitute?
Hopefully this will shed some light on the matter.
Thanks!
Nate
Re: trouble with pdf pages of different sizes.
...continuing...
my box uses ESP Ghostscript and IM Studio uses GNU Ghostscript (although the difference I know not). I would guess the problem lie in the ESP Ghostscript converter.
Any ideas on how to install the GNU Ghostscript converter in Ubuntu and have IM use it instead of the ESP converter?
Nate
my box uses ESP Ghostscript and IM Studio uses GNU Ghostscript (although the difference I know not). I would guess the problem lie in the ESP Ghostscript converter.
Any ideas on how to install the GNU Ghostscript converter in Ubuntu and have IM use it instead of the ESP converter?
Nate
Re: trouble with pdf pages of different sizes.
... Removing gs-esp from the machine and installing gs-gpl (apparently the new name of GNU GS) has changed the "Image Generated By..." line to
...installing the AFPL Ghostscript is the same yet again.
In all cases, the information stays the same...that is to say wrong. IMStudio translate the files great though, so I must be close-ish.
Nate
Code: Select all
Comment: Image generated by GPL Ghostscript (device=pnmraw)
Code: Select all
Comment: Image generated by AFPL Ghostscript (device=pnmraw)
Nate