How do I determine if a pdf is raster or vector

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
User avatar
yazinsai
Posts: 13
Joined: 2012-05-14T21:39:07-07:00
Authentication code: 13

How do I determine if a pdf is raster or vector

Post by yazinsai »

Hello,

I'd like to know if there's a way to determine if a PDF file is raster or vector, using ImageMagick. I've spent the better part of half an hour looking for this and could not find the solution.

To clarify, I mean how do I determine if there are any raster elements in a PDF file.

Thanks!
When you learn, teach. When you get, give.
- Maya Angelou
User avatar
whugemann
Posts: 289
Joined: 2011-03-28T07:11:31-07:00
Authentication code: 8675308
Location: Münster, Germany 52°N,7.6°E

Re: How do I determine if a pdf is raster or vector

Post by whugemann »

I am pretty sure that there is no way of doing that with ImageMagick, because IM essentially is a raster image processor and uses GhostScript as the main PDF processor.

Basically, you could search the PDF for certain raster drawing commands. As PDF is pure ASCII text, you could use a simple text processor such as SED or even the operating system's text filter commands to perform this job.

You could also use pdfimages from Xpdf and try to extract images from the file. If that produces no output files, the PDF is definitively purely vector.
Wolfgang Hugemann
User avatar
yazinsai
Posts: 13
Joined: 2012-05-14T21:39:07-07:00
Authentication code: 13

Re: How do I determine if a pdf is raster or vector

Post by yazinsai »

Thanks - unfortunately, i'm running off a shared host and there's no way they are going to allow me to install Xpdf.
I'm sure the pros have some way of figuring this out..

===
To give you some context on the application, I'm running a website where users can upload their files and we print them on large posters. I want to be able to check if a user has uploaded a pdf that is purely vector, whereby no size restrictions exist (vs. if the user uploads a raster pdf where I would have to go with the resolution on the file).
When you learn, teach. When you get, give.
- Maya Angelou
User avatar
yazinsai
Posts: 13
Joined: 2012-05-14T21:39:07-07:00
Authentication code: 13

Re: How do I determine if a pdf is raster or vector

Post by yazinsai »

whugemann wrote:As PDF is pure ASCII text, you could use a simple text processor such as SED or even the operating system's text filter commands to perform this job
Could you please suggest how this might work? I'm not so good with SED and i'm not sure what to search for.

Thanks a ton!
Yazin
When you learn, teach. When you get, give.
- Maya Angelou
User avatar
whugemann
Posts: 289
Joined: 2011-03-28T07:11:31-07:00
Authentication code: 8675308
Location: Münster, Germany 52°N,7.6°E

Re: How do I determine if a pdf is raster or vector

Post by whugemann »

yazinsai wrote:Could you please suggest how this might work?
Remember, this is a forum on ImageMagick, not on PDF treatment. But I think you will quickly find out, if you just open several PDFs that contain raster graphics in a text editor. A quick check by myself offered "image/width" as a possible string to search for. If this is not contained in the PDF file, it's probably a pure vector PDF. But I have no experience how dummy-proof this simple check is. You will possible have to check for alternative methods of embedding raster graphics in the PDF, i.e. other clue words in the PDF.

If you take this approach, please let us know how succesful it is.
Wolfgang Hugemann
User avatar
yazinsai
Posts: 13
Joined: 2012-05-14T21:39:07-07:00
Authentication code: 13

Re: How do I determine if a pdf is raster or vector

Post by yazinsai »

whugemann wrote:A quick check by myself offered "image/width" as a possible string to search for.
That's brilliant! Makes for a lovely hack. I'll try with a few pdf files and see what works best; thanks for the lead!! I'll update this when I find something out.
When you learn, teach. When you get, give.
- Maya Angelou
User avatar
yazinsai
Posts: 13
Joined: 2012-05-14T21:39:07-07:00
Authentication code: 13

Re: How do I determine if a pdf is raster or vector

Post by yazinsai »

This worked perfectly. I used the following grep command to determine if a raster object existed:

Code: Select all

grep -c -i "/image" thisfile.pdf
whugemann, you ROCK!
When you learn, teach. When you get, give.
- Maya Angelou
Post Reply