how to detect corrupt eps

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
kriks
Posts: 114
Joined: 2008-01-04T05:52:03-07:00

how to detect corrupt eps

Post by kriks »

Hello,

I'm trying to detect corrupt eps.

for a jpg, I make a

Code: Select all

identify -verbose $path 2>&1 | grep Corrupt | wc -l
but for a eps, imagemagick does not see the corruption, I suppose it's because it's ghostscript that does the first conversion.

Does anyone has a good way to detect corruption in eps ?
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: how to detect corrupt eps

Post by anthony »

ghostscript is what IM uses under the hood. for corruption of eps it is probably the better tool.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
kriks
Posts: 114
Joined: 2008-01-04T05:52:03-07:00

Re: how to detect corrupt eps

Post by kriks »

yes I'm trying to use GS for extracting image raw data (there is a bit DEVICE), but without success.

I found that in a EPS, image data is found between "%%BeginBinary:[ 0-9]*^Mbeginimage" and "~>^M%%EndBinary" strings.
So I can find if the file is truncated by counting each of them .

ex :

Code: Select all

# grep -c "%%BeginBinary:[ 0-9]*^Mbeginimage" 152168.eps
1
# grep -c "~>^M%%EndBinary" 152168.eps
1
# grep -c "%%BeginBinary:[ 0-9]*^Mbeginimage" 152169.eps
1
# grep -c "~>^M%%EndBinary" 152169.eps
0
152168.eps is OK, 152169 is truncated

But I would prefer to extract the binary data for the image and make an identify on it, which would be probably safer.
Maybe you have an idea ?
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: how to detect corrupt eps

Post by anthony »

The EPS is probably using Return charcaters instead of line feeds (that is the ^M chars) (DOS or Windows Text format)

This means that "grep" probably considers the WHOLE FILE as a single line!!!! Or at least as just a few very long lines.

The best idea would probably be to convert the image from DOS to UNIX text format, then extract a block of lines, before converting back (or not EPS allows UNIX text format too!)

For example:

Code: Select all

   cat input_file | tr '\015' '\012' | sed -n '/^%%BeginBinary:/,$p; /^%%EndBinary/q' > extracted_text
The "tr" command is also typically available as a "dos2unix" command.

That is only ONE way of extracting a range of lines from a file. I know many more. Which I listed in
http://www.ict.griffith.edu.au/anthony/ ... file.hints
Search for "range of lines"


However none of this is ImageMagick specific, but Shell Scripting.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
kriks
Posts: 114
Joined: 2008-01-04T05:52:03-07:00

Re: how to detect corrupt eps

Post by kriks »

thank you, that's interesting.

the grep command is fast enough for my needs, but I'm still not sure it covers every case.

I would better trust GS for extracting, if possible.
Post Reply