Disable transparent background

Flokker · Post by **Flokker** » 2019-07-19T07:51:00-07:00

Hi,

i have to extract some page from a pdf file for the company i'm working for to tiff. The command

convert -compress zip -density 300x300 +adjoin file.pdf[1] output1.tif

results in a tif file with a frame of transparent around the rest of the image. I think this is because of the OCR acrobat did. The source was a scanned image that was imported to acrobat and used ocr to make the pdf searchable. Acrobat also straightened the site.

How can i prevent that every time i convert a pdf to tif?

I#M on ubuntu and i ise image magick 6.9.7.

Post by **snibgo** » 2019-07-19T08:01:38-07:00

You have "a frame of transparent around the rest of the image". What do you want instead? You might "-trim" to remove it. Or flatten against a white background (or any colour you want): "-background white -layers flatten".

Flokker · Post by **Flokker** » 2019-07-19T08:06:51-07:00

When i open the PDF there is no such frame. The Background is white like the test of the page (except the black text). What i want is to output the page as i see it in the PDF.

Post by **snibgo** » 2019-07-19T08:41:49-07:00

You might try one of the pdf defines, eg use-trimbox. See http://www.imagemagick.org/script/comma ... php#define

Flokker · Post by **Flokker** » 2019-07-19T13:49:37-07:00

What works is pdf:use-cropbox=true but i don't understand why. There isn't anything cropped.

Post by **snibgo** » 2019-07-19T14:40:32-07:00

Flokker wrote:There isn't anything cropped.

I suspect there is, and that if you read the PDF with a text editor you will see a "/CropBox" specification.

Flokker · Post by **Flokker** » 2019-07-20T00:05:26-07:00

I cannot open the PDF with a text editor. its a pdf not a text file.

Is there no other way to simply extract the pdf "as they are"?

Post by **snibgo** » 2019-07-20T01:37:02-07:00

In Windows, PDF files can be opened with Microsoft Wordpad to view the file as raw text. I expect Unix has similar tools.

I don't know what "as they are" means. If the PDF has a cropbox, but also has content outside the cropbox, which version is the "real" one? The content might be registration marks that would be cut off a printed paper version. IM gives you the choice: use a cropbox (if the PDF has one) or don't.

And, of course, PDF files are vector. There is no definitive raster version.

If you need to convert PDF files, I suggest you read up Ghostscript documentation. You might decide to use Ghostscript directly.

Flokker · Post by **Flokker** » 2019-07-20T01:51:16-07:00

Works with -alpha remove

What i mean is that i want to extract every page as an single image so that the image looks like the page when i open it with a pdf viewer. like when i take a screenshot from the page.

Post by **fmw42** » 2019-07-20T08:49:09-07:00

Is there no other way to simply extract the pdf "as they are"?

What are you asking? What do you mean by "as they are"?

Some PDF file are totally vector files. Some are raster files imbedded in a vector PDF shell. Imagemagick is a raster only processor. It uses Ghostscript to rasterize any PDF. Thus no vectors remain, only pixels.

You can extract every page of a PNG into individual images.

convert image.pdf +adjoin image.suffix

where suffix can be JPG or PNG, etc.

If you want raw editable text, then use some other tool.

Legacy ImageMagick Discussions Archive

Disable transparent background

Disable transparent background

Re: Disable transparent background

Re: Disable transparent background

Re: Disable transparent background

Re: Disable transparent background

Re: Disable transparent background

Re: Disable transparent background

Re: Disable transparent background

Re: Disable transparent background

Re: Disable transparent background