Page 1 of 2

Conditional -negate pdf file

Posted: 2018-09-26T16:09:48-07:00
by Squashed
I'm trying to convert a pdf where some pages have white text on dark background. I can successfully convert single pages using -negate, but how can I conditionally process all pages, only converting those with a dark background? IM 7.0.7-Q16 on Win 10 64bit. Many thanks.

Re: Conditional -negate pdf file

Posted: 2018-09-26T16:18:34-07:00
by fmw42
I think you will have to write a script loop over each image. First test for the background color such as the color of the top left corner (0,0). Depending whether it is black or white, do a magick command with the -negate if black and do the magick command without the -negate if white.

Code: Select all

magick image.pdf[0] -format "%[pixel:u.p{0,0}]" info:
will return the color of the top left pixel. Note: depending upon if your image is binary and the type, you might get white/black or rgb(255,255,255)/rgb(0,0,0) or gray(255)/gray(0). You will have to do a test or make your condition check for all 3. If your image is not binary, then you will have to test whether the value is lighter or darker than gray(50%).

You will have to put the result into a variable and use the variable in a conditional statement to do the testing.

Pseudocode:

Code: Select all

if variable = black
convert image.pdf -negate ... image.png

else if variable = white 
convert image.pdf ... image.png

end if
Sorry I am a Unix person, so one of the Windows users will have to help with the scripting.

Re: Conditional -negate pdf file

Posted: 2018-09-26T16:23:07-07:00
by fmw42
CONTINUED:

What I often do is make a new variable, say, negating from the conditional test. So

Pseudocode:

Code: Select all

if variable = black
negating = "-negate"

else if variable = white 
negating = ""

end if
Then make one command line

Code: Select all

convert image negating ... image
This way then negating = "", it does nothing and when negating = "-negate" it does the -negate. But it is the condition test that decides.

Re: Conditional -negate pdf file

Posted: 2018-09-26T16:43:57-07:00
by snibgo
A slightly different approach: convert the PDF to files in your required format. Then loop through the created images, testing each to see if it needs negating. The test might use the average intensity, assuming that the text occupies fewer pixels than the background. If it doesn't need negating, there is no need to do anything.

Depending on how you convert the PDF, you could test each image at that stage.

If you want further advice, please post the command you use, and a link to a sample input PDF.

Re: Conditional -negate pdf file

Posted: 2018-09-26T17:34:56-07:00
by Squashed
In a batch file I'm using:

magick convert %1 -alpha deactivate -negate -white-threshold 85%% "%~dpn1_negated%~x1"

which inverts all the pages of the input.pdf to the output.pdf

Re: Conditional -negate pdf file

Posted: 2018-09-26T18:24:41-07:00
by GeeMack
Squashed wrote: 2018-09-26T17:34:56-07:00In a batch file I'm using:

magick convert %1 -alpha deactivate -negate -white-threshold 85%% "%~dpn1_negated%~x1"

which inverts all the pages of the input.pdf to the output.pdf
Using IM7 offers various approaches, but you should be using just "magick" instead of "magick convert" to use its full set of features.

You can read in the PDF, then use an FX expression to check for the mean value of each page and negate any page with a mean value less than 50% without affecting pages with a mean value over 50%. Here's a sample command to try...

Code: Select all

magick input.pdf -set filename:f "%[t]_checked" ^
   -channel RGB -level "%[fx:mean<0.5?100:0],%[fx:mean<0.5?0:100]%" "%[filename:f].pdf"
That uses "-level" to conditionally negate pages which are predominately black or dark. See a description of using the "-level" operator to negate an image under "Direct Level Adjustments" at THIS LINK.

To use that in a BAT script you'll need to double the percent signs "%%".

Re: Conditional -negate pdf file

Posted: 2018-09-26T19:50:59-07:00
by fmw42
Very clever, Greg!

Re: Conditional -negate pdf file

Posted: 2018-09-26T20:12:21-07:00
by Squashed
I'm getting an error:

convert: invalid argument for option '-level': %[fx:mean<0.5?100:0]% @ error/convert.c/ConvertImageCommand/2003.

Can fx be used to supply an argument for level? Or am I not formatting it correctly? Trying it manually on command line for now.

Re: Conditional -negate pdf file

Posted: 2018-09-26T20:21:12-07:00
by GeeMack
Squashed wrote: 2018-09-26T20:12:21-07:00I'm getting an error:

convert: invalid argument for option '-level': %[fx:mean<0.5?100:0]% @ error/convert.c/ConvertImageCommand/2003.
Try using just "magick" instead of "magick convert". When you add "convert" to the command it will take on IM6 behavior. You need it to act like IM7 to make those inline FX expressions work.

Re: Conditional -negate pdf file

Posted: 2018-09-27T06:33:43-07:00
by Squashed
Thank you GeeMack, I'm getting close now. I have:

magick input.pdf -channel RGB -level "%%[fx:mean<0.5?100:0]%%" output.pdf

which negates only the dark pages. Only one parameter needed for level, since white point is the opposite of black point.

However, I also wanted to add -white-threshold 85%, but only for those pages that were negated. So I tried:

magick input.pdf -channel RGB -level "%%[fx:mean<0.5?100:0]%% -white-threshold "%%[fx:mean<0.5?85:100]%%" output.pdf

But this didn't work, I assume because the mean calculated at white-threshold is from the negated image, which is above 0.5?

I wondered if I could store the mean at the start somehow, I tried:

magick input.pdf -channel RGB -set option:mw "%%[mean]" -level "%%[fx:mw<0.5?100:0]%% -white-threshold "%%[fx:mw<0.5?85:100]%%" output.pdf

but I get error: magick: unable to parse expression `mw' @ error/fx.c/FxGetSymbol/1828.

Any ideas?

Re: Conditional -negate pdf file

Posted: 2018-09-27T07:28:26-07:00
by GeeMack
Squashed wrote: 2018-09-27T06:33:43-07:00I wondered if I could store the mean at the start somehow, I tried:

magick input.pdf -channel RGB -set option:mw "%%[mean]" -level "%%[fx:mw<0.5?100:0]%% -white-threshold "%%[fx:mw<0.5?85:100]%%" output.pdf
IM won't let you set a variable like that and use it inside another FX expression, but you can set a variable with the results of an entire expression, then use that variable by itself further along in the command. Try something more like this...

Code: Select all

magick input.pdf -channel RGB -set option:var1 "%%[fx:mean<0.5?100:0]" ^
   -set option:var2 "%%[fx:mean<0.5?85:100]" -level "%%[var1]%%" -white-threshold "%%[var2]%%" output.pdf
I didn't test that, but it should give you the idea.

Re: Conditional -negate pdf file

Posted: 2018-09-27T08:44:49-07:00
by Squashed
Ok that works now, thanks.

One last issue - the level command seems to cause a problem with the anti-aliasing. I noticed the output pdf has reduced image dimensions and file size. Not sure why?

Re: Conditional -negate pdf file

Posted: 2018-09-27T10:46:03-07:00
by GeeMack
Squashed wrote: 2018-09-27T08:44:49-07:00One last issue - the level command seems to cause a problem with the anti-aliasing. I noticed the output pdf has reduced image dimensions and file size. Not sure why?
When IM reads in a PDF it defaults to a density of 72. Try setting the density before reading the input file, maybe 150 or 300.

Code: Select all

magick -density 300 input.pdf ... ... ... output.pdf
See if that improves the quality of the output. If it's something else, you should post a couple example input/output images on a public hosting site and link to them here. If we can see the actual issue, there might be a better solution. Also keep in mind with IM, everything in the input PDF gets converted to a raster image for processing and output.

Re: Conditional -negate pdf file

Posted: 2018-09-27T11:04:06-07:00
by Squashed
That does it. Thanks very much for all your help.

Re: Conditional -negate pdf file

Posted: 2018-09-27T19:15:23-07:00
by GeeMack
Squashed wrote: 2018-09-27T11:04:06-07:00That does it. Thanks very much for all your help.
Keep in mind when using FX expressions directly within an operation like...

Code: Select all

... -level "%[fx:mean<0.5?100:0] ...
... it will run that expression separately on every image in the list, or on each page of the PDF. But if you set that variable before the actual operation like...

Code: Select all

... -set option:var1 "%%[fx:mean<0.5?100:0]" -level "%%[var1]%%" ...
... it will set that variable once to the value of the current image, usually the first image in the list, then use that one same value in the "-level" operation for every image in the list. In many instances that may not be what you want.