Page 1 of 1

Processing of scaned bookpages

Posted: 2012-10-22T05:12:32-07:00
by sergiokapone
Hi all!
I'm interested how can I process with IM scaned book pages: i.e. align all the pages to the same size, without deformation of the image & make the fields around the text.
Thanks!

Re: Processing of scaned bookpages

Posted: 2012-10-22T10:00:08-07:00
by fmw42
Please post links to some example images and explain with those what the problem is. Also please identify your version of IM and platform

Re: Processing of scaned bookpages

Posted: 2012-10-22T12:02:20-07:00
by sergiokapone
fmw42 wrote:lease post links to some example images and explain with those what the problem is. Also please identify your version of IM and platform
Platform - Windows, IMvers - ImageMagick-6.7.4-9-Q16-windows, portable Win32 static at 16 bits-per-pixel.

I have, for example, two (or more) scaned pictures with slightly different sizes:

Image 3400x5058 Image 3365x4951
I need:
1. trim the white border arount the text field
Image Image

2. and make the canvas size the same with equal borders (150x150) around text field (which is aligned top-center)
Image Image


Now the pictures have the same borders and sizes (3648x5376)

Re: Processing of scaned bookpages

Posted: 2012-10-22T14:29:22-07:00
by fmw42
try something like the following on each image. You need to know the desired output size and it must be bigger either image, say, 3550x5208 (which is 150 larger than the largest dimension of either)

convert image1 -fuzz XX% -trim +repage -gravity center -background white -extent 3550x5208 esult

This seems to work for me:

convert image-1.png -fuzz 10% -trim +repage -gravity center -background white -extent 3550x5208 image-1_trim.png
convert image.png -fuzz 10% -trim +repage -gravity center -background white -extent 3550x5208 image_trim.png



You cannot trim and expect the same sizes for each trimmed image, if the text is placed different or the dpi when scanning was different. Thus you cannot add an equal size border about both, since that would mean they would not end up with the same size results. So you need to trim and pad to some given size rather than add the same size border. Thus use -extent. The -fuzz is to take care of a non-constant background color or near white color.

If the trim does now work, then it is because you have dark noise in the outside area around the text. Thus you need to remove that noise, such as by -morphology close.

see
http://www.imagemagick.org/Usage/morphology/
http://www.imagemagick.org/Usage/crop/#extent
http://www.imagemagick.org/script/comma ... s.php#fuzz

Re: Processing of scaned bookpages

Posted: 2012-10-22T22:42:14-07:00
by sergiokapone
fmw42, thank you! It is work!

Just noticed,
in a command line it's works, but that does not work in batch file. The files are processed, but the output files are the same (only with changed date)

Code: Select all

@echo off
title Trimmer

setlocal
set LIBRE=%BookShop%\DJVULIBRE\
set pagenumber=1
if exist %~dpn1 del %~dpn1\*.* /Q
if not exist %~dpn1  md %~dpn1
for /F "usebackq tokens=1*" %%i in (`"%Libre%\djvused.exe" -e n %1`) do  set /a totalpagenumber=%%i
set pn=1
set trimmer=%BookShop%\Imagemagick\

::---anothr program put tiff files in %~dpn1 folder----
::---and then start trimming of the files by following code---


echo Trimming files...
echo ----------------------------

:loop2
set pagenumber1=%pn%
if %pn% lss 10 set pagenumber1=0%pagenumber1%
if %pn% lss 100 set pagenumber1=0%pagenumber1%
if %pn% lss 1000 set pagenumber1=0%pagenumber1%
if %pn% GTR %totalpagenumber% goto Further2
"%trimmer%convert.exe" "%~dpn1\%pagenumber1%.tif" -fuzz 10 % -trim +repage -gravity center -background white -extent 3550x5208 "%%~dpn1\%pagenumber1%.tif" 
set /a pn=%pn%+1 
echo Trimming page number %pagenumber1%...&goto loop2

:Further2
pause

Re: Processing of scaned bookpages

Posted: 2012-10-23T10:26:14-07:00
by fmw42
"%trimmer%convert.exe" "%~dpn1\%pagenumber1%.tif" -fuzz 10 % -trim +repage -gravity center -background white -extent 3550x5208 "%%~dpn1\%pagenumber1%.tif"
No spaces between 10 and % and in Windows you must escape the % to %%. So -fuzz 10%%

Re: Processing of scaned bookpages

Posted: 2012-10-23T11:38:18-07:00
by sergiokapone
fmw42 wrote:No spaces between 10 and % and in Windows you must escape the % to %%. So -fuzz 10%%
Thank you very mach.

How I can determine a only a page size of an image in pix and dpi?

Re: Processing of scaned bookpages

Posted: 2012-10-23T14:54:13-07:00
by fmw42
sergiokapone wrote:
fmw42 wrote:No spaces between 10 and % and in Windows you must escape the % to %%. So -fuzz 10%%
Thank you very mach.

How I can determine a only a page size of an image in pix and dpi?

Sorry I an not sure I understand.

The number of pixels and dpi resolution is in the IM verbose

identify -verbose image

or can be extracted more directly from http://www.imagemagick.org/script/escape.php

Re: Processing of scaned bookpages

Posted: 2012-10-23T23:50:25-07:00
by sergiokapone
fmw42 wrote:Sorry I an not sure I understand.

The number of pixels and dpi resolution is in the IM verbose

identify -verbose image

or can be extracted more directly from http://www.imagemagick.org/script/escape.php
You understand me correctly.
And again, you helped me. Thank you.