Page 1 of 1

Slow convert on TIFF files

Posted: 2009-10-01T07:24:22-07:00
by djdifulvio
Rather new to this software, but it seems to take a very long time for the "convert" command to do anything?

I am running the following:

Code: Select all

convert -depth 8 -monochrome oldfile.tif newfile.tif
The server is a 3Ghz+ Intel Dual Core with 4GB 667Mhz DDR2 Ram with a single 120GB 7.2K RPM SATA2 drive (Dev/test server). I never had any problems running large processing and memory code or applications, including graphics converters; however this command locks up my system while running.

The TIFF files are exported from a German DMS system and are 2550 x 3300 at 300dpi, 1 bits/sample, CCITT Group 4 compressed. I do notice that when it did complete there had been several "unknown field with tag #####" errors which I assume it is just some junk fields added by the DMS software? However, on a side note, that same 16 page document when next ran in Tresseract for OCR listed 62 pages processes, while it only has 16..?

Any idea on how to make this process, other then using another tool?

Re: Slow convert on TIFF files

Posted: 2009-10-01T09:44:37-07:00
by magick
You can determine the characteristic of a TIFF image with the tiffinfo program.

ImageMagick is probably slow because its caching the pixels to disk since you have a large image. See http://www.imagemagick.org/script/architecture.php.

Re: Slow convert on TIFF files

Posted: 2009-10-01T09:50:15-07:00
by fmw42
I am not an expert on this, but some things that have been noted:

1) Proper IM syntax is

convert input options output.

But this probably will not help. But see http://www.imagemagick.org/Usage/basics/#cmdline

2) I have seen topics where things were slow and it was suggested to disable support for OpenMP and then things got faster. Recompile with --disable-openmp

3) What version of IM are you using? If old, then also try upgrading?


Magick is the expert, so do whatever he suggests.

Re: Slow convert on TIFF files

Posted: 2009-10-02T11:32:45-07:00
by djdifulvio
TIFFINFO works find, returns back very quickly and tells me:

Code: Select all

TIFF Directory at offset 0x7c (124)
  Subfile Type: multi-page document (2 = 0x2)
  Image Width: 2550 Image Length: 3300
  Resolution: 300, 300 pixels/inch
  Bits/Sample: 1
  Compression Scheme: CCITT Group 4
  Photometric Interpretation: min-is-white
  Thresholding: bilevel art scan
  FillOrder: lsb-to-msb
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 16
  Planar Configuration: single image plane
  ImageDescription: 
  Make: DOCUNET Germany
  Model: SCA
  Software: DACS Toolkit II
  DateTime: 2009:09:21 16:28:03
  Tag 33948: 0,0,0,0,64,0,0,0,136,0,0,0,136,0,0,0,12,0,0,0,64,0,0,0,60,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,72,0,0,0,72,0,0,0,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,72,0,0,0,2,0,0,0,150,207,62,19,11,6,216,7,239,4,2,10,8,78,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
  Tag 33949: 0
  Group 4 Options: (0 = 0x0)
That is of course the info for one page with this test file, however I do receive on my console many:

Code: Select all

TIFFReadDirectory: Warning, filename.tif: unknown field with tag 33949 (0x849d) encountered.
Perhaps I am off on the speed of IO between a disc drive and other parts of the system, but I am very sure that a SATA2 drive and dual core processes running a 64-bit distribution would be able to do a 16 page documents a little faster then hours.

As for the command syntax used, there is sense to my madness, it is the command line used by OpenKM when uploading a TIFF document, just before it is submitted to Tesseract-OCR for processing. I assume that the point of the syntax is to make sure the document is maximized for OCR, however it is not working, perhaps due to the "unknown field with tag" errors, or the very long processing time.

I am using the current stable build from the website, 6.5.6-5, and I also tried the version that can be apt-get installed, both ran the same way, however I might retest the older version next.

Re: Slow convert on TIFF files

Posted: 2009-10-02T13:55:21-07:00
by fmw42
You do have a very large image and perhaps not enough memory so it writes between disk and memory.

Magick will need to advise further on how to handle this if that is the case.