Page 1 of 1

Size limit on textcleaner

Posted: 2015-02-12T14:03:42-07:00
by jarco
Hello,

Is there a size limit on the textcleaner script?
I have the script working fine on smaller images around 1.9Mb still works. But when I use filesizes like 3.4 MB it doesn't seem to work.
On another server the larger images are processed just fine so perhaps there is a setting I need to do on the server to be able to process larger files?

On an image of 1.9 MB it just generates the test:

Code: Select all

 /var/www/CENCORED :)/web/textcleaner -s 3 app_images/ticket_1423476443139.jpg ./app_images/gens/test.jpg
On an image with exactly the same rights + permissions but with a larger size:

Code: Select all

root@CENCORED:/var/www/CENCORED/web# /var/www/CENCORED/web/textcleaner -s 3 app_images/ticket_1423765951755.jpg ./app_images/gens/testjimmy.jpg
convert: magick/blob.c:4020: WriteBlob: Assertion `data != (const unsigned char *) ((void *)0)' failed.
/var/www/CENCORED/web/textcleaner: line 397: 10567 Aborted                 (core dumped) convert -quiet "$infile" +repage "$tmpA1"

--- FILE app_images/ticket_1423765951755.jpg NOT READABLE OR HAS ZERO SIZE --- 


textcleaner:

USAGE: textcleaner [-r rotate] [-l layout] [-c cropoff] [-g] [-e enhance ] [-f filtersize] [-o offset] [-u]  [-t threshold] [-s sharpamt] [-s saturation] [-a adaptblur] [-T] [-p padamt] [-b bgcolor] infile outfile
USAGE: textcleaner [-help]

OPTIONS:

-r	  rotate			rotate image 90 degrees in direction specified if 
                          aspect ratio does not match layout; options are cw 
                          (or clockwise), ccw (or counterclockwise) and n 
                          (or none); default=none or no rotation
-l      layout            desired layout; options are p (or portrait) or 
                          l (or landscape); default=portrait
-c      cropoff			image cropping offsets after potential rotate 90; 
                          choices: one, two or four non-negative integer comma 
                          separated values; one value will crop all around; 
                          two values will crop at left/right,top/bottom; 
                          four values will crop left,top,right,bottom
-g                        convert document to grayscale before enhancing
-e      enhance           enhance image brightness before cleaning;
                          choices are: none, stretch or normalize; 
                          default=stretch
-f      filtersize        size of filter used to clean background;
                          integer>0; default=15
-o      offset            offset of filter in percent used to reduce noise;
                          integer>=0; default=5
-u                        unrotate image; cannot unrotate more than 
                          about 5 degrees
-t      threshold			text smoothing threshold; 0<=threshold<=100; 
                          nominal value is about 50; default is no smoothing
-s      sharpamt          sharpening amount in pixels; float>=0; 
                          nominal about 1; default=0
-S      saturation        color saturation expressed as percent; integer>=0; 
                          only applicable if -g not set; default=100 (no change)
-a      adaptblur         alternate text smoothing using adaptive blur; 
                          floats>=0; default=0 (no smoothing)
-T         				trim background around outer part of image 
-p      padamt			border pad amount around outer part of image;
                          integer>=0; default=0
-b      bgcolor           desired color for background; default=white

I am running this on an ubuntu 14.04 server.

Code: Select all

# convert -version
Version: ImageMagick 6.7.7-10 2014-03-06 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2012 ImageMagick Studio LLC
Features: OpenMP   
On our old centos 5 server this script seems to be working fine with larger files. I have build tesseract from source on that server but cannot find any configuration or limitation setting that could be giving the error on larger files.
Also the error is thrown instantaneously so I assume this is not due to running out of memory or something like that.

What could be causing this? Any thoughts?

Re: Size limit on textcleaner

Posted: 2015-02-12T16:11:51-07:00
by fmw42
No script limit. Only IM and system limits. You may be running out of memory with the needed tmp files used by the script or your tmp directory is full. See viewtopic.php?f=4&t=26801

Re: Size limit on textcleaner

Posted: 2015-02-17T08:58:37-07:00
by jarco
I tried some more options ad things.
I have changed the tmp folder to a folder inside the folder named tmp. I did a chmod 777 on this folder.
This is the result of identify -list resource:

Code: Select all

File         Area       Memory          Map         Disk    Thread         Time
-------------------------------------------------------------------------------
200899          1GB         2GiB         4GiB   13.877788EiB         2         3600
When I execute the command on a file smaller as 2 MB it just works fine. When I do it on a larger file it throws an error right away. I see no increase in memory or disk activity.
The file size possible in the temp folder is 30GB so I don't think that to be an issue when running the script on an image of 3.4 MB.
I also doubled the memory in the machine to 2 GB to see if that was the problem.

I really have no clue at this moment of where the problem could be.

Re: Size limit on textcleaner

Posted: 2015-02-17T09:52:17-07:00
by snibgo
... when running the script on an image of 3.4 MB
Is that the file size? That isn't relevant.

How many pixels does the image have?

Re: Size limit on textcleaner

Posted: 2015-02-17T10:06:46-07:00
by jarco
This image is 2,448px × 3,264px.

Re: Size limit on textcleaner

Posted: 2015-02-17T10:24:27-07:00
by magick
Try the latest release of ImageMagick. New releases always have plenty of bug fixes. If it still fails, try posting a stack trace so we can track the problem.

Re: Size limit on textcleaner

Posted: 2015-02-17T10:34:45-07:00
by snibgo
If you are using Q16, then each image needs 2448*3264 *8 / 1e6 = 64 MB of memory. The command might need three copies in memory, so 192 MB. This isn't much but if your machine has only 2GB, you might run out. Try it when nothing else (especially browsers) are running.

"-debug all" might say what is going wrong.

"-limit memory 32MiB -limit map 64MiB" or something might solve the problem.

Re: Size limit on textcleaner

Posted: 2015-02-17T11:04:46-07:00
by fmw42
The script has one input, makes one clone and composites them. So it would seem that it needs 3 times the memory of the one image of 64 MB that snibgo estimated (as he suggested).

Can you provide and example input file that fails and I will try test the script with it? If you want to send it to me privately, use my email address fmw at alink dot net

convert: magick/blob.c:4020: WriteBlob: Assertion `data != (const unsigned char *) ((void *)0)' failed.
/var/www/CENCORED/web/textcleaner: line 397: 10567 Aborted (core dumped) convert -quiet "$infile" +repage "$tmpA1"

--- FILE app_images/ticket_1423765951755.jpg NOT READABLE OR HAS ZERO SIZE ---
This message would seem to imply it cannot read your input image. Perhaps it is corrupt.

Re: Size limit on textcleaner

Posted: 2015-07-28T07:56:47-07:00
by jarco
I am just reporting that building the latest version from source has fixed this problem. The version in the ubuntu repos is ancient...