Page 1 of 1

slowness when opening multi page djvu documents

Posted: 2010-06-24T02:31:15-07:00
by jave
Im working on a branch of Emacs which supoprts loading of images using the ImageMagick MagickWand API.

It works well except for some problems with loading multi page image bundles, such as djvu, pdf, etc.
It appears the MagickWand api loads all the pages in a bundle, and renders them in memory before I get a chance to handle the images.
I only want to first find out how many pages there are, then show one particular image. Also, rGB seems to be used by default, and I would like
to use 1 bitplane if the image is BW.

Please note that the loading never finishes as it is. a 100 page djvu document hangs the machine, which is a fairly recent core i7 one.

So, I would like more control over the loading of an image bundle. It ought to be solveable since the "display" command can handle big djvu documents fine.


The Emacs branch is at: http://bzr.savannah.gnu.org/r/emacs/imagemagick/

Re: slowness when opening multi page djvu documents

Posted: 2010-06-24T19:26:06-07:00
by magick
If all you want is the number of pages in a PDF or DJVU document, set the density to something low like 2 and use MagickPingImage(). However, even that took 27 seconds on our 3GHZ Fedora host. You may need to use another method to determine the # of pages in a PDF if you want that information lightweight and fast. For example, pdftk only takes 2 seconds since its operating directly on the PDF source rather than rasterizing it like ImageMagick does. We found this method with Google:
  • -> gs -q -sPDFname=manual.pdf pdfpagecount.ps
    %%Pages: 96
Which returned in 2 seconds, where pdfpagecount.ps is:

Code: Select all

/PDFfile PDFname (r) file def

/PageCountString 255 string

def
systemdict /.setsafe known { .setsafe } if
/.show.stdout { (%stdout) (w) file } bind def
/puts { .show.stdout exch writestring } bind def
GS_PDF_ProcSet begin
pdfdict begin
PDFfile
pdfopen begin
/FirstPage where { pop } { /FirstPage 1 def } ifelse
/LastPage where { pop } { /LastPage pdfpagecount def } ifelse
(%%Pages: ) puts
LastPage FirstPage sub 1 add

PageCountString cvs puts

quit

Re: slowness when opening multi page djvu documents

Posted: 2010-06-25T01:07:47-07:00
by jave
Thanks for the reply!

So, if I use your method I can speed up finding out the number of pages.

But how do I then speed up showing the actual page I want to show?
My code as it is seems to load and render the entire document. Note that Im talking about djvu files here, which are distinct discrete entries in a bundle,
so it is unecessary to render all pages in memory.

To clarify, my computer hangs already in this call:

if(filename != NULL)
status = MagickReadImage(image_wand, filename);

Ive looked at the djvu delegate code in imagemagick and it does implement a ReadOnePage() function, but it doesnt seem to be public.

Re: slowness when opening multi page djvu documents

Posted: 2010-06-27T10:16:01-07:00
by magick
We added a patch to ImageMagick 6.6.2-8 Beta (available sometime tomorrow) to properly support reading one page. For example,
  • convert 'pages.djvu[2]' page.png

Re: slowness when opening multi page djvu documents

Posted: 2010-06-27T12:55:28-07:00
by jave
Is the same syntax available from C code? Or does the interface look different?
A file could presumably have an actual name of "x.djvu[2]" so a separate call to identify the page would seem apropriate from C at least.

Re: slowness when opening multi page djvu documents

Posted: 2010-06-27T13:54:10-07:00
by magick
You can use a page designation with the image filename (e.g. image.djvu[2]) or set the ImageInfo members number_scenes to 1 and scene to the page # (image_info->number_scenes=1; image_info->scene=2).

Re: slowness when opening multi page djvu documents

Posted: 2010-07-01T04:31:44-07:00
by jave
I tried
image_info -> number_scenes = 1;
image_info -> scene = 2;

But still imagemagick seems to try to render the entire djvu in ram, crashing the program.

However, this seems to work:
image_info -> number_scenes = 1;

that is, not setting the "scene" member. Then I get the 1st image very quickly.

I tried it with the latest imagemagick beta. Is this still an imagemagick bug?

Re: slowness when opening multi page djvu documents

Posted: 2010-07-01T05:17:38-07:00
by magick
We're using ImageMagick 6.6.9-10 and both these commands return the second page of a 4 page DJVU document:
  • core 'image.djvu[1]' x:
and
  • core image.djvu x:
with this code:

Code: Select all

  MagickCoreGenesis(*argv,MagickTrue);
  exception=AcquireExceptionInfo();
  image_info=CloneImageInfo((ImageInfo *) NULL);
  (void) strcpy(image_info->filename,argv[1]);
  image_info->number_scenes=1;
  image_info->scene=1;
  image=ReadImage(image_info,exception);
  ...

Re: slowness when opening multi page djvu documents

Posted: 2010-07-01T05:36:05-07:00
by jave
Im sorry, I mistakenly built 6.6.2-8. Now Im using imagemagick SVN and now it finaly works! Thanks!

Re: slowness when opening multi page djvu documents

Posted: 2010-07-01T06:41:26-07:00
by jave
Ok, next related issue.

In order to figure out how many pages there were in the djvu bundle I did:

Code: Select all

  ping_wand=NewMagickWand();
  if (filename != NULL)
    {
      status = MagickPingImage(ping_wand, filename);


This segfaults for my djvu bundle! It does work for single page djvu files, and for PDF:s, even if its a bit slow for PDF:s.

Re: slowness when opening multi page djvu documents

Posted: 2010-07-01T11:30:41-07:00
by magick
We can reproduce the problem you posted and have a patch in ImageMagick 6.6.3-0 Beta available now. Thanks.