Page 1 of 1

Multiple Page Text, Need Page breaker option

Posted: 2012-04-12T11:48:25-07:00
by ynguyen
I have a text file that worth 5 pages long.
Each page of the raw text separate with a ^L (control L) to indicate the end of a page.

Each page also ending with a word *** CONTINUE ***

Is there an option to turn on so convert will detect the page breaker such as ^L
or if it see this line *** CONTINUE *** Then create or start a new page?


This is what I am currently using.
cat mp.txt | convert -font Courier-bold -geometry 1728x2156+0+0 -density 204x196 -units PixelsPerInch -type Bilevel -depth 1 text:- ytext.tif


Using: Version: ImageMagick 6.5.4-7 2010-02-26
OS: Linux

Thank you.

ynguyen

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-12T18:17:17-07:00
by fmw42
I don't think there is anything automatic built into IM. But you can create an image that is exactly the same as the way *** CONTINUE *** shows in your generated image. Use that as a template and use compare to search for the occurrences of the small image in your larger image. Then extract those match points and use the Y value to crop your image into appropriate parts. You can take the second of the output images from compare and search for multiple match points by thresholding or extracting the brightest locations. Or see my bash unix script, maxima at http://www.fmwconcepts.com/imagemagick/maxima/index.php

see
http://www.imagemagick.org/script/compare.php
http://www.imagemagick.org/Usage/compare/

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-12T22:26:32-07:00
by anthony
No the text: does not understand special page breaks.

The best idea is to use a shell script (or perl script) to break up the text and generate each page separatally. But that is not really a Imagemagick task

Hmm perl can do this quite well, seperating the text on formfeed (character code 014 octal).
It can then feed each 'record' (page) to convert to create each page, and pipe the result to another convert to merge the stream of miff images into a single tif file.
See "MIFF Image Streaming"
http://www.imagemagick.org/Usage/files/#miff

NOTE The following example is very deep, and while small, it involve advanced perl and imagemagick techniques...

Code: Select all

cat mp.txt |
  perl -0014 -ne 'chomp; open(PAGE,"|convert text:- miff:-"); print PAGE; close PAGE;' - |
    convert miff:- -type Bilevel -depth 1 ytext.tif
For testing you may like to replace the last convert command with...

Code: Select all

  convert miff:- -trim -border 5 -bordercolor red -border 1 -append show:
Which is what I used for my testing.

Add your extra font options to the convert inside the one-line perl script.

And YES it is posible using shell scripts, just not as compact :-)

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-12T23:10:31-07:00
by ynguyen
Anthony,

You are awsome :D . By default, does Perl understand the ^L as a page break to end and start a new page?
I do not understand all the syntax at the moment but your script perform exactly what I am looking for. Thank you very much.

I also need to included the following options, which order should they be?
-font Courier-bold -geometry 1728x2156+0+0 -density 204x196 -units PixelsPerInch -type Bilevel -depth 1


cat mp.txt |
perl -0014 -ne 'chomp; open(PAGE,"|convert text:- miff:-"); print PAGE; close PAGE;' - |
convert miff:- -font Courier-bold -geometry 1728x2156+0+0 -density 204x196 -units PixelsPerInch -type Bilevel -depth 1 text:- ytext.tif


ynguyen

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-13T00:21:40-07:00
by anthony
ynguyen wrote:Anthony,
You are awsome :D .
No just experienced. More than 25 years scripting would do that :-)
By default, does Perl understand the ^L as a page break to end and start a new page?
No the -0 option defines a single character record seperator. 014 is octal for ^L

The full perl scripting can use multi-character strings for this, such as "*** Continue ***".
The seperator is removed from the end of record using the "chomp" command.
I also need to included the following options, which order should they be?
-font Courier-bold -geometry 1728x2156+0+0 -density 204x196 -units PixelsPerInch -type Bilevel -depth 1
The -type and depth is for the outside TIFF image creation. the rest is for text, and should be inside the perl script "convert" command before the text: image read/creator.

Code: Select all

cat mp.txt |
  perl -0014 -ne 'chomp; open(PAGE,"|convert -font Courier-bold -geometry 1728x2156+0+0 -density 204x196 -units PixelsPerInch text:- miff:-"); print PAGE; close PAGE;' - |
    convert miff:-  -type Bilevel -depth 1 text:- ytext.tif

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-13T10:15:23-07:00
by ynguyen
Anthony and fmw42

Thank you very much for help, much appreciate.
Are you interest for you some small freelance image and detection project\coding? please email me yoom@misoccer.us


One more question:
How can I get Roxbury fonts for IM. Can not find it with the following command: identify -list font
Where to find and how to add Roxbury fonts to IM?

I want to make some adjustment to the space before and after each paragraph.

I am thinking the work around could be using the -interline-spacing option if Roxbury fonts can not be implement for IM.
Would this work?
cat mp.txt | convert -font Courier-bold -interline-spacing 1 ytext.tif


ynguyen

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-13T10:31:03-07:00
by fmw42
You will need to search the Internet for Roxbury.ttf font. Then download it an put it into your font directory. Then run Anthony's script, imagick_type_gen.pl. see http://www.imagemagick.org/Usage/scripts/. That way it will show up in your list of fonts and you can refer to it by name and don't have to use the full path.

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-16T10:44:26-07:00
by ynguyen
fmw42,

Thank you for the information.
1.
I could not find the font that I am looking for.

2. Work Around
The -interline-spacing will be a better tool to manipulate what I am trying to accomplish. However, my current IM's Redhat distribution do not
support the -interline-spacing option.

Currently, I am trying to build a new IM version for Redhat ES 6.1 64-bits. So, having a bit of trouble at the moment, it doesn't seem to know any of the Redhat distribution lib.
I am going to download all the delegates packages and use these for the compile instead the RedHat distribution. Hope this work.

I will report with detail if the build is sucessful.

Thank you

ynguyen

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-16T11:01:33-07:00
by fmw42
I searched for you font also, but could not find it. Sorry.

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-16T17:05:14-07:00
by anthony
You need all the development packages.

For details see my RPM build notes
http://www.imagemagick.org/Usage/api/#rpms

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-17T14:23:10-07:00
by ynguyen
Any suggestion on this errors. This is the end of a make command


CCLD wand/libMagickWand.la
CXX Magick++/lib/Blob.lo
CXX Magick++/lib/BlobRef.lo
CXX Magick++/lib/CoderInfo.lo
CXX Magick++/lib/Color.lo
CXX Magick++/lib/Drawable.lo
CXX Magick++/lib/Exception.lo
CXX Magick++/lib/Functions.lo
CXX Magick++/lib/Geometry.lo
CXX Magick++/lib/Image.lo
CXX Magick++/lib/ImageRef.lo
CXX Magick++/lib/Montage.lo
CXX Magick++/lib/Options.lo
CXX Magick++/lib/Pixels.lo
CXX Magick++/lib/STL.lo
CXX Magick++/lib/Thread.lo
CXX Magick++/lib/TypeMetric.lo
CXXLD Magick++/lib/libMagick++.la
CC utilities/animate.o
CCLD utilities/animate
/usr/local/lib/libfpx.so: undefined reference to `operator delete[](void*)'
/usr/local/lib/libfpx.so: undefined reference to `operator new(unsigned long)'
/usr/local/lib/libfpx.so: undefined reference to `operator delete(void*)'
/usr/local/lib/libfpx.so: undefined reference to `operator new[](unsigned long)'
/usr/local/lib/libfpx.so: undefined reference to `__cxa_pure_virtual'
/usr/local/lib/libfpx.so: undefined reference to `__gxx_personality_v0'
/usr/local/lib/libfpx.so: undefined reference to `std::ios_base::Init::~Init()'
/usr/local/lib/libfpx.so: undefined reference to `vtable for __cxxabiv1::__vmi_class_type_info'
/usr/local/lib/libfpx.so: undefined reference to `vtable for __cxxabiv1::__class_type_info'
/usr/local/lib/libfpx.so: undefined reference to `std::ios_base::Init::Init()'
/usr/local/lib/libfpx.so: undefined reference to `vtable for __cxxabiv1::__si_class_type_info'
collect2: ld returned 1 exit status
make[1]: *** [utilities/animate] Error 1
make[1]: Leaving directory `/mnt/install/working/hylafax/working/ImageMagick6.7.6-5.7/ImageMagick-6.7.6-5'
make: *** [all] Error 2



[root@app1 ImageMagick-6.7.6-5]# make install
make install-am
make[1]: Entering directory `/mnt/install/working/hylafax/working/ImageMagick6.7.6-5.7/ImageMagick-6.7.6-5'
CCLD utilities/animate
/usr/local/lib/libfpx.so: undefined reference to `operator delete[](void*)'
/usr/local/lib/libfpx.so: undefined reference to `operator new(unsigned long)'
/usr/local/lib/libfpx.so: undefined reference to `operator delete(void*)'
/usr/local/lib/libfpx.so: undefined reference to `operator new[](unsigned long)'
/usr/local/lib/libfpx.so: undefined reference to `__cxa_pure_virtual'
/usr/local/lib/libfpx.so: undefined reference to `__gxx_personality_v0'
/usr/local/lib/libfpx.so: undefined reference to `std::ios_base::Init::~Init()'
/usr/local/lib/libfpx.so: undefined reference to `vtable for __cxxabiv1::__vmi_class_type_info'
/usr/local/lib/libfpx.so: undefined reference to `vtable for __cxxabiv1::__class_type_info'
/usr/local/lib/libfpx.so: undefined reference to `std::ios_base::Init::Init()'
/usr/local/lib/libfpx.so: undefined reference to `vtable for __cxxabiv1::__si_class_type_info'
collect2: ld returned 1 exit status
make[1]: *** [utilities/animate] Error 1
make[1]: Leaving directory `/mnt/install/working/hylafax/working/ImageMagick6.7.6-5.7/ImageMagick-6.7.6-5'
make: *** [install] Error 2
[root@app1 ImageMagick-6.7.6-5]#



Thank you.

ynguyen

Re: Multiple Page Text, Need Page breaker option

Posted: 2012-04-17T20:26:01-07:00
by ynguyen
Anthony,


Any suggestion on the errors below:


Per your recommendation I Download this
ImageMagick-6.7.6-6.src.rpm

RAN this command:
nice rpmbuild --nodeps --rebuild ImageMagick*.src.rpm



Got these errors:
checking if argz actually works... yes
checking whether libtool supports -dlopen/-dlpreopen... yes
checking for ltdl.h... no
configure: error: invalid ltdl include directory: `/usr/include'
error: Bad exit status from /var/tmp/rpm-tmp.q8iqlc (%build)


RPM build errors:
Bad exit status from /var/tmp/rpm-tmp.q8iqlc (%build)




ynguyen