Search found 14 matches

by johnbent
2015-01-22T18:55:02-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

Maybe the skew is too severe?
by johnbent
2015-01-22T18:54:14-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

Thanks again so very much! I tried this and it works great. There are 500 pages and seems to work correctly for about 90% of them!! I'm happy to do the remaining fifty manually; actually it will probably only be about 20 to do manually since 20 of the 50 are the weird "chapter" pages and probably ...
by johnbent
2015-01-22T18:07:49-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

Sorry for confusion; columnize4.sh is indeed your code that I copied out of your previous post. It is unedited except that I changed 'infile="page-004.png"' to be 'infile=$1' so I could run it with bash and pass the filename as the command line argument.
by johnbent
2015-01-22T15:25:57-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

:(

> /tmp/columnize4.sh page-004.png
cut: bad delimiter
convert: geometry does not contain image `tmp.png' @ warning/attribute.c/GetImageBoundingBox/247.

And the output "columns" have left empty and right being the full original image.
by johnbent
2015-01-22T13:57:25-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

Thanks again Fred! The second one works much better! It works about 75% of the time. http://tekinged.com/books/kerresel/images/columns/page-010.png is one that doesn't work for example. The third script doesn't work at all . . . I think maybe my imagemagick installation is missing something since I ...
by johnbent
2015-01-22T12:21:28-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

Thank you; you're amazing and I'm super appreciative! It's awesome and it works! :) But only about a third of the time. :( http://tekinged.com/books/kerresel/images/columns/ By looking at the file sizes you can see for which pages it works and for which it doesn't. For example, it works great on ...
by johnbent
2015-01-21T10:58:43-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

Shoot! Sorry; I'm an idiot! Here's the image:

Image
by johnbent
2015-01-21T09:25:27-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

Fred, thank you so very much for all your help. I've now found another similar dictionary on which the OCR works much better even though the text is entirely Palauan without any English! Unfortunately however your awesome columnize script doesn't work for these images and I can't figure out why. If ...
by johnbent
2015-01-04T20:53:25-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

Wow. You are really really good. I'm very grateful. I've made you a contributor to the project: http://tekinged.com/about.php (Your name is in the box on the right) Please let me know if you would prefer I not list you in the contributors. I'm going to now start seeing if I can train tesseract for ...
by johnbent
2014-12-26T17:21:51-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

Re: crop columns out of dictionary page

That is so awesome and I'm so very appreciative. It works awesome and straight out of the box on my mac using imagemagick (imagemagick-6.8.9-8.mavericks.bottle.tar.gz). If you would be so very kind, would it be possible to just modify it a tiny bit to remove the page number from the bottom as well ...
by johnbent
2014-12-26T15:45:52-07:00
Forum: Users
Topic: crop columns out of dictionary page
Replies: 23
Views: 15460

crop columns out of dictionary page

I'm a hobbyist trying to preserve an endangered Pacific Island language by creating an online dictionary. I have a copy of the 1990 print dictionary and am trying to use tesseract to extract the text. I have scanned pages like the below and am looking for help with command line arguments to create ...
by johnbent
2014-12-16T10:48:35-07:00
Forum: Users
Topic: Crop image on whitespace: Preserving an endangered Pacific Island language
Replies: 0
Views: 4293

Crop image on whitespace: Preserving an endangered Pacific Island language

Hello all. I'm a hobbyist trying to create a digitized dictionary for an obscure Pacific Island language, Palauan. I do have copyright holder permission. I have 350+ pages that look like this: http://tekinged.com/misc/images/dict-380.png I do not need the accents transcribed. Unfortunately OCR thus ...
by johnbent
2014-12-16T10:37:28-07:00
Forum: Users
Topic: Split images by white space
Replies: 21
Views: 46004

Re: Split images by white space

That's a great suggestion! Thanks very much. I'm a total newbie to imagemagick however. I'm willing to work to figure out how to do all of the above but if you know any of the command lines to perform each of those above steps automatically for each of the 350+ pages, that'd be a much appreciated ...
by johnbent
2014-12-16T10:13:12-07:00
Forum: Users
Topic: Split images by white space
Replies: 21
Views: 46004

Re: Split images by white space

Anyone still monitoring this really old thread? I have over 350+ images that I'd love to split along "large" regions of whitespace. Can multicrop handle this? I couldn't figure out the arguments to use. Basically what I have is 350+ scanned pages of a dictionary and I'd like to convert them to text ...