Page 1 of 1
How to find the text?
Posted: 2011-05-05T01:22:04-07:00
by ghostmansd
Dear users, is it possible to find any piece of text in the image? What I need: script looks through the image with area 50x50 pixels. If area consists of text, script remembers coordinates ($POSX and $POSY) of first background pixel in area and stop it's work. Example is below.
In other words, script must select pixel, similar to background of the text, and then translate it to Fred's Magic Wand. Then Magic Wand will colorize background in white color.
Re: How to find the text?
Posted: 2011-05-05T09:18:04-07:00
by fmw42
IM is a pixel processor and does not know about text. So I doubt IM can do what you want. I know of no way to handle that. But perhaps Anthony or someone else might know otherwise.
Re: How to find the text?
Posted: 2011-05-05T10:31:44-07:00
by ghostmansd
Hm, my be it's possible to do next way:
1) IM converts image to monochrome PBM; every PBM is a sequence of symbols 1 and 0, where 1 is black and 0 is white;
2) IM moves through PBM and finds where combination of 1 and 0 looks like text;
3) IM remembers the square of the first place which looks like text;
4) IM takes coordinates of first white pixel and translates it to script.
Re: How to find the text?
Posted: 2011-05-05T10:44:04-07:00
by fmw42
define "where combination of 1 and 0 looks like text;"
IM has no knowledge of the shapes of text and cannot distinguish 1 and 0 combinations in images from those of text. But again Anthony may have a better idea how to proceed.
Re: How to find the text?
Posted: 2011-05-05T11:23:13-07:00
by ghostmansd
Yeah, that was really foolish.
I will wait for Anthony: it seems he knows smth about this, but he deleted his old post in my previous topic. However, big thanks again!
Re: How to find the text?
Posted: 2011-05-05T12:27:19-07:00
by fmw42
If all you want is to colorize the background, then just use the color of pixel 0,0
color=`convert image -format "%[pixel:u.p{0,0}]" info:`
convert image -fuzz XX% -fill newcolor -opaque $color resultimage
where -fuzz XX% allows you some flexibility to match colors close to $color and recolor them all
Re: How to find the text?
Posted: 2011-05-05T12:45:06-07:00
by ghostmansd
That effects picture in the corner also. In a tiff files, at least.
Re: How to find the text?
Posted: 2011-05-05T13:55:41-07:00
by fmw42
ghostmansd wrote:That effects picture in the corner also. In a tiff files, at least.
You could do floodfill, but then any text that has holes in it such as the letter O will not get recolored. I am afraid there is not likely going to be an optimum solution that works as you would like. But lets see what Anthony suggests.
Re: How to find the text?
Posted: 2011-05-06T02:58:41-07:00
by ghostmansd
There is an example on Python (with Imaging Library).
Example (pavian's photo with text)
Code: Select all
from PIL import Image
im = Image.open('D:/pavian.png', 'r')
w, h = im.size
a = [[0]*w for i in range(h)]
b = [[0]*w for i in range(h)]
for i in range(h):
for j in range(w):
a[i][j] = im.getpixel((j, i))
d = [[-1,-1], [-1,0], [-1,1], [0,1], [1,1], [1,0], [1,-1], [0,-1]]
c = 40 # // порог разницы в интенсивностях двух соседей
s = 20 # // сторона квадратиков
def foo(p, q):
cnt = 0
for i in range(p + 1, p + s - 1):
for j in range(q + 1, q + s - 1):
for k in range(8):
if abs(a[i][j] - a[i + d[k][0]][j + d[k][1]] > c):
cnt += 1
return cnt
z = 0
for i in range(0, h - s, s):
for j in range(0, w - s, s):
p = foo(i, j)
for k in range(s):
for l in range(s):
b[i + k][j + l] = p
z = max(z, p)
for i in range(h):
for j in range(w):
if b[i][j] > z / 2:
v = 255
else:
v = 0
im.putpixel((j, i), v)
f_out = open('D:/pavian_out.png', 'wb')
im.save(f_out)
f_out.close()
Is it possible to realize something like this using IM?
Re: How to find the text?
Posted: 2011-05-09T21:51:04-07:00
by anthony
As Fred said it all comes down to...
define "where combination of 1 and 0 looks like text;"
The only thing I can think of is use combinations of morphology and segmentation so as to locate rows of small segments, which generally makes up characters and words, and thus means 'text'.
For example in your thumbnail image above doing a morphology search for long thin horizontal lines can mean 'text'.
On the other hand: define "what is a image" in the page image may in fact be a lot easier!
Again you would use morphology and segmentation to learn about what makes up the page, but in this case a 'image' would be any segment larger than say 2 or 3 typical text rows.
I have done this using Fred's "Multi Crop" to locate the images on a page. (My own need was for the images not text).
My own modified version is in..
http://www.imagemagick.org/Usage/scripts/multi_crop
This does a sparse grid search for any large segments (defined as NOT the background color). If the segment is too small it gets ignored (small character). A small change will let it output a list of rectangles that it thinks are areas of 'non-text'.
WARNING: whatever you do you will need to do a fast 'preview' to check that it worked fine. In my own use I came across pages with overlapping images, extra lines and boxes, or slight image rotations, or text inserts in larger images, that needed some extra work on those specific pages to deal with. But in general it worked and saved me a LOT of work in manually processing each and every page (about a thousand pages).
Re: How to find the text?
Posted: 2011-05-23T11:42:14-07:00
by ghostmansd
anthony wrote:As Fred said it all comes down to...
define "where combination of 1 and 0 looks like text;"
The only thing I can think of is use combinations of morphology and segmentation so as to locate rows of small segments, which generally makes up characters and words, and thus means 'text'.
For example in your thumbnail image above doing a morphology search for long thin horizontal lines can mean 'text'.
On the other hand: define "what is a image" in the page image may in fact be a lot easier!
Again you would use morphology and segmentation to learn about what makes up the page, but in this case a 'image' would be any segment larger than say 2 or 3 typical text rows.
I have done this using Fred's "Multi Crop" to locate the images on a page. (My own need was for the images not text).
My own modified version is in..
http://www.imagemagick.org/Usage/scripts/multi_crop
This does a sparse grid search for any large segments (defined as NOT the background color). If the segment is too small it gets ignored (small character). A small change will let it output a list of rectangles that it thinks are areas of 'non-text'.
WARNING: whatever you do you will need to do a fast 'preview' to check that it worked fine. In my own use I came across pages with overlapping images, extra lines and boxes, or slight image rotations, or text inserts in larger images, that needed some extra work on those specific pages to deal with. But in general it worked and saved me a LOT of work in manually processing each and every page (about a thousand pages).
Anthony, great thanks to you! That's amazingly useful tool! Now all what I've to do:
1. Make a b-w copy of image.
2. Cut the images from image.
3. Insert images into correct positions.
The last is the most difficult. Script must remember BEGINNING(x,y) and ENDING(x,y) coordinates of each image (if your script imagines each image as square). I think the best way is to put name of each image in text file. For example:
Code: Select all
/tmp/image-1.png,20,71,95,100
/tmp/image-2.png,300,150,374,200
/tmp/image-3.png,700,503,729,625
First column is filename, second -- x-coordinates of beginning, third -- y-coordinates of beginning, and then x- and y- coordinates of ending. Is it possible to realize?
Re: How to find the text?
Posted: 2011-05-23T12:29:07-07:00
by fmw42
You don't have to remember the coordinates. If you use -crop without adding +repage and save in png format that saves the virual canvas , then you can flatten the cropped images back into their original places as -flatten will look for the virtual canvas information in the file.
see
http://www.imagemagick.org/Usage/crop/#crop
http://www.imagemagick.org/Usage/layers/#flatten
Re: How to find the text?
Posted: 2011-05-23T22:04:55-07:00
by anthony
PNG images saved by IM also include a IM specific profile (very tiny) that stores the original page size too.
Just read all the images in, set background color and flatten.
I have does something similar in IM examples, in restoring tile cropping (where offsets and page size was preserved.
http://www.imagemagick.org/Usage/crop/#crop_tile
Just be careful about PNG images with virtual offsets in web browsers. Some browsers go really screwy!