Crop all areas with a certain color (get text out of colored areas)

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
ederhj
Posts: 11
Joined: 2015-08-30T02:57:13-07:00
Authentication code: 1151

Crop all areas with a certain color (get text out of colored areas)

Post by ederhj »

Hello,

I find your scripts very useful and great.

So I ask you to help me with an issue that could solved with your scripts I think, but I don't know how.

I have a document/scan as JPG where many areas are marked (with a text-marker) with 3 differnet colors.
This scan is read by OCR software (tesseract).
Now I have the idea to tag the file, to rename the file according to the colors in the file.
That means for example:
- the yellow marked text should be used for filename
- the green for tagging the file
- an so on.

My idea is to crop all areas with a certain color an then make the ocr.
But how do I get generic files for each colored area out of the source file?

Or is there another way to solve this problem?

It would be very nice, if you help me.

Thank U and best regards
ederhj
ederhj
Posts: 11
Joined: 2015-08-30T02:57:13-07:00
Authentication code: 1151

Re: Crop all areas with a certain color (get text out of colored areas)

Post by ederhj »

I hope it is clear what I want to do?
If not please ask.

I think everyone who wants a paperless buero has this need.

Thank U
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Crop all areas with a certain color (get text out of colored areas)

Post by snibgo »

Put up a sample image, and your expected results.
snibgo's IM pages: im.snibgo.com
ederhj
Posts: 11
Joined: 2015-08-30T02:57:13-07:00
Authentication code: 1151

Re: Crop all areas with a certain color (get text out of colored areas)

Post by ederhj »

Hello,

the file ist here: https://www.hightail.com/download/bXBaR ... V3lHR3NUQw

My expectations is, that i get images for each mark which i do OCR with.

Thank U
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Crop all areas with a certain color (get text out of colored areas)

Post by snibgo »

I hope you have a better source than JPG.

The colours are more saturated than the background, so we can distunguish by saturation, and use that to mask out the rest of the image.

Code: Select all

%IM%convert scan.jpg -colorspace HSL -channel G -separate +channel -threshold 30%% s.png

%IM%convert scan.jpg s.png -compose CopyOpacity -composite s2.png
snibgo's IM pages: im.snibgo.com
ederhj
Posts: 11
Joined: 2015-08-30T02:57:13-07:00
Authentication code: 1151

Re: Crop all areas with a certain color (get text out of colored areas)

Post by ederhj »

Almost perfect.

Only one thing i need further:
I need this "extruding" for each color, that means in my example 4 export files for each color.

How does this work.

And: THANK U for your support.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Crop all areas with a certain color (get text out of colored areas)

Post by fmw42 »

You can use the following to extract the largest the crop areas corresponding to the largest white areas from the s.png image. You will have too many due to the large white region at the bottom right. But you can then throw out the ones not needed or examine the bounding box coordinates for shapes that are longer in x than y. Skip the first one, which has color gray(0) and is the black background. Then get the next ones that are gray(255) until you read one that is gray(0) again. See http://magick.imagemagick.org/script/co ... onents.php

Code: Select all

convert s.png -define connected-components:verbose=true -connected-components 4 null:
Objects (id: bounding-box centroid area mean-color):
0: 885x1090+0+0 437.8,544.0 927407 gray(0)
286: 101x161+784+929 847.8,1024.5 9443 gray(255) <-- not likely what you want, not the right w/h aspect ratio
215: 176x32+278+359 365.2,373.9 4914 gray(255) <-- probably a good one
217: 193x29+524+384 618.3,398.1 4525 gray(255) <-- probably a good one
223: 130x35+361+544 425.3,561.1 3839 gray(255) <-- probably a good one
72: 207x28+469+170 575.2,182.4 3504 gray(255) <-- probably a good one
52: 157x25+239+115 316.9,126.7 3238 gray(255) <-- probably a good one
228: 83x26+347+636 387.0,648.5 1821 gray(255) <-- probably a good one
219: 84x25+136+421 177.2,432.4 1804 gray(255) <-- probably a good one
211: 80x26+139+228 177.3,240.4 1796 gray(255) <-- probably a good one
327: 17x21+836+939 843.5,950.1 168 gray(255) <-- probably too small and not the right w/h aspect ratio
505: 12x11+801+1045 807.2,1050.5 85 gray(255) <-- probably too small and not the right w/h aspect ratio
573: 26x8+774+1082 786.3,1086.6 75 gray(255) <-- probably too small and not the right w/h aspect ratio

96: 11x12+633+178 638.6,183.5 63 gray(0)
442: 12x13+793+1027 798.8,1033.4 63 gray(255)
377: 14x8+771+1000 777.9,1004.4 62 gray(255)
379: 9x7+804+1001 807.5,1004.3 55 gray(0)
...
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Crop all areas with a certain color (get text out of colored areas)

Post by fmw42 »

P.S. If you take all the top white areas and crop your image, you can then filter further by the average color of the cropped regions. Throw out any that are near gray, i.e. keep any with color of large saturation.
ederhj
Posts: 11
Joined: 2015-08-30T02:57:13-07:00
Authentication code: 1151

Re: Crop all areas with a certain color (get text out of colored areas)

Post by ederhj »

Hello,

I'll try.

But is it not simple possible to crop out this areas and save them in a new file?

Tank U
Hans-Jürgen
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Crop all areas with a certain color (get text out of colored areas)

Post by fmw42 »

It could be built into -connected-components, but it is not there now.

You will need to write a script to loop over the textual data and find the top group of gray(255) before the next gray(0), then separate the crop values for each and crop at those coordinates in your input image.

What platform are you on and what version of IM?

If on Unix, then the following will crop out the first group of gray(255) subsections in the list. If you want to be more selective, such as some maximum (and minimum) area, you can filter further on area. Or is you want, you can extract the W and H and compute the aspect ratio and set limits on that.

Code: Select all

list=`convert s.png \
-define connected-components:verbose=true -connected-components 4 null: |\
tail -n +3 | sed -n 's/^ *//p'`
i=0
OLDIFS=$IFS
IFS=$'\n'
for row in $list; do
color=`echo $row | cut -d\  -f5`
cropvals=`echo $row | cut -d\  -f2`
IFS=$OLDIFS
if [ "$color" = "gray(255)" ]; then
convert scan.jpg[$cropvals] scan_$i.jpg
i=$((i+1))
else
break
fi
done
ederhj
Posts: 11
Joined: 2015-08-30T02:57:13-07:00
Authentication code: 1151

Re: Crop all areas with a certain color (get text out of colored areas)

Post by ederhj »

Hi,

well that would me more to me preferd solution.

Im Working with Windows and ImageMagick 6.9.2-0 Q16 x86 2015-08-15.

If you could help me here again I would be very pleased.

Thank U
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Crop all areas with a certain color (get text out of colored areas)

Post by fmw42 »

Sorry, I do not know Windows scripting.
ederhj
Posts: 11
Joined: 2015-08-30T02:57:13-07:00
Authentication code: 1151

Re: Crop all areas with a certain color (get text out of colored areas)

Post by ederhj »

Ah.

No Prob. But can you tell me, what steps your script does so that I'll be able to transfer it to my script-language?

Thank U
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Crop all areas with a certain color (get text out of colored areas)

Post by fmw42 »

Code: Select all

# line 1 -- read the input image
# line 2 -- process with -connected components to get list and pipe it to next line
# line 3 -- remove the top two rows and remove the leading spaces of all rows
# line 4 -- set output image index to 0
# line 5 -- save the existing internal field separator, which is a space
# line 6 -- set the internal field separator to a new line so that each item in the list is a row
# line 7 -- loop over each row in the list
# line 8 -- extract the color field of the row
# line 9 -- extract the crop values field of the row
# line 10 - reset the internal field separator to a space
# line 11 - test if the color variable is gray(255)
# line 12 - if test passes, then crop with the corresponding crop values
# line 13 - increment the output index by 1
# line 14 - else statement of test
# line 15 - if test fails, it means that the current color is not gray(255), so break the loop and quit
# line 16 - fi is end of if test
# line 17 - done is end of for loop

Code: Select all

list=`convert s.png \
-define connected-components:verbose=true -connected-components 4 null: |\
tail -n +3 | sed -n 's/^ *//p'`
i=0
OLDIFS=$IFS
IFS=$'\n'
for row in $list; do
color=`echo $row | cut -d\  -f5`
cropvals=`echo $row | cut -d\  -f2`
IFS=$OLDIFS
if [ "$color" = "gray(255)" ]; then
convert scan.jpg[$cropvals] scan_$i.jpg
i=$((i+1))
else
break
fi
done
ederhj
Posts: 11
Joined: 2015-08-30T02:57:13-07:00
Authentication code: 1151

Re: Crop all areas with a certain color (get text out of colored areas)

Post by ederhj »

I don't know if I get it excatly.
You crop each pixel, is it?

I want du crop a certain area, where is a certain color with a little bit of variance.

the $cropvals ist always one pixel. so what do i do with that?

Thank U
Post Reply