Page 1 of 4
					
				How to slice up image of text
				Posted: 2017-03-24T12:08:46-07:00
				by damonm
				I have a text image that I am trying to automate slicing up into individual images.  Can someone help me determine what the best way to cut blocks of text out of an image that are surrounded by blank spaces? 
To be more specific, I have multiple jpg's (hundreds of pages) that have patient data grouped together in columns over multiple lines - and each patient entry is seperated by one blank line.  I am trying to create a batch that creates a new image for each patient's information so that I can send that information to the patient.  It is important to note that while it doesn't show in my example below there are large spaces (columns) across the page that would need to remain, I'm hoping there's a way to recognize the blank line between records and crop at that point.  Each page may or may not contain the same number of records (it's random) The data is currently formated as below:
File Number    Name                Address                    Patient Data                       Amount Due 
Date                                       Address 2                  More Patient Data                Other info
                                              City, State Zip          Yet More Patient Data
------------------------------------------------------> This line would be a blank space <----------------------
File Number    Name              Address                       Patient Data                       Amount Due
Date                                     Address 2                     More Patient Data               Other info
                                            City, State Zip             Yet More Patient Data
-----------------------------------------------------> This line would be a blank space <------------------------
I greatly appreciate any help you may have to automate this process!
			 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T12:37:25-07:00
				by snibgo
				Please state your platform, and version number of IM.
My "Subimage rectangles" page shows some methods for chopping images into rectangles. Fred has some bash scripts for similar work.
For specific guidance, you need to provide a sample image, but do not include real patient information.
			 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T15:05:55-07:00
				by damonm
				I did some pretty major blur on the image before posting, you can see a sample of what I'm talking about 
http://i1204.photobucket.com/albums/bb4 ... equeaz.jpg 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T15:19:39-07:00
				by fmw42
				So what exactly are you trying to extract? If you average (-scale) the image to one column and save to txt format, it will show you where the blank lines exist (pure white). You will have sections of dark separated by a few pixels of white. Use the white areas as the y locations to crop your image. You will have to write a script to extract those coordinates and compute the cropping from them. You want each spread of y values between each successive white regions.
			 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T15:34:27-07:00
				by damonm
				If I were to describe doing this exercise with a paper and scissors, I would cut the page into (horizontal) strips with one person's info on each strip.  In the example I posted, ignoring the header, I would end up with 11 strips.  I want 11 jpegs or gifs, but I'm not sure how to batch this.
I didn't post in that area but I can, depending on the rate I'd be willing to pay someone to write the script to do it.
Here's what I'd like to end up with:  
http://i1204.photobucket.com/albums/bb4 ... yyxbcp.jpg 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T15:53:44-07:00
				by snibgo
				This does what I think you want:
Code: Select all
call %PICTBAT%rectSubimages page1_zpsylequeaz.jpg subs.tiff White 2 0.1c
The script is shown on the page I referenced. It makes 41 outputs, such as:
 
In essence, it uses "-connected-components" to find the areas that are at least 2% darker than white, and that have an area (in pixels) of at least 0.1% of the total image.
It is a Windows BAT script, but readily translated to bash.
It works fine on the blurred image. It would need modifying to get the cropping areas from the blurred image, but then to actually crop the unblurred image.
 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T15:55:11-07:00
				by damonm
				I just modified the previous post, but just so everyone reading sees it, this is ideally what I'd like to get out of a script: 
http://i1204.photobucket.com/albums/bb4 ... yyxbcp.jpg 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T16:03:20-07:00
				by damonm
				That's interesting @snibgo...  I assume that's one "strip" all smashed together?
			 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T16:05:44-07:00
				by snibgo
				damonm wrote:...I would cut the page into (horizontal) strips with one person's info on each strip.
Oh, okay, I misunderstood.
For just horizontal strips: first trim off the black mark down the right side. Then use:
Code: Select all
call %PICTBAT%guillotine page1_zpsylequeaz.jpg subs_XX.png White 2 . 1
You still haven't said what version IM you use, on what platform.
 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T16:21:52-07:00
				by damonm
				Oh I apologize, I'm using Windows 7.0.5-q16 
So I think the function you gave me will change as a result, right?
			 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T16:36:52-07:00
				by snibgo
				Yes, forget about my post above that calls rectSubimages. Instead:
Code: Select all
%IM%convert page1_zpsylequeaz.jpg -crop 95x100%%+0+0 pat_crp.png
call %PICTBAT%guillotine pat_crp.png subs_XX.png White 2 . 1
This crops off the junk at the right-hand side. You could also crop off the headings etc. Then it calls guillotine.bat, which writes png_0.png to png_13.png.
png_12.png is:
 
If you look at the script, guillotine.bat, it calls guilFind to find the crop parameters from %1. It then uses those parameters to call guilChop that chops image %1. If you change that second %1 to %7, you can call guillotine.bat with seven parameters. The seventh will be the actual file of text that you want cropped.
 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T16:46:15-07:00
				by damonm
				Alright, so now let me reveal my true Imagemagick ignorance, where do I put these commands?  When I was messing with this earlier I was doing it in powershell using things like "magick ScannedImage.jpg -shave 60x40+0+40 Cropped.jpg" but those that you gave me don't work.
			 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T17:06:32-07:00
				by snibgo
				I don't use Powershell. I don't know how to call BAT scripts from within Powershell.
My scripts were written for IM v6. As you are using v7, you should prefix each IM command with "magick".
			 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T17:08:48-07:00
				by damonm
				ok great, let me give it a try!
			 
			
					
				Re: How to slice up image of text
				Posted: 2017-03-24T17:24:10-07:00
				by damonm
				I tried to do this but I don't have any batch files in the image magick directory...  Where do I find the guillotine.bat file?