How to slice up image of text
How to slice up image of text
I have a text image that I am trying to automate slicing up into individual images. Can someone help me determine what the best way to cut blocks of text out of an image that are surrounded by blank spaces?
To be more specific, I have multiple jpg's (hundreds of pages) that have patient data grouped together in columns over multiple lines - and each patient entry is seperated by one blank line. I am trying to create a batch that creates a new image for each patient's information so that I can send that information to the patient. It is important to note that while it doesn't show in my example below there are large spaces (columns) across the page that would need to remain, I'm hoping there's a way to recognize the blank line between records and crop at that point. Each page may or may not contain the same number of records (it's random) The data is currently formated as below:
File Number Name Address Patient Data Amount Due
Date Address 2 More Patient Data Other info
City, State Zip Yet More Patient Data
------------------------------------------------------> This line would be a blank space <----------------------
File Number Name Address Patient Data Amount Due
Date Address 2 More Patient Data Other info
City, State Zip Yet More Patient Data
-----------------------------------------------------> This line would be a blank space <------------------------
I greatly appreciate any help you may have to automate this process!
To be more specific, I have multiple jpg's (hundreds of pages) that have patient data grouped together in columns over multiple lines - and each patient entry is seperated by one blank line. I am trying to create a batch that creates a new image for each patient's information so that I can send that information to the patient. It is important to note that while it doesn't show in my example below there are large spaces (columns) across the page that would need to remain, I'm hoping there's a way to recognize the blank line between records and crop at that point. Each page may or may not contain the same number of records (it's random) The data is currently formated as below:
File Number Name Address Patient Data Amount Due
Date Address 2 More Patient Data Other info
City, State Zip Yet More Patient Data
------------------------------------------------------> This line would be a blank space <----------------------
File Number Name Address Patient Data Amount Due
Date Address 2 More Patient Data Other info
City, State Zip Yet More Patient Data
-----------------------------------------------------> This line would be a blank space <------------------------
I greatly appreciate any help you may have to automate this process!
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: How to slice up image of text
Please state your platform, and version number of IM.
My "Subimage rectangles" page shows some methods for chopping images into rectangles. Fred has some bash scripts for similar work.
For specific guidance, you need to provide a sample image, but do not include real patient information.
My "Subimage rectangles" page shows some methods for chopping images into rectangles. Fred has some bash scripts for similar work.
For specific guidance, you need to provide a sample image, but do not include real patient information.
snibgo's IM pages: im.snibgo.com
Re: How to slice up image of text
I did some pretty major blur on the image before posting, you can see a sample of what I'm talking about http://i1204.photobucket.com/albums/bb4 ... equeaz.jpg
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: How to slice up image of text
So what exactly are you trying to extract? If you average (-scale) the image to one column and save to txt format, it will show you where the blank lines exist (pure white). You will have sections of dark separated by a few pixels of white. Use the white areas as the y locations to crop your image. You will have to write a script to extract those coordinates and compute the cropping from them. You want each spread of y values between each successive white regions.
Re: How to slice up image of text
If I were to describe doing this exercise with a paper and scissors, I would cut the page into (horizontal) strips with one person's info on each strip. In the example I posted, ignoring the header, I would end up with 11 strips. I want 11 jpegs or gifs, but I'm not sure how to batch this.
I didn't post in that area but I can, depending on the rate I'd be willing to pay someone to write the script to do it.
Here's what I'd like to end up with: http://i1204.photobucket.com/albums/bb4 ... yyxbcp.jpg
I didn't post in that area but I can, depending on the rate I'd be willing to pay someone to write the script to do it.
Here's what I'd like to end up with: http://i1204.photobucket.com/albums/bb4 ... yyxbcp.jpg
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: How to slice up image of text
This does what I think you want:
The script is shown on the page I referenced. It makes 41 outputs, such as:
In essence, it uses "-connected-components" to find the areas that are at least 2% darker than white, and that have an area (in pixels) of at least 0.1% of the total image.
It is a Windows BAT script, but readily translated to bash.
It works fine on the blurred image. It would need modifying to get the cropping areas from the blurred image, but then to actually crop the unblurred image.
Code: Select all
call %PICTBAT%rectSubimages page1_zpsylequeaz.jpg subs.tiff White 2 0.1c
In essence, it uses "-connected-components" to find the areas that are at least 2% darker than white, and that have an area (in pixels) of at least 0.1% of the total image.
It is a Windows BAT script, but readily translated to bash.
It works fine on the blurred image. It would need modifying to get the cropping areas from the blurred image, but then to actually crop the unblurred image.
snibgo's IM pages: im.snibgo.com
Re: How to slice up image of text
I just modified the previous post, but just so everyone reading sees it, this is ideally what I'd like to get out of a script: http://i1204.photobucket.com/albums/bb4 ... yyxbcp.jpg
Re: How to slice up image of text
That's interesting @snibgo... I assume that's one "strip" all smashed together?
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: How to slice up image of text
Oh, okay, I misunderstood.damonm wrote:...I would cut the page into (horizontal) strips with one person's info on each strip.
For just horizontal strips: first trim off the black mark down the right side. Then use:
Code: Select all
call %PICTBAT%guillotine page1_zpsylequeaz.jpg subs_XX.png White 2 . 1
snibgo's IM pages: im.snibgo.com
Re: How to slice up image of text
Oh I apologize, I'm using Windows 7.0.5-q16
So I think the function you gave me will change as a result, right?
So I think the function you gave me will change as a result, right?
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: How to slice up image of text
Yes, forget about my post above that calls rectSubimages. Instead:
This crops off the junk at the right-hand side. You could also crop off the headings etc. Then it calls guillotine.bat, which writes png_0.png to png_13.png.
png_12.png is:
If you look at the script, guillotine.bat, it calls guilFind to find the crop parameters from %1. It then uses those parameters to call guilChop that chops image %1. If you change that second %1 to %7, you can call guillotine.bat with seven parameters. The seventh will be the actual file of text that you want cropped.
Code: Select all
%IM%convert page1_zpsylequeaz.jpg -crop 95x100%%+0+0 pat_crp.png
call %PICTBAT%guillotine pat_crp.png subs_XX.png White 2 . 1
png_12.png is:
If you look at the script, guillotine.bat, it calls guilFind to find the crop parameters from %1. It then uses those parameters to call guilChop that chops image %1. If you change that second %1 to %7, you can call guillotine.bat with seven parameters. The seventh will be the actual file of text that you want cropped.
snibgo's IM pages: im.snibgo.com
Re: How to slice up image of text
Alright, so now let me reveal my true Imagemagick ignorance, where do I put these commands? When I was messing with this earlier I was doing it in powershell using things like "magick ScannedImage.jpg -shave 60x40+0+40 Cropped.jpg" but those that you gave me don't work.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: How to slice up image of text
I don't use Powershell. I don't know how to call BAT scripts from within Powershell.
My scripts were written for IM v6. As you are using v7, you should prefix each IM command with "magick".
My scripts were written for IM v6. As you are using v7, you should prefix each IM command with "magick".
snibgo's IM pages: im.snibgo.com
Re: How to slice up image of text
ok great, let me give it a try!
Re: How to slice up image of text
I tried to do this but I don't have any batch files in the image magick directory... Where do I find the guillotine.bat file?