Trying to manipulate captcha for pre-OCR

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
wakamura
Posts: 2
Joined: 2016-07-28T08:09:32-07:00
Authentication code: 1151

Trying to manipulate captcha for pre-OCR

Post by wakamura »

I have a captcha I'm currently trying to use tesseract-OCR to read.
The problem is, there are dots and lines across the letters. I have been testing with imageMagick trying to remove these lines to make the OCR actually be able to read the letters but have failed to do so.

(Original image)
http://imgur.com/a/b3osy

(After image)
http://imgur.com/a/1sedw

The process I used to get to

Code: Select all

convert captcha1.png -level 20000,0,20000 captcha1.png
convert catpcha1.png catpcha1.pgm
convert captcha1.pgm -black-threshold 65000 captcha1.tif
convert captcha1.tif -negate captcha1.tif
convert captcha1.tif -threshold 90% captcha1.tif
convert captcha1.tif -morphology Convolve "3x3: 0.1,0.0,05 0.0,0.5,0.5 0.1,0.1,0.1" captcha1.tif
convert captcha1.tif -blur 1 captcha1.tif
convert captcha1.tif -threshold 80% captcha1.tif
convert captcha1.tif -morphology Convolve "3x3: 0.1,0.0,05 0.0,0.5,0.5 0.1,0.1,0.1" captcha1.tif
convert captcha1.tif -morphology Convolve "3x3: 0.1,0.0,05 0.0,0.5,0.5 0.1,0.1,0.1" captcha1.tif
convert captcha1.tif -threshold 80% captcha1.tif
convert captcha1.tif -morphology Convolve "3x3: 0.1,0.0,05 0.0,0.5,0.5 0.1,0.1,0.1" captcha1.tif
convert captcha1.tif -blur 1 captcha1.tif
convert captcha1.tif -threshold 90% captcha1.tif
convert captcha1.png -level 20000,0,20000 captcha1.png
convert catpcha1.png catpcha1.pgm
convert captcha1.pgm -black-threshold 65000 captcha1.tif
convert captcha1.tif -negate captcha1.tif
convert captcha1.tif -threshold 90% captcha1.tif
convert captcha1.tif -morphology Dilate rectangle:3x3 captcha1.tif
convert captcha1.tif -morphology Erode rectangle:5x1 captcha1.tif
As you can see, I am very new to this, and for the first part I have found a guide online, then tried using trial and error for a few days and still to no avail.
I have been first running it through cmd Prompt to try to get the OCR to actually read the letters correctly.
Can anyone point me to the right direction of removing the left over lines or a better method?

EDIT: Images werent showing
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Trying to manipulate captcha for pre-OCR

Post by snibgo »

I won't help attempts to defeat captcha.
snibgo's IM pages: im.snibgo.com
wakamura
Posts: 2
Joined: 2016-07-28T08:09:32-07:00
Authentication code: 1151

Re: Trying to manipulate captcha for pre-OCR

Post by wakamura »

snibgo wrote:I won't help attempts to defeat captcha.
okay.
Post Reply