The problem is, there are dots and lines across the letters. I have been testing with imageMagick trying to remove these lines to make the OCR actually be able to read the letters but have failed to do so.
(Original image)
http://imgur.com/a/b3osy
(After image)
http://imgur.com/a/1sedw
The process I used to get to
Code: Select all
convert captcha1.png -level 20000,0,20000 captcha1.png
convert catpcha1.png catpcha1.pgm
convert captcha1.pgm -black-threshold 65000 captcha1.tif
convert captcha1.tif -negate captcha1.tif
convert captcha1.tif -threshold 90% captcha1.tif
convert captcha1.tif -morphology Convolve "3x3: 0.1,0.0,05 0.0,0.5,0.5 0.1,0.1,0.1" captcha1.tif
convert captcha1.tif -blur 1 captcha1.tif
convert captcha1.tif -threshold 80% captcha1.tif
convert captcha1.tif -morphology Convolve "3x3: 0.1,0.0,05 0.0,0.5,0.5 0.1,0.1,0.1" captcha1.tif
convert captcha1.tif -morphology Convolve "3x3: 0.1,0.0,05 0.0,0.5,0.5 0.1,0.1,0.1" captcha1.tif
convert captcha1.tif -threshold 80% captcha1.tif
convert captcha1.tif -morphology Convolve "3x3: 0.1,0.0,05 0.0,0.5,0.5 0.1,0.1,0.1" captcha1.tif
convert captcha1.tif -blur 1 captcha1.tif
convert captcha1.tif -threshold 90% captcha1.tif
convert captcha1.png -level 20000,0,20000 captcha1.png
convert catpcha1.png catpcha1.pgm
convert captcha1.pgm -black-threshold 65000 captcha1.tif
convert captcha1.tif -negate captcha1.tif
convert captcha1.tif -threshold 90% captcha1.tif
convert captcha1.tif -morphology Dilate rectangle:3x3 captcha1.tif
convert captcha1.tif -morphology Erode rectangle:5x1 captcha1.tif
I have been first running it through cmd Prompt to try to get the OCR to actually read the letters correctly.
Can anyone point me to the right direction of removing the left over lines or a better method?
EDIT: Images werent showing