Cleaning up noise around text

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
g7Bond
Posts: 1
Joined: 2015-01-28T17:43:10-07:00
Authentication code: 6789

Cleaning up noise around text

Post by g7Bond »

Hi!

I've been trying to clear the background of the kind of image that you can see below.

Image
Image
Image

The process I'm doing is that first I'll run a simple filter (hand made) to remove some of the noise (picking only black pixels that are surrounded by 8 other black pixels): https://github.com/vkruoso/receita-tool ... aFilter.py - After that I just run tesseract hoping the result will be good.

I'm providing a free webservice that get information from a government site to allow an easier way to have the information (this really should be provided by the government). Doing that process I've managed to successfully decode the text 25% of the time. But that's not good enough to provide a good service.

I have very little background on image processing, so I think someone around here can give some hints about how to approach on this particular kind of image.

--
Thanks a lot.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Cleaning up noise around text

Post by fmw42 »

Most people here will not help to break captchas
Post Reply