Remove unknown background noise
Posted: 2018-03-06T16:41:45-07:00
Hi,
First of all appreciate your hard work for building such a fantastic library for image processing. It really impressive stuff!!
Problem:
Need to clean the random images before OCR (Tesseract) to achieve the highest results. Here is what i want to acheive;
1. Reduce the image resolution to 300*300
2. Convert the image to PNG format - did googling and found PNG format works well for B&W images
3. Remove any background noise (if present)
4. Convert the image to black and white
5. Remove any black border (edges) - (if present)
In order to achieve this i have come up with the following script, which i am not sure can be further simplified. The only problem i am stuck at is
1. An input image may have various levels of background noise OR it may not have at all.
2. An input image may or may not have black borders.
So when i pass all these random images through this standard script, i am not getting the consistent output (a clean image). And i found the problem is with the threshold which i have hard coded (STEP 2 & STEP 3) in the following script.
The following script works for images with heavy background noise (will attach an image sample)
STEP 2:--- 25% white-threshhold & fuzz (next line) & converting to b&W on STEP 3
But the same threshhold doesn't work with less background noise images (i have to increase the threshhold values to 60% to get the desired output)
SO IN SUMMARY HOW CAN HAVE A STANDARD SCRIPT TO REMOVE NOISE FROM ANY RANDOM IMAGE
ANY HELP WOULD BE MUCH APPRECIATED...THANKS
WINDOWS BATCH FILE
====================
echo STEP 1. Convert to png
magick input.jpg result.png
echo STEP 2. Remove Background noise
magick result.png -white-threshold 25%% -transparent white result-no-noise-temp.png
magick result-no-noise-temp.png -fuzz 25%% -transparent white result-no-noise.png
echo STEP 3. Convert to Black n White
magick result-no-noise.png -threshold 25%% result-bw.png
echo STEP 4. Crop black border
magick result-bw.png -bordercolor black -border 1 -fuzz 25%% -fill white -draw "color 0,0 floodfill" -alpha off -shave 1x1 final.png
SAMPLE IMAGES
https://www.dropbox.com/s/0f48wr6gpkdue ... e.jpg?dl=0
https://www.dropbox.com/s/gi0d14d8pei0p ... t.png?dl=0
First of all appreciate your hard work for building such a fantastic library for image processing. It really impressive stuff!!
Problem:
Need to clean the random images before OCR (Tesseract) to achieve the highest results. Here is what i want to acheive;
1. Reduce the image resolution to 300*300
2. Convert the image to PNG format - did googling and found PNG format works well for B&W images
3. Remove any background noise (if present)
4. Convert the image to black and white
5. Remove any black border (edges) - (if present)
In order to achieve this i have come up with the following script, which i am not sure can be further simplified. The only problem i am stuck at is
1. An input image may have various levels of background noise OR it may not have at all.
2. An input image may or may not have black borders.
So when i pass all these random images through this standard script, i am not getting the consistent output (a clean image). And i found the problem is with the threshold which i have hard coded (STEP 2 & STEP 3) in the following script.
The following script works for images with heavy background noise (will attach an image sample)
STEP 2:--- 25% white-threshhold & fuzz (next line) & converting to b&W on STEP 3
But the same threshhold doesn't work with less background noise images (i have to increase the threshhold values to 60% to get the desired output)
SO IN SUMMARY HOW CAN HAVE A STANDARD SCRIPT TO REMOVE NOISE FROM ANY RANDOM IMAGE
ANY HELP WOULD BE MUCH APPRECIATED...THANKS
WINDOWS BATCH FILE
====================
echo STEP 1. Convert to png
magick input.jpg result.png
echo STEP 2. Remove Background noise
magick result.png -white-threshold 25%% -transparent white result-no-noise-temp.png
magick result-no-noise-temp.png -fuzz 25%% -transparent white result-no-noise.png
echo STEP 3. Convert to Black n White
magick result-no-noise.png -threshold 25%% result-bw.png
echo STEP 4. Crop black border
magick result-bw.png -bordercolor black -border 1 -fuzz 25%% -fill white -draw "color 0,0 floodfill" -alpha off -shave 1x1 final.png
SAMPLE IMAGES
https://www.dropbox.com/s/0f48wr6gpkdue ... e.jpg?dl=0
https://www.dropbox.com/s/gi0d14d8pei0p ... t.png?dl=0