PDF to image causes 1st page margin
Re: PDF to image causes 1st page margin
I've found that the height of the document determines the angle of the watermark, as well as the size. I wonder if it's possible to pattern match the watermark, and then remove it? It looks like that is what you were talking about with morphology. I don't understand it, but I'm going to go through the http://www.imagemagick.org/Usage/morphology/ and try to figure it out.
Re: PDF to image causes 1st page margin
Removing the majority of it seems reasonable actually. By looking at convert subimage-search and possibly morphology, it seems doable. I've toyed around with it but I can't get it to work.
I'd pay you (or anyone) who could figure out how to mostly remove it so OCR would work good. I attached the pattern, and several documents. The 11-0.png document has an exact match, while the others might be slightly different which is the biggest challenge.
I'd pay you (or anyone) who could figure out how to mostly remove it so OCR would work good. I attached the pattern, and several documents. The 11-0.png document has an exact match, while the others might be slightly different which is the biggest challenge.
- Attachments
-
- 57-0.png (280.06 KiB) Viewed 10468 times
-
- 11-1.png (282.47 KiB) Viewed 10468 times
-
- 11-0.png (378.05 KiB) Viewed 10468 times
-
- Pattern.png (232.99 KiB) Viewed 10468 times
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: PDF to image causes 1st page margin
You might be able to match the watermark only image to the image with text and watermark by using compare -subimage-search. Then you need to use -compose subtract or -compose divide to remove the watermark. morphology open or close will only try to remove small dots. That did not work well for me when I tested that.
Re: PDF to image causes 1st page margin
I tried Imagick compare, but got an error:
I printed the images and they both show, but when comparing it fails. The error isn't very helpful though. I also tried CL and I didn't get any response or any files created when I did:
compare -subimage-search /fullpath/Result-0.png /fullpath/SearchPatternPNG.png /fullpath/ZZZ.png
Edit: I did run "compare -subimage-search /path/Result-0.png /path/SearchPatternPNG.png /path/ZR-%d.png" which did execute and used a great amount of server resources which then returned no image.
Code: Select all
$Page = new Imagick('Result-0.png');
$Page2 = new Imagick('SearchPatternPNG.png');
$Result = $Page2 -> compareImages($Page, Imagick::COMPOSITE_SATURATE);
$Result[0] -> setImageFormat('jpeg');
echo $Result[0];
Code: Select all
Fatal error: Uncaught exception 'ImagickException' with message 'Compare images failed' in /home/pitmanco/public_html/la/ndrin/search.php:9 Stack trace: #0 /home/pitmanco/public_html/la/ndrin/search.php(9): Imagick->compareimages(Object(Imagick), 44) #1 {main} thrown in /home/pitmanco/public_html/la/ndrin/search.php on line 9
compare -subimage-search /fullpath/Result-0.png /fullpath/SearchPatternPNG.png /fullpath/ZZZ.png
Edit: I did run "compare -subimage-search /path/Result-0.png /path/SearchPatternPNG.png /path/ZR-%d.png" which did execute and used a great amount of server resources which then returned no image.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: PDF to image causes 1st page margin
try setting a -metric rmse (or some other metric). Also note that the for subimage-search, the two images must be different sizes (larger first)
compare -metric rmse -subimage-search largeimage smallimage resultimages
if you are running it via PHP exec(), you will likely need to send the result from stderr to stdout
compare -metric rmse -subimage-search largeimage smallimage resultimages 2>&1
see
http://www.imagemagick.org/Usage/compare/
http://www.imagemagick.org/Usage/compare/#statistics
http://www.imagemagick.org/script/compare.php
I do not know much about doing compare in Imagick. But it does work in command line.
see the following old example, but it now needs the addition of -subimage-search
viewtopic.php?f=1&t=14613&p=51076&hilit ... ric#p51076
compare -metric rmse -subimage-search largeimage smallimage resultimages
if you are running it via PHP exec(), you will likely need to send the result from stderr to stdout
compare -metric rmse -subimage-search largeimage smallimage resultimages 2>&1
see
http://www.imagemagick.org/Usage/compare/
http://www.imagemagick.org/Usage/compare/#statistics
http://www.imagemagick.org/script/compare.php
I do not know much about doing compare in Imagick. But it does work in command line.
see the following old example, but it now needs the addition of -subimage-search
viewtopic.php?f=1&t=14613&p=51076&hilit ... ric#p51076
Re: PDF to image causes 1st page margin
I tried it with an example photo which worked. For some reason, the search pattern and search image attached will run till the server kills it. I attached them (SearchImage.jpg//SearchPattern.png). I tried it with a small version (attached SearchImageZ/SearchPatternZ) which returned the error: images too dissimilar `/SearchImageZ.jpg' @ error/compare.c/CompareImageCommand/953.
I'll be trying different patterns to see if something works. Any thoughts what I'm doing wrong?
I'll be trying different patterns to see if something works. Any thoughts what I'm doing wrong?
- Attachments
-
- SearchImageZ.jpg (31.4 KiB) Viewed 10443 times
-
- SearchPatternZ.jpg (18.08 KiB) Viewed 10443 times
-
- SearchImage.jpg (430.65 KiB) Viewed 10443 times
-
- SearchPattern.png (143.54 KiB) Viewed 10443 times
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: PDF to image causes 1st page margin
IM compare is set up for normal type images and will stop if the images are too dissimilar. So add to the command -dissimilarity-threshold 100%. That should keep it from stopping too quickly. If you want to speed it up, you can also add -similarity-threshold somesmallvalue, if you use -metric rmse. It will then stop when it reaches a match that has a metric value smaller than or equal to you somesmallvalue. If you know you have a perfect match you can use somesmallvalue=0 (in quantumrange --- 65535 for Q16 compile or 256 in Q8 compile) or 0% (in range 0 to 100). So that value can be absolute or a percent. If you do not believe the match will be perfect, that raise the value to something bigger than 0 but still small or it will stop at a close but not optimum match. Otherwise, just wait for it to finish when leaving off the -similarity-threshold
see
http://www.imagemagick.org/script/comma ... -threshold
http://www.imagemagick.org/script/comma ... -threshold
see
http://www.imagemagick.org/script/comma ... -threshold
http://www.imagemagick.org/script/comma ... -threshold