Page 1 of 1
Image signature changes if saved to a file then read back
Posted: 2012-02-08T06:43:46-07:00
by bon-bon
Hi,
Code: Select all
img.excerpt!(x, y, width, height)
img.write(f)
new_img = Magick::Image.read(f).first
img.signature != new_img.signature # why?
How can I get the value of
new_img.signature from
img without saving/reading it ?
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-09T00:06:03-07:00
by anthony
What image file format?
PNG saves a 'timestamp' see
-set date:modify otpion in
http://www.imagemagick.org/Usage/formats/#png_write
JPG and GIF is likely to always change as they are lossy or have a color table that can be re-ordered without effecting the result.
Also see IM examples, Comparing, Finding duplicate images, which starts with minor file differences
http://www.imagemagick.org/Usage/compare/#doubles
It goes into greater and greater depth in its determination of how identical images are.
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-11T16:31:03-07:00
by bon-bon
The file format is TIFF
Code: Select all
img.excerpt!(x, y, width, height)
puts img.inspect
img.write(f)
new_img = Magick::Image.read(f).first
puts new_img.inspect
img.signature != new_img.signature # why?
./samples/0211.tif TIFF 104x16=>4x10 4x10+0+0 DirectClass 8-bit 325b
tmp/0211.tif TIFF 4x10 4x10+0+0 PseudoClass 2c 8-bit 325b
They differ?!
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-15T05:05:34-07:00
by bon-bon
I made a mistake. The signature value depends on write/read only if I do image processing.
Here the test unit with sample image.
Code: Select all
require 'rubygems'
require 'rmagick'
puts "processing file %s"%[file = ARGV.shift]
puts " and you've choosen %sto process thresholds"%[(process = /process/i =~ ARGV.shift ) ? "" : "NOT "]
img = Magick::ImageList.new.read(file).first
img = img.black_threshold(200).white_threshold(20) if process
bb = img.bounding_box
excerpt = img.excerpt(bb.x, bb.y, bb.width, bb.height)
signature = excerpt.signature
# saving to a file and reading back
f = "excerpt-" + file
img.write(f)
new_img = Magick::Image.read(f).first
puts " RESULT: img.signature %s new_img.signature"%( img.signature == new_img.signature ? "==" : "!=" )
Save that script to test_signature.rb and sample
http://www.mediafire.com/i/?s9adm0ubx5tmxj2 to test_signature.tif, cd there and run:
Code: Select all
ruby test_signature.rb test_signature.tif
then
Code: Select all
ruby test_signature.rb test_signature.tif process
The result depends on wheter the image was processed or no. Why?
How to avoid file writing/reading but to get the second signature value?
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-15T07:21:47-07:00
by Drarakel
Some suggestions: Perhaps you should tell at the very start which programming language/API you're using. (Perl? Ruby? Something?) And what you do in those scripts. (Seems now that you were cropping the image in the first script. And black-/white-thresholding the image in the second script. In order to do some 'image processing'.) Or - if possible - try to find a regular command at the commandline that shows the problem. And what do you want to achieve at the end?
We don't have a crystal ball that tells us these things.
After analyzing your posts, I think, you're worried about the different (verbose) information that IM returns when writing a file and when again reading it. An example at the commandline:
Code: Select all
convert -verbose signature_test.tif -black-threshold 200 -white-threshold 20 signature_test2.tif
signature_test.tif TIFF 104x16 104x16+0+0 8-bit TrueColor DirectClass 5.32KB 0.000u 0:00.031
signature_test.tif=>
signature_test2.tif TIFF 104x16 104x16+0+0 8-bit Bilevel DirectClass 621B 0.016u 0:00.014
signature_test2.tif TIFF 104x16 104x16+0+0 1-bit Bilevel DirectClass 621B 0.000u 0:00.016
But even if these lines would be identical: This information doesn't have to show the exact file properties in every situation - it's rather an internal representation of the image, I would say.
Again: What do you want to achieve with these 'signatures'?
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-15T09:01:12-07:00
by bon-bon
Yes I should had clearly stated in the beginning of my post that I used Rmagick 2.6.0 which is Ruby API to ImageMagick
I have a dozen B/W images that I consider symbols - they are figures 0..9, dot and comma.
And I need to process a lot of "raw images" consisted of those symbols. As you can see on test_signature.tif (
) raw images contain numbers which I want to OCR from images to text form.
I compare regions of raw images to images of symbols by comparing their signatures. It works well indeed, but I need to write/read images to get proper signatures. Despite that confuses me, that also slows down image processing speed.
I found that I get different signatures for an image and for the same image being written and read from file only in case I apply thresholds to the image. I guess the signature somehow depends on the internal image representation which for unknown reason changes after image been written to a file. So I'm looking for the way to speed up image processing.
I had the script output:
Code: Select all
>ruby signature_test.rb signature_test.tif
processing file signature_test.tif
and you've choosen NOT to process thresholds
RESULT: img.signature == new_img.signature
>ruby signature_test.rb signature_test.tif process
processing file signature_test.tif
and you've choosen to process thresholds
RESULT: img.signature != new_img.signature
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-15T10:30:52-07:00
by Drarakel
I'm still not sure what you mean with "signatures" here. As long as no other Rmagick users show up, we can't know what Rmagick saves as "image.signature".
You only wrote about such return lines so far:
signature_test2.tif TIFF 104x16 104x16+0+0 8-bit Bilevel DirectClass 621B
But that tells almost nothing about the content of the image. (I guess, there is a certain possiblity that the file properties are the same when compared to your images with symbols. But you will need luck for that. And the content can still differ greatly, of course.)
Or did you mean the 'hash' signatures?
Perhaps you could use some workarounds with your method. For example: If you want to avoid some differing results, then you could try to 'force' some properties. By adding "-type TrueColor -depth 8" (at the command line; I don't know what's that in Rmagick), ImageMagick won't use PseudoClass/Palette or bit depth reductions for the output files. And then, IM perhaps won't show different returns with writing/reading again.
Perhaps you should further describe the processing/comparing of your files.
And did you read the chapter about comparing in IM Examples (see Anthony's link)?
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-15T11:39:43-07:00
by bon-bon
I supposed that singature is rather ImageMagick's term than Rmagik's. Here it is signature description from Rmagick docs.
img.signature -> string
Computes a message digest from an image pixel stream with an implementation of the NIST SHA-256 Message Digest algorithm. This signature uniquely identifies the image and is convenient for determining if an image has been modified or whether two images are identical.
ImageMagick adds the computed signature to the image's properties.
I added to script output of signatures of images.
Code: Select all
>ruby signature_test.rb signature_test.tif process
processing file signature_test.tif
and you've choosen to process thresholds
RESULT: img.signature != new_img.signature
img.signature = e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
new_img.signature = 1fd61a6b86710183cd38f6e979963a9dd319862dfb542aac890db04f38d742b3
Yes I've read Anthony's link. It about IM Image Signatures I wrote about:
Code: Select all
>identify -quiet -format "%#" signature_test.tif
134dce294ff4e02507af8a986e1040417df6bb7cb722ea292b6d6d0a17bc653f
I guess a signature depends on color depth or palette
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-15T17:22:37-07:00
by Drarakel
Ah, ok. So, it's the 'hash' signature.
Did you try to specify a fixed TIFF output format (with the Rmagick equivalent of "-type" and "-depth")?
I think, you should be able to get identical strings (at least in such a test script
).
Code: Select all
convert signature_test.tif -black-threshold 78% -format "%#" -write info:- signature_test2.tif
e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
Code: Select all
identify -format "%#" signature_test2.tif
1fd61a6b86710183cd38f6e979963a9dd319862dfb542aac890db04f38d742b3
But now:
Code: Select all
convert signature_test.tif -black-threshold 78% -type TrueColor -depth 8 -format "%#" -write info:- signature_test2.tif
e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
Code: Select all
identify -format "%#" signature_test2.tif
e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-15T21:23:39-07:00
by anthony
bon-bon wrote:The result depends on wheter the image was processed or no. Why?
ImageMagick has a flag known as 'taint' which becomes true is some pixels or meta-data was modified.
If it is not modified IM will simple copy the file. That is part of its 'delegate' handling for external filter programs.
See Delegates
http://www.imagemagick.org/Usage/files/#delegates
Specifically...
http://www.imagemagick.org/Usage/files/#delegate_direct
You can force an image to become tainted by using
-taint.
Re: Image signature changes if saved to a file then read bac
Posted: 2012-02-16T13:00:33-07:00
by bon-bon
I applied img.
image_type =
Magick::TrueColorType. It works! Amazing!
Drarakel, thank you soo much for your help!
Anthony, thank you for your guidance!
Keep well, guys )
Code: Select all
require 'rubygems'
require 'rmagick'
puts "processing file %s"%[file = ARGV.shift]
puts " and you've choosen %sto process thresholds"%[(process = /process/i =~ ARGV.shift ) ? "" : "NOT "]
img = Magick::ImageList.new.read(file).first
img = img.black_threshold(200).white_threshold(20) if process
img.image_type = Magick::TrueColorType
# saving to a file and reading back
f = "excerpt-" + file
img.write(f)
new_img = Magick::Image.read(f).first
puts " RESULT: img.signature %s new_img.signature"%( img.signature == new_img.signature ? "==" : "!=" )
puts " img.signature = %s"%img.signature
puts " new_img.signature = %s"%new_img.signature
Code: Select all
>ruby signature_test.rb signature_test.tif process
processing file signature_test.tif
and you've choosen to process thresholds
RESULT: img.signature == new_img.signature
img.signature = e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
new_img.signature = e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
>ruby signature_test.rb signature_test.tif
processing file signature_test.tif
and you've choosen NOT to process thresholds
RESULT: img.signature == new_img.signature
img.signature = 134dce294ff4e02507af8a986e1040417df6bb7cb722ea292b6d6d0a17bc653f
new_img.signature = 134dce294ff4e02507af8a986e1040417df6bb7cb722ea292b6d6d0a17bc653f