Let's imagine that we have someimage.jpg with embedded comment, containing some non-latin characters.
I am trying to use following command:
magick.exe convert someimage.jpg someimage.json
resulting json does contain the comment, but it is written in the current windows ANSI codepage (1251 in my location).
I am sure that correct encoding for JSON output should be something more universal, like UTF-8.
Can that be configured via the CLI, or it's a bug?
Codepage for JSON output
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Codepage for JSON output
I cannot answer the question about json output. But in Imagemagick 7, one uses magick, not convert and not magick convert.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Codepage for JSON output
@AlexRozen: please link to a sample image file that contains a comment with non-Latin characters.
Please also say what version of IM you use, on what platform (I guess Windows).
Please also say what version of IM you use, on what platform (I guess Windows).
snibgo's IM pages: im.snibgo.com
Re: Codepage for JSON output
I have checked it with ImageMagick-7.0.8-5-portable-Q16-x64
Sample image is here https://drive.google.com/file/d/1R-bRWZ ... sp=sharing
use command
magick.exe convert IMG_3010.JPG IMG_3010.JSON
Resulting json-file contains "comment": "Надежда"
It's pure Cyrillic, so it can be saved into cp1251 correctly. But I can't be sure about it on other platforms and/or distributions.
P.S. I have checked the binary of this jpeg file and another DICOM image file. Both are containing cyrillic strings in binary cp1251 form inside of them.
So, it seems that they are originally stored without unicode and ImageMagick have no chances to determine their true codepage
Sample image is here https://drive.google.com/file/d/1R-bRWZ ... sp=sharing
use command
magick.exe convert IMG_3010.JPG IMG_3010.JSON
Resulting json-file contains "comment": "Надежда"
It's pure Cyrillic, so it can be saved into cp1251 correctly. But I can't be sure about it on other platforms and/or distributions.
P.S. I have checked the binary of this jpeg file and another DICOM image file. Both are containing cyrillic strings in binary cp1251 form inside of them.
So, it seems that they are originally stored without unicode and ImageMagick have no chances to determine their true codepage
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Codepage for JSON output
Yes, as you say, the text is encoded as CP 1251, not UTF. IM can't guess which codepage is needed.AlexRozen wrote:So, it seems that they are originally stored without unicode and ImageMagick have no chances to determine their true codepage
snibgo's IM pages: im.snibgo.com