Page 1 of 1

Generating UTF-8 output with Convert Command

Posted: 2011-03-18T01:44:11-07:00
by chef-gonzo
Hello,

we currenty try to generate labels with UTF-8 content. There are special signs (boxed numbers) as well as Cyrillic strings in the input. The tests were done in a Meego 1.0 environment, first tested with ImageMagick 6.5.3.7 and freetype 2.3.5, later with the latest versions (ImageMagick 6.6.4.8 and freetype 2.4.4).
The font is Bitstream Vera Sans. The shell window is the XFCE terminal Version 4.6, the env variable LANG = de_DE.UTF-8. We never had problems with UTF-8 inside the shell.
We tested like this:
convert -font Bitstream-Vera-Sans caption:"Bla Bla Bla".
In the output, the special caracters appeared as Question Marks.
We also tried to put the UTF-8 data into a file and load it with caption:@file, no luck. Also with label:, no luck.

Then we tried ftview: no luck, in the output the special caracters appeared as Boxes.

But with pango-view, the output is correct, even if we call it with --backend=ft2 to use freetype as renderer.

Any hints what could be wrong?

Regards
Florian

Re: Generating UTF-8 output with Convert Command

Posted: 2011-03-18T05:30:11-07:00
by anthony
I have had no problems with UTF fonts. But it is UTF-8 not 16.
Also I find it best to read the UTF using "@filename" syntax, or "@-" to read from stdin.

See IM Examples, Text to Image, Unicode.
http://www.imagemagick.org/Usage/text/#unicode

More commonly the problem is ensuring that IM is actually finding the right font. If it fails it could substitute a Arial or Times font instead.

Check the font is loading using -debug annotate
See http://www.imagemagick.org/Usage/text/#font_info

Also what version of IM are you using? And has the font been set into a "type.xml" file correctly, say using the "imagick_type_gen" script.
See http://www.imagemagick.org/Usage/#font
(This really need to move into a sub-page!

Re: Generating UTF-8 output with Convert Command

Posted: 2011-03-23T09:04:12-07:00
by chef-gonzo
Hello,

I tried with the -debug annotate option. I get the following output:

Code: Select all

[gonzo123@localhost ~]$ echo -e " \xe2\x91\xa0 \xe2\x91\xa1 \xe2\x91\xa2 "
 ① ② ③ 
[gonzo123@localhost ~]$ echo -e " \xe2\x91\xa0 \xe2\x91\xa1 \xe2\x91\xa2 " | convert -font Bitstream-Vera-Sans-Roman -debug annotate label:@- test.png
2011-03-23T16:59:29+01:00 0:01 0.030u 6.5.3 Annotate convert[31281]: annotate.c/RenderFreetype/1120/Annotate
  Font /usr/share/fonts/bitstream-vera/Vera.ttf; font-encoding none; text-encoding none; pointsize 12
2011-03-23T16:59:29+01:00 0:01 0.030u 6.5.3 Annotate convert[31281]: annotate.c/GetTypeMetrics/713/Annotate
  Metrics: text:  ① ② ③  ; width: 37; height: 14; ascent: 12; descent: -3; max advance: 16; bounds: 0,0  4.96875,9; origin: 38,0; pixels per em: 12,12; underline position: -3.32812; underline thickness: 2.23438
2011-03-23T16:59:29+01:00 0:01 0.030u 6.5.3 Annotate convert[31281]: annotate.c/GetTypeMetrics/713/Annotate
  Metrics: text: ; width: 0; height: 14; ascent: 12; descent: -3; max advance: 16; bounds: 0,-3  9,9; origin: 0,0; pixels per em: 12,12; underline position: -3.32812; underline thickness: 2.23438
2011-03-23T16:59:29+01:00 0:01 0.030u 6.5.3 Annotate convert[31281]: annotate.c/RenderFreetype/1120/Annotate
  Font /usr/share/fonts/bitstream-vera/Vera.ttf; font-encoding none; text-encoding none; pointsize 12
2011-03-23T16:59:29+01:00 0:01 0.030u 6.5.3 Annotate convert[31281]: annotate.c/GetTypeMetrics/713/Annotate
  Metrics: text:  ① ② ③  ; width: 37; height: 14; ascent: 12; descent: -3; max advance: 16; bounds: 0,0  4.96875,9; origin: 38,0; pixels per em: 12,12; underline position: -3.32812; underline thickness: 2.23438
2011-03-23T16:59:29+01:00 0:01 0.030u 6.5.3 Annotate convert[31281]: annotate.c/RenderFreetype/1120/Annotate
  Font /usr/share/fonts/bitstream-vera/Vera.ttf; font-encoding none; text-encoding none; pointsize 12
2011-03-23T16:59:29+01:00 0:01 0.030u 6.5.3 Annotate convert[31281]: annotate.c/GetTypeMetrics/713/Annotate
  Metrics: text: ; width: 0; height: 14; ascent: 12; descent: -3; max advance: 16; bounds: 0,-3  9,9; origin: 0,0; pixels per em: 12,12; underline position: -3.32812; underline thickness: 2.23438
[gonzo123@localhost ~]$ 
The image contains 3 question marks. I also tried with recode and converted the input of convert to UTF-16: the result was an empty image. The font definitely exist on my system. It was copy-pasted from a convert -list font output.

Regards
Florian

Re: Generating UTF-8 output with Convert Command

Posted: 2011-03-23T20:23:30-07:00
by el_supremo
The font definitely exist on my system. It was copy-pasted from a convert -list font output.
I notice that you use the name Bitstream-Vera-Sans in the example in your first post but you use Bitstream-Vera-Sans-Roman in your most recent example. On my system the font's name is Bitstream-Vera-Sans and the other doesn't exist.

I've downloaded and installed a version of Bitstream-Vera-Sans from the web on my Windows 7 system. I also happened to have another bitstream font installed - Bitstream-CyberCJK. When I use the Windows character map program to look up those three characters in the Vera-Sans font, it does not have them, in fact it only has a few hundred characters in total. On the other hand, CyberCJK has thousands of characters in it including the three characters that you use in your example. If I use CyberCJK instead of Vera-Sans, the command works.
Can you try your command with Bitstream-CyberCJK to see if that also works for you?

Pete

Re: Generating UTF-8 output with Convert Command

Posted: 2011-03-24T21:11:42-07:00
by anthony
It works for me with the Microsoft Windows Mincho TTF font.

Question marks are typically an indication that a font does not define those characters. You will need to use a different font.

I have yet to find a font that implements EVERY unicode character in every unicode set. Even Mincho fails with some chinese characters, such as the chinese string I example in IM Examples, Unicode
http://www.imagemagick.org/Usage/text/#unicode

If you look a little further down is a 'Mincho' font example of Unicode Dingbat Characters. One of these are not defined, and so a '?' was printed instead. That specific character is in unicode, but you are ment to use different unicode number, that the 'dingbat' equivalent.