Page 1 of 1

preparing a document for a FAX service (e.g. j2)

Posted: 2018-02-10T06:21:59-07:00
by atariZen
Fax services like j2 can accept documents in any format regardless of how high the quality is, and it gets converted to fax quality before sending. The problem is that the sender has no idea what documents look like on the receiving end. The sender should convert the doc to fax quality so they can see what the recipient will see, and also for infosec reasons (if faxing a driver's license, passport, or signature, we don't need the fax service to have a high-quality version).

The fax service says it will accept "binary group 3 fax" and "TIFF CCITT group 3 & 4" (among many others). Group 4 fax is apparently for ISDN lines, and Group 3 is for POTS lines. Since most ppl have POTS, I'm thinking group 3 is what I should target. It's confusing to even know what file to produce, because there are these -compress options:
  • Fax (is this group 3 compression?)
  • Group4
and these output formats are available:
  • FAX (group 3 FAX)
  • G3 (group 3 FAX)
  • G4 (group 4 FAX)
  • GROUP4 (TIFF Raw CCITT Group4)
  • PDF (works with Fax/Group4 compression)
  • TIFF
Is "FAX" a synonym for "G3"?

Is "GROUP4" a synonym for "G4"? Is the "GROUP4" format also a synonym for -compress Group4 outputfile.tiff?

I need to produce something that is multiple pages and as close to best fax-conforming quality as possible. Ideally all the conversion loss happens on my end, so the fax service doesn't have much affect on the quality - ultimately to know what will arrive. This is my (broken) approach:

Code: Select all

$ pdf2djvu -o vector.djvu vector.pdf
$ ddjvu -format=pdf -quality=80 vector.djvu raster.pdf
$ convert raster.pdf -colorspace gray +dither -colors 2 -normalize -monochrome -threshold 70% -despeckle -resample 25x38 -units PixelsPerInch -compress Fax G3:fax.tiff
$ display fax.tiff
display-im6.q16: Not a TIFF or MDI file, bad magic number 65535 (0xffff). `fax.tiff' @ error/tiff.c/TIFFErrors/564.
That apparently failed because instead of producing a CCITT Group3 TIFF (which is what I think I need), it produced a raw group3 binary w/TIFF file extension (and apparently a group3 tiff differs from a group3 raw). I'm not sure how to get a "CCITT Group3 TIFF", so I tried to just see if I could work with the group3 raw:

Code: Select all

$ convert raster.pdf -colorspace gray +dither -colors 2 -normalize -monochrome -threshold 70% -despeckle -resample 25x38 -units PixelsPerInch -compress Fax fax.g3
$ display fax.g3
The result was blurry and extremely low res:

Code: Select all

Image: fax.g3
  Format: G3 (Group 3 FAX)
  Class: PseudoClass
  Geometry: 2592x418+0+0
  Resolution: 204x196
  Print size: 12.7059x2.13265
  Units: PixelsPerInch
  Type: Bilevel
When I zoom in the text is not legible. The source doc was a letter-sized page, and it got squashed vertically.

(edit)
I've found there exists a "TIFF-F (Class F)", which is a TIFF format that was designed specifically for fax. There is no class F TIFF in the output of convert -list format. So it seems ImageMagick is useful for despecking, bi-leveling, and changing the resolution, and then to finalize it into a class-F TIFF I'll need fax2tiff (part of the libtiff-tools pkg).

Re: preparing a document for a FAX service (e.g. j2)

Posted: 2018-02-10T09:27:39-07:00
by atariZen
For the record, apparently this is the answer, assuming the source is a vector PDF, and we're targeting the medium (fine) fax standard:

Code: Select all

$ pdf2djvu -o vector.djvu vector.pdf
$ ddjvu -mode=black -format=tiff -aspect=no -size=1734x2156 vector.djvu raster.tiff
$ convert raster.tiff raster.fax
$ fax2tiff -M -o raster_classf.tiff raster.fax
Of course I'm open to improvements if anyone has suggestions. The minor flaw in this is that ddjvu needs a hard-coded size, which means a manual calculation, and we have to lose the aspect ratio, as opposed to resampling if I understand correctly.

An approach for cases starting with a raster image of arbitrary size would be useful as well.

(edit)
The ddjvu step drops raster images that are in the input file. Bug reported => https://sourceforge.net/p/djvu/bugs/287/

So that's unacceptable. The temptation is to have ddjvu produce a color doc and have ImageMagick bi-level it, but I don't think it'll go well if color pixels are assigned by ddjvu.