A More Formal Description of the ImageMagick Machine
Posted: 2013-02-02T17:48:11-07:00
I'm new to ImageMagick, having started using it just a few days ago. In reading the various pages here and some tutorials linked to from here, I was frustrated that I could not find a abstract description of how it works. Only lots of examples.
The page "ImageMagick Command-Line Processing" (http://www.imagemagick.org/script/comma ... essing.php) tries harder than any other reference I looked at, but even it falls short. For example, for "Image Operator" is says "An image operator differs from a setting in that it affects the image immediately as it appears on the command line," and for "Image Sequence Operator" is says "An image sequence operator differs from a setting in that it affects an image sequence immediately as it appears on the command line," implying that the former affects an image and latter affects an image sequence, but this isn't really right, near as I can tell. They BOTH act on the entire preceding image sequence. The difference is that an image operator leaves each processed image where it is in the sequence, so the result is an altered sequence, whereas an image sequence operator eats up the entire sequence and deposits an image in its place. (The names might make more sense if they were reversed: an image operator leaves an image, and an image sequence operator leaves a sequence.)
The operator terminology causes trouble in the explanation of an image stack, in which the page says "Image operators only affect images in the current stack." I'm pretty sure this is true of both image operators and image sequence operators. Or, perhaps there the term "image operator" is meant to apply to both.
Another thing that confused me: The terms "sequence" and "stack" are both used. I guess a stack is a sequence with an operator in parentheses. Or, maybe not, since the operators discussed in the stack section are said to operate on the images in the stack, so perhaps a stack doesn't include the operator. Anyway, I think an improvement is to dispense with the extra term and just talk about a sequence. The so-called stack operators operate on part of the sequence, unlike any others, but not as a stack, since index 0 refers not to the top of the stack, but to the first element in the sequence. Again, as a computer scientist, I would have found it less confusing if the word "stack" had not been used to describe something that doesn't operate as a stack.
Anyway, after reading this page several times, reading some other material, and running many experiments to test out various theories of how the ImageMagick machine works, I think I now understand it. I thought I'd post my understanding here, for two reasons: (1) to elicit corrections to the stuff I've gotten wrong, either stated wrong or genuinely misunderstood, and (2) to help anyone else who's a computer scientist, as I am, and would like a semi-formal, precise, and succinct explanation of how this magnificent machine works.
Actually, everything below is only about the convert command, which is the only one I use and the only one I've worked to figure out so far. So this isn't really about the ImageMagick machine; it's about the convert machine.
In a nutshell, I'd say it like this: The convert machine operates on a postfix expression in which operators operate on the entire preceding sequence of operands and replace the sequence with the result, which may be a sequence or image. The preceding sequence extends leftward only to the matching left parentheses. At the end of processing, all of the images remaining in the sequence are output to one or more files whose names are based on the specified output file name.
(I didn't find it documented, but if you execute:
convert xc:red xc:blue output.jpg
You get the files output-0.jpg and output-1.jpg. So, it appears that convert tries very hard to get all of the image sequence out of the box.)
Breaking the abstract description down, using loose BNF notation, we start with:
command-line ::= 'convert' image-sequence output-image
The interesting part is all in the image-sequence:
image-sequence ::= ( [settings] ( image-expression | element )) ...
The parsing is greedy; the longest possible image sequence is formed and becomes the images acted on by the closet following operator (see below).
Settings means any of the numerous settings arguments (e.g., -size), which can appear prior to any image-expression or element (see below for what they are). They are completely unaffected by parentheses; a setting applies until another setting changes it.
Now for an image-expression:
image-expression ::= image-sequence operator
An operator is any of the image operators or image sequence operators.
So, all of the following are image-sequences:
rose:
xc:red photo.jpg
xc:red photo.jpg rose: +append
-size 100x100 xc:red xc:teal rose: +append xc:blue -rotate 120 +append
It's important to note that an image-sequence may or may not have operators in it. This is what allows everyone's favorite simplest example to work:
convert image.png image.jpg
where the image-sequence is 'image.png'.
Continuing from the BNF for image-sequence a few paragraphs above:
element ::= input-image | '(' image-expression ')'
Here's is where the parentheses come in: They make an image-expression, which has to end in an operator, into an element. In addition, as an operator gobbles up the preceding sequence, the grammar indicates that the gobbling stops when the left parenthesis is encountered.
This concludes the interesting part, and the part that had me confused for days until I understood that this was a postfix machine where all the operators operate on the entire image sequence.
Here's the rest of the grammar I worked up, in case you're interested. Nothing important is going on below; it's just an attempt to account for the various forms of input and output images.
input-image ::= [format :] basic-image [selector]
format ::= /* one of the formats (e.g., rgb) */
basic-image ::= file-ref | stream-ref | built-in-ref
file-ref ::= /* file name, perhaps with globbing */
stream-ref ::= '-' | 'fd:' N /* N is a non-negative integer */
built-in-ref ::= built-in-image | built-in-pattern
built-in-image ::= 'granite' | 'logo' | 'netscape' | 'rose' | 'wizard'
built-in-pattern ::= 'bricks' | 'checkerboard' | /* other patterns */
output-image ::= [format] basic-output-image
basic-output-image ::= file-ref | stream-ref
If you've read this far, thanks for your attention, and please do leave comments. If there's interest, I'll work some more on this approach to explaining the ImageMagick machine and possibly work it up into a very long article or very short book.
Marc Rochkind
The page "ImageMagick Command-Line Processing" (http://www.imagemagick.org/script/comma ... essing.php) tries harder than any other reference I looked at, but even it falls short. For example, for "Image Operator" is says "An image operator differs from a setting in that it affects the image immediately as it appears on the command line," and for "Image Sequence Operator" is says "An image sequence operator differs from a setting in that it affects an image sequence immediately as it appears on the command line," implying that the former affects an image and latter affects an image sequence, but this isn't really right, near as I can tell. They BOTH act on the entire preceding image sequence. The difference is that an image operator leaves each processed image where it is in the sequence, so the result is an altered sequence, whereas an image sequence operator eats up the entire sequence and deposits an image in its place. (The names might make more sense if they were reversed: an image operator leaves an image, and an image sequence operator leaves a sequence.)
The operator terminology causes trouble in the explanation of an image stack, in which the page says "Image operators only affect images in the current stack." I'm pretty sure this is true of both image operators and image sequence operators. Or, perhaps there the term "image operator" is meant to apply to both.
Another thing that confused me: The terms "sequence" and "stack" are both used. I guess a stack is a sequence with an operator in parentheses. Or, maybe not, since the operators discussed in the stack section are said to operate on the images in the stack, so perhaps a stack doesn't include the operator. Anyway, I think an improvement is to dispense with the extra term and just talk about a sequence. The so-called stack operators operate on part of the sequence, unlike any others, but not as a stack, since index 0 refers not to the top of the stack, but to the first element in the sequence. Again, as a computer scientist, I would have found it less confusing if the word "stack" had not been used to describe something that doesn't operate as a stack.
Anyway, after reading this page several times, reading some other material, and running many experiments to test out various theories of how the ImageMagick machine works, I think I now understand it. I thought I'd post my understanding here, for two reasons: (1) to elicit corrections to the stuff I've gotten wrong, either stated wrong or genuinely misunderstood, and (2) to help anyone else who's a computer scientist, as I am, and would like a semi-formal, precise, and succinct explanation of how this magnificent machine works.
Actually, everything below is only about the convert command, which is the only one I use and the only one I've worked to figure out so far. So this isn't really about the ImageMagick machine; it's about the convert machine.
In a nutshell, I'd say it like this: The convert machine operates on a postfix expression in which operators operate on the entire preceding sequence of operands and replace the sequence with the result, which may be a sequence or image. The preceding sequence extends leftward only to the matching left parentheses. At the end of processing, all of the images remaining in the sequence are output to one or more files whose names are based on the specified output file name.
(I didn't find it documented, but if you execute:
convert xc:red xc:blue output.jpg
You get the files output-0.jpg and output-1.jpg. So, it appears that convert tries very hard to get all of the image sequence out of the box.)
Breaking the abstract description down, using loose BNF notation, we start with:
command-line ::= 'convert' image-sequence output-image
The interesting part is all in the image-sequence:
image-sequence ::= ( [settings] ( image-expression | element )) ...
The parsing is greedy; the longest possible image sequence is formed and becomes the images acted on by the closet following operator (see below).
Settings means any of the numerous settings arguments (e.g., -size), which can appear prior to any image-expression or element (see below for what they are). They are completely unaffected by parentheses; a setting applies until another setting changes it.
Now for an image-expression:
image-expression ::= image-sequence operator
An operator is any of the image operators or image sequence operators.
So, all of the following are image-sequences:
rose:
xc:red photo.jpg
xc:red photo.jpg rose: +append
-size 100x100 xc:red xc:teal rose: +append xc:blue -rotate 120 +append
It's important to note that an image-sequence may or may not have operators in it. This is what allows everyone's favorite simplest example to work:
convert image.png image.jpg
where the image-sequence is 'image.png'.
Continuing from the BNF for image-sequence a few paragraphs above:
element ::= input-image | '(' image-expression ')'
Here's is where the parentheses come in: They make an image-expression, which has to end in an operator, into an element. In addition, as an operator gobbles up the preceding sequence, the grammar indicates that the gobbling stops when the left parenthesis is encountered.
This concludes the interesting part, and the part that had me confused for days until I understood that this was a postfix machine where all the operators operate on the entire image sequence.
Here's the rest of the grammar I worked up, in case you're interested. Nothing important is going on below; it's just an attempt to account for the various forms of input and output images.
input-image ::= [format :] basic-image [selector]
format ::= /* one of the formats (e.g., rgb) */
basic-image ::= file-ref | stream-ref | built-in-ref
file-ref ::= /* file name, perhaps with globbing */
stream-ref ::= '-' | 'fd:' N /* N is a non-negative integer */
built-in-ref ::= built-in-image | built-in-pattern
built-in-image ::= 'granite' | 'logo' | 'netscape' | 'rose' | 'wizard'
built-in-pattern ::= 'bricks' | 'checkerboard' | /* other patterns */
output-image ::= [format] basic-output-image
basic-output-image ::= file-ref | stream-ref
If you've read this far, thanks for your attention, and please do leave comments. If there's interest, I'll work some more on this approach to explaining the ImageMagick machine and possibly work it up into a very long article or very short book.
Marc Rochkind