Page 1 of 1

parsing geometry option with regexp

Posted: 2010-12-15T08:26:48-07:00
by cedk
Hi,

I found a way to parse the -geometry option with regular expressions. This can be usefull. Maybe !
I wrote it in PHP :

Code: Select all

<?php
function im_geom($geom)
{
	$spe = "[\%|\^|\!|\>|\<|\@]?";
	$number_type = "(\d*)($spe)";
	$size = "(?:$number_type)?(?:(?:x)(?:$number_type))?";
	
	$sign = "[\+|\-]+";
	$sign_nimber = "($sign)(\d*)";
	$offset = "(?:$sign_nimber)?(?:$sign_nimber)?";
	
	$regexp = $size . $offset;

	preg_match("#$regexp#", $geom, $res);

	
	if(isset($res[0])) unset($res[0]);
	return $res;
}
?>
This function returns an array. For example :

Code: Select all

im_geom('400x300+15+25') returns array [400][][300][][+][15][+][25]
OR
im_geom('40%x50%+30-30') returns array [40][%][50][%][+][30][-][30]
Hope this can be helpfull...
Any suggestions and/or improvement are welcome

Cedk

Re: parsing geometry option with regexp

Posted: 2010-12-15T23:17:57-07:00
by anthony
Output should be up to four numbers, which may include negatives.

WARNING; the parsed result is different for
300x400
verses
+300+400

the first is a Width and height, the second just X and Y offset.
If you can gurantee four numbers will always be present then

However IM also parses geometry expressions of the form...
300,-400,30,-40
5/6/7
Also numbers parsed may be floating point, though most geometry only needs integers.
It also returns whether a set of flags such as % ! ^ @ > < are present (anywhere) in arguemnt.

From this you can probably see why IM has a very complex sub-routine to simply parse geometry.
Comma separated numbers longer than five floating point numbers (such as distort) are handled by a separate parser, to count up numbers, allocate a array, then fill values into that array.

Re: parsing geometry option with regexp

Posted: 2010-12-16T07:21:00-07:00
by cedk
Hi Anthony,

I think my reg exp does parse all what you say. Output only 4 numbers doesn't seem to me appropriated, I prefer getting seperated datas, and then, if needed, concatenate the necessary values.
However, I change my reg exp to have all numbers directly (with the sign if present, real numbers are matched too).
I made a second change : it is now possible to have directly the {size} and the {offset} in the output array.

Code: Select all

function im_geom($geom)
{
	// the regex
	$signe = "[\-|\+]";
	$number = "\d*(?:\.\d+)?"; // an unsigned real ( 4, 3.2, -5 MATCH), (4. NOT MATCH)
	$type = "[\%|\^|\!|\>|\<|\@]"; // different "types" : % ^ ! > < @ or "nothing". Have I forgot one ? 

	$size = "(?:($number?)($type?))?(?:x($number?)($type?))?";
	$offset = "($signe{1}$number)?($signe{1}$number)?";

	$regexp = "($size)($offset)";

	preg_match("#$regexp#", $geom, $res);
	return $res;
}
This function returns for 400x300+35-10 :
Array
(
[0] => 400x300+35-10
[1] => 400x300
[2] => 400
[3] =>
[4] => 300
[5] =>
[6] => +35-10
[7] => +35
[8] => -10
)
array[0] = the "full path"
array[1] = the {size} path
array[2] = width
array[3] = special char for width (! ^ < > @ if present)
array[4] = height
array[5] = special char for height (! ^ < > @ if present)
array[6] = the {offset} path
array[7] = "x" offset
array[8] = "y" offset

You can have a look on this page : http://cedk06.free.fr/forumIM/test.php5 . There's a form to test the regexp, and a loop to see many (all ?) possibilities.

CedK

Re: parsing geometry option with regexp

Posted: 2010-12-16T17:56:34-07:00
by anthony
Very good. however you can have multiple special characters, and that can appear anywhere that is
not a number! EG:
%400x300
400%x300
400x%300
400x300%

Are all valid and all mean exactly the same thing to IM -- a size with a percent flag.

The special characters could be tested for and removed to simplify the parsing as a separate second step.

However like I said if you don't want to parse all the geometry IM can parse, or you know or can control (say using -format), then you can simplify things. For example, actual image size, virtual image size and offset (offset output always includes the sign)...
convert rose: -crop 30x30+5+10 -format '%w %h %W %H %X %Y' info:
30 30 70 46 +5 +10
OR limit yourself to the geometry formats that IM outputs, which will never include those special flags. For example the identify list output will not contain special characters, and will either list just image size (2 numbers), or image size and virtual size and offset (6 numbers)
convert rose: info:
rose: ROSE 70x46 70x46+0+0 8-bit DirectClass 9.67KB 0.000u 0:00.000
convert rose: +repage info:
rose: ROSE 70x46 8-bit DirectClass 9.67KB 0.000u 0:00.000
convert rose: -crop 30x30+5+10 info:
rose: ROSE 30x30 70x46+5+10 8-bit DirectClass 0.000u 0:00.000
convert rose: -repage +30+40 info:
rose: ROSE 70x46 70x46+30+40 8-bit DirectClass 9.67KB 0.000u 0:00.000
convert rose: -repage 300x400 info:
rose: ROSE 70x46 300x400+0+0 8-bit DirectClass 9.67KB 0.000u 0:00.000

Re: parsing geometry option with regexp

Posted: 2010-12-17T03:27:30-07:00
by cedk
OK, I didn't know %400x300 was a valid format.
The aim of that regex was to get 2 things : the size and the offset.
So it is possible to call a PHP function (method actually in my case) with an argument ($geom) written in the same way as the geometry IM option,and in consequence, keeping the powerfull of that way to write. So in my class I can have methods like "setPosition", or "setSize" or anthig else using {size} or {offset}.

Secondly, considering the difficulty of writing regex, I'm quite proud of this parsing method !!! 8) :wink: :-D :oops:

Re: parsing geometry option with regexp

Posted: 2010-12-17T10:41:51-07:00
by fmw42
You can get each of those from string formats w, h, W, H, X, Y

see http://www.imagemagick.org/script/escape.php

xoffset=`convert image -format "%X" info:`

for windows users see http://www.imagemagick.org/Usage/windows/

Re: parsing geometry option with regexp

Posted: 2010-12-17T14:28:47-07:00
by cedk
fmw42 wrote:You can get each of those from string formats w, h, W, H, X, Y

see http://www.imagemagick.org/script/escape.php

xoffset=`convert image -format "%X" info:`

for windows users see http://www.imagemagick.org/Usage/windows/
Thank you Fred !
Now I know, for sure. However, one point : I wanted to do the minimum of Shell request. That's the reason of parsing before anything else.

Merci quand même pour tout !
CedK

Re: parsing geometry option with regexp

Posted: 2010-12-17T14:45:28-07:00
by fmw42
You can also get them all at once and parse it further yourself:

geometry=`convert image -format '%w %h %W %H %X %Y' info:`

Much easier than getting all the verbose info and then parsing those items you want from the verbose info

Fred