Batch analysis of images (?)
Posted: 2017-01-17T02:38:16-07:00
I'm not quite sure what to call this - hence the strange subject - but this is what it is:
I have a large number of images - strictly black on white - that I want to analyse like this:
1) I want to separate them out in components: if I have, say, a text written in black on white paper, I want to cut it up into all the letters, and in fact, I want to cut out the dots over the 'i's and put them in a separate bag too. A little bit like OCR, but without trying to identify what each bit is.
2) After having generated my component images, I want to automatically categorise them, so that all dots are bundled together, all vertical strokes are the same (but perhaps split into 'short', like the one in the letter 'i' and 'long', like 'l') etc. These categories are now my "standard components".
3) Having done this, I want to analyse the original images to see which of my standard components are part of them, and then use this to build an index.
Is there any open source tool or toolset in existence, that can do one or more of these steps? Or even almost? Or perhaps just a hint about where I might start looking?
I have a large number of images - strictly black on white - that I want to analyse like this:
1) I want to separate them out in components: if I have, say, a text written in black on white paper, I want to cut it up into all the letters, and in fact, I want to cut out the dots over the 'i's and put them in a separate bag too. A little bit like OCR, but without trying to identify what each bit is.
2) After having generated my component images, I want to automatically categorise them, so that all dots are bundled together, all vertical strokes are the same (but perhaps split into 'short', like the one in the letter 'i' and 'long', like 'l') etc. These categories are now my "standard components".
3) Having done this, I want to analyse the original images to see which of my standard components are part of them, and then use this to build an index.
Is there any open source tool or toolset in existence, that can do one or more of these steps? Or even almost? Or perhaps just a hint about where I might start looking?