As an example, part of the histogram might have counts "25, 25, 25, 25" where another has values "0, 0, 0, 100". A naive search for peaks will find a peak at "100" although it probably isn't a real peak.
To counteract this problem, I identify a stream of (n) (where n > 0) entries that have zero counts, followed by a count of at least (n+1). I redistribute the entire count over the (n+1) entries. So "0, 0, 0, 100" would be redistributed as "25, 25, 25, 25".
(If the histogram were an image, some form of blurring would be equivalent.)
In the code, ph->Values[x].Count is the histogram count for value x where 0 <= x < 65536.
Code: Select all
int nFills = 0;
for (i = 0; i < NUM_VALUES-1; i++) {
ValueT * pvi = &ph->Values[i];
if (!pvi->Count) {
// Find next non-zero and test if its count is more than distance.
for (j = i+1; j < NUM_VALUES; j++) {
ValueT * pvj = &ph->Values[j];
if (pvj->Count) {
int n = j - i + 1;
if (pvj->Count >= n) {
nFills++;
int count = pvj->Count / n;
int rem = pvj->Count - (n * count);
//printf ("Back filling %i * %i from %i to %i, remainder %i\n", n, count, j, i, rem);
// We also spread remainder.
int kLim = j - rem + 1;
for (k = i; k <= j; k++) {
ValueT * pvk = &ph->Values[k];
if (k < j && pvk->Count) printf ("FillNullCounts: ** %g\n", pvk->Count);
pvk->Count = count;
if (k >= kLim) pvk->Count++;
//printf ("= %i\n", pvk->Count);
}
i = j;
}
break; // out of j loop
}
}
}
}
1. The code is grossly slow when I have large blocks of zero counts.
2. I should probably redistribute even where count < n+1.
3. My code redistributes remainders, so "0, 0, 0, 106" will become "26, 26, 27, 27". However, my count array (ph->Values[x].Count) is now floating-point (to cater for alpha masking), so this should change to give "26.5, 26.5, 26.5, 26.5".
4. I always redistribute downwards. So a stream of "0, 0, 0, 100, 0, 0, 0, 4" will become "25, 25, 25, 25, 1, 1, 1, 1", which is probably not ideal.
5. A consequence of issue 4 is that a count at histogram[0] is never redistributed, so there is often an incorrect peak there. If the first entries are "78, 0, 0, 0, 0, 100", they will become "78, 25, 25, 25".