Ah gotcha, yeah should have been clearer.
The whole process is such (for the 4 sensor array, but similar for the narrower 3 sensor array):
- independent onset detection for each channel
- lockout/timing section to determine which onset arrives first and block errand/double hits
- cross correlating each adjacent pair of sensors
- send that to “the lag maps”*
- (find centroid of overlapping area)
The lap map approach is something @timlod tested and I then implemented in Max (with a bit of dataset transformation help from @jamesbradbury).
Basically each a lap map is pre-computed at on a 1mm grid as to what each lag should theoretically be for each sensor pair. In jitter it ends up looking like this:
On the left is the NW pair and on the right is the NE pair.
These were computed in a similar way to you’ve suggested as to what they should theoretically be based on drum size, speed of sound, tuning, etc… So these maps are drum/tuning/setup-specific.
Once the cross correlation values are computed, rather than send that to all four lag maps, it takes the index as to sensor picked up the onset first (step 2 above) with the thinking being that those TDoAs would be the most accurate/correct. That then does some binary jitter stuff to get the overlapping area:
Then some more jitter stuff to find the centroid of the little overlapping area.
///////////////////////////////////////////////////////////////////////////////////////////////
So the overall idea is to ignore the furthest reading as the cross correlation isn’t as accurate (nor is onset detection).
The MLPs role is to improve non-linearities in how this approach behaves close to the edge of the drum. This lag map approach is super accurate near the middle and does quite well for a long time, it’s only the furthest hits that get a bit more jumpy/erratic. I(/we) suspect there’s some physics stuff at play near the edge since the tension of the drum changes and energy bounces/behaves differently given the circular shape etc… So the physical model, lag maps, and probably the quadratic version, can’t really account for that. At least with the level of complexity that they are generally implemented at. Nearer the center of the drum it’s just an infinite plane of vibrating membrane, that seems to behave quite predictably.
There’s also the side perk that you also wouldn’t need an accurate physical model/lap map to work from as you could strike the drum at known locations and have the NN figure out the specifics.
Yeah I misspoke. A single layer with 10 neurons.
I’ll send you the data test patches offline (will tidy the patch up a bit first).