Spectral "Compensation"?

Now that I’m making strides in the “using a KDTree to predict the future” idea from this thread I’m revisiting the idea of compensation, but in the context of MFCCs.

From what @tremblap (and @weefuzzy) have mentioned on the forum (and elsewhere) the 0th coefficient closely correlates to loudness. I even remember @tremblap posting a comparison example, but I couldn’t seem to find it now.

That led me to wonder if that 0th coefficient would work as a vector for loudness compensation (ala the approach mentioned much earlier in this thread).

At the moment, when I do this with vanilla loudness, I have an incoming target, I query for the nearest match, I subtract the difference between the two in loudness and then apply that as an amplitude offset after the fact. This works really well, and is the basis for the spectral compensation discussed in this thread.

So I’m wondering now if the 0th coefficient would play nice with a similar approach. Either by subtracting the incoming samples 0th coefficient from the corpus one, and manually adjusting the amplitude of the matched sample, or in the use case from the predictive querying thread, manually offsetting the 0th coefficient when querying the second part of the process.

Perhaps/obviously using a standalone loudness descriptor might be better, or perhaps leaving “loudness” out altogether, might be better for the predictive part of the analysis since I’m more interested in spectral morphology than I am loudness, as I can take more info for that from the real-time “real” input.

1 Like