@tremblap and I have been bouncing some ideas around corpus exploring and applications that really focus on understanding large personal collections. I’ve been searching for existing applications that attempt to do this and came across Sononym. Seems really cool so I’m about to try it now.
I’ll propose a question to everyone here to try and start discussion around the principles implied by this kind of corpus program:
What kinds of long term analytical procedures would be interesting or useful to you? Imagine a program that sifts and sieves your sound collection(s) in the background and produces a number of insights and metadata about What might those insights look like roughly? One angle I’m interested in is being able to start at an incredibly broad entry point, as an exercise in maintaining naivety (at least initially). The metadata and analysis would let you start at this ‘raw’ entry point and burrow into the analysis layer by layer unpicking more fine details about your corpus. Imagine starting with long field recordings which are segmented at different time scales and some sort of network between time points is created. The connections between points would be grounded in some sort of analysis that is specified by the user - maybe you have a big drum corpus and want to look at amplitude based descriptors or you might only have short violin melodies where pitch and tone colour are more important. Relationships could be further discovered such as similarity and antagonism as well as groups of time points that might belong together, ala clustering.
Anyway, these are some of the thoughts I’ve been having so it probably reads like a stream of consciousness right now rather than articulate and clearly expressed ideas but im interested in what others might be thinking about in this ballpark of corpus manipulation.