Computer Assisted Orchestration

So I just come from 2 days of a reality check with the amazing Wet Ink on some of my attempts to do Computer Assisted Orchestration through Concatenation (CAO). Here is a long post on what I wanted to do, what I used, in and out of FluCoMa’s toolset, and what I want to do in the near future. Any input, both from experienced (paging @danieleghisi and @tutschku) and observers welcome.

What I wanted to do

CAO tools available are interesting but not very good with gritty noisy modular synths, and complex textures, like the ones I like to use. So my idea was to:

  1. try what is available out there to see how it reacts to my sounds:
    1.1. Orchidée
    1.2. AudioGuide
    1.3. Araknides

  2. split my sounds through flucoma (nmf was the plan to split and recombine layers) and run the software mentionned in #1 above.

What I tried

Orchidée is on the IRCAM forum, which I’m lucky my institution has a subscription to. The good side: it came with an extensive (yet problematic) database. The bad sides: it did not work on any recent OS (and I’m not one that updates compulsively, I’m still on 10.12, but it does not work beyond 10.10, typical!) The second problem is the database itself. Some clever person decided to normalise the sounds instead of keeping them to their relative loudness. I would have thought that the latter is the only usable method when reconstructing sounds for real life use…

Arachnid was fun but again relying on a strange database (the same but older) so I got strange loudness mismatch for my reconstructions.

AudioGuide was fun. I found a few bugs but the dev/composer (Ben) was super quick to reply and fix. It opened a huge (interesting) question of querying, and potent descriptors, as it allows the users to define the search in a flexible, creative manner. All of this is digested and will inform some design for the 2nd toolbox but I’m open to suggestions. What AudioGuide is good at is segmenting too - a very interesting interface of how to define thresholds for amplitude base segmentation.

What I was missing

  1. the ability to take a buffer in max/bach from nmf for instance, a small object, and then ask my database for the nearest match, without bouncing. Even with bouncing, it was always long and complicated. I wished I had something a little more like our FluidSound browser, and this is soon going to be a first contribution to the 2nd toolbox (in prototype first) - as part of the project’s original goals of taxonomy/browsing, so it is good it confirms the need of having such a thing. My attempt at a distance visualiser has also raised questions of descriptor uses and usages in matching, confirmed also the non-intutitive use of those beyond the CataRT model I used in the past 10 years (with the Sandbox#3 paradigm although there was some reflection already in there on descriptors usefulness and relative weight)

What I ended up doing

  1. with arachnids I heard something that was quite different from the target, but exciting and inspiring, so i reconstructed by hand the ‘solution’ in Reaper with the sounds from the database, with manual correction of relative loudnesses. I had to score all this new editing manually but that yield the best results. In workshop, the results was similar enough to the maquette for me to be happy.

  2. with audioguide I did a parser to Max/Bach so I can load/see/play the ‘solution’ before committing it to score and/or audio. It was fun, yet again quite far from the target. The Bach bridge was useful to solo tracks and sieve through the solution.

  3. surprisingly, I was not able to segment by noisiness, or pitch confidence, and it lead @weefuzzy @groma and myself to discuss promient feature we use to segregate, which is often multi-modal, so we are thinking in term of research here (under the first tooldbox header) on how to make better informed segementations…

I hope some of these points will trigger some discussions - I’m noticing how my experience of musaiking is quite grain-based and immediate, and how these CAO offer different constraints, but I’m interested in any thoughts!


Interesting to read this stuff, to see how/what you’re going about with it.

On a semi-aesthetic and conceptual note, I’m wondering/curious about what CAO offers you, in a chamber context, that “classical” spectralist methodologies don’t? This especially the case for contemporary playing techniques that are broader than what would be available in a CAO database (I can only assume). Like, would you get a broader range of sounds and orchestrations from a richer sound source being filtered through a highly intelligent, but limited in scope database, vs a human orchestration.

Secondly, I’m wondering how you are factoring in notation here, as I would think that’s as significant as a “translation” layer as going from synth->orchestration is. There is so much room for interpretation in that translation/quantization/intonation/etc… process that I’m curious if you are revising your general “as simple as possible to communicate the idea” notation pragmatism. It might be worth investigating (and testing) how notation impacts this workflow.

On less of a question-y note, I’d be curious to hear some of your examples/tests/maquettes, particularly as a source->CAO->ensemble comparison. Don’t know if that’s too “public” for your working process here.

The segmentation (from AuidioGuide specifically) is awesome, so it would be great to see what you guys come up with there.

Lastly, I’m not sure what the sentence means, which is important to that bullet point. Is “a small object” like a musical object?


Thanks for this - and for your power posts on the forum: I’ve just discovered I can select code above and use it as quote, I love discourse :wink:

The answer is in your following (wrong) assumption:

The IRCAM database has a lot of contemporary playing techniques (on the instruments it covers). So the advantages of the method are multiple:

  1. it provides me with ideas I would not think of, because I don’t have all that knowledge of orchestration in my aural brain;
  2. the number of combinations is quite incredibly high so a computer can do them quickly for me
  3. it challenges my reflexes/assumptions/clichés by proposing material in new ways

All in, stimulating!

Definitely. But at the moment, I’m trying to stay aural and work with the musicians to make the notation as efficient as possible…

Wait - it just helped me confirm my desired list of features for the amplitude-based segmenter that I keep pestering @weefuzzy and @groma for :wink: They are both telling me to be patient (after the hardcore refactor they are doing - with a few more other ideas I hope)

In our taxonomy, it is a spectromorphology - so a bit of the spectrum in time. Imagine having an over-ranked NMF where you group slices of time and spectrum together to make a composite ‘object’ that you try to match… this is what I’m trying to do from the complex synth textures.

That’s good. Was just curious.
I look forward to hearing what comes from it.

Yeah that makes sense.
I guess I just meant that there’s no “staying aural” when notation is in the mix, even with aural/verbal communication. It will color the interpretation no matter what, even if it’s not a consideration.

I gotcha. I just wasn’t sure if you were talking Max object, or C++ etc… It was unclear from the sentence.

actually, with workshops with amazing musicians like Wet Ink, we are not far from aural intention at all :wink: But I know what you mean, I think Schaeffer was quite clear on that…

+1 for AudioGuide segmentation! I’ve been playing with it recently and comparing the results of REAPER and MuBu and it is superior in its flexibility. Where it does fall short is sometimes the defaults are really bad - but thats the obvious trade-off of low-level control and how well it works out of the box.

Thanks for the feedback - my list of things on our amplitude-based-segmenter-in-progress is quite influenced by it (although I think I prefer mine :wink: and also by reaper’s and the beautiful simplicity of Audition’s - keep sending comments and it will be even better - I think your idea of ‘out of the box’ is quite important, so I will try to devise progressive empowerment in it…

stay tuned

Hi @tremblap, not sure if this can be of any help, but:
Carmine Emanuele Cella (an italian composer) has taken up the job of re-writing Orchis/Orchids/Orchidee/etc.
At the current stage it’s a command line tool and a Max package (I helped him a little bit with the Max binding, but only with that). AFAICT, it’s still in super alpha phase, but it’s simpler than Orchidee, and it’s going to be open source.

You can find it here

To run some examples you need bach >=0.8 and dada, but I guess you alread have both.

I’ll check that with eager interest indeed.

Reliving this topic, as what @danieleghisi talked about, namely a recoded, more user-friendly version of orchidée, is now live at this website and there are even slowly arriving tutorials on youtube

It might interest people that the tutorial looks quite much better than before - @a.cassidy might be interested to start there. (they smell of the amazing quality control of the Bach Project Doc :wink: )

1 Like

Hello all

I’ve checked this and in the video, it explains Orchidea’s approach and the general problem quite well at in the first part, so it might be of interest.