Hi all, I’m very new to a lot of the concepts here and am interested in interacting with these tools via the command line.
Is there a place the describes how to use the slice indice output of some of the CLI commands?
For instance, I run
fluid-noveltyslice -source a.wav -indices indices.wav -fftsettings 2048 -1 -1 -threshold 0.2 -filtersize 1 -feature 1
on a 30 second file and the resulting indices.wav
, when opened in an audio player is a shorter segment of blips. I assume that the output needs to be further handled in other ways to be made useful, but I couldn’t find any examples of doing so.
I found some threads talking a bit about buffers but couldn’t contextualize the info enough to get any traction. Maybe someone has some nice examples of the CLI in action or a link to an explanation of the slice output format?
Hello @philomates and welcome!
The result you get is a wav or a csv file. The numbers in there are the frame numbers where slices were detected.
I’ll make a little example below, but the ultimate example is in @jamesbradbury ReaCoMa project, if you can read LUA.
More very soon.
The simplest example is this:
fluid-noveltyslice -source a.wav -indices indices.csv
If you use that in a Python script for instance, you can then retrieve the values from indices.csv and iterate over them pair-wise to do further process/descriptions.
I hope this helps? The CLI interface is not a full scripting language, literally super efficient tools to get elements. @jamesbradbury again designed the amazing ftis which wraps some of our things with some of other people’s things. He presented that here:
1 Like
Thanks for the props @tremblap , although the CSV reading in ReaCoMa is very terse and imo not hugely helpful if you’re trying to wrap your head around just that aspect
perhaps the best example for getting immediate feedback would be to chain cat
after your call? If you change the extension of the output to .csv
then the output will be comma separated value file instead.
fluid-noveltyslice -source a.wav -indices indices.csv && cat indices.csv
for example gives me this:
This might not be ideal though, and the kind of workflow that you envisage or want to achieve will change the answer to your question a lot. Do you want to just get a quick understanding of how the algos are slicing certain sounds or perhaps do you want to chop up the sound file into new ones without leaving the command line?
You could in theory use something like awk
to facilitate more complex distributing of the data, but then I’d encourage you to go elsewhere - perhaps Python to orchestrate that sorta stuff from a CLI interface.
P.S
If you do speak Python, then I wrapped all the command line algorithms and you can use the data directly in Python as numpy buffers or lists (your choice of output). A library like pydub
combined with this is how I do a lot of batch processing for segmented sound → many files.
1 Like
thanks for the tips and references!
ftis looks very cool and potentially just what I’m looking for. And yes, python tends to be easier than piping unix commands together for me
I was able to get what I initially wanted, that is use the CLI to turn 1 input wav into X sliced wavs, using the following python3 + flucoma cli + sox glue:
#!/usr/bin/python
import os
import subprocess
input_file = "b.wav"
indices_file = "indices.csv"
slice_file_base = "slice"
os.system("fluid-noveltyslice -source {0} -indices {1} -fftsettings 2048 1024 2048 -threshold 0.5 -filtersize 1 -feature 1".format(input_file, indices_file))
sample_rate = float(subprocess.check_output("sox --i -r {0}".format(input_file), shell=True))
f = open(indices_file, "r")
indices = [int(e) for e in f.read().rstrip().split(",")]
for i in range(len(indices) - 1):
x, y = indices[i], indices[i + 1]
slice_file = "{0}_{1}.wav".format(slice_file_base, i)
offset = x / sample_rate
length = (y - x) / sample_rate
cmd = "sox {0} {1} trim {2:.2f} {3:.2f}".format(input_file, slice_file, offset, length)
print(cmd)
os.system(cmd)
being new to this (and pretty novice with sample rates etc), the thing that tripped me for a second was what the indices actually represent frames that need to be rendered into seconds in order to use with sox
Anyways, thanks again and excited to now have enough context to start playing around!
Nice one. I like that this entirely the way I wouldn’t have thought to do it I love SoX - so useful and this just proves it. Theres probably room in the world for a general purpose utility in the flucoma verse that turns segments into sound files. It seems to be a very common workflow based on my sample size of 3 in which 2 people do it (me and you) and 1 person hates it because it is a sin against their disk space.
Cool! Some FluCoMa in the wild then. I like their approach to just gathering up a load of material and letting the computer figure it out for them