How to use CLI slice indice output?

philomates · May 14, 2021, 2:41pm

Hi all, I’m very new to a lot of the concepts here and am interested in interacting with these tools via the command line.

Is there a place the describes how to use the slice indice output of some of the CLI commands?

For instance, I run

fluid-noveltyslice -source a.wav -indices indices.wav -fftsettings 2048 -1 -1 -threshold 0.2 -filtersize 1 -feature 1

on a 30 second file and the resulting indices.wav, when opened in an audio player is a shorter segment of blips. I assume that the output needs to be further handled in other ways to be made useful, but I couldn’t find any examples of doing so.

I found some threads talking a bit about buffers but couldn’t contextualize the info enough to get any traction. Maybe someone has some nice examples of the CLI in action or a link to an explanation of the slice output format?

tremblap · May 14, 2021, 3:25pm

Hello @philomates and welcome!

The result you get is a wav or a csv file. The numbers in there are the frame numbers where slices were detected.

I’ll make a little example below, but the ultimate example is in @jamesbradbury ReaCoMa project, if you can read LUA.

More very soon.

tremblap · May 14, 2021, 3:42pm

The simplest example is this:

fluid-noveltyslice -source a.wav -indices indices.csv

If you use that in a Python script for instance, you can then retrieve the values from indices.csv and iterate over them pair-wise to do further process/descriptions.

I hope this helps? The CLI interface is not a full scripting language, literally super efficient tools to get elements. @jamesbradbury again designed the amazing ftis which wraps some of our things with some of other people’s things. He presented that here:

jamesbradbury · May 14, 2021, 3:42pm

Thanks for the props @tremblap , although the CSV reading in ReaCoMa is very terse and imo not hugely helpful if you’re trying to wrap your head around just that aspect

perhaps the best example for getting immediate feedback would be to chain cat after your call? If you change the extension of the output to .csv then the output will be comma separated value file instead.

fluid-noveltyslice -source a.wav -indices indices.csv && cat indices.csv for example gives me this:

This might not be ideal though, and the kind of workflow that you envisage or want to achieve will change the answer to your question a lot. Do you want to just get a quick understanding of how the algos are slicing certain sounds or perhaps do you want to chop up the sound file into new ones without leaving the command line?

You could in theory use something like awk to facilitate more complex distributing of the data, but then I’d encourage you to go elsewhere - perhaps Python to orchestrate that sorta stuff from a CLI interface.

P.S

If you do speak Python, then I wrapped all the command line algorithms and you can use the data directly in Python as numpy buffers or lists (your choice of output). A library like pydub combined with this is how I do a lot of batch processing for segmented sound → many files.

philomates · May 15, 2021, 6:23pm

thanks for the tips and references!
ftis looks very cool and potentially just what I’m looking for. And yes, python tends to be easier than piping unix commands together for me

I was able to get what I initially wanted, that is use the CLI to turn 1 input wav into X sliced wavs, using the following python3 + flucoma cli + sox glue:

#!/usr/bin/python

import os
import subprocess

input_file = "b.wav"
indices_file = "indices.csv"
slice_file_base = "slice"

os.system("fluid-noveltyslice -source {0} -indices {1} -fftsettings 2048 1024 2048 -threshold 0.5 -filtersize 1 -feature 1".format(input_file, indices_file))

sample_rate = float(subprocess.check_output("sox --i -r {0}".format(input_file), shell=True))

f = open(indices_file, "r")
indices = [int(e) for e in f.read().rstrip().split(",")]

for i in range(len(indices) - 1):
    x, y = indices[i], indices[i + 1]
    slice_file = "{0}_{1}.wav".format(slice_file_base, i)
    offset = x / sample_rate
    length = (y - x) / sample_rate
    cmd = "sox {0} {1} trim {2:.2f} {3:.2f}".format(input_file, slice_file, offset, length)
    print(cmd)
    os.system(cmd)

being new to this (and pretty novice with sample rates etc), the thing that tripped me for a second was what the indices actually represent frames that need to be rendered into seconds in order to use with sox

Anyways, thanks again and excited to now have enough context to start playing around!

jamesbradbury · May 15, 2021, 7:31pm

Nice one. I like that this entirely the way I wouldn’t have thought to do it I love SoX - so useful and this just proves it. Theres probably room in the world for a general purpose utility in the flucoma verse that turns segments into sound files. It seems to be a very common workflow based on my sample size of 3 in which 2 people do it (me and you) and 1 person hates it because it is a sin against their disk space.

philomates · May 15, 2021, 7:48pm

I actually picked up the idea from Barriers 1: Grenzmauer '75 - I scraped a contact microphone over the Berlin Wall - Releases - lines, so there is yet another user

jamesbradbury · May 15, 2021, 8:12pm

Cool! Some FluCoMa in the wild then. I like their approach to just gathering up a load of material and letting the computer figure it out for them