Experimental Creative Writing with the Vectorized Word

rodrigo.constanzo · January 15, 2019, 8:21pm

Interesting video, and approach.

It’s a bit of a bummer that the reverse of these metaphors don’t really work, or rather, the dsp-equivalents for what she’s doing with words, are already the basis for most resynthesis/convolution.

Still interesting to watch.

jamesbradbury · January 16, 2019, 8:39am

This is damn cool. I was unconvinced when she was demonstrating the python code with output, but the poetry toward the end was definitely intriguing. I particularly enjoyed the random walk through phonetic similarity space - I see this is as something I’d have been really interested in about a year ago writing a piece with primarily sounds from vocal sources.

rodrigo.constanzo · January 16, 2019, 8:55am

Yeah that was my favorite bit. It’s a shame she didn’t have the text on screen. I would have liked to read it (plus I’m not much for this kind of bland poetry reading).

“Her breasts, the hands on her breasts, her hands, her breasts, the breasts, her hands, her breasts on the hands on her breasts…”.

edit; in actually thinking about it now. I wonder if the dimensionality reduction techniques they talk about could be useful to draw on. So taking a vector space of 800+ dimensions, in some cases, and then reducing that down to 50ish dimensions, to encompass the “meaning” or “sound” of a word.

Could see easy parallels for having a really complex descriptor/feature space being analyzed across a wide range of material, and then reducing that down to something more useful and/or comparable (though probably “illegible” like MFCCs).

jamesbradbury · January 17, 2019, 7:45am

@a.harker gave a CCL talk on descriptors last year and showed an interesting bit of work from Tom Grill that featured dimensionality reduction.

https://grrrr.org/data/research/texmap/

tremblap · January 17, 2019, 3:41pm

@groma is working on this right now

pasquetje · March 5, 2019, 1:55pm

Hello !

I was away for a while… Too much work here. I never forgot Flucoma though and maybe … you !!!

A little messy thought:

That thread makes me think of something I have been exploring since a while. Open Music people may know the ZL library using Lempel–Ziv; the ZIP algorithm in quick words. It is used in OM both for notes or text and I love its ease of control.

Nice @rodrigo.constanzo! I also believe sequencing can sometimes be missing for time-based stuff. This is why a markovian approach is sometimes easier for me; but not always.

I love the fact LZW separates analyzed sequences into a dictionary (would be a cluster for a SOM) on one side and a sequence on the other. Super cool: their size can independently and dynamically change. So you can ask for large bits of sequence generated from short segments etc.

The fact sequencing is added to the clustering provides something better than just random walks. Better means more control; maybe not perceptively…
Using RNN works fine too (demos of Tensorflow) but I never had the chance to go really far with that. I may be wrong but its control may not be easy.

So why not adding dimensions to all that and play with horizontal and vertical sequences of MFCCs. 800+1 dimensions; the extra ones being time ? or 800+2 is one has two representations of time…

I hope I will have that extra time to dig further into something like the “continuator”, “OMax” or “PyOracle”.
To be continued. ;9

tremblap · May 26, 2020, 4:25pm

I like the self-aware ‘bad digital humanities’ bit too!

jamesbradbury · May 28, 2020, 10:56am

didnt know about this! Thanks

tutschku · May 28, 2020, 4:18pm

very interesting indeed, not sure yet how to bend the concepts towards my needs, but I’ll keep it on my study list

tremblap · May 28, 2020, 5:18pm

what I was thinking on my part was in the chaining of higher level ‘phonemes’ to build a dictionary… it is not clear yet, but imagine if larger chunks (‘words’) were thought of as some classes (noun, verb, etc)… or something more ‘machine learny’ like they do where a sort of vector of transitions between each class, or even each ‘word’… not clear yet as you can see but once I get chunks of valid gestalt for me I want to be able to think of their chaining. I know @weefuzzy has done some cool stuff in that direction already, and @groma had also many ideas…

jamesbradbury · May 29, 2020, 2:33pm

One might also take influence from strategies for creating conlangs as they have lots of similar requirements such as constructing a grammar, designing a phonology etc. I had a friend who made his own radio show in his and it was weirdly convincing that it was a language people used day to day. If I can find the link I will

pasquetje · July 4, 2020, 11:25am

You could hack a speech recognition engine and “reverse” its use. I was using this for my stuff and Georges Aperghis.

For instance, a long time ago using the now dead Sphinx library, I made an mxj (Java within Max) retrieving the position of phonemes recorded into a buffer~. The engine was using an n-gram file (often 3-gram) together with a language specific phoneme dictionary. Sphinx was providing several languages.
It was possible to either use the whole dictionary (not great) or to narrow the search using a JSpeech Grammar Format text file containing a prob tree of the text to be recognized (super great).
The external was giving lists of the recognized text, the position of words and phonemes (!!!) in the buffer~. You could then concatenate then ala Aperghis etc.
I do not know if that still works. There are much better engines now worth being implemented into Max.

A dinosaur from the permafrost I would REALLY LOVE to wake and implement into Flucoma… In another life.

rodrigo.constanzo · July 4, 2020, 2:24pm

I think I remember one of your presentations at Hudds with this stuff. It sounded amazing!

I still have a strong memory of when you showed the “leftovers” of the process, which were all the things that weren’t phonemes.