Suggestions for programming/teaching SC to improvise?

Heyhey,

something I’ve wanted to do for quite a while - it was actually a motivating factor in starting SC to begin with, although a lot of other things got “in the way” - is to get to a point with SC where I can play live gigs as a “duo” with my laptop. As in, I play, my laptop plays with me - kinda like this, although I have pretty different aesthetic ideals than George Lewis I guess.

My first attempts at doing this were by writing a bunch of patches (granular sampler, reverb, drum machine) and then randomising which patch was cued and how long - aside from the drum machine, the synths recorded input live from my guitar and then manipulated them, again randomly. I did a gig with this setup last Tuesday and it wasn’t too bad - I thought the music we made together sounded more or less ok. However, it became super clear that the patches were not really capable of listening or interaction - I felt like I was playing with a quite talented teenager, who had great ideas, but wasn’t really much of a listener or particularly tasteful.

Does anybody have any experience making this happen in SC? Or any ideas how to “train” SC to listen and respond in real time? I don’t have any experience (yet!) with neural networks or machine learning, but would be super happy to hear about any resources or ideas people have. This project is probably one I’ll work on for quite a while, so it doesn’t bother me if things are outside my skill/understanding level right now.

Long post, hope a few people are interested anyway.

Cheers,
Jordan

I haven’t built this specific kind of project, but I have developed ambitious SC instruments with generative systems in them, so I can offer general advice in that space.

Always start small and work incrementally rather than trying to overcomplicate and create the one big patch that does everything. For example, you can start off by using an onset detector (or just Amplitude.ar) to trigger new patches. Then you can use a monophonic pitch detector or band splitter to get a broad idea of whether you’re playing in the high or low register, and the patches you trigger can respond by either mimicking what you do or contrasting with what you do. Have patches that can only trigger once per tune to create unique moments and avoid repetitiveness. Also think about how your instrument will respond to silence — produce more silence, or take a solo?

You may be surprised at how good an improviser you can make with simple algorithms and a lot of careful, musically informed tuning of probabilities, and of course good patches.

I wouldn’t bust out machine learning unless you’ve found a specific need for it. Neural networks in particular have many drawbacks, and a big one is that they usually require mountains of training data to produce interesting or useful results.

5 Likes

I feel like I need to repeat this for myself like a mantra forever! Had so many projects stagnating as I tried to generalize them more and more… on the other side, simpler self-contained instruments can be made to talk to each other later on (like guitar pedals in a way… you can chain many of them but also already play with only one).

Another hopefully inspiring thing I want to share, even if probably not what you want to do, is that processes don’t necessarily need to interact to be in relation with each other. It sounds obvious, but to me when it happened it was not. Recently I played “noise saxophone” while a laptop was playing noise from binary files… no interaction, from either side, but IMO a strong relationship in what both me and the laptop were doing.

1 Like

This answer won’t help immediatly, sorry, but I have been thinking about this topic for quite a long time.

In order to be playing like a musician and not just like a talented teenager, when improvising, I think the musician (whether it’s a human or a computer) needs to represents itself music through language (might it be conscious or not).

As a simple example, when I’m jamming with friends, we start playing that standard mixolydian funk that goes on and on. At some point, I’m feeling bored. Here, the cool part is the fourth which I raise by half a tone to get the distinctive lydian b7. Now the funk gets taken to another level. Usually, the overall color change raises the tempo a bit (a beat ?) and musical phrases becomes more destructurate. It then becomes easy to transition to asymetric measures, chromatic or other weird scales, etc.

I don’t think the same progression of ideas can be achieved starting from the minor scale, at least not with the augmented fourth (I mean it can but the context would be highly different). It would rather be a switch between the minor and major 6th, for the first step. And probably at a slower tempo.

This is a simple example with tonal ideas but you can easily find similar examples with timbre modulation, chord progressions, spectrum occupation, etc…

I think that the problem is that even if you can formalize what happens here in musicological terms, it’s not a deductive science at this point, but inductive, insights about the music. It’s a mix between pure music theory, cultural knowledge and musician experience.

To summarize, how can one code a program such as, playing live, it will run a function like :
“Hey, I’m culturally supposed to go heavy on next bar, but hey, it’s 2022 and it’s time to change, let’s have a mild break! Now, how do I suggest other musicians that it’s where we’re going…” .

I don’t have the answer at this question (yet).

Any computer-program does what and how it has been programmed for.
If we link your “culturally supposed to (do)” to what you, the code-writer, has written in your code (i.e. what you have decided that the program should do and should not do at any point in time), then what you are tying to achieve is to have a computer overruling this. Meaning taking its own decisions.

But again, you have to tell to the computer what is the realm of acceptable decisions and then train it to detect when a particular decision is suitable or not.

And this is Artificial Intelligence.

So for me what you are trying to achieve (in Supercollider) is an Artificial Intelligence musician. The idea is appealing. Is it feasible ?

I’m lacking the technical knowledge to answer correctly. I didn’t study neural network, and I think that I didn’t really understand what Gödel’s Theorem might imply.

In Gödel-Escher-Bach (which I know is highly criticized), Hofstadter takes the example of someone thinking that 2+2=5. He says that this person makes a mistake, yet that the underlying neural processes of this thought are totally correct, meaning that every electron in the brain acts in perfect adequation with physics laws.

He then says that having a purely deterministic system can still lead to an imperfect system in an other ‘layer of representation’ (perfect brain with unperfect thought).

From this I supposed (and again I might be really wrong) that there’s not much difference between what my brain is allowed to do and what a computer is allowed to do (or a super huge and complex hydraulic system).

Now what I see, when listening to interesting musicians, is that they somehow break the conceptual musical frame from there time just enough to make it switch. This here is my own vision of how to make the musician ‘interesting’ (but there are some other concept, like aesthetics, that I don’t want to go into). I don’t listen to mainstreams radios much because I now can deduce what note they’ll play most of the time. They’re applying a recipe.

I don’t think Bach, for example, was applying a recipe. He was applying the recipe just enough to say “look, if I do this, which I shouldn’t but I will, then the way you think about the recipe completely shifts”.

And I think, again, that if Bach brain was able to do so, the computer might do so too. I’m interesting into understanding how one might achieve these, but I think it requires skills that I don’t have.

I think this is all super interesting. The responses from @nathan and @elgiano have sort of encouraged me to keep going on the direction I’ve been going on though - programming live-electronic FX patches and selecting between them randomly (with weighted probability), for now, and seeing where I end up - and probably checking out onset detection/amplitude tracking soon.

To clarify: personally, I’m not really interested in programming my laptop to think and play like a jazz musician or whatever - more interested in getting my laptop to improvise music that nobody else is going to, but within certain parameters that I unconsciously find appealing.

But yeah, no idea how to achieve what @Dindoleon and @lgvr123 are talking about!

Yup sorry went a little too far on this one.

As you stated, the computer isn’t a good listener. I find it a bit frustrating, you’re playing techno, crowd is bouncing all around, all you need is a big bass drop… and the computer can’t figure this out and keeps playing its thing in his corner.

IA is supposed to be able to solve this kind of things but it’s beyond my scope…

Maybe two directions to improve your setup :

Have a visual feedback to see what the laptop is going to play. This doesn’t improve the computer interaction itself, but helps you stay in tune with it if needed.

What I think might be a way to improve the interaction is to think about how the weighting is done. Maybe, when not in concert, you might be repeating with your laptop, like you would with a musician. Every time he plays something on his own by reacting to your musical input, you have the ability, via a foot pedal or the keyboard, to tell him “This was good” or “I didn’t like this”. Then it recalculates it’s weightings according to what you chose.

I think this can have different results starting with different weights, having several weight files, comparing weight files, comparing weight files from different songs, etc.

This way, you’re not designing the weights yourself directly, but you shape them from random values, which adds kind of a personality to the laptop. But I’m not sure if it’s feasible.

Actually, you’ll need a big 2D matrix.
Horizontally, all the possible action/decisions/phrases that you’ll allow your program to choose from.
Vertically, all the different decisions factors/informations that you will give to your program so that it can take decision. Those informations can be sensors (volume sensors), timers (e.g. elapsed time since the last decision has been taken), pitch analysers, amount of other instruments playing, …

Based on that, you can define a stochastic process, i.e. filling-in your matrix : which decisions are allowed under which circumstances, possibly with weights. In those cases, I like the Markov process

States\Phrases:                   |   A   |   B   |   C   |   D   |
----------------------------------+-------+-------+-------+-------+
Start                             |  70%  |  25%  |   5%  |       |
State 1, current phrase = A       |  50%  |  50%  |       |       |
State 1, current phrase = B       |  80%  |  20%  |       |       |
State 2, current phrase = A       |       |  30%  |  60%  |  10%  |
State 2, current phrase = B       |       |       |  10%  |  90%  |
Else                              |  45%  |  45%  |   5%  |       |

That’s a really cool way of thinking about it. Have you implemented this in code ? I’d love to see what a simple implementation of this looks like in sclang.

Something I think people often overlook is that improvisers don’t simply respond to the last thing that happened - they know where they are in the current session.

Simple example: Imagine the computer is just responding to your pitch, it should respond differently if you change register after staying in the same place a long time as opposed to if you change after a couple seconds. If you change from low to high after 2s computer could change its behavior 2s later for example while if you change after 30s it might change right away. That way the two of you could get something going on a 2s rhythm in the first case and seem to be switching sections together in the second…

Or compy might look for a longer pattern in your behavior and respond to that…

All of which is to say that context sensitivity in the time dimension at longer scales seems like the piece that is most often missing in these kinds of excercises.

2 Likes

Yes, sort-of, the matrix is not explicit, but this is the principle.
This is a Pbind that improvises on a chord progression.
Here the different states are limited to what the Pbind did a the previous iteration.

~solo=Pbind(
	(...)
	\degree,	Pfunc({|current|
		var pd, d;
		// States based on the previous note
		#pd,d=((~pE?(\degree: Rest()))[\degree].isRest).if({
			// State 1 : the previous note was a rest
			// Available decisions: take a note in the pentatonic scale (or have a Rest)
			var scale, gt, d, n;
			scale=~pg[current['measure']]['scale'];
			gt=~pg[current['measure']]['gt'];
			current['scale']=scale;
			current['gtranspose']=gt;
			// Action: select a degree of the pentatonic scale
			// all the degrees have an even probability.
			d=scale.degrees.choose ;
			pd="Rest";
			["rest",d];
		},{
			// State 2 : the previous note was a note
			// Available decisions: move from last played note with a gaussian probability (or have a Rest)
			var d,pd;
			var step;
			var out;
			// identify the degree of the last played note in current scale
			var pF=~pE[\freq].value;
			pd=~freqToScale.value(current,pF);

			// Action: select a distance from the previous *degree* with a guassian distribution 
			step=0.gauss(1);

			// .. smooth it and make it a valid degree
			if(((pd+step)<(-5)).and({step<0}),{
				d=pd-step;
			},{
				if(((pd+step)>10).and({step>0}),{
					d=pd-step;
				},{
					d=pd+step;
				})

			});
			d=d.round;
			[pd,d];

		});

		// States depending on the current degree
		// State A: the note duration vs. out-of-scale: how short is the note how high is the probability to play out of the scale.
		current['dur'].lincurve(0,2,current.detuneProb,current.detuneProb/2,1).coin.if({d=d+[-1,1].choose*0.5});


		// State B: the note duration vs. Rest: how long is the note how high is the probability to play a rest.
		current['dur'].lincurve(0,2,1,0.1,1).coin.if({d},{Rest()});

	}),
	\callback, 	{|evt| ~pE=evt;
		~pE[\note]=~pE.use(~pE[\note]); ~pE[\midinote]=~pE.use(~pE[\midinote]);
	}.inEnvir,
);

(full code here)

1 Like

Hofstadter’s point does not quite lead to that conclusion. His example is an updated version of a standard argument logicians make when they argue that you cannot reduce the rules of logic and math to physics or psychology. One of the best known version is by German philosopher Edmund Husserl, who presented it in his 1900 book “Logical investigations,” where he argued that you can build a calculator that gives you wrong results using the exactly same physical laws that a correct one does. His point was that the physical laws the engineers use do not explain how logic works. The same argument applies to psychology (which is what Husserl what was worried about) or, in more recent versions, to neuroscience: even if you knew where every single synaptic connection in the human brain goes and the exact nature of every single physico-chemical interaction between any pair of them (quite a tall order!), you still would not be able to determine why 2+2=4 is right and 2+2=5 is wrong. That’s what Hofstadter’s “layers of representation” refer to: physics and logic (or math) are logically different levels that cannot be reduced to each other (philosophers prefer calling them “level of description” but Hofstadter is an old-school AI-nik, for whom “description” and “representation” are synonyms).

The distinction between what a computer and a human can do is usually argued along different lines. A standard argument is that computers are syntactic machines—they only know rules, namely those the programmers have written down in theirs program. Humans do not always act on the basis of rules, the argument goes, and in fact it would be impossible to translate into rules the rules you follow when you write a rule (those would be a computer language instructions), and then the rules you follow when you write the rules you follow when you write the rules (these would be the compiler’s instructions)…You see where this is going. It never stop. Conclusion: humans do not always follow rules, computers do, hence humans can do something computers can’t. Here the standard example is trying to teach a computer how to use a hammer—try to reduce that knowledge to rules and you soon end up with having to describe the structure of society at large to explain how a hammer works.

If you think of making music as an activity very similar to a carpenter’s craft, then computers can at best imitate it—like the apprentice in the workshop who looks and copies but won’t ever go beyond what’s been done before. So your description of your experiments’ results was quite apt I think!

…except that in our case the things the computer is instructed to do are made by a musician… so they are not an imitation of music, they are music…

Some thoughts:

one approach would be where you let the laptop improvise something and you react to it (it’s kind of the opposite of what you set out to do I guess). E.g. here’s one possibility of how to generate eternal chord progressions within some constraints you set up yourself: Rise and shine! (meditation/background music with supercollider) - YouTube with a link to the code in the video description.

Another approach that I’ve been experimenting with is where the laptop listens to midi and reacts to it, e.g. by translating every note you play into multiple notes (conceptually a bit like an arpeggiator, but of course you can do much wilder things). Pianist Dan Tepfer got kind of famous with it, and here’s Marc Evenstein doing something similar: Untitled Lunacy - YouTube . I did make some code for such things already but it’s not online yet. I’ve used it e.g. to live play “black midi” in a small performance. A video of what that I did can be seen here: Ode to Joy - YouTube (this was recorded very shortly before covid hit the world - kind of remarkable to think about if you understand the Dutch text :slight_smile: )

Well, that again depends on what you take music to be. What I said applies to any field, creative or not, and regardless of the profession of the programmer. Traditional AI programs were coded with the assistance of so called domain-experts, who often were the programmers themselves wearing a different hat. Those collaboration, unfortunately, did not make the program themselves any more successful. But I’m digressing.
I guess the real question is this: take a piece of music that a program like, say Band-in-a-Box has “composed”, with solos and everything. Is that music?
I think it is an extremely hard question to answer, to be frank. In fact, it would take a book length essay to give it proper treatment.

I think the answer to your hypothetical is pretty straightforward: if you played the Band-in-a-Box “composition” to anyone in the world and asked them whether they are hearing music I think they would say yes! It may be very bad terrible music but its surely music…

I’d like you to meet my music teacher, for whom there is “real music” and “just midi”. Guess where does she put Band in the box pieces. I have the feeling she is not the only person in the world who thinks that way…

Anyways, this thread has gone off topic enough. I’ll shut up now.

I have been working on a sort of similar system for a couple years now mostly using the weighted probabilities idea. Part of my goal was to have the system be able to play by itself as well as with input and I think that helped it gain a kind of identity so it feels like improvising with a person that has their own musical ideas (that I generally like because I programmed it, but that also surprises me).

Since you seem very interested in input from your guitar playing, if you haven’t tried recording a few short improvisations to run through your system, it could be a good strategy. Then you’d get to concentrate fully on the SC program and not the guitar.