New issue: Ban contributions that make use of proprietary language models/code analysis

This has just been posted, I think it is a good idea:

What do you think?

1 Like

No idea about the legal stuff, nor have I decided how I feel about using/steal work to train a model…

…that being said, all this is irrelevant if it does a shit job…

Here is some output if anyone wants see it in action.

SinOsc

Input:

generate a supercollider help document for the class SinOsc

Output:

SinOsc Class Help

Description

The SinOsc class is a unit generator in the SuperCollider audio synthesis language. It generates a sine wave oscillator that can be used for various audio synthesis applications.

Syntax

SinOsc.ar(freq: 440, phase: 0, mul: 1, add: 0)
SinOsc.kr(freq: 440, phase: 0, mul: 1, add: 0)

Class Methods

SinOsc.ar(freq: 440, phase: 0, mul: 1, add: 0)

  • freq (default: 440) - The frequency of the sine wave oscillator in Hertz.
  • phase (default: 0) - The phase of the oscillator in radians.
  • mul (default: 1) - The multiplication factor for the output.
  • add (default: 0) - The addition factor for the output.
  • Returns an audio rate sine wave oscillator.

SinOsc.kr(freq: 440, phase: 0, mul: 1, add: 0)

  • freq (default: 440) - The frequency of the sine wave oscillator in Hertz.
  • phase (default: 0) - The phase of the oscillator in radians.
  • mul (default: 1) - The multiplication factor for the output.
  • add (default: 0) - The addition factor for the output.
  • Returns a control rate sine wave oscillator.

Instance Methods

none

Example Usage

Audio-rate SinOsc

// Generates an audio-rate sine wave oscillator with a frequency of 440Hz
// and an amplitude of 0.5, and adds 0.1 to the resulting signal.
{
    var osc = SinOsc.ar(440, 0, 0.5, 0.1);
    Out.ar(0, osc);
}.play;

Control-rate SinOsc

// Generates a control-rate sine wave oscillator with a frequency of 1Hz
// and multiplies the resulting signal by a control-rate sine wave oscillator
// with a frequency of 0.2Hz.
{
    var freq = SinOsc.kr(0.2, 0, 0.5, 0.5);
    var osc = SinOsc.kr(1) * freq;
    Out.ar(0, osc);
}.play;

Additional Information

The SinOsc class is a fundamental unit generator in SuperCollider and is commonly used for generating pure sine wave tones. It can be combined with other unit generators and synthesis techniques to create complex audio synthesis patches and musical compositions. For more information and examples, refer to the SuperCollider documentation and community resources.

GrainBuf

Input

generate a supercollider help document for the class GrainBuf

Output

SinOsc Class Help

Description

The SinOsc class is a unit generator in the SuperCollider audio synthesis language. It generates a sine wave oscillator that can be used for various audio synthesis applications.

Syntax

SinOsc.ar(freq: 440, phase: 0, mul: 1, add: 0)
SinOsc.kr(freq: 440, phase: 0, mul: 1, add: 0)

Class Methods

SinOsc.ar(freq: 440, phase: 0, mul: 1, add: 0)

  • freq (default: 440) - The frequency of the sine wave oscillator in Hertz.
  • phase (default: 0) - The phase of the oscillator in radians.
  • mul (default: 1) - The multiplication factor for the output.
  • add (default: 0) - The addition factor for the output.
  • Returns an audio rate sine wave oscillator.

SinOsc.kr(freq: 440, phase: 0, mul: 1, add: 0)

  • freq (default: 440) - The frequency of the sine wave oscillator in Hertz.
  • phase (default: 0) - The phase of the oscillator in radians.
  • mul (default: 1) - The multiplication factor for the output.
  • add (default: 0) - The addition factor for the output.
  • Returns a control rate sine wave oscillator.

Instance Methods

none

Example Usage

Audio-rate SinOsc

// Generates an audio-rate sine wave oscillator with a frequency of 440Hz
// and an amplitude of 0.5, and adds 0.1 to the resulting signal.
{
    var osc = SinOsc.ar(440, 0, 0.5, 0.1);
    Out.ar(0, osc);
}.play;

Control-rate SinOsc

// Generates a control-rate sine wave oscillator with a frequency of 1Hz
// and multiplies the resulting signal by a control-rate sine wave oscillator
// with a frequency of 0.2Hz.
{
    var freq = SinOsc.kr(0.2, 0, 0.5, 0.5);
    var osc = SinOsc.kr(1) * freq;
    Out.ar(0, osc);
}.play;

Additional Information

The SinOsc class is a fundamental unit generator in SuperCollider and is commonly used for generating pure sine wave tones. It can be combined with other unit generators and synthesis techniques to create complex audio synthesis patches and musical compositions. For more information and examples, refer to the SuperCollider documentation and community resources.

Here are two I made up

MySuperAwesomeUgen

Input

generate a supercollider help document for the class MySuperAwesomeUgen

Output

MySuperAwesomeUgen Class Help

Description

The MySuperAwesomeUgen class is a custom unit generator in SuperCollider that provides unique and powerful audio synthesis capabilities. It is designed to offer a wide range of functionality and flexibility for creating innovative sounds and musical compositions.

Syntax

MySuperAwesomeUgen.ar(arg1, arg2, ..., argN)
MySuperAwesomeUgen.kr(arg1, arg2, ..., argN)

Class Methods

MySuperAwesomeUgen.ar(arg1, arg2, …, argN)

  • arg1 to argN - The input arguments required by the MySuperAwesomeUgen unit generator.
  • Returns an audio rate output.

MySuperAwesomeUgen.kr(arg1, arg2, …, argN)

  • arg1 to argN - The input arguments required by the MySuperAwesomeUgen unit generator.
  • Returns a control rate output.

Instance Methods

none

Example Usage

Audio-rate MySuperAwesomeUgen

// Generates an audio-rate output using MySuperAwesomeUgen with two input arguments.
{
    var input = SinOsc.ar(440);
    var output = MySuperAwesomeUgen.ar(input, 0.5);
    Out.ar(0, output);
}.play;

Control-rate MySuperAwesomeUgen

// Generates a control-rate output using MySuperAwesomeUgen with three input arguments.
{
    var input1 = LFSaw.kr(0.5);
    var input2 = LFNoise0.kr(1);
    var output = MySuperAwesomeUgen.kr(input1, input2, 0.2);
    Out.ar(0, output);
}.play;

Additional Information

The MySuperAwesomeUgen class is a custom unit generator that extends the capabilities of SuperCollider. It can be used to implement unique synthesis algorithms, audio effects, and signal processing techniques. For detailed information on the specific input arguments and behavior of MySuperAwesomeUgen, consult the documentation or relevant resources related to the specific implementation of the class.

DirtBitCrushingLPF

DirtyBitCrushingLPF Class Help

Description

The DirtyBitCrushingLPF class is a unit generator in SuperCollider that combines bit crushing and low-pass filtering to create a dirty and lo-fi audio effect. It is useful for adding a gritty and vintage character to audio signals.

Syntax

DirtyBitCrushingLPF.ar(input, bitDepth: 8, sampleRateReduction: 1, cutoffFreq: 2000, resonance: 0.5, mul: 1, add: 0)
DirtyBitCrushingLPF.kr(input, bitDepth: 8, sampleRateReduction: 1, cutoffFreq: 2000, resonance: 0.5, mul: 1, add: 0)

Class Methods

DirtyBitCrushingLPF.ar(input, bitDepth: 8, sampleRateReduction: 1, cutoffFreq: 2000, resonance: 0.5, mul: 1, add: 0)

  • input - The audio input to be processed.
  • bitDepth (default: 8) - The number of bits used for quantization. Lower values create a more degraded and lo-fi sound.
  • sampleRateReduction (default: 1) - The degree of sample rate reduction. Values greater than 1 decrease the sample rate, resulting in a more aliasing and distorted sound.
  • cutoffFreq (default: 2000) - The cutoff frequency of the low-pass filter in Hertz. Higher values allow more high-frequency content.
  • resonance (default: 0.5) - The resonance or emphasis of the low-pass filter. Values greater than 0 create a more pronounced resonance.
  • mul (default: 1) - The multiplication factor for the output.
  • add (default: 0) - The addition factor for the output.
  • Returns an audio rate output.

DirtyBitCrushingLPF.kr(input, bitDepth: 8, sampleRateReduction: 1, cutoffFreq: 2000, resonance: 0.5, mul: 1, add: 0)

  • input - The control input to be processed.
  • bitDepth (default: 8) - The number of bits used for quantization. Lower values create a more degraded and lo-fi sound.
  • sampleRateReduction (default: 1) - The degree of sample rate reduction. Values greater than 1 decrease the sample rate, resulting in a more aliasing and distorted sound.
  • cutoffFreq (default: 2000) - The cutoff frequency of the low-pass filter in Hertz. Higher values allow more high-frequency content.
  • resonance (default: 0.5) - The resonance or emphasis of the low-pass filter. Values greater than 0 create a more pronounced resonance.
  • mul (default: 1) - The multiplication factor for the output.
  • add (default: 0) - The addition factor for the output.
  • Returns a control rate output.

Instance Methods

none

Example Usage

Audio-rate DirtyBitCrushingLPF

// Applies the DirtyBitCrushingLPF effect to an audio input
{
    var input = Mix.ar(Saw.ar([440, 660], mul: 0.5));
    var output = DirtyBitCrushingLPF.ar(input, bitDepth: 4, sampleRateReduction: 0.5, cutoffFreq: 4000, resonance: 0.

8);
    Out.ar(0, output);
}.play;

Control-rate DirtyBitCrushingLPF

// Applies the DirtyBitCrushingLPF effect to a control input
{
    var input = LFO.kr(0.5).range(-1, 1);
    var output = DirtyBitCrushingLPF.kr(input, bitDepth: 6, sampleRateReduction: 2, cutoffFreq: 3000, resonance: 0.4);
    Out.ar(0, output);
}.play;

Additional Information

The DirtyBitCrushingLPF class combines bit crushing and low-pass filtering to create a distinct lo-fi effect. Experiment with different parameter values to achieve the desired level of distortion, degradation, and resonance. For more information and examples, refer to the SuperCollider documentation and community resources.

Anyway, its pretty convincing in some place, but figuring out where it is wrong is actually quite hard, because it is consistently wrong. Meaning, once its made up its mind, it is set in its way. One example would be the arguments for GrainBuf which are wrong, but look convincing.
I do think the author is actually talking about rephrasing existing documentation rather than generating whole new documentation. Sometimes it does a good job, sometimes not — the commit is a good example of the latter, but I think the text was actually pretty good before it was edited, so what more could the AI have done?

In the future it might actually be useful as a writing aid, and for supercollider to ban the use of that would be a mistake — it might be a kin to banning spell checker.

If however, users/contributors are submitting AI generated content that is clearly and consistently wrong (i.e., one can tell it is generated) then they should be banned (if they refuse to stop). Mostly, because this just creates more work and makes Supercollider harder to learn and confuses online resources, not because it inherently uses AI.

1 Like

This could easily become a very long discussion… as a contributor and a user, my feeling is that documentation should be mindful rather than mindless.

With a large codebase and many contributors, there are of course some issues with consistency in both implementation and documentation. I.e., there isn’t a single editor whose job is to maintain consistent style. (Having Code style guidelines is a great start, and has made a good improvement!!)

I could see using some sort of LLM as an automatic proofreader, tagging inconsistencies in the documentation. BUT these would then further review by a human user / editor.

2 Likes

two reasons why I would like to see some sort of limitation:

  • it creates a new source of errors that is extremely hard to track
  • I wrote open source code and documentation not for this purpose, and I wouldn’t have, had I known.
2 Likes

I agree with this for ideological reasons, at least to some extent. It’s extremely easy to see that our code, our discussions, our content have already been incorporated into big closed-source language models. If you start asking deep questions about SuperCollider, it’s pretty easy to find things that feel intimately related to even conversations from this forum.

The word “ban” here reads a little bit overly harsh or maybe performative in this context? This wouldn’t be enforceable now - and in another 6-12 months even less so. For example, copilot’s auto-completion produces generally quite valid and useful code, and I doubt if anyone could tell whether it’s being used from a pull request. We have a very rigorous code review process for an open source project, and I doubt that use of language models would somehow cause bad code to slip through in some unique and unforeseen way. Words like “ban” feel punitive towards (potential-) contributors when the real enemy here is corporate pillaging of public community resources, something that’s far away from the pragmatic choices that an individual contributor makes about their tools.

This feels more appropriate as a request / position statement maybe? E.g. "Closed source language models are problematic for these reasons. We as an open source project are antagonistic to them and encourage you to avoid using them. Here are some open-source, ethical alternatives: ________. Pull requests with blatant auto-generated code will be closed immediately. "

I have some skepticism about even this, though - I would be in favor of joining e.g. an organized, collective and public boycott - but purely in a tactical sense, a policy that’s local to our community or codebase and isn’t a part of a broader organization of solidarity around this topic will have little to no impact on the broader culture and could be actively detrimental to getting good contributions to our code (which is already hard enough as it is). This might be an overly cynical and pragmatic take :slight_smile: - still thinking this through, and I’m happy to be argued in another direction on this.

I guess my tldr is: if we can find an organized network of other OS projects that have a stance or set of actions on this topic, lets join it - but I think I’d prefer to not manifest this legitimate battle as unenforceable rules or punitive vibes directed at people who have the energy to put a PR up for review. (I think there will be - and have already been - frivolous uses of this tech to make useless PR’s - this is a separate issue, I think we can shut those down in the same way that we shut down any PR that doesn’t pass muster).

7 Likes

FWIW: the “usefulness” of the tech doesn’t make it ethical, and i think it’s important to firmly separate these considerations - but if the conversation is purely one about quality… A few weekends ago, I translated a small/middle-sized javascript audio project into a C++ based SuperCollider UGen. I used ChatGPT to translate almost all of the JS implementation code into C+±20 equivalents. This was a heavily human guided project and took a lot of manual intervention, but I was able to get this project to probably 80% completion in about 3 hours, when doing it purely by hand would have taken at least 20 hours (pretty sure of this because I STARTED doing it by hand before abandoning it, and it was slow going…). I read the generated code carefully - IIRC it made no more than one or two mistakes for the whole codebase, which is probably better than I would have done myself. In this case, the value of the automated tool was extremely high - to the point that I simply wouldn’t have continued the project at ALL without it, and definitely would have ended up with worse code (since I was able to put my energy into making it well-factored and combing for errors, rather that re-typing thousands of lines of boilerplate).

2 Likes

I think the discussion should not be about the quality of such contributions. I don’t have a doubt that these systems will become better and better over time and will change everyday programming and while this has some serious implications for our society it is also out of the scope for this discussion.

The post was made because of the exploitation of open source projects by companies through proprietary LLMs. These exploitations used to have some legal barriers, like a license (which often enough gets ignored), but due to the black box nature of ANNs the companies ignore this license and its obligations and just do as they please. The question is how the SC dev community should position itself towards this offense and my proposition is to condemn and ban such tools as long as the underlying system is not fully FOSS.

I enjoy the idea by @scztt of forming an alliance with a remark on a strong discouragement towards these closed technologies. Of course it is impossible to truly verify if a PR was made with the help of a LLM or not (let alone separate between a FOSS and a proprietary LLM).
Maybe it is possible to contact the FSF as they are probably the biggest institution in this regard and it would fit their agenda as well? I am a baffled that they have not issued an updated license or else which addresses these issues.

1 Like

A side remark on this (known, but worth repeating):

I think that this fact indicates that systems like auto-pilot currently break GPL, because they apply modified open source code to function. Arguably, the code is not simply data, but its syntactic structure is integral part of the operational model, and it is modified to fit needs. Our license is probably unenforcable currently, but our community should refrain from deliberately committing content into those systems without permission by the authors.

1 Like

I cannot agree more with this. Tools are nothing without a community of practice, which is fostered by mindful entry points and eventually mutual support.

but I also agree with this:

As a tool for convergent coding, these can be very useful for the error-prone boring bits.

now let’s not open that can of worm :slight_smile:

Anyway, as you can see, as much as I am allergic to what @dscheiba was describing, if SC codebase cannot benefit from a lighter, stronger maintenance tool for its codebase and its expansions, I am worried about its future, when these tools are mature enough (and unpoliceable anyway) to be everywhere including in the compilers we use etc

I’m not (yet) good enough to help with the said codebase, but these struggles are real. So the question is an interesting one, and I wonder how much bigger F(l)OSS projects are making a stand.

(sorry for the morning rambling)

1 Like

The original PR is for updating documentation. I’m all for it. It will be vetted by those in the know, and possibly all the nooks and crannies that aren’t reachable from an entry point like the tutorial might be linked to, though, I’d be surprised if an LLM can do an extensive coverage of the documentation.

I think you got right to the point by separating the issue of technology (LLM and similar tools, that are out there already) from the issue of freedom.

It reminds me of something: Engels recorded after Marx’s death that he was deeply moved by every technological advance.

I haven’t seen a solid statement by the FSF about it, or did I miss it?

1 Like

Here is an interesting happening at stackexchange:
The moderators go on strike, because the company forces the forum to permit generated posts.

Here is the code of conduct referenced: Inauthentic usage policy - Meta Stack Exchange

4 Likes

Also there is a class action lawsuit concerning copilot: https://githubcopilotlitigation.com/

Some interesting thing here:

1 Like

Publication of the FSF-funded white papers on questions around Copilot

https://www.fsf.org/news/publication-of-the-fsf-funded-white-papers-on-questions-around-copilot

I think the problem is not the technology, but the privatization of commons, which is not a new thing.

A free software public archive would be very positive.

3 Likes