One more thing I just noticed: there are many classes which implement choose, but where weights wouldn’t make any sense, e.g. Dictionary and Set. While it we have such variance in arguments in a few places (… plot!), it is at least nicer to have a uniform interface.
For reasons like those, I have the feeling it would make sense not to change choose. I don’t have a strong opinion on syntax change, but consistency seems important.
Python was mentioned as an example, but numpy offers a preprocessed option (sum should be 1.0), and this is the one to use for performance-critical tasks and large lists.
This is the case for polymorphism! The IDE will not offer us weights where they are not needed…
As a sidenote I do wish that Prand was called Pchoose and Pwhite was called Prand (though that would be breaking sadly!) In any case I think it might not be a bad idea to do the same thing in these cases as well . There are far too many pattern variants with acronyms that are really impossible to remember.
addresses the primary issue raised by the PR author (a way to normalize weights)
relatedly: choose, perhaps the more commonly used/remembered, defaults to “expected” behavior—normalizing by default
unifies the interface between choose, wchoose, and TWChoose (and the t/wchoose defaults match)
both methods allow for optimization by the user
backwards compatible - nothing breaks, including old habits (wchoose)
fits the quick iterative workflow outlined by @semiquaver (common use)
leaves room for future optimization without disrupting the interface (as alluded to by @semiquaver and @smoge’s ideating)
Cons
the curious user wonders why the redundant functionality and mismatched defaults (so docs explain the legacy behavior, logic of pre-normalization, and accompanying examples show a simple illustrative benchmark)
Missing anything?
A separate initiative/PR could introduce this new interface where it makes sense. Weighted distributions is very “at home” in collections, so it seems unlikely there would be objections to the new feature introduced here and not elsewhere (yet).
I see no way to implement a weighted choice on a collection that has no way to access a particular item with at. In Dictionary it would be possible, however, by passing a dictionary of weights.
Your pros and cons make me think that even more it would be better to do the following:
wchoose(weights)
wchoosen(weights)
No flag, just add a single letter if you come from wchoose.
If you come from choose, you add one at the beginning and you’ll be prompted two possibilities which you can quickly choose from. Not much to explain, easy to remember.
It smoothly integrates into habits, separates concerns, is idiomatic, and breaks nothing.
I agree this solution shares many of the advantages. At this point it seems to come down to a matter of precedent and personal preference. The mirrored-interface option you proposed was appealing for consolidating rather than proliferating methods/functionality. That said, wchoosen has been my “runner up” choice, so I’d personally be ok with it
Are we at a point where it makes sense to take “votes”?
The interface for TWChoose is a bit strange though, it takes an integer for normalize where it should really take a boolean value… I also think that the default should be true, for the reason @semiquaver mentioned of rapid iteration. Because of that, I also think that it should be, as @mike wrote:
A mirrored wchoose like wchoose(weights, normalize=false) would make sense I think… I am indifferent to that.
And with the previous thing in mind, i actually think TChoose could be extended to allow for weights, like this: TChoose.ar(trig, array, weights=nil, normalize=true).
Why a single method instead of two? For me, it’s about conciseness: a single method like this requires less explanation than two separate methods. I am writing notes while learning SuperCollider and I really value these kinds of things. Also, the n is not that apparent: is that a n for normalize (as in perform the operation) or for normalized (as in the array is already normalized, no need for normalization)?
strongly prefer the mirrored choice over wchoosen!
We need to keep in mind discoverability for new users! Once a user has seen choose they will see an option to add weights and the flag as well - perhaps they will use these in the future. They will only know about the other options wchoose and wchoosen from slogging through the already overlong list of methods for Array and Sequenceable Collection.
Then there is cognitive load - there are so many methods/objects differing by a letter or two - PbindfPbindef - I couldn’t tell you now which of these does which - that drive us to consult the docs…
I think when it comes to this kind of situation re: syntax, there will never be consensus.
That’s why the haskell committee, in the early days, appointed one person per year to have the final word on syntax impasses. Otherwise, they would never agree with all the small details .
This is just because you are not yet used to the way things are on the UGen side. There are no boolean values for UGens, which is why audio signals and numbers are interpreted as flags. This is the same everywhere.
As far as I am concerned, I am not so much in favour of mixing unweighted and weighted choice, because they are conceptually different: for choice you don’t have to be able to index the elements.
I think everyone on the thread has been willing at one point or another to accept @julian’s suggestion of the mirrored option. Some (myself and OP) would prefer fewer methods, some more(@julian), some happiest with this option (@mike). Perhaps this is a consensus?
Regarding mixing weights and choice, as Eric pointed out this is the interface of TWChoose.
choose
wchoose(weights) // accepts any list (or maybe positive numbers to avoid ugly errors)
wchooseN(normalizedWeights) // only normalizedSum arrays
Because you don’t needwchooseN unless you’re engaged in a performance-critical operation and need to preprocess all arrays before. So one more letter is not that bad compared to preprocessing (normalizing) the arrays before the time-critical moment.
The time to perform once .normalizeSum in a human-readable (less than a thousand numbers, for example) small array is not perceptible, in a modern machine that would be in the order of microseconds in the worst scenario.
To degrade performance, one would need to use non-human-writable arrays (not a list you would type with the keyboard) and use them in a loop in a granular synthesis scheduling kind of situation. Those are not common cases.
That’s the use-case of numpy, for example (huge lists, crazy operations etc). That’s why numpy requires a preprocessed normalized list, but the standard python library doesn’t, since it does not make a difference. It’s just not a significant difference.