Choosing how to choose with self-normalising weights

@rdd’s point about normalizing outside the loop for efficiency gave me this idea:

what if we have

choose(weights:nil, normalize:true)
and
chooseN(weights)` // normalizes weights then choose N elements

(and please let’s not use a wchoosen method - using abbreviations both before and after a word is unlovely!)

You are right that double abbreviations are a little unlovely (we do have them a lot though, i was following the pattern of LFNoise0).

I would refrain from using a choose method to generate data structures, because this steepens the path to variants. E.g. the step from {[1, 2, 3].choose}.dup(5) to { rrand(1, 3) }.dup(5) etc. is short and logical, while from [1, 2, 3].chooseN(n:5) to { rrand(1, 3) }.dup(5) requires knowledge of the implementation.

With regards to the change in choose (which I still resist a bit). If we came up with a way to specify many different ways of choosing, then it could be done like this:

[1, 2, 3].choose(how); // where how could be weights, a symbol, a function.

This would be done by dispatching to the “how” argument in a separate method:

choose { |how|
    if(how.notNil) { ^chooseWith(how, this) };
    ^this.at(this.size.rand)

But this would require a full chooseWith interface and I am not sure if this makes more sense than just writing the code you intend.

FWIW, my real preferred option is to keep choose as is and change wchoose so that the normalization happens by default (same with twchoose)

I would like to take the role to decide the matter, if that is ok. So much thought has gone into this by all of you, and the fact that there is no unversally best way is just part of the beauty of the nature of programming: there is no best way. (I won’t have occasion to dicuss this much, because I’ll be mostly offline for a month or so.)

I would find it good if we could have the following in SequenceableCollection:

wchoosen { |weights|
    ^if(weights.isNil) {
          this.choose
     } {
          weights = weights.value(this).keep(this.size).collect { |x| x.value(this) };
          this.at(weights.normalizeSum.windex)
    }
}

// you can do things like:
a = { |i| (i+1).rand };
f = { |c| a.dup(c.size) };
[10, 100, 40, 20].wchoosen(f);

The “n” at the end is for discoverability and autocompletion. The “w” makes clear what the closest relative is. The edit distance is still reasonably small to encourage the use.

Now if someone gives good examples how one yould make sense of such a weight function in all the collections that are not SequenceableCollections, and how that sense could be understood as a weighting, then I would say it could be done for choose. I am not convinced right now, for several reasons, one of them being that for each new choose someone implements, they need to think about how to make the weight function work. So I would prefer to keep the concerns separate and instead pack the new variant with interesting features.

1 Like