Choosing how to choose with self-normalising weights

I think the PR interface could be improved. Currently the PR implements

  • If I just want to get a random element, I can use the choose method.
  • If I want to use a custom distribution, I can use wchoose, short for weighted(?) choose. If the distribution passed is not a distribution (all elements >=0, sum equal to 1.0), the function will implicitly proceed with non-deterministic behavior.
  • If I have an array of values that may not be a probability distribution, but can be transformed into such a distribution, I have to use choose, but now with arguments.

So based on what kind of “distribution” I have, I need to choose between two functions and their arguments properly - seems like something I already have to look up.

Lets take a look at the equivalent Python interface random.choices, which was already mentioned

def random.choices(population, weights=None, *, cum_weights=None, k=1)

You have a unified function where given weights are normalized. Instead of providing different functions with names you need to know, we can use arguments that can be displayed from within the IDE and make you memorize less, which I think is more user friendly, especially for beginners.

So, why not try to be a bit more radical here with the PR?

  • make choose accept arguments
    SequencableCollection.choose { |weights: nil, normalize: true|
      if(weights.isNil, { ^this.at(this.size.rand) });
      if(normalizeWeights, { weights = weights.normalizeSum });
      ^this.at(weights.windex);
    }
    
    Normalize is true b/c only in the edge case of large arrays that are already scaled does it make sense to turn off this flag - this is advanced usage for users who know what they are doing in regards of performance.
  • deprecate wchoose which will simply forward the call to .choose(weights, normalize: false) - normalize false in order to mimic the old behavior.

IMO this makes the code much more readable and one does not need to know the differences between choose/wchoose/wchooseN

(0..10).choose;

// looks "wrong", but IDE displays normalize: true as default argument
(0..10).choose(weights: (0..10));

// this looks strange in written code - could be a bug?
(0..10).choose(weights: (0..10), normalize: false);

// "advanced" usage - we are sure that the weights are a distribution
(0..10).choose(weights: (0..10)/(0..10).sum, normalize: false);