Wiki

Clone wiki

Weighted Item Randomizer for C# / Getting Started

Adding the Weighted Item Randomizer to your project

  1. Add Weighted Randomizer/Weighted Randomizer.csproj to your solution.

  2. Reference the Weighted Randomizer project from any projects you want to use it from.

  3. Add using Weighted_Randomizer to the top of any files you want to use it from.

Using the Weighted Item Randomizer

The main interface is IWeightedRandomizer. This is what it looks like (without the documentation comments):

public interface IWeightedRandomizer<TKey> : ICollection<TKey>
{
    long TotalWeight { get; }
    TKey NextWithReplacement();
    TKey NextWithRemoval();
    void Add(TKey key, int weight);
    int this[TKey key] { get; set; }
    int GetWeight(TKey key);
    void SetWeight(TKey key, int weight);
}

The primary methods of importance are Add(TKey key, int weight), NextWithReplacement() and NextWithRemoval().

  • Add(TKey key, int weight) adds an item to the list with the given weight. weight must be a positive integer. Larger weights are more likely to be chosen. An item cannot be added to the list twice.
  • NextWithReplacement() selects a random item from the list, and puts it back so that it can be chosen again.
  • NextWithRemoval() selects a random item from the list and removes it, so it cannot be chosen again.

Here's a short example

IWeightedRandomizer<string> randomizer = new DynamicWeightedRandomizer<string>();
randomizer.Add("Joe", 1);
randomizer.Add("Ryan", 2);
randomizer.Add("Jason", 2);

string randomPerson = randomizer.NextWithReplacement();
//At this point, randomPerson has a 1/5 = 20% chance of being "Joe", 2/5 = 40% chance of being "Ryan",
//and 2/5 = 40% chance of being "Jason".
//Whichever one is chosen is still in the list, so if NextWithReplacement() is called again, the odds are
//still the same of getting any one of the three names

You may use [] as a syntax-shortcut to get/add/update weights.

randomizer["Joe"] = 1;

DynamicWeightedRandomizer vs. StaticWeightedRandomizer

This library includes two implementations of IWeightedRandomizer: DynamicWeightedRandomizer and StaticWeightedRandomizer.

  • DynamicWeightedRandomizer is the more generally useful of the two. It is based on a tree-structure, so all operations - Add(), Contains(), Remove(), NextWithRemoval(), NextWithReplacement() etc. - are fairly fast (O(log n)). If you are unsure which implementation to use, DynamicWeightedRandomizer is the safer choice.
    Because it is a tree-structure, DynamicWeightedRandomizer requires its generic type to implement IComparable<T>.
  • StaticWeightedRandomizer constructs a probability table (using the walker alias method) that allows for extremely fast (O(1)) calls to NextWithReplacement(). However, anytime NextWithReplacement() is called after the probabilities have changed (including adding/removing items, or calling NextWithRemoval()), it must reconstruct the table. Thus, it is only useful in the specific case that your probabilities do not change once you've added all the items to the list.

Here is a speed comparison of the two implementations on my machine. Add()x10000 + NextWithReplacement()x10 means that Add() is called 10000 times, followed by NextWithReplacement() called 10 times.

         WeightedRandomizer Benchmarks                  |  Dynamic   |   Static
-----------------------------------------------------------------------------------
Add()x10000 + NextWithReplacement()x10:                 |    4 ms    |      2 ms
Add()x10000 + NextWithReplacement()x10000:              |    7 ms    |      4 ms
Add()x10000 + NextWithReplacement()x100000:             |   35 ms    |     28 ms
( Add() + NextWithReplacement() )x10000 (interleaved)   |    8 ms    |   5403 ms
Add()x10000 + NextWithRemoval()x10000:                  |   10 ms    |   5948 ms

So for the special case of static (non-changing) probabilities, StaticWeightedRandomizer is about 50-100% faster. But in the more dynamic cases, DynamicWeightedRandomizer is several orders of magnitude faster.


Is WeightedRandomizer thread-safe?

No. But, there shouldn't be a problem if you serialize access to it.

Are floating-point weights supported?

No. This was a conscious decision to avoid aggregated floating-point errors. Fortunately, probabilities are usually rational, so you can simply multiply by the least-common-denominator to get exact integer weights. In the rarer case that your probabilities are arbitrary floating-points in the range [0,1], you can still get a very close estimate by multiplying them by Int32.MaxValue.

Updated