Defining a Probability Distribution

In CMPy there are many different types of probability distributions depending on what your needs are.

Standard Distributions

This is your normal probability distribution. It can either be a single random variable:

>>> dist = Distribution({'0': 0.5, '1': 0.5})

Or it can be a joint distribution:

>>> dist = Distribution({'000': 0.25, '011': 0.25, '101': 0.25, '110': 0.25})

where the event ‘000’ corresponds to \{x_1 = 0, x_2 = 0, x_3 = 0\}.

Warning

When creating a joint distribution such as this, each joint event must have the same length.

If you wish to create a distribution where individual events are compound objects (e.g. a string rather than a character) you must specify that the distribution is not a joint distribution by supplying the joint keyword:

>>> dist = Distribution({'alpha': 0.5, 'beta': 0.5}, joint=False)

You may also create distributions with arbitrary event labels if all you care about are the probabilities:

>>> die = Distribution([1/6]*6)
>>> die
Distribution:
{0: 0.16666666666666666, 1: 0.16666666666666666, 2: 0.16666666666666666,
 3: 0.16666666666666666, 4: 0.16666666666666666, 5: 0.16666666666666666}

Log Distributions

A log distribution is useful when your distribution is likely to contain very small probabilities – so small that floating point precision may be inadequate. In such an event it is beneficial to store the log of the probability rather than the probability itself. As a contrived example:

>>> dist = LogDistribution({'A': -1, 'B': -2, 'C': -3, 'D': -3})

Symbolic Distributions

If you have access to sympy, you can create symbolic distributions representing families or classes of distributions. For example, the family of biased coins would simply be:

>>> p = Symbol('p')
>>> dist = SymbolicDistribution({'H': p, 'T': 1-p})

Generalized Distributions

Sometimes you may want a distribution where event probabilities can be outside [0, 1] – for that there are generalized distributions:

>>> d = GeneralizedDistribution('A': -0.5, 'B': 1.5)

Note

Since there is no way to define the entropy of a generalized distribution, none of the information-theoretic methods are available.