# Defining a Probability Distribution¶

In CMPy there are many different types of probability distributions depending on what your needs are.

## Standard Distributions¶

This is your normal probability distribution. It can either be a single random variable:

```
>>> dist = Distribution({'0': 0.5, '1': 0.5})
```

Or it can be a joint distribution:

```
>>> dist = Distribution({'000': 0.25, '011': 0.25, '101': 0.25, '110': 0.25})
```

where the event ‘000’ corresponds to .

Warning

When creating a joint distribution such as this, each joint event must have the same length.

If you wish to create a distribution where individual events are compound
objects (e.g. a string rather than a character) you must specify that the
distribution is not a joint distribution by supplying the *joint* keyword:

```
>>> dist = Distribution({'alpha': 0.5, 'beta': 0.5}, joint=False)
```

You may also create distributions with arbitrary event labels if all you care about are the probabilities:

```
>>> die = Distribution([1/6]*6)
>>> die
Distribution:
{0: 0.16666666666666666, 1: 0.16666666666666666, 2: 0.16666666666666666,
3: 0.16666666666666666, 4: 0.16666666666666666, 5: 0.16666666666666666}
```

## Log Distributions¶

A log distribution is useful when your distribution is likely to contain very small probabilities – so small that floating point precision may be inadequate. In such an event it is beneficial to store the log of the probability rather than the probability itself. As a contrived example:

```
>>> dist = LogDistribution({'A': -1, 'B': -2, 'C': -3, 'D': -3})
```

## Symbolic Distributions¶

If you have access to sympy, you can create symbolic distributions representing families or classes of distributions. For example, the family of biased coins would simply be:

```
>>> p = Symbol('p')
>>> dist = SymbolicDistribution({'H': p, 'T': 1-p})
```

## Generalized Distributions¶

Sometimes you may want a distribution where event probabilities can be outside [0, 1] – for that there are generalized distributions:

```
>>> d = GeneralizedDistribution('A': -0.5, 'B': 1.5)
```

Note

Since there is no way to define the entropy of a generalized distribution, none of the information-theoretic methods are available.