# Math Documentation¶

## The `logarithms` Module¶

Define log-related functions.

NumPy log2 and exp2 are vector functions and when we are working with scalars exclusively, there might be performance benefits to using scalar versions of the functions. However, this brings additional API complexity.

Let’s recommend the vector functions and provide scalar functions. Scalar functions should not be considered as robust.

Note that, NumPy added log2, exp2, logaddexp2 at NumPy rev 6090. The first release to incorporate these was NumPy 1.3.0 (2009-04-06). There was a significant bug: http://projects.scipy.org/numpy/ticket/1096 The fix for #1096 was in r7059, which is after 1.3.0 (r6844), will be included in NumPy 1.4.0.

cmpy.math will use the NumPy functions when they are available bug free. Otherwise, it will provide a (probably) slower implementation.

Scalar versions of logsum and logprod were found in: “Numerically Stable Hidden Markov Model Implementation” by Tobias P. Mann

`cmpy.math.logarithms.``logsum`(*args, **kwargs)

`cmpy.math.logarithms.``logsum2`(*args, **kwargs)

`cmpy.math.logarithms.``logdotexp`(x, y, out=None)

Returns the dot product of base-e logarithm arrays.

If x or y is a zero-dimensional array (or scalar), then the log product is returned.

If x and y are both one-dimensional arrays, then a scalar is returned.

If y is a one-dimensional array, then it is treated like a column vector, and the return value has shape x.shape[:-1].

Parameters: x, y : array-like The base-e logarithm arrays. out : array-like, None An array to store the output. ValueError : If the last dimension of x is not the same size as the second-to-last dimension of y.
`cmpy.math.logarithms.``logdotexp2`(x, y, out=None)

Returns the dot product of base-2 logarithm arrays.

If x or y is a zero-dimensional array (or scalar), then the log product is returned.

If x and y are both one-dimensional arrays, then a scalar is returned.

If y is a one-dimensional array, then it is treated like a column vector, and the return value has shape x.shape[:-1].

Parameters: x, y : array-like The base-2 logarithm arrays. out : array-like, None An array to store the output. ValueError : If the last dimension of x is not the same size as the second-to-last dimension of y.

## The `misc` Module¶

### Miscellaneous Mathematical Tools¶

`cmpy.math.misc.``is_forbidden`(p, logs, **kwds)

Returns True if the probability is zero (or -inf for log probabilities).

Parameters: p : float The probability to test. logs : boolean If True, then p is assumed to be a log probability. If False, then p is assumed to be a standard probability. b : boolean True if the probability is zero (or -inf).

Notes

All other keywords are passed to cmpy.math.close().

`cmpy.math.misc.``stationary_distribution`(ntm, logs, left=True)

Returns the stationary distribution of a stochastic matrix.

Parameters: ntm : NumPy array The node transition matrix, which is assumed to be properly normalized. logs : bool If True, then a log distribution is returned. left : bool When True, the stationary distribution is computed as the left eigenvector of the node transition matrix. When False, we transpose the matrix and then compute the left eigenvector. pi : NumPy array A one-dimensional NumPy array representing the stationary distribution. MultipleRecurrentComponents : Raised when the matrix has multiple recurrent components. CMPyException : Raised when scipy is not avaiable.
`cmpy.math.misc.``sum_probabilities`(probs, logs=None)

Returns the sum of probabilities.

Parameters: probs : dict, NumPy array, Distribution, LogDistribution The probabilities to sum. If a dictionary is passed in, then the values are used in the summation. logs : bool If True, then the probabilities are treated as log probabilities. p : float The summed probabilities

## The `sampling` Module¶

Functions related to sampling from distributions.

`cmpy.math.sampling.``random_unitsum`(dimension, scale=1, prng=None)

Returns a tuple of numbers which sum to 1.

Parameters: dimension : int The dimensionality of the unitsum vector. scale : float By default, we sample from the standard simplex only. If scale is greater than one, we are sampling from the extended simplex and there is no longer a nonnegativity constaint on the elements of the vector. For scales less than 1, we are sampling from a restricted simplex. The scale sets the radius of the simplex as measured from the uniform distribution. prng : random number generator A random number generator which returns `k’ random numbers when `prng.rand(k)’ is called. If unspecified, then we use the random number generator at cmpy.math.prng. v : NumPy array An array whose elements sum to 1.
`cmpy.math.sampling.``random_zerosum`(dimension, scale=1, prng=None)

Returns a tuple of numbers which sum to 0.

Parameters: dimension : int The dimensionality of the zerosum vector. scale : float By default, we sample from the standard simplex only. If scale is greater than one, we are sampling from the extended simplex and there is no longer a nonnegativity constaint on the elements of the vector. For scales less than 1, we are sampling from a restricted simplex. The scale sets the radius of the simplex as measured from the uniform distribution. prng : random number generator A random number generator which returns `k’ random numbers when `prng.rand(k)’ is called. If unspecified, then we use the random number generator at cmpy.math.prng. v : NumPy array An array whose elements sum to 0.
`cmpy.math.sampling.``sample_discrete`(dist, logs, prng=None, rand=None)

Returns a sample from a discrete distribution.

Parameters: dist : dict, NumPy array The distribuion from which the sample is drawn. If a dict, then the keys are what are being drawn from and the values indicate the probability of drawing each key. If a NumPy array, the elements specify the probability of drawing the index of the element. logs : bool If True, then the distribution is treated as a log distribution. If False, the distribution is treated as a standard distribution. prng : random number generator A random number generator with a `rand’ method that returns a random number between 0 and 1 when called with no arguments. If unspecified, then we use the random number generator at cmpy.math.prng. rand : float Instead of drawing a random number, you can provide a random number, and a sample will be drawn using that number. If rand is specified, then prng must be None (and vice versa). s : sample The sample drawn from the distribution. If `dist’ is a NumPy array then we return the index of the sampled element.

## The `equal` Module¶

`cmpy.math.equal.``close`(x, y, rtol=None, atol=None)

Returns True if the scalars x and y are close.

The relative error rtol must be positive and << 1.0 The absolute error atol usually comes into play when y is very small or zero; it says how small x must be also.

If rtol or atol are unspecified, they are taken from cmpyParams[‘rtol’] and cmpyParams[‘atol’].

Note: This version is cythonified.

`cmpy.math.equal.``allclose`(x, y, rtol=None, atol=None)

Returns True if all components of x and y are close.

The relative error rtol must be positive and << 1.0 The absolute error atol usually comes into play for those elements of y that are very small or zero; it says how small x must be also.

If rtol or atol are unspecified, they are taken from cmpyParams[‘rtol’] and cmpyParams[‘atol’].

`cmpy.math.equal.``cmpy_equal`(*args, **kwargs)

Machines can output strings, classes, numbers, etc. The purpose of this function is to provide a unified approach to equality comparisions. If the objects are strings, we should compare with string equality. If the objects are floats, we should compare with float equality.

Warning: x and y are converted into scipy arrays before comparision.

This means that all elements in x or y will be converted into the same type.

So, x=(‘a’,3) will become x=array(‘a’,‘3’,dtype=”|S1”)

This has bad implications if there is also an output symbol ‘3’ which is meant be different from the output symbol 3.

Rather than worry about this at all, this function is designed to work with scalars only. A ValueError exception will be raised for any other usage.

Required arguments:
x
This value will be compared to y.
y
This value will be compared to x.
Optional keywords:
rtol
This is the relative tolerance. The default value is cmpyParams[‘rtol’].
atol
This is the absolute tolerance. The default value is cmpyParams[‘atol’].

cmpy_equal() is deprecated. Use ==, close(), or allclose() instead.

## The `sigmaalgebra` Module¶

Functions for generating sigma algebras.

http://users.ices.utexas.edu/~chetan/talks/tech_report_sigma_algebra.pdf

`cmpy.math.sigmaalgebra.``is_sigma_algebra`(F, X=None)

Returns True if F is a sigma algebra on X.

Parameters: F : set of frozensets The candidate sigma algebra. X : frozenset, None The universal set. If None, then X is taken to be the union of the sets in F. issa : bool True if F is a sigma algebra and False if not.

Notes

The time complexity of this algorithm is O ( len(F) * len(X) ).

`cmpy.math.sigmaalgebra.``sigma_algebra`(C, X=None)

Returns the sigma algebra generated by the subsets in C.

Let X be a set and C be a collection of subsets of X. The sigma algebra generated by the subsets in C is the smallest sigma-algebra which contains every subset in C.

Parameters: C : set of frozensets The set of subsets of X. X : frozenset, None The underlying set. If None, then X is taken to be the union of the sets in C. sC : frozenset of frozensets The sigma-algebra generated by C.

Notes

The algorithm run time is generally exponential in |X|, the size of X.