Math Documentation

This page contains the Math Package documentation.

The logarithms Module

Define log-related functions.

NumPy log2 and exp2 are vector functions and when we are working with scalars exclusively, there might be performance benefits to using scalar versions of the functions. However, this brings additional API complexity.

Let’s recommend the vector functions and provide scalar functions. Scalar functions should not be considered as robust.

Note that, NumPy added log2, exp2, logaddexp2 at NumPy rev 6090. The first release to incorporate these was NumPy 1.3.0 (2009-04-06). There was a significant bug: http://projects.scipy.org/numpy/ticket/1096 The fix for #1096 was in r7059, which is after 1.3.0 (r6844), will be included in NumPy 1.4.0.

cmpy.math will use the NumPy functions when they are available bug free. Otherwise, it will provide a (probably) slower implementation.

Scalar versions of logsum and logprod were found in: “Numerically Stable Hidden Markov Model Implementation” by Tobias P. Mann

cmpy.math.logarithms.logsum(*args, **kwargs)

Vectorized _logsum() is deprecated. Use logaddexp() instead.

cmpy.math.logarithms.logsum2(*args, **kwargs)

Vectorized _logsum2() is deprecated. Use logaddexp2() instead.

cmpy.math.logarithms.logdotexp(x, y, out=None)

Returns the dot product of base-e logarithm arrays.

If x or y is a zero-dimensional array (or scalar), then the log product is returned.

If x and y are both one-dimensional arrays, then a scalar is returned.

If y is a one-dimensional array, then it is treated like a column vector, and the return value has shape x.shape[:-1].

Parameters:

x, y : array-like

The base-e logarithm arrays.

out : array-like, None

An array to store the output.

Raises:

ValueError :

If the last dimension of x is not the same size as the second-to-last dimension of y.

cmpy.math.logarithms.logdotexp2(x, y, out=None)

Returns the dot product of base-2 logarithm arrays.

If x or y is a zero-dimensional array (or scalar), then the log product is returned.

If x and y are both one-dimensional arrays, then a scalar is returned.

If y is a one-dimensional array, then it is treated like a column vector, and the return value has shape x.shape[:-1].

Parameters:

x, y : array-like

The base-2 logarithm arrays.

out : array-like, None

An array to store the output.

Raises:

ValueError :

If the last dimension of x is not the same size as the second-to-last dimension of y.

The misc Module

Miscellaneous Mathematical Tools

cmpy.math.misc.is_forbidden(p, logs, **kwds)

Returns True if the probability is zero (or -inf for log probabilities).

Parameters:

p : float

The probability to test.

logs : boolean

If True, then p is assumed to be a log probability. If False, then p is assumed to be a standard probability.

Returns:

b : boolean

True if the probability is zero (or -inf).

Notes

All other keywords are passed to cmpy.math.close().

cmpy.math.misc.stationary_distribution(ntm, logs, left=True)

Returns the stationary distribution of a stochastic matrix.

Parameters:

ntm : NumPy array

The node transition matrix, which is assumed to be properly normalized.

logs : bool

If True, then a log distribution is returned.

left : bool

When True, the stationary distribution is computed as the left eigenvector of the node transition matrix. When False, we transpose the matrix and then compute the left eigenvector.

Returns:

pi : NumPy array

A one-dimensional NumPy array representing the stationary distribution.

Raises:

MultipleRecurrentComponents :

Raised when the matrix has multiple recurrent components.

CMPyException :

Raised when scipy is not avaiable.

cmpy.math.misc.sum_probabilities(probs, logs=None)

Returns the sum of probabilities.

Parameters:

probs : dict, NumPy array, Distribution, LogDistribution

The probabilities to sum. If a dictionary is passed in, then the values are used in the summation.

logs : bool

If True, then the probabilities are treated as log probabilities.

Returns:

p : float

The summed probabilities

The sampling Module

Functions related to sampling from distributions.

cmpy.math.sampling.random_unitsum(dimension, scale=1, prng=None)

Returns a tuple of numbers which sum to 1.

Parameters:

dimension : int

The dimensionality of the unitsum vector.

scale : float

By default, we sample from the standard simplex only. If scale is greater than one, we are sampling from the extended simplex and there is no longer a nonnegativity constaint on the elements of the vector. For scales less than 1, we are sampling from a restricted simplex. The scale sets the radius of the simplex as measured from the uniform distribution.

prng : random number generator

A random number generator which returns `k’ random numbers when `prng.rand(k)’ is called. If unspecified, then we use the random number generator at cmpy.math.prng.

Returns:

v : NumPy array

An array whose elements sum to 1.

cmpy.math.sampling.random_zerosum(dimension, scale=1, prng=None)

Returns a tuple of numbers which sum to 0.

Parameters:

dimension : int

The dimensionality of the zerosum vector.

scale : float

By default, we sample from the standard simplex only. If scale is greater than one, we are sampling from the extended simplex and there is no longer a nonnegativity constaint on the elements of the vector. For scales less than 1, we are sampling from a restricted simplex. The scale sets the radius of the simplex as measured from the uniform distribution.

prng : random number generator

A random number generator which returns `k’ random numbers when `prng.rand(k)’ is called. If unspecified, then we use the random number generator at cmpy.math.prng.

Returns:

v : NumPy array

An array whose elements sum to 0.

cmpy.math.sampling.sample_discrete(dist, logs, prng=None, rand=None)

Returns a sample from a discrete distribution.

Parameters:

dist : dict, NumPy array

The distribuion from which the sample is drawn. If a dict, then the keys are what are being drawn from and the values indicate the probability of drawing each key. If a NumPy array, the elements specify the probability of drawing the index of the element.

logs : bool

If True, then the distribution is treated as a log distribution. If False, the distribution is treated as a standard distribution.

prng : random number generator

A random number generator with a `rand’ method that returns a random number between 0 and 1 when called with no arguments. If unspecified, then we use the random number generator at cmpy.math.prng.

rand : float

Instead of drawing a random number, you can provide a random number, and a sample will be drawn using that number. If rand is specified, then prng must be None (and vice versa).

Returns:

s : sample

The sample drawn from the distribution. If `dist’ is a NumPy array then we return the index of the sampled element.

The equal Module

cmpy.math.equal.close(x, y, rtol=None, atol=None)

Returns True if the scalars x and y are close.

The relative error rtol must be positive and << 1.0 The absolute error atol usually comes into play when y is very small or zero; it says how small x must be also.

If rtol or atol are unspecified, they are taken from cmpyParams[‘rtol’] and cmpyParams[‘atol’].

Note: This version is cythonified.

cmpy.math.equal.allclose(x, y, rtol=None, atol=None)

Returns True if all components of x and y are close.

The relative error rtol must be positive and << 1.0 The absolute error atol usually comes into play for those elements of y that are very small or zero; it says how small x must be also.

If rtol or atol are unspecified, they are taken from cmpyParams[‘rtol’] and cmpyParams[‘atol’].

cmpy.math.equal.cmpy_equal(*args, **kwargs)

Machines can output strings, classes, numbers, etc. The purpose of this function is to provide a unified approach to equality comparisions. If the objects are strings, we should compare with string equality. If the objects are floats, we should compare with float equality.

Warning: x and y are converted into scipy arrays before comparision.

This means that all elements in x or y will be converted into the same type.

So, x=(‘a’,3) will become x=array(‘a’,‘3’,dtype=”|S1”)

This has bad implications if there is also an output symbol ‘3’ which is meant be different from the output symbol 3.

Rather than worry about this at all, this function is designed to work with scalars only. A ValueError exception will be raised for any other usage.

Required arguments:
x
This value will be compared to y.
y
This value will be compared to x.
Optional keywords:
rtol
This is the relative tolerance. The default value is cmpyParams[‘rtol’].
atol
This is the absolute tolerance. The default value is cmpyParams[‘atol’].

cmpy_equal() is deprecated. Use ==, close(), or allclose() instead.

The sigmaalgebra Module

Functions for generating sigma algebras.

http://users.ices.utexas.edu/~chetan/talks/tech_report_sigma_algebra.pdf

cmpy.math.sigmaalgebra.is_sigma_algebra(F, X=None)

Returns True if F is a sigma algebra on X.

Parameters:

F : set of frozensets

The candidate sigma algebra.

X : frozenset, None

The universal set. If None, then X is taken to be the union of the sets in F.

Returns:

issa : bool

True if F is a sigma algebra and False if not.

Notes

The time complexity of this algorithm is O ( len(F) * len(X) ).

cmpy.math.sigmaalgebra.sigma_algebra(C, X=None)

Returns the sigma algebra generated by the subsets in C.

Let X be a set and C be a collection of subsets of X. The sigma algebra generated by the subsets in C is the smallest sigma-algebra which contains every subset in C.

Parameters:

C : set of frozensets

The set of subsets of X.

X : frozenset, None

The underlying set. If None, then X is taken to be the union of the sets in C.

Returns:

sC : frozenset of frozensets

The sigma-algebra generated by C.

Notes

The algorithm run time is generally exponential in |X|, the size of X.