Infotheory Documentation

This page contains the Infotheory Package documentation.

The quantities Module

Fun little module which shows variables relationships between information theory quantities.

cmpy.infotheory.quantities.print_atoms(variables)

Prints the possible information theoretic atoms over the variables.

Parameters:

variables : list

A list of random variable names.

cmpy.infotheory.quantities.print_entropies(variables)

Prints the possible entropies over specified variables.

Parameters:

variables : list

A list of random variable names.

cmpy.infotheory.quantities.print_mutual_informations(variables, sep=', ')

Prints the possible mutual informations over specified variables.

Parameters:

variables : list

A list of random variable names.

cmpy.infotheory.quantities.num_entropies(n)

Returns the number of entropies over n variables.

cmpy.infotheory.quantities.num_atoms(n)

Returns the number of atoms over n variables.

The conditional_distributions Module

Conditional Probability Distributions

Various conditional distributions.

class cmpy.infotheory.conditional_distributions.ConditionalDistribution(dist=None, marginal=None, wildcard='*', event_type=<class 'cmpy.infotheory.events.JointEvent'>)

Bases: cmpy.infotheory.conditional_distributions.ConditionalEntropicDistribution

ConditionalDistribution

Conditional distribution with standard probability values.

Methods

array([order_p, order_q]) Return a stochastic matrix representation of this conditional distribution.
bayesian_inverse([indices, length]) Returns the inverted conditional distribution.
channel_capacity([rtol]) Computes the marginal which achieves the channel capacity, and returns it along with the channel capacity.
clean_events() Removes wildcards from events.
conditional_entropy([moment]) Return the conditional entropy of this conditional and a supplied
events([inner, raw]) Return the events of the conditional distribution.
is_commensurate() Returns True if inner and outer events are commensurate.
iter_eventdists([raw]) Returns an iterator over (event, distribution) pairs.
iter_events([raw]) Returns an iterator over outer events.
marginal([dist]) Returns (or sets) the associated marginal distribution.
marginal_other([indices, length]) Return the other marginal distribution.
normalize() Normalizes each distribution and the marginal distribution as well.
reduced() Reduces all conditional distributions.
to_event(event) Converts event into a formal event instance.
to_joint([indices, length]) Returns a joint distribution from this conditional distribution.
to_logcdist() Returns a base-2 log-distributed conditional distribution.
trim([descend]) Remove distributions which do not contain any events.
type() Returns the type of the inner distributions.
wildcard_events() Adds wildcards to events, if they have been removed.
to_logcdist()

Returns a base-2 log-distributed conditional distribution.

Returns:

cdist : ConditionalLogDistribution

A base-2 log-distributed conditional probability distribution.

class cmpy.infotheory.conditional_distributions.ConditionalLogDistribution(dist=None, marginal=None, wildcard='*', event_type=<class 'cmpy.infotheory.events.JointEvent'>)

Bases: cmpy.infotheory.conditional_distributions.ConditionalEntropicDistribution

ConditionalLogDistribution

Conditional distribution with log-probabilities.

Methods

array([order_p, order_q]) Return a stochastic matrix representation of this conditional distribution.
bayesian_inverse([indices, length]) Returns the inverted conditional distribution.
channel_capacity([rtol]) Computes the marginal which achieves the channel capacity, and returns it along with the channel capacity.
clean_events() Removes wildcards from events.
conditional_entropy([moment]) Return the conditional entropy of this conditional and a supplied
events([inner, raw]) Return the events of the conditional distribution.
is_commensurate() Returns True if inner and outer events are commensurate.
iter_eventdists([raw]) Returns an iterator over (event, distribution) pairs.
iter_events([raw]) Returns an iterator over outer events.
marginal([dist]) Returns (or sets) the associated marginal distribution.
marginal_other([indices, length]) Return the other marginal distribution.
normalize() Normalizes each distribution and the marginal distribution as well.
reduced() Reduces all conditional distributions.
to_cdist() Returns a linear-distributed conditional distribution.
to_event(event) Converts event into a formal event instance.
to_joint([indices, length]) Returns a joint distribution from this conditional distribution.
trim([descend]) Remove distributions which do not contain any events.
type() Returns the type of the inner distributions.
wildcard_events() Adds wildcards to events, if they have been removed.
to_cdist()

Returns a linear-distributed conditional distribution.

Returns:

cdist : ConditionalDistribution

A linear-distributed conditional probability distribution.

class cmpy.infotheory.conditional_distributions.ConditionalSymbolicDistribution(dist=None, marginal=None, wildcard='*', event_type=<class 'cmpy.infotheory.events.JointEvent'>)

Bases: cmpy.infotheory.conditional_distributions.ConditionalEntropicDistribution

ConditionalSymbolicDistribution

Conditional distribution with symbolic probabilies. Requires sympy.

Methods

array([order_p, order_q]) Return a stochastic matrix representation of this conditional distribution.
bayesian_inverse([indices, length]) Returns the inverted conditional distribution.
channel_capacity([rtol]) Computes the marginal which achieves the channel capacity, and returns it along with the channel capacity.
clean_events() Removes wildcards from events.
conditional_entropy([moment]) Return the conditional entropy of this conditional and a supplied
events([inner, raw]) Return the events of the conditional distribution.
is_commensurate() Returns True if inner and outer events are commensurate.
iter_eventdists([raw]) Returns an iterator over (event, distribution) pairs.
iter_events([raw]) Returns an iterator over outer events.
marginal([dist]) Returns (or sets) the associated marginal distribution.
marginal_other([indices, length]) Return the other marginal distribution.
normalize() Normalizes each distribution and the marginal distribution as well.
reduced() Reduces all conditional distributions.
to_event(event) Converts event into a formal event instance.
to_joint([indices, length]) Returns a joint distribution from this conditional distribution.
trim([descend]) Remove distributions which do not contain any events.
type() Returns the type of the inner distributions.
wildcard_events() Adds wildcards to events, if they have been removed.
array(order_p=None, order_q=None)

Return a stochastic matrix representation of this conditional distribution.

Parameters:

order_p : [events]

Order along the first axis

order_q : [events]

Order along the second axis

Returns:

stochastic : Matrix

Sympy matrix of probabilities.

cmpy.infotheory.conditional_distributions.tunnel(YgX, ZgY, commensurate=False)

Given Pr(Y|X) and Pr(Z|Y), return Pr(Z|X).

Let X -> Y -> Z be a channel (and so, a Markov chain).

Given the conditional distributions for each channel (X to Y and Y to Z), we return the conditional distribution connecting X to Z.

Parameters:

YgX : conditional distribution

The channel from X to Y.

ZgY : conditional distribution

The channel from Y to Z.

commensurate : bool

A boolean controlling how the X events are related to the Z events. Both YgX and ZgY have a _commensurate attribute. If this attribute is True for either conditional distribution, then the value of commensurate is ignored. If both attributes are False, and commensurate is True, then the events for X and Z are assumed to be commensurate with each other, and the _commensurate attribute for ZgX will be set to True (and False otherwise).

Returns:

ZgX : conditional distribution

The channel from X to Z.

Raises:

InfoTheoryException :

Raised if YgX and ZgY are not of the same type.

cmpy.infotheory.conditional_distributions.channel_YZ(YgX, ZgX)

Given Pr(Y|X) and Pr(Z|X), return Pr(Z|Y).

Suppose we have:

X -> Y and X -> Z.

We invert the X -> Y channel to give Y -> X and then we have:

Y -> X -> Z

which forms a Markov chain. This function ‘tunnels’ Y to Z to yield Pr(Z|Y).

Parameters:

YgX : conditional distribution

The channel from X to Y. The marginal distribution must be defined.

ZgX : conditional distribution

The channel from X to Z.

commensurate : bool

A boolean controlling how the Y events are related to the Z events. Both YgX and ZgX have a _commensurate attribute. If this attribute is True for either conditional distribution, then the value of commensurate is ignored. If both attributes are False, and commensurate is True, then the events for Y and Z are assumed to be commensurate with each other, and the _commensurate attribute for ZgY will be set to True (and False otherwise).

Returns:

ZgY : conditional distribution

The channel from Y to Z.

Raises:

InfoTheoryException :

Raised if YgX and ZgX are not of the same type. Raised if YgX does not have a marginal defined.

The plotting Module

The infotheory Module

Information Theory

Information theory is a study concerning the quantification of information.

Various measure are typically defined, of which as many as could possibly be useful are defined here.

cmpy.infotheory.infotheory.channel_capacity(cond_dist, rtol)
cmpy.infotheory.infotheory.conditional_entropy(joint, marginal, moment=1)
cmpy.infotheory.infotheory.conditional_entropy_log(joint, marginal, moment=1)
cmpy.infotheory.infotheory.conditional_entropy_symbolic(joint, marginal, moment=1)
cmpy.infotheory.infotheory.conditional_entropy_mc(marginal, conditional, moment=1)
cmpy.infotheory.infotheory.conditional_entropy_mc_log(marginal, conditional, moment=1)
cmpy.infotheory.infotheory.conditional_entropy_mc_symbolic(marginal, conditional, moment=1)
cmpy.infotheory.infotheory.cross_entropy(dist1, dist2)
cmpy.infotheory.infotheory.cross_entropy_log(dist1, dist2)
cmpy.infotheory.infotheory.cross_entropy_symbolic(dist1, dist2)
cmpy.infotheory.infotheory.entropy(dist, moment=1)
cmpy.infotheory.infotheory.entropy_log(logdist, moment=1)
cmpy.infotheory.infotheory.entropy_symbolic(dist, moment=1)
cmpy.infotheory.infotheory.mutual_information(joint, marginal1, marginal2)
cmpy.infotheory.infotheory.mutual_information_log(joint, marginal1, marginal2)
cmpy.infotheory.infotheory.mutual_information_symbolic(joint, marginal1, marginal2)
cmpy.infotheory.infotheory.perplexity(dist)
cmpy.infotheory.infotheory.perplexity_log(dist)
cmpy.infotheory.infotheory.perplexity_symbolic(dist)
cmpy.infotheory.infotheory.relative_entropy(dist1, dist2)
cmpy.infotheory.infotheory.relative_entropy_log(dist1, dist2)
cmpy.infotheory.infotheory.relative_entropy_symbolic(dist1, dist2)
cmpy.infotheory.infotheory.renyi_entropy(dist, alpha)
cmpy.infotheory.infotheory.renyi_entropy_log(dist, alpha)
cmpy.infotheory.infotheory.renyi_entropy_symbolic(dist, alpha)
cmpy.infotheory.infotheory.renyi_relative_entropy(dist1, dist2, alpha)
cmpy.infotheory.infotheory.renyi_relative_entropy_log(dist1, dist2, alpha)
cmpy.infotheory.infotheory.renyi_relative_entropy_symbolic(dist1, dist2, alpha)
cmpy.infotheory.infotheory.tsallis_entropy(dist, q)
cmpy.infotheory.infotheory.tsallis_entropy_log(dist, q)
cmpy.infotheory.infotheory.tsallis_entropy_symbolic(dist, q)
cmpy.infotheory.infotheory.H(dist, logs=False)

A function that will try its best to return the entropy of a distribution, no matter what format that distribution is in.

Parameters:

dist : object

Something that can be turned into an array either via an array() or values() method, or by passing it as an argument to numpy.array().

logs : bool, optional

If True, then dist is assumed to be log-probabilities.

Returns:

H : float

dist‘s entropy.

Raises:

TypeError :

Raised if dist ends up being a scalar when passed to numpy.array().

The exceptions Module

Information Theory Exceptions

Exceptions related to information theory.

exception cmpy.infotheory.exceptions.InfoTheoryException(*args)

Bases: cmpy.exceptions.CMPyException

Generic information theory exception.

exception cmpy.infotheory.exceptions.InvalidDistribution(*args)

Bases: cmpy.infotheory.exceptions.InfoTheoryException

Exception thrown when a distribution is not normalized.

exception cmpy.infotheory.exceptions.InvalidProbability(*args)

Bases: cmpy.exceptions.CMPyException

Exception thrown when a probability is not in [0,1].

The memory Module

Implementation of ideas in:
“On the Generative Nature of Prediction” Wolfgang Lohr & Nihat Ay

But modifying their notation of prescient to match ours.

class cmpy.infotheory.memory.Memory(memmap, process)

Bases: object

An implementation of memory.

A Memory object is a mapping from histories in a distribution to memory states.

Methods

predictive_information() Returns the predictive information.
predictive_state_information() Returns the predictive state information.
predictive_information()

Returns the predictive information.

The predictive information is the mutual information between the past and the future, also known as the excess entropy.

Returns:

pi : float

The predictive information.

predictive_state_information()

Returns the predictive state information.

The predictive state information is the mutual information between the states and the true future, rather than the generated future. This quantity can be computed from the memory mapping alone.

Returns:

psi : float

The predictive state information.

class cmpy.infotheory.memory.Model(memory, gen)

Bases: object

A model consists of a memory and a generative process.

Note, a model is a channel.

Methods

generated_information() Returns the generated information, I[Past; Generated Future].
generated_state_information() Returns the generated state information, I[State; Generated Future].
is_prescient() Returns True if the model is prescient with respect to the process.
generated_information()

Returns the generated information, I[Past; Generated Future].

generated_state_information()

Returns the generated state information, I[State; Generated Future].

is_prescient()

Returns True if the model is prescient with respect to the process.

A model is predictive if I[Past; Generated Future] = I[Past; Future].

Note: This property was called “predictive” in the paper. To be more precise, if the probability functions were the same, the the model was deemd “predicitive”. The information theoretic quantities being the same is a consequence of the probability functions being the same.

Generally though, when we say prescience we mean that the presentation generates the same process language. So it is a strict requirement on the probability distribution, not an information-theoretic quantity.

cmpy.infotheory.memory.example_1p1()

Example 1.1 from “On the Generative Nature of Prediction”.

The distributions Module

Probability Distributions

Distributions defines a number of different probability distribution classes for use in CMPy.

class cmpy.infotheory.distributions.Distribution(dist=None, wildcard='*', event_type=<class 'cmpy.infotheory.events.JointEvent'>)

Bases: cmpy.infotheory.distributions.EntropicDistribution

Distribution

Represents a standard probability distribution.

Examples

>>> coin = Distribution({0: 0.5, 1: 0.5})
>>> bias_coin = Distribution({0: 0.1, 1: 0.9})
>>> rrx = Distribution({'000': 0.25, '011': 0.25, '101': 0.25, '110': 0.25})
>>> die = Distribution([1/6]*6)

Methods

add_probs(probs) Returns the appropriate summation of probs for the distribution.
approx_equal(other[, atol, rtol])
array([order]) Return a NumPy array of the probabilities of this distribution.
binding_information(*events) Return the binding information [abdallah2010].
clean_events() Remove wildcards from events.
coalesce(indices[, simple]) Coalesce random variables to form a new joint distribution.
condition_on(indices[, target, clean, trim]) Condition the distribution on the events as indexed by indices.
conditional_entropy(indices[, moment]) Computes the conditional entropy H[Y|X], where this distribution is the joint (Y, X) and indices specifies the marginal X.
conditional_mutual_information(*events) Return the multivariate conditional mutual information.
cross_entropy(other) Return the cross entropy: $sum(p*log(q))$.
entropy([moment]) Compute the entropy of the distribution: \sum(p_i log^k(1/p_i))
events([raw]) Return the events of the distribution.
interaction_information(*events) Return the interaction information as defined in [jakulin2004].
invert_prob(prob) Returns the probability appropriate inverted for use in mult_probs().
iter_eventprobs([raw]) Returns an iterator over (event, probability) pairs.
iter_events([raw]) Returns an iterator over events.
iter_probs() Returns an iterator over probabilities.
marginal(indices[, clean]) Return a marginal distribution.
marginal_from_event(event[, clean]) Returns a marginal distribution, inferred from a wildcarded event.
marginalize(indices[, clean]) Marginalize a joint distribution.
mult_probs(probs) Returns the appropriate product of probs for the distribution.
mutual_information(*events) Return the multivariate mutual information as defined by [yeung1991].
normalize() Normalize the distribution.
perplexity() Returns the perplexity of a distribution: \prod(p_i^{p_i})
probs([order]) Return a NumPy array of the probabilities of this distribution.
reduced() Returns a distribution with all masked indices removed.
reindexed(indices[, length]) Returns a new distribution by reindexing the unmasked indices.
relative_entropy(other) Returns the relative entropy between self and other:
renyi_entropy([alpha]) Compute the Renyi entropy of the distribution.
renyi_relative_entropy(other[, alpha]) Returns the Renyi relative entropy between self and other.
residual_entropy(*events) Return the residual entropy [abdallah2010].
sample() Draw an event from this distribution.
samples(n) Draw events from this distribution.
sum() Return the sum of the distribution’s probabilities.
to_dict([single]) Returns a dictionary of the distribution with raw, reduced events.
to_event(event) Converts event into a formal event instance.
to_logdist() Returns a log-distributed distribution equivalent to this one.
total_correlation(*events) Return the total correlation as defined in [wanatabe1960].
trim() Remove events which occur with _null_prob.
tsallis_entropy([q]) Returns the Tsallis entropy of the distribution.
validate() Verify that the probabilities of this distribution sum to unity, and that each probability is in [0, 1].
wildcard_events() Adds wildcards to events, if they have been removed.
normalize()

Normalize the distribution.

Returns:

total : float

The normalization constant used to normalize the distribution.

Notes

This is an in-place operation.

Examples

>>> d = Distribution({'0': 1, '1': 2})
>>> d.normalize()
3.0
>>> d
Distribution:
{'0': 0.3333333333333333, '1': 0.6666666666666666}
to_logdist()

Returns a log-distributed distribution equivalent to this one.

Returns:

log_dist : LogDistribution

A log-distributed probability distribution.

Examples

>>> d = Distribution({'000': 0.25,
...                   '011': 0.25,
...                   '101': 0.25,
...                   '110': 0.25})
>>> d.to_logdist()
LogDistribution:
{'011': -2.0, '000': -2.0, '110': -2.0, '101': -2.0}
validate()

Verify that the probabilities of this distribution sum to unity, and that each probability is in [0, 1].

Returns:

valid : bool

True if the distribution is valid.

Raises:

InvalidDistribution: :

Raised in the event that the probabilities do not sum to unity.

InvalidProbability: :

Raised in the event that a probability is outside of [0, 1].

Notes

The raising of InvalidDistribution is likely a sign that the distribution is not normalized – try running dist.normalize().

class cmpy.infotheory.distributions.LogDistribution(dist=None, wildcard='*', event_type=<class 'cmpy.infotheory.events.JointEvent'>)

Bases: cmpy.infotheory.distributions.EntropicDistribution

LogDistribution

This class represents a distribution which stores log(p) instead of p itself in order to retain numerical accuracy for very small p.

Examples

>>> coin = LogDistribution([-1, -1])
>>> bias_coin = LogDistribution({'H': -0.15, 'T': -3.34})

Methods

add_probs(probs) Returns the appropriate summation of probs for the distribution.
approx_equal(other[, atol, rtol])
array([order]) Return a NumPy array of the probabilities of this distribution.
binding_information(*events) Return the binding information [abdallah2010].
clean_events() Remove wildcards from events.
coalesce(indices[, simple]) Coalesce random variables to form a new joint distribution.
condition_on(indices[, target, clean, trim]) Condition the distribution on the events as indexed by indices.
conditional_entropy(indices[, moment]) Computes the conditional entropy H[Y|X], where this distribution is the joint (Y, X) and indices specifies the marginal X.
conditional_mutual_information(*events) Return the multivariate conditional mutual information.
cross_entropy(other) Return the cross entropy: $sum(p*log(q))$.
entropy([moment]) Compute the entropy of the distribution: \sum(p_i log^k(1/p_i))
events([raw]) Return the events of the distribution.
interaction_information(*events) Return the interaction information as defined in [jakulin2004].
invert_prob(prob) Returns the probability appropriately inverted for use in mult_probs().
iter_eventprobs([raw]) Returns an iterator over (event, probability) pairs.
iter_events([raw]) Returns an iterator over events.
iter_probs() Returns an iterator over probabilities.
marginal(indices[, clean]) Return a marginal distribution.
marginal_from_event(event[, clean]) Returns a marginal distribution, inferred from a wildcarded event.
marginalize(indices[, clean]) Marginalize a joint distribution.
mult_probs(probs) Returns the appropriate product of probs for the distribution.
mutual_information(*events) Return the multivariate mutual information as defined by [yeung1991].
normalize() Normalize the distribution.
perplexity() Returns the perplexity of a distribution: \prod(p_i^{p_i})
probs([order]) Return a NumPy array of the probabilities of this distribution.
reduced() Returns a distribution with all masked indices removed.
reindexed(indices[, length]) Returns a new distribution by reindexing the unmasked indices.
relative_entropy(other) Returns the relative entropy between self and other:
renyi_entropy([alpha]) Compute the Renyi entropy of the distribution.
renyi_relative_entropy(other[, alpha]) Returns the Renyi relative entropy between self and other.
residual_entropy(*events) Return the residual entropy [abdallah2010].
sample() Draw an event from this distribution.
samples(n) Draw events from this distribution.
sum() Return the sum of the distribution’s probabilities.
to_dict([single]) Returns a dictionary of the distribution with raw, reduced events.
to_dist() Returns a linear-distributed distribution equivalent to this one.
to_event(event) Converts event into a formal event instance.
total_correlation(*events) Return the total correlation as defined in [wanatabe1960].
trim() Remove events which occur with _null_prob.
tsallis_entropy([q]) Returns the Tsallis entropy of the distribution.
validate() Verify that the probabilities of this distribution sum to unity after being exponentiated, and that each probability is in [-inf, 0].
wildcard_events() Adds wildcards to events, if they have been removed.
static add_probs(probs)

Returns the appropriate summation of probs for the distribution.

Parameters:

probs : array of floats

Variable number of log probabilities to add.

Returns:

combined : float

The passed in probabilities, added appropriately. In this case using logaddexp2().

static invert_prob(prob)

Returns the probability appropriately inverted for use in mult_probs().

Parameters:

prob : float

The probability to invert.

Returns:

iprob : float

The inverted probability. In this case, -prob.

static mult_probs(probs)

Returns the appropriate product of probs for the distribution.

Parameters:

probs : array of floats

Variable number of probabilities to multiply.

Returns:

combined : float

The passed in probabilities, multiplied appropriately. In this case using sum().

normalize()

Normalize the distribution.

Returns:

total : float

The normalization log-constant used to normalize the distribution.

Notes

This is an in-place operation.

Examples

>>> d = LogDistribution({'0': -4, '1': -4})
>>> d.normalize()
-3.0
>>> d
Distribution:
{'0': -1.0, '1': -1.0}
to_dist()

Returns a linear-distributed distribution equivalent to this one.

Returns:

dist : Distribution

A linear-distributed probability distribution.

Examples

>>> ld = LogDistribution({0: -1, 1: -1})
>>> ld.to_dist()
Distribution:
{(0,): 0.5, (1,): 0.5}
validate()

Verify that the probabilities of this distribution sum to unity after being exponentiated, and that each probability is in [-inf, 0].

Returns:

valid : bool

True if the distribution is valid.

Raises:

InvalidDistribution: :

Raised in the event that the probabilities do not sum to unity after being exponentiated.

InvalidProbability: :

Raised in the event that a probability is outside of [-inf, 0].

Notes

The raising of InvalidDistribution is likely a sign that the distribution is not normalized – try running dist.normalize().

class cmpy.infotheory.distributions.SymbolicDistribution(dist=None, wildcard='*', event_type=<class 'cmpy.infotheory.events.JointEvent'>)

Bases: cmpy.infotheory.distributions.EntropicDistribution

SymbolicDistribution

This distribution relies upon SymPy, so you’d best have that. It allows one to define symbolic probabilities, such as a bias coin being {‘H’: p, ‘T’: 1-p}.

Examples

>>> p = sympy.Symbol('p')
>>> bias_coin = SymbolicDistribution({'H': p, 'T': 1-p})

Methods

add_probs(probs) Returns the appropriate summation of probs for the distribution.
approx_equal(other[, atol, rtol])
array([order]) Return a NumPy array of the probabilities of this distribution.
binding_information(*events) Return the binding information [abdallah2010].
clean_events() Remove wildcards from events.
coalesce(indices[, simple]) Coalesce random variables to form a new joint distribution.
condition_on(indices[, target, clean, trim]) Condition the distribution on the events as indexed by indices.
conditional_entropy(indices[, moment]) Computes the conditional entropy H[Y|X], where this distribution is the joint (Y, X) and indices specifies the marginal X.
conditional_mutual_information(*events) Return the multivariate conditional mutual information.
cross_entropy(other) Return the cross entropy: $sum(p*log(q))$.
entropy([moment]) Compute the entropy of the distribution: \sum(p_i log^k(1/p_i))
events([raw]) Return the events of the distribution.
interaction_information(*events) Return the interaction information as defined in [jakulin2004].
invert_prob(prob) Returns the probability appropriate inverted for use in mult_probs().
iter_eventprobs([raw]) Returns an iterator over (event, probability) pairs.
iter_events([raw]) Returns an iterator over events.
iter_probs() Returns an iterator over probabilities.
marginal(indices[, clean]) Return a marginal distribution.
marginal_from_event(event[, clean]) Returns a marginal distribution, inferred from a wildcarded event.
marginalize(indices[, clean]) Marginalize a joint distribution.
mult_probs(probs) Returns the appropriate product of probs for the distribution.
mutual_information(*events) Return the multivariate mutual information as defined by [yeung1991].
normalize() Normalize this distribution.
perplexity() Returns the perplexity of a distribution: \prod(p_i^{p_i})
probs([order]) Return a NumPy array of the probabilities of this distribution.
reduced() Returns a distribution with all masked indices removed.
reindexed(indices[, length]) Returns a new distribution by reindexing the unmasked indices.
relative_entropy(other) Returns the relative entropy between self and other:
renyi_entropy([alpha]) Compute the Renyi entropy of the distribution.
renyi_relative_entropy(other[, alpha]) Returns the Renyi relative entropy between self and other.
residual_entropy(*events) Return the residual entropy [abdallah2010].
sample() Draw an event from this distribution.
samples(n) Draw events from this distribution.
sum() Return the sum of the distribution’s probabilities.
to_dict([single]) Returns a dictionary of the distribution with raw, reduced events.
to_dist(subs_dict) Convert this symbolic distribution to a linear-distribution by substituting the variables in the probabilities to floats using subs_dict.
to_event(event) Converts event into a formal event instance.
to_logdist(subs_dict) Convert this symbolic distribution to a log-distribution by substituting the variables in the probabilities to floats using:subs_dict:.
total_correlation(*events) Return the total correlation as defined in [wanatabe1960].
trim() Remove events which occur with :_null_prob:.
tsallis_entropy([q]) Returns the Tsallis entropy of the distribution.
validate() Verify that assuming 0 <= p <= 1 (and so on for each symbol), the sum of each probability is unity.
wildcard_events() Adds wildcards to events, if they have been removed.
static add_probs(probs)

Returns the appropriate summation of probs for the distribution.

Parameters:

probs : list of Symbol instances

Variable number of probabilities to add.

Returns:

combined : Symbol

The passed in probabilities, added appropriately. In this case using sum.

static mult_probs(probs)

Returns the appropriate product of probs for the distribution.

Parameters:

probs : list of Symbol instances

Variable number of probabilities to multiply.

Returns:

combined : Symbol

The passed in probabilities, multiplied appropriately. In this case using operator.mul.

normalize()

Normalize this distribution.

to_dist(subs_dict)

Convert this symbolic distribution to a linear-distribution by substituting the variables in the probabilities to floats using subs_dict.

Parameters:

subs_dict : dict

A dictionary containing (symbol, float) pairs. Each symbol will be replaced by a float according to this dictionary.

Returns:

dist : Distribution

A distribution with values taken from :subs_dist:.

to_logdist(subs_dict)

Convert this symbolic distribution to a log-distribution by substituting the variables in the probabilities to floats using:subs_dict:.

Parameters:

subs_dict : dict

A dictionary containing (symbol, float) pairs. Each symbol will be replaced by a float according to this dictionary. The probabilities in the dictionary should be linearly distributed: they will substitute in as log(p) and then converted into a float.

Returns:

dist : LogDistribution

A log distribution with values taken from :subs_dist:.

trim()

Remove events which occur with :_null_prob:.

validate()

Verify that assuming 0 <= p <= 1 (and so on for each symbol), the sum of each probability is unity.

class cmpy.infotheory.distributions.GeneralizedDistribution(dist=None, wildcard='*', event_type=<class 'cmpy.infotheory.events.JointEvent'>)

Bases: cmpy.infotheory.distributions.BaseDistribution

GeneralizedDistribution

This distribution demands that the sum of the probabilities adds to unity, but allows individual probabilities to be outside of [0, 1].

Examples

>>> d = GeneralizedDistribution({'A': -0.5, 'B': 1.5})

Methods

add_probs(probs) Returns the appropriate summation of probs for the distribution.
approx_equal(other[, atol, rtol])
array([order]) Return a NumPy array of the probabilities of this distribution.
clean_events() Remove wildcards from events.
coalesce(indices[, simple]) Coalesce random variables to form a new joint distribution.
condition_on(indices[, target, clean, trim]) Condition the distribution on the events as indexed by indices.
events([raw]) Return the events of the distribution.
invert_prob(prob) Returns the probability appropriate inverted for use in mult_probs().
iter_eventprobs([raw]) Returns an iterator over (event, probability) pairs.
iter_events([raw]) Returns an iterator over events.
iter_probs() Returns an iterator over probabilities.
marginal(indices[, clean]) Return a marginal distribution.
marginal_from_event(event[, clean]) Returns a marginal distribution, inferred from a wildcarded event.
marginalize(indices[, clean]) Marginalize a joint distribution.
mult_probs(probs) Returns the appropriate product of probs for the distribution.
normalize() Normalize the distribution.
probs([order]) Return a NumPy array of the probabilities of this distribution.
reduced() Returns a distribution with all masked indices removed.
reindexed(indices[, length]) Returns a new distribution by reindexing the unmasked indices.
sample() Draw an event from this distribution.
samples(n) Draw events from this distribution.
sum() Return the sum of the distribution’s probabilities.
to_dict([single]) Returns a dictionary of the distribution with raw, reduced events.
to_event(event) Converts event into a formal event instance.
trim() Remove events which occur with _null_prob.
validate() Verify that the probabilities of this distribution sum to unity.
wildcard_events() Adds wildcards to events, if they have been removed.
normalize()

Normalize the distribution.

validate()

Verify that the probabilities of this distribution sum to unity.

Returns:

valid : bool

True if the distribution is valid.

Raises:

InvalidDistribution: :

Raised in the event that the probabilities do not sum to unity.

Notes

The raising of InvalidDistribution is likely a sign that the distribution is not normalized – try running dist.normalize().

The events Module

Formalized event objects.

Though not strictly required, it is a nice convenience to end users if the displayed representation has the same hash value and would be equal to the true event. This allows the users to type in whatever is displayed and it will transparently work with its associated distribution.

class cmpy.infotheory.events.Event(event, mask=None, cleaned=None, wildcard=None)

Bases: cmpy.infotheory.events.BaseEvent

Standard event object for distributions.

Methods

clean() Hides any masked indices when iterating over the event.
copy()
equiv(other) Returns True if other is an equivalent event.
invert_mask() Inverts the masked indices, in-place.
iter([cleaned]) Returns an iterator over the event.
mask(indices) Masks indices, in-place.
pretty() Returns a ‘pretty’ representation of the event.
raw() Returns the original event, unmodified.
reduced([raw]) Returns a new event with all masked indices removed.
unmask([indices]) Unmasks indices, in-place.
wildcard() Shows any masked indices when iterating over the event.
pretty()

Returns a ‘pretty’ representation of the event.

class cmpy.infotheory.events.JointEvent(event, mask=None, cleaned=None, wildcard=None)

Bases: cmpy.infotheory.events.BaseEvent

Standard joint event object for distributions.

Methods

clean() Hides any masked indices when iterating over the event.
copy()
equiv(other) Returns True if other is an equivalent event.
invert_mask() Inverts the masked indices, in-place.
iter([cleaned]) Returns an iterator over the event.
mask(indices) Masks indices, in-place.
pretty([simple])
raw() Returns the original event, unmodified.
reduced([raw]) Returns a new event with all masked indices removed.
reindexed(indices[, length]) Returns a new event by reindexing the unmasked indices.
unmask([indices]) Unmasks indices, in-place.
wildcard() Shows any masked indices when iterating over the event.
pretty(simple=None)
reindexed(indices, length=None)

Returns a new event by reindexing the unmasked indices.

Parameters:

indices : list

A list of integers specifying how the old event’s unmasked indices, as read from left to right, should appear in the new event.

length : { int | None }

The true length of the new event. This should be larger than the largest index in indices, and is useful when wanted to postpad the new event with wildcards.

Returns:

event : event

A joint event with the indices reindexed.

Raises:

InfoTheoryException :

When not enough indices are specified.

Notes

The meta information of the new event will be identical in the is sense to the original meta information.

The convert Module

Special case helper functions.

cmpy.infotheory.convert.validate_and_convert(dist, logs)

Validates and converts a distribution to the type specified by logs.

Parameters:

dist : Distribution, LogDistribution

The distribution to be validated and converted, if necessary.

logs : bool

The desired type of distribution. If True, then we convert to a log distribution, if neccessary (and vice versa).

Returns:

converted_dist : {Distribution, LogDistribution}

The converted distribution.