Algorithms Documentation¶
This page contains the Algorithms Package documentation.
Subpackages¶
The emmo
Module¶
The mealybe
Module¶
Block entropies from MealyHMMs that do not have a start node.

class
cmpy.machines.algorithms.mealybe.
DynamicBlockEntropies
(machine)¶ Bases:
object
Block entropies from MealyHMMs that do not have a start node.
Thus, we build the mixed state representation of the machine and use it to calculate block entropies.
Methods
block_entropies
([lengths, ndist, logs])Generator of block entropies from the machine. condblock_entropies
([lengths, ndist, logs])Generator of conditional block entropies from the machine. current_mixedstate_entropy
()Returns the current mixed state entropy. 
block_entropies
(lengths=None, ndist=None, logs=None)¶ Generator of block entropies from the machine.
Parameters: lengths : list, None
A list of integers, specifying the desired block entropies. We yield only at the word lengths appearing in lengths. Note, the list must be sorted and not contain duplicates. If no list is specified, then we generate block entropies at every length, indefinitely.
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then one is obtained from the machine if possible.
logs : bool
Specifies if logs are used in the calculation of block entropies.

condblock_entropies
(lengths=None, ndist=None, logs=None)¶ Generator of conditional block entropies from the machine.
Parameters: lengths : list, None
A list of integers, specifying the desired block entropies. We yield only at the word lengths appearing in lengths. Note, the list must be sorted and not contain duplicates. If no list is specified, then we generate block entropies at every length, indefinitely.
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then we use the stationary distribution of the process.
logs : bool
Specifies if logs are to be used in the calculation of block entropies.

current_mixedstate_entropy
()¶ Returns the current mixed state entropy.
This quantity is:
H[R_N  R_0 = d]where d is the initial distribution of the original machine.


cmpy.machines.algorithms.mealybe.
block_entropies
(machine, lengths, mse=False, ndist=None, logs=None)¶ Returns a NumPy array of block entropies from the machine.
Parameters: machine : MealyHMM
The machine used to compute block entropies.
lengths : list
A list of integers, specifying the desired block entropies. The list is assumed to be sorted and not contain duplicates.
mse : bool
If True, then the mixed state entropies are included in the output.
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then we use the stationary distribution of the process.
logs : bool
Specifies if logs are being used to calculate the block entropies.
Returns: be : NumPy array, shape (2 or 3, len(lengths))
An array of floats representing the block entropies at the requested lengths. The columns represent the block entropies at each in lengths. The first row is for and the second row is for where is a node in machine. If mse is True, then the third row is where R_L are the mixed states than the nodes in the original machine.
Notes
One can obtain via: be[1]  be[0].

cmpy.machines.algorithms.mealybe.
block_entropies_markov
(machine, max_length, ndist=None, logs=None)¶ Returns a NumPy array of block entropies 0 <= L <= max_length.
During the calculation, we check if the process is Markov. If so, then we stop updating distributions and quickly return the rest of the block entropies. This provides a performance benefit especially when the mixed state presentation has a large number of recurrent states.
Parameters: machine : MealyHMM
The machine used to compute block entropies. It must be a unifilar presentation, but the algorithm does not check that this is so.
max_length : int
An integer specifying the largest block entropy that will be computed.
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then we use the stationary distribution of the process.
logs : bool
Specifies if logs are being used to calculate the block entropies.
Returns: be : NumPy array, shape (2, max_length`+1)
An array of floats representing the block entropies at the requested lengths. The columns represent the block entropies at each . in lengths. The first row is for and the second row is for where is a node in machine.
order : int, None
An integer representing the Markov order if the process was discovered to be Markov. Otherwise, we return None to signify that the Markov order was not determined to be finite up to max_length.

cmpy.machines.algorithms.mealybe.
condblock_entropies
(machine, lengths, ndist=None, logs=None)¶ Returns a NumPy array of conditional block entropies from the machine.
Parameters: machine : MealyHMM
The machine used to compute block entropies.
lengths : list
A list of integers, specifying the desired block entropies. The list is assumed to be sorted and not contain duplicates.
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then we use the stationary distribution of the process.
logs : bool
Specifies if logs are being used to calculate the block entropies.
Returns: be : NumPy array, shape (2, len(lengths))
An array of floats representing the block entropies at the requested lengths. The columns represent the block entropies at each in lengths. The first row is for and the second row is for where and are random variables for nodes in machine.
Notes
One can obtain via: be[1]  be[0].
The mealyms
Module¶
Algorithms for constructing mixed state presentations.
This implementation assumes we are working with MealyHMMs.

class
cmpy.machines.algorithms.mealyms.
BaseMixedState
(msr, word=None)¶ Bases:
object
A representation of a mixed state.
Methods
copy
()Returns a copy of the mixed state. covering
()Returns the basis states represented by the mixed state. is_log
()
copy
()¶ Returns a copy of the mixed state.
Only the mixed state representation is copied deeply.

covering
()¶ Returns the basis states represented by the mixed state.

is_log
()¶


class
cmpy.machines.algorithms.mealyms.
BidirectionalMachine
(data=None, name='', **attr)¶ Bases:
cmpy.machines.mealyhmm.MealyHMM
Methods
labeled_transition_matrices
([node_ordering, ...])Returns the labeled transition matrices. stationary_distribution
([logs, sparse, ...])Returns the stationary distribution. 
labeled_transition_matrices
(node_ordering=None, logs=False, cdist=False)¶ Returns the labeled transition matrices.
Each element in the matrix is given by:
T_{ij}^k = Pr(j,ki)where i and j are nodes and k is an output symbol. The value specifies the probability of seeing symbol k and transitioning to node j, given that the current state is node i.
There is one matrix for each symbol in the alphabet.
Parameters: node_ordering : list
An ordering of nodes in the machine.
logs : bool
If True, then the base2 log of the labeled transitions matrices is returned.
cdist : bool
If True, then return the labeled transition matrices as a formal conditional distribution from cmpy.infotheory.
Returns: ltm : dict
A dictionary of NumPy arrays keyed by output symbols. The values are the labeled transition matrices.
norder : list
List of nodes, specifying the order of the rows/cols of the labeled transition matrices. This is returned only if node_ordering is None.
Raises: CMPyException :
If a node in node_ordering does not exist in the machine.

stationary_distribution
(logs=None, sparse=None, node_ordering=None, meta=False, event_type=None)¶ Returns the stationary distribution.
The stationary distribution is defined to be the distribution which satisfies d.T = d, where d is a distribution. This distribution is guaranteed to be unique if the machine is irreducible.
The stationary distribution is also unique for reducible machines which have only a single attracting component. However, now there can be other nonstationary distribuions which generate stationary process languages. To see this, consider any eM. One choice of a distribution which generates a stationary process is to place all probabilty in the start node. Another choice is to appropriately place all probability in the recurrent causal states. Although both distributions generate a stationary process, only the latter corresponds to a stationary distribution in the eigenvector sense.
When there are multiple attracting components, then the stationary distribution is not unique. In fact, the stationary distribution of any attracting component, extended to the entire machine, would qualify as a stationary distribution for the nonergodic machine. Indeed, any mixture of the stationary distributions from each ergodic component would itself be a stationary distribution. Given an initial node distribution, one can calculate a related distribution using
averaged_distribution()
.Note: the stationary distribution does not depend on the initial node distribution. Thus, periodic structures will have a stationary distribution, even though it is possible to choose an initial node distribution such that the generated process is not stationary. The stationary distribution simply tells us that if the initial node distribution is set equal to the stationary distribution, then the generated process will be stationary.
Parameters: logs : bool, None
Specifies whether the returned distribution should be a log or standard distribution. If logs is None, then the value is taken from cmpyParams[‘logs’].
sparse : bool, None
Specifies whether nodes with zero probability should be included in the stationary distribution. If unspecified, then the value is taken from cmpyParams[‘sparse’]. Note, this parameter is only relevant if node_ordering is None.
node_ordering : list, None
If specified, then the distribution will be returned as a NumPy array. The list must include every node in the graph.
meta : bool
When True, the node transition matrix and node ordering that were used in the computation are stored in a dictionary, which is returned, under the keys ‘ntm’ and ‘node_ordering’.
Returns: pi : Distribution, LogDistribution, NumPy array
The stationary distribution. Unless a node ordering is specified, the return value will be a distribution object. When a node ordering is specified, the return value will be a NumPy array.
meta : dict
Dictionary of useful quantities used during the calculation. This is returned only if meta is True.


class
cmpy.machines.algorithms.mealyms.
LogMixedState
(msr, word=None)¶ Bases:
cmpy.machines.algorithms.mealyms.BaseMixedState
Methods
copy
()Returns a copy of the mixed state. covering
()Returns the basis states represented by the mixed state. is_log
()to_mixedstate
()Converts the LogMixedState to a MixedState. 
to_mixedstate
()¶ Converts the LogMixedState to a MixedState.
Only the mixed state representation is copied deeply.


cmpy.machines.algorithms.mealyms.
MSPBuilder
¶ alias of
MSPBuilderDict

class
cmpy.machines.algorithms.mealyms.
MSPBuilderBase
(machine, atol=None, rtol=None)¶ Bases:
object
A class that builds recurrent mixed state presentations.
The attributes listed below might also be None until they are specified or obtained from the machine when build_msp is called.
Attributes
Methods
build_msp
([max_length, forward, transients, ...])Builds the mixed state presentation. build_msp_iter
([max_length, forward, row, ...])Builds the mixed state presentation iteratively. label_nodes
([digits, propagate_ms, transients])Labels the nodes according to their mixed state distributions. reverse_words
()Reverses the words associated with each MixedState. 
build_msp
(max_length=30, forward=True, transients=False, row=True, scanLR=True, logs=None, ndist=None, norder=None, create_using=None, propagate_ms=False, label_nodes=False, partial=False)¶ Builds the mixed state presentation.
Parameters: max_length : integer
Some processes have an infinite number of recurrent mixed states. At each L, we construct the mixed states for words of length L, and then we increment L. When L > max_length, we will raise an exception. The exception will contain the current values of the return variables as attributes.
forward : bool
Specifies if the mixed state presentation should be constructed for forward process or the reverse process. By convention, the forward process is defined as the process generated by the input machine. When True, the mixed state presentation is constructed for the forward process.
transients : bool
Specifies if we should return the transient mixed states, in addition to the recurrent mixed states. If False and the process is nonergodic, then shortcuts are attempted to find the recurrent portion even in the case when there are infninite transients.
row : bool
Specifies whether the mixed states should be computed as stochastic row vectors or nonstochastic column vectors. If True, then stochastic row vectors are generated. If False, then nonstochastic column vectors are generated.
scanLR : bool
Specifies the scan direction to be used when interpreting the inducing words of each mixed state. When True, the timeorder of the word, as seen by the process generated by the mixed state presentation, occurs when scanning from lefttoright. When False, the timeorder of the word, as seen by the process generated by the mixed state presentation, occurs when scanning from righttoleft. When scanLR is False, the algorithm proceeds as if scanLR is True and changes the order of the words only when the algorithm finishes.
logs : bool
If True, then we construct the mixed state presentation by using logarithms in the computations of the mixed states. The default value is taken from cmpyParams[‘logs’].
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then we use the stationary distribution of the process.
norder : list
Specifies the order of the elements in the mixed state NumPy arrays. The node ordering must include every node in the machine. If None, then the nodes will be used in sorted order. In either case, the ordering is stored in the norder attribute of the mixed states.
propagate_ms : bool
If True then each node in the original machine is assumed to have a ‘ms’ attribute, which informs us about the nodes in the input machine in terms of the nodes in some ‘original’ machine. These attributes are used to represent the nodes in the output MSP in terms of the nodes in the ‘original’ machine.
label_nodes : bool  positive int
When True, the nodes will be labeled according to the mixed state distributions. When False, the nodes will be integers. If an integer, it specifies the number of decimal digits that should appear in the label. Also, each node will have an ‘ms’ attribute containing its mixed state.
partial : bool
When True, if the maximum word length is reached, then the partially constructed mixed state presentation along with the MSPBuilder instance is returned and no exception is raised. When False, if the maximum word length is reached, then the CMPyException is allowed to raise.
For a nonergodic process, no shortcut attempt is made to find the recurrent portion. So if partial is False then the value of transients is effectively always True for nonergodic processes.
Returns: msp : MealyHMM
The mixed state presentation.
mspb : MSPBuilder
The algorithm instance used to construct the mixed state presentation. This is returned only if partial is True.
Raises: CMPyException :
 Raised when all of the following hold:
 partial is False
 L becomes greater than max_length
 there are still mixed states to augment
 the HMM is exactly synchronizing
So long as the process is exactly synchronizable, increasing the value of max_length should eventually work.
UnexpectedHMM :
 Raised when all of the following hold:
 partial is False
 L becomes greater than max_length
 there are still mixed states to augment
 the HMM is not exactly synchronizing
If the HMM is not exactly sychronizing, then no value of max_length will work. You should consider passing setting partial to True and looking at the incomplete mixed state presentation.

build_msp_iter
(max_length=None, forward=True, row=True, logs=None, ndist=None, norder=None, create_using=None)¶ Builds the mixed state presentation iteratively.
Parameters: max_length : integer, None
At each L, we construct the mixed states for words of length L, and then we increment L. When L > max_length, iteration will stop. If max_length is None, then we iterate indefinitely. If the mixed state presentation completes before reaching max_length, then we will continue to yield the complete mixed state presentation until the iteration reaches max_length.
forward : bool
Specifies if the mixed state presentation should be constructed for forward process or the reverse process. By convention, the forward process is defined as the process generated by the input machine. When True, the mixed state presentation is constructed for the forward process.
row : bool
Specifies whether the mixed states should be computed as stochastic row vectors or nonstochastic column vectors. If True, then stochastic row vectors are generated. If False, then nonstochastic column vectors are generated.
logs : bool
If True, then we construct the mixed state presentation by using logarithms in the computations of the mixed states. The default value is taken from cmpyParams[‘logs’].
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then we use the stationary distribution of the process.
norder : list
Specifies the order of the elements in the mixed state NumPy arrays. The node ordering must include every node in the machine. If None, then the nodes will be used in sorted order. In either case, the ordering is stored in the norder attribute of the mixed states.
create_using : callable
A callable which returns a subclass of MealyHMM. The returned value will be populated with nodes and edges.
Notes
The scan direction of the inducing words, as seen by the processes which is generating the mixed state presentation, will always be from lefttoright. This is true regardless of the value of forward. The order of the words can be changed with self.reverse_words(), and should be rereversed before the next iteration (though the class knows if it should rereverse the words before updating the mixed state presentation—and it will do so if necessary).

label_nodes
(digits=None, propagate_ms=False, transients=True)¶ Labels the nodes according to their mixed state distributions.

reverse_words
()¶ Reverses the words associated with each MixedState.


class
cmpy.machines.algorithms.mealyms.
MSPBuilderDict
(machine, atol=None, rtol=None, hashdigits=None)¶ Bases:
cmpy.machines.algorithms.mealyms.MSPBuilderBase
Methods
build_msp
([max_length, forward, transients, ...])Builds the mixed state presentation. build_msp_iter
([max_length, forward, row, ...])Builds the mixed state presentation iteratively. label_nodes
([digits, propagate_ms, transients])Labels the nodes according to their mixed state distributions. reverse_words
()Reverses the words associated with each MixedState.

class
cmpy.machines.algorithms.mealyms.
MixedState
(msr, word=None)¶ Bases:
cmpy.machines.algorithms.mealyms.BaseMixedState
Methods
copy
()Returns a copy of the mixed state. covering
()Returns the basis states represented by the mixed state. is_log
()to_logmixedstate
()Converts the MixedState to a LogMixedState. 
to_logmixedstate
()¶ Converts the MixedState to a LogMixedState.
Only the mixed state representation is copied deeply.


exception
cmpy.machines.algorithms.mealyms.
RecurrentFoundException
¶ Bases:
exceptions.Exception

cmpy.machines.algorithms.mealyms.
build_bimachine
(machine, max_length=None, forward=True)¶ Returns the bidirectional machine presentation of the forward process.
Parameters: machine : MealyHMM
The machine generating the forward process language.
max_length : int
The maximum length used to construct the forward and reverse eMs. The deafult value is 30.
forward : bool
If True, then the bidirectional presentation of the forward process is constructed. If False, then the bidirectional presentation of the reverse process is constructed. The two presentations are related by the timereversing operator.
Returns: m : MealyHMM
The bidirectional machine generating the foward process if forward is True and generating the reverse process otherwise.
feM : RecurrentEpsilonMachine
The forward eM (that which generates the same process as machine).
reM : RecurrentEpsilonMachine
The reverse eM.

cmpy.machines.algorithms.mealyms.
build_msp
(machine, max_length=30, forward=True, transients=False, row=True, scanLR=True, logs=None, ndist=None, norder=None, create_using=None, propagate_ms=False, label_nodes=False, partial=False)¶ Builds the mixed state presentation from the machine.
Parameters: machine : MealyHMM
The machine used to construct the mixed state presentation.
max_length : integer, optional
It can happen that the mixed states will form an infinite semigroup. In this situation, the construction will never complete. At each L, we construct the mixed state representations. If L > max_length, then we will raise an exception. The exception will contain the current values of the return variables as attributes.
forward : bool, optional
Specifies if the mixed state presentation should be constructed for forward process or the reverse process. By convention, the forward process is defined as the process generated by the input machine. When True, the mixed state presentation is constructed for the forward process. Note that this argument is closely tied to scanLR. The default value is True.
transients : bool, optional
Specifies if the transient states should be included in the mixed state presentation. Note, it is often the case that there will be an infinite number of transient mixed states. In such situations, the algorithm is not able to construct the complete mixed state representation and an exception will be raised.
row : bool, optional
Specifies whether the mixed states should be computed as stochastic row vectors or nonstochastic column vectors. If True, then stochastic row vectors are generated. If False, then nonstochastic column vectors are generated. The default value is True.
scanLR : bool, optional
Specifies the scan direction to be used when interpreting the inducing words of each mixed state. When True, the timeorder of the word as seen by the process occurs when scanning from left to right. When False, the timeorder of the word, as seen by the process, occurs when scanning from right to left. The default value is True.
logs : bool, optional
If true, then we construct the mixed state presentation by using logarithms in the computations of the mixed states. The default value is taken from cmpyParams[‘logs’].
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then we use the stationary distribution of the process.
norder : list, optional
Specifies the order of the elements in the mixed state NumPy arrays. The node ordering must include every node in the machine. If None, then a random ordering is choosen and can be obtained from the mixed states.
create_using : callable
A callable which returns a subclass of MealyHMM. The returned value will be populated with nodes and edges.
propagate_ms : bool
If True then each node in the original machine is assumed to have a ‘ms’ attribute, which informs us about the nodes in the input machine in terms of the nodes in some ‘original’ machine. These attributes are used to represent the nodes in the output MSP in terms of the nodes in the ‘original’ machine.
label_nodes : bool  positive int
When True, the nodes will be labeled according to the mixed state distributions. When False, the nodes will be integers. If an integer, it specifies the number of decimal digits the should appear in the label. Also, each node will have an ‘ms’ attribute containing its mixed state.
partial : bool
When True, if the maximum word length is reached, then the partially constructed mixed state presentation along with the MSPBuilder instance is returned and no exception is raised. When False, if the maximum word length is reached, then the CMPyException is allowed to raise.
Returns: msp : MealyHMM
The mixed state presentation.
mspb : MSPBuilder
The algorithm instance used to construct the mixed state presentation. This is returned only if partial is True.
Raises: CMPyException :
Raised when L becomes greater than max_length and there are still mixed states to augment. So long as the process is exactly synchronizable, make sure transients is False and try increasing the value of max_length. It will succeed eventually. If the process is not exactly sychronizable, then no value of max_length will work.

cmpy.machines.algorithms.mealyms.
custom_close
(n, rtol=None, atol=None)¶ Returns a custom close function when Cythonified versions are present.

cmpy.machines.algorithms.mealyms.
propagate_mixedstates
(B, C_mixedstates, logs)¶ Propagates the mixed states of B to A.
The mixed states of B are distributions over nodes in A. The mixed states of C are distributions over nodes in B.
This function writes the mixed states of C in terms of the mixed states of A, and returns a dictionary relating nodes of C to its new mixed states.
The entropyfp
Module¶
Functions for computing entropy flow and production.
Note: These functions operate on the internal Markov chain only.

cmpy.machines.algorithms.entropyfp.
entropyfp_iter
(machine, n=None)¶ Calculates of entropy flow and production, iteratively.
Parameters: machine : MealyHMM
The machine used in calculations.
n : int, None
The number of iteratations. If None, iterate indefinitely.
The typeclass
Module¶
Standardize the type class representation in CMPy.

class
cmpy.machines.algorithms.typeclass.
TypeClass
(probability, members)¶ Bases:
object
A representation of a type class.
Methods
cmp_macroprob1
(x, y)Comparison function on the macroprobability of the type class. cmp_macroprob2
(x, y)Comparison function on the macroprobability of the type class. cmp_microprob
(x, y)Comparison function on the microprobability of the type class. cmp_multiplicity
(x, y)Comparison function on the number of elements in the type class. 
static
cmp_macroprob1
(x, y)¶ Comparison function on the macroprobability of the type class.
 Assumption:
 The typeclass probability is a log probability. The typeclass length is a log length.

static
cmp_macroprob2
(x, y)¶ Comparison function on the macroprobability of the type class.
 Assumption:
 The typeclass probability is a log probability. The typeclass length is a standard length.

static
cmp_microprob
(x, y)¶ Comparison function on the microprobability of the type class.

static
cmp_multiplicity
(x, y)¶ Comparison function on the number of elements in the type class.

length
¶ The number of sequences belonging to this type class.
The length can be the base2 logarithm of the number of sequences.

static

cmpy.machines.algorithms.typeclass.
block_entropy_from_types
(types, logcounts=True, logprobs=True)¶ Returns the block entropy H(L) from a list of type classes at L.
The block entropies are computed via the method of types.
Parameters: types : list[TypeClass]
A list of TypeClass instances.
logcounts : bool
Specifies if the type class counts are log counts.
logprobs : bool
Specifies if the type class probabilities are log probabilities.
Returns: be : float
The block entropy corresponding to the type classes.
Raises: OverflowError : :
Usually only when logcounts is False and the multiplicity for a type class is too large to convert into a float. For example, it is not possible to convert the counts to a 64bit float for type classes of the Even process at L=1481.
The mealycompose
Module¶
Functions for Mealy transducer compositions.

class
cmpy.machines.algorithms.mealycompose.
VectorMealyHMM
(data=None, name='', **attr)¶ Bases:
cmpy.machines.mealyhmm.MealyHMM
An HMM where the node and symbols can be tuples.
All symbols should have the same length. All nodes should have the same length. These lengths need not be equal, but some methods might require them to be.
Methods
marginal_machine
(indexes[, ndist, ...])Returns a marginal of the VectorHMM. 
marginal_machine
(indexes, ndist=None, create_using=None, scalar=False)¶ Returns a marginal of the VectorHMM.
Parameters: indexes : list
A list of integers specifying the symbol indexes to keep—these define the marginal machine. All other symbol indexes are marginalized.
ndist : distribution  list  None
A distribution over nodes in the machine. If provided, then the marginal machine will contain only nodes reachable from the nodes in ndist. Otherwise, all nodes are kept.
Returns: m : VectorHMM
The marginal machine.


class
cmpy.machines.algorithms.mealycompose.
VectorMealyTransducer
(data=None, name='', **attr)¶ Bases:
cmpy.machines.mealytransducer.MealyTransducer
Methods

cmpy.machines.algorithms.mealycompose.
cartesian_product_gg
(machines, create_using=None)¶ Returns the Cartesian product of two generators.
Parameters: machines : (MealyHMM,)
A tuple of MealyHMM instances used in the Cartesian product.
create_using : callable
A callable used to create the instance to be populated. If None, then we use the
VectorMealyHMM
class.Returns: m : VectorMealyHMM
The machine generating the Cartesian product of the processes of machines.

cmpy.machines.algorithms.mealycompose.
cartesian_product_tt
(transducers, create_using=None)¶ Cartesian product of transducers.
The result is another transducer:
where is a transducer. The transition matrices of the product transducer are defined by:
Parameters: machines : (MealyTransducer,)
A tuple of MealyTransducer instances used in the Cartesian product.
create_using : callable
A callable used to create the instance to be populated. If None, then we use the
VectorMealyTransducer
class.Returns: m : VectorMealyTransducer
The output transducer.

cmpy.machines.algorithms.mealycompose.
compose_tg
(transducer, generator, create_using=None)¶ Composes the transducer with the generator.
The result is another generator:
where is the transducer and is the generator. The transition matrices of the joint process (over input and output symbols) is defined by:
The nodes of the composed machine will be (generator, transducer) nodes.
Parameters: transducer : MealyTransducer
The transducer used during composition.
generator : MealyHMM
The generator used during composition.
Returns: m : VectorMealyHMM
The output vector process.

cmpy.machines.algorithms.mealycompose.
compose_tt
(transducers, complete=True, create_using=None)¶ Composes the transducers.
The result is another transducer. Suppose the transducers are:
That is, T_4 is the result of passing the output of T_0 into T_1 into T_2 into T_3 and so on. Notationally, the states of T_4 will be ordered pairs of states (T_0, T_1, T_2, T_3).
Parameters: transducers : (MealyTransducer,)
The a tuple of transducers used during composition.
Returns: m : MealyTransducer
The output vector process.
The attrmatrix
Module¶
Functions for constructing matrixlike objects from graph attributes.

cmpy.machines.algorithms.attrmatrix.
attr_matrix
(G, edge_attr=None, node_attr=None, normalize=False, rc_order=None, dtype=None, order=None)¶ Returns a NumPy matrix using attributes from G.
If only G is passed in, then the adjacency matrix is constructed.
Let A be a discrete set of values for the node attribute node_attr. Then the elements of A represent the rows and columns of the constructed matrix. Now, iterate through every edge e=(u,v) in G and consider the value of the edge attribute edge_attr. If ua and va are the values of the node attribute node_attr for u and v, respectively, then the value of the edge attribute is added to the matrix element at (ua, va).
Parameters: G : graph
The NetworkX graph used to construct the NumPy matrix.
edge_attr : str, optional
Each element of the matrix represents a running total of the specified edge attribute for edges whose node attributes correspond to the rows/cols of the matirx. The attribute must be present for all edges in the graph. If no attribute is specified, then we just count the number of edges whose node attributes correspond to the matrix element.
node_attr : str, optional
Each row and column in the matrix represents a particular value of the node attribute. The attribute must be present for all nodes in the graph. Note, the values of this attribute should be reliably hashable. So, float values are not recommended. If no attribute is specified, then the rows and columns will be the nodes of the graph.
normalize : bool, optional
If True, then each row is normalized by the summation of its values.
rc_order : list, optional
A list of the node attribute values. This list specifies the ordering of rows and columns of the array. If no ordering is provided, then the ordering will be random (and also, a return value).
Returns: M : NumPy matrix
The attribute matrix.
ordering : list
If rc_order was specified, then only the matrix is returned. However, if rc_order was None, then the ordering used to construct the matrix is returned as well.
Examples
Construct an adjacency matrix:
>>> G = nx.Graph() >>> G.add_edge(0,1,thickness=1,weight=3) >>> G.add_edge(0,2,thickness=2) >>> G.add_edge(1,2,thickness=3) >>> nx.attr_matrix(G, rc_order=[0,1,2]) matrix([[ 0. 1. 1.] [ 1. 0. 1.] [ 1. 1. 0.]])
Alternatively, we can obtain the matrix describing edge thickness.
>>> nx.attr_matrix(G, edge_attr='thickness', rc_order=[0,1,2]) matrix([[ 0. 1. 2.] [ 1. 0. 3.] [ 2. 3. 0.]])
We can also color the nodes and ask for the probability distribution over all edges (u,v) describing:
Pr(v has color Y  u has color X)>>> G.node[0]['color'] = 'red' >>> G.node[1]['color'] = 'red' >>> G.node[2]['color'] = 'blue' >>> rc = ['red', 'blue'] >>> nx.attr_matrix(G, node_attr='color', normalize=True, rc_order=rc) matrix([[ 0.33333333, 0.66666667], [ 1. , 0. ]])
For example, the above tells us that for all edges (u,v):
Pr( v is red  u is red) = 1/3 Pr( v is blue  u is red) = 2/3
Pr( v is red  u is blue) = 1 Pr( v is blue  u is blue) = 0
Finally, we can obtain the total weights listed by the node colors.
>>> M = nx.attr_sparse_matrix(G, edge_attr='weight', node_attr='color', rc_order=rc) matrix([[ 3., 2.], [ 2., 0.]])
Thus, the total weight over all edges (u,v) with u and v having colors:
(red, red) is 3 # the sole contribution is from edge (0,1) (red, blue) is 2 # contributions from edges (0,2) and (1,2) (blue, red) is 2 # same as (red, blue) since graph is undirected (blue, blue) is 0 # there are no edges with blue endpoints

cmpy.machines.algorithms.attrmatrix.
attr_sparse_matrix
(G, edge_attr=None, node_attr=None, normalize=False, rc_order=None, dtype=None)¶ Returns a SciPy sparse matrix using attributes from G.
If only G is passed in, then the adjacency matrix is constructed.
Let A be a discrete set of values for the node attribute node_attr. Then the elements of A represent the rows and columns of the constructed matrix. Now, iterate through every edge e=(u,v) in G and consider the value of the edge attribute edge_attr. If ua and va are the values of the node attribute node_attr for u and v, respectively, then the value of the edge attribute is added to the matrix element at (ua, va).
Parameters: G : graph
The NetworkX graph used to construct the NumPy matrix.
edge_attr : str, optional
Each element of the matrix represents a running total of the specified edge attribute for edges whose node attributes correspond to the rows/cols of the matirx. The attribute must be present for all edges in the graph. If no attribute is specified, then we just count the number of edges whose node attributes correspond to the matrix element.
node_attr : str, optional
Each row and column in the matrix represents a particular value of the node attribute. The attribute must be present for all nodes in the graph. Note, the values of this attribute should be reliably hashable. So, float values are not recommended. If no attribute is specified, then the rows and columns will be the nodes of the graph.
normalize : bool, optional
If True, then each row is normalized by the summation of its values.
rc_order : list, optional
A list of the node attribute values. This list specifies the ordering of rows and columns of the array. If no ordering is provided, then the ordering will be random (and also, a return value).
Returns: M : SciPy sparse matrix
The attribute matrix.
ordering : list
If rc_order was specified, then only the matrix is returned. However, if rc_order was None, then the ordering used to construct the matrix is returned as well.
Examples
Construct an adjacency matrix:
>>> G = nx.Graph() >>> G.add_edge(0,1,thickness=1,weight=3) >>> G.add_edge(0,2,thickness=2) >>> G.add_edge(1,2,thickness=3) >>> M = nx.attr_sparse_matrix(G, rc_order=[0,1,2]) >>> M.todense() matrix([[ 0. 1. 1.] [ 1. 0. 1.] [ 1. 1. 0.]])
Alternatively, we can obtain the matrix describing edge thickness.
>>> M = nx.attr_sparse_matrix(G, edge_attr='thickness', rc_order=[0,1,2]) >>> M.todense() matrix([[ 0. 1. 2.] [ 1. 0. 3.] [ 2. 3. 0.]])
We can also color the nodes and ask for the probability distribution over all edges (u,v) describing:
Pr(v has color Y  u has color X)>>> G.node[0]['color'] = 'red' >>> G.node[1]['color'] = 'red' >>> G.node[2]['color'] = 'blue' >>> rc = ['red', 'blue'] >>> M = nx.attr_sparse_matrix(G, node_attr='color', normalize=True, rc_order=rc) >>> M.todense() matrix([[ 0.33333333, 0.66666667], [ 1. , 0. ]])
For example, the above tells us that for all edges (u,v):
Pr( v is red  u is red) = 1/3 Pr( v is blue  u is red) = 2/3
Pr( v is red  u is blue) = 1 Pr( v is blue  u is blue) = 0
Finally, we can obtain the total weights listed by the node colors.
>>> M = nx.attr_sparse_matrix(G, edge_attr='weight', node_attr='color', rc_order=rc) >>> M.todense() matrix([[ 3., 2.], [ 2., 0.]])
Thus, the total weight over all edges (u,v) with u and v having colors:
(red, red) is 3 # the sole contribution is from edge (0,1) (red, blue) is 2 # contributions from edges (0,2) and (1,2) (blue, red) is 2 # same as (red, blue) since graph is undirected (blue, blue) is 0 # there are no edges with blue endpoints
The mealysr
Module¶

cmpy.machines.algorithms.mealysr.
add_level
(M, MM)¶

cmpy.machines.algorithms.mealysr.
compute_alpha
(M)¶

cmpy.machines.algorithms.mealysr.
compute_sync_rate
(M)¶

cmpy.machines.algorithms.mealysr.
construct_possibility_machine
(M)¶

cmpy.machines.algorithms.mealysr.
get_index
(MM, s, x)¶

cmpy.machines.algorithms.mealysr.
get_next_state
(M, s, x)¶
The mealysw
Module¶
Synchronizing Words¶
Iterator to produce all synchronizing words of a process along with the causal state they synchronize to.

cmpy.machines.algorithms.mealysw.
build_pa
(machine, co=False)¶ Constructs the power automata via the subset construction.
This can be used to determinize an NFA (when co=False).
The power automata begins in the topological ‘allstate’, a state which represents every state in the machine. Then on each symbol, the set of states is updated to include only those states which could see that symbol on an incoming edge—that is, it must be possible to transition from a state in the current state set, on the observed symbol, to a state in the next state set. In this sense, one can think of the power automata as representing histories such that it keeps track of which states are possible given a path down the parse tree (by following successors).
Similarly, one can construct a power automata that represents futures. It keeps track of which states might generate a path down the parse tree (by following predecessors). So once again, we start in the topological ‘allstate’. Then on each symbol, the set of states is updated to include only those states which could see that symbol on an outgoing edge. This is the copower automata.
Parameters: machines : MealyHMM
The machines for which subset reconstruction will re run.
co : bool
When True, builds the copower automata. Otherwise, the standard power automaton is built.
Returns: powerset : DFA
A DFA representing the determinized process with start state.
Notes
In its current form, this algorithm does not follow empty transitions.

cmpy.machines.algorithms.mealysw.
exactly_sync
(m, pa=None)¶ Tests if a machine is exactly synchronizable or not.
Parameters: m : MealyHMM
The machine to test.
Returns: exact : bool
True if the machine is exactly synchronizable, false otherwise.

cmpy.machines.algorithms.mealysw.
minsync_words_iter
(machine)¶ Returns an iterator over minimal synchronizing words and the induced state.
Parameters: machine : MealyHMM
The machine for which the synchronizing words are wanted.
The dfaminimize
Module¶
Algorithms for DFA minimization.

cmpy.machines.algorithms.dfaminimize.
remove_unreachable
(m, successors=None)¶ Removes all unreachable states from the start state of DFA m.
We begin in the start state and follow all transitions. We continue doing so until we no longer reach new states. Only the visited states are reachable; all others are unreachable.
Parameters: m : DFA
The DFA to be modified inplace.
successors : dict, None
A dictionary representing the successor topology of the DFA. It should be keyed by nodes and the values should be sets of nodes. If None, then the topology is constructed.
Returns: m : DFA
The original DFA modified inplace.

cmpy.machines.algorithms.dfaminimize.
partition_hopcroft
(m, predecessors=None, alphabet=None)¶ Partitions the nodes of m into equivalence classes over follower sets.
Parameters: m : DFA
The DFA whose nodes will be partitioned.
predecessors : dict, None
A dictionary keyed by (v,s) pairs. The keys represent the to_node and symbol. The values are sets representing the set of from_nodes for the key. If None, then the dictionary will be constructed.
alphabet : list, None
The alphabet of the machine. If None, then it is obtained from the machine.
Returns: partition : list
A list of sets. Each set contains equivalent nodes in m.

cmpy.machines.algorithms.dfaminimize.
minimize_hopcroft
(m)¶ Minimizes the DFA m.
The minimization method follows the Hopcroft algorithm [hopcroft1971] as described in [gries1973]. Worstcase running time is where is the alphabet size and is the number of nodes in the DFA.
If the machine is not a DFA, then minimization is not guaranteed since only a subset of equivalent nodes will be merged.
Parameters: m : DFA
The DFA that will be minimized.
Returns: mm : DFA
The minimized DFA.
mapping : dict
The mapping from nodes in m to nodes in the minimized DFA.

cmpy.machines.algorithms.dfaminimize.
is_minimal
(m, _topology=None)¶ Returns True if m is a minimal DFA, and False otherwise.
Parameters: m : DFA
The DFA to be tested for minimality.
_topology : tuple
The tuple of topologies that func:topology returns. If None, it will be constructed.
Returns: ismin : bool
Boolean specifying is m is a minimal DFA.
The mealywords
Module¶
Functions for enumerating the language from a node distribution.

cmpy.machines.algorithms.mealywords.
language_iter
(machine, nodes, sort)¶ A generator over the fixedlength languages from a set of nodes.
Parameters: machine : MealyHMM
The machine used to generate the language.
nodes : set
A set of nodes in machine which specifies that the yielded languages should be consistent with the machine having been in any of the nodes in nodes. Nodes in nodes which are not in machine are silently ignored.
sort : bool
If True, then the lists are sorted before returning.
The mealypm
Module¶
Mealy Possibility Machine

cmpy.machines.algorithms.mealypm.
add_transitions
(pm, deque, alphabet, delta, probs, processed)¶

cmpy.machines.algorithms.mealypm.
build_MealyPM
(machine, simple_labels=True, unreachable=False)¶ Returns the possibility machine for a MealyHMM.
Parameters: machine : MealyHMM
The machine whose possibility machine is constructed.
simple_labels : bool
If True, then the labels for the possibility machine displayed in a simple form.
unreachable : bool
If True, then unreachable nodes unreachable from the allstate belief are included.
Returns: pm : MealyHMM
The possibility machine.

cmpy.machines.algorithms.mealypm.
powerset
([1,2,3]) > () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)¶
The emco
Module¶
Cryptic Order¶
Compute the cryptic order utilizing the technique outlined in Ryan James’ orals talk. The cryptic order is defined in [mahoney2009a].

cmpy.machines.algorithms.emco.
cryptic_order
(machine)¶ Compute the cryptic order of machine.
Parameters: `machine` : EpsilonMachine
the machine to compute the cryptic order of.
Returns: k : int
the cryptic order of machine.
The mealyee
Module¶
Excess entropy calculations for MealyHMMs.
These functions are generally for the developers. They are exposed as
methods of the MealyHMM
class.

cmpy.machines.algorithms.mealyee.
bicausal_distribution
(machine, max_length, logs=None)¶ Returns the bicausal joint distribution: Pr(S+, S).
Given the recurrent portion of an epsilon machine, we construct the recurrent component of the epsilon machine for the reverse process. The construction is done via the mixed state algorithm, and in the process we are able to build the joint distribution over future and reverse causal states.
Note, construction is not always possible. It can happen that the reverse process is not exactly synchronizable. In these situations, it is not possible to construct the recurrent portion of the epsilon machine using the mixed state algorithm. In these cases, an exception is raised.
Parameters: machine : MealyHMM
The recurrent portion of the forward epsilon machine.
max_length : int
The maximum word length used in attempts to construct the recurrent portion of the reverse epsilon machine.
logs : bool
Specifies if the bicausal distribution should be a log distribution.
Returns: dist : distribution
The bicausal distribution Pr(S+, S).
rmachine : MealyHMM
The recurrent portion of the reverse epsilon machine.
Raises: CMPyException :
When the recurrent portion of the reverse epsilon machine was not found after reaching L=`max_length`. As long as the reverse process is exactly synchronizable, increasing the value of max_length will eventually work. If the process is not exactly sychronizable, then no value of max_length will work.

cmpy.machines.algorithms.mealyee.
excess_entropy
(machine, max_length, logs=None)¶ Returns the excess entropy of the machine.
The calculated value is correct only if the machine is the recurrent portion of the epsilon machine.
Parameters: machine : MealyHMM
The machine used to calculate the excess entropy of the process.
max_length : int
The maximum word length used in attempts to construct the recurrent portion of the reverse epsilon machine.
logs : bool
Specifies if the bicausal distribution should be a log distribution.
Returns: ee : float
The excess entropy of the process.
Raises: CMPyException :
When the recurrent portion of the reverse epsilon machine was not found after reaching L=`max_length`. As long as the reverse process is exactly synchronizable, increasing the value of max_length will eventually work. If the process is not exactly sychronizable, then no value of max_length will work.
The mealybprob
Module¶
Functions for calculating backward probabilities of MealyHMMs.
Eventually, someone should implement csc_matrix versions from scipy.sparse. Handling log2 probabilities might be an issue.
Despite the underlying algorithm similarlity, we provide distinct log and nonlog implementations so that we can minimize the number of checks against the ‘logs’ parameter.
All these functions define the backward probabilities at time t as:
 $$
 Pr( X_t^{Lt} = w_t^{Lt}  S_t)
$$
For a word of length L, there will be L+1 backward probabilties from t=0 to t=L.

cmpy.machines.algorithms.mealybprob.
backward_logprobability__dense
(ltm, word, t, pbp)¶ Implementation of backward log probabilities using NumPy arrays.
This assumes 0 <= t <= L1.
Parameters: ltm : dict
The a dictionary of labeled transition matrices as NumPy arrays, keyed by symbols. The arrays must already be log2 arrays.
word : iterable
The word for which we are computing backward probabilities.
t : int
The time specifying which backward probability is being computed.
pbp : arraylike
The previous backward log probability.
Returns: cbp : arraylike
The current backward probability.

cmpy.machines.algorithms.mealybprob.
backward_logprobability__machine
(machine, word, t, pbp, sparse)¶ Core implementation of backward log probabilities using the machine.
This assumes 0 <= t <= L1.
Parameters: machine : EdgeHMM instance
The machine representing the process.
word : iterable
The word for which we are computing backward probabilities.
t : int
The time specifying which backward probability is being computed.
pbp : dict
The previous backward log probability.
sparse : bool
If True, forbidden probabilities are not included in the return value.
Returns: cbp : dict
The current backward log probability.

cmpy.machines.algorithms.mealybprob.
backward_probabilities_iter
(machine, word, logs=None, ndist=None, sparse=None, method=None, norder=None)¶ Returns an iterator of backward probabilities.
Let $w = w_0 w_1 cdots w_{L1}$ be a word of length L. The backward probability at time $t$ for word $w$ is a distribution:
 $$
 Pr( X_t^{Lt} = w_t^{Lt}  S_t )
$$
There will be L+1 yields.
Parameters: machine : MealyHMM
The machine used to calculate the backward probabilities.
word : iterable
This is the word for which we will compute the probabilities.
logs : bool
When True, we compute log backward probabilities. When None, we use the value from cmpyParams[‘logs’].
ndist : {Distribution, LogDistribution}, optional
Probabilities are calculated from a node distribution. Typically, this is a vector of all ones, signifying that every node is a terminal node in the machine. So when ndist is None, we use a vector of ones.
sparse : bool, optional
When True, only realizations with nonzero probability are included. When False, every realization will be included. When None, we use the value from cmpyParams[‘sparse’].
method : str, optional
Specifies how the backward probabilities should be calculated. Valid options are:
‘machine’ ‘array’
If ‘machine’, then matrix multiplication is done via for loops over the incoming edges of each node. If ‘array’, then matrix multiplication is done via NumPy arrays. The default value is taken from cmpyParams[‘probability_method’].
norder : list, optional
This parameter is used and required only when method is equal to ‘machine’. It specifies the order of the nodes in the array. Every node in the machine must be present in the list.

cmpy.machines.algorithms.mealybprob.
backward_probability__dense
(ltm, word, t, pbp)¶ Implementation of backward probabilities using NumPy arrays.
This assumes 0 <= t <= L1.
Parameters: ltm : dict
The a dictionary of labeled transition matrices as NumPy arrays, keyed by symbols.
word : iterable
The word for which we are computing backward probabilities.
t : int
The time specifying which backward probability is being computed.
pbp : arraylike
The previous backward probability.
Returns: cbp : arraylike
The current backward probability.

cmpy.machines.algorithms.mealybprob.
backward_probability__machine
(machine, word, t, pbp, sparse)¶ Core implementation of backward probabilities using the machine.
This assumes 0 <= t <= L1.
Parameters: machine : EdgeHMM instance
The machine representing the process.
word : iterable
The word for which we are computing backward probabilities.
t : int
The time specifying which backward probability is being computed.
pbp : dict
The previous backward probability.
sparse : bool
If True, forbidden probabilities are not included in the return value.
Returns: cbp : dict
The current backward probability.

cmpy.machines.algorithms.mealybprob.
get_bprob_func
(method, logs)¶ Returns the appropriate backward probability function.
Parameters: method : str
Specification of how backward probabilities are computed. Valid values are:
‘array’ ‘machine’
logs : bool
Specification of standard or log probabilities are to be computed.
Returns: func : function
Function used to compute backward probabilities.

cmpy.machines.algorithms.mealybprob.
prepare_bprob_array
(machine, ndist, norder, logs)¶ Returns the labeled transition matrices and initial backward probability.
The backward probability functions which use NumPy require that the node distribution be a dense array.
Parameters: machine : MealyHMM
The machine used to compute the backward probabilities.
ndist : dict
The node distribution used to compute backward probabilities.
norder : list
The ordering of the nodes as elements in the array.
logs : bool
Specifies if we are compute log or standard backward probabilties.
Returns: ltm : dict
The labeled transitions matrices.
narray : NumPy array
The initial backward probability.
The mealyeM
Module¶
Meta algorithm to build the eM from MealyHMMs.

cmpy.machines.algorithms.mealyeM.
build_eM
(machine, max_length=30, forward=True, logs=None, ndist=None, norder=None, transients=True, label_nodes=False)¶ Returns the epsilon machine from the machine.
We build the mixed state presentation and then merge all causally equivalent nodes.
Parameters: machine : MealyHMM
The machine used to construct the mixed state presentation.
max_length : integer, optional
It can happen that the mixed states will form an infinite semigroup. In this situation, the construction will never complete. At each L, we construct the mixed state representations. If L > max_length, then we will raise an exception. The exception will contain the current values of the return variables as attributes.
forward : bool, optional
Specifies if the mixed state presentation should be constructed for forward process or the reverse process. By convention, the forward process is defined as the process generated by the input machine. When True, the mixed state presentation is constructed for the forward process.
logs : bool, optional
If true, then we construct the mixed state presentation by using logarithms in the computations of the mixed states. The default value is taken from cmpyParams[‘logs’].
ndist : Distribution, LogDistribution
The intial node distribution. That is, . If None, then we attempt to obtain one from the machine.
norder : list, optional
Specifies the order of the elements in the mixed state NumPy arrays. The node ordering must include every node in the machine. If None, then a random ordering is choosen and can be obtained from the mixed states.
transients : bool
If True, then the transient causal states are included in the eM. If the eM is of a nonstationary process, then the transient causal states are required if the generated process language is to correspond to the nonstationary process. If transients is False, then the stationary process that is tended to by the nonstationary process is what will be generated (by the recurrent states of the eM).
label_nodes : bool  positive int
When True, the nodes will be labeled according to the mixed state distributions. When False, the nodes will be integers. If an integer, it specifies the number of decimal digits the should appear in the label. Also, each node will have an ‘ms’ attribute containing its mixed state.
Returns: msp : MealyHMM
The mixed state presentation.
Raises: CMPyException :
Raised when L becomes greater than max_length and there are still mixed states to augment.
UnexpectedHMM :
Raised if the generated process is not exactly synchronizing. In these cases, the epsilon machine has an infinite number of transient states and there is no synchronzing word which allows us to shortcut to the recurrent portion. So even if transients is False, construction of the recurrent eM cannot complete.

cmpy.machines.algorithms.mealyeM.
merge_mixedstates
(B, C, logs)¶ Merges the mixed states of A to B.
The mixed states of B are distributions over nodes in A. The nodes of C are tuples of the nodes in B. Some nodes in B have been deemed identical and are being merged in C.
This function writes the mixed states of C in terms of the mixed states of A, and returns a dictionary relating nodes of C to its new mixed states.
The mealycmech
Module¶
Bestpractice algorithms for calculating Computational Mechanics quantities from block entropies.

class
cmpy.machines.algorithms.mealycmech.
CMechQuantities
(machine)¶ Bases:
object
Class implementing the calculation of computational mechanics quantities.
Attributes
machine MealyHMM The machine used to calculate quantities. quantities [str] A list of strings specifying the quantities of interest. Methods
block_entropy_asymptote
()block_entropy_integral
(n)blockstate_entropy
()blockstate_entropy_asymptote
()blockstate_entropy_integral
(n)calculate
(quantities, maxL[, ndist])Returns an array of quantities from computational mechanics. crypticity
()entropy_gain
()entropy_rate
()excess_entropy
()excess_entropy_lower
()excess_entropy_upper
()gauge_information
()hmuL
()mixedstate_entropy
()oracular_information
()predictability_gain
()reset
()state_entropy
()stateblock_entropy
()stateblock_entropy_asymptote
()stateblock_entropy_integral
(n)synchronizationL
()synchronization_information
()total_predictability
()transient_information
()
block_entropy_asymptote
()¶

block_entropy_integral
(n)¶

blockstate_entropy
()¶

blockstate_entropy_asymptote
()¶

blockstate_entropy_integral
(n)¶

calculate
(quantities, maxL, ndist=None)¶ Returns an array of quantities from computational mechanics.
When a quantity is a function of the word length (e.g. block entropy), then its values at each L will be returned. When a quantity is a scalar value, then its estimates as a function of L will be returned.
Parameters: quantities : [str]
A list of strings from the following list, specifying the desired quantities that will be returned at each iteration. In the list below, the states S are to be interpreted as states in the initial node distribution. Usually, this means that they are recurrent states (but not necessarily recurrent causal states).
 ‘HX0L’  block entropy
 ‘HX0LSL’  blockstate entropy
 ‘HS0X0L’  stateblock entropy
 ‘HX0LgS0’  block entropy conditioned on state
 ‘HX0LSLgS0’  blockstate entropy conditioned on state
 ‘MSE’  asymptotic mixed state entropy
 ‘El’  lower bound estimate of excess entropy
 ‘Eu’  upper bound estimate of excess entropy
 ‘T’  transient information estimate
 ‘G’  total predictability estimate
 ‘S’  synchronization information estimate
 ‘hmu’  entropy rate estimate
 ‘pg’  predictability gain
 ‘sL’  synchronization at L
 ‘In’  nth block entropy convergence integral,
For example: ‘I0’, ‘I5’
 ‘Jn’  nth blockstate entropy convergence integral
For example: ‘J0’, ‘J5’
 ‘Kn’  nth stateblock entropy convergence integral
For example: ‘K0’, ‘K5’
 ‘phi’  gauge information estimate
 ‘zeta’  oracular information estimate
 ‘chi’  crypticity estimate
 ‘HX0L~’  block entropy asymptotic behavior
 ‘HX0LSL~’  blockstate entropy asymptotic behavior
 ‘HS0X0L~’  stateblock entropy asymptotic behavior
 ‘hmuL`  entropy rate line
maxL : int
The maximum L value used to compute the quantities.
Returns: estimates : NumPy masked array, shape (len(quantities), maxL+1)
The estimates of the requested quantites at the current value of L. For quantities which are discrete derivatives, the first few values may be masked.

crypticity
()¶

entropy_gain
()¶

entropy_rate
()¶

excess_entropy
()¶

excess_entropy_lower
()¶

excess_entropy_upper
()¶

gauge_information
()¶

hmuL
()¶

mixedstate_entropy
()¶

oracular_information
()¶

predictability_gain
()¶

reset
()¶

state_entropy
()¶

stateblock_entropy
()¶

stateblock_entropy_asymptote
()¶

stateblock_entropy_integral
(n)¶

synchronizationL
()¶

synchronization_information
()¶

total_predictability
()¶

transient_information
()¶

The mealyiso
Module¶
Isomorphism algorithms for MealyHMMs.

cmpy.machines.algorithms.mealyiso.
is_isomorphic
(m1, m2, style='node_with_symbol')¶ Returns True if m1 and m2 are isomorphic.
Parameters: m1, m2 : MealyHMM
The machines being tested for an isomorphism.
style : str
The type of isomorphism to search for. Acceptable options are below.
 ‘node’:
Checks for a node isomorphism and ignores the symbols appearing on each edge. So the Golden Mean process is isomorphic to the Even process with this type of check.
 ‘node_with_symbol’ :
Checks for a node isomorphism but also demands that the output symbols on each edge match. So the Golden Mean process is not isomorphic to the Even process with this type of check.
 ‘nodesymbol’ :
Checks for a simultaneous node and symbol isomorphism.
Returns: iso : bool
True if m1 and m2 are isomorphic and False otherwise.
The sofic
Module¶
Functions/classes related to soficity.

cmpy.machines.algorithms.sofic.
is_strictly_sofic
(machine)¶ Returns True if the support of the machine is strictly sofic.
The convert
Module¶
Convert CMPy graphs to and from other formats.

cmpy.machines.algorithms.convert.
MealyHMM_from_MooreHMM
(machine)¶ Returns a MealyHMM converted from a MooreHMM.
Parameters: machine : MooreHMM
The machine that will be converted into a MealyHMM.

cmpy.machines.algorithms.convert.
MooreHMM_from_MealyHMM
(machine)¶ Returns a MooreHMM converted from a MealyHMM.
Parameters: machine : MealyHMM
The machine that will be converted into a MooreHMM.

cmpy.machines.algorithms.convert.
DFA_from_MealyHMM
(machine, probability=False, eqrelation=None, create_using=None, topology=False)¶ Returns a DFA converted from a MealyHMM.
Parameters: probability : bool
When True, the symbols of the DFA will be tuples (p, x) where p is the probability of the edge and x is the output symbol of the edge.
eqrelation : callable  None
A callable accepting two symbols as arguments. It should return True if the symbols are equivalent. If None, then exact equality is used. If probability is True, the symbols are of the form (probability, output).
create_using : callable
A callable that is called without arguments. It should return a machine instance to be populated with the node/edge data. If None, then we use
DFA
.topology : bool
If True, then 3 topologies, useful for DFAs, are constructed and returned. See documentation in dfaminimize.py for more details.
Returns: m : DFA
The converted MealyHMM.
t : tuple
A tuple of the constructed topologies. This is returned only when topology is True.

cmpy.machines.algorithms.convert.
MealyHMM_from_DFA
(machine, probability=False, create_using=None)¶ Returns a MealyHMM converted from a DFA.
Parameters: probability : bool
When True, the symbols of the DFA are assumed to be of the form (p, x) where p is the probability of the edge and x is the output symbol of the edge. When False, all edges are given equal weight and the the machine is renormalized.
create_using : callable
A callable that is called without arguments. It should return a machine instance to be populated with the node/edge data. If None, then we use
MealyHMM
.Returns: m : MealyHMM
The mealymsbe
Module¶
Block entropies from mixed state presentations.
More generally, we compute block entropies from any static MealyHMM.

cmpy.machines.algorithms.mealymsbe.
block_entropies
(machine, lengths, msmap=None, logs=None)¶ Returns a NumPy array of block entropies from the machine.
If the machine is not an epsilon machine, then the computed block entropies will be incorrect.
Parameters: machine : MealyHMM
The machine used to compute block entropies. This function requires that the machine have a start node defined.
lengths : list
A list of integers, specifying the desired block entropies. The list is assumed to be sorted and not contain duplicates.
msmap : dict, None
A mapping from node names to MixedState instances. This is used to calculate . If None, then this quantity cannot be calculated.
logs : bool
Specifies if logs are being used to calculate the block entropies.
Returns: be : NumPy array
The array represents the block entropies at the requested lengths. The columns represent the block entropies at each in lengths. The first row is for and the second row is for . If msmap is None, then the shape of the array is (1,L) where L = len(lengths). If msmap is not None, then the shape of the array is (2,L).
Notes
One can obtain via: be[1]  be[0].
The mealyfprob
Module¶
Functions for calculating probabilities of MealyHMMs.
Eventually, someone should implement csc_matrix versions from scipy.sparse. Handling log2 probabilities might be an issue.
Despite the underlying algorithm similarlity, we provide distinct log and nonlog implementations so that we can minimize the number of checks against the ‘logs’ parameter.
All these functions define the forward probabilities at time t as:
 $$
 Pr( X_0^t = w_0^t, S_t)
$$
For a word of length L, there will be L+1 forward probabilties from t=0 to t=L.

cmpy.machines.algorithms.mealyfprob.
forward_logprobability__dense
(ltm, word, t, pfp)¶ Implementation of forward log probabilities using NumPy arrays.
This assumes 1 <= t <= L.
Parameters: ltm : dict
The a dictionary of labeled transition matrices as NumPy arrays, keyed by symbols. The arrays must already be log2 arrays.
word : iterable
The word for which we are computing forward probabilities.
t : int
The time specifying which forward probability is being computed.
pfp : arraylike
The previous forward log probability.
Returns: cfp : arraylike
The current forward probability.

cmpy.machines.algorithms.mealyfprob.
forward_logprobability__machine
(machine, word, t, pfp, sparse)¶ Core implementation of forward log probabilities using the machine.
This assumes 1 <= t <= L.
Parameters: machine : EdgeHMM instance
The machine representing the process.
word : iterable
The word for which we are computing forward probabilities.
t : int
The time specifying which forward probability is being computed.
pfp : dict
The previous forward log probability.
sparse : bool
If True, forbidden probabilities are not included in the return value.
Returns: cfp : dict
The current forward log probability.

cmpy.machines.algorithms.mealyfprob.
forward_probabilities_iter
(machine, word, logs=None, ndist=None, sparse=None, method=None, norder=None)¶ Returns an iterator of forward probabilities.
Let $w = w_0 w_1 cdots w_{L1}$ be a word of length L. The forward probability at time $t$ for word $w$ is a distribution:
 $$
 Pr( X_0^t = w_0^t, S_t)
$$
There will be L+1 yields.
Parameters: machine : MealyHMM
The machine used to calculate the forward probabilities.
word : iterable
This is the word for which we will compute the probabilities.
logs : bool
When True, we compute log forward probabilities. When None, we use the value from cmpyParams[‘logs’].
ndist : {Distribution, LogDistribution}, optional
Probabilities are calculated from a node distribution. Typically, this is the stationary distribution. So when `ndist’ is None, we calculate the stationary probabilities. Optionally, one may specify a different distribution.
sparse : bool, optional
When True, only realizations with nonzero probability are included. When False, every realization will be included. When None, we use the value from cmpyParams[‘sparse’].
method : str, optional
Specifies how the forward probabilities should be calculated. Valid options are:
‘machine’ ‘array’
If ‘machine’, then matrix multiplication is done via for loops over the incoming edges of each node. If ‘array’, then matrix multiplication is done via NumPy arrays. The default value is taken from cmpyParams[‘probability_method’].
norder : list, optional
This parameter is used and required only when `method’ is equal to ‘machine’. It specifies the order of the nodes in the array. Every node in the machine must be present in the list.

cmpy.machines.algorithms.mealyfprob.
forward_probability__dense
(ltm, word, t, pfp)¶ Implementation of forward probabilities using NumPy arrays.
This assumes 1 <= t <= L.
Parameters: ltm : dict
The a dictionary of labeled transition matrices as NumPy arrays, keyed by symbols.
word : iterable
The word for which we are computing forward probabilities.
t : int
The time specifying which forward probability is being computed.
pfp : arraylike
The previous forward probability.
Returns: cfp : arraylike
The current forward probability.

cmpy.machines.algorithms.mealyfprob.
forward_probability__machine
(machine, word, t, pfp, sparse)¶ Core implementation of forward probabilities using the machine.
This assumes 1 <= t <= L.
Parameters: machine : EdgeHMM instance
The machine representing the process.
word : iterable
The word for which we are computing forward probabilities.
t : int
The time specifying which forward probability is being computed.
pfp : dict
The previous forward probability.
sparse : bool
If True, forbidden probabilities are not included in the return value.
Returns: cfp : dict
The current forward probability.

cmpy.machines.algorithms.mealyfprob.
get_fprob_func
(method, logs)¶ Returns the appropriate forward probability function.
Parameters: method : str
Specification of how forward probabilities are computed. Valid values are:
‘array’ ‘machine’
logs : bool
Specification of standard or log probabilities are to be computed.
Returns: func : function
Function used to compute forward probabilities.

cmpy.machines.algorithms.mealyfprob.
init_distltm
(machine, start=None, logs=None)¶ Returns the initial distribution and labeled transition matrices.
Parameters: machine : MealyHMM
The machine used to prepare the arrays.
start : node, distribution, None
If None, then the stationary distribution is computed. If start is a distribution, then it will be used as the starting node distribution. If start is a node in the machine, then the distribution will be peaked at the node.
logs : bool
Specifies if the arrays should be prepared for computing log probabilities or standard probabilities.
Returns: ndist : distribution
The starting node distribution.
ltm : dict
The labeled transition matrices, keyed by symbols in the machine.

cmpy.machines.algorithms.mealyfprob.
logprobability__numpy
(machine, word, ndist)¶ Calculates the log probability of word from ndist.

cmpy.machines.algorithms.mealyfprob.
logprobability__numpy2
(word, dist, ltm)¶ Raw function to calculate word log probabilities.
Parameters: word : iterable
The word whose probability is to be computed.
dist : NumPy array
An array representing the starting node distribution.
ltm : dict
A dictionary keyed by symbols in word with values representing the labeled transition matrices.

cmpy.machines.algorithms.mealyfprob.
prepare_fprob_array
(machine, ndist, norder, logs)¶ Returns the labeled transition matrices and initial forward probability.
The forward probability functions which use NumPy require that the node distribution be a dense array.
Parameters: machine : MealyHMM
The machine used to compute the forward probabilities.
ndist : dict
The node distribution used to compute forward probabilities.
norder : list
The ordering of the nodes as elements in the array.
logs : bool
Specifies if we are compute log or standard forward probabilties.
Returns: narray : NumPy array
The initial forward probability.
ltm : dict
The labeled transitions matrices.

cmpy.machines.algorithms.mealyfprob.
probability__numpy
(machine, word, ndist)¶ Calculates the probability of word from ndist.

cmpy.machines.algorithms.mealyfprob.
probability__numpy2
(word, dist, ltm)¶ Raw function to calculate word probabilities.
Parameters: word : iterable
The word whose probability is to be computed.
dist : NumPy array
An array representing the starting node distribution.
ltm : dict
A dictionary keyed by symbols in word with values representing the labeled transition matrices.
The mealytfa
Module¶
Implementations of various tablefilling type algorithms for MealyHMMs.
Syntax used:
States will always be {0,1,...N1}. Symbols will always be {0,1,...A1}.
 HMM_MatrixForm = list of symbol labeled transition matrices transitions
 [T(0), T(1), ...T(A1)]
HMM_ListForm = list [transitions,probabilities]
transitions is a list [[t00,t01,..t0A1], ... ,[tN10, ..tN1A1]] where tkj = state which state k goes to on symbol j (if no such state ‘X’)
probabilities is a list [[p00,p01,..p0A1], ... ,[pN10, ..pN1A1]] where pkj = probability of symbol j given state k
Note: All HMM’s are assumed to be unifilar, but not necessarily emachines.

cmpy.machines.algorithms.mealytfa.
Convert_HMMMatrixForm_to_HMMListForm
(HMM_Matrix_Form)¶

cmpy.machines.algorithms.mealytfa.
TF_induction
(transitions, base_matrix)¶

cmpy.machines.algorithms.mealytfa.
TF_initialization_PC
(HMM_Matrix_Form)¶

cmpy.machines.algorithms.mealytfa.
TF_initialization_TD
(HMM_Matrix_Form)¶

cmpy.machines.algorithms.mealytfa.
TF_initialization_prob
(HMM_Matrix_Form)¶

cmpy.machines.algorithms.mealytfa.
get_transition
(matrices, n, a)¶

cmpy.machines.algorithms.mealytfa.
is_asymptotic_sync
(M)¶

cmpy.machines.algorithms.mealytfa.
is_exact_sync
(M)¶

cmpy.machines.algorithms.mealytfa.
is_minimal
(M)¶

cmpy.machines.algorithms.mealytfa.
randint
(low, high=None, size=None)¶ Return random integers from low (inclusive) to high (exclusive).
Return random integers from the “discrete uniform” distribution in the “halfopen” interval [low, high). If high is None (the default), then results are from [0, low).
Parameters: low : int
Lowest (signed) integer to be drawn from the distribution (unless
high=None
, in which case this parameter is the highest such integer).high : int, optional
If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if
high=None
).size : int or tuple of ints, optional
Output shape. Default is None, in which case a single int is returned.
Returns: out : int or ndarray of ints
sizeshaped array of random integers from the appropriate distribution, or a single such random int if size not provided.
See also
random.random_integers
 similar to randint, only for the closed interval [low, high], and 1 is the lowest value if high is omitted. In particular, this other one is the one to use to generate uniformly distributed discrete nonintegers.
Examples
>>> np.random.randint(2, size=10) array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) >>> np.random.randint(1, size=10) array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Generate a 2 x 4 array of ints between 0 and 4, inclusive:
>>> np.random.randint(5, size=(2, 4)) array([[4, 0, 2, 1], [3, 2, 2, 0]])