Machines Documentation¶
This page contains the Machines Package documentation.
Subpackages¶
 Algorithms Documentation
 Subpackages
 The
emmo
Module  The
mealybe
Module  The
mealyms
Module  The
entropyfp
Module  The
typeclass
Module  The
mealycompose
Module  The
attrmatrix
Module  The
mealysr
Module  The
mealysw
Module  The
dfaminimize
Module  The
mealywords
Module  The
mealypm
Module  The
emco
Module  The
mealyee
Module  The
mealybprob
Module  The
mealyeM
Module  The
mealycmech
Module  The
mealyiso
Module  The
sofic
Module  The
convert
Module  The
mealymsbe
Module  The
mealyfprob
Module  The
mealytfa
Module
 Libraries Documentation
 Tests Documentation
 Processes Documentation
 Subpackages
 The
period
Module  The
noisyp2
Module  The
coin
Module  The
even
Module  The
odd
Module  The
rn1n
Module  The
sns
Module  The
irrev
Module  The
rip
Module  The
alternatingbiasedcoins
Module  The
rrx
Module  The
flower
Module  The
bandmerging
Module  The
cyclic
Module  The
goldenmean
Module  The
butterfly
Module  The
transducers
Module  The
nrps
Module  The
analyticeven
Module  The
cantor
Module  The
nemo
Module  The
multiple
Module  The
misiurewicz
Module  The
threehundred
Module  The
beforeafter
Module  The
beadsonnecklace
Module  The
rn1c
Module  The
rrxro
Module  The
rotation
Module  The
logic
Module  The
random
Module
The dfa
Module¶
Simple DFA class.

class
cmpy.machines.dfa.
DFA
(data=None, name='', **attr)¶ Bases:
cmpy.machines.base.BaseMachine
Machine where symbols are output on the edge transitions.
Methods
add_accept_node
(n)Sets the start node. alphabet
([u, v])Returns the alphabet of the machine. alphabet_by_node
([succ])Returns the alphabet for each node. complement
()Returns a DFA accepting the complement language of the DFA. complete
([reject, copy])Returns the DFA as a complete DFA. delta
(ns, word)Returns the final state after seeing word from nodes ns. get_accept_nodes
()Returns the set of accept nodes. intersection
(other[, minimize, relabel_nodes])Returns a DFA accepting the intersection of the languages. is_deterministic
()Returns True if the DFA is deterministic. is_empty
()Returns True if the DFA’s language is empty and False if not. is_equivalent
(other[, minimize])Returns True is the languages of self and other are equal. is_isomorphic
(other[, style])Returns True if the machine is isomorphic to m. is_minimal
()Returns True if the DFA is minimal. is_subset
(other)Returns True if the DFA language is a subset the language of other. is_unifilar
()Returns True if the DFA is unifilar. minimize
([mapping])Returns a minimized version of the DFA. reduce
([copy])Reduces the machine so that it contains only the recurrent nodes. relabel_symbols
(mapping)Changes the alaphabet of the machine, inplace. remove_accept_node
(n)Removes n from the set of accept nodes. reverse
([copy])Reverses the machine so that it generates the reverse language. set_accept_nodes
(nbunch)Sets the set of accept nodes to the nodes in nbunch. set_start_node
(n)Sets the start node. 
add_accept_node
(n)¶ Sets the start node.
Parameters: n : node
The node that will be set as the start node.
Raises: CMPyException :
When n is not in the machine.

alphabet
(u=None, v=None)¶ Returns the alphabet of the machine.
The alphabet between the from node u and the to node v is dynamically constructed by examining every edge between those nodes. In the default call, the alphabet of the entire machine is returned. Note that this is not the same as the alphabet from all nodes accessible from the start node.
Parameters: u : node
The “from” node. If u is None, then all nodes are considered as “from” nodes. The result is that the alphabet over all incoming edges to v is constructed. Thus, specifying u and leaving v as None, will give the outgoing alphabet from node u.
v : node
The “to” node. If v is None, then all nodes are considered as “to” nodes. The result is that the alphabet over all outgoing edges to u is constructed. Thus, specifying v and leaving u as None, will give the incoming alphabet to node v.
Returns: a : set
The alphabet of the machine.
Raises: AlphabetException :
When the alphabet cannot be obtained from the machine.

alphabet_by_node
(succ=True)¶ Returns the alphabet for each node.
Parameters: succ : bool
If True, then we return the outgoing alphabet for each node. If False, then we return the incoming alphabet for each node.
Returns: a : dict
A dictionary keys by nodes in the machine. The values are sets containing all symbols that can be seen going out from (or into) the node.

complement
()¶ Returns a DFA accepting the complement language of the DFA.
Returns: m : DFA
The complement DFA.

complete
(reject='REJECT', copy=True)¶ Returns the DFA as a complete DFA.
Parameters: reject : str
The name given to the reject node.
copy : bool
If True, then a copy of the machine is made and returned. If False, the machine is updated inplace.
Returns: m : DFA
The complete DFA.

delta
(ns, word)¶ Returns the final state after seeing word from nodes ns.
Parameters: ns : nodes
The nodes from which an attempt is made to recognize the word.
word : word
An iterable of words from the alphabet, used to advance the from nodes ns.
Returns: nns : {nodes}
The nodes the DFA is in after seeing word from nodes ns. If the word is not recognized, then an empty set is returned.

edge_attr_aliases
= {'i': 'output', 'input': 'output', 'o': 'output'}¶

edge_attr_required
= set(['output'])¶

get_accept_nodes
()¶ Returns the set of accept nodes.

graph_attr_aliases
= {}¶

graph_attr_required
= set([])¶

intersection
(other, minimize=True, relabel_nodes=True)¶ Returns a DFA accepting the intersection of the languages.
Parameters: other : DFA
The DFAs whose language will be intersected with.
minimize : bool
If True, then we return the minimal DFA recognizing the intersection of the languages recognized by the two DFAs.
relabel_nodes : bool
If True, then the nodes are relabeled to integers.
Returns: m : DFA
The DFA which recognizes the intersection language.

is_deterministic
()¶ Returns True if the DFA is deterministic.
A DFA is deterministic if, given the current state, there is only one next state associated with each next symbol.
Notes
Determinism is also know as ‘unifilarity’ in information theory literature and as ‘rightresolving’ in symbolic dynamics literature.

is_empty
()¶ Returns True if the DFA’s language is empty and False if not.

is_equivalent
(other, minimize=True)¶ Returns True is the languages of self and other are equal.
Parameters: other : DFA
The machine to be tested against for equivalence.
minimize : bool
We test if the minimal DFA for each machine are isomorphic. When minimize is True, then the input machines are not assumed to be minimal and are minimized before the isomorphism check.
Returns: equiv : bool
True if the DFAs are equivalent and False otherwise.

is_isomorphic
(other, style='node_with_symbol')¶ Returns True if the machine is isomorphic to m.
Parameters: other : DFA
The machine to be tested against for an isomorphism.
style : str
The type of isomorphism to search for. Acceptable options are below.
 ‘node’:
Checks for a node isomorphism and ignores the symbols appearing on each edge. So the Golden Mean process is isomorphic to the Even process with this type of check.
 ‘node_with_symbol’ :
Checks for a node isomorphism but also demands that the output symbols on each edge match. So the Golden Mean process is not isomorphic to the Even process with this type of check.
 ‘nodesymbol’ :
Checks for a simultaneous node and symbol isomorphism.
Returns: iso : bool
True if the machine is isomorphic to m and False otherwise.

is_minimal
()¶ Returns True if the DFA is minimal.
Returns: minimal : bool
If the DFA is minimal, then minimal is True. Otherwise, False.

is_subset
(other)¶ Returns True if the DFA language is a subset the language of other.

is_unifilar
()¶ Returns True if the DFA is unifilar.
Since unifilar is another name for ‘deterministic’, this is True for all DFAs.

minimize
(mapping=False)¶ Returns a minimized version of the DFA.
The minimization method follows the Hopcroft algorithm. Worstcase running time is O(kn log(n)) where k is the alphabet size and n is the number of nodes in the DFA.
Parameters: m : DFA
The DFA which will be minimized.
mapping : bool
If True, then the mapping from nodes in the original machine to nodes in the minimized machine is returned.
Returns: mm : DFA
The minimized DFA.
mapp : dict
The mapping from nodes in m to nodes in the minimized DFA. This is returned only if map is True.

node_attr_aliases
= {}¶

node_attr_required
= set([])¶

reduce
(copy=True)¶ Reduces the machine so that it contains only the recurrent nodes.
Parameters: copy : bool
When True, we return a new machine. When False, we modify the machine in place, removing every node except the recurrent nodes.
Returns: m : machine
The irreducible (or recurrent) portion of the original machine.

relabel_symbols
(mapping)¶ Changes the alaphabet of the machine, inplace.
Parameters: mapping : dict
A dictionary mapping old symbols to new symbols.

remove_accept_node
(n)¶ Removes n from the set of accept nodes.
No exception is raised if n in the set of accept nodes.
Parameters: n : node
The node that will be removed from the set of accept nodes.

reverse
(copy=True)¶ Reverses the machine so that it generates the reverse language.
Parameters: copy : bool
When True, we return a new machine. When False, we modify the machine inplace.
Returns: m : machine
The reversed machine.
TODO: make start and accept states copy over. likely requires adjusting :
base to allow for multiple start nodes. This should be in an NFA base :
class – NFAs allow for de facto multiple start states via empty :
transitions. :

set_accept_nodes
(nbunch)¶ Sets the set of accept nodes to the nodes in nbunch.
Parameters: nbunch : nodes
A iterable of nodes that will become the new set of accept nodes.
Raises: CMPyException :
When a node n is in nbunch but not in the machine.

set_start_node
(n)¶ Sets the start node.
Parameters: n : node
The node that will be set as the start node.

The mealyhmm
Module¶
The epsilonmachine
Module¶
Epsilon Machine¶

class
cmpy.machines.epsilonmachine.
BaseEpsilonMachine
¶ Bases:
object
Methods
cryptic_order
()Returns the cryptic order of the process. crypticity
([max_length])Returns the crypticity of the process. excess_entropy
([max_length, tolerance])Returns the excess entropy of the process. excess_entropy_bounds
(lengths[, return_be])Returns lower and upper bounds for the excess entropy. excess_entropy_exact
([max_length])Returns the excess entropy of the process. is_markov
()Returns True if the Markov order of the process is finite. markov_order
()Returns the Markov order of the process. predictive_information_rate
([N])Returns the predictive information rate, b_mu, of the process [abdallah2010]. predictive_information_rate_exact
()Returns the predictive information rate, b_mu, of the process [abdallah2010]. residual_entropy_rate
([N])Returns the residual entropy rate, r_mu, of the process [abdallah2010]. residual_entropy_rate_exact
()Returns the residual entropy rate, r_mu, of the process [abdallah2010]. reverse
([copy])Reverses the machine so that it generates the ‘reverse’ process. setcomplexity
()Returns the set complexity of the process language. statistical_complexity
()Returns the statistical complexity of the process. to_mealyhmm
([copy])Return this same presentation, but as a MealyHMM. 
cryptic_order
()¶ Returns the cryptic order of the process.
Returns: k : int, inf
The process’s cryptic order.
Notes
The machine must be able to be put into a finite, standard epsilon machine form.

crypticity
(max_length=30)¶ Returns the crypticity of the process.
The crypticity of the process is defined as H[SFuture]. The calculation of crypticity depends on the calculation of excess entropy. See the docstring for excess_entropy for more details.
Parameters: max_length : int
The maximum word length used in attempts to construct the recurrent portion of the reverse epsilon machine.
Returns: chi : float
The crypticity of the process.
Raises: CMPyException :
Raised when the recurrent portion of the reverse epsilon machine was not found after reaching L=`max_length`. As long as the reverse process is exactly synchronizable, increasing the value of max_length will eventually work. If the process is not exactly sychronizable, then no value of max_length will work.
NonunifilarException :
Raised when the machine is not unifilar, as required.

excess_entropy
(max_length=30, tolerance=1e05)¶ Returns the excess entropy of the process.
Parameters: max_length : int
The maximum length to use when computing the excess entropy, either via the exact method or via block entropies.

excess_entropy_bounds
(lengths, return_be=False)¶ Returns lower and upper bounds for the excess entropy.
The lower and upper bounds on the excess entropy are given by:
where is the causal state at time whose probabilities are given by the stationary distribution.
When lengths is a list of integers, we compute the lower and upper bounds for every length in lengths. If lengths is a float, it specifies the desired stopping difference. When the difference between the bounds is less than the specified difference, we stop at that particular L and return the array of bounds.
Note: If the two curves are equal, it doesn’t necessarily mean that we have converged to the value of the excess entropy. It simply means that the two curves give the same estimate.
Parameters: lengths : [int], float
Either a list of integers or a float.
return_be : bool
If True, then we also return the computed block entropies.
Returns: ee : NumPy array, shape (2,L)
The array of lower and upper bounds on the excess entropy. L is equal to len(lengths) if lengths was a list. If lengths was a float, then L is the length at which the difference between the upper and lower bounds became less then the specified float. The i`th column corresponds to the block entropies at `lengths[i]. The rows (0,1) correspond to (lower bound, upper bound).
be : NumPy array, shape (2,L)
The block entropies that were computed to obtain the bounds on the excess entropy. The first row corresponds to and the second row corresponds to .
Raises: TypeError :
Raised if lengths could not be converted to either a list of lengths or a float.
NonunifilarException :
Raised when the machine is not unifilar, as required.
See also

excess_entropy_exact
(max_length=30)¶ Returns the excess entropy of the process.
The excess entropy is calculated by where represents the forward causal states and represents the reverse causal states. The calculation proceeds via the mixed state algorithm. Presently, the method is guaranteed to succeed only if:
 the presentation is unifilar
 the presentation is irreducible
 the reverse process is exactly synchronizable
Also, there is (currently) no way of knowing if the reverse process is exactly synchronizable. This is due to the nonunifilarity of the timereversed presentation. When the reverse process is not exactly synchronizable, it is still possible to get good estimates of the excess entropy. See
excess_entropy_bounds()
.Parameters: max_length : int
The maximum word length used in attempts to construct the recurrent portion of the reverse epsilon machine.
Returns: ee : float
The excess entropy of the process.
Raises: CMPyException :
When the recurrent portion of the reverse epsilon machine was not found after reaching L=`max_length`. As long as the reverse process is exactly synchronizable, increasing the value of max_length will eventually work. If the process is not exactly sychronizable, then no value of max_length will work.
NonunifilarException :
Raised when the machine is not unifilar, as required.
See also

is_markov
()¶ Returns True if the Markov order of the process is finite.

markov_order
()¶ Returns the Markov order of the process.
Returns: R : int, inf
The Markov order of the process.
Notes
The machine must have a finite number of recurrent states for this method to work.

predictive_information_rate
(N=10)¶ Returns the predictive information rate, b_mu, of the process [abdallah2010].
b_mu = I[ X_0 : X_{1:infty}  X_{infty:0} ]
Returns: bmu : float
The predictive information rate of the process.

predictive_information_rate_exact
()¶ Returns the predictive information rate, b_mu, of the process [abdallah2010].
b_mu = I[ X_0 : X_{1:infty}  X_{infty:0} ]
Returns: bmu : float
The predictive information rate of the process.
Raises: CMPyException : :
If either the forward or reverse process does not have a finite epsilon machine.

residual_entropy_rate
(N=10)¶ Returns the residual entropy rate, r_mu, of the process [abdallah2010].
r_mu = H[ X_0  X_{infity:0}, X_{1:infty}]
Returns: rmu : float
The residual entropy rate of the process.

residual_entropy_rate_exact
()¶ Returns the residual entropy rate, r_mu, of the process [abdallah2010].
r_mu = H[ X_0  X_{infity:0}, X_{1:infty}]
Returns: rmu : float
The residual entropy rate of the process.
Raises: CMPyException : :
If either the forward or reverse process does not have a finite epsilon machine.

reverse
(copy=True)¶ Reverses the machine so that it generates the ‘reverse’ process.
Parameters: copy : bool
When True, we return a new machine. When False, we modify the machine inplace.
Returns: m : MealyHMM
The reversed machine.
Notes
The reverse machine is not, generally, the recurrent part of an epsilon machine. Thus, m is returned as a MealyHMM.

setcomplexity
()¶ Returns the set complexity of the process language.
Let w = xs be a semiinfinite word ending in symbol s. The set complexity is the average Shannon entropy stored about x in order to determine if s results in w being a valid word in the language.
The set complexity was defined by Grassberger in ...

statistical_complexity
()¶ Returns the statistical complexity of the process.
Returns: sc : float
The statistical complexity of the process.
Notes
Because the machine is an EpsilonMachine instance, we assume that the machine is the minimal unifilar presentation of the process.

to_mealyhmm
(copy=True)¶ Return this same presentation, but as a MealyHMM.
Parameters: copy : bool
If True, then a copy of the machine is made. Otherwise, the class is dynamically changed to MealyHMM.
Returns: m : MealyHMM
A machine which generating the same process.


class
cmpy.machines.epsilonmachine.
EpsilonMachine
(data=None, name='', **attr)¶ Bases:
cmpy.machines.epsilonmachine.BaseEpsilonMachine
,cmpy.machines.mealyhmm.MealyHMM
EpsilonMachine
Reducible, merged presentations of a process.
This presentation contains only causal states, both recurrent and transient.
Methods
causal_state
(word)Returns the causal state induced by the specified word. cryptic_order
()Returns the cryptic order of the process. crypticity
([max_length])Returns the crypticity of the process. excess_entropy
([max_length, tolerance])Returns the excess entropy of the process. excess_entropy_bounds
(lengths[, return_be])Returns lower and upper bounds for the excess entropy. excess_entropy_exact
([max_length])Returns the excess entropy of the process. is_markov
()Returns True if the Markov order of the process is finite. markov_order
()Returns the Markov order of the process. predictive_information_rate
([N])Returns the predictive information rate, b_mu, of the process [abdallah2010]. predictive_information_rate_exact
()Returns the predictive information rate, b_mu, of the process [abdallah2010]. reduce
([copy])Returns a new machine containing only the recurrent nodes. residual_entropy_rate
([N])Returns the residual entropy rate, r_mu, of the process [abdallah2010]. residual_entropy_rate_exact
()Returns the residual entropy rate, r_mu, of the process [abdallah2010]. reverse
([copy])Reverses the machine so that it generates the ‘reverse’ process. setcomplexity
()Returns the set complexity of the process language. statistical_complexity
()Returns the statistical complexity of the process. to_mealyhmm
([copy])Return this same presentation, but as a MealyHMM. 
causal_state
(word)¶ Returns the causal state induced by the specified word.
Parameters: word : iterable
The word used to induce the causal state.
Returns: cs : node
The causal state induced by the specified word.
Raises: ForbiddenWord :
Raised if the word is forbidden.

reduce
(copy=True)¶ Returns a new machine containing only the recurrent nodes.
Parameters: copy : bool
If True, then a deepcopy of the machine is made. If False, then the transient nodes are removed and the class is dynamically changed to RecurrentEpsilonMachine
Returns: m : machine
The recurrent portion of the machine.
Notes
If the number of transient nodes is much larger than the number of recurrent nodes, then a copy is generally quicker. If the number of recurrent nodes is much larger than the number of transient nodes, then it is generally quicker to remove the transient nodes in place (so set copy to False).


class
cmpy.machines.epsilonmachine.
RecurrentEpsilonMachine
(data=None, name='', **attr)¶ Bases:
cmpy.machines.epsilonmachine.BaseEpsilonMachine
,cmpy.machines.mealyhmm.MealyHMM
Recurrent Epsilon Machine
This class represents the recurrent portion of an epsilon machine. This is particularly important since many epsilon machines have an infinite number of transient states. Note, this includes eMs with multiple recurrent components.
Each state in a Recurrent Epsilon Machine represents an equivalence class over infinite pasts.
Methods
cryptic_order
()Returns the cryptic order of the process. crypticity
([max_length])Returns the crypticity of the process. excess_entropy
([max_length, tolerance])Returns the excess entropy of the process. excess_entropy_bounds
(lengths[, return_be])Returns lower and upper bounds for the excess entropy. excess_entropy_exact
([max_length])Returns the excess entropy of the process. is_markov
()Returns True if the Markov order of the process is finite. is_minimal_unifilar
()Returns True if the machine is the minimial unifilar presentation. markov_order
()Returns the Markov order of the process. predictive_information_rate
([N])Returns the predictive information rate, b_mu, of the process [abdallah2010]. predictive_information_rate_exact
()Returns the predictive information rate, b_mu, of the process [abdallah2010]. reduce
([copy])Returns a new copy of self. residual_entropy_rate
([N])Returns the residual entropy rate, r_mu, of the process [abdallah2010]. residual_entropy_rate_exact
()Returns the residual entropy rate, r_mu, of the process [abdallah2010]. reverse
([copy])Reverses the machine so that it generates the ‘reverse’ process. setcomplexity
()Returns the set complexity of the process language. statistical_complexity
()Returns the statistical complexity of the process. to_mealyhmm
([copy])Return this same presentation, but as a MealyHMM. 
is_minimal_unifilar
()¶ Returns True if the machine is the minimial unifilar presentation.
Note that an eM with transient states is not the smallest unifilar presentation of the process.

reduce
(copy=True)¶ Returns a new copy of self.
Returns: m : RecurrentEpsilonMachine
The recurrent portion of the machine.

The hmm
Module¶
Base class for hidden Markov models.

class
cmpy.machines.hmm.
HMM
(data=None, name='', **attr)¶ Bases:
cmpy.machines.stochasticmachine.StochasticMachine
Base class for hidden Markov models.
Methods
conditional_probabilities
(lengths[, logs, ...])Returns list of dictionaries, representing conditional distributions. conditional_probability
(word, condition[, ...])Returns Pr(wordcondition). entropy_rate_estimates
(lengths[, method, logs])Returns estimates of the entropy rate. info
([n])Prints out summary information about the machine. probabilities
(lengths[, logs, ndist, ...])Returns the probability of every word, for all lengths in lengths. reverse
([copy])Reverses the machine so that it generates the ‘reverse’ process. symbol
()Returns a generated symbol. symbols
(n)Returns n symbols by evolving the HMM. 
alphabet
(*args, **kwargs)¶ Returns the alphabet of the HMM.

conditional_probabilities
(lengths, logs=None, ndist=None, sparse=None)¶ Returns list of dictionaries, representing conditional distributions.
 Each conditional distribution has the following semantics:
 Pr(future_word  history_word)
Parameters: lengths : list
A list of 2tuples. Each element in the list specifies a desired conditional distribution. Each element should be a 2tuple, where the first element specifies the history word length and the second element specifies the future word length. For example, (1,2) means that the conditional distribution will be over future words of length two conditioned on all history words of length one.
logs : bool
When True, we compute the log probabilities of every word, and when False, we compute standard probabilities. When None, we use the value from cmpyParams[‘logs’].
ndist : Distribution, LogDistribution
Probabilities are calculated from a node distribution. Typically, this is the stationary distribution. So when ndist is None, we calculate the stationary probabilities. Optionally, one may specify a different distribution.
sparse : bool
When True, only words with probabilities greater than 0 (or inf) are included in the return value. When False, every word will be included. When None, we use the value from cmpyParams[‘sparse’].
Returns: cdists : list
The calculated conditional distributions. The distributions will be in the order of the length tuples specified in lengths. Each distribution is a dictionary keyed by history words. The values are dictionaries keyed by future words and the values are probabilities.

conditional_probability
(word, condition, logs=None, ndist=None)¶ Returns Pr(wordcondition).
Parameters: word : iterable
The word in Pr(wordcondition).
condition : iterable
The condition in Pr(wordcondition).
logs : bool, None
Specifies if logarithms should be used in the calculations of the conditional probability. If None, then the default value is taken from cmpyParams[‘logs’].
ndist : Distribution, LogDistribution
Probabilities are calculated from a node distribution. Typically, this is the stationary distribution. So when ndist is None, we calculate from the stationary probabilities. Optionally, one may specify a different distribution.
Returns: cp : float
The conditional probability.
Raises: ZeroDivisionError :
If Pr(condition) = 0.

entropy_rate
(*args, **kwargs)¶ Returns the entropy rate.

entropy_rate_estimates
(lengths, method='diff', logs=None)¶ Returns estimates of the entropy rate.
Parameters: lengths : list
A list of word lengths. For each word length, we calculate the entropy rate estimate at that word length.
method : str
A specification of the method used to compute the estimates. Valid methods are:
‘diff’ : h(L) = H(L)  H(L1) ‘ratio’ : h’(L) = H(L) / L
Note that the difference method is more accurate:
h <= h(L) <= h’(L)
where h is the true entropy rate.
logs : bool
If True, then logarithms are used in the calculation of the block entropies. If None, then we use cmpyParams[‘logs’].
Returns: estimates : NumPy array
An array of entropy rate estimates. The order of the estimates will match the order of the specified lengths in lengths.

excess_entropy
(*args, **kwargs)¶ Returns the excess entropy.

info
(n=None, **kwargs)¶ Prints out summary information about the machine.
Parameters: n : node
When not None, we print out specific information about a particular node.
logs : bool
When True, probabilities in the summary information are displayed as log probabilities. The default value is determined by cmpyParams[‘logs’].

probabilities
(lengths, logs=None, ndist=None, sparse=None, method=None, norder=None, sort=False)¶ Returns the probability of every word, for all lengths in lengths.
Using a depthfirst traversal, we efficiently calculate the probability of every word of every length less than some maximum length. By pruning the parse tree, we can achieve nice speedups.
Parameters: lengths : list
A list of integers, specifying the desired word lengths. For each word length, we will calculate the probability of every word.
logs : bool
When True, we compute the log probabilities of every word, and when False, we compute standard probabilities. When None, we use the value from cmpyParams[‘logs’].
Returns: probs : dict
The calculated probabilities.

probabilities_iter
(*args, **kwargs)¶ Iterator of probabilties for all words of various lengths.

probability
(*args, **kwargs)¶ Returns the probability of a word.

reverse
(copy=True)¶ Reverses the machine so that it generates the ‘reverse’ process.
Nonrecurrent nodes will be removed. The initial distribution of the reversed machine will be set equal to averaged distribution of the original machine.
Parameters: copy : bool
When True, we return a new machine. When False, we modify the machine inplace.
Returns: m : machine
The reversed machine.

symbol
()¶ Returns a generated symbol.
Note: This is not a readonly operation. The current node of the machine will be different after this call.

symbols
(n)¶ Returns n symbols by evolving the HMM.
If the machine has no current node, then one is choosen randomly according to the stationary distribution.
Note: This implementation is readonly. The final node of the machine after the method is called will equal the initial node of the machine when this method was first called.
Parameters: n : int
The number of symbols to generate.
Returns: s : list
The list of generated symbols.

symbols_iter
(*args, **kwargs)¶ Yields n symbols by evolving the machine’s current node.

The transducer
Module¶
Base class for transducers.

class
cmpy.machines.transducer.
Transducer
(data=None, name='', **attr)¶ Bases:
cmpy.machines.stochasticmachine.StochasticMachine
Base class for transducers.
Methods
input_alphabet
()Returns the input alphabet of the transducer. output_alphabet
()Returns the output alphabet of the transducer. 
edge_attr_aliases
= {'i': 'input', 'p': 'probability', 'value': 'probability', 'o': 'output'}¶

edge_attr_required
= set(['input', 'probability', 'output'])¶

graph_attr_aliases
= {}¶

graph_attr_required
= set([])¶

input_alphabet
()¶ Returns the input alphabet of the transducer.

input_machine
(*args, **kwargs)¶ Returns the topological eM describing the input language.

node_attr_aliases
= {}¶

node_attr_required
= set([])¶

output_alphabet
()¶ Returns the output alphabet of the transducer.

output_machine
(*args, **kwargs)¶ Returns the topological eM describing the output language.

The stochasticmachine
Module¶
The base stochastic machine class.

class
cmpy.machines.stochasticmachine.
StochasticMachine
(data=None, name='', **attr)¶ Bases:
cmpy.machines.base.BaseMachine
The base stochastic machine class for CMPy.
Attributes
edge_attr_required
graph_attr_required
node_attr_required
Methods
distribution
()Updates and returns the current distribution. distributions
(n)Returns n distributions by evolving the current distribution. distributions_iter
([n])Evolves and yields the current distribution n times. get_current_distribution
([logs])Returns the current distribution. get_current_node
()Returns the current node. get_start_distribution
([logs])Returns the start distribution. get_sync_mode
()Returns the sync mode. is_irreducible
()Returns True if the machine is irreducible and False if not. node_transition_matrix
([node_ordering, validate])Returns the node transition matrix as a NumPy array. randomize_current_distribution
([logs, ...])Randomizes the current distribution. randomize_current_node
([dist, validate])Randomizes the current node. recurrent_components
([ndist, logs])Returns the recurrent components as a distribution. recurrent_nodes
([ndist])Returns the recurrent nodes of the machine. reduce
([recurrent, copy, create_using])Reduces the machine by removing transient nodes. relabel_nodes
(mapping)Returns a copy of the machine with nodes relabeled. renormalize_edges
([style])Renormalizes all the outgoing edge probabilities for each node. set_current_distribution
(dist[, logs])Sets the current distribution. set_current_node
(n)Sets the current node. set_start_distribution
(d)Sets the start distribution. set_sync_mode
(sync)Sets the sync mode. stationary_distribution
([logs, sparse, ...])Returns the stationary distribution. stationary_entropy
()Returns the entropy of the stationary distribution. subgraph
(nbunch[, copy, create_using])Returns a nodeinduced subgraph of the graph. synchronize
([logs])Synchronizes the current distribution to the current node. topological_entropy_rate
()Returns the topological entropy rate of the underlying Markov chain. 
distribution
()¶ Updates and returns the current distribution.
See also

distributions
(n)¶ Returns n distributions by evolving the current distribution.
Parameters: n : int
The number of distributions to generate.
Returns: d : list
The list of generated distributions.
See also

distributions_iter
(n=None)¶ Evolves and yields the current distribution n times.
If the machine has no current distribution, one is choosen randomly. The current distribution is evolved by multiplying the current distribution by the node transition matrix.
Parameters: n : int, None
The number of times to evolve the current distribution. If n is None, then the current distribution will be evolved indefinitely.
Notes
The current distribution is returned. If you want to save it, you should make a copy of it, since it will be updated when the current distribution is evolved again.

get_current_distribution
(logs=None)¶ Returns the current distribution.
Parameters: logs : bool, None
If True, then the current distribution is returned as a log distribution. If False, then the current distribution is returned as a standard distribution. If None, then the current distribution is returned as it is currently stored.
Returns: d : distribution
The current distribution.

get_current_node
()¶ Returns the current node.
Returns: n : node
The current node.
Notes
The current node is None only if it has never been set.

get_start_distribution
(logs=None)¶ Returns the start distribution.
Parameters: logs : bool, None
If True, then the start distribution is returned as a log distribution. If False, then the start distribution is returned as a standard distribution. If None, then the start distribution is returned as it is currently stored.
Returns: d : distribution
The start distribution.

get_sync_mode
()¶ Returns the sync mode.
If the sync mode is True (active), then the current distribution is synchronized to the current node.
Returns: s : bool
The sync mode.

is_irreducible
()¶ Returns True if the machine is irreducible and False if not.
A machine is irreducible if every node can be reached from every other node. That is, the machine must consist of a single strongly connected component. This is strictly a graph property, and any probability flow from the initial distribution is irrelevant.

node_transition_matrix
(node_ordering=None, validate=False)¶ Returns the node transition matrix as a NumPy array.
Parameters: node_ordering : list, None
An ordering of the nodes in the graph. The ordering determines the order of the rows and columns in the array. If None, then self.nodes() is used.
validate : bool
If True, then we verify that each row sum to 1. Note, the elements in the node transition matrix need not be between 0 and 1.
Returns: m : NumPy array
The matrix whose elements describe the probability of moving between nodes in the HMM.
ordering : list
The ordering of the rows/columns in the matrix. This is returned only if node_ordering was None.
Raises: InvalidProbabilitySum :
Raised when a node’s outgoing transitions do not sum to 1.

randomize_current_distribution
(logs=None, event_type=<class 'cmpy.infotheory.events.Event'>)¶ Randomizes the current distribution.
The distribution is drawn uniformly from the space of all distributions.
Parameters: logs : bool
Specifies if the current distribution should be a log or standard distribution. If None, then the distribution will match whatever type is already set in the HMM. If the HMM has never had a distribution, then the value of logs is obtained from cmpyParams[‘logs’].
Returns: d : Distribution, LogDistribution
The randomly drawn distribution.

randomize_current_node
(dist=None, validate=True)¶ Randomizes the current node.
Parameters: dist : Distribution, LogDistribution
A distribution specifying how to draw the current node. If None, then we draw from the current distribution, provided it is not None. If the current distribution is NOne, then we draw from the stationary distribution.
validate : bool
Specifies whether the provided distribution should be validated before drawing from it.
Returns: n : node
The updated current node.

recurrent_components
(ndist=None, logs=None)¶ Returns the recurrent components as a distribution.
Parameters: ndist : Distribution  LogDistribution  node  None
A distribution, a node in the machine, or None. If None, then an attempt is made to grab one from the machine.
logs : bool
Whether to return a Distribution or LogDistribution.
Returns: dist : distribution
The distribution of the recurrent components.

recurrent_nodes
(ndist=None)¶ Returns the recurrent nodes of the machine.
If the machine consists of a single attracting component, then the recurrent nodes will consist of all the nodes in that attracting component.
If the machine consists of multiple attracting components, then the recurrent nodes will consist of those nodes with positive probability in the averaged distribution.
Parameters: ndist : Distribution  LogDistribution  node  None
A distribution, a node in the machine, or None. If None, then an attempt is made to grab one from the machine.
Returns: nodes : list
The recurrent nodes, subject to the initial distribution.

reduce
(recurrent=True, copy=True, create_using=None)¶ Reduces the machine by removing transient nodes.
If the machine consists of a single attracting component, then the resultant machine will contain only the nodes in that attracting component. If the machine consists of multiple attracting components, then there is a choice:
 Return the machine containing only those nodes with positive probability in the averaged distribution.
 Return the machine containing all attracting nodes.
If you reduce a nonergodic machine, then the start distribution will be set to the averaging distribution of the original machine.
Parameters: recurrent : bool
If True, then option 1 is followed whenever there are multiple attracting components. If False, then option 2 is followed.
copy : bool
When True, we return a new machine. When False, we modify the machine in place, removing every node except the recurrent nodes.
create_using : callable
A callable which returns an instantiation of a machine class that is populated when `copy’ is True.
Returns: m : machine
The recurrent portion of the original machine.

relabel_nodes
(mapping)¶ Returns a copy of the machine with nodes relabeled.
Parameters: mapping : dict
A dictionary mapping old nodes to new nodes.

renormalize_edges
(style='as_is')¶ Renormalizes all the outgoing edge probabilities for each node.
Parameters: style : str
 The type of renormalization to perform. Valid options are:
 ‘as_is’
Use the current probabilities and simply renormalize them.
 ‘uniform’:
Probability is uniformly distributed per node.
Examples
>>> from cmpy import machines >>> m = machines.Even()
Change the probabiltiies so that they are not normalized. Then renormalize.
>>> m['A']['A'][0]['p'] = 1 >>> m.renormalize_edges()
Now make the machine a topological machine.
>>> m.renormalize_edges(style='uniform')

set_current_distribution
(dist, logs=None)¶ Sets the current distribution.
Parameters: dist : Distribution, LogDistribution
A cmpy.infotheory distribution keyed by nodes in the HMM.
logs : bool
Specifies how the distribution should be stored. If True, then we store the distribution as a log distribution. If False, we store as a linear distribution. If None, we store it as is.
Returns: d : Distribution, LogDistribution
The current distribution.
Raises: CMPyException :
Raised when attempting to set an invalid distribution.
Notes
The current distribution is None only if it has never been set.

set_current_node
(n)¶ Sets the current node.
Parameters: n : node
The node to declare as the current node.
Raises: CMPyException :
Raised when attempting to set the current node to a node which does not exist in the HMM.
Notes
The current node is None only if it has never been set.

set_start_distribution
(d)¶ Sets the start distribution.
Parameters: d : distribution
The distribution that will be set as the initial node distribution.

set_sync_mode
(sync)¶ Sets the sync mode.
If the sync mode is True (active), then the current distribution is synchronized to the current node.

stationary_distribution
(logs=None, sparse=None, node_ordering=None, meta=False, event_type=<class 'cmpy.infotheory.events.Event'>)¶ Returns the stationary distribution.
The stationary distribution is defined to be the distribution which satisfies d.T = d, where d is a distribution. This distribution is guaranteed to be unique if the machine is irreducible.
The stationary distribution is also unique for reducible machines which have only a single attracting component. However, now there can be other nonstationary distribuions which generate stationary process languages. To see this, consider any eM. One choice of a distribution which generates a stationary process is to place all probabilty in the start node. Another choice is to appropriately place all probability in the recurrent causal states. Although both distributions generate a stationary process, only the latter corresponds to a stationary distribution in the eigenvector sense.
When there are multiple attracting components, then the stationary distribution is not unique. In fact, the stationary distribution of any attracting component, extended to the entire machine, would qualify as a stationary distribution for the nonergodic machine. Indeed, any mixture of the stationary distributions from each ergodic component would itself be a stationary distribution. Given an initial node distribution, one can calculate a related distribution using
averaged_distribution()
.Note: the stationary distribution does not depend on the initial node distribution. Thus, periodic structures will have a stationary distribution, even though it is possible to choose an initial node distribution such that the generated process is not stationary. The stationary distribution simply tells us that if the initial node distribution is set equal to the stationary distribution, then the generated process will be stationary.
Parameters: logs : bool, None
Specifies whether the returned distribution should be a log or standard distribution. If logs is None, then the value is taken from cmpyParams[‘logs’].
sparse : bool, None
Specifies whether nodes with zero probability should be included in the stationary distribution. If unspecified, then the value is taken from cmpyParams[‘sparse’]. Note, this parameter is only relevant if node_ordering is None.
node_ordering : list, None
If specified, then the distribution will be returned as a NumPy array. The list must include every node in the graph.
meta : bool
When True, the node transition matrix and node ordering that were used in the computation are stored in a dictionary, which is returned, under the keys ‘ntm’ and ‘node_ordering’.
Returns: pi : Distribution, LogDistribution, NumPy array
The stationary distribution. Unless a node ordering is specified, the return value will be a distribution object. When a node ordering is specified, the return value will be a NumPy array.
meta : dict
Dictionary of useful quantities used during the calculation. This is returned only if meta is True.

stationary_entropy
()¶ Returns the entropy of the stationary distribution.

subgraph
(nbunch, copy=True, create_using=None)¶ Returns a nodeinduced subgraph of the graph.
Parameters: nbunch : list
A list of nodes to include in the subgraph.
copy : bool
If True, then a new machine is created by making a deepcopy of all node/edge/graph attributes. Otherwise, every node except the nodes in nbunch are deleted.
create_using : [ callable  None ]
A callable which returns a machine instance that will be populated. If None then we use the class of the original machine.
Returns: m : machine
The nodeinduced subgraph of the machine.

synchronize
(logs=None)¶ Synchronizes the current distribution to the current node.
Parameters: logs : bool
Specifies if the current distribution should be a log or standard distribution. If None, then the distribution will match whatever type is already set in the HMM. If the HMM has never had a distribution, then the value of logs is obtained from cmpyParams[‘logs’].

topological_entropy_rate
()¶ Returns the topological entropy rate of the underlying Markov chain.
The topological entropy rate is defined as:
h = lim_{Lrightarrowinfty} 1/L log_2 N(L)where N(L) is the number of allowable words of length L. It is a wellknown result that the topological entropy rate equals the logarithm of the largest eigenvalue of the adjacency matrix. For a proof, see:
D. Lind and B. Marcus. An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, 1995.Notes
This does not compute the topological entropy rate of the HMM.

The moorehmm
Module¶
Mealy HMM  symbols are output while in the state.

class
cmpy.machines.moorehmm.
MooreHMM
(data=None, name='', **attr)¶ Bases:
cmpy.machines.hmm.HMM
Machine where symbols are output while in the state.
Methods
alphabet
([u])Returns the alphabet of the machine. distributions_iter
([n, sdist])Evolves and yields the current distribution n times. 
alphabet
(u=None)¶ Returns the alphabet of the machine.
Parameters: u : node, None
If not None, then we return the alphabet as seen from node u. If None, then we return the union of the alphabets as seen from each node in the machine.
Returns: a : set
The alphabet of the machine.
Raises: AlphabetException :
When the alphabet cannot be obtained from the machine.

distributions_iter
(n=None, sdist=None)¶ Evolves and yields the current distribution n times.
If the machine has no current distribution, one is choosen randomly. The distribution is evolved using a symbol distribution which describes the probability of observing a particular symbol.
Parameters: n : int, None
The number of times to evolve the current distribution by the node transition matrix. If n is None, then the current distribution will be evolved indefinitely by the node transition matrix.
Notes
The current distribution is returned. If you want to save it, you should make a copy of it, since it will be updated when the current distribution is evolved again.

edge_attr_aliases
= {'p': 'probability', 'value': 'probability'}¶

edge_attr_required
= set(['probability'])¶

graph_attr_aliases
= {}¶

graph_attr_required
= set([])¶

node_attr_aliases
= {'d': 'distribution'}¶

node_attr_required
= set(['distribution'])¶

The mealytransducer
Module¶
Mealy Transducers¶
Mealy transducers receive input and output symbols on the edges.
When querying about an input that is not defined for a particular node, the matrix is full of zeros. Then the probability can be computed in the same way as if working with a generator. The probability of error is the difference of the subnormalized probabilities from 1.

class
cmpy.machines.mealytransducer.
MealyTransducer
(data=None, name='', **attr)¶ Bases:
cmpy.machines.transducer.Transducer
Class for Mealy transducers.
Methods
alphabets
([node])Returns the input and output alphabets. complete
([ialphabet, reject, copy])Returns the MealyTransducer as a complete MealyTransducer. input_alphabet
()Returns the input alphabet of the transducer. input_machine
([build, create_using])Returns the eM describing the input language. invertibility
()Determines if a transducer is invertible with delay N. invertibility_test_graph
()Constructs the invertibility test graph for a transducer. is_complete
([ialphabet])Determines if a transducer is complete or not. joint_machine
(generator)Returns the joint generator of the transducer composed with generator. labeled_transition_matrices
([node_ordering, ...])Returns the labeled transition matrices. output_alphabet
()Returns the output alphabet of the transducer. output_machine
([build, create_using])Returns the eM describing the output language. renormalize_edges
([style])Renormalizes all the outgoing edge probabilities for each node. stationary_distribution
(generator[, gndist, ...])Returns the stationary distribution induced by the generator. stationary_entropy
(generator[, gndist, ...])Returns the entropy of stationary distribution induced by the generator. transduce_generator
(generator[, simple, ...])Transducers the generator. transduce_symbols_iter
([ndist, error])Returns a memoryful iterator for transducing symbols. transduce_word
(iword[, ndist, error, ...])Transducers iword into an output word. 
alphabets
(node=False)¶ Returns the input and output alphabets.
Parameters: node : bool
If True then the return values are dictionaries keyed by nodes. For each node, the corresponding alphabet is returned.
Returns: ialphabet : set  dict
The input alphabet. If node is True, then this is a dictionary keyed by nodes in the transducer. The value for each key is the input alphabet from that node.
oalphabet : set  dict
The output alphabet. If node is True, then this is a dictionary keyed by nodes in the transducer. The value for each key is the output alphabet from that node.

complete
(ialphabet=None, reject=None, copy=True)¶ Returns the MealyTransducer as a complete MealyTransducer.
That is, the transducer will be complete over the input symbols.
Parameters: ialphabet : set
Additional input symbols used to make the transducer complete.
reject : hashable
The name given to the reject node.
copy : bool
If True, then a copy of the machine is made and returned. If False, the machine is updated inplace.
Returns: m : MealyTransducer
The complete transducer.

input_alphabet
()¶ Returns the input alphabet of the transducer.

input_machine
(build=True, create_using=None, **kwargs)¶ Returns the eM describing the input language.
The transducer’s edges are added to a generator using only the input symbols. Each edge is given the same probability and the edges are renormalized after construction. Then the machine is transformed into its eM. Note that the machine will not necessarily be a topological eM.
All other parameters are passed to cmpy.machines.build_msr().
Parameters: build : bool
If True, then the eM of the topological machine is constructed. If False, then the machine’s topology will resemble the transducer’s.
create_using : { callable  None }
A callable which returns a generator that is populated with the input machine.
Returns: teM : RecurrentEpsilonMachine
The topological eM generating the input language.

invertibility
()¶ Determines if a transducer is invertible with delay N.
Returns: invertible : bool
True when the transducer is invertible with delay N, False if not invertible. Partial transducers result in an exception.
delay : int
When invertible = True, this is the inverse delay of the transducer. Otherwise, this is ‘Undefined’

invertibility_test_graph
()¶ Constructs the invertibility test graph for a transducer.
Returns: G : networkx.DiGraph()
The invertibility test graph for the transducer.
Examples :
—— :
>>> T = cmpy.machines.Delay(3) :
>>> G = T.invertibility_test_graph() :
>>> import cmpy.drawing :
>>> cmpy.drawing.draw_graphviz(G) :

is_complete
(ialphabet=None)¶ Determines if a transducer is complete or not.
Parameters: ialphabet : set
Additional input symbols used to make the transducer complete.
Returns: complete : bool
True if the transducer is complete, otherwise False.

joint_machine
(generator)¶ Returns the joint generator of the transducer composed with generator.
Parameters: generator : machine
The generator to be composed with the transducer.
Returns: jm : machine
The joint generator.
Notes
This is a raw operatation and does not account for start node distributions at all. This must be done after the fact.

labeled_transition_matrices
(node_ordering=None, logs=False, mode='nan')¶ Returns the labeled transition matrices.
Each element in the matrix is given by:
where is the “from node”, is the “to node”, is the input symbol, and is the output symbol. The value specifies the probability of seeing symbol and transitioning to given the current node and input symbol .
There is one matrix for each (input, output) symbol.
Parameters: node_ordering : list
An ordering of nodes in the machine.
logs : bool  None
If True, then the log of the labeled transition matrices are returned.
mode : ‘nan’  ‘zero’
When a node does not specify an input symbol, there are a number of ways one might choose to represent this fact in the labeled transition matrices. A natural choice is to use NaNs for the entire row. Choose ‘nan’ in this case. One can also choose to assign zero to the entire row. Choose ‘zero’ in this case.
Returns: ltm : defaultdict
A dictionary of NumPy arrays keyed by (input, output) symbol pairs. The values are the labeled transition matrices. If an invalid key is specified, an array of zeros is returned.
norder : list
List of nodes, specifying the order of the rows/cols of the labeled transition matrices. This is returned only if node_ordering is None.
Raises: CMPyException :
If a node in node_ordering does not exist in the machine.

output_alphabet
()¶ Returns the output alphabet of the transducer.

output_machine
(build=True, create_using=None, **kwargs)¶ Returns the eM describing the output language.
The transducer’s edges are added to a generator using only the output symbols. Each edge is given the same probability and the edges are renormalized after construction. Then the machine is transformed into its eM. Note that the machine will not necessarily be a topological eM.
All other parameters are passed to cmpy.machines.build_msr().
Parameters: build : bool
If True, then the eM of the topological machine is constructed. If False, then the machine’s topology will resemble the transducer’s.
create_using : { callable  None }
A callable which returns a generator that is populated with the input machine.
Returns: teM : RecurrentEpsilonMachine
The topological eM generating the output language.

renormalize_edges
(style='as_is')¶ Renormalizes all the outgoing edge probabilities for each node.
Normalization is done from each node and for each input symbol.
Parameters: style : str
 The type of renormalization to perform. Valid options are:
 ‘as_is’
Use the current probabilities and simply renormalize them.
 ‘uniform’:
Probability is uniformly distributed per node.
Examples
>>> from cmpy import machines >>> m = machines.FlipEveryOther()
Change the probabiltiies so that they are not normalized. Then renormalize.
>>> m['A']['B'][0]['p'] = 1 >>> m.renormalize_edges()
Now make the machine a topological machine.
>>> m.renormalize_edges(style='uniform')

stationary_distribution
(generator, gndist=None, ndist=None, logs=None)¶ Returns the stationary distribution induced by the generator.
If the joint generator has a single attracting component, then it is not necessary to specify the generator and transducer start node distributions.
Parameters: generator : machine
The generator that drives the transducer.
gndist : distribution
The start node distribution for the generator. If None, then one will be obtained from the machine, if possible.
ndist : distribution
The start node distribution for the transducer. If None, then one will be obtained from the machine, if possible.
logs : bool
If True, then the distribution will be a log distribution.
Returns: dist : Distribution  LogDistribution
The induce stationary distribution over transducer nodes.
Notes
Unlike with generators, the same algorithm can be used to calculate the stationary distribution (ie, one attracting component) and averaged distribution (ie, multiple attracting components).

stationary_entropy
(generator, gndist=None, ndist=None, logs=None)¶ Returns the entropy of stationary distribution induced by the generator.
Parameters: generator : machine
The generator that drives the transducer.
gndist : distribution
The initial node distribution for the generator. If None, then one will be obtained from the machine, if possible.
ndist : distribution
The initial node distribution for the transducer. If None, then one will be obtained from the machine, if possible.
logs : bool
If True, then the distribution will be a log distribution.
Returns: ent : float
The stationary entropy.

transduce_generator
(generator, simple=False, gndist=None, ndist=None, error=None, logs=None)¶ Transducers the generator.
Parameters: generator : machine
The generator which provides input symbols to the transducer.
simple : bool
If True, then the generator is transduced and no attention is given to the other parameters. That is, the simplist, most generic transduction is performed without any error analysis.
gndist : distribution
The initial node distribution for the generator. If None, then one will be obtained from the machine, if possible.
ndist : distribution
The initial node distribution for the transducer. If None, then one will be obtained from the machine, if possible.
error : int
If 0, then an exception will be raised if the output generator cannot be used without eventually emitting an error. If 1 or 2, then the output generator is constructed and output. If the value is None, then the value is taken from cmpyParams[‘transducers.error’].
logs : bool
If True, then the distribution over the output generator nodes will be a log distribution.
Returns: m : MealyHMM
The transduced generator representing the output.
jd : distribution
The initial node distribution for the output generator. If simple is True, this value is not returned.
ed : distribution
The averaged node distribution for the output generator. If simple is True, this value is not returned
noerror : bool
If True, then the output generator will never output an error. If simple is True, this value is not returned.
Raises: CMPyException :
If simple is False and error equals 0, then this exception is raised if the output generator eventually outputs an error.
If the alphabet of the generator has no intersection with the input alphabet of the transducer.

transduce_symbols_iter
(ndist=None, error=None)¶ Returns a memoryful iterator for transducing symbols.
Calling this method returns an custom object which has a transduce method that converts a single input symbol into an output symbol and next state (of the transducer). The custom iterator uses a cached copy of the transducer’s labeled transition matrices. If the transducer is modified in any way, then a new memoryful iterator should be constructed.
If the input symbol is not allowed from the current state of the transducer, then a ‘?’ will be returned and the current state of the transducer will be updated to ‘?’. Any future transductions will result in ‘?’ as outputs again. In other words, you need to reset the current node of the transducer.
If the current state is None, then one is choosen according to the self._get_ndist().
Parameters: symbol : symbol
The input symbol to pass to the transducer.
ndist : distribution
The initial node distribution or the initial node for the transducer. If None, then one will be obtained from the machine, if possible.
error : { int  None}
If 0, then an exception is raised whenever the input is not allowed from the current state of the transducer. If 1 or 2, then an invalid input will return (‘?’, ‘?’) representing an error output symbol and an error state. All future calls to transduce will result in the same output. If the value of error is None, then the value is taken from cmpyParams[‘transducers.error’].
Returns: it : object
A customized iterator for transducing symbols.
Examples
>>> from cmpy import machines >>> t = machines.GMtoEven() >>> t.set_start_node('A') >>> it = t.transduce_symbol_iter(error=1) >>> it.transduce('0') ('1', 'B') >>> it.transduce('1') ('1', 'A') >>> it.transduce('0') ('1', 'B') >>> t.transduce_word('010') # compare to transducing all at once ('1', '1', '1') >>> it.transduce('0') # contining but with bad input TransducerError: Transducer errored on input ('0',)

transduce_word
(iword, ndist=None, error=None, return_dist=False, logs=None)¶ Transducers iword into an output word.
Parameters: ndist : { Distribution  LogDistribution }
The initial node distribution of the transducer. If None, then the stationary distribution induced by the minimal topological eM of the input language will be used.
error : { int  None}
If 0, then the conditional distribution will only include nonerror outputs. This means that the probabilities will be renormalized, so as to remove the effect of the error output. If 1, then the total probability of error is included in the output distribution. If 2, then there will be multiple error events in the distribuiton and errors will be shown as they occur. If the value is None, then the value is taken from cmpyParams[‘transducers.error’].
return_dist : bool
If True, we return the distribution over output words given the input word iword.
logs : bool
If True and if return_dist is True, then the returned distribution is a log distribution; otherwise, it’s a standard distribution. If logs is None, then we use the value from cmpyParams[‘logs’].
Returns: oword : tuple
The output word, transduced from the input word iword.
dist : distribution
The distribution over output words given iword. This is returned only if return_dist is True.

The misc
Module¶
Miscellaneous machine related functions.

cmpy.machines.misc.
validate_node_distribution
(dist, machine, logs, test_prob=False)¶ Validates a node distribution.
A valid node distribution is a dictionary where every key is a node in the graph and the values of the dictionary must sum to 1. Note, the dictionary need not contain every node in the graph.
Parameters: dist : dict
This is the distribution to be validated.
machine : CMPy machine
The machine we are checking against.
logs : bool
This tells us if the distribution is a log distribution.
test_prob : bool
This tells us to make sure each value is between 0 and 1.
Returns: b : bool
True if the distribution is valid. Otherwise, an exception is raised.
Raises: InvalidDistribution :
Raised when the distribution is not properly normalized.
CMPyException :
Raised when the distribution contains a node not in the HMM.

cmpy.machines.misc.
validate_node_ordering
(ordering, machine)¶ Validates a node ordering.
A valid node ordering includes each node in the machine exactly once.
Parameters: ordering : list
The node ordering that will be checked for validity.
machine : CMPy machine
The machine used to check the validity of the ordering.
Returns: b : bool
True if the ordering is valid. Otherwise, an exception is raised.
Raises: InvalidOrdering :
Raised when the ordering is invalid.
The transform
Module¶
Some basic transformations on machines.

cmpy.machines.transform.
reassign_edge_probabilities
(machine, style='uniform')¶ Reassigns edge probabilities for every edge in machine.
Parameters: machine : MealyHMM
The machine whose edge probabilities will be reassignd.
style : str
If ‘uniform’, then edge probabilities are uniformly distributed per node. If ‘prime’, then each node is assigned a unique prime number to act as the starting value. This should make all edge distributions distinct.

cmpy.machines.transform.
internal_markov_chain
(machine, create_using=None)¶ Returns the internal Markov chain of the machine.
Parameters: machine : MealyHMM
A machine.
create_using : callable
A callable used to create the instance to be populated. If None, then we use the
MealyHMM
class.Returns: mc : MealyHMM
The internal Markov chain of machine.

cmpy.machines.transform.
edge_machine
(machine, iterations=1, style=0, create_using=None)¶ Returns a MealyHMM corresponding to the edges of machine.
This is also know as a ‘line graph’.
Regardless of the requested style, we construct a PULL machine for the requested number of iterations. At the very end, we relabel the symbols if a PUSH machine is requested. Note that the intermediate stages are the same. That is, constructing a PULL machine from a PUSH’ed machine is the same as constructing a PULL machine from a PULL’ed machine.
Parameters: machine : MealyHMM
A MealyHMM instance.
iterations : positive int, 1
It is possible to create an edge machine from the resulting edge machine. This parameter specifes how many times to compose the operation.
style : int, {0,1}
To create an EdgeHMM, we must still output symbols on the edges. There are at least two ways to do this. style=0
After creating all nodes, we PULL the symbol from the from_node.
 style=1
After creating all nodes, we PUSH the symbol from the to_node.
Returns: m : MealyHMM
The edge machine.
The label
Module¶
Functions for constructing node/edge labels.

cmpy.machines.label.
edge_label
(attrs, input=True, value=True, output=True, supertime=False, join=True)¶ Returns a string to use when drawing edge labels.
Generally, an edge label is drawing in the following manner:
input  value  outputThis format degrades gracefully whenever any of the quantities is omitted. For example, the edge label for a MealyHMM appears as:
value  outputwhere the value used is the edge probability.
Parameters: attrs : aliasdict1
The attributes of the edge from a CMPy machine.
The convert
Module¶
Convert CMPy graphs to and from other formats.

cmpy.machines.convert.
from_string
(s, name=None, style=0, renormalize=True)¶ Returns a CMPy machine from the string s.
Parameters: s : str
A string representing the machine.
name : {str, None}
The name to give the presenatation.
style : int  string
An integer describing the type of machine to create. Valid values are:
 0, ‘mealyhmm’ : MealyHMM
Syntax for s is that edges are separated by semicolons and each edge consists of the fromnode, tonode, symbol, and transition probability. Each element of the edge should be separated by spaces. For example: “A A 1 .5; A A 0 .5”. If the probability is omitted, then uniform transition probabilities are applied to each edge. For example: “A A 1; A A 0” gives the fair coin.
 1, ‘reM’ : RecurrentEpsilonMachine
Same syntax as 0. User must ensure that the machine described is both minimal and unifilar.
 2, ‘mealytransducer’ : MealyTransducer
Syntax for s is that edges are separated by semicolons and each edge consists of the fromnode, tonode, inputsymbol, outputsymbol, and transition probability. Each element of the edge should be separated by spaces. For example: “A A 1 0 .5; A A 1 1 .5; A A 2 1 1”.
 3, ‘dfa’ : DFA
Syntax for s is that edges are separated by semicolons and each edge consists of the fromnode, tonode, and symbol. Each element of the edge should be separated by spaces. For example: “A A 1; A A 0” gives the fair coin.
 4, ‘markovchain’ : MarkovChain
Syntax for s is that edges are separated by semicolons and each edge consists of the fromnode, tonode, and probability. Each element of the edge should be separated by spaces. For example: “A A .6; A B .4; B A 1”. If the probability is omitted, then uniform transitions are applied to each edge. So, “A A; A B; B A” gives the standard golden mean process.
renormalize : bool
If True, then the edge probabilities are renormalized before returning the machine.
Examples
To create the golden mean process:
>>> m = from_string('A A 1 .5; A B 0 .5; B A 1 1')
To create the golden mean process as a Markov chain:
>>> m = from_string('A A; A B; B A', style='markovchain')

cmpy.machines.convert.
from_matrices
(matrices, name=None, nodes=None, symbols=None, style=0)¶ Returns a CMPy machine from matrices.
Parameters: matrices : {list, dict}
A list of matrices. The exact format depends on the value of style.
name : {str, None}
The name to give the presenatation.
nodes : {list, None}
The names to give the nodes, the order corresponds to the order of the rows/cols in the matrices. If None, the nodes will be labeled as integers beginning at 0.
symbols : {list, None}
The names given to the symbols (used only when style == 3). If None, the symbols will be integers beginning at 0.
style : int
An integer describing the type of machine to create. Valid values are:
 0, ‘mealyhmm’ : MealyHMM
matrices should be a list of 2tuples. Each 2tuple should be of the form (symbol, matrix). matrices can also be a dict keyed by symbols with the labeled transition matrices as values.
 1, ‘dfa’ : DFA
matrices should be a list of 2tuples. Each 2tuple should be of the form (symbol, matrix). matrices can also be a dict keyed by symbols with the labeled transition matrices as values.
 2, ‘dfa_auto’: DFA
matrices should be a list of matrices. Symbols will be integers from 0.
 3, ‘moorehmm’ : MooreHMM
matrices should be a 2tuple consisting of the node transition matrix and the emission matrix.
 4, ‘markovchain’ : MarkovChain
matrices should be a single NumPy 2D array that describes the state transition matrix of the Markov chain.
Examples
To create the golden mean process:
>>> T0 = (0, [[0,.5],[0,0]]) >>> T1 = (1, [[.5,0],[1,0]]) >>> m = from_matrices( (T0,T1) )

cmpy.machines.convert.
convert_symbols_to_integers
(m, as_strings=False)¶ Converts all output symbols to integers.

cmpy.machines.convert.
from_distribution
(d)¶ Builds a Markov process from a joint distribution.
The simplemoorehmm
Module¶
Mealy HMM  symbols are output while in the state.

class
cmpy.machines.simplemoorehmm.
MooreHMM
(data=None, name='', **attr)¶ Bases:
cmpy.machines.hmm.HMM
Machine where symbols are output while in the state.
Methods

alphabet
(u=None)¶ Returns the alphabet of the machine.
Parameters: u : node, None
If not None, then we return the alphabet as seen from node u. If None, then we return the union of the alphabets as seen from each node in the machine.
Returns: a : set
The alphabet of the machine.
Raises: AlphabetException :
When the alphabet cannot be obtained from the machine.

distributions_iter
(n=None, sdist=None)¶ Evolves and yields the current distribution n times.
If the machine has no current distribution, one is choosen randomly. The distribution is evolved using a symbol distribution which describes the probability of observing a particular symbol.
Parameters: n : int, None
The number of times to evolve the current distribution by the node transition matrix. If n is None, then the current distribution will be evolved indefinitely by the node transition matrix.
Notes
The current distribution is returned. If you want to save it, you should make a copy of it, since it will be updated when the current distribution is evolved again.

edge_attr_aliases
= {'p': 'probability', 'value': 'probability'}¶

edge_attr_required
= set(['probability'])¶

graph_attr_aliases
= {}¶

graph_attr_required
= set([])¶

node_attr_aliases
= {'d': 'distribution'}¶

node_attr_required
= set(['distribution'])¶

The base
Module¶
The base machine class.

class
cmpy.machines.base.
BaseMachine
(data=None, name='', **attr)¶ Bases:
networkx.classes.multidigraph.MultiDiGraph
The base machine class for CMPy.
Attributes
edge_attr_required
graph_attr_required
node_attr_required
Methods
add_edge
(u, v[, key, attr_dict])Add an edge between u and v. add_edges_from
(ebunch[, attr_dict])Add all the edges in ebunch. add_node
(n[, attr_dict])Add a single node to the graph and update its attributes. add_nodes_from
(nodes, **attr)Adds multiple nodes to the graph. adjacency_matrix
([node_ordering])Returns the adjacency matrix as a NumPy integer array. attracting_components
()Returns the attracting components of the graph. attracting_nodes
()Returns the attracting nodes of the machine. draw
([filename, format, prefix, suffix, ...])Draws the machine using the specified layout algorithm. get_name
()Returns the name of the machine. get_start_node
()Returns the start node. is_irreducible
()Returns True if the machine is irreducible and False if not. set_name
(name)Sets the name of the machine. set_start_node
(n)Sets the start node. to_dot
([filename, labelstyle])Writes the machine to filename in the Graphviz DOT format. 
add_edge
(u, v, key=None, attr_dict=None, **attr)¶ Add an edge between u and v.
Since multiple edges between two nodes are allowed, each edge must be assigned a key. If a key is not specified, then one will be provided.
Parameters: u,v : nodes
Nodes can be, for example, strings or numbers. Nodes must be hashable (and not None) Python objects.
key : hashable identifier, optional (default=lowest unused integer)
Used to distinguish multiedges between a pair of nodes.
attr_dict : dictionary, optional (default= no attributes)
Dictionary of edge attributes. Key/value pairs will update existing data associated with the edge.
attr : keyword arguments, optional
Edge data (or labels or objects) can be assigned using keyword arguments.
Returns: key : The key uniquely identifying the added edge.
See also
Notes
To replace/update edge data, use the optional key argument to identity a unique edge. Otherwise a new edge will be created.

add_edges_from
(ebunch, attr_dict=None, **attr)¶ Add all the edges in ebunch.
Parameters: ebunch : container of edges
Each edge given in the container will be added to the graph. The edges can be:
 2tuples (u,v) or
 3tuples (u,v,d) for an edge attribute dict d, or
 4tuples (u,v,k,d) for an edge identified by key k
attr_dict : dictionary, optional (default= no attributes)
Dictionary of edge attributes. Key/value pairs will update existing data associated with each edge.
attr : keyword arguments, optional
Edge data (or labels or objects) can be assigned using keyword arguments.
Returns: keys : The keys uniquely identifying the added edges.
See also
add_edge
 add a single edge
add_weighted_edges_from
 convenient way to add weighted edges
Notes
Adding the same edge twice has no effect but any edge data will be updated when each duplicate edge is added.

add_node
(n, attr_dict=None, **attr)¶ Add a single node to the graph and update its attributes.
If the node already exists, then its attributes are updated.
Parameters: n : node
Any hashable object except None.
attr_dict : dict (optional)
Dictionary of node attributes. If None, then no attributes are set.
attr : keyword arguments (optional)
Additional keywords that update attr_dict, before setting the node attributes.
Raises: FrozenMachine :
If the machine is frozen.
See also

add_nodes_from
(nodes, **attr)¶ Adds multiple nodes to the graph.
Parameters: nodes : list
A list of nodes to add to the graph.
attr : keyword arguments (optional)
Additional keyword arguments passed to add_node().

adjacency_matrix
(node_ordering=None)¶ Returns the adjacency matrix as a NumPy integer array.
Parameters: node_ordering : list, None
An ordering of the nodes in the graph. The ordering determines the order of the rows and columns in the array. If None, then ‘self.nodes()’ is used.
Returns: m : NumPy array
The adjacency matrix whose elements describe the number of edges between two nodes in the graph.
ordering : list
The ordering used to construc the array. This is returned only if node_ordering was None.
Notes
The return value is not necessarily a 01 matrix. The elements in the matrix will equal the number of edges between nodes.

attracting_components
()¶ Returns the attracting components of the graph.
This does not construct machines of the attracting components.
Returns: acc : list
A list of attracting components. Each attracting component is represented by a tuple of nodes in the attracting component.

attracting_nodes
()¶ Returns the attracting nodes of the machine.
An attracting node belongs to some attracting component. An attracting component is a group of nodes which accummulates probability flow.
Note, that a node can be attracting without being recurrent. That is, the attracting property of a node is independent of whether or not probabilty actually flows through it.
Returns: nodes : [nodes]
Returns a list of all the attracting nodes of the machine, even if they are in different strongly connected components.

draw
(filename=None, format=None, prefix=None, suffix=None, layout='dot', args='', labelstyle=1, show=True, env=None)¶ Draws the machine using the specified layout algorithm.
Parameters: filename : str, None
The name of the file to save the image to. If None, save to a temporary file. File formats are inferred from the extension of the filename. If the format parameter is provided, it overwrites any inferred value.
format : str
An output format. Note that not all may be available on every system depending on how Graphviz was built. If no filename is provided and no format is specified, then we write a ‘png’ image. Other values for format are:
‘canon’, ‘cmap’, ‘cmapx’, ‘cmapx_np’, ‘dia’, ‘dot’, ‘fig’, ‘gd’, ‘gd2’, ‘gif’, ‘hpgl’, ‘imap’, ‘imap_np’, ‘ismap’, ‘jpe’, ‘jpeg’, ‘jpg’, ‘mif’, ‘mp’, ‘pcl’, ‘pdf’, ‘pic’, ‘plain’, ‘plainext’, ‘png’, ‘ps’, ‘ps2’, ‘svg’, ‘svgz’, ‘vml’, ‘vmlz’, ‘vrml’, ‘vtx’, ‘wbmp’, ‘xdot’, ‘xlib’
prefix : str  None
If filename is None, we save to a temporary file. The value of prefix will appear at after ‘cmpy_‘ but before random string and file extension. If None, then the machine’s name is used.
suffix : str  None
If filename is None, we save to a temporary file. The value of suffix will appear at after the prefix and random string but before the file extension.
tempname : str, None
If filename is None, we save to a temporary file. The value of tempname will appear at the tail end of the temporary filename. If`tempname` is None, then we use the name of the machine.
layout : str
The graphviz layout algorithm. Options are the same as in pygraphviz.AGraph.draw().
args : str
Additional arguments to pass to the Graphviz layout program.
labelstyle : int, None
A specification of the label style. Default option is 0. Valid options:
0 No labels are drawn.
 1 Full CMPy label in “input  probability  output” format.
If input or output is None, then we adjust accordingly.
2 Probability is removed from the label.
show : bool
If True, then the image is displayed using the default viewer after drawing it.
Notes
If this function is called in succession to quickly, it doesn’t seem to work. So you might want to sleep for a second between calls.

edge_attr_aliases
= {}¶

edge_attr_required
= set([])¶

get_name
()¶ Returns the name of the machine.

get_start_node
()¶ Returns the start node.

graph_attr_aliases
= {}¶

graph_attr_required
= set([])¶

is_irreducible
()¶ Returns True if the machine is irreducible and False if not.
A machine is irreducible if every node can be reached from every other node. That is, the machine must consist of a single strongly connected component. Probabilistically, this is equivalent to asking if the machine consists only of its recurrent component.

node_attr_aliases
= {}¶

node_attr_required
= set([])¶

set_name
(name)¶ Sets the name of the machine.

set_start_node
(n)¶ Sets the start node.
Parameters: n : node
The node that will be set as the start node.

to_dot
(filename=None, labelstyle=1)¶ Writes the machine to filename in the Graphviz DOT format.
No layout program is ran on the machine to determine node positions. End of line characters are Unixstyle: n
Parameters: filename : str
The name of the file to save the image to. No matter the extension of filename, the format will always be the DOT format.
labelstyle : int
A specification of the label style. Default option is 1. Valid options:
0 No labels are drawn. 1 Full CMPy label in “input  probability  output” format.
If input or output is None, then we adjust accordingly.
2 Probability is removed from the label.
