kaldi.fstext

PyKaldi has built-in support for common FST types (including Kaldi lattices and KWS index) and operations. The API for the user facing PyKaldi FST types and operations is mostly defined in Python mimicking the API exposed by OpenFst’s official Python wrapper pywrapfst to a large extent. This includes integrations with Graphviz and IPython for interactive visualization of FSTs.

There are two major differences between the PyKaldi FST package and pywrapfst:

  1. PyKaldi bindings are generated with CLIF while pywrapfst bindings are generated with Cython. This allows PyKaldi FST types to work seamlessly with the rest of the PyKaldi package.
  2. In contrast to pywrapfst, PyKaldi does not wrap OpenFst scripting API, which uses virtual dispatch, function registration, and dynamic loading of shared objects to provide a common interface shared by FSTs of different semirings. While this change requires wrapping each semiring specialization separately in PyKaldi, it gives users the ability to pass FST objects directly to the myriad PyKaldi functions accepting FST arguments.

Operations which construct new FSTs are implemented as traditional functions, as are two-argument boolean functions like equal and equivalent. Convert operation is not implemented as a separate function since FSTs already support construction from other FST types, e.g. vector FSTs can be constructed from constant FSTs and vice versa. Destructive operations—those that mutate an FST, in place—are instance methods, as is write.

The following example, based on Mohri et al. 2002, shows the construction of an ASR graph given a pronunciation lexicon L, grammar G, a transducer from context-dependent phones to context-independent phones C, and an HMM set H:

import kaldi.fstext as fst

L = fst.StdVectorFst.read("L.fst")
G = fst.StdVectorFst.read("G.fst")
C = fst.StdVectorFst.read("C.fst")
H = fst.StdVectorFst.read("H.fst")
LG = fst.determinize(fst.compose(L, G))
CLG = fst.determinize(fst.compose(C, LG))
HCLG = fst.determinize(fst.compose(H, CLG))
HCLG.minimize()                                      # NB: works in-place.
kaldi.fstext.NO_STATE_ID = -1
kaldi.fstext.NO_LABEL = -1
kaldi.fstext.ENCODE_FLAGS = 3
kaldi.fstext.ENCODE_LABELS = 1
kaldi.fstext.ENCODE_WEIGHTS = 2

Functions

arcmap Constructively applies a transform to all arcs and final states.
compat_symbols Returns true if the two symbol tables have equal checksums.
compose Constructively composes two FSTs.
deserialize_symbol_table Deserializes a symbol table.
determinize Constructively determinizes a weighted FST.
difference Constructively computes the difference of two FSTs.
disambiguate Constructively disambiguates a weighted transducer.
epsnormalize Constructively epsilon-normalizes an FST.
equal Are two FSTs equal?
equivalent Are the two acceptors equivalent?
indices_to_symbols Converts indices to symbols by looking them up in the symbol table.
intersect Constructively intersects two FSTs.
isomorphic Are the two acceptors isomorphic?
prune Constructively removes paths with weights below a certain threshold.
push Constructively pushes weights/labels towards initial or final states.
randequivalent Are two acceptors stochastically equivalent?
randgen Randomly generate successful paths in an FST.
read_fst_kaldi Reads FST using Kaldi I/O mechanisms.
relabel_symbol_table Relabels a symbol table as specified by the input list of pairs.
replace Recursively replaces arcs in the root FST with other FST(s).
reverse Constructively reverses an FST’s transduction.
rmepsilon Constructively removes epsilon transitions from an FST.
serialize_symbol_table Serializes a symbol table.
shortestdistance Compute the shortest distance from the initial or final state.
shortestpath Construct an FST containing the shortest path(s) in the input FST.
statemap Constructively applies a transform to all states.
symbols_to_indices Converts symbols to indices by looking them up in the symbol table.
synchronize Constructively synchronizes an FST.
write_fst_kaldi Writes FST using Kaldi I/O mechanisms.

Classes

CompactLatticeArc FST arc with compact lattice weight.
CompactLatticeConstFst Constant FST over the compact lattice semiring.
CompactLatticeConstFstArcIterator Arc iterator for a constant FST over the compact lattice semiring.
CompactLatticeConstFstStateIterator State iterator for a constant FST over the compact lattice semiring.
CompactLatticeEncodeMapper Arc encoder for an FST over the compact lattice semiring.
CompactLatticeEncodeTable Encode table for CompactLatticeArc.
CompactLatticeFstCompiler Compiler for FSTs over the compact lattice semiring.
CompactLatticeVectorFst Vector FST over the compact lattice semiring.
CompactLatticeVectorFstArcIterator Arc iterator for a vector FST over the compact lattice semiring.
CompactLatticeVectorFstMutableArcIterator Mutable arc iterator for a vector FST over the compact lattice semiring.
CompactLatticeVectorFstStateIterator State iterator for a vector FST over the compact lattice semiring.
CompactLatticeWeight Compact lattice weight factory.
FstHeader FST file header.
FstReadOptions FST reading options.
FstWriteOptions FST writing options.
KwsIndexArc FST arc with KWS index weight.
KwsIndexConstFst Constant FST over the KWS index semiring.
KwsIndexConstFstArcIterator Arc iterator for a constant FST over the KWS index semiring.
KwsIndexConstFstStateIterator State iterator for a constant FST over the KWS index semiring.
KwsIndexEncodeMapper Arc encoder for an FST over the KWS index semiring.
KwsIndexEncodeTable Encode table for KwsIndexArc.
KwsIndexFstCompiler Compiler for FSTs over the KWS index semiring.
KwsIndexVectorFst Vector FST over the KWS index semiring.
KwsIndexVectorFstArcIterator Arc iterator for a vector FST over the KWS index semiring.
KwsIndexVectorFstMutableArcIterator Mutable arc iterator for a vector FST over the KWS index semiring.
KwsIndexVectorFstStateIterator State iterator for a vector FST over the KWS index semiring.
KwsIndexWeight KWS index weight factory.
KwsTimeWeight KWS time weight factory.
LatticeArc FST arc with lattice weight.
LatticeConstFst Constant FST over the lattice semiring.
LatticeConstFstArcIterator Arc iterator for a constant FST over the lattice semiring.
LatticeConstFstStateIterator State iterator for a constant FST over the lattice semiring.
LatticeEncodeMapper Arc encoder for an FST over the lattice semiring.
LatticeEncodeTable Encode table for LatticeArc.
LatticeFstCompiler Compiler for FSTs over the lattice semiring.
LatticeVectorFst Vector FST over the lattice semiring.
LatticeVectorFstArcIterator Arc iterator for a vector FST over the lattice semiring.
LatticeVectorFstMutableArcIterator Mutable arc iterator for a vector FST over the lattice semiring.
LatticeVectorFstStateIterator State iterator for a vector FST over the lattice semiring.
LatticeWeight Lattice weight factory.
LogArc FST arc with log weight.
LogConstFst Constant FST over the log semiring.
LogConstFstArcIterator Arc iterator for a constant FST over the log semiring.
LogConstFstStateIterator State iterator for a constant FST over the log semiring.
LogEncodeMapper Arc encoder for an FST over the log semiring.
LogEncodeTable Encode table for LogArc.
LogFstCompiler Compiler for FSTs over the log semiring.
LogVectorFst Vector FST over the log semiring.
LogVectorFstArcIterator Arc iterator for a vector FST over the log semiring.
LogVectorFstMutableArcIterator Mutable arc iterator for a vector FST over the log semiring.
LogVectorFstStateIterator State iterator for a vector FST over the log semiring.
LogWeight Log weight factory.
StdArc FST arc with tropical weight.
StdConstFst Constant FST over the tropical semiring.
StdConstFstArcIterator Arc iterator for a constant FST over the tropical semiring.
StdConstFstStateIterator State iterator for a constant FST over the tropical semiring.
StdEncodeMapper Arc encoder for an FST over the tropical semiring.
StdEncodeTable Encode table for StdArc.
StdFstCompiler Compiler for FSTs over the tropical semiring.
StdVectorFst Vector FST over the tropical semiring.
StdVectorFstArcIterator Arc iterator for a vector FST over the tropical semiring.
StdVectorFstMutableArcIterator Mutable arc iterator for a vector FST over the tropical semiring.
StdVectorFstStateIterator State iterator for a vector FST over the tropical semiring.
SymbolTable Symbol table.
SymbolTableIterator Symbol table iterator.
SymbolTableTextOptions Options for reading symbol table from text file.
TropicalWeight Tropical weight factory.
class kaldi.fstext.CompactLatticeArc[source]

FST arc with compact lattice weight.

CompactLatticeArc():
Creates an uninitialized CompactLatticeArc instance.
CompactLatticeArc(ilabel, olabel, weight, nextstate):
Creates a new CompactLatticeArc instance initalized with given arguments.
Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (CompactLatticeWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
from_attrs(ilabel:int, olabel:int, weight:CompactLatticeWeight, nextstate:int) → CompactLatticeArc

Creates a new arc with the given attributes.

Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (CompactLatticeWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
ilabel

int – The input label.

nextstate

int – The destination state for the arc.

olabel

int – The output label.

type() → str

Returns arc type.

weight

CompactLatticeWeight – The arc weight.

class kaldi.fstext.CompactLatticeConstFst(fst=None)[source]

Constant FST over the compact lattice semiring.

Parameters:fst (CompactLatticeFst) – The input FST over the compact lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

copy()

Makes a copy of the FST.

Returns:A copy of the FST.
draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
type()

Returns the FST type.

Returns:The FST type.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.CompactLatticeConstFstArcIterator(fst, state)[source]

Arc iterator for a constant FST over the compact lattice semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeConstFstStateIterator(fst)[source]

State iterator for a constant FST over the compact lattice semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]

Arc encoder for an FST over the compact lattice semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:
  • encode_labels (bool) – Should labels be encoded?
  • encode_weights (bool) – Should weights be encoded?
  • encode (bool) – Encode or decode?
flags() → int

Returns encoder flags.

from_other(mapper:CompactLatticeEncodeMapper) → CompactLatticeEncodeMapper

Creates a new encoder with the contents of another.

from_other_with_type(mapper:CompactLatticeEncodeMapper, type:EncodeType) → CompactLatticeEncodeMapper

Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable

Returns input symbol table.

output_symbols() → SymbolTable

Returns output symbol table.

properties(inprops:int) → int

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:mask – The property mask to be compared to the encoder’s properties.
Returns:A 64-bit bitmask representing the requested properties.
read(filename:str, type:EncodeType=default) → CompactLatticeEncodeMapper

Reads encoder from file.

set_input_symbols(syms:SymbolTable)

Sets the input symbol table.

Parameters:syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)

Sets the output symbol table.

Parameters:syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType

Returns encoder type.

write(filename:str) → bool

Writes encoder to file.

Returns:True if write was successful, False otherwise.
class kaldi.fstext.CompactLatticeEncodeTable

Encode table for CompactLatticeArc.

CompactLatticeEncodeTable(flags):
Creates a new encode table with the given flags.
class Tuple

CompactLatticeArc encoding tuple.

ilabel

Input label.

olabel

Output label.

weight

Weight.

decode(key:int) → Tuple

Decodes an encoded arc label back to labels and cost.

encode(arc:CompactLatticeArc) → int

Encodes the given arc (either labels or weights or both).

flags() → int

Returns encoding flags.

get_label(arc:CompactLatticeArc) → int

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable

Returns input symbols.

output_symbols() → SymbolTable

Returns output symbols.

read(strm:istream, source:str) → CompactLatticeEncodeTable

Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)

Sets input symbols.

set_output_symbols(syms:SymbolTable)

Sets output symbols.

size() → int

Returns the size of the table.

write(strm:ostream, source:str) → bool

Writes table to output stream.

class kaldi.fstext.CompactLatticeFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]

Compiler for FSTs over the compact lattice semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:
  • isymbols – An optional SymbolTable used to label input symbols.
  • osymbols – An optional SymbolTable used to label output symbols.
  • ssymbols – An optional SymbolTable used to label states.
  • acceptor – Should the FST be rendered in acceptor format if possible?
  • keep_isymbols – Should the input symbol table be stored in the FST?
  • keep_osymbols – Should the output symbol table be stored in the FST?
  • keep_state_numbering – Should the state numbering be preserved?
  • allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
compile()

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:The FST described by the string buffer.
Raises:RuntimeError – Compilation failed.
write(expression)

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)
Parameters:expression – A string expression to add to compiler string buffer.
class kaldi.fstext.CompactLatticeVectorFst(fst=None)[source]

Vector FST over the compact lattice semiring.

Parameters:fst (CompactLatticeFst) – The input FST over the compact lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
add_arc(state, arc)

Adds a new arc to the FST and returns self.

Parameters:
  • state – The integer index of the source state.
  • arc – The arc to add.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: add_state.

add_state()

Adds a new state to the FST and returns the state ID.

Returns:The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:self.
Raises:ValueError – Unknown sort type.

See also: topsort.

closure(closure_plus=False)

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:closure_plus – If True, do not accept the empty string.
Returns:self.
concat(ifst)

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:ifst – The second input FST.
Returns:self.
connect()

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:self.
copy()

Makes a copy of the FST.

Returns:A copy of the FST.
decode(encoder)

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: encode.

delete_arcs(state, n=None)

Deletes arcs leaving a particular state.

Parameters:
  • state – The integer index of a state.
  • n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: delete_states.

delete_states(states=None)

Deletes states.

Parameters:states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:self.
Raises:IndexError – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: decode.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

invert()

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:self.
minimize(delta=0.0009765625, allow_nondet=False)

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • allow_nondet – Attempt minimization of non-deterministic FST?
Returns:

self.

mutable_arcs(state)

Returns a mutable iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

project(project_output=False)

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:project_output – Project onto output labels?
Returns:self.

See also: decode, encode, relabel, relabel_tables.

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:
  • to_final – Push towards final states?
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • remove_total_weight – If pushing weights, should the total weight be removed?
Returns:

self.

See also: The constructive variant, which also supports label pushing.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

relabel(ipairs=None, opairs=None)

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:
  • ipairs – An iterable containing (old index, new index) integer pairs.
  • opairs – An iterable containing (old index, new index) integer pairs.
Returns:

self.

Raises:

ValueError – No relabeling pairs specified.

See also: decode, encode, project, relabel_tables.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:
  • old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
  • new_isymbols – A SymbolTable used to relabel the input labels
  • unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
  • old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
  • new_osymbols – A SymbolTable used to relabel the output labels.
  • unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:

self.

Raises:

ValueError – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)

Reserve n arcs at a particular state (best effort).

Parameters:
  • state – The integer index of a state.
  • n – The number of arcs to reserve.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: reserve_states.

reserve_states(n)

Reserve n states (best effort).

Parameters:n – The number of states to reserve.
Returns:self.

See also: reserve_arcs.

reweight(potentials, to_final=False)

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:
  • potentials – An iterable of TropicalWeights.
  • to_final – Push towards final states?
Returns:

self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:
  • connect – Should output be trimmed?
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant, which also supports epsilon removal
in reverse (and which may be more efficient).
set_final(state, weight=None)

Sets the final weight for a state.

Parameters:
  • state – The integer index of a state.
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:

IndexError – State index out of range.

See also: set_start.

set_input_symbols(syms)

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_output_symbols.

set_output_symbols(syms)

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_input_symbols.

set_properties(props, mask)

Sets the properties bits.

Parameters:
  • props (int) – The properties to be set.
  • mask (int) – A mask to be applied to the props argument before setting the FST’s properties.
Returns:

self.

set_start(state)

Sets the initial state.

Parameters:state – The integer index of a state.
Returns:self.
Raises:IndexError – State index out of range.

See also: set_final.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
topsort()

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:self.

See also: arcsort.

type()

Returns the FST type.

Returns:The FST type.
union(ifst)

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:ifst – The second input FST.
Returns:self.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.CompactLatticeVectorFstArcIterator(fst, state)[source]

Arc iterator for a vector FST over the compact lattice semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeVectorFstMutableArcIterator(fst, state)[source]

Mutable arc iterator for a vector FST over the compact lattice semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in lattice.mutable_arcs(0):
    setter(LatticeArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
set_value(arc)

Replace the current arc with a new arc.

Parameters:arc – The arc to replace the current arc with.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeVectorFstStateIterator(fst)[source]

State iterator for a vector FST over the compact lattice semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeWeight[source]

Compact lattice weight factory.

This class is used for creating new CompactLatticeWeight instances.

CompactLatticeWeight():
Creates an uninitialized CompactLatticeWeight instance.
CompactLatticeWeight(weight):
Creates a new CompactLatticeWeight instance initalized with the weight.
Parameters:weight (Tuple[Tuple[float, float], List[int]] or Tuple[LatticeWeight, List[int]] or CompactLatticeWeight) – A pair of weight values or another CompactLatticeWeight instance.
CompactLatticeWeight(weight, string):
Creates a new CompactLatticeWeight instance initalized with the (weight, string) pair.
Parameters:
from_other(other:CompactLatticeWeight) → CompactLatticeWeight

Create a new compact lattice weight from another.

from_pair(w:LatticeWeight, s:list<int>) → CompactLatticeWeight

Create a new compact lattice weight from a weight string pair.

get_int_size_string() → str

Returns int size string.

hash() → int

Returns the hash for the weight.

member() → bool

Checks if weight is a member of the compact lattice semiring.

no_weight() → CompactLatticeWeight

No weight in compact lattice semiring.

one() → CompactLatticeWeight

One in compact lattice semiring.

properties() → int

Returns weight properties.

quantize(delta:float=default) → CompactLatticeWeight

Quantizes the weight.

reverse() → CompactLatticeWeight

Reverses the weight.

string

The string as a list of integers.

type() → str

Returns weight type.

weight

The weight.

zero() → CompactLatticeWeight

Zero in compact lattice semiring.

class kaldi.fstext.FstHeader

FST file header.

arc_type() → str

Returns arc type.

debug_string() → str

Outputs a debug string for the FstHeader object.

fst_type() → str

Returns FST type.

get_flags() → int

Returns flags.

num_arcs() → int

Returns number of arcs.

num_states() → int

Returns number of states.

properties() → int

Returns FST properties.

read(strm:istream, source:str, rewind:bool=default) → bool

Reads header from stream.

set_arc_type(type:str)

Sets arc type.

set_flags(flags:int)

Sets flags.

set_fst_type(type:str)

Sets FST type.

set_num_arcs(numarcs:int)

Sets number of arcs.

set_num_states(numstates:int)

Sets number of states.

set_properties(properties:int)

Sets FST properties.

set_start(start:int)

Sets start state.

set_version(version:int)

Sets version.

start() → int

Returns start state.

version() → int

Returns version.

write(strm:ostream, source:str) → bool

Writes header to stream.

class kaldi.fstext.FstReadOptions

FST reading options.

FileReadMode

alias of FstReadOptions.FileReadMode

debug_string() → str

Outputs a debug string for the FstReadOptions object.

mode

Read or map files (advisory, if possible)

read_isymbols

Read input symbols, if any (default – true).

read_mode(mode:str) → FileReadMode

Converts mode strings into FileReadMode enum values.

read_osymbols

Read output symbols, if any (default – true).

source

Where you’re reading from.

class kaldi.fstext.FstWriteOptions

FST writing options.

align

Write data aligned (may fail on pipes)?

source

Where you’re writing to.

stream_write

Avoid seek operations in writing.

write_header

Write the header?

write_isymbols

Write input symbols?

write_osymbols

Write output symbols?

class kaldi.fstext.KwsIndexArc[source]

FST arc with KWS index weight.

KwsIndexArc():
Creates an uninitialized KwsIndexArc instance.
KwsIndexArc(ilabel, olabel, weight, nextstate):
Creates a new KwsIndexArc instance initalized with given arguments.
Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (KwsIndexWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
from_attrs(ilabel:int, olabel:int, weight:KwsIndexWeight, nextstate:int) → KwsIndexArc

Creates a new arc with the given attributes.

Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (KwsIndexWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
ilabel

int – The input label.

nextstate

int – The destination state for the arc.

olabel

int – The output label.

type() → str

Returns arc type.

weight

TropicalWeight – The arc weight.

class kaldi.fstext.KwsIndexConstFst(fst=None)[source]

Constant FST over the KWS index semiring.

Parameters:fst (KwsIndexFst) – The input FST over the KWS index semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

copy()

Makes a copy of the FST.

Returns:A copy of the FST.
draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
type()

Returns the FST type.

Returns:The FST type.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.KwsIndexConstFstArcIterator(fst, state)[source]

Arc iterator for a constant FST over the KWS index semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexConstFstStateIterator(fst)[source]

State iterator for a constant FST over the KWS index semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]

Arc encoder for an FST over the KWS index semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:
  • encode_labels (bool) – Should labels be encoded?
  • encode_weights (bool) – Should weights be encoded?
  • encode (bool) – Encode or decode?
flags() → int

Returns encoder flags.

from_other(mapper:KwsIndexEncodeMapper) → KwsIndexEncodeMapper

Creates a new encoder with the contents of another.

from_other_with_type(mapper:KwsIndexEncodeMapper, type:EncodeType) → KwsIndexEncodeMapper

Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable

Returns input symbol table.

output_symbols() → SymbolTable

Returns output symbol table.

properties(inprops:int) → int

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:mask – The property mask to be compared to the encoder’s properties.
Returns:A 64-bit bitmask representing the requested properties.
read(filename:str, type:EncodeType=default) → KwsIndexEncodeMapper

Reads encoder from file.

set_input_symbols(syms:SymbolTable)

Sets the input symbol table.

Parameters:syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)

Sets the output symbol table.

Parameters:syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType

Returns encoder type.

write(filename:str) → bool

Writes encoder to file.

Returns:True if write was successful, False otherwise.
class kaldi.fstext.KwsIndexEncodeTable

Encode table for KwsIndexArc.

KwsIndexEncodeTable(flags):
Creates a new encode table with the given flags.
class Tuple

KwsIndexArc encoding tuple.

ilabel

Input label.

olabel

Output label.

weight

Weight.

decode(key:int) → Tuple

Decodes an encoded arc label back to labels and cost.

encode(arc:KwsIndexArc) → int

Encodes the given arc (either labels or weights or both).

flags() → int

Returns encoding flags.

get_label(arc:KwsIndexArc) → int

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable

Returns input symbols.

output_symbols() → SymbolTable

Returns output symbols.

read(strm:istream, source:str) → KwsIndexEncodeTable

Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)

Sets input symbols.

set_output_symbols(syms:SymbolTable)

Sets output symbols.

size() → int

Returns the size of the table.

write(strm:ostream, source:str) → bool

Writes table to output stream.

class kaldi.fstext.KwsIndexFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]

Compiler for FSTs over the KWS index semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:
  • isymbols – An optional SymbolTable used to label input symbols.
  • osymbols – An optional SymbolTable used to label output symbols.
  • ssymbols – An optional SymbolTable used to label states.
  • acceptor – Should the FST be rendered in acceptor format if possible?
  • keep_isymbols – Should the input symbol table be stored in the FST?
  • keep_osymbols – Should the output symbol table be stored in the FST?
  • keep_state_numbering – Should the state numbering be preserved?
  • allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
compile()

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:The FST described by the string buffer.
Raises:RuntimeError – Compilation failed.
write(expression)

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)
Parameters:expression – A string expression to add to compiler string buffer.
class kaldi.fstext.KwsIndexVectorFst(fst=None)[source]

Vector FST over the KWS index semiring.

Parameters:fst (KwsIndexFst) – The input FST over the KWS index semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
add_arc(state, arc)

Adds a new arc to the FST and returns self.

Parameters:
  • state – The integer index of the source state.
  • arc – The arc to add.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: add_state.

add_state()

Adds a new state to the FST and returns the state ID.

Returns:The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:self.
Raises:ValueError – Unknown sort type.

See also: topsort.

closure(closure_plus=False)

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:closure_plus – If True, do not accept the empty string.
Returns:self.
concat(ifst)

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:ifst – The second input FST.
Returns:self.
connect()

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:self.
copy()

Makes a copy of the FST.

Returns:A copy of the FST.
decode(encoder)

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: encode.

delete_arcs(state, n=None)

Deletes arcs leaving a particular state.

Parameters:
  • state – The integer index of a state.
  • n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: delete_states.

delete_states(states=None)

Deletes states.

Parameters:states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:self.
Raises:IndexError – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: decode.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

invert()

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:self.
minimize(delta=0.0009765625, allow_nondet=False)

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • allow_nondet – Attempt minimization of non-deterministic FST?
Returns:

self.

mutable_arcs(state)

Returns a mutable iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

project(project_output=False)

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:project_output – Project onto output labels?
Returns:self.

See also: decode, encode, relabel, relabel_tables.

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:
  • to_final – Push towards final states?
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • remove_total_weight – If pushing weights, should the total weight be removed?
Returns:

self.

See also: The constructive variant, which also supports label pushing.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

relabel(ipairs=None, opairs=None)

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:
  • ipairs – An iterable containing (old index, new index) integer pairs.
  • opairs – An iterable containing (old index, new index) integer pairs.
Returns:

self.

Raises:

ValueError – No relabeling pairs specified.

See also: decode, encode, project, relabel_tables.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:
  • old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
  • new_isymbols – A SymbolTable used to relabel the input labels
  • unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
  • old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
  • new_osymbols – A SymbolTable used to relabel the output labels.
  • unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:

self.

Raises:

ValueError – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)

Reserve n arcs at a particular state (best effort).

Parameters:
  • state – The integer index of a state.
  • n – The number of arcs to reserve.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: reserve_states.

reserve_states(n)

Reserve n states (best effort).

Parameters:n – The number of states to reserve.
Returns:self.

See also: reserve_arcs.

reweight(potentials, to_final=False)

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:
  • potentials – An iterable of TropicalWeights.
  • to_final – Push towards final states?
Returns:

self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:
  • connect – Should output be trimmed?
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant, which also supports epsilon removal
in reverse (and which may be more efficient).
set_final(state, weight=None)

Sets the final weight for a state.

Parameters:
  • state – The integer index of a state.
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:

IndexError – State index out of range.

See also: set_start.

set_input_symbols(syms)

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_output_symbols.

set_output_symbols(syms)

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_input_symbols.

set_properties(props, mask)

Sets the properties bits.

Parameters:
  • props (int) – The properties to be set.
  • mask (int) – A mask to be applied to the props argument before setting the FST’s properties.
Returns:

self.

set_start(state)

Sets the initial state.

Parameters:state – The integer index of a state.
Returns:self.
Raises:IndexError – State index out of range.

See also: set_final.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
topsort()

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:self.

See also: arcsort.

type()

Returns the FST type.

Returns:The FST type.
union(ifst)

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:ifst – The second input FST.
Returns:self.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.KwsIndexVectorFstArcIterator(fst, state)[source]

Arc iterator for a vector FST over the KWS index semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexVectorFstMutableArcIterator(fst, state)[source]

Mutable arc iterator for a vector FST over the KWS index semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in fst.mutable_arcs(0):
    setter(KwsIndexArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
set_value(arc)

Replace the current arc with a new arc.

Parameters:arc – The arc to replace the current arc with.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexVectorFstStateIterator(fst)[source]

State iterator for a vector FST over the KWS index semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexWeight[source]

KWS index weight factory.

This class is used for creating new KwsIndexWeight instances.

KwsIndexWeight():
Creates an uninitialized KwsIndexWeight instance.
KwsIndexWeight(weight):
Creates a new KwsIndexWeight instance initalized with the weight.
Parameters:weight (Tuple[float, Tuple[float, float]] or Tuple[TropicalWeight, KwsTimeWeight] or KwsIndexWeight) – A pair of weight values or another KwsIndexWeight instance.
KwsIndexWeight(weight1, weight2):
Creates a new KwsIndexWeight instance initalized with weights.
Parameters:
from_components(w1:TropicalWeight, w2:KwsTimeWeight) → KwsIndexWeight

Creates a new KWS index weight from component weights.

member() → bool

Checks if weight is a member of the KWS index semiring.

no_weight() → KwsIndexWeight

No weight in KWS index semiring.

one() → KwsIndexWeight

One in KWS index semiring.

properties() → int

Returns weight properties.

quantize(delta:float=default) → KwsIndexWeight

Quantizes the weight.

reverse() → KwsIndexWeight

Reverses the weight.

type() → str

Returns weight type.

value1

The first component weight.

value2

The second component weight.

zero() → KwsIndexWeight

Zero in KWS index semiring.

class kaldi.fstext.KwsTimeWeight[source]

KWS time weight factory.

This class is used for creating new KwsTimeWeight instances.

KwsTimeWeight():
Creates an uninitialized KwsTimeWeight instance.
KwsTimeWeight(weight):
Creates a new KwsTimeWeight instance initalized with the weight.
Parameters:
KwsTimeWeight(weight1, weight2):
Creates a new KwsTimeWeight instance initalized with the weights.
Parameters:
  • weight1 (float) – The first weight value.
  • weight2 (float) – The second weight value.
from_components(w1:TropicalWeight, w2:TropicalWeight) → KwsTimeWeight

Creates a new KWS time weight from component weights.

member() → bool

Checks if weight is a member of the KWS time semiring.

no_weight() → KwsTimeWeight

No weight in the KWS time semiring.

one() → KwsTimeWeight

One in the KWS time semiring.

properties() → int

Returns weight properties.

quantize(delta:float=default) → KwsTimeWeight

Quantizes the weight.

reverse() → KwsTimeWeight

Reverses the weight.

type() → str

Returns weight type.

value1

The first component weight.

value2

The second component weight.

zero() → KwsTimeWeight

Zero in the KWS time semiring.

class kaldi.fstext.LatticeArc[source]

FST arc with lattice weight.

LatticeArc():
Creates an uninitialized LatticeArc instance.
LatticeArc(ilabel, olabel, weight, nextstate):
Creates a new LatticeArc instance initalized with given arguments.
Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (LatticeWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
from_attrs(ilabel:int, olabel:int, weight:LatticeWeight, nextstate:int) → LatticeArc

Creates a new arc with the given attributes.

Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (LatticeWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
ilabel

int – The input label.

nextstate

int – The destination state for the arc.

olabel

int – The output label.

type() → str

Returns arc type.

weight

LatticeWeight – The arc weight.

class kaldi.fstext.LatticeConstFst(fst=None)[source]

Constant FST over the lattice semiring.

Parameters:fst (LatticeFst) – The input FST over the lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

copy()

Makes a copy of the FST.

Returns:A copy of the FST.
draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
type()

Returns the FST type.

Returns:The FST type.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.LatticeConstFstArcIterator(fst, state)[source]

Arc iterator for a constant FST over the lattice semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeConstFstStateIterator(fst)[source]

State iterator for a constant FST over the lattice semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]

Arc encoder for an FST over the lattice semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:
  • encode_labels (bool) – Should labels be encoded?
  • encode_weights (bool) – Should weights be encoded?
  • encode (bool) – Encode or decode?
flags() → int

Returns encoder flags.

from_other(mapper:LatticeEncodeMapper) → LatticeEncodeMapper

Creates a new encoder with the contents of another.

from_other_with_type(mapper:LatticeEncodeMapper, type:EncodeType) → LatticeEncodeMapper

Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable

Returns input symbol table.

output_symbols() → SymbolTable

Returns output symbol table.

properties(inprops:int) → int

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:mask – The property mask to be compared to the encoder’s properties.
Returns:A 64-bit bitmask representing the requested properties.
read(filename:str, type:EncodeType=default) → LatticeEncodeMapper

Reads encoder from file.

set_input_symbols(syms:SymbolTable)

Sets the input symbol table.

Parameters:syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)

Sets the output symbol table.

Parameters:syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType

Returns encoder type.

write(filename:str) → bool

Writes encoder to file.

Returns:True if write was successful, False otherwise.
class kaldi.fstext.LatticeEncodeTable

Encode table for LatticeArc.

LatticeEncodeTable(flags):
Creates a new encode table with the given flags.
class Tuple

LatticeArc encoding tuple.

ilabel

Input label.

olabel

Output label.

weight

Weight.

decode(key:int) → Tuple

Decodes an encoded arc label back to labels and cost.

encode(arc:LatticeArc) → int

Encodes the given arc (either labels or weights or both).

flags() → int

Returns encoding flags.

get_label(arc:LatticeArc) → int

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable

Returns input symbols.

output_symbols() → SymbolTable

Returns output symbols.

read(strm:istream, source:str) → LatticeEncodeTable

Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)

Sets input symbols.

set_output_symbols(syms:SymbolTable)

Sets output symbols.

size() → int

Returns the size of the table.

write(strm:ostream, source:str) → bool

Writes table to output stream.

class kaldi.fstext.LatticeFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]

Compiler for FSTs over the lattice semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:
  • isymbols – An optional SymbolTable used to label input symbols.
  • osymbols – An optional SymbolTable used to label output symbols.
  • ssymbols – An optional SymbolTable used to label states.
  • acceptor – Should the FST be rendered in acceptor format if possible?
  • keep_isymbols – Should the input symbol table be stored in the FST?
  • keep_osymbols – Should the output symbol table be stored in the FST?
  • keep_state_numbering – Should the state numbering be preserved?
  • allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
compile()

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:The FST described by the string buffer.
Raises:RuntimeError – Compilation failed.
write(expression)

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)
Parameters:expression – A string expression to add to compiler string buffer.
class kaldi.fstext.LatticeVectorFst(fst=None)[source]

Vector FST over the lattice semiring.

Parameters:fst (LatticeFst) – The input FST over the lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
add_arc(state, arc)

Adds a new arc to the FST and returns self.

Parameters:
  • state – The integer index of the source state.
  • arc – The arc to add.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: add_state.

add_state()

Adds a new state to the FST and returns the state ID.

Returns:The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:self.
Raises:ValueError – Unknown sort type.

See also: topsort.

closure(closure_plus=False)

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:closure_plus – If True, do not accept the empty string.
Returns:self.
concat(ifst)

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:ifst – The second input FST.
Returns:self.
connect()

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:self.
copy()

Makes a copy of the FST.

Returns:A copy of the FST.
decode(encoder)

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: encode.

delete_arcs(state, n=None)

Deletes arcs leaving a particular state.

Parameters:
  • state – The integer index of a state.
  • n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: delete_states.

delete_states(states=None)

Deletes states.

Parameters:states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:self.
Raises:IndexError – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: decode.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

invert()

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:self.
minimize(delta=0.0009765625, allow_nondet=False)

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • allow_nondet – Attempt minimization of non-deterministic FST?
Returns:

self.

mutable_arcs(state)

Returns a mutable iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

project(project_output=False)

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:project_output – Project onto output labels?
Returns:self.

See also: decode, encode, relabel, relabel_tables.

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:
  • to_final – Push towards final states?
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • remove_total_weight – If pushing weights, should the total weight be removed?
Returns:

self.

See also: The constructive variant, which also supports label pushing.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

relabel(ipairs=None, opairs=None)

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:
  • ipairs – An iterable containing (old index, new index) integer pairs.
  • opairs – An iterable containing (old index, new index) integer pairs.
Returns:

self.

Raises:

ValueError – No relabeling pairs specified.

See also: decode, encode, project, relabel_tables.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:
  • old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
  • new_isymbols – A SymbolTable used to relabel the input labels
  • unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
  • old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
  • new_osymbols – A SymbolTable used to relabel the output labels.
  • unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:

self.

Raises:

ValueError – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)

Reserve n arcs at a particular state (best effort).

Parameters:
  • state – The integer index of a state.
  • n – The number of arcs to reserve.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: reserve_states.

reserve_states(n)

Reserve n states (best effort).

Parameters:n – The number of states to reserve.
Returns:self.

See also: reserve_arcs.

reweight(potentials, to_final=False)

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:
  • potentials – An iterable of TropicalWeights.
  • to_final – Push towards final states?
Returns:

self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:
  • connect – Should output be trimmed?
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant, which also supports epsilon removal
in reverse (and which may be more efficient).
set_final(state, weight=None)

Sets the final weight for a state.

Parameters:
  • state – The integer index of a state.
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:

IndexError – State index out of range.

See also: set_start.

set_input_symbols(syms)

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_output_symbols.

set_output_symbols(syms)

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_input_symbols.

set_properties(props, mask)

Sets the properties bits.

Parameters:
  • props (int) – The properties to be set.
  • mask (int) – A mask to be applied to the props argument before setting the FST’s properties.
Returns:

self.

set_start(state)

Sets the initial state.

Parameters:state – The integer index of a state.
Returns:self.
Raises:IndexError – State index out of range.

See also: set_final.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
topsort()

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:self.

See also: arcsort.

type()

Returns the FST type.

Returns:The FST type.
union(ifst)

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:ifst – The second input FST.
Returns:self.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.LatticeVectorFstArcIterator(fst, state)[source]

Arc iterator for a vector FST over the lattice semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeVectorFstMutableArcIterator(fst, state)[source]

Mutable arc iterator for a vector FST over the lattice semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in lattice.mutable_arcs(0):
    setter(LatticeArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
set_value(arc)

Replace the current arc with a new arc.

Parameters:arc – The arc to replace the current arc with.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeVectorFstStateIterator(fst)[source]

State iterator for a vector FST over the lattice semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeWeight[source]

Lattice weight factory.

This class is used for creating new LatticeWeight instances.

LatticeWeight():
Creates an uninitialized LatticeWeight instance.
LatticeWeight(weight):
Creates a new LatticeWeight instance initalized with the weight.
Parameters:
LatticeWeight(weight1, weight2):
Creates a new LatticeWeight instance initalized with the weights.
Parameters:
  • weight1 (float) – The first weight value.
  • weight2 (float) – The second weight value.
from_other(other:LatticeWeight) → LatticeWeight

Create a new lattice weight from another.

from_pair(a:float, b:float) → LatticeWeight

Create a new lattice weight from a pair of floats.

hash() → int

Returns the hash for the weight.

member() → bool

Checks if weight is a member of the lattice semiring.

no_weight() → LatticeWeight

No weight in lattice semiring.

one() → LatticeWeight

One in lattice semiring, i.e. (0.0, 0.0).

properties() → int

Returns weight properties.

quantize(delta:float=default) → LatticeWeight

Quantizes the weight.

reverse() → LatticeWeight

Reverses the weight.

type() → str

Returns weight type.

value1

Float value of the first weight.

value2

Float value of the second weight.

zero() → LatticeWeight

Zero in lattice semiring, i.e. (+infinity, +infinity).

class kaldi.fstext.LogArc[source]

FST arc with log weight.

LogArc():
Creates an uninitialized LogArc instance.
LogArc(ilabel, olabel, weight, nextstate):
Creates a new LogArc instance initalized with given arguments.
Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (LogWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
from_attrs(ilabel:int, olabel:int, weight:LogWeight, nextstate:int) → LogArc

Creates a new arc with the given attributes.

Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (LogWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
ilabel

int – The input label.

nextstate

int – The destination state for the arc.

olabel

int – The output label.

type() → str

Returns arc type.

weight

LogWeight – The arc weight.

class kaldi.fstext.LogConstFst(fst=None)[source]

Constant FST over the log semiring.

Parameters:fst (LogFst) – The input FST over the log semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

copy()

Makes a copy of the FST.

Returns:A copy of the FST.
draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
type()

Returns the FST type.

Returns:The FST type.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.LogConstFstArcIterator(fst, state)[source]

Arc iterator for a constant FST over the log semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogConstFstStateIterator(fst)[source]

State iterator for a constant FST over the log semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]

Arc encoder for an FST over the log semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:
  • encode_labels (bool) – Should labels be encoded?
  • encode_weights (bool) – Should weights be encoded?
  • encode (bool) – Encode or decode?
flags() → int

Returns encoder flags.

from_other(mapper:LogEncodeMapper) → LogEncodeMapper

Creates a new encoder with the contents of another.

from_other_with_type(mapper:LogEncodeMapper, type:EncodeType) → LogEncodeMapper

Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable

Returns input symbol table.

output_symbols() → SymbolTable

Returns output symbol table.

properties(inprops:int) → int

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:mask – The property mask to be compared to the encoder’s properties.
Returns:A 64-bit bitmask representing the requested properties.
read(filename:str, type:EncodeType=default) → LogEncodeMapper

Reads encoder from file.

set_input_symbols(syms:SymbolTable)

Sets the input symbol table.

Parameters:syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)

Sets the output symbol table.

Parameters:syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType

Returns encoder type.

write(filename:str) → bool

Writes encoder to file.

Returns:True if write was successful, False otherwise.
class kaldi.fstext.LogEncodeTable

Encode table for LogArc.

LogEncodeTable(flags):
Creates a new encode table with the given flags.
class Tuple

LogArc encoding tuple.

ilabel

Input label.

olabel

Output label.

weight

Weight.

decode(key:int) → Tuple

Decodes an encoded arc label back to labels and cost.

encode(arc:LogArc) → int

Encodes the given arc (either labels or weights or both).

flags() → int

Returns encoding flags.

get_label(arc:LogArc) → int

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable

Returns input symbols.

output_symbols() → SymbolTable

Returns output symbols.

read(strm:istream, source:str) → LogEncodeTable

Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)

Sets input symbols.

set_output_symbols(syms:SymbolTable)

Sets output symbols.

size() → int

Returns the size of the table.

write(strm:ostream, source:str) → bool

Writes table to output stream.

class kaldi.fstext.LogFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]

Compiler for FSTs over the log semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:
  • isymbols – An optional SymbolTable used to label input symbols.
  • osymbols – An optional SymbolTable used to label output symbols.
  • ssymbols – An optional SymbolTable used to label states.
  • acceptor – Should the FST be rendered in acceptor format if possible?
  • keep_isymbols – Should the input symbol table be stored in the FST?
  • keep_osymbols – Should the output symbol table be stored in the FST?
  • keep_state_numbering – Should the state numbering be preserved?
  • allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
compile()

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:The FST described by the string buffer.
Raises:RuntimeError – Compilation failed.
write(expression)

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)
Parameters:expression – A string expression to add to compiler string buffer.
class kaldi.fstext.LogVectorFst(fst=None)[source]

Vector FST over the log semiring.

Parameters:fst (LogFst) – The input FST over the log semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
add_arc(state, arc)

Adds a new arc to the FST and returns self.

Parameters:
  • state – The integer index of the source state.
  • arc – The arc to add.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: add_state.

add_state()

Adds a new state to the FST and returns the state ID.

Returns:The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:self.
Raises:ValueError – Unknown sort type.

See also: topsort.

closure(closure_plus=False)

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:closure_plus – If True, do not accept the empty string.
Returns:self.
concat(ifst)

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:ifst – The second input FST.
Returns:self.
connect()

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:self.
copy()

Makes a copy of the FST.

Returns:A copy of the FST.
decode(encoder)

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: encode.

delete_arcs(state, n=None)

Deletes arcs leaving a particular state.

Parameters:
  • state – The integer index of a state.
  • n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: delete_states.

delete_states(states=None)

Deletes states.

Parameters:states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:self.
Raises:IndexError – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: decode.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

invert()

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:self.
minimize(delta=0.0009765625, allow_nondet=False)

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • allow_nondet – Attempt minimization of non-deterministic FST?
Returns:

self.

mutable_arcs(state)

Returns a mutable iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

project(project_output=False)

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:project_output – Project onto output labels?
Returns:self.

See also: decode, encode, relabel, relabel_tables.

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:
  • to_final – Push towards final states?
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • remove_total_weight – If pushing weights, should the total weight be removed?
Returns:

self.

See also: The constructive variant, which also supports label pushing.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

relabel(ipairs=None, opairs=None)

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:
  • ipairs – An iterable containing (old index, new index) integer pairs.
  • opairs – An iterable containing (old index, new index) integer pairs.
Returns:

self.

Raises:

ValueError – No relabeling pairs specified.

See also: decode, encode, project, relabel_tables.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:
  • old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
  • new_isymbols – A SymbolTable used to relabel the input labels
  • unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
  • old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
  • new_osymbols – A SymbolTable used to relabel the output labels.
  • unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:

self.

Raises:

ValueError – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)

Reserve n arcs at a particular state (best effort).

Parameters:
  • state – The integer index of a state.
  • n – The number of arcs to reserve.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: reserve_states.

reserve_states(n)

Reserve n states (best effort).

Parameters:n – The number of states to reserve.
Returns:self.

See also: reserve_arcs.

reweight(potentials, to_final=False)

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:
  • potentials – An iterable of TropicalWeights.
  • to_final – Push towards final states?
Returns:

self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:
  • connect – Should output be trimmed?
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant, which also supports epsilon removal
in reverse (and which may be more efficient).
set_final(state, weight=None)

Sets the final weight for a state.

Parameters:
  • state – The integer index of a state.
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:

IndexError – State index out of range.

See also: set_start.

set_input_symbols(syms)

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_output_symbols.

set_output_symbols(syms)

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_input_symbols.

set_properties(props, mask)

Sets the properties bits.

Parameters:
  • props (int) – The properties to be set.
  • mask (int) – A mask to be applied to the props argument before setting the FST’s properties.
Returns:

self.

set_start(state)

Sets the initial state.

Parameters:state – The integer index of a state.
Returns:self.
Raises:IndexError – State index out of range.

See also: set_final.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
topsort()

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:self.

See also: arcsort.

type()

Returns the FST type.

Returns:The FST type.
union(ifst)

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:ifst – The second input FST.
Returns:self.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.LogVectorFstArcIterator(fst, state)[source]

Arc iterator for a vector FST over the log semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogVectorFstMutableArcIterator(fst, state)[source]

Mutable arc iterator for a vector FST over the log semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in logfst.mutable_arcs(0):
    setter(LogArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
set_value(arc)

Replace the current arc with a new arc.

Parameters:arc – The arc to replace the current arc with.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogVectorFstStateIterator(fst)[source]

State iterator for a vector FST over the log semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogWeight[source]

Log weight factory.

This class is used for creating new LogWeight instances.

LogWeight():
Creates an uninitialized LogWeight instance.
LogWeight(weight):
Creates a new LogWeight instance initalized with the weight.
Parameters:weight (float or FloatWeight) – The weight value.
from_float(f:float) → LogWeight

Create a new log weight from a float.

from_other(weight:LogWeight) → LogWeight

Create a new log weight from another.

hash() → int

Returns the hash for the weight.

member() → bool

Checks if weight is a member of log semiring.

no_weight() → LogWeight

No weight in log semiring.

one() → LogWeight

One in log semiring, i.e. 0.0.

properties() → int

Returns weight properties.

quantize(delta:float=default) → LogWeight

Quantizes the weight.

reverse() → LogWeight

Reverses the weight.

type() → str

Returns weight type.

value

Float value of the weight.

zero() → LogWeight

Zero in log semiring, i.e. float +infinity.

class kaldi.fstext.StdArc[source]

FST arc with tropical weight.

StdArc():
Creates an uninitialized StdArc instance.
StdArc(ilabel, olabel, weight, nextstate):
Creates a new StdArc instance initalized with given arguments.
Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (TropicalWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
from_attrs(ilabel:int, olabel:int, weight:TropicalWeight, nextstate:int) → StdArc

Creates a new arc with the given attributes.

Parameters:
  • ilabel (int) – The input label.
  • olabel (int) – The output label.
  • weight (TropicalWeight) – The arc weight.
  • nextstate (int) – The destination state for the arc.
ilabel

int – The input label.

nextstate

int – The destination state for the arc.

olabel

int – The output label.

type() → str

Returns arc type.

weight

TropicalWeight – The arc weight.

class kaldi.fstext.StdConstFst(fst=None)[source]

Constant FST over the tropical semiring.

Parameters:fst (StdFst) – The input FST over the tropical semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

copy()

Makes a copy of the FST.

Returns:A copy of the FST.
draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
type()

Returns the FST type.

Returns:The FST type.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.StdConstFstArcIterator(fst, state)[source]

Arc iterator for a constant FST over the tropical semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.StdConstFstStateIterator(fst)[source]

State iterator for a constant FST over the tropical semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.StdEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]

Arc encoder for an FST over the tropical semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:
  • encode_labels (bool) – Should labels be encoded?
  • encode_weights (bool) – Should weights be encoded?
  • encode (bool) – Encode or decode?
flags() → int

Returns encoder flags.

from_other(mapper:StdEncodeMapper) → StdEncodeMapper

Creates a new encoder with the contents of another.

from_other_with_type(mapper:StdEncodeMapper, type:EncodeType) → StdEncodeMapper

Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable

Returns input symbol table.

output_symbols() → SymbolTable

Returns output symbol table.

properties(inprops:int) → int

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:mask – The property mask to be compared to the encoder’s properties.
Returns:A 64-bit bitmask representing the requested properties.
read(filename:str, type:EncodeType=default) → StdEncodeMapper

Reads encoder from file.

set_input_symbols(syms:SymbolTable)

Sets the input symbol table.

Parameters:syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)

Sets the output symbol table.

Parameters:syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType

Returns encoder type.

write(filename:str) → bool

Writes encoder to file.

Returns:True if write was successful, False otherwise.
class kaldi.fstext.StdEncodeTable

Encode table for StdArc.

StdEncodeTable(flags):
Creates a new encode table with the given flags.
class Tuple

StdArc encoding tuple.

ilabel

Input label.

olabel

Output label.

weight

Weight.

decode(key:int) → Tuple

Decodes an encoded arc label back to labels and cost.

encode(arc:StdArc) → int

Encodes the given arc (either labels or weights or both).

flags() → int

Returns encoding flags.

get_label(arc:StdArc) → int

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable

Returns input symbols.

output_symbols() → SymbolTable

Returns output symbols.

read(strm:istream, source:str) → StdEncodeTable

Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)

Sets input symbols.

set_output_symbols(syms:SymbolTable)

Sets output symbols.

size() → int

Returns the size of the table.

write(strm:ostream, source:str) → bool

Writes table to output stream.

class kaldi.fstext.StdFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]

Compiler for FSTs over the tropical semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:
  • isymbols – An optional SymbolTable used to label input symbols.
  • osymbols – An optional SymbolTable used to label output symbols.
  • ssymbols – An optional SymbolTable used to label states.
  • acceptor – Should the FST be rendered in acceptor format if possible?
  • keep_isymbols – Should the input symbol table be stored in the FST?
  • keep_osymbols – Should the output symbol table be stored in the FST?
  • keep_state_numbering – Should the state numbering be preserved?
  • allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
compile()

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:The FST described by the string buffer.
Raises:RuntimeError – Compilation failed.
write(expression)

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)
Parameters:expression – A string expression to add to compiler string buffer.
class kaldi.fstext.StdVectorFst(fst=None)[source]

Vector FST over the tropical semiring.

Parameters:fst (StdFst) – The input FST over the tropical semiring. If provided, its contents are used for initializing the new FST. Defaults to None.
add_arc(state, arc)

Adds a new arc to the FST and returns self.

Parameters:
  • state – The integer index of the source state.
  • arc – The arc to add.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: add_state.

add_state()

Adds a new state to the FST and returns the state ID.

Returns:The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)

Returns an iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:self.
Raises:ValueError – Unknown sort type.

See also: topsort.

closure(closure_plus=False)

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:closure_plus – If True, do not accept the empty string.
Returns:self.
concat(ifst)

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:ifst – The second input FST.
Returns:self.
connect()

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:self.
copy()

Makes a copy of the FST.

Returns:A copy of the FST.
decode(encoder)

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: encode.

delete_arcs(state, n=None)

Deletes arcs leaving a particular state.

Parameters:
  • state – The integer index of a state.
  • n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: delete_states.

delete_states(states=None)

Deletes states.

Parameters:states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:self.
Raises:IndexError – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:
  • filename (str) – The string location of the output dot/Graphviz file.
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
  • title (str) – An optional string indicating the figure title. Defaults to empty string.
  • width (float) – The figure width, in inches. Defaults 8.5’‘.
  • height (float) – The figure height, in inches. Defaults 11’‘.
  • portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
  • vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
  • ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
  • nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
  • fontsize (int) – Font size, in points. Defaults 14pt.
  • precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
  • float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:encoder – An EncodeMapper object used to encode the FST.
Returns:self.

See also: decode.

final(state)

Returns the final weight of a state.

Parameters:state – The integer index of a state.
Returns:The final Weight of that state.
Raises:IndexError – State index out of range.
from_bytes(s)

Returns the FST represented by the bytes object.

Parameters:s (bytes) – The bytes object representing the FST.
Returns:An FST object.
input_symbols()

Returns the input symbol table.

Returns:The input symbol table.

See Also: output_symbols().

invert()

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:self.
minimize(delta=0.0009765625, allow_nondet=False)

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • allow_nondet – Attempt minimization of non-deterministic FST?
Returns:

self.

mutable_arcs(state)

Returns a mutable iterator over arcs leaving the specified state.

Parameters:state – The source state index.
Returns:A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:state – The integer index of a state. Defaults to None.
Returns:The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:IndexError – State index out of range.

See also: num_states.

num_input_epsilons(state)

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-input-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:state – The integer index of a state.
Returns:The number of epsilon-output-labeled arcs leaving that state.
Raises:IndexError – State index out of range.

See also: num_input_epsilons.

num_states()

Returns the number of states, counting them if necessary.

Returns:The number of states.

See also: num_arcs.

output_symbols()

Returns the output symbol table.

Returns:The output symbol table.

See Also: input_symbols().

project(project_output=False)

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:project_output – Project onto output labels?
Returns:self.

See also: decode, encode, relabel, relabel_tables.

properties(mask, test)

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:
  • mask – The property mask to be compared to the FST’s properties.
  • test – Should any unknown values be computed before comparing against the mask?
Returns:

A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:
  • to_final – Push towards final states?
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • remove_total_weight – If pushing weights, should the total weight be removed?
Returns:

self.

See also: The constructive variant, which also supports label pushing.

read(filename)

Reads an FST from a file.

Parameters:filename (str) – The location of the input file.
Returns:An FST object.
Raises:RuntimeError – Read failed.
read_from_stream(strm, ropts)

Reads an FST from an input stream.

Parameters:
Returns:

An FST object.

Raises:

RuntimeError – Read failed.

relabel(ipairs=None, opairs=None)

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:
  • ipairs – An iterable containing (old index, new index) integer pairs.
  • opairs – An iterable containing (old index, new index) integer pairs.
Returns:

self.

Raises:

ValueError – No relabeling pairs specified.

See also: decode, encode, project, relabel_tables.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:
  • old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
  • new_isymbols – A SymbolTable used to relabel the input labels
  • unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
  • old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
  • new_osymbols – A SymbolTable used to relabel the output labels.
  • unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
  • attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:

self.

Raises:

ValueError – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)

Reserve n arcs at a particular state (best effort).

Parameters:
  • state – The integer index of a state.
  • n – The number of arcs to reserve.
Returns:

self.

Raises:

IndexError – State index out of range.

See also: reserve_states.

reserve_states(n)

Reserve n states (best effort).

Parameters:n – The number of states to reserve.
Returns:self.

See also: reserve_arcs.

reweight(potentials, to_final=False)

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:
  • potentials – An iterable of TropicalWeights.
  • to_final – Push towards final states?
Returns:

self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:
  • connect – Should output be trimmed?
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

self.

See also: The constructive variant, which also supports epsilon removal
in reverse (and which may be more efficient).
set_final(state, weight=None)

Sets the final weight for a state.

Parameters:
  • state – The integer index of a state.
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:

IndexError – State index out of range.

See also: set_start.

set_input_symbols(syms)

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_output_symbols.

set_output_symbols(syms)

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:syms – A SymbolTable.
Returns:self.

See also: set_input_symbols.

set_properties(props, mask)

Sets the properties bits.

Parameters:
  • props (int) – The properties to be set.
  • mask (int) – A mask to be applied to the props argument before setting the FST’s properties.
Returns:

self.

set_start(state)

Sets the initial state.

Parameters:state – The integer index of a state.
Returns:self.
Raises:IndexError – State index out of range.

See also: set_final.

start()

Returns the start state.

Returns:The start state if start state is set, -1 otherwise.
states()

Returns an iterator over all states in the FST.

Returns:A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:
  • isymbols – An optional symbol table used to label input symbols.
  • osymbols – An optional symbol table used to label output symbols.
  • ssymbols – An optional symbol table used to label states.
  • acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
  • show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
  • missing_symbol – The string to be printed when symbol table lookup fails.
Returns:

A formatted string representing the FST.

to_bytes()

Returns a bytes object representing the FST.

Returns:A bytes object.
topsort()

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:self.

See also: arcsort.

type()

Returns the FST type.

Returns:The FST type.
union(ifst)

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:ifst – The second input FST.
Returns:self.
verify()

Verifies that an FST’s contents are sane.

Returns:True if the contents are sane, False otherwise.
write(filename)

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:filename (str) – The location of the output file.
Raises:IOError – Write failed.
write_to_stream(strm, wopts)

Serializes FST to an output stream.

Parameters:
Returns:

True if write was successful, False otherwise.

Raises:

RuntimeError – Write failed.

class kaldi.fstext.StdVectorFstArcIterator(fst, state)[source]

Arc iterator for a vector FST over the tropical semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.StdVectorFstMutableArcIterator(fst, state)[source]

Mutable arc iterator for a vector FST over the tropical semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in fst.mutable_arcs(0):
    setter(StdArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:
  • fst – The fst.
  • state – The state index.
Raises:

IndexError – State index out of range.

done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
flags()

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The current iterator behavioral flags as an integer.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:The iterator’s position, expressed as an integer.
reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:a (int) – The position to seek to.
set_flags(flags, mask)

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:
  • flags (int) – The properties to be set.
  • mask (int) – A mask to be applied to the flags argument before setting them.
set_value(arc)

Replace the current arc with a new arc.

Parameters:arc – The arc to replace the current arc with.
value()

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.StdVectorFstStateIterator(fst)[source]

State iterator for a vector FST over the tropical semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:fst – The fst.
done()

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.SymbolTable

Symbol table.

SymbolTable():
Creates a new symbol table.

This class can be used to programmatically construct a SymbolTable in memory, e.g.

import string

table = SymbolTable()
table.set_name("alphabet")
table.add_symbol("<eps>")
for symbol in string.ascii_lowercase:
    table.add_symbol(symbol)
table.write_text("alphabet.syms")
add_pair(symbol:str, key:int) → int

Adds a symbol with given key to the table and returns the index.

This method adds a (symbol, key) pair to the table. If symbol is already in the table with a different key, then the return value will be the already existing key. Otherwise, return value will be the given key.

Parameters:
  • symbol – A symbol string.
  • key – A non-negative index for the symbol (-1 is reserved for “no symbol requested”).
Returns:

The integer index of the new symbol.

add_symbol(symbol:str) → int

Adds a symbol to the table and returns the index.

This method adds a symbol to the table. The associated value key is automatically assigned by the symbol table.

Parameters:symbol – A symbol string.
Returns:The integer index of the new symbol.
add_table(table:SymbolTable)

Adds another SymbolTable to this table.

This method merges another symbol table into the current table. All key values will be offset by the current available key.

Parameters:syms – A SymbolTable to be merged with the current table.
available_key() → int

Returns the current available key (i.e. highest key + 1).

checksum() → str

Returns the label-agnostic MD5 checksum for the table.

copy() → SymbolTable

Returns a copy of the symbol table.

find_index(symbol:str) → int

Given a symbol, finds the associated index.

Parameters:key – A symbol string.
Returns:The index associated with the symbol key. -1 if symbol is not found.
find_symbol(key:int) → str

Given an index, finds the associated symbol.

Parameters:key – An index.
Returns:The symbol associated with the index key. Empty string if index is not found.
from_name(name:str) → SymbolTable

Creates a new SymbolTable with the given name.

get_nth_key(pos:int) → int

Retrieves the integer index of the n-th key in the table.

Parameters:pos – The n-th key to retrieve.
Returns:The integer index of the n-th key or -1 if index is not found.
labeled_checksum() → str

Returns the label-dependent MD5 checksum of the table.

member_index(key:int) → bool

Given an index, returns whether it is found in the table.

This method returns a boolean indicating whether the given index is present in the table. If one intends to perform subsequent lookup, it is much better to simply call the find_index method and check the return value.

Parameters:key – An index.
Returns:Whether or not the key is present in the table.
member_symbol(symbol:str) → bool

Given a symbol, returns whether it is found in the table.

This method returns a boolean indicating whether the given symbol is present in the table. If one intends to perform subsequent lookup, it is much better to simply call the find_symbol method and check the return value.

Parameters:key – A symbol.
Returns:Whether or not the key is present in the table.
name() → str

Returns the name of the table.

num_symbols() → int

Returns the number of sysmbols in the table.

read(filename:str) → SymbolTable

Reads symbol table from binary file.

This class method creates a new SymbolTable.

Parameters:filename – The string location of the input binary file.
Returns:A new SymbolTable instance.

See also: SymbolTable.read_text.

read_text(filename:str, opts:SymbolTableTextOptions=default) → SymbolTable

Reads symbol table from text file.

This class method creates a new SymbolTable.

Parameters:
  • filename – The string location of the input text file.
  • opts (SymbolTableTextOptions) – The symbol table reading options.
Returns:

A new SymbolTable instance.

See also: SymbolTable.read.

remove_symbol(key:int)

Removes the symbol with the given key.

set_name(new_name:str)

Sets the name of the table.

write(filename:str) → bool

Serializes symbol table to a file.

This method writes the SymbolTable to a file in binary format.

Parameters:filename – The string location of the output file.
Returns:True if write was successful, False otherwise.
write_text(filename:str) → bool

Writes symbol table to text file.

This method writes the SymbolTable to a file in human-readable format.

Parameters:filename – The string location of the output file.
Returns:True if write was successful, False otherwise.
class kaldi.fstext.SymbolTableIterator[source]

Symbol table iterator.

This class is used for iterating over the (index, symbol) pairs in a symbol table. In addition to the full C++ API, it also supports the iterator protocol, e.g.

# Returns a symbol table containing only symbols referenced by fst.
def prune_symbol_table(fst, syms, inp=True):
    seen = set([0])
    for s in fst.states():
        for a in fst.arcs(s):
            seen.add(a.ilabel if inp else a.olabel)
    pruned = SymbolTable()
    for label, symbol in SymbolTableIterator(syms):
        if label in seen:
            pruned.add_pair(symbol, label)
    return pruned
Parameters:table – The symbol table.
done() → bool

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:True if the iterator is exhausted, False otherwise.
next()

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

symbol() → str

Returns the current symbol string.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:A symbol string.
value() → int

Returns the current integer index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:An integer index.
class kaldi.fstext.SymbolTableTextOptions

Options for reading symbol table from text file.

SymbolTableTextOptions(allow_negative_labels=False):
Creates options for reading symbol table from text file.
Parameters:allow_negative_labels (bool) – Allow negative labels?
allow_negative_labels

Allow negative labels? (Not recommended; may cause conflicts).

fst_field_separator

Set of characters used as a separator between printed fields.

class kaldi.fstext.TropicalWeight[source]

Tropical weight factory.

This class is used for creating new TropicalWeight instances.

TropicalWeight():
Creates an uninitialized TropicalWeight instance.
TropicalWeight(weight):
Creates a new TropicalWeight instance initalized with the weight.
Parameters:weight (float or FloatWeight) – The weight value.
from_float(f:float) → TropicalWeight

Create a new tropical weight from a float.

from_other(weight:TropicalWeight) → TropicalWeight

Create a new tropical weight from another.

hash() → int

Returns the hash for the weight.

member() → bool

Checks if weight is a member of the tropical semiring.

no_weight() → TropicalWeight

No weight in tropical semiring.

one() → TropicalWeight

One in tropical semiring, i.e. 0.0.

properties() → int

Returns weight properties.

quantize(delta:float=default) → TropicalWeight

Quantizes the weight.

reverse() → TropicalWeight

Reverses the weight.

type() → str

Returns weight type.

value

Float value of the weight.

zero() → TropicalWeight

Zero in tropical semiring, i.e. float +infinity.

kaldi.fstext.arcmap(ifst, map_type='identity', delta=0.0009765625, weight=None)[source]

Constructively applies a transform to all arcs and final states.

This operation transforms each arc and final state in the input FST using one of the following:

  • identity: maps to self.
  • input_epsilon: replaces all input labels with epsilon.
  • invert: reciprocates all non-Zero weights.
  • output_epsilon: replaces all output labels with epsilon.
  • plus: adds a constant to all weights.
  • quantize: quantizes weights.
  • rmweight: replaces all non-Zero weights with 1.
  • superfinal: redirects final states to a new superfinal state.
  • times: right-multiplies a constant to all weights.
Parameters:
  • ifst – The input FST.
  • map_type – A string matching a known mapping operation (see above).
  • delta – Comparison/quantization delta (ignored unless map_type is quantize, default: 0.0009765625).
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring passed to the arc-mapper; this is ignored unless map_type is plus (in which case it defaults to semiring Zero) or times (in which case it defaults to semiring One).
Returns:

An FST with arcs and final states remapped.

Raises:

ValueError – Unknown map type.

See also: statemap.

kaldi.fstext.compat_symbols(syms1:SymbolTable, syms2:SymbolTable, warning:bool=default) → bool

Returns true if the two symbol tables have equal checksums.

Passing in None for either table always returns true.

kaldi.fstext.compose(ifst1, ifst2, connect=True, compose_filter='auto')[source]

Constructively composes two FSTs.

This operation computes the composition of two FSTs. If A transduces string x to y with weight a and B transduces y to z with weight b, then their composition transduces string x to z with weight a otimes b. The output labels of the first transducer or the input labels of the second transducer must be sorted (or otherwise support appropriate matchers).

Parameters:
  • ifst1 – The first input FST.
  • ifst2 – The second input FST.
  • connect – Should output be trimmed?
  • compose_filter – A string matching a known composition filter; one of: “alt_sequence”, “auto”, “match”, “null”, “sequence”, “trivial”.
Returns:

A composed FST.

See also: arcsort.

kaldi.fstext.deserialize_symbol_table(str:bytes) → SymbolTable

Deserializes a symbol table.

kaldi.fstext.determinize(ifst, delta=0.0009765625, weight=None, nstate=-1, subsequential_label=0, det_type='functional', increment_subsequential_label=False)[source]

Constructively determinizes a weighted FST.

This operations creates an equivalent FST that has the property that no state has two transitions with the same input label. For this algorithm, epsilon transitions are treated as regular symbols (cf. rmepsilon).

Parameters:
  • ifst – The input FST.
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • subsequential_label – Input label of arc corresponding to residual final output when producing a subsequential transducer.
  • det_type – Type of determinization; one of: “functional” (input transducer is functional), “nonfunctional” (input transducer is not functional) and disambiguate” (input transducer is not functional but only keep the min of ambiguous outputs).
  • increment_subsequential_label – Increment subsequential when creating several arcs for the residual final output at a given state.
Returns:

An equivalent deterministic FST.

Raises:

ValueError – Unknown determinization type.

See also: disambiguate, rmepsilon.

kaldi.fstext.difference(ifst1, ifst2, connect=True, compose_filter='auto')[source]

Constructively computes the difference of two FSTs.

This operation computes the difference between two FSAs. Only strings that are in the first automaton but not in second are retained in the result. The first argument must be an acceptor; the second argument must be an unweighted, epsilon-free, deterministic acceptor. The output labels of the first transducer or the input labels of the second transducer must be sorted (or otherwise support appropriate matchers).

Parameters:
  • ifst1 – The first input FST.
  • ifst2 – The second input FST.
  • connect – Should the output FST be trimmed?
  • compose_filter – A string matching a known composition filter; one of: “alt_sequence”, “auto”, “match”, “null”, “sequence”, “trivial”.
Returns:

An FST representing the difference of the FSTs.

kaldi.fstext.disambiguate(ifst, delta=0.0009765625, weight=None, nstate=-1, subsequential_label=0)[source]

Constructively disambiguates a weighted transducer.

This operation disambiguates a weighted transducer. The result will be an equivalent FST that has the property that no two successful paths have the same input labeling. For this algorithm, epsilon transitions are treated as regular symbols (cf. rmepsilon).

Parameters:
  • ifst – The input FST.
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold.
  • subsequential_label – Input label of arc corresponding to residual final output when producing a subsequential transducer.
Returns:

An equivalent disambiguated FST.

See also: determinize, rmepsilon.

kaldi.fstext.epsnormalize(ifst, eps_norm_output=False)[source]

Constructively epsilon-normalizes an FST.

This operation creates an equivalent FST that is epsilon-normalized. An acceptor is epsilon-normalized if it it is epsilon-removed (cf. rmepsilon). A transducer is input epsilon-normalized if, in addition, along any path, all arcs with epsilon input labels follow all arcs with non-epsilon input labels. Output epsilon-normalized is defined similarly. The input FST must be functional.

Parameters:
  • ifst – The input FST.
  • eps_norm_output – Should the FST be output epsilon-normalized?
Returns:

An equivalent epsilon-normalized FST.

See also: rmepsilon.

kaldi.fstext.equal(ifst1, ifst2, delta=0.0009765625)[source]

Are two FSTs equal?

This function tests whether two FSTs have the same states with the same numbering and the same transitions with the same labels and weights in the same order.

Parameters:
  • ifst1 – The first input FST.
  • ifst2 – The second input FST.
  • delta – Comparison/quantization delta (0.0009765625).
Returns:

True if the FSTs satisfy the above condition, else False.

See also: equivalent, isomorphic, randequivalent.

kaldi.fstext.equivalent(ifst1, ifst2, delta=0.0009765625)[source]

Are the two acceptors equivalent?

This operation tests whether two epsilon-free deterministic weighted acceptors are equivalent, that is if they accept the same strings with the same weights.

Parameters:
  • ifst1 – The first input FST.
  • ifst2 – The second input FST.
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

True if the FSTs satisfy the above condition, else False.

Raises:

RuntimeError – Equivalence test encountered error.

See also: equal, isomorphic, randequivalent.

kaldi.fstext.indices_to_symbols(symbol_table, indices)[source]

Converts indices to symbols by looking them up in the symbol table.

Parameters:
  • symbol_table (SymbolTable) – The symbol table.
  • indices (List[int]) – The list of indices.
Returns:

The list of symbols corresponding to the given indices.

Return type:

List[str]

Raises:

KeyError – If an index is not found in the symbol table.

kaldi.fstext.intersect(ifst1, ifst2, connect=True, compose_filter='auto')[source]

Constructively intersects two FSTs.

This operation computes the intersection (Hadamard product) of two FSTs. Only strings that are in both automata are retained in the result. The two arguments must be acceptors. One of the arguments must be label-sorted (or otherwise support appropriate matchers).

Parameters:
  • ifst1 – The first input FST.
  • ifst2 – The second input FST.
  • connect – Should output be trimmed?
  • compose_filter – A string matching a known composition filter; one of: “alt_sequence”, “auto”, “match”, “null”, “sequence”, “trivial”.
Returns:

An intersected FST.

kaldi.fstext.isomorphic(ifst1, ifst2, delta=0.0009765625)[source]

Are the two acceptors isomorphic?

This operation determines if two transducers with a certain required determinism have the same states, irrespective of numbering, and the same transitions with the same labels and weights, irrespective of ordering. In other words, FSTs A, B are isomorphic if and only if the states of A can be renumbered and the transitions leaving each state reordered so the two are equal (according to the definition given in equal).

Parameters:
  • ifst1 – The first input FST.
  • ifst2 – The second input FST.
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

True if the two transducers satisfy the above condition, else False.

See also: equal, equivalent, randequivalent.

kaldi.fstext.prune(ifst, weight=None, nstate=-1, delta=0.0009765625)[source]

Constructively removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold t otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:
  • ifst – The input FST.
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
  • nstate – State number threshold (default: -1).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

A pruned FST.

See also: The destructive variant.

kaldi.fstext.push(ifst, push_weights=False, push_labels=False, remove_common_affix=False, remove_total_weight=False, to_final=False, delta=0.0009765625)[source]

Constructively pushes weights/labels towards initial or final states.

This operation produces an equivalent transducer by pushing the weights and/or the labels towards the initial state or toward the final states.

When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to 1 in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to 1. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Pushing labels towards the initial state consists in minimizing at every state the length of the longest common prefix of the output labels of the outgoing paths. Pushing labels towards the final states consists in minimizing at every state the length of the longest common suffix of the output labels of the incoming paths.

Parameters:
  • ifst – The input FST.
  • push_weights – Should weights be pushed?
  • push_labels – Should labels be pushed?
  • remove_common_affix – If pushing labels, should common prefix/suffix be removed?
  • remove_total_weight – If pushing weights, should total weight be removed?
  • to_final – Push towards final states?
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

An equivalent pushed FST.

See also: The destructive variant.

kaldi.fstext.randequivalent(ifst1, ifst2, npath=1, delta=0.0009765625, seed=None, select='uniform', max_length=2147483647)[source]

Are two acceptors stochastically equivalent?

This operation tests whether two FSTs are equivalent by randomly generating paths alternatively in each of the two FSTs. For each randomly generated path, the algorithm computes for each of the two FSTs the sum of the weights of all the successful paths sharing the same input and output labels as the randomly generated path and checks that these two values are within delta.

Parameters:
  • ifst1 – The first input FST.
  • ifst2 – The second input FST.
  • npath – The number of random paths to generate.
  • delta – Comparison/quantization delta.
  • seed – An optional seed value for random path generation; if None, the current time and process ID is used.
  • select – A string matching a known random arc selection type; one of: “uniform”, “log_prob”, “fast_log_prob”.
  • max_length – The maximum length of each random path.
Returns:

True if the two transducers satisfy the above condition, else False.

Raises:

RuntimeError – Random equivalence test encountered error.

See also: equal, equivalent, isomorphic, randgen.

kaldi.fstext.randgen(ifst, npath=1, seed=None, select='uniform', max_length=2147483647, weighted=False, remove_total_weight=False)[source]

Randomly generate successful paths in an FST.

This operation randomly generates a set of successful paths in the input FST. This relies on a mechanism for selecting arcs, specified using the select argument. The default selector, “uniform”, randomly selects a transition using a uniform distribution. The “log_prob” selector randomly selects a transition w.r.t. the weights treated as negative log probabilities after normalizing for the total weight leaving the state. In all cases, finality is treated as a transition to a super-final state.

Parameters:
  • ifst – The input FST.
  • npath – The number of random paths to generate.
  • seed – An optional seed value for random path generation; if zero, the current time and process ID is used.
  • select – A string matching a known random arc selection type; one of: “uniform”, “log_prob”, “fast_log_prob”.
  • max_length – The maximum length of each random path.
  • weighted – Should the output be weighted by path count?
  • remove_total_weight – Should the total weight be removed (ignored when weighted is False)?
Returns:

An FST containing one or more random paths.

See also: randequivalent.

kaldi.fstext.read_fst_kaldi(rxfilename)[source]

Reads FST using Kaldi I/O mechanisms.

Does not support reading in text mode.

Parameters:

rxfilename (str) – Extended filename for reading the FST.

Returns:

An FST object.

Raises:
  • IOError – If reading fails.
  • TypeError – If FST type or arc type is not supported.
kaldi.fstext.relabel_symbol_table(table:SymbolTable, pairs:list<tuple<int, int>>) → SymbolTable

Relabels a symbol table as specified by the input list of pairs.

The new symbol table only retains symbols for which a relabeling is explicitly specified.

Parameters:
  • table – A symbol table.
  • pairs – A list of (old label, new label) pairs.
Returns:

A new symbol table.

kaldi.fstext.replace(pairs, root_label, call_arc_labeling='input', return_arc_labeling='neither', epsilon_on_replace=False, return_label=0)[source]

Recursively replaces arcs in the root FST with other FST(s).

This operation performs the dynamic replacement of arcs in one FST with another FST, allowing the definition of FSTs analogous to RTNs. It takes as input a set of pairs formed by a non-terminal label and its corresponding FST, and a label identifying the root FST in that set. The resulting FST is obtained by taking the root FST and recursively replacing each arc having a nonterminal as output label by its corresponding FST. More precisely, an arc from state s to state d with (nonterminal) output label n in this FST is replaced by redirecting this “call” arc to the initial state of a copy F of the FST for n, and adding “return” arcs from each final state of F to d. Optional arguments control how the call and return arcs are labeled; by default, the only non-epsilon label is placed on the call arc.

Parameters:
  • pairs – An iterable of (nonterminal label, FST) pairs, where the former is an unsigned integer and the latter is an Fst instance.
  • root_label – Label identifying the root FST.
  • call_arc_labeling – A string indicating which call arc labels should be non-epsilon. One of: “input” (default), “output”, “both”, “neither”. This value is set to “neither” if epsilon_on_replace is True.
  • return_arc_labeling – A string indicating which return arc labels should be non-epsilon. One of: “input”, “output”, “both”, “neither” (default). This value is set to “neither” if epsilon_on_replace is True.
  • epsilon_on_replace – Should call and return arcs be epsilon arcs? If True, this effectively overrides call_arc_labeling and return_arc_labeling, setting both to “neither”.
  • return_label – The integer label for return arcs.
Returns:

An FST resulting from expanding the input RTN.

kaldi.fstext.reverse(ifst, require_superinitial=True)[source]

Constructively reverses an FST’s transduction.

This operation reverses an FST. If A transduces string x to y with weight a, then the reverse of A transduces the reverse of x to the reverse of y with weight a.Reverse(). (Typically, a = a.Reverse() and Arc = RevArc, e.g., TropicalWeight and LogWeight.) In general, e.g., when the weights only form a left or right semiring, the output arc type must match the input arc type.

Parameters:
  • ifst – The input FST.
  • require_superinitial – Should a superinitial state be created?
Returns:

A reversed FST.

kaldi.fstext.rmepsilon(ifst, connect=True, reverse=False, queue_type='auto', delta=0.0009765625, weight=None, nstate=-1)[source]

Constructively removes epsilon transitions from an FST.

This operation removes epsilon transitions (those where both input and output labels are epsilon) from an FST.

Parameters:
  • ifst – The input FST.
  • connect – Should output be trimmed?
  • reverse – Should epsilon transitions be removed in reverse order?
  • queue_type – A string matching a known queue type; one of: “auto”, “fifo”, “lifo”, “shortest”, “state”, “top”.
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold; paths with weights below this threshold will be pruned.
  • nstate – State number threshold (default: -1).
Returns:

An equivalent FST with no epsilon transitions.

kaldi.fstext.serialize_symbol_table(table:SymbolTable) → bytes

Serializes a symbol table.

kaldi.fstext.shortestdistance(ifst, reverse=False, source=-1, queue_type='auto', delta=0.0009765625)[source]

Compute the shortest distance from the initial or final state.

This operation computes the shortest distance from the initial state (when reverse is False) or from every state to the final state (when reverse is True). The shortest distance from p to q is the otimes-sum of the weights of all the paths between p and q. The weights must be right (if reverse is False) or left (if reverse is True) distributive, and k-closed (i.e., 1 otimes x otimes x^2 otimes … otimes x^{k + 1} = 1 otimes x otimes x^2 otimes … otimes x^k; e.g., TropicalWeight).

Parameters:
  • ifst – The input FST.
  • reverse – Should the reverse distance (from each state to the final state) be computed?
  • source – Source state (this is ignored if reverse is True). If NO_STATE_ID (-1), use FST’s initial state.
  • queue_type – A string matching a known queue type; one of: “auto”, “fifo”, “lifo”, “shortest”, “state”, “top” (this is ignored if reverse is True).
  • delta – Comparison/quantization delta (default: 0.0009765625).
Returns:

A list of Weight objects representing the shortest distance for each state.

kaldi.fstext.shortestpath(ifst, nshortest=1, unique=False, queue_type='auto', delta=0.0009765625, weight=None, nstate=-1)[source]

Construct an FST containing the shortest path(s) in the input FST.

This operation produces an FST containing the n-shortest paths in the input FST. The n-shortest paths are the n-lowest weight paths w.r.t. the natural semiring order. The single path that can be read from the ith of at most n transitions leaving the initial state of the resulting FST is the ith shortest path. The weights need to be right distributive and have the path property. They also need to be left distributive as well for n-shortest with n > 1 (e.g., TropicalWeight).

Parameters:
  • ifst – The input FST.
  • nshortest – The number of paths to return.
  • unique – Should the resulting FST only contain distinct paths? (Requires the input FST to be an acceptor; epsilons are treated as if they are regular symbols.)
  • queue_type – A string matching a known queue type; one of: “auto”, “fifo”, “lifo”, “shortest”, “state”, “top”.
  • delta – Comparison/quantization delta (default: 0.0009765625).
  • weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if omitted, no paths are pruned.
  • nstate – State number threshold (default: -1).
Returns:

An FST containing the n-shortest paths.

kaldi.fstext.statemap(ifst, map_type)[source]

Constructively applies a transform to all states.

This operation transforms each state according to the requested map type. Note that currently, only one state-mapping operation is supported.

Parameters:
  • ifst – The input FST.
  • map_type – A string matching a known mapping operation; one of: “arc_sum” (sum weights of identically-labeled multi-arcs), “arc_unique” (deletes non-unique identically-labeled multi-arcs).
Returns:

An FST with states remapped.

Raises:

ValueError – Unknown map type.

See also: arcmap.

kaldi.fstext.symbols_to_indices(symbol_table, symbols)[source]

Converts symbols to indices by looking them up in the symbol table.

Parameters:
  • symbol_table (SymbolTable) – The symbol table.
  • indices (List[str]) – The list of symbols.
Returns:

The list of indices corresponding to the given symbols.

Return type:

List[int]

Raises:

KeyError – If a symbol is not found in the symbol table.

kaldi.fstext.synchronize(ifst)[source]

Constructively synchronizes an FST.

This operation synchronizes a transducer. The result will be an equivalent FST that has the property that during the traversal of a path, the delay is either zero or strictly increasing, where the delay is the difference between the number of non-epsilon output labels and input labels along the path. For the algorithm to terminate, the input transducer must have bounded delay, i.e., the delay of every cycle must be zero.

Parameters:ifst – The input FST.
Returns:An equivalent synchronized FST.
kaldi.fstext.write_fst_kaldi(fst, wxfilename)[source]

Writes FST using Kaldi I/O mechanisms.

FST is written in binary mode without Kaldi binary mode header.

Parameters:
  • fst – The FST to write.
  • wxfilename (str) – Extended filename for writing the FST.
Raises:

IOError – If writing fails.

kaldi.fstext.enums

Functions

GetArcSortType Calls C++ function
GetClosureType Calls C++ function
GetComposeFilter Calls C++ function
GetDeterminizeType Calls C++ function
GetEncodeFlags Calls C++ function
GetEpsNormalizeType Calls C++ function
GetMapType Calls C++ function
GetProjectType Calls C++ function
GetPushFlags Calls C++ function
GetQueueType Calls C++ function
GetRandArcSelection Calls C++ function
GetReplaceLabelType Calls C++ function
GetReweightType Calls C++ function

Classes

ArcSortType An enumeration.
ClosureType An enumeration.
ComposeFilter An enumeration.
DeterminizeType An enumeration.
EncodeType An enumeration.
EpsNormalizeType An enumeration.
MapType An enumeration.
MatchType An enumeration.
ProjectType An enumeration.
QueueType An enumeration.
RandArcSelection An enumeration.
ReplaceLabelType An enumeration.
ReweightType An enumeration.
class kaldi.fstext.enums.ArcSortType

An enumeration.

ILABEL_SORT = 0
OLABEL_SORT = 1
class kaldi.fstext.enums.ClosureType

An enumeration.

CLOSURE_PLUS = 1
CLOSURE_STAR = 0
class kaldi.fstext.enums.ComposeFilter

An enumeration.

ALT_SEQUENCE_FILTER = 4
AUTO_FILTER = 0
MATCH_FILTER = 5
NULL_FILTER = 1
SEQUENCE_FILTER = 3
TRIVIAL_FILTER = 2
class kaldi.fstext.enums.DeterminizeType

An enumeration.

DETERMINIZE_DISAMBIGUATE = 2
DETERMINIZE_FUNCTIONAL = 0
DETERMINIZE_NONFUNCTIONAL = 1
class kaldi.fstext.enums.EncodeType

An enumeration.

DECODE = 2
ENCODE = 1
class kaldi.fstext.enums.EpsNormalizeType

An enumeration.

EPS_NORM_INPUT = 0
EPS_NORM_OUTPUT = 1
kaldi.fstext.enums.GetArcSortType(str:str) -> (success:bool, sort_type:ArcSortType)

Calls C++ function bool ::fst::script::GetArcSortType(::std::string, ::fst::script::ArcSortType*)

kaldi.fstext.enums.GetClosureType(closure_plus:bool) → ClosureType

Calls C++ function ::fst::ClosureType ::fst::script::GetClosureType(bool)

kaldi.fstext.enums.GetComposeFilter(str:str) -> (success:bool, compose_filter:ComposeFilter)

Calls C++ function bool ::fst::script::GetComposeFilter(::std::string, ::fst::ComposeFilter*)

kaldi.fstext.enums.GetDeterminizeType(str:str) -> (success:bool, det_type:DeterminizeType)

Calls C++ function bool ::fst::script::GetDeterminizeType(::std::string, ::fst::DeterminizeType*)

kaldi.fstext.enums.GetEncodeFlags(encode_labels:bool, encode_weights:bool) → int

Calls C++ function unsigned int ::fst::script::GetEncodeFlags(bool, bool)

kaldi.fstext.enums.GetEpsNormalizeType(eps_norm_output:bool) → EpsNormalizeType

Calls C++ function ::fst::EpsNormalizeType ::fst::script::GetEpsNormalizeType(bool)

kaldi.fstext.enums.GetMapType(str:str) -> (success:bool, sort_type:MapType)

Calls C++ function bool ::fst::script::GetMapType(::std::string, ::fst::script::MapType*)

kaldi.fstext.enums.GetProjectType(project_output:bool) → ProjectType

Calls C++ function ::fst::ProjectType ::fst::script::GetProjectType(bool)

kaldi.fstext.enums.GetPushFlags(push_weights:bool, push_labels:bool, remove_total_weight:bool, remove_common_affix:bool) → int

Calls C++ function unsigned int ::fst::script::GetPushFlags(bool, bool, bool, bool)

kaldi.fstext.enums.GetQueueType(str:str) -> (success:bool, queue_type:QueueType)

Calls C++ function bool ::fst::script::GetQueueType(::std::string, ::fst::QueueType*)

kaldi.fstext.enums.GetRandArcSelection(str:str) -> (success:bool, ras:RandArcSelection)

Calls C++ function bool ::fst::script::GetRandArcSelection(::std::string, ::fst::script::RandArcSelection*)

kaldi.fstext.enums.GetReplaceLabelType(str:str, epsilon_on_replace:bool) -> (success:bool, rlt:ReplaceLabelType)

Calls C++ function bool ::fst::script::GetReplaceLabelType(::std::string, bool, ::fst::ReplaceLabelType*)

kaldi.fstext.enums.GetReweightType(to_final:bool) → ReweightType

Calls C++ function ::fst::ReweightType ::fst::script::GetReweightType(bool)

class kaldi.fstext.enums.MapType

An enumeration.

ARC_SUM_MAPPER = 0
ARC_UNIQUE_MAPPER = 1
IDENTITY_MAPPER = 2
INPUT_EPSILON_MAPPER = 3
INVERT_MAPPER = 4
OUTPUT_EPSILON_MAPPER = 5
PLUS_MAPPER = 6
POWER_MAPPER = 7
QUANTIZE_MAPPER = 8
RMWEIGHT_MAPPER = 9
SUPERFINAL_MAPPER = 10
TIMES_MAPPER = 11
TO_LOG64_MAPPER = 13
TO_LOG_MAPPER = 12
TO_STD_MAPPER = 14
class kaldi.fstext.enums.MatchType

An enumeration.

MATCH_BOTH = 3
MATCH_INPUT = 1
MATCH_NONE = 4
MATCH_OUTPUT = 2
MATCH_UNKNOWN = 5
class kaldi.fstext.enums.ProjectType

An enumeration.

PROJECT_INPUT = 1
PROJECT_OUTPUT = 2
class kaldi.fstext.enums.QueueType

An enumeration.

AUTO_QUEUE = 7
FIFO_QUEUE = 1
LIFO_QUEUE = 2
OTHER_QUEUE = 8
SCC_QUEUE = 6
SHORTEST_FIRST_QUEUE = 3
STATE_ORDER_QUEUE = 5
TOP_ORDER_QUEUE = 4
TRIVIAL_QUEUE = 0
class kaldi.fstext.enums.RandArcSelection

An enumeration.

FAST_LOG_PROB_ARC_SELECTOR = 2
LOG_PROB_ARC_SELECTOR = 1
UNIFORM_ARC_SELECTOR = 0
class kaldi.fstext.enums.ReplaceLabelType

An enumeration.

REPLACE_LABEL_BOTH = 4
REPLACE_LABEL_INPUT = 2
REPLACE_LABEL_NEITHER = 1
REPLACE_LABEL_OUTPUT = 3
class kaldi.fstext.enums.ReweightType

An enumeration.

REWEIGHT_TO_FINAL = 1
REWEIGHT_TO_INITIAL = 0

kaldi.fstext.properties

FST Properties.

kaldi.fstext.properties.EXPANDED = 1
kaldi.fstext.properties.MUTABLE = 2
kaldi.fstext.properties.ERROR = 4
kaldi.fstext.properties.ACCEPTOR = 65536
kaldi.fstext.properties.NOT_ACCEPTOR = 131072
kaldi.fstext.properties.I_DETERMINISTIC = 262144
kaldi.fstext.properties.NON_I_DETERMINISTIC = 524288
kaldi.fstext.properties.O_DETERMINISTIC = 1048576
kaldi.fstext.properties.NON_O_DETERMINISTIC = 2097152
kaldi.fstext.properties.EPSILONS = 4194304
kaldi.fstext.properties.NO_EPSILONS = 8388608
kaldi.fstext.properties.I_EPSILONS = 16777216
kaldi.fstext.properties.NO_I_EPSILONS = 33554432
kaldi.fstext.properties.O_EPSILONS = 67108864
kaldi.fstext.properties.NO_O_EPSILONS = 134217728
kaldi.fstext.properties.I_LABEL_SORTED = 268435456
kaldi.fstext.properties.NOT_I_LABEL_SORTED = 536870912
kaldi.fstext.properties.O_LABEL_SORTED = 1073741824
kaldi.fstext.properties.NOT_O_LABEL_SORTED = 2147483648
kaldi.fstext.properties.WEIGHTED = 4294967296
kaldi.fstext.properties.UNWEIGHTED = 8589934592
kaldi.fstext.properties.CYCLIC = 17179869184
kaldi.fstext.properties.ACYCLIC = 34359738368
kaldi.fstext.properties.INITIAL_CYCLIC = 68719476736
kaldi.fstext.properties.INITIAL_ACYCLIC = 137438953472
kaldi.fstext.properties.TOP_SORTED = 274877906944
kaldi.fstext.properties.NOT_TOP_SORTED = 549755813888
kaldi.fstext.properties.ACCESSIBLE = 1099511627776
kaldi.fstext.properties.NOT_ACCESSIBLE = 2199023255552
kaldi.fstext.properties.COACCESSIBLE = 4398046511104
kaldi.fstext.properties.NOT_COACCESSIBLE = 8796093022208
kaldi.fstext.properties.STRING = 17592186044416
kaldi.fstext.properties.NOT_STRING = 35184372088832
kaldi.fstext.properties.WEIGHTED_CYCLES = 70368744177664
kaldi.fstext.properties.UNWEIGHTED_CYCLES = 140737488355328
kaldi.fstext.properties.NULL_PROPERTIES = 164284018786304
kaldi.fstext.properties.COPY_PROPERTIES = 281474976645124
kaldi.fstext.properties.INTRINSIC_PROPERTIES = 281474976645123
kaldi.fstext.properties.EXTRINSIC_PROPERTIES = 4
kaldi.fstext.properties.SET_START_PROPERTIES = 225193725198343
kaldi.fstext.properties.SET_FINAL_PROPERTIES = 215491394076679
kaldi.fstext.properties.ADD_STATE_PROPERTIES = 258385232461831
kaldi.fstext.properties.ADD_ARC_PROPERTIES = 76509027631111
kaldi.fstext.properties.SET_ARC_PROPERTIES = 7
kaldi.fstext.properties.DELETE_STATE_PROPERTIES = 141194274603015
kaldi.fstext.properties.DELETE_ARC_PROPERTIES = 152189390880775
kaldi.fstext.properties.STATE_SORT_PROPERTIES = 227873784791047
kaldi.fstext.properties.ARC_SORT_PROPERTIES = 281470950113287
kaldi.fstext.properties.I_LABEL_INVARIANT_PROPERTIES = 281474107441159
kaldi.fstext.properties.O_LABEL_INVARIANT_PROPERTIES = 281471538167815
kaldi.fstext.properties.WEIGHT_INVARIANT_PROPERTIES = 70355859210247
kaldi.fstext.properties.ADD_SUPERFINAL_PROPERTIES = 262506881417223
kaldi.fstext.properties.RM_SUPERFINAL_PROPERTIES = 243539050430471
kaldi.fstext.properties.BINARY_PROPERTIES = 7
kaldi.fstext.properties.TRINARY_PROPERTIES = 281474976645120
kaldi.fstext.properties.POS_TRINARY_PROPERTIES = 93824992215040
kaldi.fstext.properties.NEG_TRINARY_PROPERTIES = 187649984430080
kaldi.fstext.properties.FST_PROPERTIES = 281474976645127

kaldi.fstext.special

Functions

add_subsequential_loop Adds a subsequential symbol loop to the input FST.
compose_context Creates a context FST and composes it on the left with input fst.
compose_context_left_biphone Creates a context FST and composes it on the left with input fst.
compose_deterministic_on_demand_fst Composes an FST with a deterministic on demand FST.
create_ilabel_info_symbol_table Creates a symbol table from the ilabel info and phones symbol table.
determinize_lattice Determinizes lattice.
determinize_star Implements a special determinization with epsilon removal.
determinize_star_in_log Performs determinize_star in place in log semiring.
get_encoding_multiple Returns the smallest multiple of 1000 > nonterm_phones_offset.
push_in_log Push weights/labels in log semiring.
push_special Pushes weights in log semiring in a special way.
read_ilabel_info Reads ilabel info from input stream.
remove_eps_local Removes epsilon arcs locally.
table_compose Performs table composition.
table_compose_cache Performs cached table composition.
table_compose_cache_lattice Performs cached table composition on lattices.
table_compose_lattice Performs table composition on lattices.
write_ilabel_info Writes ilabel info to output stream.

Classes

LatticeTableComposeCache Cache for table compose.
NonterminalValues An enumeration.
ScaleDeterministicOnDemandFst A DeterministicOnDemandFst scaling the weights of another.
StdBackoffDeterministicOnDemandFst Deterministic on demand backoff language model.
StdCacheDeterministicOnDemandFst A DeterministicOnDemandFst caching the arcs of another.
StdComposeDeterministicOnDemandFst A DeterministicOnDemandFst implementing the composition of others.
StdDeterministicOnDemandFst Base class for deterministic on demand FSTs over the tropical semiring.
StdInverseContextFst Inverse of the context FST “C” in “HCLG” over the tropical semiring.
StdInverseLeftBiphoneContextFst Inverse of the left-biphone context FST “C” over the tropical semiring.
StdTableComposeCache Cache for table compose.
StdUnweightedNgramFst A DeterministicOnDemandFst in which states encode an n-gram history.
TableComposeOptions Options for table composition.
TableMatcherOptions Options for table matcher.
class kaldi.fstext.special.LatticeTableComposeCache

Cache for table compose.

Used for doing multiple compositions while caching the same matcher.

This version is for composing FSTs over lattice semiring.

from_compose_opts(opts:TableComposeOptions=default) → LatticeTableComposeCache

Creates a new LatticeTableComposeCache instance.

opts

Table compose options.

class kaldi.fstext.special.NonterminalValues

An enumeration.

kNontermBegin = 1
kNontermBigNumber = 10000000
kNontermBos = 0
kNontermEnd = 2
kNontermMediumNumber = 1000
kNontermReenter = 3
kNontermUserDefined = 4
class kaldi.fstext.special.ScaleDeterministicOnDemandFst

A DeterministicOnDemandFst scaling the weights of another.

For instance, to subtract existing LM scores from a lattice you could use this with a negative weight; and to interpolate LMs you can also use this with weights less than one.

Parameters:
final(state:int) → TropicalWeight

Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)

Creates an on demand arc and returns it.

Parameters:
  • s (int) – State index.
  • ilabel (int) – Arc label.
Returns:

The created arc.

start() → int

Returns the start state index.

class kaldi.fstext.special.StdBackoffDeterministicOnDemandFst

Deterministic on demand backoff language model.

This class wraps a conventional Fst, representing a language model, with a “DeterministicOnDemandFst” interface. Backoff arcs in the language model should have the epsilon label (label 0) on the arcs, and that there should be no other epsilons in the language model. The backoff (i.e. epsilon) arcs are followed if a particular arc (or a final-prob) is not found at the current state.

Parameters:fst (StdFst) – Input language model FST.
final(state:int) → TropicalWeight

Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)

Creates an on demand arc and returns it.

Parameters:
  • s (int) – State index.
  • ilabel (int) – Arc label.
Returns:

The created arc.

start() → int

Returns the start state index.

class kaldi.fstext.special.StdCacheDeterministicOnDemandFst

A DeterministicOnDemandFst caching the arcs of another.

Parameters:
final(state:int) → TropicalWeight

Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)

Creates an on demand arc and returns it.

Parameters:
  • s (int) – State index.
  • ilabel (int) – Arc label.
Returns:

The created arc.

start() → int

Returns the start state index.

class kaldi.fstext.special.StdComposeDeterministicOnDemandFst

A DeterministicOnDemandFst implementing the composition of others.

Parameters:
final(state:int) → TropicalWeight

Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)

Creates an on demand arc and returns it.

Parameters:
  • s (int) – State index.
  • ilabel (int) – Arc label.
Returns:

The created arc.

start() → int

Returns the start state index.

class kaldi.fstext.special.StdDeterministicOnDemandFst

Base class for deterministic on demand FSTs over the tropical semiring.

final(state:int) → TropicalWeight

Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)

Creates an on demand arc and returns it.

Parameters:
  • s (int) – State index.
  • ilabel (int) – Arc label.
Returns:

The created arc.

start() → int

Returns the start state index.

class kaldi.fstext.special.StdInverseContextFst

Inverse of the context FST “C” in “HCLG” over the tropical semiring.

InverseContextFst represents the inverse of the context FST “C” (the “C” in “HCLG”) which transduces from symbols representing phone context windows (e.g. “a, b, c”) to individual phones, e.g. “a”. So InverseContextFst transduces from phones to symbols representing phone context windows. The point is that the inverse is deterministic, so the DeterministicOnDemandFst interface is applicable, which turns out to be a convenient way to implement this.

This doesn’t implement the full Fst interface, it implements the DeterministicOnDemandFst interface which is much simpler and which is sufficient for what we need to do with this.

Search for “hbka.pdf” (“Speech Recognition with Weighted Finite State Transducers”) by M. Mohri, for more context.

Parameters:
  • subsequential_symbol (int) – Integer index of the subsequential symbol.
  • phones (List[int]) – Integer indices for the phones.
  • disambig_syms (List[int]) – Integer indices for disambiguation symbols.
  • context_width (int) – Size of context window.
  • central_position (int) – Position of central phone in context window, from 0..N-1.
final(state:int) → TropicalWeight

Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)

Creates an on demand arc and returns it.

Parameters:
  • s (int) – State index.
  • ilabel (int) – Arc label.
Returns:

The created arc.

ilabel_info() → list<list<int>>

Returns input label info.

start() → int

Returns the start state index.

class kaldi.fstext.special.StdInverseLeftBiphoneContextFst

Inverse of the left-biphone context FST “C” over the tropical semiring.

This does not take the arguments ‘context_width’ or ‘central_position’ because they are assumed to be (2, 1) meaning a system with left-biphone context; and there is no subsequential symbol because it is not needed in systems without right context.

Parameters:
  • nonterm_phones_offset (int) – Integer index of the first non-terminal symbol. Set to a large value, e.g. 1 million, if not using non-terminals.
  • phones (List[int]) – Integer indices for the phones.
  • disambig_syms (List[int]) – Integer indices for disambiguation symbols.
final(state:int) → TropicalWeight

Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)

Creates an on demand arc and returns it.

Parameters:
  • s (int) – State index.
  • ilabel (int) – Arc label.
Returns:

The created arc.

ilabel_info() → list<list<int>>

Returns input label info.

start() → int

Returns the start state index.

class kaldi.fstext.special.StdTableComposeCache

Cache for table compose.

Used for doing multiple compositions while caching the same matcher.

from_compose_opts(opts:TableComposeOptions=default) → StdTableComposeCache

Creates a new StdTableComposeCache instance.

opts

Table compose options.

class kaldi.fstext.special.StdUnweightedNgramFst

A DeterministicOnDemandFst in which states encode an n-gram history.

Conceptually, for n-gram order n and k labels, the FST is an unweighted acceptor with about k^(n-1) states (ignoring end effects). However, the FST is created on demand and doesn’t need the label vocabulary; get_arc matches on any input label. This class is primarily used by compose_deterministic_on_demand_fst to expand the n-gram history of lattices.

Parameters:n (int) – N-gram order.
final(state:int) → TropicalWeight

Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)

Creates an on demand arc and returns it.

Parameters:
  • s (int) – State index.
  • ilabel (int) – Arc label.
Returns:

The created arc.

start() → int

Returns the start state index.

class kaldi.fstext.special.TableComposeOptions

Options for table composition.

connect

Connect output

filter_type

Which pre-defined filter to use.

from_matcher_opts(mo:TableMatcherOptions, connect:bool=default, filter_type:ComposeFilter=default, table_match_type:MatchType=default) → TableComposeOptions

Creates a new TableComposeOptions instance.

min_table_size

Minimum table size.

table_match_type

Type of table match.

table_ratio

Construct the table if it would be at least this full.

class kaldi.fstext.special.TableMatcherOptions

Options for table matcher.

Table matcher is a matcher specialized for the case where the output side of the left FST always has either all-epsilons coming out of a state, or a majority of the symbol table. Therefore we can either store nothing (for the all-epsilon case) or store a lookup table from labels to arc offsets. Since the table matcher has to iterate over all arcs in each left-hand state the first time it sees it, this matcher type is not efficient if you compose with something very small on the right – unless you do it multiple times and keep the matcher around.

Table matcher class is not exposed to Python code directly. Instances of TableMatcherOptions can be passed to table_compose() and TableComposeCache for controlling the table matcher behavior.

min_table_size

Minimum table size.

table_ratio

Construct the table if it would be at least this full.

kaldi.fstext.special.add_subsequential_loop(subseq_symbol:int, fst:StdMutableFst)

Adds a subsequential symbol loop to the input FST.

Modifies the FST so that it transuces the same paths, but the input side of the paths can all have the subsequential symbol ‘$’ appended to them any number of times.

Parameters:
  • subseq_symbol (int) – Integer index for the subsequential symbol.
  • fst (StdFst) – Input FST.
kaldi.fstext.special.compose_context(disambig_syms, N, P, ifst)[source]

Creates a context FST and composes it on the left with input fst.

Outputs the label information along with the composed FST. Input FST should be mutable since the algorithm adds the subsequential loop to it.

Parameters:
  • disambig_syms (List[int]) – Disambiguation symbols.
  • N (int) – Size of context window.
  • P (int) – Position of central phone in context window, from 0..N-1.
  • ifst (StdFst) – Input FST.
Returns:

Output fst, label information tuple.

Return type:

Tuple[StdVectorFst, List[List[int]]]

kaldi.fstext.special.compose_context_left_biphone(nonterm_phones_offset:int, disambig_syms:list<int>, ifst:StdVectorFst, ofst:StdVectorFst) → list<list<int>>

Creates a context FST and composes it on the left with input fst.

This is a variant of the function :meth:compose_context which is to be used with the “grammar FST” framework. This does not take the ‘context_width’ and ‘central_position’ arguments because they are assumed to be 2 and 1 respectively (meaning, left-biphone phonetic context).

Parameters:
  • nonterm_phones_offset (int) – The integer index of the first non-terminal symbol.
  • disambig_syms (List[int]) – Disambiguation symbols.
  • ifst (StdVectorFst) – Input FST.
  • ofst (StdVectorFst) – Output FST.
Returns:

Label information.

Return type:

List[List[int]]

kaldi.fstext.special.compose_deterministic_on_demand_fst(fst1, fst2, inverse=False)[source]

Composes an FST with a deterministic on demand FST.

If inverse is True, computes ofst = Compose(Inverse(fst2), fst1). Note that the arguments are reversed in this case.

This function does not trim its output.

Parameters:
  • fst1 (StdFst) – The input FST.
  • fst2 (StdDeterministicOnDemandFst) – The input deterministic on demand FST.
  • inverse (bool) – Deterministic FST on the left?
Returns:

A composed FST.

kaldi.fstext.special.create_ilabel_info_symbol_table(info:list<list<int>>, phones_symtab:SymbolTable, separator:str, disambig_prefix:str) → SymbolTable

Creates a symbol table from the ilabel info and phones symbol table.

This is mainly used for debugging.

kaldi.fstext.special.determinize_lattice(ifst, compact_output=True, delta=0.0009765625, max_mem=-1, max_loop=-1)[source]

Determinizes lattice.

Implements a special form of determinization with epsilon removal, optimized for a phase of lattice generation.

See kaldi/src/fstext/determinize-lattice.h for details.

Parameters:
  • ifst (LatticeFst) – Input lattice.
  • compact_output (bool) – Whether the output is a compact lattice.
  • delta (float) – Comparison/quantization delta.
  • max_mem (int) – If positive, determinization will fail when the algorithm’s (approximate) memory consumption crosses this threshold.
  • max_loop (int) – If positive, can be used to detect non-determinizable input (a case that wouldn’t be caught by max_mem).
Returns:

A determized lattice.

Raises:

RuntimeError – If determization fails.

kaldi.fstext.special.determinize_star(ifst, delta=0.0009765625, max_states=-1, allow_partial=False)[source]

Implements a special determinization with epsilon removal.

See kaldi/src/fstext/determinize-star.h for details.

Parameters:
  • ifst (StdFst) – Input fst over the tropical semiring.
  • delta (float) – Comparison/quantization delta.
  • max_states (int) – If positive, determinization will fail when max states is reached.
  • allow_partial (bool) – If True, the algorithm will output partial results when the specified max states is reached (when larger than zero), instead of raising an exception.
Returns:

A determized lattice.

Raises:

RuntimeError – If determization fails.

kaldi.fstext.special.determinize_star_in_log(fst:StdVectorFst, delta:float=default, max_states:int=default)

Performs determinize_star in place in log semiring.

Parameters:
  • ifst (StdFst) – Input fst over the tropical semiring.
  • delta (float) – Comparison/quantization delta.
  • max_states (int) – If positive, determinization will fail when max states is reached.
Raises:

RuntimeError – If determization fails.

See Also: determinize_star()

kaldi.fstext.special.get_encoding_multiple(nonterm_phones_offset:int) → int

Returns the smallest multiple of 1000 > nonterm_phones_offset.

kaldi.fstext.special.push_in_log(ifst, push_weights=False, push_labels=False, remove_common_affix=False, remove_total_weight=False, to_final=False, delta=0.0009765625)[source]

Push weights/labels in log semiring.

Destructively pushes weights/labels towards initial or final states.

Parameters:
  • fst (StdVectorFst) – Input fst over the tropical semiring.
  • push_weights – Should weights be pushed?
  • push_labels – Should labels be pushed?
  • remove_common_affix – If pushing labels, should common prefix/suffix be removed?
  • remove_total_weight – If pushing weights, should total weight be removed?
  • to_final – Push towards final states?
  • delta – Comparison/quantization delta (default: 0.0009765625).
kaldi.fstext.special.push_special(fst:StdVectorFst, delta:float=default)

Pushes weights in log semiring in a special way.

Destructively pushes weights in the log semiring such that any leftover weight after pushing gets distributed evenly along the FST, and doesn’t end up either at the start or at the end. Basically it pushes the weights such that the total weight of each state (i.e. the sum of the arc probabilities plus the final-prob) is the same for all states.

Parameters:
  • fst (StdFst) – Input fst over the tropical semiring.
  • delta – Comparison/quantization delta (default: 0.0009765625).
kaldi.fstext.special.read_ilabel_info(is:istream, binary:bool) → list<list<int>>

Reads ilabel info from input stream.

kaldi.fstext.special.remove_eps_local(fst, special=False)[source]

Removes epsilon arcs locally.

Removes some (but not necessarily all) epsilons in an FST, using an algorithm that is guaranteed to never increase the number of arcs in the FST (and will also never increase the number of states).

See kaldi/src/fstext/remove-eps-local.h for details.

Parameters:
  • fst (StdVectorFst) – Input fst over the tropical semiring.
  • special (bool) – Preserve stochasticity when casting to log semiring.
kaldi.fstext.special.table_compose(ifst1:StdFst, ifst2:StdFst, ofst:StdMutableFst, opts:TableComposeOptions=default)

Performs table composition.

kaldi.fstext.special.table_compose_cache(ifst1:StdFst, ifst2:StdFst, ofst:StdMutableFst, cache:StdTableComposeCache)

Performs cached table composition.

kaldi.fstext.special.table_compose_cache_lattice(ifst1:LatticeFst, ifst2:LatticeFst, ofst:LatticeMutableFst, cache:LatticeTableComposeCache)

Performs cached table composition on lattices.

kaldi.fstext.special.table_compose_lattice(ifst1:LatticeFst, ifst2:LatticeFst, ofst:LatticeMutableFst, opts:TableComposeOptions=default)

Performs table composition on lattices.

kaldi.fstext.special.write_ilabel_info(os:ostream, binary:bool, info:list<list<int>>)

Writes ilabel info to output stream.

kaldi.fstext.utils

Functions

acoustic_lattice_scale Returns a 2x2 matrix for scaling acoustic cost in lattice weights.
apply_probability_scale Applies a probability scale to the FST.
cast_log_to_std Casts FST in log semiring to tropical semiring.
cast_std_to_log Casts FST in tropical semiring to log semiring.
clear_symbols Sets all input/output labels of the FST to zero.
compact_lattice_has_alignment Checks if compact lattice has state-level alignments.
convert_compact_lattice_to_lattice Converts compact lattice to lattice.
convert_lattice_to_compact_lattice Converts lattice to compact lattice.
convert_lattice_to_std Converts lattice to FST over tropical semiring.
convert_nbest_to_list Converts n-best FST to a list of FSTs.
convert_std_to_lattice Converts FST over tropical semiring to lattice.
default_lattice_scale Returns a default 2x2 matrix for scaling lattice weights.
equal_align Generates sequences from the input FST with exactly “length” symbols.
following_input_symbols_are_same Checks if all arcs exiting any state have the same input symbol.
get_input_symbols Gets input labels of the FST as a sorted unique list.
get_linear_symbol_sequence Extracts linear symbol sequences from the input FST.
get_output_symbols Gets output labels of the FST as a sorted unique list.
get_symbols Gets labels in the symbol table as a sorted unique list.
graph_lattice_scale Returns a 2x2 matrix for scaling graph cost in lattice weights.
highest_numbered_input_symbol Returns the highest numbered input label of the FST (zero if FST is empty).
highest_numbered_output_symbol Returns the highest numbered output label of the FST (zero if FST is empty).
is_stochastic_fst Checks if FST is stochastic.
is_stochastic_fst_in_log Checks if FST is stochastic in log semiring.
lattice_scale Returns a 2x2 matrix for scaling graph and acoustic costs in lattice weights.
make_following_input_symbols_same Ensures that all arcs exiting any state have the same input symbol.
make_linear_acceptor Creates an unweighted linear acceptor from the label sequence.
make_linear_acceptor_with_alternatives Creates an unweighted acceptor with a linear structure.
make_preceding_input_symbols_same Ensures that all arcs entering any state have the same input symbol.
map_input_symbols Maps input labels to labels given in the symbol map.
minimize_encoded_std_fst Minimizes FST in place after encoding labels and weights.
nbest_as_fsts Outputs (up to) n-best paths in the FST as a list of FSTs.
phi_compose Performs composition by handling phi (failure) transitions.
phi_compose_lattice Performs lattice composition by handling phi (failure) transitions.
preceding_input_symbols_are_same Checks if all arcs entering any state have the same input symbol.
propagate_final Propagates final-probs through “phi” transitions.
remove_alignments_from_compact_lattice Removes state-level alignments in a compact lattice.
remove_some_input_symbols Replaces given input labels with zeros.
remove_useless_arcs Removes arcs that are not on best paths for any input symbol sequence.
remove_weights Removes FST weights.
rho_compose Performs composition by handling rho transitions.
safe_determinize_minimize_wrapper Performs safe determinization and minimization.
safe_determinize_minimize_wrapper_in_log Performs safe determinization and minimization in log semiring.
safe_determinize_wrapper Performs safe determinization.
scale_compact_lattice Scales the compact lattice weights.
scale_lattice Scales the lattice weights.
kaldi.fstext.utils.acoustic_lattice_scale(acwt:float) → list<list<float>>

Returns a 2x2 matrix for scaling acoustic cost in lattice weights.

kaldi.fstext.utils.apply_probability_scale(scale:float, fst:StdMutableFst)

Applies a probability scale to the FST.

This is applicable to FSTs in the log or tropical semiring. It multiplies the arc and final weights by scale [this is not the multiplication operation of the semiring, it’s actual multiplication, which is equivalent to taking a power in the semiring].

kaldi.fstext.utils.cast_log_to_std(ifst:LogVectorFst) → StdVectorFst

Casts FST in log semiring to tropical semiring.

kaldi.fstext.utils.cast_std_to_log(ifst:StdVectorFst) → LogVectorFst

Casts FST in tropical semiring to log semiring.

kaldi.fstext.utils.clear_symbols(clear_input:bool, clear_output:bool, fst:StdMutableFst)

Sets all input/output labels of the FST to zero.

Does not alter symbol tables.

kaldi.fstext.utils.compact_lattice_has_alignment(fst:CompactLatticeExpandedFst) → bool

Checks if compact lattice has state-level alignments.

kaldi.fstext.utils.convert_compact_lattice_to_lattice(ifst, invert=True)[source]

Converts compact lattice to lattice.

Parameters:
  • ifst (CompactLatticeFst) – The input compact lattice.
  • invert (bool) – Invert input and output labels.
Returns:

The output lattice.

Return type:

LatticeVectorFst

kaldi.fstext.utils.convert_lattice_to_compact_lattice(ifst, invert=True)[source]

Converts lattice to compact lattice.

Parameters:
  • ifst (LatticeFst) – The input lattice.
  • invert (bool) – Invert input and output labels.
Returns:

The output compact lattice.

Return type:

CompactLatticeVectorFst

kaldi.fstext.utils.convert_lattice_to_std(ifst)[source]

Converts lattice to FST over tropical semiring.

Parameters:ifst (LatticeFst) – The input lattice.
Returns:The output FST.
Return type:StdVectorFst
kaldi.fstext.utils.convert_nbest_to_list(fst:StdFst) → list<StdVectorFst>

Converts n-best FST to a list of FSTs.

kaldi.fstext.utils.convert_std_to_lattice(ifst)[source]

Converts FST over tropical semiring to lattice.

Parameters:ifst (StdFst) – The input FST.
Returns:The output lattice.
Return type:LatticeVectorFst
kaldi.fstext.utils.default_lattice_scale() → list<list<float>>

Returns a default 2x2 matrix for scaling lattice weights.

kaldi.fstext.utils.equal_align(ifst:StdFst, length:int, rand_seed:int, ofst:StdMutableFst, num_retries:int=default) → bool

Generates sequences from the input FST with exactly “length” symbols.

This is similar to randgen, but it generates a sequence with exactly “length” input symbols. It returns True on success, False on failure (failure is partly random but should never happen in practice for normal speech models.) It generates a random path through the input FST, finds out which subset of the states it visits along the way have self-loops with inupt symbols on them, and outputs a path with exactly enough self-loops to have the requested number of input symbols. Note that EqualAlign does not use the probabilities on the FST. It just uses equal probabilities in the first stage of selection (since the output will anyway not be a truly random sample from the FST). The input fst “ifst” must be connected or this may enter an infinite loop.

kaldi.fstext.utils.following_input_symbols_are_same(end_is_epsilon:bool, fst:StdFst) → bool

Checks if all arcs exiting any state have the same input symbol.

Returns true if and only if the FST is such that the input symbols on arcs exiting any given state all have the same value. If end_is_epsilon == True, treats final-states as epsilon output arcs [i.e. ensures only epsilons can exit final-states].

kaldi.fstext.utils.get_input_symbols(fst:StdFst, include_eps:bool) → list<int>

Gets input labels of the FST as a sorted unique list.

kaldi.fstext.utils.get_linear_symbol_sequence(fst)[source]

Extracts linear symbol sequences from the input FST.

Parameters:fst – The input FST.
Returns:The tuple (isymbols, osymbols, total_weight).
kaldi.fstext.utils.get_output_symbols(fst:StdFst, include_eps:bool) → list<int>

Gets output labels of the FST as a sorted unique list.

kaldi.fstext.utils.get_symbols(symtab:SymbolTable, include_eps:bool) → list<int>

Gets labels in the symbol table as a sorted unique list.

kaldi.fstext.utils.graph_lattice_scale(lmwt:float) → list<list<float>>

Returns a 2x2 matrix for scaling graph cost in lattice weights.

kaldi.fstext.utils.highest_numbered_input_symbol(fst:StdFst) → int

Returns the highest numbered input label of the FST (zero if FST is empty).

kaldi.fstext.utils.highest_numbered_output_symbol(fst:StdFst) → int

Returns the highest numbered output label of the FST (zero if FST is empty).

kaldi.fstext.utils.is_stochastic_fst(fst:StdFst, delta:float=default, min_sum:TropicalWeight=default, max_sum:TropicalWeight=default) → bool

Checks if FST is stochastic.

This function returns true if, in the semiring of the FST, the sum (within the semiring) of all the arcs out of each state in the FST is one, to within delta.

Parameters:
  • fst – The FST that we are testing.
  • delta – The tolerance to within which we test equality to 1.
  • min_sum – If provided, it will be set to the minimum sum of weights.
  • max_sum – If provided, it will be set to the maximum sum of weights.
Returns:

True if the FST is stochastic, and False otherwise.

kaldi.fstext.utils.is_stochastic_fst_in_log(fst:StdFst, delta:float=default, min_sum:TropicalWeight=default, max_sum:TropicalWeight=default) → bool

Checks if FST is stochastic in log semiring.

This function returns true if, in the log semiring, the sum of all the arcs out of each state in the FST is one, to within delta.

Parameters:
  • fst – The FST that we are testing.
  • delta – The tolerance to within which we test equality to 1.
  • min_sum – If provided, it will be set to the minimum sum of weights.
  • max_sum – If provided, it will be set to the maximum sum of weights.
Returns:

True if the FST is stochastic, and False otherwise.

kaldi.fstext.utils.lattice_scale(lmwt:float, acwt:float) → list<list<float>>

Returns a 2x2 matrix for scaling graph and acoustic costs in lattice weights.

kaldi.fstext.utils.make_following_input_symbols_same(end_is_epsilon:bool, fst:StdMutableFst)

Ensures that all arcs exiting any state have the same input symbol.

Detects states that have differing input symbols going out, and inserts, for each of the following arcs with non-epsilon input symbol, a new dummy state that has an epsilon link from the fst state. The output symbol and weight stay on the link to the dummy state (in order to keep the FST output-deterministic and stochastic, if it already was). If end_is_epsilon == True, treats “being a final-state” like having an epsilon output link.

kaldi.fstext.utils.make_linear_acceptor(labels:list<int>, ofst:StdMutableFst)

Creates an unweighted linear acceptor from the label sequence.

kaldi.fstext.utils.make_linear_acceptor_with_alternatives(labels:list<list<int>>, ofst:StdMutableFst)

Creates an unweighted acceptor with a linear structure.

Each position in the input list is a list of labels. Each position must have at least one alternative. Epsilon/0 is treated like a normal symbol.

kaldi.fstext.utils.make_preceding_input_symbols_same(start_is_epsilon:bool, fst:StdMutableFst)

Ensures that all arcs entering any state have the same input symbol.

Detects states that have differing input symbols going in, and inserts, for each of the preceding arcs with non-epsilon input symbol, a new dummy state that has an epsilon link to the fst state. If start_is_epsilon == True, ensures that start-state can have only epsilon-links into it.

kaldi.fstext.utils.map_input_symbols(symbol_map:list<int>, fst:StdMutableFst)

Maps input labels to labels given in the symbol map.

kaldi.fstext.utils.minimize_encoded_std_fst(fst:StdVectorFst, delta:float=default)

Minimizes FST in place after encoding labels and weights.

Similar to minimize operation, except it does not push the weights, or the labels.

Parameters:
  • fst (StdVectorFst) – Input FST.
  • delta (float) – Quantization delta (default=0.0009765625).
kaldi.fstext.utils.nbest_as_fsts(fst:StdFst, n:int) → list<StdVectorFst>

Outputs (up to) n-best paths in the FST as a list of FSTs.

kaldi.fstext.utils.phi_compose(fst1:StdFst, fst2:StdFst, phi_label:int, ofst:StdMutableFst)

Performs composition by handling phi (failure) transitions.

This is a version of composition where the right hand FST (fst2) is treated as a backoff language model, with the phi symbol (e.g. #0) treated as a “failure transition”, only taken when there is no match for the requested symbol.

kaldi.fstext.utils.phi_compose_lattice(fst1:LatticeFst, fst2:LatticeFst, phi_label:int, ofst:LatticeMutableFst)

Performs lattice composition by handling phi (failure) transitions.

This is a version of composition where the right hand FST (fst2) is treated as a backoff language model, with the phi symbol (e.g. #0) treated as a “failure transition”, only taken when there is no match for the requested symbol.

kaldi.fstext.utils.preceding_input_symbols_are_same(start_is_epsilon:bool, fst:StdFst) → bool

Checks if all arcs entering any state have the same input symbol.

Returns true if and only if the FST is such that the input symbols on arcs entering any given state all have the same value. If start_is_epsilon == True, treats start-state as an epsilon input arc [i.e. ensures only epsilons can enter start-state].

kaldi.fstext.utils.propagate_final(phi_label:int, fst:StdMutableFst)

Propagates final-probs through “phi” transitions.

Note that here, phi_label may be epsilon. If you have a backoff language model with special symbols (“phi”) on the backoff arcs instead of epsilon, you may use phi_compose() to compose with it, but this won’t do the right thing w.r.t. final probabilities. You should first call propagate_final() on the FST with phi’s in it (fst2 in phi_compose()), to fix this. If a state does not have a final-prob, but has a phi transition, it makes the state’s final-prob (phi-prob * final-prob-of-dest-state), and does this recursively i.e. follows phi transitions on the dest state first. It behaves as if there were a super-final state with a special symbol leading to it, from each currently final state. Note that this may not behave as desired if there are epsilons in your FST; it might be better to remove those before calling this function.

kaldi.fstext.utils.remove_alignments_from_compact_lattice(fst:CompactLatticeMutableFst)

Removes state-level alignments in a compact lattice.

kaldi.fstext.utils.remove_some_input_symbols(to_remove:list<int>, fst:StdMutableFst)

Replaces given input labels with zeros.

kaldi.fstext.utils.remove_useless_arcs(fst:StdMutableFst)

Removes arcs that are not on best paths for any input symbol sequence.

This removes arcs such that there is no input symbol sequence for which the best path through the FST would contain those arcs [for these purposes, epsilon is not treated as a real symbol]. This is mainly geared towards decoding-graph FSTs which may contain transitions that have less likely words on them that would never be taken. We do not claim that this algorithm removes all such arcs; it just does the best job it can. Only works for tropical (not log) semiring as it uses NaturalLess.

kaldi.fstext.utils.remove_weights(fst:StdMutableFst)

Removes FST weights.

kaldi.fstext.utils.rho_compose(fst1:StdFst, fst2:StdFst, rho_label:int, ofst:StdMutableFst)

Performs composition by handling rho transitions.

This is a version of composition where the right hand FST (fst2) has special “rho transitions” which are taken whenever no normal transition matches; these transitions will be rewritten with whatever symbol was on the first FST.

kaldi.fstext.utils.safe_determinize_minimize_wrapper(ifst:StdMutableFst, ofst:StdVectorFst, delta:float=default)

Performs safe determinization and minimization.

Like meth:safe_determinize_wrapper but also does encoded minimization, which is safe. This algorithm will destroy ifst.

kaldi.fstext.utils.safe_determinize_minimize_wrapper_in_log(ifst:StdVectorFst, ofst:StdVectorFst, delta:float=default)

Performs safe determinization and minimization in log semiring.

Like meth:safe_determinize_minimize_wrapper but first casts to the log semiring. This algorithm will destroy ifst.

kaldi.fstext.utils.safe_determinize_wrapper(ifst:StdMutableFst, ofst:StdMutableFst, delta:float=default)

Performs safe determinization.

This is a form of determinization that will never blow up. Note that ifst is non-const and can be destroyed by this operation. Does not do epsilon removal. This is so it’s safe to cast to log and do this, and maintain equivalence in tropical.

kaldi.fstext.utils.scale_compact_lattice(scale:list<list<float>>, fst:CompactLatticeMutableFst)

Scales the compact lattice weights.

Scales the pair of weights in CompactLatticeWeight by viewing the pair (a, b) as a 2-vector and pre-multiplying by the 2x2 matrix in scale. E.g. typically scale would equal [[1, 0], [0, acwt]] if we want to scale the acoustics by acwt.

kaldi.fstext.utils.scale_lattice(scale:list<list<float>>, fst:LatticeMutableFst)

Scales the lattice weights.

Scales the pair of weights in LatticeWeight by viewing the pair (a, b) as a 2-vector and pre-multiplying by the 2x2 matrix in scale. E.g. typically scale would equal [[1, 0], [0, acwt]] if we want to scale the acoustics by acwt.

kaldi.fstext.weight

PyKaldi has support for the following weight types:

  1. Tropical weight.
  2. Log weight.
  3. Lattice weight.
  4. Compact lattice weight.
  5. KWS time weight.
  6. KWS index weight.
kaldi.fstext.weight.DELTA = 0.0009765625
kaldi.fstext.weight.LEFT_SEMIRING = 1
kaldi.fstext.weight.RIGHT_SEMIRING = 2
kaldi.fstext.weight.SEMIRING = 3
kaldi.fstext.weight.COMMUTATIVE = 4
kaldi.fstext.weight.IDEMPOTENT = 8
kaldi.fstext.weight.PATH = 16
kaldi.fstext.weight.NUM_RANDOM_WEIGHTS = 5

Functions

approx_equal_compact_lattice_weight Checks if given compact lattice weights are approximately equal.
approx_equal_float_weight Checks if given float weights are approximately equal.
approx_equal_lattice_weight Checks if given lattice weights are approximately equal.
compact_lattice_weight_to_cost Converts compact lattice weight to cost.
compare_compact_lattice_weight Compares input compact lattice weights.
compare_lattice_weight Compares input lattice weights.
divide_compact_lattice_weight \(\oslash\) operation in the compact lattice semiring.
divide_kws_index_weight \(\oslash\) operation in the KWS index semiring.
divide_lattice_weight \(\oslash\) operation in the lattice semiring.
divide_log_weight \(\oslash\) operation in the log semiring.
divide_tropical_lt_tropical_weight \(\oslash\) operation in the KWS time semiring.
divide_tropical_weight \(\oslash\) operation in the tropical semiring.
get_log_to_tropical_converter Returns a callable for converting log weight to tropical weight.
get_tropical_to_log_converter Returns a callable for converting tropical weight to log weight.
lattice_weight_to_cost Converts lattice weight to cost.
lattice_weight_to_tropical Converts lattice weight to tropical weight.
plus_compact_lattice_weight \(\oplus\) operation in the compact lattice semiring.
plus_kws_index_weight \(\oplus\) operation in the KWS index semiring.
plus_lattice_weight \(\oplus\) operation in the lattice semiring.
plus_log_weight \(\oplus\) operation in the log semiring.
plus_tropical_lt_tropical_weight \(\oplus\) operation in the KWS time semiring.
plus_tropical_weight \(\oplus\) operation in the tropical semiring.
power_log_weight Power operation in the log semiring.
power_tropical_weight Power operation in the tropical semiring.
scale_compact_lattice_weight Scales compact lattice weight.
scale_lattice_weight Scales lattice weight.
times_compact_lattice_weight \(\otimes\) operation in the compact lattice semiring.
times_kws_index_weight \(\otimes\) operation in the KWS index semiring.
times_lattice_weight \(\otimes\) operation in the lattice semiring.
times_log_weight \(\otimes\) operation in the log semiring.
times_tropical_lt_tropical_weight \(\otimes\) operation in the KWS time semiring.
times_tropical_weight \(\otimes\) operation in the tropical semiring.
tropical_weight_to_cost Converts tropical weight to cost.

Classes

CompactLatticeNaturalLess Comparison object in compact lattice semiring.
CompactLatticeWeight Compact lattice weight.
DivideType An enumeration.
FloatLimits Float limits.
FloatWeight Base class for float weight types.
KwsIndexWeight KWS index weight.
KwsTimeWeight KWS time weight.
LatticeNaturalLess Comparison object in lattice semiring.
LatticeWeight Lattice weight.
LogWeight Log weight.
TropicalWeight Tropical weight.
class kaldi.fstext.weight.CompactLatticeNaturalLess

Comparison object in compact lattice semiring.

class kaldi.fstext.weight.CompactLatticeWeight

Compact lattice weight.

from_other(other:CompactLatticeWeight) → CompactLatticeWeight

Create a new compact lattice weight from another.

from_pair(w:LatticeWeight, s:list<int>) → CompactLatticeWeight

Create a new compact lattice weight from a weight string pair.

get_int_size_string() → str

Returns int size string.

hash() → int

Returns the hash for the weight.

member() → bool

Checks if weight is a member of the compact lattice semiring.

no_weight() → CompactLatticeWeight

No weight in compact lattice semiring.

one() → CompactLatticeWeight

One in compact lattice semiring.

properties() → int

Returns weight properties.

quantize(delta:float=default) → CompactLatticeWeight

Quantizes the weight.

reverse() → CompactLatticeWeight

Reverses the weight.

string

The string as a list of integers.

type() → str

Returns weight type.

weight

The weight.

zero() → CompactLatticeWeight

Zero in compact lattice semiring.

class kaldi.fstext.weight.DivideType

An enumeration.

DIVIDE_ANY = 2
DIVIDE_LEFT = 0
DIVIDE_RIGHT = 1
class kaldi.fstext.weight.FloatLimits

Float limits.

neg_infinity() → float

Returns float -infinity.

number_bad() → float

Returns float bad number.

pos_infinity() → float

Returns float +infinity.

class kaldi.fstext.weight.FloatWeight

Base class for float weight types.

from_float(f:float) → FloatWeight

Create a new float weight from a float.

from_other(weight:FloatWeight) → FloatWeight

Create a new float weight from another.

hash() → int

Returns the hash for the weight.

value

Float value of the weight.

class kaldi.fstext.weight.KwsIndexWeight

KWS index weight.

A tropical weight triplet with lexicographic ordering.

from_components(w1:TropicalWeight, w2:KwsTimeWeight) → KwsIndexWeight

Creates a new KWS index weight from component weights.

member() → bool

Checks if weight is a member of the KWS index semiring.

no_weight() → KwsIndexWeight

No weight in KWS index semiring.

one() → KwsIndexWeight

One in KWS index semiring.

properties() → int

Returns weight properties.

quantize(delta:float=default) → KwsIndexWeight

Quantizes the weight.

reverse() → KwsIndexWeight

Reverses the weight.

type() → str

Returns weight type.

value1

The first component weight.

value2

The second component weight.

zero() → KwsIndexWeight

Zero in KWS index semiring.

class kaldi.fstext.weight.KwsTimeWeight

KWS time weight.

A tropical weight pair with lexicographic ordering.

from_components(w1:TropicalWeight, w2:TropicalWeight) → KwsTimeWeight

Creates a new KWS time weight from component weights.

member() → bool

Checks if weight is a member of the KWS time semiring.

no_weight() → KwsTimeWeight

No weight in the KWS time semiring.

one() → KwsTimeWeight

One in the KWS time semiring.

properties() → int

Returns weight properties.

quantize(delta:float=default) → KwsTimeWeight

Quantizes the weight.

reverse() → KwsTimeWeight

Reverses the weight.

type() → str

Returns weight type.

value1

The first component weight.

value2

The second component weight.

zero() → KwsTimeWeight

Zero in the KWS time semiring.

class kaldi.fstext.weight.LatticeNaturalLess

Comparison object in lattice semiring.

class kaldi.fstext.weight.LatticeWeight

Lattice weight.

from_other(other:LatticeWeight) → LatticeWeight

Create a new lattice weight from another.

from_pair(a:float, b:float) → LatticeWeight

Create a new lattice weight from a pair of floats.

hash() → int

Returns the hash for the weight.

member() → bool

Checks if weight is a member of the lattice semiring.

no_weight() → LatticeWeight

No weight in lattice semiring.

one() → LatticeWeight

One in lattice semiring, i.e. (0.0, 0.0).

properties() → int

Returns weight properties.

quantize(delta:float=default) → LatticeWeight

Quantizes the weight.

reverse() → LatticeWeight

Reverses the weight.

type() → str

Returns weight type.

value1

Float value of the first weight.

value2

Float value of the second weight.

zero() → LatticeWeight

Zero in lattice semiring, i.e. (+infinity, +infinity).

class kaldi.fstext.weight.LogWeight

Log weight.

from_float(f:float) → LogWeight

Create a new log weight from a float.

from_other(weight:LogWeight) → LogWeight

Create a new log weight from another.

hash() → int

Returns the hash for the weight.

member() → bool

Checks if weight is a member of log semiring.

no_weight() → LogWeight

No weight in log semiring.

one() → LogWeight

One in log semiring, i.e. 0.0.

properties() → int

Returns weight properties.

quantize(delta:float=default) → LogWeight

Quantizes the weight.

reverse() → LogWeight

Reverses the weight.

type() → str

Returns weight type.

value

Float value of the weight.

zero() → LogWeight

Zero in log semiring, i.e. float +infinity.

class kaldi.fstext.weight.TropicalWeight

Tropical weight.

from_float(f:float) → TropicalWeight

Create a new tropical weight from a float.

from_other(weight:TropicalWeight) → TropicalWeight

Create a new tropical weight from another.

hash() → int

Returns the hash for the weight.

member() → bool

Checks if weight is a member of the tropical semiring.

no_weight() → TropicalWeight

No weight in tropical semiring.

one() → TropicalWeight

One in tropical semiring, i.e. 0.0.

properties() → int

Returns weight properties.

quantize(delta:float=default) → TropicalWeight

Quantizes the weight.

reverse() → TropicalWeight

Reverses the weight.

type() → str

Returns weight type.

value

Float value of the weight.

zero() → TropicalWeight

Zero in tropical semiring, i.e. float +infinity.

kaldi.fstext.weight.approx_equal_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight, delta:float=default) → bool

Checks if given compact lattice weights are approximately equal.

kaldi.fstext.weight.approx_equal_float_weight(w1:FloatWeight, w2:FloatWeight, delta:float=default) → bool

Checks if given float weights are approximately equal.

kaldi.fstext.weight.approx_equal_lattice_weight(w1:LatticeWeight, w2:LatticeWeight, delta:float=default) → bool

Checks if given lattice weights are approximately equal.

kaldi.fstext.weight.compact_lattice_weight_to_cost(w:CompactLatticeWeight) → float

Converts compact lattice weight to cost.

kaldi.fstext.weight.compare_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → int

Compares input compact lattice weights.

kaldi.fstext.weight.compare_lattice_weight(w1:LatticeWeight, w2:LatticeWeight) → int

Compares input lattice weights.

kaldi.fstext.weight.divide_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight, typ:DivideType=default) → CompactLatticeWeight

\(\oslash\) operation in the compact lattice semiring.

kaldi.fstext.weight.divide_kws_index_weight(w1:KwsIndexWeight, w2:KwsIndexWeight, typ:DivideType=default) → KwsIndexWeight

\(\oslash\) operation in the KWS index semiring.

kaldi.fstext.weight.divide_lattice_weight(w1:LatticeWeight, w2:LatticeWeight, typ:DivideType=default) → LatticeWeight

\(\oslash\) operation in the lattice semiring.

kaldi.fstext.weight.divide_log_weight(w1:LogWeight, w2:LogWeight, typ:DivideType=default) → LogWeight

\(\oslash\) operation in the log semiring.

kaldi.fstext.weight.divide_tropical_lt_tropical_weight(w1:KwsTimeWeight, w2:KwsTimeWeight, typ:DivideType=default) → KwsTimeWeight

\(\oslash\) operation in the KWS time semiring.

kaldi.fstext.weight.divide_tropical_weight(w1:TropicalWeight, w2:TropicalWeight, typ:DivideType=default) → TropicalWeight

\(\oslash\) operation in the tropical semiring.

kaldi.fstext.weight.get_log_to_tropical_converter() -> (w:LogWeight) → TropicalWeight

Returns a callable for converting log weight to tropical weight.

kaldi.fstext.weight.get_tropical_to_log_converter() -> (w:TropicalWeight) → LogWeight

Returns a callable for converting tropical weight to log weight.

kaldi.fstext.weight.lattice_weight_to_cost(w:LatticeWeight) → float

Converts lattice weight to cost.

kaldi.fstext.weight.lattice_weight_to_tropical(w_in:LatticeWeight) → TropicalWeight

Converts lattice weight to tropical weight.

kaldi.fstext.weight.plus_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → CompactLatticeWeight

\(\oplus\) operation in the compact lattice semiring.

kaldi.fstext.weight.plus_kws_index_weight(w1:KwsIndexWeight, w2:KwsIndexWeight) → KwsIndexWeight

\(\oplus\) operation in the KWS index semiring.

kaldi.fstext.weight.plus_lattice_weight(w1:LatticeWeight, w2:LatticeWeight) → LatticeWeight

\(\oplus\) operation in the lattice semiring.

kaldi.fstext.weight.plus_log_weight(w1:LogWeight, w2:LogWeight) → LogWeight

\(\oplus\) operation in the log semiring.

kaldi.fstext.weight.plus_tropical_lt_tropical_weight(w1:KwsTimeWeight, w2:KwsTimeWeight) → KwsTimeWeight

\(\oplus\) operation in the KWS time semiring.

kaldi.fstext.weight.plus_tropical_weight(w1:TropicalWeight, w2:TropicalWeight) → TropicalWeight

\(\oplus\) operation in the tropical semiring.

kaldi.fstext.weight.power_log_weight(weight:LogWeight, scalar:float) → LogWeight

Power operation in the log semiring.

kaldi.fstext.weight.power_tropical_weight(weight:TropicalWeight, scalar:float) → TropicalWeight

Power operation in the tropical semiring.

kaldi.fstext.weight.scale_compact_lattice_weight(w:CompactLatticeWeight, scale:list<list<float>>) → CompactLatticeWeight

Scales compact lattice weight.

kaldi.fstext.weight.scale_lattice_weight(w:LatticeWeight, scale:list<list<float>>) → LatticeWeight

Scales lattice weight.

kaldi.fstext.weight.times_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → CompactLatticeWeight

\(\otimes\) operation in the compact lattice semiring.

kaldi.fstext.weight.times_kws_index_weight(w1:KwsIndexWeight, w2:KwsIndexWeight) → KwsIndexWeight

\(\otimes\) operation in the KWS index semiring.

kaldi.fstext.weight.times_lattice_weight(w1:LatticeWeight, w2:LatticeWeight) → LatticeWeight

\(\otimes\) operation in the lattice semiring.

kaldi.fstext.weight.times_log_weight(w1:LogWeight, w2:LogWeight) → LogWeight

\(\otimes\) operation in the log semiring.

kaldi.fstext.weight.times_tropical_lt_tropical_weight(w1:KwsTimeWeight, w2:KwsTimeWeight) → KwsTimeWeight

\(\otimes\) operation in the KWS time semiring.

kaldi.fstext.weight.times_tropical_weight(w1:TropicalWeight, w2:TropicalWeight) → TropicalWeight

\(\otimes\) operation in the tropical semiring.

kaldi.fstext.weight.tropical_weight_to_cost(w:TropicalWeight) → float

Converts tropical weight to cost.