kaldi.fstext¶

PyKaldi has built-in support for common FST types (including Kaldi lattices and KWS index) and operations. The API for the user facing PyKaldi FST types and operations is mostly defined in Python mimicking the API exposed by OpenFst’s official Python wrapper pywrapfst to a large extent. This includes integrations with Graphviz and IPython for interactive visualization of FSTs.

There are two major differences between the PyKaldi FST package and pywrapfst:

PyKaldi bindings are generated with CLIF while pywrapfst bindings are generated with Cython. This allows PyKaldi FST types to work seamlessly with the rest of the PyKaldi package.
In contrast to pywrapfst, PyKaldi does not wrap OpenFst scripting API, which uses virtual dispatch, function registration, and dynamic loading of shared objects to provide a common interface shared by FSTs of different semirings. While this change requires wrapping each semiring specialization separately in PyKaldi, it gives users the ability to pass FST objects directly to the myriad PyKaldi functions accepting FST arguments.

Operations which construct new FSTs are implemented as traditional functions, as are two-argument boolean functions like equal and equivalent. Convert operation is not implemented as a separate function since FSTs already support construction from other FST types, e.g. vector FSTs can be constructed from constant FSTs and vice versa. Destructive operations—those that mutate an FST, in place—are instance methods, as is write.

The following example, based on Mohri et al. 2002, shows the construction of an ASR graph given a pronunciation lexicon L, grammar G, a transducer from context-dependent phones to context-independent phones C, and an HMM set H:

import kaldi.fstext as fst

L = fst.StdVectorFst.read("L.fst")
G = fst.StdVectorFst.read("G.fst")
C = fst.StdVectorFst.read("C.fst")
H = fst.StdVectorFst.read("H.fst")
LG = fst.determinize(fst.compose(L, G))
CLG = fst.determinize(fst.compose(C, LG))
HCLG = fst.determinize(fst.compose(H, CLG))
HCLG.minimize()                                      # NB: works in-place.

kaldi.fstext.NO_STATE_ID = -1¶

kaldi.fstext.NO_LABEL = -1¶

kaldi.fstext.ENCODE_FLAGS = 3¶

kaldi.fstext.ENCODE_LABELS = 1¶

kaldi.fstext.ENCODE_WEIGHTS = 2¶

Functions

`arcmap`	Constructively applies a transform to all arcs and final states.
`compat_symbols`	Returns true if the two symbol tables have equal checksums.
`compose`	Constructively composes two FSTs.
`deserialize_symbol_table`	Deserializes a symbol table.
`determinize`	Constructively determinizes a weighted FST.
`difference`	Constructively computes the difference of two FSTs.
`disambiguate`	Constructively disambiguates a weighted transducer.
`epsnormalize`	Constructively epsilon-normalizes an FST.
`equal`	Are two FSTs equal?
`equivalent`	Are the two acceptors equivalent?
`indices_to_symbols`	Converts indices to symbols by looking them up in the symbol table.
`intersect`	Constructively intersects two FSTs.
`isomorphic`	Are the two acceptors isomorphic?
`prune`	Constructively removes paths with weights below a certain threshold.
`push`	Constructively pushes weights/labels towards initial or final states.
`randequivalent`	Are two acceptors stochastically equivalent?
`randgen`	Randomly generate successful paths in an FST.
`read_fst_kaldi`	Reads FST using Kaldi I/O mechanisms.
`relabel_symbol_table`	Relabels a symbol table as specified by the input list of pairs.
`replace`	Recursively replaces arcs in the root FST with other FST(s).
`reverse`	Constructively reverses an FST’s transduction.
`rmepsilon`	Constructively removes epsilon transitions from an FST.
`serialize_symbol_table`	Serializes a symbol table.
`shortestdistance`	Compute the shortest distance from the initial or final state.
`shortestpath`	Construct an FST containing the shortest path(s) in the input FST.
`statemap`	Constructively applies a transform to all states.
`symbols_to_indices`	Converts symbols to indices by looking them up in the symbol table.
`synchronize`	Constructively synchronizes an FST.
`write_fst_kaldi`	Writes FST using Kaldi I/O mechanisms.

Classes

`CompactLatticeArc`	FST arc with compact lattice weight.
`CompactLatticeConstFst`	Constant FST over the compact lattice semiring.
`CompactLatticeConstFstArcIterator`	Arc iterator for a constant FST over the compact lattice semiring.
`CompactLatticeConstFstStateIterator`	State iterator for a constant FST over the compact lattice semiring.
`CompactLatticeEncodeMapper`	Arc encoder for an FST over the compact lattice semiring.
`CompactLatticeEncodeTable`	Encode table for CompactLatticeArc.
`CompactLatticeFstCompiler`	Compiler for FSTs over the compact lattice semiring.
`CompactLatticeVectorFst`	Vector FST over the compact lattice semiring.
`CompactLatticeVectorFstArcIterator`	Arc iterator for a vector FST over the compact lattice semiring.
`CompactLatticeVectorFstMutableArcIterator`	Mutable arc iterator for a vector FST over the compact lattice semiring.
`CompactLatticeVectorFstStateIterator`	State iterator for a vector FST over the compact lattice semiring.
`CompactLatticeWeight`	Compact lattice weight factory.
`FstHeader`	FST file header.
`FstReadOptions`	FST reading options.
`FstWriteOptions`	FST writing options.
`KwsIndexArc`	FST arc with KWS index weight.
`KwsIndexConstFst`	Constant FST over the KWS index semiring.
`KwsIndexConstFstArcIterator`	Arc iterator for a constant FST over the KWS index semiring.
`KwsIndexConstFstStateIterator`	State iterator for a constant FST over the KWS index semiring.
`KwsIndexEncodeMapper`	Arc encoder for an FST over the KWS index semiring.
`KwsIndexEncodeTable`	Encode table for KwsIndexArc.
`KwsIndexFstCompiler`	Compiler for FSTs over the KWS index semiring.
`KwsIndexVectorFst`	Vector FST over the KWS index semiring.
`KwsIndexVectorFstArcIterator`	Arc iterator for a vector FST over the KWS index semiring.
`KwsIndexVectorFstMutableArcIterator`	Mutable arc iterator for a vector FST over the KWS index semiring.
`KwsIndexVectorFstStateIterator`	State iterator for a vector FST over the KWS index semiring.
`KwsIndexWeight`	KWS index weight factory.
`KwsTimeWeight`	KWS time weight factory.
`LatticeArc`	FST arc with lattice weight.
`LatticeConstFst`	Constant FST over the lattice semiring.
`LatticeConstFstArcIterator`	Arc iterator for a constant FST over the lattice semiring.
`LatticeConstFstStateIterator`	State iterator for a constant FST over the lattice semiring.
`LatticeEncodeMapper`	Arc encoder for an FST over the lattice semiring.
`LatticeEncodeTable`	Encode table for LatticeArc.
`LatticeFstCompiler`	Compiler for FSTs over the lattice semiring.
`LatticeVectorFst`	Vector FST over the lattice semiring.
`LatticeVectorFstArcIterator`	Arc iterator for a vector FST over the lattice semiring.
`LatticeVectorFstMutableArcIterator`	Mutable arc iterator for a vector FST over the lattice semiring.
`LatticeVectorFstStateIterator`	State iterator for a vector FST over the lattice semiring.
`LatticeWeight`	Lattice weight factory.
`LogArc`	FST arc with log weight.
`LogConstFst`	Constant FST over the log semiring.
`LogConstFstArcIterator`	Arc iterator for a constant FST over the log semiring.
`LogConstFstStateIterator`	State iterator for a constant FST over the log semiring.
`LogEncodeMapper`	Arc encoder for an FST over the log semiring.
`LogEncodeTable`	Encode table for LogArc.
`LogFstCompiler`	Compiler for FSTs over the log semiring.
`LogVectorFst`	Vector FST over the log semiring.
`LogVectorFstArcIterator`	Arc iterator for a vector FST over the log semiring.
`LogVectorFstMutableArcIterator`	Mutable arc iterator for a vector FST over the log semiring.
`LogVectorFstStateIterator`	State iterator for a vector FST over the log semiring.
`LogWeight`	Log weight factory.
`StdArc`	FST arc with tropical weight.
`StdConstFst`	Constant FST over the tropical semiring.
`StdConstFstArcIterator`	Arc iterator for a constant FST over the tropical semiring.
`StdConstFstStateIterator`	State iterator for a constant FST over the tropical semiring.
`StdEncodeMapper`	Arc encoder for an FST over the tropical semiring.
`StdEncodeTable`	Encode table for StdArc.
`StdFstCompiler`	Compiler for FSTs over the tropical semiring.
`StdVectorFst`	Vector FST over the tropical semiring.
`StdVectorFstArcIterator`	Arc iterator for a vector FST over the tropical semiring.
`StdVectorFstMutableArcIterator`	Mutable arc iterator for a vector FST over the tropical semiring.
`StdVectorFstStateIterator`	State iterator for a vector FST over the tropical semiring.
`SymbolTable`	Symbol table.
`SymbolTableIterator`	Symbol table iterator.
`SymbolTableTextOptions`	Options for reading symbol table from text file.
`TropicalWeight`	Tropical weight factory.

class kaldi.fstext.CompactLatticeArc[source]¶

FST arc with compact lattice weight.

CompactLatticeArc():: Creates an uninitialized CompactLatticeArc instance.
CompactLatticeArc(ilabel, olabel, weight, nextstate):: Creates a new CompactLatticeArc instance initalized with given arguments.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (CompactLatticeWeight) – The arc weight. nextstate (int) – The destination state for the arc.

from_attrs(ilabel:int, olabel:int, weight:CompactLatticeWeight, nextstate:int) → CompactLatticeArc¶

Creates a new arc with the given attributes.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (CompactLatticeWeight) – The arc weight. nextstate (int) – The destination state for the arc.

ilabel¶: int – The input label.

nextstate¶: int – The destination state for the arc.

olabel¶: int – The output label.

type() → str¶: Returns arc type.

weight¶: CompactLatticeWeight – The arc weight.

class kaldi.fstext.CompactLatticeConstFst(fst=None)[source]¶

Constant FST over the compact lattice semiring.

Parameters:	fst (CompactLatticeFst) – The input FST over the compact lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

type()¶

Returns the FST type.

Returns:	The FST type.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.CompactLatticeConstFstArcIterator(fst, state)[source]¶

Arc iterator for a constant FST over the compact lattice semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeConstFstStateIterator(fst)[source]¶

State iterator for a constant FST over the compact lattice semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]¶

Arc encoder for an FST over the compact lattice semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:	encode_labels (bool) – Should labels be encoded? encode_weights (bool) – Should weights be encoded? encode (bool) – Encode or decode?

flags() → int¶: Returns encoder flags.

from_other(mapper:CompactLatticeEncodeMapper) → CompactLatticeEncodeMapper¶: Creates a new encoder with the contents of another.

from_other_with_type(mapper:CompactLatticeEncodeMapper, type:EncodeType) → CompactLatticeEncodeMapper¶: Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable¶: Returns input symbol table.

output_symbols() → SymbolTable¶: Returns output symbol table.

properties(inprops:int) → int¶

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the encoder’s properties.
Returns:	A 64-bit bitmask representing the requested properties.

read(filename:str, type:EncodeType=default) → CompactLatticeEncodeMapper¶: Reads encoder from file.

set_input_symbols(syms:SymbolTable)¶

Sets the input symbol table.

Parameters:	syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)¶

Sets the output symbol table.

Parameters:	syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType¶: Returns encoder type.

write(filename:str) → bool¶

Writes encoder to file.

Returns:	True if write was successful, False otherwise.

class kaldi.fstext.CompactLatticeEncodeTable¶

Encode table for CompactLatticeArc.

CompactLatticeEncodeTable(flags):: Creates a new encode table with the given flags.

class Tuple¶

CompactLatticeArc encoding tuple.

ilabel¶: Input label.

olabel¶: Output label.

weight¶: Weight.

decode(key:int) → Tuple¶: Decodes an encoded arc label back to labels and cost.

encode(arc:CompactLatticeArc) → int¶: Encodes the given arc (either labels or weights or both).

flags() → int¶: Returns encoding flags.

get_label(arc:CompactLatticeArc) → int¶

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable¶: Returns input symbols.

output_symbols() → SymbolTable¶: Returns output symbols.

read(strm:istream, source:str) → CompactLatticeEncodeTable¶: Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)¶: Sets input symbols.

set_output_symbols(syms:SymbolTable)¶: Sets output symbols.

size() → int¶: Returns the size of the table.

write(strm:ostream, source:str) → bool¶: Writes table to output stream.

class kaldi.fstext.CompactLatticeFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶

Compiler for FSTs over the compact lattice semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:

isymbols – An optional SymbolTable used to label input symbols.
osymbols – An optional SymbolTable used to label output symbols.
ssymbols – An optional SymbolTable used to label states.
acceptor – Should the FST be rendered in acceptor format if possible?
keep_isymbols – Should the input symbol table be stored in the FST?
keep_osymbols – Should the output symbol table be stored in the FST?
keep_state_numbering – Should the state numbering be preserved?
allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).

compile()¶

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:	The FST described by the string buffer.
Raises:	`RuntimeError` – Compilation failed.

write(expression)¶

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)

Parameters:	expression – A string expression to add to compiler string buffer.

class kaldi.fstext.CompactLatticeVectorFst(fst=None)[source]¶

Vector FST over the compact lattice semiring.

Parameters:	fst (CompactLatticeFst) – The input FST over the compact lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

add_arc(state, arc)¶

Adds a new arc to the FST and returns self.

Parameters:	state – The integer index of the source state. arc – The arc to add.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: add_state.

add_state()¶

Adds a new state to the FST and returns the state ID.

Returns:	The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')¶

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:	sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:	self.
Raises:	`ValueError` – Unknown sort type.

See also: topsort.

closure(closure_plus=False)¶

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:	closure_plus – If True, do not accept the empty string.
Returns:	self.

concat(ifst)¶

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:	ifst – The second input FST.
Returns:	self.

connect()¶

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:	self.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

decode(encoder)¶

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: encode.

delete_arcs(state, n=None)¶

Deletes arcs leaving a particular state.

Parameters:	state – The integer index of a state. n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_states.

delete_states(states=None)¶

Deletes states.

Parameters:	states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)¶

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: decode.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

invert()¶

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:	self.

minimize(delta=0.0009765625, allow_nondet=False)¶

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:	delta – Comparison/quantization delta (default: 0.0009765625). allow_nondet – Attempt minimization of non-deterministic FST?
Returns:	self.

mutable_arcs(state)¶

Returns a mutable iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

project(project_output=False)¶

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:	project_output – Project onto output labels?
Returns:	self.

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)¶

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:	weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)¶

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:	to_final – Push towards final states? delta – Comparison/quantization delta (default: 0.0009765625). remove_total_weight – If pushing weights, should the total weight be removed?
Returns:	self.

See also: The constructive variant, which also supports label pushing.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

relabel(ipairs=None, opairs=None)¶

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:	ipairs – An iterable containing (old index, new index) integer pairs. opairs – An iterable containing (old index, new index) integer pairs.
Returns:	self.
Raises:	`ValueError` – No relabeling pairs specified.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:	old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table. new_isymbols – A SymbolTable used to relabel the input labels unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table? old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table. new_osymbols – A SymbolTable used to relabel the output labels. unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:	self.
Raises:	`ValueError` – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)¶

Reserve n arcs at a particular state (best effort).

Parameters:	state – The integer index of a state. n – The number of arcs to reserve.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: reserve_states.

reserve_states(n)¶

Reserve n states (best effort).

Parameters:	n – The number of states to reserve.
Returns:	self.

See also: reserve_arcs.

reweight(potentials, to_final=False)¶

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:	potentials – An iterable of TropicalWeights. to_final – Push towards final states?
Returns:	self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:	connect – Should output be trimmed? weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant, which also supports epsilon removal: in reverse (and which may be more efficient).

set_final(state, weight=None)¶

Sets the final weight for a state.

Parameters:	state – The integer index of a state. weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:	`IndexError` – State index out of range.

See also: set_start.

set_input_symbols(syms)¶

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_output_symbols.

set_output_symbols(syms)¶

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_input_symbols.

set_properties(props, mask)¶

Sets the properties bits.

Parameters:	props (int) – The properties to be set. mask (int) – A mask to be applied to the `props` argument before setting the FST’s properties.
Returns:	self.

set_start(state)¶

Sets the initial state.

Parameters:	state – The integer index of a state.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: set_final.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

topsort()¶

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:	self.

See also: arcsort.

type()¶

Returns the FST type.

Returns:	The FST type.

union(ifst)¶

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:	ifst – The second input FST.
Returns:	self.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.CompactLatticeVectorFstArcIterator(fst, state)[source]¶

Arc iterator for a vector FST over the compact lattice semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeVectorFstMutableArcIterator(fst, state)[source]¶

Mutable arc iterator for a vector FST over the compact lattice semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in lattice.mutable_arcs(0):
    setter(LatticeArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

set_value(arc)¶

Replace the current arc with a new arc.

Parameters:	arc – The arc to replace the current arc with.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeVectorFstStateIterator(fst)[source]¶

State iterator for a vector FST over the compact lattice semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.CompactLatticeWeight[source]¶

Compact lattice weight factory.

This class is used for creating new CompactLatticeWeight instances.

CompactLatticeWeight():: Creates an uninitialized CompactLatticeWeight instance.
CompactLatticeWeight(weight):: Creates a new CompactLatticeWeight instance initalized with the weight.

Parameters:	weight (Tuple[Tuple[float, float], List[int]] or Tuple[LatticeWeight, List[int]] or CompactLatticeWeight) – A pair of weight values or another `CompactLatticeWeight` instance.

CompactLatticeWeight(weight, string):: Creates a new CompactLatticeWeight instance initalized with the (weight, string) pair.

Parameters:	weight (Tuple[float, float] or LatticeWeight) – The weight value. string (List[int]) – The string value given as a list of integers.

from_other(other:CompactLatticeWeight) → CompactLatticeWeight¶: Create a new compact lattice weight from another.

from_pair(w:LatticeWeight, s:list<int>) → CompactLatticeWeight¶: Create a new compact lattice weight from a weight string pair.

get_int_size_string() → str¶: Returns int size string.

hash() → int¶: Returns the hash for the weight.

member() → bool¶: Checks if weight is a member of the compact lattice semiring.

no_weight() → CompactLatticeWeight¶: No weight in compact lattice semiring.

one() → CompactLatticeWeight¶: One in compact lattice semiring.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → CompactLatticeWeight¶: Quantizes the weight.

reverse() → CompactLatticeWeight¶: Reverses the weight.

string¶: The string as a list of integers.

type() → str¶: Returns weight type.

weight¶: The weight.

zero() → CompactLatticeWeight¶: Zero in compact lattice semiring.

class kaldi.fstext.FstHeader¶

FST file header.

arc_type() → str¶: Returns arc type.

debug_string() → str¶: Outputs a debug string for the FstHeader object.

fst_type() → str¶: Returns FST type.

get_flags() → int¶: Returns flags.

num_arcs() → int¶: Returns number of arcs.

num_states() → int¶: Returns number of states.

properties() → int¶: Returns FST properties.

read(strm:istream, source:str, rewind:bool=default) → bool¶: Reads header from stream.

set_arc_type(type:str)¶: Sets arc type.

set_flags(flags:int)¶: Sets flags.

set_fst_type(type:str)¶: Sets FST type.

set_num_arcs(numarcs:int)¶: Sets number of arcs.

set_num_states(numstates:int)¶: Sets number of states.

set_properties(properties:int)¶: Sets FST properties.

set_start(start:int)¶: Sets start state.

set_version(version:int)¶: Sets version.

start() → int¶: Returns start state.

version() → int¶: Returns version.

write(strm:ostream, source:str) → bool¶: Writes header to stream.

class kaldi.fstext.FstReadOptions¶

FST reading options.

FileReadMode¶: alias of FstReadOptions.FileReadMode

debug_string() → str¶: Outputs a debug string for the FstReadOptions object.

mode¶: Read or map files (advisory, if possible)

read_isymbols¶: Read input symbols, if any (default – true).

read_mode(mode:str) → FileReadMode¶: Converts mode strings into FileReadMode enum values.

read_osymbols¶: Read output symbols, if any (default – true).

source¶: Where you’re reading from.

class kaldi.fstext.FstWriteOptions¶

FST writing options.

align¶: Write data aligned (may fail on pipes)?

source¶: Where you’re writing to.

stream_write¶: Avoid seek operations in writing.

write_header¶: Write the header?

write_isymbols¶: Write input symbols?

write_osymbols¶: Write output symbols?

class kaldi.fstext.KwsIndexArc[source]¶

FST arc with KWS index weight.

KwsIndexArc():: Creates an uninitialized KwsIndexArc instance.
KwsIndexArc(ilabel, olabel, weight, nextstate):: Creates a new KwsIndexArc instance initalized with given arguments.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (KwsIndexWeight) – The arc weight. nextstate (int) – The destination state for the arc.

from_attrs(ilabel:int, olabel:int, weight:KwsIndexWeight, nextstate:int) → KwsIndexArc¶

Creates a new arc with the given attributes.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (KwsIndexWeight) – The arc weight. nextstate (int) – The destination state for the arc.

ilabel¶: int – The input label.

nextstate¶: int – The destination state for the arc.

olabel¶: int – The output label.

type() → str¶: Returns arc type.

weight¶: TropicalWeight – The arc weight.

class kaldi.fstext.KwsIndexConstFst(fst=None)[source]¶

Constant FST over the KWS index semiring.

Parameters:	fst (KwsIndexFst) – The input FST over the KWS index semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

type()¶

Returns the FST type.

Returns:	The FST type.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.KwsIndexConstFstArcIterator(fst, state)[source]¶

Arc iterator for a constant FST over the KWS index semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexConstFstStateIterator(fst)[source]¶

State iterator for a constant FST over the KWS index semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]¶

Arc encoder for an FST over the KWS index semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:	encode_labels (bool) – Should labels be encoded? encode_weights (bool) – Should weights be encoded? encode (bool) – Encode or decode?

flags() → int¶: Returns encoder flags.

from_other(mapper:KwsIndexEncodeMapper) → KwsIndexEncodeMapper¶: Creates a new encoder with the contents of another.

from_other_with_type(mapper:KwsIndexEncodeMapper, type:EncodeType) → KwsIndexEncodeMapper¶: Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable¶: Returns input symbol table.

output_symbols() → SymbolTable¶: Returns output symbol table.

properties(inprops:int) → int¶

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the encoder’s properties.
Returns:	A 64-bit bitmask representing the requested properties.

read(filename:str, type:EncodeType=default) → KwsIndexEncodeMapper¶: Reads encoder from file.

set_input_symbols(syms:SymbolTable)¶

Sets the input symbol table.

Parameters:	syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)¶

Sets the output symbol table.

Parameters:	syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType¶: Returns encoder type.

write(filename:str) → bool¶

Writes encoder to file.

Returns:	True if write was successful, False otherwise.

class kaldi.fstext.KwsIndexEncodeTable¶

Encode table for KwsIndexArc.

KwsIndexEncodeTable(flags):: Creates a new encode table with the given flags.

class Tuple¶

KwsIndexArc encoding tuple.

ilabel¶: Input label.

olabel¶: Output label.

weight¶: Weight.

decode(key:int) → Tuple¶: Decodes an encoded arc label back to labels and cost.

encode(arc:KwsIndexArc) → int¶: Encodes the given arc (either labels or weights or both).

flags() → int¶: Returns encoding flags.

get_label(arc:KwsIndexArc) → int¶

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable¶: Returns input symbols.

output_symbols() → SymbolTable¶: Returns output symbols.

read(strm:istream, source:str) → KwsIndexEncodeTable¶: Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)¶: Sets input symbols.

set_output_symbols(syms:SymbolTable)¶: Sets output symbols.

size() → int¶: Returns the size of the table.

write(strm:ostream, source:str) → bool¶: Writes table to output stream.

class kaldi.fstext.KwsIndexFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶

Compiler for FSTs over the KWS index semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:

isymbols – An optional SymbolTable used to label input symbols.
osymbols – An optional SymbolTable used to label output symbols.
ssymbols – An optional SymbolTable used to label states.
acceptor – Should the FST be rendered in acceptor format if possible?
keep_isymbols – Should the input symbol table be stored in the FST?
keep_osymbols – Should the output symbol table be stored in the FST?
keep_state_numbering – Should the state numbering be preserved?
allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).

compile()¶

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:	The FST described by the string buffer.
Raises:	`RuntimeError` – Compilation failed.

write(expression)¶

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)

Parameters:	expression – A string expression to add to compiler string buffer.

class kaldi.fstext.KwsIndexVectorFst(fst=None)[source]¶

Vector FST over the KWS index semiring.

Parameters:	fst (KwsIndexFst) – The input FST over the KWS index semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

add_arc(state, arc)¶

Adds a new arc to the FST and returns self.

Parameters:	state – The integer index of the source state. arc – The arc to add.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: add_state.

add_state()¶

Adds a new state to the FST and returns the state ID.

Returns:	The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')¶

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:	sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:	self.
Raises:	`ValueError` – Unknown sort type.

See also: topsort.

closure(closure_plus=False)¶

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:	closure_plus – If True, do not accept the empty string.
Returns:	self.

concat(ifst)¶

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:	ifst – The second input FST.
Returns:	self.

connect()¶

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:	self.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

decode(encoder)¶

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: encode.

delete_arcs(state, n=None)¶

Deletes arcs leaving a particular state.

Parameters:	state – The integer index of a state. n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_states.

delete_states(states=None)¶

Deletes states.

Parameters:	states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)¶

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: decode.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

invert()¶

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:	self.

minimize(delta=0.0009765625, allow_nondet=False)¶

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:	delta – Comparison/quantization delta (default: 0.0009765625). allow_nondet – Attempt minimization of non-deterministic FST?
Returns:	self.

mutable_arcs(state)¶

Returns a mutable iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

project(project_output=False)¶

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:	project_output – Project onto output labels?
Returns:	self.

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)¶

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:	weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)¶

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:	to_final – Push towards final states? delta – Comparison/quantization delta (default: 0.0009765625). remove_total_weight – If pushing weights, should the total weight be removed?
Returns:	self.

See also: The constructive variant, which also supports label pushing.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

relabel(ipairs=None, opairs=None)¶

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:	ipairs – An iterable containing (old index, new index) integer pairs. opairs – An iterable containing (old index, new index) integer pairs.
Returns:	self.
Raises:	`ValueError` – No relabeling pairs specified.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:	old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table. new_isymbols – A SymbolTable used to relabel the input labels unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table? old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table. new_osymbols – A SymbolTable used to relabel the output labels. unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:	self.
Raises:	`ValueError` – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)¶

Reserve n arcs at a particular state (best effort).

Parameters:	state – The integer index of a state. n – The number of arcs to reserve.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: reserve_states.

reserve_states(n)¶

Reserve n states (best effort).

Parameters:	n – The number of states to reserve.
Returns:	self.

See also: reserve_arcs.

reweight(potentials, to_final=False)¶

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:	potentials – An iterable of TropicalWeights. to_final – Push towards final states?
Returns:	self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:	connect – Should output be trimmed? weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant, which also supports epsilon removal: in reverse (and which may be more efficient).

set_final(state, weight=None)¶

Sets the final weight for a state.

Parameters:	state – The integer index of a state. weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:	`IndexError` – State index out of range.

See also: set_start.

set_input_symbols(syms)¶

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_output_symbols.

set_output_symbols(syms)¶

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_input_symbols.

set_properties(props, mask)¶

Sets the properties bits.

Parameters:	props (int) – The properties to be set. mask (int) – A mask to be applied to the `props` argument before setting the FST’s properties.
Returns:	self.

set_start(state)¶

Sets the initial state.

Parameters:	state – The integer index of a state.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: set_final.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

topsort()¶

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:	self.

See also: arcsort.

type()¶

Returns the FST type.

Returns:	The FST type.

union(ifst)¶

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:	ifst – The second input FST.
Returns:	self.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.KwsIndexVectorFstArcIterator(fst, state)[source]¶

Arc iterator for a vector FST over the KWS index semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexVectorFstMutableArcIterator(fst, state)[source]¶

Mutable arc iterator for a vector FST over the KWS index semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in fst.mutable_arcs(0):
    setter(KwsIndexArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

set_value(arc)¶

Replace the current arc with a new arc.

Parameters:	arc – The arc to replace the current arc with.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexVectorFstStateIterator(fst)[source]¶

State iterator for a vector FST over the KWS index semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.KwsIndexWeight[source]¶

KWS index weight factory.

This class is used for creating new KwsIndexWeight instances.

KwsIndexWeight():: Creates an uninitialized KwsIndexWeight instance.
KwsIndexWeight(weight):: Creates a new KwsIndexWeight instance initalized with the weight.

Parameters:	weight (Tuple[float, Tuple[float, float]] or Tuple[TropicalWeight, KwsTimeWeight] or KwsIndexWeight) – A pair of weight values or another `KwsIndexWeight` instance.

KwsIndexWeight(weight1, weight2):: Creates a new KwsIndexWeight instance initalized with weights.

Parameters:	weight1 (float or TropicalWeight) – The first weight value. weight2 (Tuple[float, float] or KwsTimeWeight) – The second weight value.

from_components(w1:TropicalWeight, w2:KwsTimeWeight) → KwsIndexWeight¶: Creates a new KWS index weight from component weights.

member() → bool¶: Checks if weight is a member of the KWS index semiring.

no_weight() → KwsIndexWeight¶: No weight in KWS index semiring.

one() → KwsIndexWeight¶: One in KWS index semiring.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → KwsIndexWeight¶: Quantizes the weight.

reverse() → KwsIndexWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value1¶: The first component weight.

value2¶: The second component weight.

zero() → KwsIndexWeight¶: Zero in KWS index semiring.

class kaldi.fstext.KwsTimeWeight[source]¶

KWS time weight factory.

This class is used for creating new KwsTimeWeight instances.

KwsTimeWeight():: Creates an uninitialized KwsTimeWeight instance.
KwsTimeWeight(weight):: Creates a new KwsTimeWeight instance initalized with the weight.

Parameters:	weight (Tuple[float, float] or KwsTimeWeight) – A pair of weight values another KwsTimeWeight instance. (or) –

KwsTimeWeight(weight1, weight2):: Creates a new KwsTimeWeight instance initalized with the weights.

Parameters:	weight1 (float) – The first weight value. weight2 (float) – The second weight value.

from_components(w1:TropicalWeight, w2:TropicalWeight) → KwsTimeWeight¶: Creates a new KWS time weight from component weights.

member() → bool¶: Checks if weight is a member of the KWS time semiring.

no_weight() → KwsTimeWeight¶: No weight in the KWS time semiring.

one() → KwsTimeWeight¶: One in the KWS time semiring.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → KwsTimeWeight¶: Quantizes the weight.

reverse() → KwsTimeWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value1¶: The first component weight.

value2¶: The second component weight.

zero() → KwsTimeWeight¶: Zero in the KWS time semiring.

class kaldi.fstext.LatticeArc[source]¶

FST arc with lattice weight.

LatticeArc():: Creates an uninitialized LatticeArc instance.
LatticeArc(ilabel, olabel, weight, nextstate):: Creates a new LatticeArc instance initalized with given arguments.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (LatticeWeight) – The arc weight. nextstate (int) – The destination state for the arc.

from_attrs(ilabel:int, olabel:int, weight:LatticeWeight, nextstate:int) → LatticeArc¶

Creates a new arc with the given attributes.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (LatticeWeight) – The arc weight. nextstate (int) – The destination state for the arc.

ilabel¶: int – The input label.

nextstate¶: int – The destination state for the arc.

olabel¶: int – The output label.

type() → str¶: Returns arc type.

weight¶: LatticeWeight – The arc weight.

class kaldi.fstext.LatticeConstFst(fst=None)[source]¶

Constant FST over the lattice semiring.

Parameters:	fst (LatticeFst) – The input FST over the lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

type()¶

Returns the FST type.

Returns:	The FST type.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.LatticeConstFstArcIterator(fst, state)[source]¶

Arc iterator for a constant FST over the lattice semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeConstFstStateIterator(fst)[source]¶

State iterator for a constant FST over the lattice semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]¶

Arc encoder for an FST over the lattice semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:	encode_labels (bool) – Should labels be encoded? encode_weights (bool) – Should weights be encoded? encode (bool) – Encode or decode?

flags() → int¶: Returns encoder flags.

from_other(mapper:LatticeEncodeMapper) → LatticeEncodeMapper¶: Creates a new encoder with the contents of another.

from_other_with_type(mapper:LatticeEncodeMapper, type:EncodeType) → LatticeEncodeMapper¶: Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable¶: Returns input symbol table.

output_symbols() → SymbolTable¶: Returns output symbol table.

properties(inprops:int) → int¶

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the encoder’s properties.
Returns:	A 64-bit bitmask representing the requested properties.

read(filename:str, type:EncodeType=default) → LatticeEncodeMapper¶: Reads encoder from file.

set_input_symbols(syms:SymbolTable)¶

Sets the input symbol table.

Parameters:	syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)¶

Sets the output symbol table.

Parameters:	syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType¶: Returns encoder type.

write(filename:str) → bool¶

Writes encoder to file.

Returns:	True if write was successful, False otherwise.

class kaldi.fstext.LatticeEncodeTable¶

Encode table for LatticeArc.

LatticeEncodeTable(flags):: Creates a new encode table with the given flags.

class Tuple¶

LatticeArc encoding tuple.

ilabel¶: Input label.

olabel¶: Output label.

weight¶: Weight.

decode(key:int) → Tuple¶: Decodes an encoded arc label back to labels and cost.

encode(arc:LatticeArc) → int¶: Encodes the given arc (either labels or weights or both).

flags() → int¶: Returns encoding flags.

get_label(arc:LatticeArc) → int¶

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable¶: Returns input symbols.

output_symbols() → SymbolTable¶: Returns output symbols.

read(strm:istream, source:str) → LatticeEncodeTable¶: Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)¶: Sets input symbols.

set_output_symbols(syms:SymbolTable)¶: Sets output symbols.

size() → int¶: Returns the size of the table.

write(strm:ostream, source:str) → bool¶: Writes table to output stream.

class kaldi.fstext.LatticeFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶

Compiler for FSTs over the lattice semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:

isymbols – An optional SymbolTable used to label input symbols.
osymbols – An optional SymbolTable used to label output symbols.
ssymbols – An optional SymbolTable used to label states.
acceptor – Should the FST be rendered in acceptor format if possible?
keep_isymbols – Should the input symbol table be stored in the FST?
keep_osymbols – Should the output symbol table be stored in the FST?
keep_state_numbering – Should the state numbering be preserved?
allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).

compile()¶

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:	The FST described by the string buffer.
Raises:	`RuntimeError` – Compilation failed.

write(expression)¶

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)

Parameters:	expression – A string expression to add to compiler string buffer.

class kaldi.fstext.LatticeVectorFst(fst=None)[source]¶

Vector FST over the lattice semiring.

Parameters:	fst (LatticeFst) – The input FST over the lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

add_arc(state, arc)¶

Adds a new arc to the FST and returns self.

Parameters:	state – The integer index of the source state. arc – The arc to add.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: add_state.

add_state()¶

Adds a new state to the FST and returns the state ID.

Returns:	The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')¶

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:	sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:	self.
Raises:	`ValueError` – Unknown sort type.

See also: topsort.

closure(closure_plus=False)¶

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:	closure_plus – If True, do not accept the empty string.
Returns:	self.

concat(ifst)¶

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:	ifst – The second input FST.
Returns:	self.

connect()¶

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:	self.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

decode(encoder)¶

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: encode.

delete_arcs(state, n=None)¶

Deletes arcs leaving a particular state.

Parameters:	state – The integer index of a state. n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_states.

delete_states(states=None)¶

Deletes states.

Parameters:	states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)¶

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: decode.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

invert()¶

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:	self.

minimize(delta=0.0009765625, allow_nondet=False)¶

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:	delta – Comparison/quantization delta (default: 0.0009765625). allow_nondet – Attempt minimization of non-deterministic FST?
Returns:	self.

mutable_arcs(state)¶

Returns a mutable iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

project(project_output=False)¶

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:	project_output – Project onto output labels?
Returns:	self.

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)¶

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:	weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)¶

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:	to_final – Push towards final states? delta – Comparison/quantization delta (default: 0.0009765625). remove_total_weight – If pushing weights, should the total weight be removed?
Returns:	self.

See also: The constructive variant, which also supports label pushing.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

relabel(ipairs=None, opairs=None)¶

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:	ipairs – An iterable containing (old index, new index) integer pairs. opairs – An iterable containing (old index, new index) integer pairs.
Returns:	self.
Raises:	`ValueError` – No relabeling pairs specified.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:	old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table. new_isymbols – A SymbolTable used to relabel the input labels unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table? old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table. new_osymbols – A SymbolTable used to relabel the output labels. unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:	self.
Raises:	`ValueError` – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)¶

Reserve n arcs at a particular state (best effort).

Parameters:	state – The integer index of a state. n – The number of arcs to reserve.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: reserve_states.

reserve_states(n)¶

Reserve n states (best effort).

Parameters:	n – The number of states to reserve.
Returns:	self.

See also: reserve_arcs.

reweight(potentials, to_final=False)¶

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:	potentials – An iterable of TropicalWeights. to_final – Push towards final states?
Returns:	self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:	connect – Should output be trimmed? weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant, which also supports epsilon removal: in reverse (and which may be more efficient).

set_final(state, weight=None)¶

Sets the final weight for a state.

Parameters:	state – The integer index of a state. weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:	`IndexError` – State index out of range.

See also: set_start.

set_input_symbols(syms)¶

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_output_symbols.

set_output_symbols(syms)¶

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_input_symbols.

set_properties(props, mask)¶

Sets the properties bits.

Parameters:	props (int) – The properties to be set. mask (int) – A mask to be applied to the `props` argument before setting the FST’s properties.
Returns:	self.

set_start(state)¶

Sets the initial state.

Parameters:	state – The integer index of a state.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: set_final.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

topsort()¶

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:	self.

See also: arcsort.

type()¶

Returns the FST type.

Returns:	The FST type.

union(ifst)¶

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:	ifst – The second input FST.
Returns:	self.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.LatticeVectorFstArcIterator(fst, state)[source]¶

Arc iterator for a vector FST over the lattice semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeVectorFstMutableArcIterator(fst, state)[source]¶

Mutable arc iterator for a vector FST over the lattice semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in lattice.mutable_arcs(0):
    setter(LatticeArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

set_value(arc)¶

Replace the current arc with a new arc.

Parameters:	arc – The arc to replace the current arc with.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeVectorFstStateIterator(fst)[source]¶

State iterator for a vector FST over the lattice semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LatticeWeight[source]¶

Lattice weight factory.

This class is used for creating new LatticeWeight instances.

LatticeWeight():: Creates an uninitialized LatticeWeight instance.
LatticeWeight(weight):: Creates a new LatticeWeight instance initalized with the weight.

Parameters:	weight (Tuple[float, float] or LatticeWeight) – A pair of weight values another LatticeWeight instance. (or) –

LatticeWeight(weight1, weight2):: Creates a new LatticeWeight instance initalized with the weights.

Parameters:	weight1 (float) – The first weight value. weight2 (float) – The second weight value.

from_other(other:LatticeWeight) → LatticeWeight¶: Create a new lattice weight from another.

from_pair(a:float, b:float) → LatticeWeight¶: Create a new lattice weight from a pair of floats.

hash() → int¶: Returns the hash for the weight.

member() → bool¶: Checks if weight is a member of the lattice semiring.

no_weight() → LatticeWeight¶: No weight in lattice semiring.

one() → LatticeWeight¶: One in lattice semiring, i.e. (0.0, 0.0).

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → LatticeWeight¶: Quantizes the weight.

reverse() → LatticeWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value1¶: Float value of the first weight.

value2¶: Float value of the second weight.

zero() → LatticeWeight¶: Zero in lattice semiring, i.e. (+infinity, +infinity).

class kaldi.fstext.LogArc[source]¶

FST arc with log weight.

LogArc():: Creates an uninitialized LogArc instance.
LogArc(ilabel, olabel, weight, nextstate):: Creates a new LogArc instance initalized with given arguments.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (LogWeight) – The arc weight. nextstate (int) – The destination state for the arc.

from_attrs(ilabel:int, olabel:int, weight:LogWeight, nextstate:int) → LogArc¶

Creates a new arc with the given attributes.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (LogWeight) – The arc weight. nextstate (int) – The destination state for the arc.

ilabel¶: int – The input label.

nextstate¶: int – The destination state for the arc.

olabel¶: int – The output label.

type() → str¶: Returns arc type.

weight¶: LogWeight – The arc weight.

class kaldi.fstext.LogConstFst(fst=None)[source]¶

Constant FST over the log semiring.

Parameters:	fst (LogFst) – The input FST over the log semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

type()¶

Returns the FST type.

Returns:	The FST type.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.LogConstFstArcIterator(fst, state)[source]¶

Arc iterator for a constant FST over the log semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogConstFstStateIterator(fst)[source]¶

State iterator for a constant FST over the log semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]¶

Arc encoder for an FST over the log semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:	encode_labels (bool) – Should labels be encoded? encode_weights (bool) – Should weights be encoded? encode (bool) – Encode or decode?

flags() → int¶: Returns encoder flags.

from_other(mapper:LogEncodeMapper) → LogEncodeMapper¶: Creates a new encoder with the contents of another.

from_other_with_type(mapper:LogEncodeMapper, type:EncodeType) → LogEncodeMapper¶: Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable¶: Returns input symbol table.

output_symbols() → SymbolTable¶: Returns output symbol table.

properties(inprops:int) → int¶

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the encoder’s properties.
Returns:	A 64-bit bitmask representing the requested properties.

read(filename:str, type:EncodeType=default) → LogEncodeMapper¶: Reads encoder from file.

set_input_symbols(syms:SymbolTable)¶

Sets the input symbol table.

Parameters:	syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)¶

Sets the output symbol table.

Parameters:	syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType¶: Returns encoder type.

write(filename:str) → bool¶

Writes encoder to file.

Returns:	True if write was successful, False otherwise.

class kaldi.fstext.LogEncodeTable¶

Encode table for LogArc.

LogEncodeTable(flags):: Creates a new encode table with the given flags.

class Tuple¶

LogArc encoding tuple.

ilabel¶: Input label.

olabel¶: Output label.

weight¶: Weight.

decode(key:int) → Tuple¶: Decodes an encoded arc label back to labels and cost.

encode(arc:LogArc) → int¶: Encodes the given arc (either labels or weights or both).

flags() → int¶: Returns encoding flags.

get_label(arc:LogArc) → int¶

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable¶: Returns input symbols.

output_symbols() → SymbolTable¶: Returns output symbols.

read(strm:istream, source:str) → LogEncodeTable¶: Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)¶: Sets input symbols.

set_output_symbols(syms:SymbolTable)¶: Sets output symbols.

size() → int¶: Returns the size of the table.

write(strm:ostream, source:str) → bool¶: Writes table to output stream.

class kaldi.fstext.LogFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶

Compiler for FSTs over the log semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:

isymbols – An optional SymbolTable used to label input symbols.
osymbols – An optional SymbolTable used to label output symbols.
ssymbols – An optional SymbolTable used to label states.
acceptor – Should the FST be rendered in acceptor format if possible?
keep_isymbols – Should the input symbol table be stored in the FST?
keep_osymbols – Should the output symbol table be stored in the FST?
keep_state_numbering – Should the state numbering be preserved?
allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).

compile()¶

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:	The FST described by the string buffer.
Raises:	`RuntimeError` – Compilation failed.

write(expression)¶

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)

Parameters:	expression – A string expression to add to compiler string buffer.

class kaldi.fstext.LogVectorFst(fst=None)[source]¶

Vector FST over the log semiring.

Parameters:	fst (LogFst) – The input FST over the log semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

add_arc(state, arc)¶

Adds a new arc to the FST and returns self.

Parameters:	state – The integer index of the source state. arc – The arc to add.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: add_state.

add_state()¶

Adds a new state to the FST and returns the state ID.

Returns:	The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')¶

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:	sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:	self.
Raises:	`ValueError` – Unknown sort type.

See also: topsort.

closure(closure_plus=False)¶

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:	closure_plus – If True, do not accept the empty string.
Returns:	self.

concat(ifst)¶

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:	ifst – The second input FST.
Returns:	self.

connect()¶

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:	self.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

decode(encoder)¶

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: encode.

delete_arcs(state, n=None)¶

Deletes arcs leaving a particular state.

Parameters:	state – The integer index of a state. n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_states.

delete_states(states=None)¶

Deletes states.

Parameters:	states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)¶

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: decode.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

invert()¶

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:	self.

minimize(delta=0.0009765625, allow_nondet=False)¶

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:	delta – Comparison/quantization delta (default: 0.0009765625). allow_nondet – Attempt minimization of non-deterministic FST?
Returns:	self.

mutable_arcs(state)¶

Returns a mutable iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

project(project_output=False)¶

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:	project_output – Project onto output labels?
Returns:	self.

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)¶

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:	weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)¶

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:	to_final – Push towards final states? delta – Comparison/quantization delta (default: 0.0009765625). remove_total_weight – If pushing weights, should the total weight be removed?
Returns:	self.

See also: The constructive variant, which also supports label pushing.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

relabel(ipairs=None, opairs=None)¶

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:	ipairs – An iterable containing (old index, new index) integer pairs. opairs – An iterable containing (old index, new index) integer pairs.
Returns:	self.
Raises:	`ValueError` – No relabeling pairs specified.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:	old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table. new_isymbols – A SymbolTable used to relabel the input labels unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table? old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table. new_osymbols – A SymbolTable used to relabel the output labels. unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:	self.
Raises:	`ValueError` – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)¶

Reserve n arcs at a particular state (best effort).

Parameters:	state – The integer index of a state. n – The number of arcs to reserve.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: reserve_states.

reserve_states(n)¶

Reserve n states (best effort).

Parameters:	n – The number of states to reserve.
Returns:	self.

See also: reserve_arcs.

reweight(potentials, to_final=False)¶

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:	potentials – An iterable of TropicalWeights. to_final – Push towards final states?
Returns:	self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:	connect – Should output be trimmed? weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant, which also supports epsilon removal: in reverse (and which may be more efficient).

set_final(state, weight=None)¶

Sets the final weight for a state.

Parameters:	state – The integer index of a state. weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:	`IndexError` – State index out of range.

See also: set_start.

set_input_symbols(syms)¶

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_output_symbols.

set_output_symbols(syms)¶

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_input_symbols.

set_properties(props, mask)¶

Sets the properties bits.

Parameters:	props (int) – The properties to be set. mask (int) – A mask to be applied to the `props` argument before setting the FST’s properties.
Returns:	self.

set_start(state)¶

Sets the initial state.

Parameters:	state – The integer index of a state.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: set_final.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

topsort()¶

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:	self.

See also: arcsort.

type()¶

Returns the FST type.

Returns:	The FST type.

union(ifst)¶

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:	ifst – The second input FST.
Returns:	self.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.LogVectorFstArcIterator(fst, state)[source]¶

Arc iterator for a vector FST over the log semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogVectorFstMutableArcIterator(fst, state)[source]¶

Mutable arc iterator for a vector FST over the log semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in logfst.mutable_arcs(0):
    setter(LogArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

set_value(arc)¶

Replace the current arc with a new arc.

Parameters:	arc – The arc to replace the current arc with.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogVectorFstStateIterator(fst)[source]¶

State iterator for a vector FST over the log semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.LogWeight[source]¶

Log weight factory.

This class is used for creating new LogWeight instances.

LogWeight():: Creates an uninitialized LogWeight instance.
LogWeight(weight):: Creates a new LogWeight instance initalized with the weight.

Parameters:	weight (float or FloatWeight) – The weight value.

from_float(f:float) → LogWeight¶: Create a new log weight from a float.

from_other(weight:LogWeight) → LogWeight¶: Create a new log weight from another.

hash() → int¶: Returns the hash for the weight.

member() → bool¶: Checks if weight is a member of log semiring.

no_weight() → LogWeight¶: No weight in log semiring.

one() → LogWeight¶: One in log semiring, i.e. 0.0.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → LogWeight¶: Quantizes the weight.

reverse() → LogWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value¶: Float value of the weight.

zero() → LogWeight¶: Zero in log semiring, i.e. float +infinity.

class kaldi.fstext.StdArc[source]¶

FST arc with tropical weight.

StdArc():: Creates an uninitialized StdArc instance.
StdArc(ilabel, olabel, weight, nextstate):: Creates a new StdArc instance initalized with given arguments.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (TropicalWeight) – The arc weight. nextstate (int) – The destination state for the arc.

from_attrs(ilabel:int, olabel:int, weight:TropicalWeight, nextstate:int) → StdArc¶

Creates a new arc with the given attributes.

Parameters:	ilabel (int) – The input label. olabel (int) – The output label. weight (TropicalWeight) – The arc weight. nextstate (int) – The destination state for the arc.

ilabel¶: int – The input label.

nextstate¶: int – The destination state for the arc.

olabel¶: int – The output label.

type() → str¶: Returns arc type.

weight¶: TropicalWeight – The arc weight.

class kaldi.fstext.StdConstFst(fst=None)[source]¶

Constant FST over the tropical semiring.

Parameters:	fst (StdFst) – The input FST over the tropical semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

type()¶

Returns the FST type.

Returns:	The FST type.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.StdConstFstArcIterator(fst, state)[source]¶

Arc iterator for a constant FST over the tropical semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.StdConstFstStateIterator(fst)[source]¶

State iterator for a constant FST over the tropical semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.StdEncodeMapper(encode_labels=False, encode_weights=False, encode=True)[source]¶

Arc encoder for an FST over the tropical semiring.

This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.

To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods encode and decode. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.

Parameters:	encode_labels (bool) – Should labels be encoded? encode_weights (bool) – Should weights be encoded? encode (bool) – Encode or decode?

flags() → int¶: Returns encoder flags.

from_other(mapper:StdEncodeMapper) → StdEncodeMapper¶: Creates a new encoder with the contents of another.

from_other_with_type(mapper:StdEncodeMapper, type:EncodeType) → StdEncodeMapper¶: Creates a new encoder with the contents of another and given type.

input_symbols() → SymbolTable¶: Returns input symbol table.

output_symbols() → SymbolTable¶: Returns output symbol table.

properties(inprops:int) → int¶

Provides property bits.

This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the encoder’s properties.
Returns:	A 64-bit bitmask representing the requested properties.

read(filename:str, type:EncodeType=default) → StdEncodeMapper¶: Reads encoder from file.

set_input_symbols(syms:SymbolTable)¶

Sets the input symbol table.

Parameters:	syms – A SymbolTable.

See also: set_output_symbols.

set_output_symbols(syms:SymbolTable)¶

Sets the output symbol table.

Parameters:	syms – A SymbolTable.

See also: set_input_symbols.

type() → EncodeType¶: Returns encoder type.

write(filename:str) → bool¶

Writes encoder to file.

Returns:	True if write was successful, False otherwise.

class kaldi.fstext.StdEncodeTable¶

Encode table for StdArc.

StdEncodeTable(flags):: Creates a new encode table with the given flags.

class Tuple¶

StdArc encoding tuple.

ilabel¶: Input label.

olabel¶: Output label.

weight¶: Weight.

decode(key:int) → Tuple¶: Decodes an encoded arc label back to labels and cost.

encode(arc:StdArc) → int¶: Encodes the given arc (either labels or weights or both).

flags() → int¶: Returns encoding flags.

get_label(arc:StdArc) → int¶

Looks up the encoded label for the given arc.

Returns -1 if arc is not found.

input_symbols() → SymbolTable¶: Returns input symbols.

output_symbols() → SymbolTable¶: Returns output symbols.

read(strm:istream, source:str) → StdEncodeTable¶: Reads encode table from input stream.

set_input_symbols(syms:SymbolTable)¶: Sets input symbols.

set_output_symbols(syms:SymbolTable)¶: Sets output symbols.

size() → int¶: Returns the size of the table.

write(strm:ostream, source:str) → bool¶: Writes table to output stream.

class kaldi.fstext.StdFstCompiler(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶

Compiler for FSTs over the tropical semiring.

This class is used to compile FSTs specified using the AT&T FSM library format described here:

http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html

This is the same format used by the fstcompile executable.

FstCompiler options (symbol tables, etc.) are set at construction time:

compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)

Once constructed, FstCompiler instances behave like a file handle opened for writing:

# /ba+/
print("0 1 50 50", file=compiler)
print("1 2 49 49", file=compiler)
print("2 2 49 49", file=compiler)
print("2", file=compiler)

The compile method returns an actual FST instance:

sheep_machine = compiler.compile()

Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.

Parameters:

isymbols – An optional SymbolTable used to label input symbols.
osymbols – An optional SymbolTable used to label output symbols.
ssymbols – An optional SymbolTable used to label states.
acceptor – Should the FST be rendered in acceptor format if possible?
keep_isymbols – Should the input symbol table be stored in the FST?
keep_osymbols – Should the output symbol table be stored in the FST?
keep_state_numbering – Should the state numbering be preserved?
allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).

compile()¶

Compiles the FST in the string buffer.

This method compiles the FST and returns the resulting machine.

Returns:	The FST described by the string buffer.
Raises:	`RuntimeError` – Compilation failed.

write(expression)¶

Writes a string into the compiler string buffer.

This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:

compiler = FstCompiler()
print("0 0 49 49", file=compiler)
print("0", file=compiler)

Parameters:	expression – A string expression to add to compiler string buffer.

class kaldi.fstext.StdVectorFst(fst=None)[source]¶

Vector FST over the tropical semiring.

Parameters:	fst (StdFst) – The input FST over the tropical semiring. If provided, its contents are used for initializing the new FST. Defaults to `None`.

add_arc(state, arc)¶

Adds a new arc to the FST and returns self.

Parameters:	state – The integer index of the source state. arc – The arc to add.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: add_state.

add_state()¶

Adds a new state to the FST and returns the state ID.

Returns:	The integer index of the new state.

See also: add_arc, set_start, set_final.

arcs(state)¶

Returns an iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	An ArcIterator.

See also: mutable_arcs, states.

arcsort(sort_type='ilabel')¶

Sorts arcs leaving each state of the FST.

This operation destructively sorts arcs leaving each state using either input or output labels.

Parameters:	sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels).
Returns:	self.
Raises:	`ValueError` – Unknown sort type.

See also: topsort.

closure(closure_plus=False)¶

Computes concatenative closure.

This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if closure_plus is False.

Parameters:	closure_plus – If True, do not accept the empty string.
Returns:	self.

concat(ifst)¶

Computes the concatenation (product) of two FSTs.

This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.

Parameters:	ifst – The second input FST.
Returns:	self.

connect()¶

Removes unsuccessful paths.

This operation destructively trims the FST, removing states and arcs that are not part of any successful path.

Returns:	self.

copy()¶

Makes a copy of the FST.

Returns:	A copy of the FST.

decode(encoder)¶

Decodes encoded labels and/or weights.

This operation reverses the encoding performed by encode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: encode.

delete_arcs(state, n=None)¶

Deletes arcs leaving a particular state.

Parameters:	state – The integer index of a state. n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_states.

delete_states(states=None)¶

Deletes states.

Parameters:	states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: delete_arcs.

draw(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶

Writes out the FST in Graphviz text format.

This method writes out the FST in the dot graph description language. The graph can be rendered using the dot binary provided by Graphviz.

Parameters:

filename (str) – The string location of the output dot/Graphviz file.
isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
title (str) – An optional string indicating the figure title. Defaults to empty string.
width (float) – The figure width, in inches. Defaults 8.5’‘.
height (float) – The figure height, in inches. Defaults 11’‘.
portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
fontsize (int) – Font size, in points. Defaults 14pt.
precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.

For more information about the rendering options, see man dot.

See also: text.

encode(encoder)¶

Encodes labels and/or weights.

This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.

Parameters:	encoder – An EncodeMapper object used to encode the FST.
Returns:	self.

See also: decode.

final(state)¶

Returns the final weight of a state.

Parameters:	state – The integer index of a state.
Returns:	The final Weight of that state.
Raises:	`IndexError` – State index out of range.

from_bytes(s)¶

Returns the FST represented by the bytes object.

Parameters:	s (bytes) – The bytes object representing the FST.
Returns:	An FST object.

input_symbols()¶

Returns the input symbol table.

Returns:	The input symbol table.

See Also: output_symbols().

invert()¶

Inverts the FST’s transduction.

This operation destructively inverts the FST’s transduction by exchanging input and output labels.

Returns:	self.

minimize(delta=0.0009765625, allow_nondet=False)¶

Minimizes the FST.

This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.

Parameters:	delta – Comparison/quantization delta (default: 0.0009765625). allow_nondet – Attempt minimization of non-deterministic FST?
Returns:	self.

mutable_arcs(state)¶

Returns a mutable iterator over arcs leaving the specified state.

Parameters:	state – The source state index.
Returns:	A MutableArcIterator.

See also: arcs, states.

num_arcs(state=None)¶

Returns the number of arcs, counting them if necessary.

If state is None, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.

Parameters:	state – The integer index of a state. Defaults to `None`.
Returns:	The number of arcs leaving a state or the number of arcs in the FST.

Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.

Raises:	`IndexError` – State index out of range.

See also: num_states.

num_input_epsilons(state)¶

Returns the number of arcs with epsilon input labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-input-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_output_epsilons.

num_output_epsilons(state)¶

Returns the number of arcs with epsilon output labels leaving a state.

Parameters:	state – The integer index of a state.
Returns:	The number of epsilon-output-labeled arcs leaving that state.
Raises:	`IndexError` – State index out of range.

See also: num_input_epsilons.

num_states()¶

Returns the number of states, counting them if necessary.

Returns:	The number of states.

See also: num_arcs.

output_symbols()¶

Returns the output symbol table.

Returns:	The output symbol table.

See Also: input_symbols().

project(project_output=False)¶

Converts the FST to an acceptor using input or output labels.

This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.

Parameters:	project_output – Project onto output labels?
Returns:	self.

properties(mask, test)¶

Provides property bits.

This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the mask property.

Parameters:	mask – The property mask to be compared to the FST’s properties. test – Should any unknown values be computed before comparing against the mask?
Returns:	A 64-bit bitmask representing the requested properties.

prune(weight=None, nstate=-1, delta=0.0009765625)¶

Removes paths with weights below a certain threshold.

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.

Parameters:	weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant.

push(to_final=False, delta=0.0009765625, remove_total_weight=False)¶

Pushes weights towards the initial or final states.

This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.

Parameters:	to_final – Push towards final states? delta – Comparison/quantization delta (default: 0.0009765625). remove_total_weight – If pushing weights, should the total weight be removed?
Returns:	self.

See also: The constructive variant, which also supports label pushing.

read(filename)¶

Reads an FST from a file.

Parameters:	filename (str) – The location of the input file.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

read_from_stream(strm, ropts)¶

Reads an FST from an input stream.

Parameters:	strm (istream) – The input stream to read from. ropts (FstReadOptions) – FST reading options.
Returns:	An FST object.
Raises:	`RuntimeError` – Read failed.

relabel(ipairs=None, opairs=None)¶

Replaces input and/or output labels using pairs of labels.

This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.

Parameters:	ipairs – An iterable containing (old index, new index) integer pairs. opairs – An iterable containing (old index, new index) integer pairs.
Returns:	self.
Raises:	`ValueError` – No relabeling pairs specified.

relabel_tables(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶

Replaces input and/or output labels using SymbolTables.

This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.

Parameters:	old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table. new_isymbols – A SymbolTable used to relabel the input labels unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table? old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table. new_osymbols – A SymbolTable used to relabel the output labels. unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception) attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns:	self.
Raises:	`ValueError` – No SymbolTable specified.

See also: decode, encode, project, relabel.

reserve_arcs(state, n)¶

Reserve n arcs at a particular state (best effort).

Parameters:	state – The integer index of a state. n – The number of arcs to reserve.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: reserve_states.

reserve_states(n)¶

Reserve n states (best effort).

Parameters:	n – The number of states to reserve.
Returns:	self.

See also: reserve_arcs.

reweight(potentials, to_final=False)¶

Reweights an FST using an iterable of potentials.

This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).

Parameters:	potentials – An iterable of TropicalWeights. to_final – Push towards final states?
Returns:	self.

rmepsilon(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶

Removes epsilon transitions.

This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.

Parameters:	connect – Should output be trimmed? weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). delta – Comparison/quantization delta (default: 0.0009765625).
Returns:	self.

See also: The constructive variant, which also supports epsilon removal: in reverse (and which may be more efficient).

set_final(state, weight=None)¶

Sets the final weight for a state.

Parameters:	state – The integer index of a state. weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises:	`IndexError` – State index out of range.

See also: set_start.

set_input_symbols(syms)¶

Sets the input symbol table.

Passing None as a value will delete the input symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_output_symbols.

set_output_symbols(syms)¶

Sets the output symbol table.

Passing None as a value will delete the output symbol table.

Parameters:	syms – A SymbolTable.
Returns:	self.

See also: set_input_symbols.

set_properties(props, mask)¶

Sets the properties bits.

Parameters:	props (int) – The properties to be set. mask (int) – A mask to be applied to the `props` argument before setting the FST’s properties.
Returns:	self.

set_start(state)¶

Sets the initial state.

Parameters:	state – The integer index of a state.
Returns:	self.
Raises:	`IndexError` – State index out of range.

See also: set_final.

start()¶

Returns the start state.

Returns:	The start state if start state is set, -1 otherwise.

states()¶

Returns an iterator over all states in the FST.

Returns:	A StateIterator object for the FST.

See also: arcs, mutable_arcs.

text(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶

Produces a human-readable string representation of the FST.

This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.

Parameters:

isymbols – An optional symbol table used to label input symbols.
osymbols – An optional symbol table used to label output symbols.
ssymbols – An optional symbol table used to label states.
acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
missing_symbol – The string to be printed when symbol table lookup fails.

Returns:

A formatted string representing the FST.

to_bytes()¶

Returns a bytes object representing the FST.

Returns:	A bytes object.

topsort()¶

Sorts transitions by state IDs.

This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs

Returns:	self.

See also: arcsort.

type()¶

Returns the FST type.

Returns:	The FST type.

union(ifst)¶

Computes the union (sum) of two FSTs.

This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.

Parameters:	ifst – The second input FST.
Returns:	self.

verify()¶

Verifies that an FST’s contents are sane.

Returns:	True if the contents are sane, False otherwise.

write(filename)¶

Serializes FST to a file.

This method writes the FST to a file in a binary format.

Parameters:	filename (str) – The location of the output file.
Raises:	`IOError` – Write failed.

write_to_stream(strm, wopts)¶

Serializes FST to an output stream.

Parameters:	strm (ostream) – The output stream to write to. wopts (FstWriteOptions) – FST writing options.
Returns:	True if write was successful, False otherwise.
Raises:	`RuntimeError` – Write failed.

class kaldi.fstext.StdVectorFstArcIterator(fst, state)[source]¶

Arc iterator for a vector FST over the tropical semiring.

This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the arcs method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.StdVectorFstMutableArcIterator(fst, state)[source]¶

Mutable arc iterator for a vector FST over the tropical semiring.

This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the __iter__ method of a mutable arc iterator object returns an iterator over (arc, setter) pairs. The setter is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call the mutable_arcs method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.

for arc, setter in fst.mutable_arcs(0):
    setter(StdArc(arc.ilabel, 0, arc.weight, arc.nextstate))

Creates a new arc iterator.

Parameters:	fst – The fst. state – The state index.
Raises:	`IndexError` – State index out of range.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

flags()¶

Returns the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The current iterator behavioral flags as an integer.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

position()¶

Returns the position of the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	The iterator’s position, expressed as an integer.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

seek(a)¶

Advance the iterator to a new position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	a (int) – The position to seek to.

set_flags(flags, mask)¶

Sets the current iterator behavioral flags.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Parameters:	flags (int) – The properties to be set. mask (int) – A mask to be applied to the `flags` argument before setting them.

set_value(arc)¶

Replace the current arc with a new arc.

Parameters:	arc – The arc to replace the current arc with.

value()¶

Returns the current arc.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.StdVectorFstStateIterator(fst)[source]¶

State iterator for a vector FST over the tropical semiring.

This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the states method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.

Creates a new state iterator.

Parameters:	fst – The fst.

done()¶

Indicates whether the iterator is exhausted or not.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

Returns:	True if the iterator is exhausted, False otherwise.

next()¶

Advances the iterator.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

reset()¶

Resets the iterator to the initial position.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

value()¶

Returns the current state index.

This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.

class kaldi.fstext.SymbolTable¶

Symbol table.

SymbolTable():: Creates a new symbol table.

This class can be used to programmatically construct a SymbolTable in memory, e.g.

import string

table = SymbolTable()
table.set_name("alphabet")
table.add_symbol("<eps>")
for symbol in string.ascii_lowercase:
    table.add_symbol(symbol)
table.write_text("alphabet.syms")

add_pair(symbol:str, key:int) → int¶

Adds a symbol with given key to the table and returns the index.

This method adds a (symbol, key) pair to the table. If symbol is already in the table with a different key, then the return value will be the already existing key. Otherwise, return value will be the given key.

Parameters:	symbol – A symbol string. key – A non-negative index for the symbol (-1 is reserved for “no symbol requested”).
Returns:	The integer index of the new symbol.

add_symbol(symbol:str) → int¶

Adds a symbol to the table and returns the index.

This method adds a symbol to the table. The associated value key is automatically assigned by the symbol table.

Parameters:	symbol – A symbol string.
Returns:	The integer index of the new symbol.

add_table(table:SymbolTable)¶

Adds another SymbolTable to this table.

This method merges another symbol table into the current table. All key values will be offset by the current available key.

Parameters:	syms – A SymbolTable to be merged with the current table.

available_key() → int¶: Returns the current available key (i.e. highest key + 1).

checksum() → str¶: Returns the label-agnostic MD5 checksum for the table.

copy() → SymbolTable¶: Returns a copy of the symbol table.

find_index(symbol:str) → int¶

Given a symbol, finds the associated index.

Parameters:	key – A symbol string.
Returns:	The index associated with the symbol key. -1 if symbol is not found.

find_symbol(key:int) → str¶

Given an index, finds the associated symbol.

Parameters:	key – An index.
Returns:	The symbol associated with the index key. Empty string if index is not found.

from_name(name:str) → SymbolTable¶: Creates a new SymbolTable with the given name.

get_nth_key(pos:int) → int¶

Retrieves the integer index of the n-th key in the table.

Parameters:	pos – The n-th key to retrieve.
Returns:	The integer index of the n-th key or -1 if index is not found.

labeled_checksum() → str¶: Returns the label-dependent MD5 checksum of the table.

member_index(key:int) → bool¶

Given an index, returns whether it is found in the table.

This method returns a boolean indicating whether the given index is present in the table. If one intends to perform subsequent lookup, it is much better to simply call the find_index method and check the return value.

Parameters:	key – An index.
Returns:	Whether or not the key is present in the table.

member_symbol(symbol:str) → bool¶

Given a symbol, returns whether it is found in the table.

This method returns a boolean indicating whether the given symbol is present in the table. If one intends to perform subsequent lookup, it is much better to simply call the find_symbol method and check the return value.

Parameters:	key – A symbol.
Returns:	Whether or not the key is present in the table.

name() → str¶: Returns the name of the table.

num_symbols() → int¶: Returns the number of sysmbols in the table.

read(filename:str) → SymbolTable¶

Reads symbol table from binary file.

This class method creates a new SymbolTable.

Parameters:	filename – The string location of the input binary file.
Returns:	A new SymbolTable instance.

See also: arcsort.

kaldi.fstext.deserialize_symbol_table(str:bytes) → SymbolTable¶: Deserializes a symbol table.

kaldi.fstext.determinize(ifst, delta=0.0009765625, weight=None, nstate=-1, subsequential_label=0, det_type='functional', increment_subsequential_label=False)[source]¶

Constructively determinizes a weighted FST.

This operations creates an equivalent FST that has the property that no state has two transitions with the same input label. For this algorithm, epsilon transitions are treated as regular symbols (cf. rmepsilon).

Parameters:	ifst – The input FST. delta – Comparison/quantization delta (default: 0.0009765625). weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned. nstate – State number threshold (default: -1). subsequential_label – Input label of arc corresponding to residual final output when producing a subsequential transducer. det_type – Type of determinization; one of: “functional” (input transducer is functional), “nonfunctional” (input transducer is not functional) and disambiguate” (input transducer is not functional but only keep the min of ambiguous outputs). increment_subsequential_label – Increment subsequential when creating several arcs for the residual final output at a given state.
Returns:	An equivalent deterministic FST.
Raises:	`ValueError` – Unknown determinization type.

kaldi.fstext.enums¶

Functions

`GetArcSortType`	Calls C++ function
`GetClosureType`	Calls C++ function
`GetComposeFilter`	Calls C++ function
`GetDeterminizeType`	Calls C++ function
`GetEncodeFlags`	Calls C++ function
`GetEpsNormalizeType`	Calls C++ function
`GetMapType`	Calls C++ function
`GetProjectType`	Calls C++ function
`GetPushFlags`	Calls C++ function
`GetQueueType`	Calls C++ function
`GetRandArcSelection`	Calls C++ function
`GetReplaceLabelType`	Calls C++ function
`GetReweightType`	Calls C++ function

Classes

`ArcSortType`	An enumeration.
`ClosureType`	An enumeration.
`ComposeFilter`	An enumeration.
`DeterminizeType`	An enumeration.
`EncodeType`	An enumeration.
`EpsNormalizeType`	An enumeration.
`MapType`	An enumeration.
`MatchType`	An enumeration.
`ProjectType`	An enumeration.
`QueueType`	An enumeration.
`RandArcSelection`	An enumeration.
`ReplaceLabelType`	An enumeration.
`ReweightType`	An enumeration.

class kaldi.fstext.enums.ArcSortType¶

An enumeration.

ILABEL_SORT = 0¶

OLABEL_SORT = 1¶

class kaldi.fstext.enums.ClosureType¶

An enumeration.

CLOSURE_PLUS = 1¶

CLOSURE_STAR = 0¶

class kaldi.fstext.enums.ComposeFilter¶

An enumeration.

ALT_SEQUENCE_FILTER = 4¶

AUTO_FILTER = 0¶

MATCH_FILTER = 5¶

NULL_FILTER = 1¶

SEQUENCE_FILTER = 3¶

TRIVIAL_FILTER = 2¶

class kaldi.fstext.enums.DeterminizeType¶

An enumeration.

DETERMINIZE_DISAMBIGUATE = 2¶

DETERMINIZE_FUNCTIONAL = 0¶

DETERMINIZE_NONFUNCTIONAL = 1¶

class kaldi.fstext.enums.EncodeType¶

An enumeration.

DECODE = 2¶

ENCODE = 1¶

class kaldi.fstext.enums.EpsNormalizeType¶

An enumeration.

EPS_NORM_INPUT = 0¶

EPS_NORM_OUTPUT = 1¶

kaldi.fstext.enums.GetArcSortType(str:str) -> (success:bool, sort_type:ArcSortType)¶: Calls C++ function bool ::fst::script::GetArcSortType(::std::string, ::fst::script::ArcSortType*)

kaldi.fstext.enums.GetClosureType(closure_plus:bool) → ClosureType¶: Calls C++ function ::fst::ClosureType ::fst::script::GetClosureType(bool)

kaldi.fstext.enums.GetComposeFilter(str:str) -> (success:bool, compose_filter:ComposeFilter)¶: Calls C++ function bool ::fst::script::GetComposeFilter(::std::string, ::fst::ComposeFilter*)

kaldi.fstext.enums.GetDeterminizeType(str:str) -> (success:bool, det_type:DeterminizeType)¶: Calls C++ function bool ::fst::script::GetDeterminizeType(::std::string, ::fst::DeterminizeType*)

kaldi.fstext.enums.GetEncodeFlags(encode_labels:bool, encode_weights:bool) → int¶: Calls C++ function unsigned int ::fst::script::GetEncodeFlags(bool, bool)

kaldi.fstext.enums.GetEpsNormalizeType(eps_norm_output:bool) → EpsNormalizeType¶: Calls C++ function ::fst::EpsNormalizeType ::fst::script::GetEpsNormalizeType(bool)

kaldi.fstext.enums.GetMapType(str:str) -> (success:bool, sort_type:MapType)¶: Calls C++ function bool ::fst::script::GetMapType(::std::string, ::fst::script::MapType*)

kaldi.fstext.enums.GetProjectType(project_output:bool) → ProjectType¶: Calls C++ function ::fst::ProjectType ::fst::script::GetProjectType(bool)

kaldi.fstext.enums.GetPushFlags(push_weights:bool, push_labels:bool, remove_total_weight:bool, remove_common_affix:bool) → int¶: Calls C++ function unsigned int ::fst::script::GetPushFlags(bool, bool, bool, bool)

kaldi.fstext.enums.GetQueueType(str:str) -> (success:bool, queue_type:QueueType)¶: Calls C++ function bool ::fst::script::GetQueueType(::std::string, ::fst::QueueType*)

kaldi.fstext.enums.GetRandArcSelection(str:str) -> (success:bool, ras:RandArcSelection)¶: Calls C++ function bool ::fst::script::GetRandArcSelection(::std::string, ::fst::script::RandArcSelection*)

kaldi.fstext.enums.GetReplaceLabelType(str:str, epsilon_on_replace:bool) -> (success:bool, rlt:ReplaceLabelType)¶: Calls C++ function bool ::fst::script::GetReplaceLabelType(::std::string, bool, ::fst::ReplaceLabelType*)

kaldi.fstext.enums.GetReweightType(to_final:bool) → ReweightType¶: Calls C++ function ::fst::ReweightType ::fst::script::GetReweightType(bool)

class kaldi.fstext.enums.MapType¶

An enumeration.

ARC_SUM_MAPPER = 0¶

ARC_UNIQUE_MAPPER = 1¶

IDENTITY_MAPPER = 2¶

INPUT_EPSILON_MAPPER = 3¶

INVERT_MAPPER = 4¶

OUTPUT_EPSILON_MAPPER = 5¶

PLUS_MAPPER = 6¶

POWER_MAPPER = 7¶

QUANTIZE_MAPPER = 8¶

RMWEIGHT_MAPPER = 9¶

SUPERFINAL_MAPPER = 10¶

TIMES_MAPPER = 11¶

TO_LOG64_MAPPER = 13¶

TO_LOG_MAPPER = 12¶

TO_STD_MAPPER = 14¶

class kaldi.fstext.enums.MatchType¶

An enumeration.

MATCH_BOTH = 3¶

MATCH_INPUT = 1¶

MATCH_NONE = 4¶

MATCH_OUTPUT = 2¶

MATCH_UNKNOWN = 5¶

class kaldi.fstext.enums.ProjectType¶

An enumeration.

PROJECT_INPUT = 1¶

PROJECT_OUTPUT = 2¶

class kaldi.fstext.enums.QueueType¶

An enumeration.

AUTO_QUEUE = 7¶

FIFO_QUEUE = 1¶

LIFO_QUEUE = 2¶

OTHER_QUEUE = 8¶

SCC_QUEUE = 6¶

SHORTEST_FIRST_QUEUE = 3¶

STATE_ORDER_QUEUE = 5¶

TOP_ORDER_QUEUE = 4¶

TRIVIAL_QUEUE = 0¶

class kaldi.fstext.enums.RandArcSelection¶

An enumeration.

FAST_LOG_PROB_ARC_SELECTOR = 2¶

LOG_PROB_ARC_SELECTOR = 1¶

UNIFORM_ARC_SELECTOR = 0¶

class kaldi.fstext.enums.ReplaceLabelType¶

An enumeration.

REPLACE_LABEL_BOTH = 4¶

REPLACE_LABEL_INPUT = 2¶

REPLACE_LABEL_NEITHER = 1¶

REPLACE_LABEL_OUTPUT = 3¶

class kaldi.fstext.enums.ReweightType¶

An enumeration.

REWEIGHT_TO_FINAL = 1¶

REWEIGHT_TO_INITIAL = 0¶

kaldi.fstext.properties¶

FST Properties.

kaldi.fstext.properties.EXPANDED = 1¶

kaldi.fstext.properties.MUTABLE = 2¶

kaldi.fstext.properties.ERROR = 4¶

kaldi.fstext.properties.ACCEPTOR = 65536¶

kaldi.fstext.properties.NOT_ACCEPTOR = 131072¶

kaldi.fstext.properties.I_DETERMINISTIC = 262144¶

kaldi.fstext.properties.NON_I_DETERMINISTIC = 524288¶

kaldi.fstext.properties.O_DETERMINISTIC = 1048576¶

kaldi.fstext.properties.NON_O_DETERMINISTIC = 2097152¶

kaldi.fstext.properties.EPSILONS = 4194304¶

kaldi.fstext.properties.NO_EPSILONS = 8388608¶

kaldi.fstext.properties.I_EPSILONS = 16777216¶

kaldi.fstext.properties.NO_I_EPSILONS = 33554432¶

kaldi.fstext.properties.O_EPSILONS = 67108864¶

kaldi.fstext.properties.NO_O_EPSILONS = 134217728¶

kaldi.fstext.properties.I_LABEL_SORTED = 268435456¶

kaldi.fstext.properties.NOT_I_LABEL_SORTED = 536870912¶

kaldi.fstext.properties.O_LABEL_SORTED = 1073741824¶

kaldi.fstext.properties.NOT_O_LABEL_SORTED = 2147483648¶

kaldi.fstext.properties.WEIGHTED = 4294967296¶

kaldi.fstext.properties.UNWEIGHTED = 8589934592¶

kaldi.fstext.properties.CYCLIC = 17179869184¶

kaldi.fstext.properties.ACYCLIC = 34359738368¶

kaldi.fstext.properties.INITIAL_CYCLIC = 68719476736¶

kaldi.fstext.properties.INITIAL_ACYCLIC = 137438953472¶

kaldi.fstext.properties.TOP_SORTED = 274877906944¶

kaldi.fstext.properties.NOT_TOP_SORTED = 549755813888¶

kaldi.fstext.properties.ACCESSIBLE = 1099511627776¶

kaldi.fstext.properties.NOT_ACCESSIBLE = 2199023255552¶

kaldi.fstext.properties.COACCESSIBLE = 4398046511104¶

kaldi.fstext.properties.NOT_COACCESSIBLE = 8796093022208¶

kaldi.fstext.properties.STRING = 17592186044416¶

kaldi.fstext.properties.NOT_STRING = 35184372088832¶

kaldi.fstext.properties.WEIGHTED_CYCLES = 70368744177664¶

kaldi.fstext.properties.UNWEIGHTED_CYCLES = 140737488355328¶

kaldi.fstext.properties.NULL_PROPERTIES = 164284018786304¶

kaldi.fstext.properties.COPY_PROPERTIES = 281474976645124¶

kaldi.fstext.properties.INTRINSIC_PROPERTIES = 281474976645123¶

kaldi.fstext.properties.EXTRINSIC_PROPERTIES = 4¶

kaldi.fstext.properties.SET_START_PROPERTIES = 225193725198343¶

kaldi.fstext.properties.SET_FINAL_PROPERTIES = 215491394076679¶

kaldi.fstext.properties.ADD_STATE_PROPERTIES = 258385232461831¶

kaldi.fstext.properties.ADD_ARC_PROPERTIES = 76509027631111¶

kaldi.fstext.properties.SET_ARC_PROPERTIES = 7¶

kaldi.fstext.properties.DELETE_STATE_PROPERTIES = 141194274603015¶

kaldi.fstext.properties.DELETE_ARC_PROPERTIES = 152189390880775¶

kaldi.fstext.properties.STATE_SORT_PROPERTIES = 227873784791047¶

kaldi.fstext.properties.ARC_SORT_PROPERTIES = 281470950113287¶

kaldi.fstext.properties.I_LABEL_INVARIANT_PROPERTIES = 281474107441159¶

kaldi.fstext.properties.O_LABEL_INVARIANT_PROPERTIES = 281471538167815¶

kaldi.fstext.properties.WEIGHT_INVARIANT_PROPERTIES = 70355859210247¶

kaldi.fstext.properties.ADD_SUPERFINAL_PROPERTIES = 262506881417223¶

kaldi.fstext.properties.RM_SUPERFINAL_PROPERTIES = 243539050430471¶

kaldi.fstext.properties.BINARY_PROPERTIES = 7¶

kaldi.fstext.properties.TRINARY_PROPERTIES = 281474976645120¶

kaldi.fstext.properties.POS_TRINARY_PROPERTIES = 93824992215040¶

kaldi.fstext.properties.NEG_TRINARY_PROPERTIES = 187649984430080¶

kaldi.fstext.properties.FST_PROPERTIES = 281474976645127¶

kaldi.fstext.special¶

Functions

`add_subsequential_loop`	Adds a subsequential symbol loop to the input FST.
`compose_context`	Creates a context FST and composes it on the left with input fst.
`compose_context_left_biphone`	Creates a context FST and composes it on the left with input fst.
`compose_deterministic_on_demand_fst`	Composes an FST with a deterministic on demand FST.
`create_ilabel_info_symbol_table`	Creates a symbol table from the ilabel info and phones symbol table.
`determinize_lattice`	Determinizes lattice.
`determinize_star`	Implements a special determinization with epsilon removal.
`determinize_star_in_log`	Performs determinize_star in place in log semiring.
`get_encoding_multiple`	Returns the smallest multiple of 1000 > nonterm_phones_offset.
`push_in_log`	Push weights/labels in log semiring.
`push_special`	Pushes weights in log semiring in a special way.
`read_ilabel_info`	Reads ilabel info from input stream.
`remove_eps_local`	Removes epsilon arcs locally.
`table_compose`	Performs table composition.
`table_compose_cache`	Performs cached table composition.
`table_compose_cache_lattice`	Performs cached table composition on lattices.
`table_compose_lattice`	Performs table composition on lattices.
`write_ilabel_info`	Writes ilabel info to output stream.

Classes

`LatticeTableComposeCache`	Cache for table compose.
`NonterminalValues`	An enumeration.
`ScaleDeterministicOnDemandFst`	A DeterministicOnDemandFst scaling the weights of another.
`StdBackoffDeterministicOnDemandFst`	Deterministic on demand backoff language model.
`StdCacheDeterministicOnDemandFst`	A DeterministicOnDemandFst caching the arcs of another.
`StdComposeDeterministicOnDemandFst`	A DeterministicOnDemandFst implementing the composition of others.
`StdDeterministicOnDemandFst`	Base class for deterministic on demand FSTs over the tropical semiring.
`StdInverseContextFst`	Inverse of the context FST “C” in “HCLG” over the tropical semiring.
`StdInverseLeftBiphoneContextFst`	Inverse of the left-biphone context FST “C” over the tropical semiring.
`StdTableComposeCache`	Cache for table compose.
`StdUnweightedNgramFst`	A DeterministicOnDemandFst in which states encode an n-gram history.
`TableComposeOptions`	Options for table composition.
`TableMatcherOptions`	Options for table matcher.

class kaldi.fstext.special.LatticeTableComposeCache¶

Cache for table compose.

Used for doing multiple compositions while caching the same matcher.

This version is for composing FSTs over lattice semiring.

from_compose_opts(opts:TableComposeOptions=default) → LatticeTableComposeCache¶: Creates a new LatticeTableComposeCache instance.

opts¶: Table compose options.

class kaldi.fstext.special.NonterminalValues¶

An enumeration.

kNontermBegin = 1¶

kNontermBigNumber = 10000000¶

kNontermBos = 0¶

kNontermEnd = 2¶

kNontermMediumNumber = 1000¶

kNontermReenter = 3¶

kNontermUserDefined = 4¶

class kaldi.fstext.special.ScaleDeterministicOnDemandFst¶

A DeterministicOnDemandFst scaling the weights of another.

For instance, to subtract existing LM scores from a lattice you could use this with a negative weight; and to interpolate LMs you can also use this with weights less than one.

Parameters:	scale (float) – The scaling factor. det_fst (StdDeterministicOnDemandFst) – The input deterministic on demand FST.

final(state:int) → TropicalWeight¶: Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶

Creates an on demand arc and returns it.

Parameters:	s (int) – State index. ilabel (int) – Arc label.
Returns:	The created arc.

start() → int¶: Returns the start state index.

class kaldi.fstext.special.StdBackoffDeterministicOnDemandFst¶

Deterministic on demand backoff language model.

This class wraps a conventional Fst, representing a language model, with a “DeterministicOnDemandFst” interface. Backoff arcs in the language model should have the epsilon label (label 0) on the arcs, and that there should be no other epsilons in the language model. The backoff (i.e. epsilon) arcs are followed if a particular arc (or a final-prob) is not found at the current state.

Parameters:	fst (StdFst) – Input language model FST.

final(state:int) → TropicalWeight¶: Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶

Creates an on demand arc and returns it.

Parameters:	s (int) – State index. ilabel (int) – Arc label.
Returns:	The created arc.

start() → int¶: Returns the start state index.

class kaldi.fstext.special.StdCacheDeterministicOnDemandFst¶

A DeterministicOnDemandFst caching the arcs of another.

Parameters:	fst (StdDeterministicOnDemandFst) – The input deterministic on demand FST. num_cached_arcs (int) – Number of arcs to keep in the cache.

final(state:int) → TropicalWeight¶: Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶

Creates an on demand arc and returns it.

Parameters:	s (int) – State index. ilabel (int) – Arc label.
Returns:	The created arc.

start() → int¶: Returns the start state index.

class kaldi.fstext.special.StdComposeDeterministicOnDemandFst¶

A DeterministicOnDemandFst implementing the composition of others.

Parameters:	fst1 (StdDeterministicOnDemandFst) – The first deterministic on demand FST. fst2 (StdDeterministicOnDemandFst) – The second deterministic on demand FST.

final(state:int) → TropicalWeight¶: Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶

Creates an on demand arc and returns it.

Parameters:	s (int) – State index. ilabel (int) – Arc label.
Returns:	The created arc.

start() → int¶: Returns the start state index.

class kaldi.fstext.special.StdDeterministicOnDemandFst¶

Base class for deterministic on demand FSTs over the tropical semiring.

final(state:int) → TropicalWeight¶: Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶

Creates an on demand arc and returns it.

Parameters:	s (int) – State index. ilabel (int) – Arc label.
Returns:	The created arc.

start() → int¶: Returns the start state index.

class kaldi.fstext.special.StdInverseContextFst¶

Inverse of the context FST “C” in “HCLG” over the tropical semiring.

InverseContextFst represents the inverse of the context FST “C” (the “C” in “HCLG”) which transduces from symbols representing phone context windows (e.g. “a, b, c”) to individual phones, e.g. “a”. So InverseContextFst transduces from phones to symbols representing phone context windows. The point is that the inverse is deterministic, so the DeterministicOnDemandFst interface is applicable, which turns out to be a convenient way to implement this.

This doesn’t implement the full Fst interface, it implements the DeterministicOnDemandFst interface which is much simpler and which is sufficient for what we need to do with this.

Search for “hbka.pdf” (“Speech Recognition with Weighted Finite State Transducers”) by M. Mohri, for more context.

Parameters:	subsequential_symbol (int) – Integer index of the subsequential symbol. phones (List[int]) – Integer indices for the phones. disambig_syms (List[int]) – Integer indices for disambiguation symbols. context_width (int) – Size of context window. central_position (int) – Position of central phone in context window, from 0..N-1.

final(state:int) → TropicalWeight¶: Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶

Creates an on demand arc and returns it.

Parameters:	s (int) – State index. ilabel (int) – Arc label.
Returns:	The created arc.

ilabel_info() → list<list<int>>¶: Returns input label info.

start() → int¶: Returns the start state index.

class kaldi.fstext.special.StdInverseLeftBiphoneContextFst¶

Inverse of the left-biphone context FST “C” over the tropical semiring.

This does not take the arguments ‘context_width’ or ‘central_position’ because they are assumed to be (2, 1) meaning a system with left-biphone context; and there is no subsequential symbol because it is not needed in systems without right context.

Parameters:	nonterm_phones_offset (int) – Integer index of the first non-terminal symbol. Set to a large value, e.g. 1 million, if not using non-terminals. phones (List[int]) – Integer indices for the phones. disambig_syms (List[int]) – Integer indices for disambiguation symbols.

final(state:int) → TropicalWeight¶: Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶

Creates an on demand arc and returns it.

Parameters:	s (int) – State index. ilabel (int) – Arc label.
Returns:	The created arc.

ilabel_info() → list<list<int>>¶: Returns input label info.

start() → int¶: Returns the start state index.

class kaldi.fstext.special.StdTableComposeCache¶

Cache for table compose.

Used for doing multiple compositions while caching the same matcher.

from_compose_opts(opts:TableComposeOptions=default) → StdTableComposeCache¶: Creates a new StdTableComposeCache instance.

opts¶: Table compose options.

class kaldi.fstext.special.StdUnweightedNgramFst¶

A DeterministicOnDemandFst in which states encode an n-gram history.

Conceptually, for n-gram order n and k labels, the FST is an unweighted acceptor with about k^(n-1) states (ignoring end effects). However, the FST is created on demand and doesn’t need the label vocabulary; get_arc matches on any input label. This class is primarily used by compose_deterministic_on_demand_fst to expand the n-gram history of lattices.

Parameters:	n (int) – N-gram order.

final(state:int) → TropicalWeight¶: Returns the final weight of the given state.

get_arc(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶

Creates an on demand arc and returns it.

Parameters:	s (int) – State index. ilabel (int) – Arc label.
Returns:	The created arc.

start() → int¶: Returns the start state index.

class kaldi.fstext.special.TableComposeOptions¶

Options for table composition.

connect¶: Connect output

filter_type¶: Which pre-defined filter to use.

from_matcher_opts(mo:TableMatcherOptions, connect:bool=default, filter_type:ComposeFilter=default, table_match_type:MatchType=default) → TableComposeOptions¶: Creates a new TableComposeOptions instance.

min_table_size¶: Minimum table size.

table_match_type¶: Type of table match.

table_ratio¶: Construct the table if it would be at least this full.

class kaldi.fstext.special.TableMatcherOptions¶

Options for table matcher.

Table matcher is a matcher specialized for the case where the output side of the left FST always has either all-epsilons coming out of a state, or a majority of the symbol table. Therefore we can either store nothing (for the all-epsilon case) or store a lookup table from labels to arc offsets. Since the table matcher has to iterate over all arcs in each left-hand state the first time it sees it, this matcher type is not efficient if you compose with something very small on the right – unless you do it multiple times and keep the matcher around.

Table matcher class is not exposed to Python code directly. Instances of TableMatcherOptions can be passed to table_compose() and TableComposeCache for controlling the table matcher behavior.

min_table_size¶: Minimum table size.

table_ratio¶: Construct the table if it would be at least this full.

kaldi.fstext.special.add_subsequential_loop(subseq_symbol:int, fst:StdMutableFst)¶

Adds a subsequential symbol loop to the input FST.

Modifies the FST so that it transuces the same paths, but the input side of the paths can all have the subsequential symbol ‘$’ appended to them any number of times.

Parameters:	subseq_symbol (int) – Integer index for the subsequential symbol. fst (StdFst) – Input FST.

kaldi.fstext.special.compose_context(disambig_syms, N, P, ifst)[source]¶

Creates a context FST and composes it on the left with input fst.

Outputs the label information along with the composed FST. Input FST should be mutable since the algorithm adds the subsequential loop to it.

Parameters:	disambig_syms (List[int]) – Disambiguation symbols. N (int) – Size of context window. P (int) – Position of central phone in context window, from 0..N-1. ifst (StdFst) – Input FST.
Returns:	Output fst, label information tuple.
Return type:	Tuple[StdVectorFst, List[List[int]]]

kaldi.fstext.special.compose_context_left_biphone(nonterm_phones_offset:int, disambig_syms:list<int>, ifst:StdVectorFst, ofst:StdVectorFst) → list<list<int>>¶

Creates a context FST and composes it on the left with input fst.

This is a variant of the function :meth:compose_context which is to be used with the “grammar FST” framework. This does not take the ‘context_width’ and ‘central_position’ arguments because they are assumed to be 2 and 1 respectively (meaning, left-biphone phonetic context).

Parameters:	nonterm_phones_offset (int) – The integer index of the first non-terminal symbol. disambig_syms (List[int]) – Disambiguation symbols. ifst (StdVectorFst) – Input FST. ofst (StdVectorFst) – Output FST.
Returns:	Label information.
Return type:	List[List[int]]

kaldi.fstext.special.compose_deterministic_on_demand_fst(fst1, fst2, inverse=False)[source]¶

Composes an FST with a deterministic on demand FST.

If inverse is True, computes ofst = Compose(Inverse(fst2), fst1). Note that the arguments are reversed in this case.

This function does not trim its output.

Parameters:	fst1 (StdFst) – The input FST. fst2 (StdDeterministicOnDemandFst) – The input deterministic on demand FST. inverse (bool) – Deterministic FST on the left?
Returns:	A composed FST.

kaldi.fstext.special.create_ilabel_info_symbol_table(info:list<list<int>>, phones_symtab:SymbolTable, separator:str, disambig_prefix:str) → SymbolTable¶

Creates a symbol table from the ilabel info and phones symbol table.

This is mainly used for debugging.

kaldi.fstext.special.determinize_lattice(ifst, compact_output=True, delta=0.0009765625, max_mem=-1, max_loop=-1)[source]¶

Determinizes lattice.

Implements a special form of determinization with epsilon removal, optimized for a phase of lattice generation.

See kaldi/src/fstext/determinize-lattice.h for details.

Parameters:	ifst (LatticeFst) – Input lattice. compact_output (bool) – Whether the output is a compact lattice. delta (float) – Comparison/quantization delta. max_mem (int) – If positive, determinization will fail when the algorithm’s (approximate) memory consumption crosses this threshold. max_loop (int) – If positive, can be used to detect non-determinizable input (a case that wouldn’t be caught by max_mem).
Returns:	A determized lattice.
Raises:	`RuntimeError` – If determization fails.

kaldi.fstext.special.determinize_star(ifst, delta=0.0009765625, max_states=-1, allow_partial=False)[source]¶

Implements a special determinization with epsilon removal.

See kaldi/src/fstext/determinize-star.h for details.

Parameters:	ifst (StdFst) – Input fst over the tropical semiring. delta (float) – Comparison/quantization delta. max_states (int) – If positive, determinization will fail when max states is reached. allow_partial (bool) – If True, the algorithm will output partial results when the specified max states is reached (when larger than zero), instead of raising an exception.
Returns:	A determized lattice.
Raises:	`RuntimeError` – If determization fails.

kaldi.fstext.special.determinize_star_in_log(fst:StdVectorFst, delta:float=default, max_states:int=default)¶

Performs determinize_star in place in log semiring.

Parameters:	ifst (StdFst) – Input fst over the tropical semiring. delta (float) – Comparison/quantization delta. max_states (int) – If positive, determinization will fail when max states is reached.
Raises:	`RuntimeError` – If determization fails.

kaldi.fstext.utils¶

Functions

`acoustic_lattice_scale`	Returns a 2x2 matrix for scaling acoustic cost in lattice weights.
`apply_probability_scale`	Applies a probability scale to the FST.
`cast_log_to_std`	Casts FST in log semiring to tropical semiring.
`cast_std_to_log`	Casts FST in tropical semiring to log semiring.
`clear_symbols`	Sets all input/output labels of the FST to zero.
`compact_lattice_has_alignment`	Checks if compact lattice has state-level alignments.
`convert_compact_lattice_to_lattice`	Converts compact lattice to lattice.
`convert_lattice_to_compact_lattice`	Converts lattice to compact lattice.
`convert_lattice_to_std`	Converts lattice to FST over tropical semiring.
`convert_nbest_to_list`	Converts n-best FST to a list of FSTs.
`convert_std_to_lattice`	Converts FST over tropical semiring to lattice.
`default_lattice_scale`	Returns a default 2x2 matrix for scaling lattice weights.
`equal_align`	Generates sequences from the input FST with exactly “length” symbols.
`following_input_symbols_are_same`	Checks if all arcs exiting any state have the same input symbol.
`get_input_symbols`	Gets input labels of the FST as a sorted unique list.
`get_linear_symbol_sequence`	Extracts linear symbol sequences from the input FST.
`get_output_symbols`	Gets output labels of the FST as a sorted unique list.
`get_symbols`	Gets labels in the symbol table as a sorted unique list.
`graph_lattice_scale`	Returns a 2x2 matrix for scaling graph cost in lattice weights.
`highest_numbered_input_symbol`	Returns the highest numbered input label of the FST (zero if FST is empty).
`highest_numbered_output_symbol`	Returns the highest numbered output label of the FST (zero if FST is empty).
`is_stochastic_fst`	Checks if FST is stochastic.
`is_stochastic_fst_in_log`	Checks if FST is stochastic in log semiring.
`lattice_scale`	Returns a 2x2 matrix for scaling graph and acoustic costs in lattice weights.
`make_following_input_symbols_same`	Ensures that all arcs exiting any state have the same input symbol.
`make_linear_acceptor`	Creates an unweighted linear acceptor from the label sequence.
`make_linear_acceptor_with_alternatives`	Creates an unweighted acceptor with a linear structure.
`make_preceding_input_symbols_same`	Ensures that all arcs entering any state have the same input symbol.
`map_input_symbols`	Maps input labels to labels given in the symbol map.
`minimize_encoded_std_fst`	Minimizes FST in place after encoding labels and weights.
`nbest_as_fsts`	Outputs (up to) n-best paths in the FST as a list of FSTs.
`phi_compose`	Performs composition by handling phi (failure) transitions.
`phi_compose_lattice`	Performs lattice composition by handling phi (failure) transitions.
`preceding_input_symbols_are_same`	Checks if all arcs entering any state have the same input symbol.
`propagate_final`	Propagates final-probs through “phi” transitions.
`remove_alignments_from_compact_lattice`	Removes state-level alignments in a compact lattice.
`remove_some_input_symbols`	Replaces given input labels with zeros.
`remove_useless_arcs`	Removes arcs that are not on best paths for any input symbol sequence.
`remove_weights`	Removes FST weights.
`rho_compose`	Performs composition by handling rho transitions.
`safe_determinize_minimize_wrapper`	Performs safe determinization and minimization.
`safe_determinize_minimize_wrapper_in_log`	Performs safe determinization and minimization in log semiring.
`safe_determinize_wrapper`	Performs safe determinization.
`scale_compact_lattice`	Scales the compact lattice weights.
`scale_lattice`	Scales the lattice weights.

kaldi.fstext.utils.acoustic_lattice_scale(acwt:float) → list<list<float>>¶: Returns a 2x2 matrix for scaling acoustic cost in lattice weights.

kaldi.fstext.utils.apply_probability_scale(scale:float, fst:StdMutableFst)¶

Applies a probability scale to the FST.

This is applicable to FSTs in the log or tropical semiring. It multiplies the arc and final weights by scale [this is not the multiplication operation of the semiring, it’s actual multiplication, which is equivalent to taking a power in the semiring].

kaldi.fstext.utils.cast_log_to_std(ifst:LogVectorFst) → StdVectorFst¶: Casts FST in log semiring to tropical semiring.

kaldi.fstext.utils.cast_std_to_log(ifst:StdVectorFst) → LogVectorFst¶: Casts FST in tropical semiring to log semiring.

kaldi.fstext.utils.clear_symbols(clear_input:bool, clear_output:bool, fst:StdMutableFst)¶

Sets all input/output labels of the FST to zero.

Does not alter symbol tables.

kaldi.fstext.utils.compact_lattice_has_alignment(fst:CompactLatticeExpandedFst) → bool¶: Checks if compact lattice has state-level alignments.

kaldi.fstext.utils.convert_compact_lattice_to_lattice(ifst, invert=True)[source]¶

Converts compact lattice to lattice.

Parameters:	ifst (CompactLatticeFst) – The input compact lattice. invert (bool) – Invert input and output labels.
Returns:	The output lattice.
Return type:	LatticeVectorFst

kaldi.fstext.utils.convert_lattice_to_compact_lattice(ifst, invert=True)[source]¶

Converts lattice to compact lattice.

Parameters:	ifst (LatticeFst) – The input lattice. invert (bool) – Invert input and output labels.
Returns:	The output compact lattice.
Return type:	CompactLatticeVectorFst

kaldi.fstext.utils.convert_lattice_to_std(ifst)[source]¶

Converts lattice to FST over tropical semiring.

Parameters:	ifst (LatticeFst) – The input lattice.
Returns:	The output FST.
Return type:	StdVectorFst

kaldi.fstext.utils.convert_nbest_to_list(fst:StdFst) → list<StdVectorFst>¶: Converts n-best FST to a list of FSTs.

kaldi.fstext.utils.convert_std_to_lattice(ifst)[source]¶

Converts FST over tropical semiring to lattice.

Parameters:	ifst (StdFst) – The input FST.
Returns:	The output lattice.
Return type:	LatticeVectorFst

kaldi.fstext.utils.default_lattice_scale() → list<list<float>>¶: Returns a default 2x2 matrix for scaling lattice weights.

kaldi.fstext.utils.equal_align(ifst:StdFst, length:int, rand_seed:int, ofst:StdMutableFst, num_retries:int=default) → bool¶

Generates sequences from the input FST with exactly “length” symbols.

This is similar to randgen, but it generates a sequence with exactly “length” input symbols. It returns True on success, False on failure (failure is partly random but should never happen in practice for normal speech models.) It generates a random path through the input FST, finds out which subset of the states it visits along the way have self-loops with inupt symbols on them, and outputs a path with exactly enough self-loops to have the requested number of input symbols. Note that EqualAlign does not use the probabilities on the FST. It just uses equal probabilities in the first stage of selection (since the output will anyway not be a truly random sample from the FST). The input fst “ifst” must be connected or this may enter an infinite loop.

kaldi.fstext.utils.following_input_symbols_are_same(end_is_epsilon:bool, fst:StdFst) → bool¶

Checks if all arcs exiting any state have the same input symbol.

Returns true if and only if the FST is such that the input symbols on arcs exiting any given state all have the same value. If end_is_epsilon == True, treats final-states as epsilon output arcs [i.e. ensures only epsilons can exit final-states].

kaldi.fstext.utils.get_input_symbols(fst:StdFst, include_eps:bool) → list<int>¶: Gets input labels of the FST as a sorted unique list.

kaldi.fstext.utils.get_linear_symbol_sequence(fst)[source]¶

Extracts linear symbol sequences from the input FST.

Parameters:	fst – The input FST.
Returns:	The tuple (isymbols, osymbols, total_weight).

kaldi.fstext.utils.get_output_symbols(fst:StdFst, include_eps:bool) → list<int>¶: Gets output labels of the FST as a sorted unique list.

kaldi.fstext.utils.get_symbols(symtab:SymbolTable, include_eps:bool) → list<int>¶: Gets labels in the symbol table as a sorted unique list.

kaldi.fstext.utils.graph_lattice_scale(lmwt:float) → list<list<float>>¶: Returns a 2x2 matrix for scaling graph cost in lattice weights.

kaldi.fstext.utils.highest_numbered_input_symbol(fst:StdFst) → int¶: Returns the highest numbered input label of the FST (zero if FST is empty).

kaldi.fstext.utils.highest_numbered_output_symbol(fst:StdFst) → int¶: Returns the highest numbered output label of the FST (zero if FST is empty).

kaldi.fstext.utils.is_stochastic_fst(fst:StdFst, delta:float=default, min_sum:TropicalWeight=default, max_sum:TropicalWeight=default) → bool¶

Checks if FST is stochastic.

This function returns true if, in the semiring of the FST, the sum (within the semiring) of all the arcs out of each state in the FST is one, to within delta.

Parameters:	fst – The FST that we are testing. delta – The tolerance to within which we test equality to 1. min_sum – If provided, it will be set to the minimum sum of weights. max_sum – If provided, it will be set to the maximum sum of weights.
Returns:	True if the FST is stochastic, and False otherwise.

kaldi.fstext.utils.is_stochastic_fst_in_log(fst:StdFst, delta:float=default, min_sum:TropicalWeight=default, max_sum:TropicalWeight=default) → bool¶

Checks if FST is stochastic in log semiring.

This function returns true if, in the log semiring, the sum of all the arcs out of each state in the FST is one, to within delta.

Parameters:	fst – The FST that we are testing. delta – The tolerance to within which we test equality to 1. min_sum – If provided, it will be set to the minimum sum of weights. max_sum – If provided, it will be set to the maximum sum of weights.
Returns:	True if the FST is stochastic, and False otherwise.

kaldi.fstext.utils.lattice_scale(lmwt:float, acwt:float) → list<list<float>>¶: Returns a 2x2 matrix for scaling graph and acoustic costs in lattice weights.

kaldi.fstext.utils.make_following_input_symbols_same(end_is_epsilon:bool, fst:StdMutableFst)¶

Ensures that all arcs exiting any state have the same input symbol.

Detects states that have differing input symbols going out, and inserts, for each of the following arcs with non-epsilon input symbol, a new dummy state that has an epsilon link from the fst state. The output symbol and weight stay on the link to the dummy state (in order to keep the FST output-deterministic and stochastic, if it already was). If end_is_epsilon == True, treats “being a final-state” like having an epsilon output link.

kaldi.fstext.utils.make_linear_acceptor(labels:list<int>, ofst:StdMutableFst)¶: Creates an unweighted linear acceptor from the label sequence.

kaldi.fstext.utils.make_linear_acceptor_with_alternatives(labels:list<list<int>>, ofst:StdMutableFst)¶

Creates an unweighted acceptor with a linear structure.

Each position in the input list is a list of labels. Each position must have at least one alternative. Epsilon/0 is treated like a normal symbol.

kaldi.fstext.utils.make_preceding_input_symbols_same(start_is_epsilon:bool, fst:StdMutableFst)¶

Ensures that all arcs entering any state have the same input symbol.

Detects states that have differing input symbols going in, and inserts, for each of the preceding arcs with non-epsilon input symbol, a new dummy state that has an epsilon link to the fst state. If start_is_epsilon == True, ensures that start-state can have only epsilon-links into it.

kaldi.fstext.utils.map_input_symbols(symbol_map:list<int>, fst:StdMutableFst)¶: Maps input labels to labels given in the symbol map.

kaldi.fstext.utils.minimize_encoded_std_fst(fst:StdVectorFst, delta:float=default)¶

Minimizes FST in place after encoding labels and weights.

Similar to minimize operation, except it does not push the weights, or the labels.

Parameters:	fst (StdVectorFst) – Input FST. delta (float) – Quantization delta (default=0.0009765625).

kaldi.fstext.utils.nbest_as_fsts(fst:StdFst, n:int) → list<StdVectorFst>¶: Outputs (up to) n-best paths in the FST as a list of FSTs.

kaldi.fstext.utils.phi_compose(fst1:StdFst, fst2:StdFst, phi_label:int, ofst:StdMutableFst)¶

Performs composition by handling phi (failure) transitions.

This is a version of composition where the right hand FST (fst2) is treated as a backoff language model, with the phi symbol (e.g. #0) treated as a “failure transition”, only taken when there is no match for the requested symbol.

kaldi.fstext.utils.phi_compose_lattice(fst1:LatticeFst, fst2:LatticeFst, phi_label:int, ofst:LatticeMutableFst)¶

Performs lattice composition by handling phi (failure) transitions.

This is a version of composition where the right hand FST (fst2) is treated as a backoff language model, with the phi symbol (e.g. #0) treated as a “failure transition”, only taken when there is no match for the requested symbol.

kaldi.fstext.utils.preceding_input_symbols_are_same(start_is_epsilon:bool, fst:StdFst) → bool¶

Checks if all arcs entering any state have the same input symbol.

Returns true if and only if the FST is such that the input symbols on arcs entering any given state all have the same value. If start_is_epsilon == True, treats start-state as an epsilon input arc [i.e. ensures only epsilons can enter start-state].

kaldi.fstext.utils.propagate_final(phi_label:int, fst:StdMutableFst)¶

Propagates final-probs through “phi” transitions.

Note that here, phi_label may be epsilon. If you have a backoff language model with special symbols (“phi”) on the backoff arcs instead of epsilon, you may use phi_compose() to compose with it, but this won’t do the right thing w.r.t. final probabilities. You should first call propagate_final() on the FST with phi’s in it (fst2 in phi_compose()), to fix this. If a state does not have a final-prob, but has a phi transition, it makes the state’s final-prob (phi-prob * final-prob-of-dest-state), and does this recursively i.e. follows phi transitions on the dest state first. It behaves as if there were a super-final state with a special symbol leading to it, from each currently final state. Note that this may not behave as desired if there are epsilons in your FST; it might be better to remove those before calling this function.

kaldi.fstext.utils.remove_alignments_from_compact_lattice(fst:CompactLatticeMutableFst)¶: Removes state-level alignments in a compact lattice.

kaldi.fstext.utils.remove_some_input_symbols(to_remove:list<int>, fst:StdMutableFst)¶: Replaces given input labels with zeros.

kaldi.fstext.utils.remove_useless_arcs(fst:StdMutableFst)¶

Removes arcs that are not on best paths for any input symbol sequence.

This removes arcs such that there is no input symbol sequence for which the best path through the FST would contain those arcs [for these purposes, epsilon is not treated as a real symbol]. This is mainly geared towards decoding-graph FSTs which may contain transitions that have less likely words on them that would never be taken. We do not claim that this algorithm removes all such arcs; it just does the best job it can. Only works for tropical (not log) semiring as it uses NaturalLess.

kaldi.fstext.utils.remove_weights(fst:StdMutableFst)¶: Removes FST weights.

kaldi.fstext.utils.rho_compose(fst1:StdFst, fst2:StdFst, rho_label:int, ofst:StdMutableFst)¶

Performs composition by handling rho transitions.

This is a version of composition where the right hand FST (fst2) has special “rho transitions” which are taken whenever no normal transition matches; these transitions will be rewritten with whatever symbol was on the first FST.

kaldi.fstext.utils.safe_determinize_minimize_wrapper(ifst:StdMutableFst, ofst:StdVectorFst, delta:float=default)¶

Performs safe determinization and minimization.

Like meth:safe_determinize_wrapper but also does encoded minimization, which is safe. This algorithm will destroy ifst.

kaldi.fstext.utils.safe_determinize_minimize_wrapper_in_log(ifst:StdVectorFst, ofst:StdVectorFst, delta:float=default)¶

Performs safe determinization and minimization in log semiring.

Like meth:safe_determinize_minimize_wrapper but first casts to the log semiring. This algorithm will destroy ifst.

kaldi.fstext.utils.safe_determinize_wrapper(ifst:StdMutableFst, ofst:StdMutableFst, delta:float=default)¶

Performs safe determinization.

This is a form of determinization that will never blow up. Note that ifst is non-const and can be destroyed by this operation. Does not do epsilon removal. This is so it’s safe to cast to log and do this, and maintain equivalence in tropical.

kaldi.fstext.utils.scale_compact_lattice(scale:list<list<float>>, fst:CompactLatticeMutableFst)¶

Scales the compact lattice weights.

Scales the pair of weights in CompactLatticeWeight by viewing the pair (a, b) as a 2-vector and pre-multiplying by the 2x2 matrix in scale. E.g. typically scale would equal [[1, 0], [0, acwt]] if we want to scale the acoustics by acwt.

kaldi.fstext.utils.scale_lattice(scale:list<list<float>>, fst:LatticeMutableFst)¶

Scales the lattice weights.

Scales the pair of weights in LatticeWeight by viewing the pair (a, b) as a 2-vector and pre-multiplying by the 2x2 matrix in scale. E.g. typically scale would equal [[1, 0], [0, acwt]] if we want to scale the acoustics by acwt.

kaldi.fstext.weight¶

PyKaldi has support for the following weight types:

Tropical weight.
Log weight.
Lattice weight.
Compact lattice weight.
KWS time weight.
KWS index weight.

kaldi.fstext.weight.DELTA = 0.0009765625¶

kaldi.fstext.weight.LEFT_SEMIRING = 1¶

kaldi.fstext.weight.RIGHT_SEMIRING = 2¶

kaldi.fstext.weight.SEMIRING = 3¶

kaldi.fstext.weight.COMMUTATIVE = 4¶

kaldi.fstext.weight.IDEMPOTENT = 8¶

kaldi.fstext.weight.PATH = 16¶

kaldi.fstext.weight.NUM_RANDOM_WEIGHTS = 5¶

Functions

`approx_equal_compact_lattice_weight`	Checks if given compact lattice weights are approximately equal.
`approx_equal_float_weight`	Checks if given float weights are approximately equal.
`approx_equal_lattice_weight`	Checks if given lattice weights are approximately equal.
`compact_lattice_weight_to_cost`	Converts compact lattice weight to cost.
`compare_compact_lattice_weight`	Compares input compact lattice weights.
`compare_lattice_weight`	Compares input lattice weights.
`divide_compact_lattice_weight`	$\oslash$ operation in the compact lattice semiring.
`divide_kws_index_weight`	$\oslash$ operation in the KWS index semiring.
`divide_lattice_weight`	$\oslash$ operation in the lattice semiring.
`divide_log_weight`	$\oslash$ operation in the log semiring.
`divide_tropical_lt_tropical_weight`	$\oslash$ operation in the KWS time semiring.
`divide_tropical_weight`	$\oslash$ operation in the tropical semiring.
`get_log_to_tropical_converter`	Returns a callable for converting log weight to tropical weight.
`get_tropical_to_log_converter`	Returns a callable for converting tropical weight to log weight.
`lattice_weight_to_cost`	Converts lattice weight to cost.
`lattice_weight_to_tropical`	Converts lattice weight to tropical weight.
`plus_compact_lattice_weight`	$\oplus$ operation in the compact lattice semiring.
`plus_kws_index_weight`	$\oplus$ operation in the KWS index semiring.
`plus_lattice_weight`	$\oplus$ operation in the lattice semiring.
`plus_log_weight`	$\oplus$ operation in the log semiring.
`plus_tropical_lt_tropical_weight`	$\oplus$ operation in the KWS time semiring.
`plus_tropical_weight`	$\oplus$ operation in the tropical semiring.
`power_log_weight`	Power operation in the log semiring.
`power_tropical_weight`	Power operation in the tropical semiring.
`scale_compact_lattice_weight`	Scales compact lattice weight.
`scale_lattice_weight`	Scales lattice weight.
`times_compact_lattice_weight`	$\otimes$ operation in the compact lattice semiring.
`times_kws_index_weight`	$\otimes$ operation in the KWS index semiring.
`times_lattice_weight`	$\otimes$ operation in the lattice semiring.
`times_log_weight`	$\otimes$ operation in the log semiring.
`times_tropical_lt_tropical_weight`	$\otimes$ operation in the KWS time semiring.
`times_tropical_weight`	$\otimes$ operation in the tropical semiring.
`tropical_weight_to_cost`	Converts tropical weight to cost.

Classes

`CompactLatticeNaturalLess`	Comparison object in compact lattice semiring.
`CompactLatticeWeight`	Compact lattice weight.
`DivideType`	An enumeration.
`FloatLimits`	Float limits.
`FloatWeight`	Base class for float weight types.
`KwsIndexWeight`	KWS index weight.
`KwsTimeWeight`	KWS time weight.
`LatticeNaturalLess`	Comparison object in lattice semiring.
`LatticeWeight`	Lattice weight.
`LogWeight`	Log weight.
`TropicalWeight`	Tropical weight.

class kaldi.fstext.weight.CompactLatticeNaturalLess¶: Comparison object in compact lattice semiring.

class kaldi.fstext.weight.CompactLatticeWeight¶

Compact lattice weight.

from_other(other:CompactLatticeWeight) → CompactLatticeWeight¶: Create a new compact lattice weight from another.

from_pair(w:LatticeWeight, s:list<int>) → CompactLatticeWeight¶: Create a new compact lattice weight from a weight string pair.

get_int_size_string() → str¶: Returns int size string.

hash() → int¶: Returns the hash for the weight.

member() → bool¶: Checks if weight is a member of the compact lattice semiring.

no_weight() → CompactLatticeWeight¶: No weight in compact lattice semiring.

one() → CompactLatticeWeight¶: One in compact lattice semiring.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → CompactLatticeWeight¶: Quantizes the weight.

reverse() → CompactLatticeWeight¶: Reverses the weight.

string¶: The string as a list of integers.

type() → str¶: Returns weight type.

weight¶: The weight.

zero() → CompactLatticeWeight¶: Zero in compact lattice semiring.

class kaldi.fstext.weight.DivideType¶

An enumeration.

DIVIDE_ANY = 2¶

DIVIDE_LEFT = 0¶

DIVIDE_RIGHT = 1¶

class kaldi.fstext.weight.FloatLimits¶

Float limits.

neg_infinity() → float¶: Returns float -infinity.

number_bad() → float¶: Returns float bad number.

pos_infinity() → float¶: Returns float +infinity.

class kaldi.fstext.weight.FloatWeight¶

Base class for float weight types.

from_float(f:float) → FloatWeight¶: Create a new float weight from a float.

from_other(weight:FloatWeight) → FloatWeight¶: Create a new float weight from another.

hash() → int¶: Returns the hash for the weight.

value¶: Float value of the weight.

class kaldi.fstext.weight.KwsIndexWeight¶

KWS index weight.

A tropical weight triplet with lexicographic ordering.

from_components(w1:TropicalWeight, w2:KwsTimeWeight) → KwsIndexWeight¶: Creates a new KWS index weight from component weights.

member() → bool¶: Checks if weight is a member of the KWS index semiring.

no_weight() → KwsIndexWeight¶: No weight in KWS index semiring.

one() → KwsIndexWeight¶: One in KWS index semiring.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → KwsIndexWeight¶: Quantizes the weight.

reverse() → KwsIndexWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value1¶: The first component weight.

value2¶: The second component weight.

zero() → KwsIndexWeight¶: Zero in KWS index semiring.

class kaldi.fstext.weight.KwsTimeWeight¶

KWS time weight.

A tropical weight pair with lexicographic ordering.

from_components(w1:TropicalWeight, w2:TropicalWeight) → KwsTimeWeight¶: Creates a new KWS time weight from component weights.

member() → bool¶: Checks if weight is a member of the KWS time semiring.

no_weight() → KwsTimeWeight¶: No weight in the KWS time semiring.

one() → KwsTimeWeight¶: One in the KWS time semiring.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → KwsTimeWeight¶: Quantizes the weight.

reverse() → KwsTimeWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value1¶: The first component weight.

value2¶: The second component weight.

zero() → KwsTimeWeight¶: Zero in the KWS time semiring.

class kaldi.fstext.weight.LatticeNaturalLess¶: Comparison object in lattice semiring.

class kaldi.fstext.weight.LatticeWeight¶

Lattice weight.

from_other(other:LatticeWeight) → LatticeWeight¶: Create a new lattice weight from another.

from_pair(a:float, b:float) → LatticeWeight¶: Create a new lattice weight from a pair of floats.

hash() → int¶: Returns the hash for the weight.

member() → bool¶: Checks if weight is a member of the lattice semiring.

no_weight() → LatticeWeight¶: No weight in lattice semiring.

one() → LatticeWeight¶: One in lattice semiring, i.e. (0.0, 0.0).

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → LatticeWeight¶: Quantizes the weight.

reverse() → LatticeWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value1¶: Float value of the first weight.

value2¶: Float value of the second weight.

zero() → LatticeWeight¶: Zero in lattice semiring, i.e. (+infinity, +infinity).

class kaldi.fstext.weight.LogWeight¶

Log weight.

from_float(f:float) → LogWeight¶: Create a new log weight from a float.

from_other(weight:LogWeight) → LogWeight¶: Create a new log weight from another.

hash() → int¶: Returns the hash for the weight.

member() → bool¶: Checks if weight is a member of log semiring.

no_weight() → LogWeight¶: No weight in log semiring.

one() → LogWeight¶: One in log semiring, i.e. 0.0.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → LogWeight¶: Quantizes the weight.

reverse() → LogWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value¶: Float value of the weight.

zero() → LogWeight¶: Zero in log semiring, i.e. float +infinity.

class kaldi.fstext.weight.TropicalWeight¶

Tropical weight.

from_float(f:float) → TropicalWeight¶: Create a new tropical weight from a float.

from_other(weight:TropicalWeight) → TropicalWeight¶: Create a new tropical weight from another.

hash() → int¶: Returns the hash for the weight.

member() → bool¶: Checks if weight is a member of the tropical semiring.

no_weight() → TropicalWeight¶: No weight in tropical semiring.

one() → TropicalWeight¶: One in tropical semiring, i.e. 0.0.

properties() → int¶: Returns weight properties.

quantize(delta:float=default) → TropicalWeight¶: Quantizes the weight.

reverse() → TropicalWeight¶: Reverses the weight.

type() → str¶: Returns weight type.

value¶: Float value of the weight.

zero() → TropicalWeight¶: Zero in tropical semiring, i.e. float +infinity.

kaldi.fstext.weight.approx_equal_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight, delta:float=default) → bool¶: Checks if given compact lattice weights are approximately equal.

kaldi.fstext.weight.approx_equal_float_weight(w1:FloatWeight, w2:FloatWeight, delta:float=default) → bool¶: Checks if given float weights are approximately equal.

kaldi.fstext.weight.approx_equal_lattice_weight(w1:LatticeWeight, w2:LatticeWeight, delta:float=default) → bool¶: Checks if given lattice weights are approximately equal.

kaldi.fstext.weight.compact_lattice_weight_to_cost(w:CompactLatticeWeight) → float¶: Converts compact lattice weight to cost.

kaldi.fstext.weight.compare_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → int¶: Compares input compact lattice weights.

kaldi.fstext.weight.compare_lattice_weight(w1:LatticeWeight, w2:LatticeWeight) → int¶: Compares input lattice weights.

kaldi.fstext.weight.divide_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight, typ:DivideType=default) → CompactLatticeWeight¶: $\oslash$ operation in the compact lattice semiring.

kaldi.fstext.weight.divide_kws_index_weight(w1:KwsIndexWeight, w2:KwsIndexWeight, typ:DivideType=default) → KwsIndexWeight¶: $\oslash$ operation in the KWS index semiring.

kaldi.fstext.weight.divide_lattice_weight(w1:LatticeWeight, w2:LatticeWeight, typ:DivideType=default) → LatticeWeight¶: $\oslash$ operation in the lattice semiring.

kaldi.fstext.weight.divide_log_weight(w1:LogWeight, w2:LogWeight, typ:DivideType=default) → LogWeight¶: $\oslash$ operation in the log semiring.

kaldi.fstext.weight.divide_tropical_lt_tropical_weight(w1:KwsTimeWeight, w2:KwsTimeWeight, typ:DivideType=default) → KwsTimeWeight¶: $\oslash$ operation in the KWS time semiring.

kaldi.fstext.weight.divide_tropical_weight(w1:TropicalWeight, w2:TropicalWeight, typ:DivideType=default) → TropicalWeight¶: $\oslash$ operation in the tropical semiring.

kaldi.fstext.weight.get_log_to_tropical_converter() -> (w:LogWeight) → TropicalWeight¶: Returns a callable for converting log weight to tropical weight.

kaldi.fstext.weight.get_tropical_to_log_converter() -> (w:TropicalWeight) → LogWeight¶: Returns a callable for converting tropical weight to log weight.

kaldi.fstext.weight.lattice_weight_to_cost(w:LatticeWeight) → float¶: Converts lattice weight to cost.

kaldi.fstext.weight.lattice_weight_to_tropical(w_in:LatticeWeight) → TropicalWeight¶: Converts lattice weight to tropical weight.

kaldi.fstext.weight.plus_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → CompactLatticeWeight¶: $\oplus$ operation in the compact lattice semiring.

kaldi.fstext.weight.plus_kws_index_weight(w1:KwsIndexWeight, w2:KwsIndexWeight) → KwsIndexWeight¶: $\oplus$ operation in the KWS index semiring.

kaldi.fstext.weight.plus_lattice_weight(w1:LatticeWeight, w2:LatticeWeight) → LatticeWeight¶: $\oplus$ operation in the lattice semiring.

kaldi.fstext.weight.plus_log_weight(w1:LogWeight, w2:LogWeight) → LogWeight¶: $\oplus$ operation in the log semiring.

kaldi.fstext.weight.plus_tropical_lt_tropical_weight(w1:KwsTimeWeight, w2:KwsTimeWeight) → KwsTimeWeight¶: $\oplus$ operation in the KWS time semiring.

kaldi.fstext.weight.plus_tropical_weight(w1:TropicalWeight, w2:TropicalWeight) → TropicalWeight¶: $\oplus$ operation in the tropical semiring.

kaldi.fstext.weight.power_log_weight(weight:LogWeight, scalar:float) → LogWeight¶: Power operation in the log semiring.

kaldi.fstext.weight.power_tropical_weight(weight:TropicalWeight, scalar:float) → TropicalWeight¶: Power operation in the tropical semiring.

kaldi.fstext.weight.scale_compact_lattice_weight(w:CompactLatticeWeight, scale:list<list<float>>) → CompactLatticeWeight¶: Scales compact lattice weight.

kaldi.fstext.weight.scale_lattice_weight(w:LatticeWeight, scale:list<list<float>>) → LatticeWeight¶: Scales lattice weight.

kaldi.fstext.weight.times_compact_lattice_weight(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → CompactLatticeWeight¶: $\otimes$ operation in the compact lattice semiring.

kaldi.fstext.weight.times_kws_index_weight(w1:KwsIndexWeight, w2:KwsIndexWeight) → KwsIndexWeight¶: $\otimes$ operation in the KWS index semiring.

kaldi.fstext.weight.times_lattice_weight(w1:LatticeWeight, w2:LatticeWeight) → LatticeWeight¶: $\otimes$ operation in the lattice semiring.

kaldi.fstext.weight.times_log_weight(w1:LogWeight, w2:LogWeight) → LogWeight¶: $\otimes$ operation in the log semiring.

kaldi.fstext.weight.times_tropical_lt_tropical_weight(w1:KwsTimeWeight, w2:KwsTimeWeight) → KwsTimeWeight¶: $\otimes$ operation in the KWS time semiring.

kaldi.fstext.weight.times_tropical_weight(w1:TropicalWeight, w2:TropicalWeight) → TropicalWeight¶: $\otimes$ operation in the tropical semiring.

kaldi.fstext.weight.tropical_weight_to_cost(w:TropicalWeight) → float¶: Converts tropical weight to cost.