kaldi.fstext¶
PyKaldi has built-in support for common FST types (including Kaldi lattices and KWS index) and operations. The API for the user facing PyKaldi FST types and operations is mostly defined in Python mimicking the API exposed by OpenFst’s official Python wrapper pywrapfst to a large extent. This includes integrations with Graphviz and IPython for interactive visualization of FSTs.
There are two major differences between the PyKaldi FST package and pywrapfst:
- PyKaldi bindings are generated with CLIF while pywrapfst bindings are generated with Cython. This allows PyKaldi FST types to work seamlessly with the rest of the PyKaldi package.
- In contrast to pywrapfst, PyKaldi does not wrap OpenFst scripting API, which uses virtual dispatch, function registration, and dynamic loading of shared objects to provide a common interface shared by FSTs of different semirings. While this change requires wrapping each semiring specialization separately in PyKaldi, it gives users the ability to pass FST objects directly to the myriad PyKaldi functions accepting FST arguments.
Operations which construct new FSTs are implemented as traditional functions, as
are two-argument boolean functions like equal
and equivalent
. Convert
operation is not implemented as a separate function since FSTs already support
construction from other FST types, e.g. vector FSTs can be constructed from
constant FSTs and vice versa. Destructive operations—those that mutate an FST,
in place—are instance methods, as is write
.
The following example, based on Mohri et al. 2002, shows the construction of an ASR graph given a pronunciation lexicon L, grammar G, a transducer from context-dependent phones to context-independent phones C, and an HMM set H:
import kaldi.fstext as fst
L = fst.StdVectorFst.read("L.fst")
G = fst.StdVectorFst.read("G.fst")
C = fst.StdVectorFst.read("C.fst")
H = fst.StdVectorFst.read("H.fst")
LG = fst.determinize(fst.compose(L, G))
CLG = fst.determinize(fst.compose(C, LG))
HCLG = fst.determinize(fst.compose(H, CLG))
HCLG.minimize() # NB: works in-place.
-
kaldi.fstext.
NO_STATE_ID
= -1¶
-
kaldi.fstext.
NO_LABEL
= -1¶
-
kaldi.fstext.
ENCODE_FLAGS
= 3¶
-
kaldi.fstext.
ENCODE_LABELS
= 1¶
-
kaldi.fstext.
ENCODE_WEIGHTS
= 2¶
Functions
arcmap |
Constructively applies a transform to all arcs and final states. |
compat_symbols |
Returns true if the two symbol tables have equal checksums. |
compose |
Constructively composes two FSTs. |
deserialize_symbol_table |
Deserializes a symbol table. |
determinize |
Constructively determinizes a weighted FST. |
difference |
Constructively computes the difference of two FSTs. |
disambiguate |
Constructively disambiguates a weighted transducer. |
epsnormalize |
Constructively epsilon-normalizes an FST. |
equal |
Are two FSTs equal? |
equivalent |
Are the two acceptors equivalent? |
indices_to_symbols |
Converts indices to symbols by looking them up in the symbol table. |
intersect |
Constructively intersects two FSTs. |
isomorphic |
Are the two acceptors isomorphic? |
prune |
Constructively removes paths with weights below a certain threshold. |
push |
Constructively pushes weights/labels towards initial or final states. |
randequivalent |
Are two acceptors stochastically equivalent? |
randgen |
Randomly generate successful paths in an FST. |
read_fst_kaldi |
Reads FST using Kaldi I/O mechanisms. |
relabel_symbol_table |
Relabels a symbol table as specified by the input list of pairs. |
replace |
Recursively replaces arcs in the root FST with other FST(s). |
reverse |
Constructively reverses an FST’s transduction. |
rmepsilon |
Constructively removes epsilon transitions from an FST. |
serialize_symbol_table |
Serializes a symbol table. |
shortestdistance |
Compute the shortest distance from the initial or final state. |
shortestpath |
Construct an FST containing the shortest path(s) in the input FST. |
statemap |
Constructively applies a transform to all states. |
symbols_to_indices |
Converts symbols to indices by looking them up in the symbol table. |
synchronize |
Constructively synchronizes an FST. |
write_fst_kaldi |
Writes FST using Kaldi I/O mechanisms. |
Classes
CompactLatticeArc |
FST arc with compact lattice weight. |
CompactLatticeConstFst |
Constant FST over the compact lattice semiring. |
CompactLatticeConstFstArcIterator |
Arc iterator for a constant FST over the compact lattice semiring. |
CompactLatticeConstFstStateIterator |
State iterator for a constant FST over the compact lattice semiring. |
CompactLatticeEncodeMapper |
Arc encoder for an FST over the compact lattice semiring. |
CompactLatticeEncodeTable |
Encode table for CompactLatticeArc. |
CompactLatticeFstCompiler |
Compiler for FSTs over the compact lattice semiring. |
CompactLatticeVectorFst |
Vector FST over the compact lattice semiring. |
CompactLatticeVectorFstArcIterator |
Arc iterator for a vector FST over the compact lattice semiring. |
CompactLatticeVectorFstMutableArcIterator |
Mutable arc iterator for a vector FST over the compact lattice semiring. |
CompactLatticeVectorFstStateIterator |
State iterator for a vector FST over the compact lattice semiring. |
CompactLatticeWeight |
Compact lattice weight factory. |
FstHeader |
FST file header. |
FstReadOptions |
FST reading options. |
FstWriteOptions |
FST writing options. |
KwsIndexArc |
FST arc with KWS index weight. |
KwsIndexConstFst |
Constant FST over the KWS index semiring. |
KwsIndexConstFstArcIterator |
Arc iterator for a constant FST over the KWS index semiring. |
KwsIndexConstFstStateIterator |
State iterator for a constant FST over the KWS index semiring. |
KwsIndexEncodeMapper |
Arc encoder for an FST over the KWS index semiring. |
KwsIndexEncodeTable |
Encode table for KwsIndexArc. |
KwsIndexFstCompiler |
Compiler for FSTs over the KWS index semiring. |
KwsIndexVectorFst |
Vector FST over the KWS index semiring. |
KwsIndexVectorFstArcIterator |
Arc iterator for a vector FST over the KWS index semiring. |
KwsIndexVectorFstMutableArcIterator |
Mutable arc iterator for a vector FST over the KWS index semiring. |
KwsIndexVectorFstStateIterator |
State iterator for a vector FST over the KWS index semiring. |
KwsIndexWeight |
KWS index weight factory. |
KwsTimeWeight |
KWS time weight factory. |
LatticeArc |
FST arc with lattice weight. |
LatticeConstFst |
Constant FST over the lattice semiring. |
LatticeConstFstArcIterator |
Arc iterator for a constant FST over the lattice semiring. |
LatticeConstFstStateIterator |
State iterator for a constant FST over the lattice semiring. |
LatticeEncodeMapper |
Arc encoder for an FST over the lattice semiring. |
LatticeEncodeTable |
Encode table for LatticeArc. |
LatticeFstCompiler |
Compiler for FSTs over the lattice semiring. |
LatticeVectorFst |
Vector FST over the lattice semiring. |
LatticeVectorFstArcIterator |
Arc iterator for a vector FST over the lattice semiring. |
LatticeVectorFstMutableArcIterator |
Mutable arc iterator for a vector FST over the lattice semiring. |
LatticeVectorFstStateIterator |
State iterator for a vector FST over the lattice semiring. |
LatticeWeight |
Lattice weight factory. |
LogArc |
FST arc with log weight. |
LogConstFst |
Constant FST over the log semiring. |
LogConstFstArcIterator |
Arc iterator for a constant FST over the log semiring. |
LogConstFstStateIterator |
State iterator for a constant FST over the log semiring. |
LogEncodeMapper |
Arc encoder for an FST over the log semiring. |
LogEncodeTable |
Encode table for LogArc. |
LogFstCompiler |
Compiler for FSTs over the log semiring. |
LogVectorFst |
Vector FST over the log semiring. |
LogVectorFstArcIterator |
Arc iterator for a vector FST over the log semiring. |
LogVectorFstMutableArcIterator |
Mutable arc iterator for a vector FST over the log semiring. |
LogVectorFstStateIterator |
State iterator for a vector FST over the log semiring. |
LogWeight |
Log weight factory. |
StdArc |
FST arc with tropical weight. |
StdConstFst |
Constant FST over the tropical semiring. |
StdConstFstArcIterator |
Arc iterator for a constant FST over the tropical semiring. |
StdConstFstStateIterator |
State iterator for a constant FST over the tropical semiring. |
StdEncodeMapper |
Arc encoder for an FST over the tropical semiring. |
StdEncodeTable |
Encode table for StdArc. |
StdFstCompiler |
Compiler for FSTs over the tropical semiring. |
StdVectorFst |
Vector FST over the tropical semiring. |
StdVectorFstArcIterator |
Arc iterator for a vector FST over the tropical semiring. |
StdVectorFstMutableArcIterator |
Mutable arc iterator for a vector FST over the tropical semiring. |
StdVectorFstStateIterator |
State iterator for a vector FST over the tropical semiring. |
SymbolTable |
Symbol table. |
SymbolTableIterator |
Symbol table iterator. |
SymbolTableTextOptions |
Options for reading symbol table from text file. |
TropicalWeight |
Tropical weight factory. |
-
class
kaldi.fstext.
CompactLatticeArc
[source]¶ FST arc with compact lattice weight.
- CompactLatticeArc():
- Creates an uninitialized
CompactLatticeArc
instance. - CompactLatticeArc(ilabel, olabel, weight, nextstate):
- Creates a new
CompactLatticeArc
instance initalized with given arguments.
Parameters: - ilabel (int) – The input label.
- olabel (int) – The output label.
- weight (CompactLatticeWeight) – The arc weight.
- nextstate (int) – The destination state for the arc.
-
from_attrs
(ilabel:int, olabel:int, weight:CompactLatticeWeight, nextstate:int) → CompactLatticeArc¶ Creates a new arc with the given attributes.
Parameters: - ilabel (int) – The input label.
- olabel (int) – The output label.
- weight (CompactLatticeWeight) – The arc weight.
- nextstate (int) – The destination state for the arc.
-
ilabel
¶ int – The input label.
-
nextstate
¶ int – The destination state for the arc.
-
olabel
¶ int – The output label.
-
type
() → str¶ Returns arc type.
-
weight
¶ CompactLatticeWeight – The arc weight.
-
class
kaldi.fstext.
CompactLatticeConstFst
(fst=None)[source]¶ Constant FST over the compact lattice semiring.
Parameters: fst (CompactLatticeFst) – The input FST over the compact lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
CompactLatticeConstFstArcIterator
(fst, state)[source]¶ Arc iterator for a constant FST over the compact lattice semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
CompactLatticeConstFstStateIterator
(fst)[source]¶ State iterator for a constant FST over the compact lattice semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
CompactLatticeEncodeMapper
(encode_labels=False, encode_weights=False, encode=True)[source]¶ Arc encoder for an FST over the compact lattice semiring.
This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.
To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods
encode
anddecode
. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.Parameters: -
flags
() → int¶ Returns encoder flags.
-
from_other
(mapper:CompactLatticeEncodeMapper) → CompactLatticeEncodeMapper¶ Creates a new encoder with the contents of another.
-
from_other_with_type
(mapper:CompactLatticeEncodeMapper, type:EncodeType) → CompactLatticeEncodeMapper¶ Creates a new encoder with the contents of another and given type.
-
input_symbols
() → SymbolTable¶ Returns input symbol table.
-
output_symbols
() → SymbolTable¶ Returns output symbol table.
-
properties
(inprops:int) → int¶ Provides property bits.
This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: mask – The property mask to be compared to the encoder’s properties. Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename:str, type:EncodeType=default) → CompactLatticeEncodeMapper¶ Reads encoder from file.
-
set_input_symbols
(syms:SymbolTable)¶ Sets the input symbol table.
Parameters: syms – A SymbolTable. See also:
set_output_symbols
.
-
set_output_symbols
(syms:SymbolTable)¶ Sets the output symbol table.
Parameters: syms – A SymbolTable. See also:
set_input_symbols
.
-
type
() → EncodeType¶ Returns encoder type.
-
write
(filename:str) → bool¶ Writes encoder to file.
Returns: True if write was successful, False otherwise.
-
-
class
kaldi.fstext.
CompactLatticeEncodeTable
¶ Encode table for CompactLatticeArc.
- CompactLatticeEncodeTable(flags):
- Creates a new encode table with the given flags.
-
class
Tuple
¶ CompactLatticeArc encoding tuple.
-
ilabel
¶ Input label.
-
olabel
¶ Output label.
-
weight
¶ Weight.
-
-
decode
(key:int) → Tuple¶ Decodes an encoded arc label back to labels and cost.
-
encode
(arc:CompactLatticeArc) → int¶ Encodes the given arc (either labels or weights or both).
-
flags
() → int¶ Returns encoding flags.
-
get_label
(arc:CompactLatticeArc) → int¶ Looks up the encoded label for the given arc.
Returns -1 if arc is not found.
-
input_symbols
() → SymbolTable¶ Returns input symbols.
-
output_symbols
() → SymbolTable¶ Returns output symbols.
-
read
(strm:istream, source:str) → CompactLatticeEncodeTable¶ Reads encode table from input stream.
-
set_input_symbols
(syms:SymbolTable)¶ Sets input symbols.
-
set_output_symbols
(syms:SymbolTable)¶ Sets output symbols.
-
size
() → int¶ Returns the size of the table.
-
write
(strm:ostream, source:str) → bool¶ Writes table to output stream.
-
class
kaldi.fstext.
CompactLatticeFstCompiler
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶ Compiler for FSTs over the compact lattice semiring.
This class is used to compile FSTs specified using the AT&T FSM library format described here:
http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html
This is the same format used by the
fstcompile
executable.FstCompiler options (symbol tables, etc.) are set at construction time:
compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)
Once constructed, FstCompiler instances behave like a file handle opened for writing:
# /ba+/ print("0 1 50 50", file=compiler) print("1 2 49 49", file=compiler) print("2 2 49 49", file=compiler) print("2", file=compiler)
The
compile
method returns an actual FST instance:sheep_machine = compiler.compile()
Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.
Parameters: - isymbols – An optional SymbolTable used to label input symbols.
- osymbols – An optional SymbolTable used to label output symbols.
- ssymbols – An optional SymbolTable used to label states.
- acceptor – Should the FST be rendered in acceptor format if possible?
- keep_isymbols – Should the input symbol table be stored in the FST?
- keep_osymbols – Should the output symbol table be stored in the FST?
- keep_state_numbering – Should the state numbering be preserved?
- allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
-
compile
()¶ Compiles the FST in the string buffer.
This method compiles the FST and returns the resulting machine.
Returns: The FST described by the string buffer. Raises: RuntimeError
– Compilation failed.
-
write
(expression)¶ Writes a string into the compiler string buffer.
This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:
compiler = FstCompiler() print("0 0 49 49", file=compiler) print("0", file=compiler)
Parameters: expression – A string expression to add to compiler string buffer.
-
class
kaldi.fstext.
CompactLatticeVectorFst
(fst=None)[source]¶ Vector FST over the compact lattice semiring.
Parameters: fst (CompactLatticeFst) – The input FST over the compact lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
add_arc
(state, arc)¶ Adds a new arc to the FST and returns self.
Parameters: - state – The integer index of the source state.
- arc – The arc to add.
Returns: self.
Raises: IndexError
– State index out of range.See also:
add_state
.
-
add_state
()¶ Adds a new state to the FST and returns the state ID.
Returns: The integer index of the new state.
-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
arcsort
(sort_type='ilabel')¶ Sorts arcs leaving each state of the FST.
This operation destructively sorts arcs leaving each state using either input or output labels.
Parameters: sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels). Returns: self. Raises: ValueError
– Unknown sort type.See also:
topsort
.
-
closure
(closure_plus=False)¶ Computes concatenative closure.
This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if
closure_plus
is False.Parameters: closure_plus – If True, do not accept the empty string. Returns: self.
-
concat
(ifst)¶ Computes the concatenation (product) of two FSTs.
This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.
Parameters: ifst – The second input FST. Returns: self.
-
connect
()¶ Removes unsuccessful paths.
This operation destructively trims the FST, removing states and arcs that are not part of any successful path.
Returns: self.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
decode
(encoder)¶ Decodes encoded labels and/or weights.
This operation reverses the encoding performed by
encode
.Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
encode
.
-
delete_arcs
(state, n=None)¶ Deletes arcs leaving a particular state.
Parameters: - state – The integer index of a state.
- n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns: self.
Raises: IndexError
– State index out of range.See also:
delete_states
.
-
delete_states
(states=None)¶ Deletes states.
Parameters: states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted. Returns: self. Raises: IndexError
– State index out of range.See also:
delete_arcs
.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
encode
(encoder)¶ Encodes labels and/or weights.
This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.
Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
decode
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
invert
()¶ Inverts the FST’s transduction.
This operation destructively inverts the FST’s transduction by exchanging input and output labels.
Returns: self.
-
minimize
(delta=0.0009765625, allow_nondet=False)¶ Minimizes the FST.
This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.
Parameters: - delta – Comparison/quantization delta (default: 0.0009765625).
- allow_nondet – Attempt minimization of non-deterministic FST?
Returns: self.
-
mutable_arcs
(state)¶ Returns a mutable iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: A MutableArcIterator.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
project
(project_output=False)¶ Converts the FST to an acceptor using input or output labels.
This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.
Parameters: project_output – Project onto output labels? Returns: self. See also:
decode
,encode
,relabel
,relabel_tables
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
prune
(weight=None, nstate=-1, delta=0.0009765625)¶ Removes paths with weights below a certain threshold.
This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.
Parameters: - weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
See also: The constructive variant.
-
push
(to_final=False, delta=0.0009765625, remove_total_weight=False)¶ Pushes weights towards the initial or final states.
This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.
Parameters: - to_final – Push towards final states?
- delta – Comparison/quantization delta (default: 0.0009765625).
- remove_total_weight – If pushing weights, should the total weight be removed?
Returns: self.
See also: The constructive variant, which also supports label pushing.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
relabel
(ipairs=None, opairs=None)¶ Replaces input and/or output labels using pairs of labels.
This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.
Parameters: - ipairs – An iterable containing (old index, new index) integer pairs.
- opairs – An iterable containing (old index, new index) integer pairs.
Returns: self.
Raises: ValueError
– No relabeling pairs specified.See also:
decode
,encode
,project
,relabel_tables
.
-
relabel_tables
(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶ Replaces input and/or output labels using SymbolTables.
This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.
Parameters: - old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
- new_isymbols – A SymbolTable used to relabel the input labels
- unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
- old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
- new_osymbols – A SymbolTable used to relabel the output labels.
- unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns: self.
Raises: ValueError
– No SymbolTable specified.
-
reserve_arcs
(state, n)¶ Reserve n arcs at a particular state (best effort).
Parameters: - state – The integer index of a state.
- n – The number of arcs to reserve.
Returns: self.
Raises: IndexError
– State index out of range.See also:
reserve_states
.
-
reserve_states
(n)¶ Reserve n states (best effort).
Parameters: n – The number of states to reserve. Returns: self. See also:
reserve_arcs
.
-
reweight
(potentials, to_final=False)¶ Reweights an FST using an iterable of potentials.
This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).
Parameters: - potentials – An iterable of TropicalWeights.
- to_final – Push towards final states?
Returns: self.
-
rmepsilon
(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶ Removes epsilon transitions.
This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.
Parameters: - connect – Should output be trimmed?
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
- See also: The constructive variant, which also supports epsilon removal
- in reverse (and which may be more efficient).
-
set_final
(state, weight=None)¶ Sets the final weight for a state.
Parameters: - state – The integer index of a state.
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises: IndexError
– State index out of range.See also:
set_start
.
-
set_input_symbols
(syms)¶ Sets the input symbol table.
Passing
None
as a value will delete the input symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_output_symbols
.
-
set_output_symbols
(syms)¶ Sets the output symbol table.
Passing
None
as a value will delete the output symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_input_symbols
.
-
set_properties
(props, mask)¶ Sets the properties bits.
Parameters: Returns: self.
-
set_start
(state)¶ Sets the initial state.
Parameters: state – The integer index of a state. Returns: self. Raises: IndexError
– State index out of range.See also:
set_final
.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
topsort
()¶ Sorts transitions by state IDs.
This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs
Returns: self. See also:
arcsort
.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
union
(ifst)¶ Computes the union (sum) of two FSTs.
This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.
Parameters: ifst – The second input FST. Returns: self.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
CompactLatticeVectorFstArcIterator
(fst, state)[source]¶ Arc iterator for a vector FST over the compact lattice semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
CompactLatticeVectorFstMutableArcIterator
(fst, state)[source]¶ Mutable arc iterator for a vector FST over the compact lattice semiring.
This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the
__iter__
method of a mutable arc iterator object returns an iterator over(arc, setter)
pairs. Thesetter
is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call themutable_arcs
method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.for arc, setter in lattice.mutable_arcs(0): setter(LatticeArc(arc.ilabel, 0, arc.weight, arc.nextstate))
Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
set_value
(arc)¶ Replace the current arc with a new arc.
Parameters: arc – The arc to replace the current arc with.
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
CompactLatticeVectorFstStateIterator
(fst)[source]¶ State iterator for a vector FST over the compact lattice semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
CompactLatticeWeight
[source]¶ Compact lattice weight factory.
This class is used for creating new
CompactLatticeWeight
instances.- CompactLatticeWeight():
- Creates an uninitialized
CompactLatticeWeight
instance. - CompactLatticeWeight(weight):
- Creates a new
CompactLatticeWeight
instance initalized with the weight.
Parameters: weight (Tuple[Tuple[float, float], List[int]] or Tuple[LatticeWeight, List[int]] or CompactLatticeWeight) – A pair of weight values or another CompactLatticeWeight
instance.- CompactLatticeWeight(weight, string):
- Creates a new
CompactLatticeWeight
instance initalized with the (weight, string) pair.
Parameters: - weight (Tuple[float, float] or LatticeWeight) – The weight value.
- string (List[int]) – The string value given as a list of integers.
-
from_other
(other:CompactLatticeWeight) → CompactLatticeWeight¶ Create a new compact lattice weight from another.
-
from_pair
(w:LatticeWeight, s:list<int>) → CompactLatticeWeight¶ Create a new compact lattice weight from a weight string pair.
-
get_int_size_string
() → str¶ Returns int size string.
-
hash
() → int¶ Returns the hash for the weight.
-
member
() → bool¶ Checks if weight is a member of the compact lattice semiring.
-
no_weight
() → CompactLatticeWeight¶ No weight in compact lattice semiring.
-
one
() → CompactLatticeWeight¶ One in compact lattice semiring.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → CompactLatticeWeight¶ Quantizes the weight.
-
reverse
() → CompactLatticeWeight¶ Reverses the weight.
-
string
¶ The string as a list of integers.
-
type
() → str¶ Returns weight type.
-
weight
¶ The weight.
-
zero
() → CompactLatticeWeight¶ Zero in compact lattice semiring.
-
class
kaldi.fstext.
FstHeader
¶ FST file header.
-
arc_type
() → str¶ Returns arc type.
-
debug_string
() → str¶ Outputs a debug string for the FstHeader object.
-
fst_type
() → str¶ Returns FST type.
-
get_flags
() → int¶ Returns flags.
-
num_arcs
() → int¶ Returns number of arcs.
-
num_states
() → int¶ Returns number of states.
-
properties
() → int¶ Returns FST properties.
-
read
(strm:istream, source:str, rewind:bool=default) → bool¶ Reads header from stream.
-
set_arc_type
(type:str)¶ Sets arc type.
-
set_flags
(flags:int)¶ Sets flags.
-
set_fst_type
(type:str)¶ Sets FST type.
-
set_num_arcs
(numarcs:int)¶ Sets number of arcs.
-
set_num_states
(numstates:int)¶ Sets number of states.
-
set_properties
(properties:int)¶ Sets FST properties.
-
set_start
(start:int)¶ Sets start state.
-
set_version
(version:int)¶ Sets version.
-
start
() → int¶ Returns start state.
-
version
() → int¶ Returns version.
-
write
(strm:ostream, source:str) → bool¶ Writes header to stream.
-
-
class
kaldi.fstext.
FstReadOptions
¶ FST reading options.
-
FileReadMode
¶ alias of
FstReadOptions.FileReadMode
-
debug_string
() → str¶ Outputs a debug string for the FstReadOptions object.
-
mode
¶ Read or map files (advisory, if possible)
-
read_isymbols
¶ Read input symbols, if any (default – true).
-
read_mode
(mode:str) → FileReadMode¶ Converts mode strings into FileReadMode enum values.
-
read_osymbols
¶ Read output symbols, if any (default – true).
-
source
¶ Where you’re reading from.
-
-
class
kaldi.fstext.
FstWriteOptions
¶ FST writing options.
-
align
¶ Write data aligned (may fail on pipes)?
-
source
¶ Where you’re writing to.
-
stream_write
¶ Avoid seek operations in writing.
-
write_header
¶ Write the header?
-
write_isymbols
¶ Write input symbols?
-
write_osymbols
¶ Write output symbols?
-
-
class
kaldi.fstext.
KwsIndexArc
[source]¶ FST arc with KWS index weight.
- KwsIndexArc():
- Creates an uninitialized
KwsIndexArc
instance. - KwsIndexArc(ilabel, olabel, weight, nextstate):
- Creates a new
KwsIndexArc
instance initalized with given arguments.
Parameters: - ilabel (int) – The input label.
- olabel (int) – The output label.
- weight (KwsIndexWeight) – The arc weight.
- nextstate (int) – The destination state for the arc.
-
from_attrs
(ilabel:int, olabel:int, weight:KwsIndexWeight, nextstate:int) → KwsIndexArc¶ Creates a new arc with the given attributes.
Parameters: - ilabel (int) – The input label.
- olabel (int) – The output label.
- weight (KwsIndexWeight) – The arc weight.
- nextstate (int) – The destination state for the arc.
-
ilabel
¶ int – The input label.
-
nextstate
¶ int – The destination state for the arc.
-
olabel
¶ int – The output label.
-
type
() → str¶ Returns arc type.
-
weight
¶ TropicalWeight – The arc weight.
-
class
kaldi.fstext.
KwsIndexConstFst
(fst=None)[source]¶ Constant FST over the KWS index semiring.
Parameters: fst (KwsIndexFst) – The input FST over the KWS index semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
KwsIndexConstFstArcIterator
(fst, state)[source]¶ Arc iterator for a constant FST over the KWS index semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
KwsIndexConstFstStateIterator
(fst)[source]¶ State iterator for a constant FST over the KWS index semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
KwsIndexEncodeMapper
(encode_labels=False, encode_weights=False, encode=True)[source]¶ Arc encoder for an FST over the KWS index semiring.
This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.
To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods
encode
anddecode
. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.Parameters: -
flags
() → int¶ Returns encoder flags.
-
from_other
(mapper:KwsIndexEncodeMapper) → KwsIndexEncodeMapper¶ Creates a new encoder with the contents of another.
-
from_other_with_type
(mapper:KwsIndexEncodeMapper, type:EncodeType) → KwsIndexEncodeMapper¶ Creates a new encoder with the contents of another and given type.
-
input_symbols
() → SymbolTable¶ Returns input symbol table.
-
output_symbols
() → SymbolTable¶ Returns output symbol table.
-
properties
(inprops:int) → int¶ Provides property bits.
This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: mask – The property mask to be compared to the encoder’s properties. Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename:str, type:EncodeType=default) → KwsIndexEncodeMapper¶ Reads encoder from file.
-
set_input_symbols
(syms:SymbolTable)¶ Sets the input symbol table.
Parameters: syms – A SymbolTable. See also:
set_output_symbols
.
-
set_output_symbols
(syms:SymbolTable)¶ Sets the output symbol table.
Parameters: syms – A SymbolTable. See also:
set_input_symbols
.
-
type
() → EncodeType¶ Returns encoder type.
-
write
(filename:str) → bool¶ Writes encoder to file.
Returns: True if write was successful, False otherwise.
-
-
class
kaldi.fstext.
KwsIndexEncodeTable
¶ Encode table for KwsIndexArc.
- KwsIndexEncodeTable(flags):
- Creates a new encode table with the given flags.
-
decode
(key:int) → Tuple¶ Decodes an encoded arc label back to labels and cost.
-
encode
(arc:KwsIndexArc) → int¶ Encodes the given arc (either labels or weights or both).
-
flags
() → int¶ Returns encoding flags.
-
get_label
(arc:KwsIndexArc) → int¶ Looks up the encoded label for the given arc.
Returns -1 if arc is not found.
-
input_symbols
() → SymbolTable¶ Returns input symbols.
-
output_symbols
() → SymbolTable¶ Returns output symbols.
-
read
(strm:istream, source:str) → KwsIndexEncodeTable¶ Reads encode table from input stream.
-
set_input_symbols
(syms:SymbolTable)¶ Sets input symbols.
-
set_output_symbols
(syms:SymbolTable)¶ Sets output symbols.
-
size
() → int¶ Returns the size of the table.
-
write
(strm:ostream, source:str) → bool¶ Writes table to output stream.
-
class
kaldi.fstext.
KwsIndexFstCompiler
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶ Compiler for FSTs over the KWS index semiring.
This class is used to compile FSTs specified using the AT&T FSM library format described here:
http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html
This is the same format used by the
fstcompile
executable.FstCompiler options (symbol tables, etc.) are set at construction time:
compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)
Once constructed, FstCompiler instances behave like a file handle opened for writing:
# /ba+/ print("0 1 50 50", file=compiler) print("1 2 49 49", file=compiler) print("2 2 49 49", file=compiler) print("2", file=compiler)
The
compile
method returns an actual FST instance:sheep_machine = compiler.compile()
Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.
Parameters: - isymbols – An optional SymbolTable used to label input symbols.
- osymbols – An optional SymbolTable used to label output symbols.
- ssymbols – An optional SymbolTable used to label states.
- acceptor – Should the FST be rendered in acceptor format if possible?
- keep_isymbols – Should the input symbol table be stored in the FST?
- keep_osymbols – Should the output symbol table be stored in the FST?
- keep_state_numbering – Should the state numbering be preserved?
- allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
-
compile
()¶ Compiles the FST in the string buffer.
This method compiles the FST and returns the resulting machine.
Returns: The FST described by the string buffer. Raises: RuntimeError
– Compilation failed.
-
write
(expression)¶ Writes a string into the compiler string buffer.
This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:
compiler = FstCompiler() print("0 0 49 49", file=compiler) print("0", file=compiler)
Parameters: expression – A string expression to add to compiler string buffer.
-
class
kaldi.fstext.
KwsIndexVectorFst
(fst=None)[source]¶ Vector FST over the KWS index semiring.
Parameters: fst (KwsIndexFst) – The input FST over the KWS index semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
add_arc
(state, arc)¶ Adds a new arc to the FST and returns self.
Parameters: - state – The integer index of the source state.
- arc – The arc to add.
Returns: self.
Raises: IndexError
– State index out of range.See also:
add_state
.
-
add_state
()¶ Adds a new state to the FST and returns the state ID.
Returns: The integer index of the new state.
-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
arcsort
(sort_type='ilabel')¶ Sorts arcs leaving each state of the FST.
This operation destructively sorts arcs leaving each state using either input or output labels.
Parameters: sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels). Returns: self. Raises: ValueError
– Unknown sort type.See also:
topsort
.
-
closure
(closure_plus=False)¶ Computes concatenative closure.
This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if
closure_plus
is False.Parameters: closure_plus – If True, do not accept the empty string. Returns: self.
-
concat
(ifst)¶ Computes the concatenation (product) of two FSTs.
This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.
Parameters: ifst – The second input FST. Returns: self.
-
connect
()¶ Removes unsuccessful paths.
This operation destructively trims the FST, removing states and arcs that are not part of any successful path.
Returns: self.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
decode
(encoder)¶ Decodes encoded labels and/or weights.
This operation reverses the encoding performed by
encode
.Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
encode
.
-
delete_arcs
(state, n=None)¶ Deletes arcs leaving a particular state.
Parameters: - state – The integer index of a state.
- n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns: self.
Raises: IndexError
– State index out of range.See also:
delete_states
.
-
delete_states
(states=None)¶ Deletes states.
Parameters: states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted. Returns: self. Raises: IndexError
– State index out of range.See also:
delete_arcs
.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
encode
(encoder)¶ Encodes labels and/or weights.
This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.
Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
decode
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
invert
()¶ Inverts the FST’s transduction.
This operation destructively inverts the FST’s transduction by exchanging input and output labels.
Returns: self.
-
minimize
(delta=0.0009765625, allow_nondet=False)¶ Minimizes the FST.
This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.
Parameters: - delta – Comparison/quantization delta (default: 0.0009765625).
- allow_nondet – Attempt minimization of non-deterministic FST?
Returns: self.
-
mutable_arcs
(state)¶ Returns a mutable iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: A MutableArcIterator.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
project
(project_output=False)¶ Converts the FST to an acceptor using input or output labels.
This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.
Parameters: project_output – Project onto output labels? Returns: self. See also:
decode
,encode
,relabel
,relabel_tables
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
prune
(weight=None, nstate=-1, delta=0.0009765625)¶ Removes paths with weights below a certain threshold.
This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.
Parameters: - weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
See also: The constructive variant.
-
push
(to_final=False, delta=0.0009765625, remove_total_weight=False)¶ Pushes weights towards the initial or final states.
This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.
Parameters: - to_final – Push towards final states?
- delta – Comparison/quantization delta (default: 0.0009765625).
- remove_total_weight – If pushing weights, should the total weight be removed?
Returns: self.
See also: The constructive variant, which also supports label pushing.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
relabel
(ipairs=None, opairs=None)¶ Replaces input and/or output labels using pairs of labels.
This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.
Parameters: - ipairs – An iterable containing (old index, new index) integer pairs.
- opairs – An iterable containing (old index, new index) integer pairs.
Returns: self.
Raises: ValueError
– No relabeling pairs specified.See also:
decode
,encode
,project
,relabel_tables
.
-
relabel_tables
(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶ Replaces input and/or output labels using SymbolTables.
This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.
Parameters: - old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
- new_isymbols – A SymbolTable used to relabel the input labels
- unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
- old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
- new_osymbols – A SymbolTable used to relabel the output labels.
- unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns: self.
Raises: ValueError
– No SymbolTable specified.
-
reserve_arcs
(state, n)¶ Reserve n arcs at a particular state (best effort).
Parameters: - state – The integer index of a state.
- n – The number of arcs to reserve.
Returns: self.
Raises: IndexError
– State index out of range.See also:
reserve_states
.
-
reserve_states
(n)¶ Reserve n states (best effort).
Parameters: n – The number of states to reserve. Returns: self. See also:
reserve_arcs
.
-
reweight
(potentials, to_final=False)¶ Reweights an FST using an iterable of potentials.
This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).
Parameters: - potentials – An iterable of TropicalWeights.
- to_final – Push towards final states?
Returns: self.
-
rmepsilon
(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶ Removes epsilon transitions.
This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.
Parameters: - connect – Should output be trimmed?
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
- See also: The constructive variant, which also supports epsilon removal
- in reverse (and which may be more efficient).
-
set_final
(state, weight=None)¶ Sets the final weight for a state.
Parameters: - state – The integer index of a state.
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises: IndexError
– State index out of range.See also:
set_start
.
-
set_input_symbols
(syms)¶ Sets the input symbol table.
Passing
None
as a value will delete the input symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_output_symbols
.
-
set_output_symbols
(syms)¶ Sets the output symbol table.
Passing
None
as a value will delete the output symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_input_symbols
.
-
set_properties
(props, mask)¶ Sets the properties bits.
Parameters: Returns: self.
-
set_start
(state)¶ Sets the initial state.
Parameters: state – The integer index of a state. Returns: self. Raises: IndexError
– State index out of range.See also:
set_final
.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
topsort
()¶ Sorts transitions by state IDs.
This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs
Returns: self. See also:
arcsort
.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
union
(ifst)¶ Computes the union (sum) of two FSTs.
This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.
Parameters: ifst – The second input FST. Returns: self.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
KwsIndexVectorFstArcIterator
(fst, state)[source]¶ Arc iterator for a vector FST over the KWS index semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
KwsIndexVectorFstMutableArcIterator
(fst, state)[source]¶ Mutable arc iterator for a vector FST over the KWS index semiring.
This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the
__iter__
method of a mutable arc iterator object returns an iterator over(arc, setter)
pairs. Thesetter
is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call themutable_arcs
method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.for arc, setter in fst.mutable_arcs(0): setter(KwsIndexArc(arc.ilabel, 0, arc.weight, arc.nextstate))
Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
set_value
(arc)¶ Replace the current arc with a new arc.
Parameters: arc – The arc to replace the current arc with.
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
KwsIndexVectorFstStateIterator
(fst)[source]¶ State iterator for a vector FST over the KWS index semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
KwsIndexWeight
[source]¶ KWS index weight factory.
This class is used for creating new
KwsIndexWeight
instances.- KwsIndexWeight():
- Creates an uninitialized
KwsIndexWeight
instance. - KwsIndexWeight(weight):
- Creates a new
KwsIndexWeight
instance initalized with the weight.
Parameters: weight (Tuple[float, Tuple[float, float]] or Tuple[TropicalWeight, KwsTimeWeight] or KwsIndexWeight) – A pair of weight values or another KwsIndexWeight
instance.- KwsIndexWeight(weight1, weight2):
- Creates a new
KwsIndexWeight
instance initalized with weights.
Parameters: - weight1 (float or TropicalWeight) – The first weight value.
- weight2 (Tuple[float, float] or KwsTimeWeight) – The second weight value.
-
from_components
(w1:TropicalWeight, w2:KwsTimeWeight) → KwsIndexWeight¶ Creates a new KWS index weight from component weights.
-
member
() → bool¶ Checks if weight is a member of the KWS index semiring.
-
no_weight
() → KwsIndexWeight¶ No weight in KWS index semiring.
-
one
() → KwsIndexWeight¶ One in KWS index semiring.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → KwsIndexWeight¶ Quantizes the weight.
-
reverse
() → KwsIndexWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value1
¶ The first component weight.
-
value2
¶ The second component weight.
-
zero
() → KwsIndexWeight¶ Zero in KWS index semiring.
-
class
kaldi.fstext.
KwsTimeWeight
[source]¶ KWS time weight factory.
This class is used for creating new
KwsTimeWeight
instances.- KwsTimeWeight():
- Creates an uninitialized
KwsTimeWeight
instance. - KwsTimeWeight(weight):
- Creates a new
KwsTimeWeight
instance initalized with the weight.
Parameters: - weight (Tuple[float, float] or KwsTimeWeight) – A pair of weight values
- another KwsTimeWeight instance. (or) –
- KwsTimeWeight(weight1, weight2):
- Creates a new
KwsTimeWeight
instance initalized with the weights.
Parameters: -
from_components
(w1:TropicalWeight, w2:TropicalWeight) → KwsTimeWeight¶ Creates a new KWS time weight from component weights.
-
member
() → bool¶ Checks if weight is a member of the KWS time semiring.
-
no_weight
() → KwsTimeWeight¶ No weight in the KWS time semiring.
-
one
() → KwsTimeWeight¶ One in the KWS time semiring.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → KwsTimeWeight¶ Quantizes the weight.
-
reverse
() → KwsTimeWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value1
¶ The first component weight.
-
value2
¶ The second component weight.
-
zero
() → KwsTimeWeight¶ Zero in the KWS time semiring.
-
class
kaldi.fstext.
LatticeArc
[source]¶ FST arc with lattice weight.
- LatticeArc():
- Creates an uninitialized
LatticeArc
instance. - LatticeArc(ilabel, olabel, weight, nextstate):
- Creates a new
LatticeArc
instance initalized with given arguments.
Parameters: - ilabel (int) – The input label.
- olabel (int) – The output label.
- weight (LatticeWeight) – The arc weight.
- nextstate (int) – The destination state for the arc.
-
from_attrs
(ilabel:int, olabel:int, weight:LatticeWeight, nextstate:int) → LatticeArc¶ Creates a new arc with the given attributes.
Parameters: - ilabel (int) – The input label.
- olabel (int) – The output label.
- weight (LatticeWeight) – The arc weight.
- nextstate (int) – The destination state for the arc.
-
ilabel
¶ int – The input label.
-
nextstate
¶ int – The destination state for the arc.
-
olabel
¶ int – The output label.
-
type
() → str¶ Returns arc type.
-
weight
¶ LatticeWeight – The arc weight.
-
class
kaldi.fstext.
LatticeConstFst
(fst=None)[source]¶ Constant FST over the lattice semiring.
Parameters: fst (LatticeFst) – The input FST over the lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
LatticeConstFstArcIterator
(fst, state)[source]¶ Arc iterator for a constant FST over the lattice semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
LatticeConstFstStateIterator
(fst)[source]¶ State iterator for a constant FST over the lattice semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
LatticeEncodeMapper
(encode_labels=False, encode_weights=False, encode=True)[source]¶ Arc encoder for an FST over the lattice semiring.
This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.
To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods
encode
anddecode
. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.Parameters: -
flags
() → int¶ Returns encoder flags.
-
from_other
(mapper:LatticeEncodeMapper) → LatticeEncodeMapper¶ Creates a new encoder with the contents of another.
-
from_other_with_type
(mapper:LatticeEncodeMapper, type:EncodeType) → LatticeEncodeMapper¶ Creates a new encoder with the contents of another and given type.
-
input_symbols
() → SymbolTable¶ Returns input symbol table.
-
output_symbols
() → SymbolTable¶ Returns output symbol table.
-
properties
(inprops:int) → int¶ Provides property bits.
This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: mask – The property mask to be compared to the encoder’s properties. Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename:str, type:EncodeType=default) → LatticeEncodeMapper¶ Reads encoder from file.
-
set_input_symbols
(syms:SymbolTable)¶ Sets the input symbol table.
Parameters: syms – A SymbolTable. See also:
set_output_symbols
.
-
set_output_symbols
(syms:SymbolTable)¶ Sets the output symbol table.
Parameters: syms – A SymbolTable. See also:
set_input_symbols
.
-
type
() → EncodeType¶ Returns encoder type.
-
write
(filename:str) → bool¶ Writes encoder to file.
Returns: True if write was successful, False otherwise.
-
-
class
kaldi.fstext.
LatticeEncodeTable
¶ Encode table for LatticeArc.
- LatticeEncodeTable(flags):
- Creates a new encode table with the given flags.
-
decode
(key:int) → Tuple¶ Decodes an encoded arc label back to labels and cost.
-
encode
(arc:LatticeArc) → int¶ Encodes the given arc (either labels or weights or both).
-
flags
() → int¶ Returns encoding flags.
-
get_label
(arc:LatticeArc) → int¶ Looks up the encoded label for the given arc.
Returns -1 if arc is not found.
-
input_symbols
() → SymbolTable¶ Returns input symbols.
-
output_symbols
() → SymbolTable¶ Returns output symbols.
-
read
(strm:istream, source:str) → LatticeEncodeTable¶ Reads encode table from input stream.
-
set_input_symbols
(syms:SymbolTable)¶ Sets input symbols.
-
set_output_symbols
(syms:SymbolTable)¶ Sets output symbols.
-
size
() → int¶ Returns the size of the table.
-
write
(strm:ostream, source:str) → bool¶ Writes table to output stream.
-
class
kaldi.fstext.
LatticeFstCompiler
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶ Compiler for FSTs over the lattice semiring.
This class is used to compile FSTs specified using the AT&T FSM library format described here:
http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html
This is the same format used by the
fstcompile
executable.FstCompiler options (symbol tables, etc.) are set at construction time:
compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)
Once constructed, FstCompiler instances behave like a file handle opened for writing:
# /ba+/ print("0 1 50 50", file=compiler) print("1 2 49 49", file=compiler) print("2 2 49 49", file=compiler) print("2", file=compiler)
The
compile
method returns an actual FST instance:sheep_machine = compiler.compile()
Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.
Parameters: - isymbols – An optional SymbolTable used to label input symbols.
- osymbols – An optional SymbolTable used to label output symbols.
- ssymbols – An optional SymbolTable used to label states.
- acceptor – Should the FST be rendered in acceptor format if possible?
- keep_isymbols – Should the input symbol table be stored in the FST?
- keep_osymbols – Should the output symbol table be stored in the FST?
- keep_state_numbering – Should the state numbering be preserved?
- allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
-
compile
()¶ Compiles the FST in the string buffer.
This method compiles the FST and returns the resulting machine.
Returns: The FST described by the string buffer. Raises: RuntimeError
– Compilation failed.
-
write
(expression)¶ Writes a string into the compiler string buffer.
This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:
compiler = FstCompiler() print("0 0 49 49", file=compiler) print("0", file=compiler)
Parameters: expression – A string expression to add to compiler string buffer.
-
class
kaldi.fstext.
LatticeVectorFst
(fst=None)[source]¶ Vector FST over the lattice semiring.
Parameters: fst (LatticeFst) – The input FST over the lattice semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
add_arc
(state, arc)¶ Adds a new arc to the FST and returns self.
Parameters: - state – The integer index of the source state.
- arc – The arc to add.
Returns: self.
Raises: IndexError
– State index out of range.See also:
add_state
.
-
add_state
()¶ Adds a new state to the FST and returns the state ID.
Returns: The integer index of the new state.
-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
arcsort
(sort_type='ilabel')¶ Sorts arcs leaving each state of the FST.
This operation destructively sorts arcs leaving each state using either input or output labels.
Parameters: sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels). Returns: self. Raises: ValueError
– Unknown sort type.See also:
topsort
.
-
closure
(closure_plus=False)¶ Computes concatenative closure.
This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if
closure_plus
is False.Parameters: closure_plus – If True, do not accept the empty string. Returns: self.
-
concat
(ifst)¶ Computes the concatenation (product) of two FSTs.
This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.
Parameters: ifst – The second input FST. Returns: self.
-
connect
()¶ Removes unsuccessful paths.
This operation destructively trims the FST, removing states and arcs that are not part of any successful path.
Returns: self.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
decode
(encoder)¶ Decodes encoded labels and/or weights.
This operation reverses the encoding performed by
encode
.Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
encode
.
-
delete_arcs
(state, n=None)¶ Deletes arcs leaving a particular state.
Parameters: - state – The integer index of a state.
- n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns: self.
Raises: IndexError
– State index out of range.See also:
delete_states
.
-
delete_states
(states=None)¶ Deletes states.
Parameters: states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted. Returns: self. Raises: IndexError
– State index out of range.See also:
delete_arcs
.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
encode
(encoder)¶ Encodes labels and/or weights.
This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.
Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
decode
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
invert
()¶ Inverts the FST’s transduction.
This operation destructively inverts the FST’s transduction by exchanging input and output labels.
Returns: self.
-
minimize
(delta=0.0009765625, allow_nondet=False)¶ Minimizes the FST.
This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.
Parameters: - delta – Comparison/quantization delta (default: 0.0009765625).
- allow_nondet – Attempt minimization of non-deterministic FST?
Returns: self.
-
mutable_arcs
(state)¶ Returns a mutable iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: A MutableArcIterator.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
project
(project_output=False)¶ Converts the FST to an acceptor using input or output labels.
This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.
Parameters: project_output – Project onto output labels? Returns: self. See also:
decode
,encode
,relabel
,relabel_tables
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
prune
(weight=None, nstate=-1, delta=0.0009765625)¶ Removes paths with weights below a certain threshold.
This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.
Parameters: - weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
See also: The constructive variant.
-
push
(to_final=False, delta=0.0009765625, remove_total_weight=False)¶ Pushes weights towards the initial or final states.
This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.
Parameters: - to_final – Push towards final states?
- delta – Comparison/quantization delta (default: 0.0009765625).
- remove_total_weight – If pushing weights, should the total weight be removed?
Returns: self.
See also: The constructive variant, which also supports label pushing.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
relabel
(ipairs=None, opairs=None)¶ Replaces input and/or output labels using pairs of labels.
This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.
Parameters: - ipairs – An iterable containing (old index, new index) integer pairs.
- opairs – An iterable containing (old index, new index) integer pairs.
Returns: self.
Raises: ValueError
– No relabeling pairs specified.See also:
decode
,encode
,project
,relabel_tables
.
-
relabel_tables
(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶ Replaces input and/or output labels using SymbolTables.
This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.
Parameters: - old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
- new_isymbols – A SymbolTable used to relabel the input labels
- unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
- old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
- new_osymbols – A SymbolTable used to relabel the output labels.
- unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns: self.
Raises: ValueError
– No SymbolTable specified.
-
reserve_arcs
(state, n)¶ Reserve n arcs at a particular state (best effort).
Parameters: - state – The integer index of a state.
- n – The number of arcs to reserve.
Returns: self.
Raises: IndexError
– State index out of range.See also:
reserve_states
.
-
reserve_states
(n)¶ Reserve n states (best effort).
Parameters: n – The number of states to reserve. Returns: self. See also:
reserve_arcs
.
-
reweight
(potentials, to_final=False)¶ Reweights an FST using an iterable of potentials.
This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).
Parameters: - potentials – An iterable of TropicalWeights.
- to_final – Push towards final states?
Returns: self.
-
rmepsilon
(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶ Removes epsilon transitions.
This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.
Parameters: - connect – Should output be trimmed?
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
- See also: The constructive variant, which also supports epsilon removal
- in reverse (and which may be more efficient).
-
set_final
(state, weight=None)¶ Sets the final weight for a state.
Parameters: - state – The integer index of a state.
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises: IndexError
– State index out of range.See also:
set_start
.
-
set_input_symbols
(syms)¶ Sets the input symbol table.
Passing
None
as a value will delete the input symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_output_symbols
.
-
set_output_symbols
(syms)¶ Sets the output symbol table.
Passing
None
as a value will delete the output symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_input_symbols
.
-
set_properties
(props, mask)¶ Sets the properties bits.
Parameters: Returns: self.
-
set_start
(state)¶ Sets the initial state.
Parameters: state – The integer index of a state. Returns: self. Raises: IndexError
– State index out of range.See also:
set_final
.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
topsort
()¶ Sorts transitions by state IDs.
This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs
Returns: self. See also:
arcsort
.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
union
(ifst)¶ Computes the union (sum) of two FSTs.
This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.
Parameters: ifst – The second input FST. Returns: self.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
LatticeVectorFstArcIterator
(fst, state)[source]¶ Arc iterator for a vector FST over the lattice semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
LatticeVectorFstMutableArcIterator
(fst, state)[source]¶ Mutable arc iterator for a vector FST over the lattice semiring.
This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the
__iter__
method of a mutable arc iterator object returns an iterator over(arc, setter)
pairs. Thesetter
is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call themutable_arcs
method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.for arc, setter in lattice.mutable_arcs(0): setter(LatticeArc(arc.ilabel, 0, arc.weight, arc.nextstate))
Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
set_value
(arc)¶ Replace the current arc with a new arc.
Parameters: arc – The arc to replace the current arc with.
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
LatticeVectorFstStateIterator
(fst)[source]¶ State iterator for a vector FST over the lattice semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
LatticeWeight
[source]¶ Lattice weight factory.
This class is used for creating new
LatticeWeight
instances.- LatticeWeight():
- Creates an uninitialized
LatticeWeight
instance. - LatticeWeight(weight):
- Creates a new
LatticeWeight
instance initalized with the weight.
Parameters: - weight (Tuple[float, float] or LatticeWeight) – A pair of weight values
- another LatticeWeight instance. (or) –
- LatticeWeight(weight1, weight2):
- Creates a new
LatticeWeight
instance initalized with the weights.
Parameters: -
from_other
(other:LatticeWeight) → LatticeWeight¶ Create a new lattice weight from another.
-
from_pair
(a:float, b:float) → LatticeWeight¶ Create a new lattice weight from a pair of floats.
-
hash
() → int¶ Returns the hash for the weight.
-
member
() → bool¶ Checks if weight is a member of the lattice semiring.
-
no_weight
() → LatticeWeight¶ No weight in lattice semiring.
-
one
() → LatticeWeight¶ One in lattice semiring, i.e. (0.0, 0.0).
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → LatticeWeight¶ Quantizes the weight.
-
reverse
() → LatticeWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value1
¶ Float value of the first weight.
-
value2
¶ Float value of the second weight.
-
zero
() → LatticeWeight¶ Zero in lattice semiring, i.e. (+infinity, +infinity).
-
class
kaldi.fstext.
LogArc
[source]¶ FST arc with log weight.
- LogArc():
- Creates an uninitialized
LogArc
instance. - LogArc(ilabel, olabel, weight, nextstate):
- Creates a new
LogArc
instance initalized with given arguments.
Parameters: -
from_attrs
(ilabel:int, olabel:int, weight:LogWeight, nextstate:int) → LogArc¶ Creates a new arc with the given attributes.
Parameters:
-
ilabel
¶ int – The input label.
-
nextstate
¶ int – The destination state for the arc.
-
olabel
¶ int – The output label.
-
type
() → str¶ Returns arc type.
-
weight
¶ LogWeight – The arc weight.
-
class
kaldi.fstext.
LogConstFst
(fst=None)[source]¶ Constant FST over the log semiring.
Parameters: fst (LogFst) – The input FST over the log semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
LogConstFstArcIterator
(fst, state)[source]¶ Arc iterator for a constant FST over the log semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
LogConstFstStateIterator
(fst)[source]¶ State iterator for a constant FST over the log semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
LogEncodeMapper
(encode_labels=False, encode_weights=False, encode=True)[source]¶ Arc encoder for an FST over the log semiring.
This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.
To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods
encode
anddecode
. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.Parameters: -
flags
() → int¶ Returns encoder flags.
-
from_other
(mapper:LogEncodeMapper) → LogEncodeMapper¶ Creates a new encoder with the contents of another.
-
from_other_with_type
(mapper:LogEncodeMapper, type:EncodeType) → LogEncodeMapper¶ Creates a new encoder with the contents of another and given type.
-
input_symbols
() → SymbolTable¶ Returns input symbol table.
-
output_symbols
() → SymbolTable¶ Returns output symbol table.
-
properties
(inprops:int) → int¶ Provides property bits.
This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: mask – The property mask to be compared to the encoder’s properties. Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename:str, type:EncodeType=default) → LogEncodeMapper¶ Reads encoder from file.
-
set_input_symbols
(syms:SymbolTable)¶ Sets the input symbol table.
Parameters: syms – A SymbolTable. See also:
set_output_symbols
.
-
set_output_symbols
(syms:SymbolTable)¶ Sets the output symbol table.
Parameters: syms – A SymbolTable. See also:
set_input_symbols
.
-
type
() → EncodeType¶ Returns encoder type.
-
write
(filename:str) → bool¶ Writes encoder to file.
Returns: True if write was successful, False otherwise.
-
-
class
kaldi.fstext.
LogEncodeTable
¶ Encode table for LogArc.
- LogEncodeTable(flags):
- Creates a new encode table with the given flags.
-
decode
(key:int) → Tuple¶ Decodes an encoded arc label back to labels and cost.
-
encode
(arc:LogArc) → int¶ Encodes the given arc (either labels or weights or both).
-
flags
() → int¶ Returns encoding flags.
-
get_label
(arc:LogArc) → int¶ Looks up the encoded label for the given arc.
Returns -1 if arc is not found.
-
input_symbols
() → SymbolTable¶ Returns input symbols.
-
output_symbols
() → SymbolTable¶ Returns output symbols.
-
read
(strm:istream, source:str) → LogEncodeTable¶ Reads encode table from input stream.
-
set_input_symbols
(syms:SymbolTable)¶ Sets input symbols.
-
set_output_symbols
(syms:SymbolTable)¶ Sets output symbols.
-
size
() → int¶ Returns the size of the table.
-
write
(strm:ostream, source:str) → bool¶ Writes table to output stream.
-
class
kaldi.fstext.
LogFstCompiler
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶ Compiler for FSTs over the log semiring.
This class is used to compile FSTs specified using the AT&T FSM library format described here:
http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html
This is the same format used by the
fstcompile
executable.FstCompiler options (symbol tables, etc.) are set at construction time:
compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)
Once constructed, FstCompiler instances behave like a file handle opened for writing:
# /ba+/ print("0 1 50 50", file=compiler) print("1 2 49 49", file=compiler) print("2 2 49 49", file=compiler) print("2", file=compiler)
The
compile
method returns an actual FST instance:sheep_machine = compiler.compile()
Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.
Parameters: - isymbols – An optional SymbolTable used to label input symbols.
- osymbols – An optional SymbolTable used to label output symbols.
- ssymbols – An optional SymbolTable used to label states.
- acceptor – Should the FST be rendered in acceptor format if possible?
- keep_isymbols – Should the input symbol table be stored in the FST?
- keep_osymbols – Should the output symbol table be stored in the FST?
- keep_state_numbering – Should the state numbering be preserved?
- allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
-
compile
()¶ Compiles the FST in the string buffer.
This method compiles the FST and returns the resulting machine.
Returns: The FST described by the string buffer. Raises: RuntimeError
– Compilation failed.
-
write
(expression)¶ Writes a string into the compiler string buffer.
This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:
compiler = FstCompiler() print("0 0 49 49", file=compiler) print("0", file=compiler)
Parameters: expression – A string expression to add to compiler string buffer.
-
class
kaldi.fstext.
LogVectorFst
(fst=None)[source]¶ Vector FST over the log semiring.
Parameters: fst (LogFst) – The input FST over the log semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
add_arc
(state, arc)¶ Adds a new arc to the FST and returns self.
Parameters: - state – The integer index of the source state.
- arc – The arc to add.
Returns: self.
Raises: IndexError
– State index out of range.See also:
add_state
.
-
add_state
()¶ Adds a new state to the FST and returns the state ID.
Returns: The integer index of the new state.
-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
arcsort
(sort_type='ilabel')¶ Sorts arcs leaving each state of the FST.
This operation destructively sorts arcs leaving each state using either input or output labels.
Parameters: sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels). Returns: self. Raises: ValueError
– Unknown sort type.See also:
topsort
.
-
closure
(closure_plus=False)¶ Computes concatenative closure.
This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if
closure_plus
is False.Parameters: closure_plus – If True, do not accept the empty string. Returns: self.
-
concat
(ifst)¶ Computes the concatenation (product) of two FSTs.
This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.
Parameters: ifst – The second input FST. Returns: self.
-
connect
()¶ Removes unsuccessful paths.
This operation destructively trims the FST, removing states and arcs that are not part of any successful path.
Returns: self.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
decode
(encoder)¶ Decodes encoded labels and/or weights.
This operation reverses the encoding performed by
encode
.Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
encode
.
-
delete_arcs
(state, n=None)¶ Deletes arcs leaving a particular state.
Parameters: - state – The integer index of a state.
- n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns: self.
Raises: IndexError
– State index out of range.See also:
delete_states
.
-
delete_states
(states=None)¶ Deletes states.
Parameters: states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted. Returns: self. Raises: IndexError
– State index out of range.See also:
delete_arcs
.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
encode
(encoder)¶ Encodes labels and/or weights.
This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.
Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
decode
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
invert
()¶ Inverts the FST’s transduction.
This operation destructively inverts the FST’s transduction by exchanging input and output labels.
Returns: self.
-
minimize
(delta=0.0009765625, allow_nondet=False)¶ Minimizes the FST.
This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.
Parameters: - delta – Comparison/quantization delta (default: 0.0009765625).
- allow_nondet – Attempt minimization of non-deterministic FST?
Returns: self.
-
mutable_arcs
(state)¶ Returns a mutable iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: A MutableArcIterator.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
project
(project_output=False)¶ Converts the FST to an acceptor using input or output labels.
This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.
Parameters: project_output – Project onto output labels? Returns: self. See also:
decode
,encode
,relabel
,relabel_tables
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
prune
(weight=None, nstate=-1, delta=0.0009765625)¶ Removes paths with weights below a certain threshold.
This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.
Parameters: - weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
See also: The constructive variant.
-
push
(to_final=False, delta=0.0009765625, remove_total_weight=False)¶ Pushes weights towards the initial or final states.
This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.
Parameters: - to_final – Push towards final states?
- delta – Comparison/quantization delta (default: 0.0009765625).
- remove_total_weight – If pushing weights, should the total weight be removed?
Returns: self.
See also: The constructive variant, which also supports label pushing.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
relabel
(ipairs=None, opairs=None)¶ Replaces input and/or output labels using pairs of labels.
This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.
Parameters: - ipairs – An iterable containing (old index, new index) integer pairs.
- opairs – An iterable containing (old index, new index) integer pairs.
Returns: self.
Raises: ValueError
– No relabeling pairs specified.See also:
decode
,encode
,project
,relabel_tables
.
-
relabel_tables
(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶ Replaces input and/or output labels using SymbolTables.
This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.
Parameters: - old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
- new_isymbols – A SymbolTable used to relabel the input labels
- unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
- old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
- new_osymbols – A SymbolTable used to relabel the output labels.
- unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns: self.
Raises: ValueError
– No SymbolTable specified.
-
reserve_arcs
(state, n)¶ Reserve n arcs at a particular state (best effort).
Parameters: - state – The integer index of a state.
- n – The number of arcs to reserve.
Returns: self.
Raises: IndexError
– State index out of range.See also:
reserve_states
.
-
reserve_states
(n)¶ Reserve n states (best effort).
Parameters: n – The number of states to reserve. Returns: self. See also:
reserve_arcs
.
-
reweight
(potentials, to_final=False)¶ Reweights an FST using an iterable of potentials.
This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).
Parameters: - potentials – An iterable of TropicalWeights.
- to_final – Push towards final states?
Returns: self.
-
rmepsilon
(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶ Removes epsilon transitions.
This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.
Parameters: - connect – Should output be trimmed?
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
- See also: The constructive variant, which also supports epsilon removal
- in reverse (and which may be more efficient).
-
set_final
(state, weight=None)¶ Sets the final weight for a state.
Parameters: - state – The integer index of a state.
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises: IndexError
– State index out of range.See also:
set_start
.
-
set_input_symbols
(syms)¶ Sets the input symbol table.
Passing
None
as a value will delete the input symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_output_symbols
.
-
set_output_symbols
(syms)¶ Sets the output symbol table.
Passing
None
as a value will delete the output symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_input_symbols
.
-
set_properties
(props, mask)¶ Sets the properties bits.
Parameters: Returns: self.
-
set_start
(state)¶ Sets the initial state.
Parameters: state – The integer index of a state. Returns: self. Raises: IndexError
– State index out of range.See also:
set_final
.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
topsort
()¶ Sorts transitions by state IDs.
This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs
Returns: self. See also:
arcsort
.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
union
(ifst)¶ Computes the union (sum) of two FSTs.
This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.
Parameters: ifst – The second input FST. Returns: self.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
LogVectorFstArcIterator
(fst, state)[source]¶ Arc iterator for a vector FST over the log semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
LogVectorFstMutableArcIterator
(fst, state)[source]¶ Mutable arc iterator for a vector FST over the log semiring.
This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the
__iter__
method of a mutable arc iterator object returns an iterator over(arc, setter)
pairs. Thesetter
is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call themutable_arcs
method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.for arc, setter in logfst.mutable_arcs(0): setter(LogArc(arc.ilabel, 0, arc.weight, arc.nextstate))
Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
set_value
(arc)¶ Replace the current arc with a new arc.
Parameters: arc – The arc to replace the current arc with.
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
LogVectorFstStateIterator
(fst)[source]¶ State iterator for a vector FST over the log semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
LogWeight
[source]¶ Log weight factory.
This class is used for creating new
LogWeight
instances.- LogWeight():
- Creates an uninitialized
LogWeight
instance. - LogWeight(weight):
- Creates a new
LogWeight
instance initalized with the weight.
Parameters: weight (float or FloatWeight) – The weight value. -
from_float
(f:float) → LogWeight¶ Create a new log weight from a float.
-
from_other
(weight:LogWeight) → LogWeight¶ Create a new log weight from another.
-
hash
() → int¶ Returns the hash for the weight.
-
member
() → bool¶ Checks if weight is a member of log semiring.
-
no_weight
() → LogWeight¶ No weight in log semiring.
-
one
() → LogWeight¶ One in log semiring, i.e. 0.0.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → LogWeight¶ Quantizes the weight.
-
reverse
() → LogWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value
¶ Float value of the weight.
-
zero
() → LogWeight¶ Zero in log semiring, i.e. float +infinity.
-
class
kaldi.fstext.
StdArc
[source]¶ FST arc with tropical weight.
- StdArc():
- Creates an uninitialized
StdArc
instance. - StdArc(ilabel, olabel, weight, nextstate):
- Creates a new
StdArc
instance initalized with given arguments.
Parameters: - ilabel (int) – The input label.
- olabel (int) – The output label.
- weight (TropicalWeight) – The arc weight.
- nextstate (int) – The destination state for the arc.
-
from_attrs
(ilabel:int, olabel:int, weight:TropicalWeight, nextstate:int) → StdArc¶ Creates a new arc with the given attributes.
Parameters: - ilabel (int) – The input label.
- olabel (int) – The output label.
- weight (TropicalWeight) – The arc weight.
- nextstate (int) – The destination state for the arc.
-
ilabel
¶ int – The input label.
-
nextstate
¶ int – The destination state for the arc.
-
olabel
¶ int – The output label.
-
type
() → str¶ Returns arc type.
-
weight
¶ TropicalWeight – The arc weight.
-
class
kaldi.fstext.
StdConstFst
(fst=None)[source]¶ Constant FST over the tropical semiring.
Parameters: fst (StdFst) – The input FST over the tropical semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
StdConstFstArcIterator
(fst, state)[source]¶ Arc iterator for a constant FST over the tropical semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
StdConstFstStateIterator
(fst)[source]¶ State iterator for a constant FST over the tropical semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
StdEncodeMapper
(encode_labels=False, encode_weights=False, encode=True)[source]¶ Arc encoder for an FST over the tropical semiring.
This class provides an object which can be used to encode or decode FST arcs. This is most useful to convert an FST to an unweighted acceptor, on which some FST operations are more efficient, and then decoding the FST afterwards.
To use an instance of this class to encode or decode a mutable FST, pass it as the first argument to the FST instance methods
encode
anddecode
. Alternatively, an instance of this class can be used as a callable to encode/decode arcs.Parameters: -
flags
() → int¶ Returns encoder flags.
-
from_other
(mapper:StdEncodeMapper) → StdEncodeMapper¶ Creates a new encoder with the contents of another.
-
from_other_with_type
(mapper:StdEncodeMapper, type:EncodeType) → StdEncodeMapper¶ Creates a new encoder with the contents of another and given type.
-
input_symbols
() → SymbolTable¶ Returns input symbol table.
-
output_symbols
() → SymbolTable¶ Returns output symbol table.
-
properties
(inprops:int) → int¶ Provides property bits.
This method provides user access to the properties attributes for the encoder. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: mask – The property mask to be compared to the encoder’s properties. Returns: A 64-bit bitmask representing the requested properties.
-
read
(filename:str, type:EncodeType=default) → StdEncodeMapper¶ Reads encoder from file.
-
set_input_symbols
(syms:SymbolTable)¶ Sets the input symbol table.
Parameters: syms – A SymbolTable. See also:
set_output_symbols
.
-
set_output_symbols
(syms:SymbolTable)¶ Sets the output symbol table.
Parameters: syms – A SymbolTable. See also:
set_input_symbols
.
-
type
() → EncodeType¶ Returns encoder type.
-
write
(filename:str) → bool¶ Writes encoder to file.
Returns: True if write was successful, False otherwise.
-
-
class
kaldi.fstext.
StdEncodeTable
¶ Encode table for StdArc.
- StdEncodeTable(flags):
- Creates a new encode table with the given flags.
-
decode
(key:int) → Tuple¶ Decodes an encoded arc label back to labels and cost.
-
encode
(arc:StdArc) → int¶ Encodes the given arc (either labels or weights or both).
-
flags
() → int¶ Returns encoding flags.
-
get_label
(arc:StdArc) → int¶ Looks up the encoded label for the given arc.
Returns -1 if arc is not found.
-
input_symbols
() → SymbolTable¶ Returns input symbols.
-
output_symbols
() → SymbolTable¶ Returns output symbols.
-
read
(strm:istream, source:str) → StdEncodeTable¶ Reads encode table from input stream.
-
set_input_symbols
(syms:SymbolTable)¶ Sets input symbols.
-
set_output_symbols
(syms:SymbolTable)¶ Sets output symbols.
-
size
() → int¶ Returns the size of the table.
-
write
(strm:ostream, source:str) → bool¶ Writes table to output stream.
-
class
kaldi.fstext.
StdFstCompiler
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, keep_isymbols=False, keep_osymbols=False, keep_state_numbering=False, allow_negative_labels=False)[source]¶ Compiler for FSTs over the tropical semiring.
This class is used to compile FSTs specified using the AT&T FSM library format described here:
http://web.eecs.umich.edu/~radev/NLP-fall2015/resources/fsm_archive/fsm.5.html
This is the same format used by the
fstcompile
executable.FstCompiler options (symbol tables, etc.) are set at construction time:
compiler = FstCompiler(isymbols=ascii_syms, osymbols=ascii_syms)
Once constructed, FstCompiler instances behave like a file handle opened for writing:
# /ba+/ print("0 1 50 50", file=compiler) print("1 2 49 49", file=compiler) print("2 2 49 49", file=compiler) print("2", file=compiler)
The
compile
method returns an actual FST instance:sheep_machine = compiler.compile()
Compilation flushes the internal buffer, so the compiler instance can be reused to compile new machines with the same symbol tables, etc.
Parameters: - isymbols – An optional SymbolTable used to label input symbols.
- osymbols – An optional SymbolTable used to label output symbols.
- ssymbols – An optional SymbolTable used to label states.
- acceptor – Should the FST be rendered in acceptor format if possible?
- keep_isymbols – Should the input symbol table be stored in the FST?
- keep_osymbols – Should the output symbol table be stored in the FST?
- keep_state_numbering – Should the state numbering be preserved?
- allow_negative_labels – Should negative labels be allowed? (Not recommended; may cause conflicts).
-
compile
()¶ Compiles the FST in the string buffer.
This method compiles the FST and returns the resulting machine.
Returns: The FST described by the string buffer. Raises: RuntimeError
– Compilation failed.
-
write
(expression)¶ Writes a string into the compiler string buffer.
This method adds a line to the compiler string buffer. It can also be invoked with a print call, like so:
compiler = FstCompiler() print("0 0 49 49", file=compiler) print("0", file=compiler)
Parameters: expression – A string expression to add to compiler string buffer.
-
class
kaldi.fstext.
StdVectorFst
(fst=None)[source]¶ Vector FST over the tropical semiring.
Parameters: fst (StdFst) – The input FST over the tropical semiring. If provided, its contents are used for initializing the new FST. Defaults to None
.-
add_arc
(state, arc)¶ Adds a new arc to the FST and returns self.
Parameters: - state – The integer index of the source state.
- arc – The arc to add.
Returns: self.
Raises: IndexError
– State index out of range.See also:
add_state
.
-
add_state
()¶ Adds a new state to the FST and returns the state ID.
Returns: The integer index of the new state.
-
arcs
(state)¶ Returns an iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: An ArcIterator. See also:
mutable_arcs
,states
.
-
arcsort
(sort_type='ilabel')¶ Sorts arcs leaving each state of the FST.
This operation destructively sorts arcs leaving each state using either input or output labels.
Parameters: sort_type – Either “ilabel” (sort arcs according to input labels) or “olabel” (sort arcs according to output labels). Returns: self. Raises: ValueError
– Unknown sort type.See also:
topsort
.
-
closure
(closure_plus=False)¶ Computes concatenative closure.
This operation destructively converts the FST to its concatenative closure. If A transduces string x to y with weight a, then the closure transduces x to y with weight a, xx to yy with weight a otimes a, xxx to yyy with weight a otimes a otimes a, and so on. The empty string is also transduced to itself with semiring One if
closure_plus
is False.Parameters: closure_plus – If True, do not accept the empty string. Returns: self.
-
concat
(ifst)¶ Computes the concatenation (product) of two FSTs.
This operation destructively concatenates the FST with a second FST. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their concatenation transduces string xw to yv with weight a otimes b.
Parameters: ifst – The second input FST. Returns: self.
-
connect
()¶ Removes unsuccessful paths.
This operation destructively trims the FST, removing states and arcs that are not part of any successful path.
Returns: self.
-
copy
()¶ Makes a copy of the FST.
Returns: A copy of the FST.
-
decode
(encoder)¶ Decodes encoded labels and/or weights.
This operation reverses the encoding performed by
encode
.Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
encode
.
-
delete_arcs
(state, n=None)¶ Deletes arcs leaving a particular state.
Parameters: - state – The integer index of a state.
- n – An optional argument indicating how many arcs to be deleted. If this argument is None, all arcs from this state are deleted.
Returns: self.
Raises: IndexError
– State index out of range.See also:
delete_states
.
-
delete_states
(states=None)¶ Deletes states.
Parameters: states – An optional iterable of integer indices of the states to be deleted. If this argument is omitted, all states are deleted. Returns: self. Raises: IndexError
– State index out of range.See also:
delete_arcs
.
-
draw
(filename, isymbols=None, osymbols=None, ssymbols=None, acceptor=False, title='', width=8.5, height=11, portrait=False, vertical=False, ranksep=0.4, nodesep=0.25, fontsize=14, precision=5, float_format='g', show_weight_one=False)¶ Writes out the FST in Graphviz text format.
This method writes out the FST in the dot graph description language. The graph can be rendered using the
dot
binary provided by Graphviz.Parameters: - filename (str) – The string location of the output dot/Graphviz file.
- isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the figure be rendered in acceptor format if possible? Defaults False.
- title (str) – An optional string indicating the figure title. Defaults to empty string.
- width (float) – The figure width, in inches. Defaults 8.5’‘.
- height (float) – The figure height, in inches. Defaults 11’‘.
- portrait (bool) – Should the figure be rendered in portrait rather than landscape? Defaults False.
- vertical (bool) – Should the figure be rendered bottom-to-top rather than left-to-right?
- ranksep (float) – The minimum separation separation between ranks, in inches. Defaults 0.4’‘.
- nodesep (float) – The minimum separation between nodes, in inches. Defaults 0.25’‘.
- fontsize (int) – Font size, in points. Defaults 14pt.
- precision (int) – Numeric precision for floats, in number of chars. Defaults to 5.
- float_format ('e', 'f' or 'g') – One of: ‘e’, ‘f’ or ‘g’. Defaults to ‘g’
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
For more information about the rendering options, see
man dot
.See also:
text
.
-
encode
(encoder)¶ Encodes labels and/or weights.
This operation allows for the representation of a weighted transducer as a weighted acceptor, an unweighted transducer, or an unweighted acceptor by considering the pair (input label, output label), the pair (input label, weight), or the triple (input label, output label, weight) as a single label. Applying this operation mutates the EncodeMapper argument, which can then be used to decode.
Parameters: encoder – An EncodeMapper object used to encode the FST. Returns: self. See also:
decode
.
-
final
(state)¶ Returns the final weight of a state.
Parameters: state – The integer index of a state. Returns: The final Weight of that state. Raises: IndexError
– State index out of range.
-
from_bytes
(s)¶ Returns the FST represented by the bytes object.
Parameters: s (bytes) – The bytes object representing the FST. Returns: An FST object.
-
input_symbols
()¶ Returns the input symbol table.
Returns: The input symbol table. See Also:
output_symbols()
.
-
invert
()¶ Inverts the FST’s transduction.
This operation destructively inverts the FST’s transduction by exchanging input and output labels.
Returns: self.
-
minimize
(delta=0.0009765625, allow_nondet=False)¶ Minimizes the FST.
This operation destructively performs the minimization of deterministic weighted automata and transducers. If the input FST A is an acceptor, this operation produces the minimal acceptor B equivalent to A, i.e. the acceptor with a minimal number of states that is equivalent to A. If the input FST A is a transducer, this operation internally builds an equivalent transducer with a minimal number of states. However, this minimality is obtained by allowing transitions to have strings of symbols as output labels, this is known in the literature as a real-time transducer. Such transducers are not directly supported by the library. This function will convert such transducers by expanding each string-labeled transition into a sequence of transitions. This will result in the creation of new states, hence losing the minimality property.
Parameters: - delta – Comparison/quantization delta (default: 0.0009765625).
- allow_nondet – Attempt minimization of non-deterministic FST?
Returns: self.
-
mutable_arcs
(state)¶ Returns a mutable iterator over arcs leaving the specified state.
Parameters: state – The source state index. Returns: A MutableArcIterator.
-
num_arcs
(state=None)¶ Returns the number of arcs, counting them if necessary.
If state is
None
, returns the number of arcs in the FST. Otherwise, returns the number of arcs leaving that state.Parameters: state – The integer index of a state. Defaults to None
.Returns: The number of arcs leaving a state or the number of arcs in the FST. Note: This method counts the number of arcs in the FST by iterating over the states and summing up the number of arcs leaving each state.
Raises: IndexError
– State index out of range.See also:
num_states
.
-
num_input_epsilons
(state)¶ Returns the number of arcs with epsilon input labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-input-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_output_epsilons
.
-
num_output_epsilons
(state)¶ Returns the number of arcs with epsilon output labels leaving a state.
Parameters: state – The integer index of a state. Returns: The number of epsilon-output-labeled arcs leaving that state. Raises: IndexError
– State index out of range.See also:
num_input_epsilons
.
-
num_states
()¶ Returns the number of states, counting them if necessary.
Returns: The number of states. See also:
num_arcs
.
-
output_symbols
()¶ Returns the output symbol table.
Returns: The output symbol table. See Also:
input_symbols()
.
-
project
(project_output=False)¶ Converts the FST to an acceptor using input or output labels.
This operation destructively projects an FST onto its domain or range by either copying each arc’s input label to its output label (the default) or vice versa.
Parameters: project_output – Project onto output labels? Returns: self. See also:
decode
,encode
,relabel
,relabel_tables
.
-
properties
(mask, test)¶ Provides property bits.
This method provides user access to the properties attributes for the FST. The resulting value is a long integer, but when it is cast to a boolean, it represents whether or not the FST has the
mask
property.Parameters: - mask – The property mask to be compared to the FST’s properties.
- test – Should any unknown values be computed before comparing against the mask?
Returns: A 64-bit bitmask representing the requested properties.
-
prune
(weight=None, nstate=-1, delta=0.0009765625)¶ Removes paths with weights below a certain threshold.
This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.
Parameters: - weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
See also: The constructive variant.
-
push
(to_final=False, delta=0.0009765625, remove_total_weight=False)¶ Pushes weights towards the initial or final states.
This operation destructively produces an equivalent transducer by pushing the weights towards the initial state or toward the final states. When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to one in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to one. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.
Parameters: - to_final – Push towards final states?
- delta – Comparison/quantization delta (default: 0.0009765625).
- remove_total_weight – If pushing weights, should the total weight be removed?
Returns: self.
See also: The constructive variant, which also supports label pushing.
-
read
(filename)¶ Reads an FST from a file.
Parameters: filename (str) – The location of the input file. Returns: An FST object. Raises: RuntimeError
– Read failed.
-
read_from_stream
(strm, ropts)¶ Reads an FST from an input stream.
Parameters: - strm (istream) – The input stream to read from.
- ropts (FstReadOptions) – FST reading options.
Returns: An FST object.
Raises: RuntimeError
– Read failed.
-
relabel
(ipairs=None, opairs=None)¶ Replaces input and/or output labels using pairs of labels.
This operation destructively relabels the input and/or output labels of the FST using pairs of the form (old_ID, new_ID); omitted indices are identity-mapped.
Parameters: - ipairs – An iterable containing (old index, new index) integer pairs.
- opairs – An iterable containing (old index, new index) integer pairs.
Returns: self.
Raises: ValueError
– No relabeling pairs specified.See also:
decode
,encode
,project
,relabel_tables
.
-
relabel_tables
(old_isymbols=None, new_isymbols=None, unknown_isymbol='', attach_new_isymbols=True, old_osymbols=None, new_osymbols=None, unknown_osymbol='', attach_new_osymbols=True)¶ Replaces input and/or output labels using SymbolTables.
This operation destructively relabels the input and/or output labels of the FST using user-specified symbol tables; omitted symbols are identity-mapped.
Parameters: - old_isymbols – The old SymbolTable for input labels, defaulting to the FST’s input symbol table.
- new_isymbols – A SymbolTable used to relabel the input labels
- unknown_isymbol – Input symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_isymbols – Should new_isymbols be made the FST’s input symbol table?
- old_osymbols – The old SymbolTable for output labels, defaulting to the FST’s output symbol table.
- new_osymbols – A SymbolTable used to relabel the output labels.
- unknown_osymbol – Outnput symbol to use to relabel OOVs (if empty, OOVs raise an exception)
- attach_new_osymbols – Should new_osymbols be made the FST’s output symbol table?
Returns: self.
Raises: ValueError
– No SymbolTable specified.
-
reserve_arcs
(state, n)¶ Reserve n arcs at a particular state (best effort).
Parameters: - state – The integer index of a state.
- n – The number of arcs to reserve.
Returns: self.
Raises: IndexError
– State index out of range.See also:
reserve_states
.
-
reserve_states
(n)¶ Reserve n states (best effort).
Parameters: n – The number of states to reserve. Returns: self. See also:
reserve_arcs
.
-
reweight
(potentials, to_final=False)¶ Reweights an FST using an iterable of potentials.
This operation destructively reweights an FST according to the potentials and in the direction specified by the user. An arc of weight w, with an origin state of potential p and destination state of potential q, is reweighted by p^{-1} otimes (w otimes q) when reweighting towards the initial state, and by (p otimes w) otimes q^{-1} when reweighting towards the final states. The weights must be left distributive when reweighting towards the initial state and right distributive when reweighting towards the final states (e.g., TropicalWeight and LogWeight).
Parameters: - potentials – An iterable of TropicalWeights.
- to_final – Push towards final states?
Returns: self.
-
rmepsilon
(connect=True, weight=None, nstate=-1, delta=0.0009765625)¶ Removes epsilon transitions.
This operation destructively removes epsilon transitions, i.e., those where both input and output labels are epsilon) from an FST.
Parameters: - connect – Should output be trimmed?
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: self.
- See also: The constructive variant, which also supports epsilon removal
- in reverse (and which may be more efficient).
-
set_final
(state, weight=None)¶ Sets the final weight for a state.
Parameters: - state – The integer index of a state.
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired final weight; if omitted, it is set to semiring One.
Raises: IndexError
– State index out of range.See also:
set_start
.
-
set_input_symbols
(syms)¶ Sets the input symbol table.
Passing
None
as a value will delete the input symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_output_symbols
.
-
set_output_symbols
(syms)¶ Sets the output symbol table.
Passing
None
as a value will delete the output symbol table.Parameters: syms – A SymbolTable. Returns: self. See also:
set_input_symbols
.
-
set_properties
(props, mask)¶ Sets the properties bits.
Parameters: Returns: self.
-
set_start
(state)¶ Sets the initial state.
Parameters: state – The integer index of a state. Returns: self. Raises: IndexError
– State index out of range.See also:
set_final
.
-
start
()¶ Returns the start state.
Returns: The start state if start state is set, -1 otherwise.
-
states
()¶ Returns an iterator over all states in the FST.
Returns: A StateIterator object for the FST. See also:
arcs
,mutable_arcs
.
-
text
(isymbols=None, osymbols=None, ssymbols=None, acceptor=False, show_weight_one=False, missing_symbol='')¶ Produces a human-readable string representation of the FST.
This method generates a human-readable string representation of the FST. The caller may optionally specify SymbolTables used to label input labels, output labels, or state labels, respectively.
Parameters: - isymbols – An optional symbol table used to label input symbols.
- osymbols – An optional symbol table used to label output symbols.
- ssymbols – An optional symbol table used to label states.
- acceptor (bool) – Should the FST be rendered in acceptor format if possible? Defaults False.
- show_weight_one (bool) – Should weights equivalent to semiring One be printed? Defaults False.
- missing_symbol – The string to be printed when symbol table lookup fails.
Returns: A formatted string representing the FST.
-
to_bytes
()¶ Returns a bytes object representing the FST.
Returns: A bytes object.
-
topsort
()¶ Sorts transitions by state IDs.
This operation destructively topologically sorts the FST, if it is acyclic; otherwise it remains unchanged. Once sorted, all transitions are from lower state IDs to higher state IDs
Returns: self. See also:
arcsort
.
-
type
()¶ Returns the FST type.
Returns: The FST type.
-
union
(ifst)¶ Computes the union (sum) of two FSTs.
This operation computes the union (sum) of two FSTs. If A transduces string x to y with weight a and B transduces string w to v with weight b, then their union transduces x to y with weight a and w to v with weight b.
Parameters: ifst – The second input FST. Returns: self.
-
verify
()¶ Verifies that an FST’s contents are sane.
Returns: True if the contents are sane, False otherwise.
-
write
(filename)¶ Serializes FST to a file.
This method writes the FST to a file in a binary format.
Parameters: filename (str) – The location of the output file. Raises: IOError
– Write failed.
-
write_to_stream
(strm, wopts)¶ Serializes FST to an output stream.
Parameters: - strm (ostream) – The output stream to write to.
- wopts (FstWriteOptions) – FST writing options.
Returns: True if write was successful, False otherwise.
Raises: RuntimeError
– Write failed.
-
-
class
kaldi.fstext.
StdVectorFstArcIterator
(fst, state)[source]¶ Arc iterator for a vector FST over the tropical semiring.
This class is used for iterating over the arcs leaving some state. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
arcs
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
StdVectorFstMutableArcIterator
(fst, state)[source]¶ Mutable arc iterator for a vector FST over the tropical semiring.
This class is used for iterating over the arcs leaving some state and optionally replacing them with new ones. In addition to the full C++ API, it also supports the iterator protocol. Calling the
__iter__
method of a mutable arc iterator object returns an iterator over(arc, setter)
pairs. Thesetter
is a bound method of the mutable arc iterator object that can be used to replace the current arc with a new one. Most users should just call themutable_arcs
method of a vector FST object instead of directly constructing this iterator and take advantage of the Pythonic API, e.g.for arc, setter in fst.mutable_arcs(0): setter(StdArc(arc.ilabel, 0, arc.weight, arc.nextstate))
Creates a new arc iterator.
Parameters: - fst – The fst.
- state – The state index.
Raises: IndexError
– State index out of range.-
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
flags
()¶ Returns the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The current iterator behavioral flags as an integer.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
position
()¶ Returns the position of the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: The iterator’s position, expressed as an integer.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
seek
(a)¶ Advance the iterator to a new position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters: a (int) – The position to seek to.
-
set_flags
(flags, mask)¶ Sets the current iterator behavioral flags.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Parameters:
-
set_value
(arc)¶ Replace the current arc with a new arc.
Parameters: arc – The arc to replace the current arc with.
-
value
()¶ Returns the current arc.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
class
kaldi.fstext.
StdVectorFstStateIterator
(fst)[source]¶ State iterator for a vector FST over the tropical semiring.
This class is used for iterating over the states. In addition to the full C++ API, it also supports the iterator protocol. Most users should just call the
states
method of an FST object instead of directly constructing this iterator and take advantage of the Pythonic API.Creates a new state iterator.
Parameters: fst – The fst. -
done
()¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
value
()¶ Returns the current state index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
-
class
kaldi.fstext.
SymbolTable
¶ Symbol table.
- SymbolTable():
- Creates a new symbol table.
This class can be used to programmatically construct a SymbolTable in memory, e.g.
import string table = SymbolTable() table.set_name("alphabet") table.add_symbol("<eps>") for symbol in string.ascii_lowercase: table.add_symbol(symbol) table.write_text("alphabet.syms")
-
add_pair
(symbol:str, key:int) → int¶ Adds a symbol with given key to the table and returns the index.
This method adds a (symbol, key) pair to the table. If symbol is already in the table with a different key, then the return value will be the already existing key. Otherwise, return value will be the given key.
Parameters: - symbol – A symbol string.
- key – A non-negative index for the symbol (-1 is reserved for “no symbol requested”).
Returns: The integer index of the new symbol.
-
add_symbol
(symbol:str) → int¶ Adds a symbol to the table and returns the index.
This method adds a symbol to the table. The associated value key is automatically assigned by the symbol table.
Parameters: symbol – A symbol string. Returns: The integer index of the new symbol.
-
add_table
(table:SymbolTable)¶ Adds another SymbolTable to this table.
This method merges another symbol table into the current table. All key values will be offset by the current available key.
Parameters: syms – A SymbolTable to be merged with the current table.
-
available_key
() → int¶ Returns the current available key (i.e. highest key + 1).
-
checksum
() → str¶ Returns the label-agnostic MD5 checksum for the table.
-
copy
() → SymbolTable¶ Returns a copy of the symbol table.
-
find_index
(symbol:str) → int¶ Given a symbol, finds the associated index.
Parameters: key – A symbol string. Returns: The index associated with the symbol key. -1 if symbol is not found.
-
find_symbol
(key:int) → str¶ Given an index, finds the associated symbol.
Parameters: key – An index. Returns: The symbol associated with the index key. Empty string if index is not found.
-
from_name
(name:str) → SymbolTable¶ Creates a new SymbolTable with the given name.
-
get_nth_key
(pos:int) → int¶ Retrieves the integer index of the n-th key in the table.
Parameters: pos – The n-th key to retrieve. Returns: The integer index of the n-th key or -1 if index is not found.
-
labeled_checksum
() → str¶ Returns the label-dependent MD5 checksum of the table.
-
member_index
(key:int) → bool¶ Given an index, returns whether it is found in the table.
This method returns a boolean indicating whether the given index is present in the table. If one intends to perform subsequent lookup, it is much better to simply call the
find_index
method and check the return value.Parameters: key – An index. Returns: Whether or not the key is present in the table.
-
member_symbol
(symbol:str) → bool¶ Given a symbol, returns whether it is found in the table.
This method returns a boolean indicating whether the given symbol is present in the table. If one intends to perform subsequent lookup, it is much better to simply call the
find_symbol
method and check the return value.Parameters: key – A symbol. Returns: Whether or not the key is present in the table.
-
name
() → str¶ Returns the name of the table.
-
num_symbols
() → int¶ Returns the number of sysmbols in the table.
-
read
(filename:str) → SymbolTable¶ Reads symbol table from binary file.
This class method creates a new SymbolTable.
Parameters: filename – The string location of the input binary file. Returns: A new SymbolTable instance. See also:
SymbolTable.read_text
.
-
read_text
(filename:str, opts:SymbolTableTextOptions=default) → SymbolTable¶ Reads symbol table from text file.
This class method creates a new SymbolTable.
Parameters: - filename – The string location of the input text file.
- opts (SymbolTableTextOptions) – The symbol table reading options.
Returns: A new SymbolTable instance.
See also:
SymbolTable.read
.
-
remove_symbol
(key:int)¶ Removes the symbol with the given key.
-
set_name
(new_name:str)¶ Sets the name of the table.
-
write
(filename:str) → bool¶ Serializes symbol table to a file.
This method writes the SymbolTable to a file in binary format.
Parameters: filename – The string location of the output file. Returns: True if write was successful, False otherwise.
-
write_text
(filename:str) → bool¶ Writes symbol table to text file.
This method writes the SymbolTable to a file in human-readable format.
Parameters: filename – The string location of the output file. Returns: True if write was successful, False otherwise.
-
class
kaldi.fstext.
SymbolTableIterator
[source]¶ Symbol table iterator.
This class is used for iterating over the (index, symbol) pairs in a symbol table. In addition to the full C++ API, it also supports the iterator protocol, e.g.
# Returns a symbol table containing only symbols referenced by fst. def prune_symbol_table(fst, syms, inp=True): seen = set([0]) for s in fst.states(): for a in fst.arcs(s): seen.add(a.ilabel if inp else a.olabel) pruned = SymbolTable() for label, symbol in SymbolTableIterator(syms): if label in seen: pruned.add_pair(symbol, label) return pruned
Parameters: table – The symbol table. -
done
() → bool¶ Indicates whether the iterator is exhausted or not.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: True if the iterator is exhausted, False otherwise.
-
next
()¶ Advances the iterator.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
reset
()¶ Resets the iterator to the initial position.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
-
symbol
() → str¶ Returns the current symbol string.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: A symbol string.
-
value
() → int¶ Returns the current integer index.
This method is provided for compatibility with the C++ API only; most users should use the Pythonic API.
Returns: An integer index.
-
-
class
kaldi.fstext.
SymbolTableTextOptions
¶ Options for reading symbol table from text file.
- SymbolTableTextOptions(allow_negative_labels=False):
- Creates options for reading symbol table from text file.
Parameters: allow_negative_labels (bool) – Allow negative labels? -
allow_negative_labels
¶ Allow negative labels? (Not recommended; may cause conflicts).
-
fst_field_separator
¶ Set of characters used as a separator between printed fields.
-
class
kaldi.fstext.
TropicalWeight
[source]¶ Tropical weight factory.
This class is used for creating new
TropicalWeight
instances.- TropicalWeight():
- Creates an uninitialized
TropicalWeight
instance. - TropicalWeight(weight):
- Creates a new
TropicalWeight
instance initalized with the weight.
Parameters: weight (float or FloatWeight) – The weight value. -
from_float
(f:float) → TropicalWeight¶ Create a new tropical weight from a float.
-
from_other
(weight:TropicalWeight) → TropicalWeight¶ Create a new tropical weight from another.
-
hash
() → int¶ Returns the hash for the weight.
-
member
() → bool¶ Checks if weight is a member of the tropical semiring.
-
no_weight
() → TropicalWeight¶ No weight in tropical semiring.
-
one
() → TropicalWeight¶ One in tropical semiring, i.e. 0.0.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → TropicalWeight¶ Quantizes the weight.
-
reverse
() → TropicalWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value
¶ Float value of the weight.
-
zero
() → TropicalWeight¶ Zero in tropical semiring, i.e. float +infinity.
-
kaldi.fstext.
arcmap
(ifst, map_type='identity', delta=0.0009765625, weight=None)[source]¶ Constructively applies a transform to all arcs and final states.
This operation transforms each arc and final state in the input FST using one of the following:
- identity: maps to self.
- input_epsilon: replaces all input labels with epsilon.
- invert: reciprocates all non-Zero weights.
- output_epsilon: replaces all output labels with epsilon.
- plus: adds a constant to all weights.
- quantize: quantizes weights.
- rmweight: replaces all non-Zero weights with 1.
- superfinal: redirects final states to a new superfinal state.
- times: right-multiplies a constant to all weights.
Parameters: - ifst – The input FST.
- map_type – A string matching a known mapping operation (see above).
- delta – Comparison/quantization delta (ignored unless
map_type
isquantize
, default: 0.0009765625). - weight – A Weight in the FST semiring or an object that can be converted
to a Weight in the FST semiring passed to the arc-mapper; this is
ignored unless
map_type
isplus
(in which case it defaults to semiring Zero) ortimes
(in which case it defaults to semiring One).
Returns: An FST with arcs and final states remapped.
Raises: ValueError
– Unknown map type.See also:
statemap
.
-
kaldi.fstext.
compat_symbols
(syms1:SymbolTable, syms2:SymbolTable, warning:bool=default) → bool¶ Returns true if the two symbol tables have equal checksums.
Passing in
None
for either table always returns true.
-
kaldi.fstext.
compose
(ifst1, ifst2, connect=True, compose_filter='auto')[source]¶ Constructively composes two FSTs.
This operation computes the composition of two FSTs. If A transduces string x to y with weight a and B transduces y to z with weight b, then their composition transduces string x to z with weight a otimes b. The output labels of the first transducer or the input labels of the second transducer must be sorted (or otherwise support appropriate matchers).
Parameters: - ifst1 – The first input FST.
- ifst2 – The second input FST.
- connect – Should output be trimmed?
- compose_filter – A string matching a known composition filter; one of: “alt_sequence”, “auto”, “match”, “null”, “sequence”, “trivial”.
Returns: A composed FST.
See also:
arcsort
.
-
kaldi.fstext.
deserialize_symbol_table
(str:bytes) → SymbolTable¶ Deserializes a symbol table.
-
kaldi.fstext.
determinize
(ifst, delta=0.0009765625, weight=None, nstate=-1, subsequential_label=0, det_type='functional', increment_subsequential_label=False)[source]¶ Constructively determinizes a weighted FST.
This operations creates an equivalent FST that has the property that no state has two transitions with the same input label. For this algorithm, epsilon transitions are treated as regular symbols (cf.
rmepsilon
).Parameters: - ifst – The input FST.
- delta – Comparison/quantization delta (default: 0.0009765625).
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- subsequential_label – Input label of arc corresponding to residual final output when producing a subsequential transducer.
- det_type – Type of determinization; one of: “functional” (input transducer is functional), “nonfunctional” (input transducer is not functional) and disambiguate” (input transducer is not functional but only keep the min of ambiguous outputs).
- increment_subsequential_label – Increment subsequential when creating several arcs for the residual final output at a given state.
Returns: An equivalent deterministic FST.
Raises: ValueError
– Unknown determinization type.See also:
disambiguate
,rmepsilon
.
-
kaldi.fstext.
difference
(ifst1, ifst2, connect=True, compose_filter='auto')[source]¶ Constructively computes the difference of two FSTs.
This operation computes the difference between two FSAs. Only strings that are in the first automaton but not in second are retained in the result. The first argument must be an acceptor; the second argument must be an unweighted, epsilon-free, deterministic acceptor. The output labels of the first transducer or the input labels of the second transducer must be sorted (or otherwise support appropriate matchers).
Parameters: - ifst1 – The first input FST.
- ifst2 – The second input FST.
- connect – Should the output FST be trimmed?
- compose_filter – A string matching a known composition filter; one of: “alt_sequence”, “auto”, “match”, “null”, “sequence”, “trivial”.
Returns: An FST representing the difference of the FSTs.
-
kaldi.fstext.
disambiguate
(ifst, delta=0.0009765625, weight=None, nstate=-1, subsequential_label=0)[source]¶ Constructively disambiguates a weighted transducer.
This operation disambiguates a weighted transducer. The result will be an equivalent FST that has the property that no two successful paths have the same input labeling. For this algorithm, epsilon transitions are treated as regular symbols (cf.
rmepsilon
).Parameters: - ifst – The input FST.
- delta – Comparison/quantization delta (default: 0.0009765625).
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold.
- subsequential_label – Input label of arc corresponding to residual final output when producing a subsequential transducer.
Returns: An equivalent disambiguated FST.
See also:
determinize
,rmepsilon
.
-
kaldi.fstext.
epsnormalize
(ifst, eps_norm_output=False)[source]¶ Constructively epsilon-normalizes an FST.
This operation creates an equivalent FST that is epsilon-normalized. An acceptor is epsilon-normalized if it it is epsilon-removed (cf.
rmepsilon
). A transducer is input epsilon-normalized if, in addition, along any path, all arcs with epsilon input labels follow all arcs with non-epsilon input labels. Output epsilon-normalized is defined similarly. The input FST must be functional.Parameters: - ifst – The input FST.
- eps_norm_output – Should the FST be output epsilon-normalized?
Returns: An equivalent epsilon-normalized FST.
See also:
rmepsilon
.
-
kaldi.fstext.
equal
(ifst1, ifst2, delta=0.0009765625)[source]¶ Are two FSTs equal?
This function tests whether two FSTs have the same states with the same numbering and the same transitions with the same labels and weights in the same order.
Parameters: - ifst1 – The first input FST.
- ifst2 – The second input FST.
- delta – Comparison/quantization delta (0.0009765625).
Returns: True if the FSTs satisfy the above condition, else False.
See also:
equivalent
,isomorphic
,randequivalent
.
-
kaldi.fstext.
equivalent
(ifst1, ifst2, delta=0.0009765625)[source]¶ Are the two acceptors equivalent?
This operation tests whether two epsilon-free deterministic weighted acceptors are equivalent, that is if they accept the same strings with the same weights.
Parameters: - ifst1 – The first input FST.
- ifst2 – The second input FST.
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: True if the FSTs satisfy the above condition, else False.
Raises: RuntimeError
– Equivalence test encountered error.See also:
equal
,isomorphic
,randequivalent
.
-
kaldi.fstext.
indices_to_symbols
(symbol_table, indices)[source]¶ Converts indices to symbols by looking them up in the symbol table.
Parameters: - symbol_table (SymbolTable) – The symbol table.
- indices (List[int]) – The list of indices.
Returns: The list of symbols corresponding to the given indices.
Return type: List[str]
Raises: KeyError
– If an index is not found in the symbol table.
-
kaldi.fstext.
intersect
(ifst1, ifst2, connect=True, compose_filter='auto')[source]¶ Constructively intersects two FSTs.
This operation computes the intersection (Hadamard product) of two FSTs. Only strings that are in both automata are retained in the result. The two arguments must be acceptors. One of the arguments must be label-sorted (or otherwise support appropriate matchers).
Parameters: - ifst1 – The first input FST.
- ifst2 – The second input FST.
- connect – Should output be trimmed?
- compose_filter – A string matching a known composition filter; one of: “alt_sequence”, “auto”, “match”, “null”, “sequence”, “trivial”.
Returns: An intersected FST.
-
kaldi.fstext.
isomorphic
(ifst1, ifst2, delta=0.0009765625)[source]¶ Are the two acceptors isomorphic?
This operation determines if two transducers with a certain required determinism have the same states, irrespective of numbering, and the same transitions with the same labels and weights, irrespective of ordering. In other words, FSTs A, B are isomorphic if and only if the states of A can be renumbered and the transitions leaving each state reordered so the two are equal (according to the definition given in
equal
).Parameters: - ifst1 – The first input FST.
- ifst2 – The second input FST.
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: True if the two transducers satisfy the above condition, else False.
See also:
equal
,equivalent
,randequivalent
.
-
kaldi.fstext.
prune
(ifst, weight=None, nstate=-1, delta=0.0009765625)[source]¶ Constructively removes paths with weights below a certain threshold.
This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural semiring order) than the threshold t otimes the weight of the shortest path in the input FST. Weights must be commutative and have the path property.
Parameters: - ifst – The input FST.
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if None, no paths are pruned.
- nstate – State number threshold (default: -1).
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: A pruned FST.
See also: The destructive variant.
-
kaldi.fstext.
push
(ifst, push_weights=False, push_labels=False, remove_common_affix=False, remove_total_weight=False, to_final=False, delta=0.0009765625)[source]¶ Constructively pushes weights/labels towards initial or final states.
This operation produces an equivalent transducer by pushing the weights and/or the labels towards the initial state or toward the final states.
When pushing weights towards the initial state, the sum of the weight of the outgoing transitions and final weight at any non-initial state is equal to 1 in the resulting machine. When pushing weights towards the final states, the sum of the weight of the incoming transitions at any state is equal to 1. Weights need to be left distributive when pushing towards the initial state and right distributive when pushing towards the final states.
Pushing labels towards the initial state consists in minimizing at every state the length of the longest common prefix of the output labels of the outgoing paths. Pushing labels towards the final states consists in minimizing at every state the length of the longest common suffix of the output labels of the incoming paths.
Parameters: - ifst – The input FST.
- push_weights – Should weights be pushed?
- push_labels – Should labels be pushed?
- remove_common_affix – If pushing labels, should common prefix/suffix be removed?
- remove_total_weight – If pushing weights, should total weight be removed?
- to_final – Push towards final states?
- delta – Comparison/quantization delta (default: 0.0009765625).
Returns: An equivalent pushed FST.
See also: The destructive variant.
-
kaldi.fstext.
randequivalent
(ifst1, ifst2, npath=1, delta=0.0009765625, seed=None, select='uniform', max_length=2147483647)[source]¶ Are two acceptors stochastically equivalent?
This operation tests whether two FSTs are equivalent by randomly generating paths alternatively in each of the two FSTs. For each randomly generated path, the algorithm computes for each of the two FSTs the sum of the weights of all the successful paths sharing the same input and output labels as the randomly generated path and checks that these two values are within
delta
.Parameters: - ifst1 – The first input FST.
- ifst2 – The second input FST.
- npath – The number of random paths to generate.
- delta – Comparison/quantization delta.
- seed – An optional seed value for random path generation; if None, the current time and process ID is used.
- select – A string matching a known random arc selection type; one of: “uniform”, “log_prob”, “fast_log_prob”.
- max_length – The maximum length of each random path.
Returns: True if the two transducers satisfy the above condition, else False.
Raises: RuntimeError
– Random equivalence test encountered error.See also:
equal
,equivalent
,isomorphic
,randgen
.
-
kaldi.fstext.
randgen
(ifst, npath=1, seed=None, select='uniform', max_length=2147483647, weighted=False, remove_total_weight=False)[source]¶ Randomly generate successful paths in an FST.
This operation randomly generates a set of successful paths in the input FST. This relies on a mechanism for selecting arcs, specified using the
select
argument. The default selector, “uniform”, randomly selects a transition using a uniform distribution. The “log_prob” selector randomly selects a transition w.r.t. the weights treated as negative log probabilities after normalizing for the total weight leaving the state. In all cases, finality is treated as a transition to a super-final state.Parameters: - ifst – The input FST.
- npath – The number of random paths to generate.
- seed – An optional seed value for random path generation; if zero, the current time and process ID is used.
- select – A string matching a known random arc selection type; one of: “uniform”, “log_prob”, “fast_log_prob”.
- max_length – The maximum length of each random path.
- weighted – Should the output be weighted by path count?
- remove_total_weight – Should the total weight be removed (ignored when
weighted
is False)?
Returns: An FST containing one or more random paths.
See also:
randequivalent
.
-
kaldi.fstext.
read_fst_kaldi
(rxfilename)[source]¶ Reads FST using Kaldi I/O mechanisms.
Does not support reading in text mode.
Parameters: rxfilename (str) – Extended filename for reading the FST.
Returns: An FST object.
Raises:
-
kaldi.fstext.
relabel_symbol_table
(table:SymbolTable, pairs:list<tuple<int, int>>) → SymbolTable¶ Relabels a symbol table as specified by the input list of pairs.
The new symbol table only retains symbols for which a relabeling is explicitly specified.
Parameters: - table – A symbol table.
- pairs – A list of (old label, new label) pairs.
Returns: A new symbol table.
-
kaldi.fstext.
replace
(pairs, root_label, call_arc_labeling='input', return_arc_labeling='neither', epsilon_on_replace=False, return_label=0)[source]¶ Recursively replaces arcs in the root FST with other FST(s).
This operation performs the dynamic replacement of arcs in one FST with another FST, allowing the definition of FSTs analogous to RTNs. It takes as input a set of pairs formed by a non-terminal label and its corresponding FST, and a label identifying the root FST in that set. The resulting FST is obtained by taking the root FST and recursively replacing each arc having a nonterminal as output label by its corresponding FST. More precisely, an arc from state s to state d with (nonterminal) output label n in this FST is replaced by redirecting this “call” arc to the initial state of a copy F of the FST for n, and adding “return” arcs from each final state of F to d. Optional arguments control how the call and return arcs are labeled; by default, the only non-epsilon label is placed on the call arc.
Parameters: - pairs – An iterable of (nonterminal label, FST) pairs, where the former is an unsigned integer and the latter is an Fst instance.
- root_label – Label identifying the root FST.
- call_arc_labeling – A string indicating which call arc labels should be non-epsilon. One of: “input” (default), “output”, “both”, “neither”. This value is set to “neither” if epsilon_on_replace is True.
- return_arc_labeling – A string indicating which return arc labels should be non-epsilon. One of: “input”, “output”, “both”, “neither” (default). This value is set to “neither” if epsilon_on_replace is True.
- epsilon_on_replace – Should call and return arcs be epsilon arcs? If True, this effectively overrides call_arc_labeling and return_arc_labeling, setting both to “neither”.
- return_label – The integer label for return arcs.
Returns: An FST resulting from expanding the input RTN.
-
kaldi.fstext.
reverse
(ifst, require_superinitial=True)[source]¶ Constructively reverses an FST’s transduction.
This operation reverses an FST. If A transduces string x to y with weight a, then the reverse of A transduces the reverse of x to the reverse of y with weight a.Reverse(). (Typically, a = a.Reverse() and Arc = RevArc, e.g., TropicalWeight and LogWeight.) In general, e.g., when the weights only form a left or right semiring, the output arc type must match the input arc type.
Parameters: - ifst – The input FST.
- require_superinitial – Should a superinitial state be created?
Returns: A reversed FST.
-
kaldi.fstext.
rmepsilon
(ifst, connect=True, reverse=False, queue_type='auto', delta=0.0009765625, weight=None, nstate=-1)[source]¶ Constructively removes epsilon transitions from an FST.
This operation removes epsilon transitions (those where both input and output labels are epsilon) from an FST.
Parameters: - ifst – The input FST.
- connect – Should output be trimmed?
- reverse – Should epsilon transitions be removed in reverse order?
- queue_type – A string matching a known queue type; one of: “auto”, “fifo”, “lifo”, “shortest”, “state”, “top”.
- delta – Comparison/quantization delta (default: 0.0009765625).
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold; paths with weights below this threshold will be pruned.
- nstate – State number threshold (default: -1).
Returns: An equivalent FST with no epsilon transitions.
-
kaldi.fstext.
serialize_symbol_table
(table:SymbolTable) → bytes¶ Serializes a symbol table.
-
kaldi.fstext.
shortestdistance
(ifst, reverse=False, source=-1, queue_type='auto', delta=0.0009765625)[source]¶ Compute the shortest distance from the initial or final state.
This operation computes the shortest distance from the initial state (when
reverse
is False) or from every state to the final state (whenreverse
is True). The shortest distance from p to q is the otimes-sum of the weights of all the paths between p and q. The weights must be right (ifreverse
is False) or left (ifreverse
is True) distributive, and k-closed (i.e., 1 otimes x otimes x^2 otimes … otimes x^{k + 1} = 1 otimes x otimes x^2 otimes … otimes x^k; e.g., TropicalWeight).Parameters: - ifst – The input FST.
- reverse – Should the reverse distance (from each state to the final state) be computed?
- source – Source state (this is ignored if
reverse
is True). If NO_STATE_ID (-1), use FST’s initial state. - queue_type – A string matching a known queue type; one of: “auto”,
“fifo”, “lifo”, “shortest”, “state”, “top” (this is ignored if
reverse
is True). - delta – Comparison/quantization delta (default: 0.0009765625).
Returns: A list of Weight objects representing the shortest distance for each state.
-
kaldi.fstext.
shortestpath
(ifst, nshortest=1, unique=False, queue_type='auto', delta=0.0009765625, weight=None, nstate=-1)[source]¶ Construct an FST containing the shortest path(s) in the input FST.
This operation produces an FST containing the n-shortest paths in the input FST. The n-shortest paths are the n-lowest weight paths w.r.t. the natural semiring order. The single path that can be read from the ith of at most n transitions leaving the initial state of the resulting FST is the ith shortest path. The weights need to be right distributive and have the path property. They also need to be left distributive as well for n-shortest with n > 1 (e.g., TropicalWeight).
Parameters: - ifst – The input FST.
- nshortest – The number of paths to return.
- unique – Should the resulting FST only contain distinct paths? (Requires the input FST to be an acceptor; epsilons are treated as if they are regular symbols.)
- queue_type – A string matching a known queue type; one of: “auto”, “fifo”, “lifo”, “shortest”, “state”, “top”.
- delta – Comparison/quantization delta (default: 0.0009765625).
- weight – A Weight in the FST semiring or an object that can be converted to a Weight in the FST semiring indicating the desired weight threshold below which paths are pruned; if omitted, no paths are pruned.
- nstate – State number threshold (default: -1).
Returns: An FST containing the n-shortest paths.
-
kaldi.fstext.
statemap
(ifst, map_type)[source]¶ Constructively applies a transform to all states.
This operation transforms each state according to the requested map type. Note that currently, only one state-mapping operation is supported.
Parameters: - ifst – The input FST.
- map_type – A string matching a known mapping operation; one of: “arc_sum” (sum weights of identically-labeled multi-arcs), “arc_unique” (deletes non-unique identically-labeled multi-arcs).
Returns: An FST with states remapped.
Raises: ValueError
– Unknown map type.See also:
arcmap
.
-
kaldi.fstext.
symbols_to_indices
(symbol_table, symbols)[source]¶ Converts symbols to indices by looking them up in the symbol table.
Parameters: - symbol_table (SymbolTable) – The symbol table.
- indices (List[str]) – The list of symbols.
Returns: The list of indices corresponding to the given symbols.
Return type: List[int]
Raises: KeyError
– If a symbol is not found in the symbol table.
-
kaldi.fstext.
synchronize
(ifst)[source]¶ Constructively synchronizes an FST.
This operation synchronizes a transducer. The result will be an equivalent FST that has the property that during the traversal of a path, the delay is either zero or strictly increasing, where the delay is the difference between the number of non-epsilon output labels and input labels along the path. For the algorithm to terminate, the input transducer must have bounded delay, i.e., the delay of every cycle must be zero.
Parameters: ifst – The input FST. Returns: An equivalent synchronized FST.
-
kaldi.fstext.
write_fst_kaldi
(fst, wxfilename)[source]¶ Writes FST using Kaldi I/O mechanisms.
FST is written in binary mode without Kaldi binary mode header.
Parameters: - fst – The FST to write.
- wxfilename (str) – Extended filename for writing the FST.
Raises: IOError
– If writing fails.
kaldi.fstext.enums¶
Functions
GetArcSortType |
Calls C++ function |
GetClosureType |
Calls C++ function |
GetComposeFilter |
Calls C++ function |
GetDeterminizeType |
Calls C++ function |
GetEncodeFlags |
Calls C++ function |
GetEpsNormalizeType |
Calls C++ function |
GetMapType |
Calls C++ function |
GetProjectType |
Calls C++ function |
GetPushFlags |
Calls C++ function |
GetQueueType |
Calls C++ function |
GetRandArcSelection |
Calls C++ function |
GetReplaceLabelType |
Calls C++ function |
GetReweightType |
Calls C++ function |
Classes
ArcSortType |
An enumeration. |
ClosureType |
An enumeration. |
ComposeFilter |
An enumeration. |
DeterminizeType |
An enumeration. |
EncodeType |
An enumeration. |
EpsNormalizeType |
An enumeration. |
MapType |
An enumeration. |
MatchType |
An enumeration. |
ProjectType |
An enumeration. |
QueueType |
An enumeration. |
RandArcSelection |
An enumeration. |
ReplaceLabelType |
An enumeration. |
ReweightType |
An enumeration. |
-
class
kaldi.fstext.enums.
ComposeFilter
¶ An enumeration.
-
ALT_SEQUENCE_FILTER
= 4¶
-
AUTO_FILTER
= 0¶
-
MATCH_FILTER
= 5¶
-
NULL_FILTER
= 1¶
-
SEQUENCE_FILTER
= 3¶
-
TRIVIAL_FILTER
= 2¶
-
-
class
kaldi.fstext.enums.
DeterminizeType
¶ An enumeration.
-
DETERMINIZE_DISAMBIGUATE
= 2¶
-
DETERMINIZE_FUNCTIONAL
= 0¶
-
DETERMINIZE_NONFUNCTIONAL
= 1¶
-
-
kaldi.fstext.enums.
GetArcSortType
(str:str) -> (success:bool, sort_type:ArcSortType)¶ Calls C++ function bool ::fst::script::GetArcSortType(::std::string, ::fst::script::ArcSortType*)
-
kaldi.fstext.enums.
GetClosureType
(closure_plus:bool) → ClosureType¶ Calls C++ function ::fst::ClosureType ::fst::script::GetClosureType(bool)
-
kaldi.fstext.enums.
GetComposeFilter
(str:str) -> (success:bool, compose_filter:ComposeFilter)¶ Calls C++ function bool ::fst::script::GetComposeFilter(::std::string, ::fst::ComposeFilter*)
-
kaldi.fstext.enums.
GetDeterminizeType
(str:str) -> (success:bool, det_type:DeterminizeType)¶ Calls C++ function bool ::fst::script::GetDeterminizeType(::std::string, ::fst::DeterminizeType*)
-
kaldi.fstext.enums.
GetEncodeFlags
(encode_labels:bool, encode_weights:bool) → int¶ Calls C++ function unsigned int ::fst::script::GetEncodeFlags(bool, bool)
-
kaldi.fstext.enums.
GetEpsNormalizeType
(eps_norm_output:bool) → EpsNormalizeType¶ Calls C++ function ::fst::EpsNormalizeType ::fst::script::GetEpsNormalizeType(bool)
-
kaldi.fstext.enums.
GetMapType
(str:str) -> (success:bool, sort_type:MapType)¶ Calls C++ function bool ::fst::script::GetMapType(::std::string, ::fst::script::MapType*)
-
kaldi.fstext.enums.
GetProjectType
(project_output:bool) → ProjectType¶ Calls C++ function ::fst::ProjectType ::fst::script::GetProjectType(bool)
-
kaldi.fstext.enums.
GetPushFlags
(push_weights:bool, push_labels:bool, remove_total_weight:bool, remove_common_affix:bool) → int¶ Calls C++ function unsigned int ::fst::script::GetPushFlags(bool, bool, bool, bool)
-
kaldi.fstext.enums.
GetQueueType
(str:str) -> (success:bool, queue_type:QueueType)¶ Calls C++ function bool ::fst::script::GetQueueType(::std::string, ::fst::QueueType*)
-
kaldi.fstext.enums.
GetRandArcSelection
(str:str) -> (success:bool, ras:RandArcSelection)¶ Calls C++ function bool ::fst::script::GetRandArcSelection(::std::string, ::fst::script::RandArcSelection*)
-
kaldi.fstext.enums.
GetReplaceLabelType
(str:str, epsilon_on_replace:bool) -> (success:bool, rlt:ReplaceLabelType)¶ Calls C++ function bool ::fst::script::GetReplaceLabelType(::std::string, bool, ::fst::ReplaceLabelType*)
-
kaldi.fstext.enums.
GetReweightType
(to_final:bool) → ReweightType¶ Calls C++ function ::fst::ReweightType ::fst::script::GetReweightType(bool)
-
class
kaldi.fstext.enums.
MapType
¶ An enumeration.
-
ARC_SUM_MAPPER
= 0¶
-
ARC_UNIQUE_MAPPER
= 1¶
-
IDENTITY_MAPPER
= 2¶
-
INPUT_EPSILON_MAPPER
= 3¶
-
INVERT_MAPPER
= 4¶
-
OUTPUT_EPSILON_MAPPER
= 5¶
-
PLUS_MAPPER
= 6¶
-
POWER_MAPPER
= 7¶
-
QUANTIZE_MAPPER
= 8¶
-
RMWEIGHT_MAPPER
= 9¶
-
SUPERFINAL_MAPPER
= 10¶
-
TIMES_MAPPER
= 11¶
-
TO_LOG64_MAPPER
= 13¶
-
TO_LOG_MAPPER
= 12¶
-
TO_STD_MAPPER
= 14¶
-
-
class
kaldi.fstext.enums.
MatchType
¶ An enumeration.
-
MATCH_BOTH
= 3¶
-
MATCH_INPUT
= 1¶
-
MATCH_NONE
= 4¶
-
MATCH_OUTPUT
= 2¶
-
MATCH_UNKNOWN
= 5¶
-
-
class
kaldi.fstext.enums.
QueueType
¶ An enumeration.
-
AUTO_QUEUE
= 7¶
-
FIFO_QUEUE
= 1¶
-
LIFO_QUEUE
= 2¶
-
OTHER_QUEUE
= 8¶
-
SCC_QUEUE
= 6¶
-
SHORTEST_FIRST_QUEUE
= 3¶
-
STATE_ORDER_QUEUE
= 5¶
-
TOP_ORDER_QUEUE
= 4¶
-
TRIVIAL_QUEUE
= 0¶
-
-
class
kaldi.fstext.enums.
RandArcSelection
¶ An enumeration.
-
FAST_LOG_PROB_ARC_SELECTOR
= 2¶
-
LOG_PROB_ARC_SELECTOR
= 1¶
-
UNIFORM_ARC_SELECTOR
= 0¶
-
kaldi.fstext.properties¶
FST Properties.
-
kaldi.fstext.properties.
EXPANDED
= 1¶
-
kaldi.fstext.properties.
MUTABLE
= 2¶
-
kaldi.fstext.properties.
ERROR
= 4¶
-
kaldi.fstext.properties.
ACCEPTOR
= 65536¶
-
kaldi.fstext.properties.
NOT_ACCEPTOR
= 131072¶
-
kaldi.fstext.properties.
I_DETERMINISTIC
= 262144¶
-
kaldi.fstext.properties.
NON_I_DETERMINISTIC
= 524288¶
-
kaldi.fstext.properties.
O_DETERMINISTIC
= 1048576¶
-
kaldi.fstext.properties.
NON_O_DETERMINISTIC
= 2097152¶
-
kaldi.fstext.properties.
EPSILONS
= 4194304¶
-
kaldi.fstext.properties.
NO_EPSILONS
= 8388608¶
-
kaldi.fstext.properties.
I_EPSILONS
= 16777216¶
-
kaldi.fstext.properties.
NO_I_EPSILONS
= 33554432¶
-
kaldi.fstext.properties.
O_EPSILONS
= 67108864¶
-
kaldi.fstext.properties.
NO_O_EPSILONS
= 134217728¶
-
kaldi.fstext.properties.
I_LABEL_SORTED
= 268435456¶
-
kaldi.fstext.properties.
NOT_I_LABEL_SORTED
= 536870912¶
-
kaldi.fstext.properties.
O_LABEL_SORTED
= 1073741824¶
-
kaldi.fstext.properties.
NOT_O_LABEL_SORTED
= 2147483648¶
-
kaldi.fstext.properties.
WEIGHTED
= 4294967296¶
-
kaldi.fstext.properties.
UNWEIGHTED
= 8589934592¶
-
kaldi.fstext.properties.
CYCLIC
= 17179869184¶
-
kaldi.fstext.properties.
ACYCLIC
= 34359738368¶
-
kaldi.fstext.properties.
INITIAL_CYCLIC
= 68719476736¶
-
kaldi.fstext.properties.
INITIAL_ACYCLIC
= 137438953472¶
-
kaldi.fstext.properties.
TOP_SORTED
= 274877906944¶
-
kaldi.fstext.properties.
NOT_TOP_SORTED
= 549755813888¶
-
kaldi.fstext.properties.
ACCESSIBLE
= 1099511627776¶
-
kaldi.fstext.properties.
NOT_ACCESSIBLE
= 2199023255552¶
-
kaldi.fstext.properties.
COACCESSIBLE
= 4398046511104¶
-
kaldi.fstext.properties.
NOT_COACCESSIBLE
= 8796093022208¶
-
kaldi.fstext.properties.
STRING
= 17592186044416¶
-
kaldi.fstext.properties.
NOT_STRING
= 35184372088832¶
-
kaldi.fstext.properties.
WEIGHTED_CYCLES
= 70368744177664¶
-
kaldi.fstext.properties.
UNWEIGHTED_CYCLES
= 140737488355328¶
-
kaldi.fstext.properties.
NULL_PROPERTIES
= 164284018786304¶
-
kaldi.fstext.properties.
COPY_PROPERTIES
= 281474976645124¶
-
kaldi.fstext.properties.
INTRINSIC_PROPERTIES
= 281474976645123¶
-
kaldi.fstext.properties.
EXTRINSIC_PROPERTIES
= 4¶
-
kaldi.fstext.properties.
SET_START_PROPERTIES
= 225193725198343¶
-
kaldi.fstext.properties.
SET_FINAL_PROPERTIES
= 215491394076679¶
-
kaldi.fstext.properties.
ADD_STATE_PROPERTIES
= 258385232461831¶
-
kaldi.fstext.properties.
ADD_ARC_PROPERTIES
= 76509027631111¶
-
kaldi.fstext.properties.
SET_ARC_PROPERTIES
= 7¶
-
kaldi.fstext.properties.
DELETE_STATE_PROPERTIES
= 141194274603015¶
-
kaldi.fstext.properties.
DELETE_ARC_PROPERTIES
= 152189390880775¶
-
kaldi.fstext.properties.
STATE_SORT_PROPERTIES
= 227873784791047¶
-
kaldi.fstext.properties.
ARC_SORT_PROPERTIES
= 281470950113287¶
-
kaldi.fstext.properties.
I_LABEL_INVARIANT_PROPERTIES
= 281474107441159¶
-
kaldi.fstext.properties.
O_LABEL_INVARIANT_PROPERTIES
= 281471538167815¶
-
kaldi.fstext.properties.
WEIGHT_INVARIANT_PROPERTIES
= 70355859210247¶
-
kaldi.fstext.properties.
ADD_SUPERFINAL_PROPERTIES
= 262506881417223¶
-
kaldi.fstext.properties.
RM_SUPERFINAL_PROPERTIES
= 243539050430471¶
-
kaldi.fstext.properties.
BINARY_PROPERTIES
= 7¶
-
kaldi.fstext.properties.
TRINARY_PROPERTIES
= 281474976645120¶
-
kaldi.fstext.properties.
POS_TRINARY_PROPERTIES
= 93824992215040¶
-
kaldi.fstext.properties.
NEG_TRINARY_PROPERTIES
= 187649984430080¶
-
kaldi.fstext.properties.
FST_PROPERTIES
= 281474976645127¶
kaldi.fstext.special¶
Functions
add_subsequential_loop |
Adds a subsequential symbol loop to the input FST. |
compose_context |
Creates a context FST and composes it on the left with input fst. |
compose_context_left_biphone |
Creates a context FST and composes it on the left with input fst. |
compose_deterministic_on_demand_fst |
Composes an FST with a deterministic on demand FST. |
create_ilabel_info_symbol_table |
Creates a symbol table from the ilabel info and phones symbol table. |
determinize_lattice |
Determinizes lattice. |
determinize_star |
Implements a special determinization with epsilon removal. |
determinize_star_in_log |
Performs determinize_star in place in log semiring. |
get_encoding_multiple |
Returns the smallest multiple of 1000 > nonterm_phones_offset. |
push_in_log |
Push weights/labels in log semiring. |
push_special |
Pushes weights in log semiring in a special way. |
read_ilabel_info |
Reads ilabel info from input stream. |
remove_eps_local |
Removes epsilon arcs locally. |
table_compose |
Performs table composition. |
table_compose_cache |
Performs cached table composition. |
table_compose_cache_lattice |
Performs cached table composition on lattices. |
table_compose_lattice |
Performs table composition on lattices. |
write_ilabel_info |
Writes ilabel info to output stream. |
Classes
LatticeTableComposeCache |
Cache for table compose. |
NonterminalValues |
An enumeration. |
ScaleDeterministicOnDemandFst |
A DeterministicOnDemandFst scaling the weights of another. |
StdBackoffDeterministicOnDemandFst |
Deterministic on demand backoff language model. |
StdCacheDeterministicOnDemandFst |
A DeterministicOnDemandFst caching the arcs of another. |
StdComposeDeterministicOnDemandFst |
A DeterministicOnDemandFst implementing the composition of others. |
StdDeterministicOnDemandFst |
Base class for deterministic on demand FSTs over the tropical semiring. |
StdInverseContextFst |
Inverse of the context FST “C” in “HCLG” over the tropical semiring. |
StdInverseLeftBiphoneContextFst |
Inverse of the left-biphone context FST “C” over the tropical semiring. |
StdTableComposeCache |
Cache for table compose. |
StdUnweightedNgramFst |
A DeterministicOnDemandFst in which states encode an n-gram history. |
TableComposeOptions |
Options for table composition. |
TableMatcherOptions |
Options for table matcher. |
-
class
kaldi.fstext.special.
LatticeTableComposeCache
¶ Cache for table compose.
Used for doing multiple compositions while caching the same matcher.
This version is for composing FSTs over lattice semiring.
-
from_compose_opts
(opts:TableComposeOptions=default) → LatticeTableComposeCache¶ Creates a new
LatticeTableComposeCache
instance.
-
opts
¶ Table compose options.
-
-
class
kaldi.fstext.special.
NonterminalValues
¶ An enumeration.
-
kNontermBegin
= 1¶
-
kNontermBigNumber
= 10000000¶
-
kNontermBos
= 0¶
-
kNontermEnd
= 2¶
-
kNontermMediumNumber
= 1000¶
-
kNontermReenter
= 3¶
-
kNontermUserDefined
= 4¶
-
-
class
kaldi.fstext.special.
ScaleDeterministicOnDemandFst
¶ A DeterministicOnDemandFst scaling the weights of another.
For instance, to subtract existing LM scores from a lattice you could use this with a negative weight; and to interpolate LMs you can also use this with weights less than one.
Parameters: - scale (float) – The scaling factor.
- det_fst (StdDeterministicOnDemandFst) – The input deterministic on demand FST.
-
final
(state:int) → TropicalWeight¶ Returns the final weight of the given state.
-
get_arc
(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶ Creates an on demand arc and returns it.
Parameters: Returns: The created arc.
-
start
() → int¶ Returns the start state index.
-
class
kaldi.fstext.special.
StdBackoffDeterministicOnDemandFst
¶ Deterministic on demand backoff language model.
This class wraps a conventional Fst, representing a language model, with a “DeterministicOnDemandFst” interface. Backoff arcs in the language model should have the epsilon label (label 0) on the arcs, and that there should be no other epsilons in the language model. The backoff (i.e. epsilon) arcs are followed if a particular arc (or a final-prob) is not found at the current state.
Parameters: fst (StdFst) – Input language model FST. -
final
(state:int) → TropicalWeight¶ Returns the final weight of the given state.
-
get_arc
(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶ Creates an on demand arc and returns it.
Parameters: Returns: The created arc.
-
start
() → int¶ Returns the start state index.
-
-
class
kaldi.fstext.special.
StdCacheDeterministicOnDemandFst
¶ A DeterministicOnDemandFst caching the arcs of another.
Parameters: - fst (StdDeterministicOnDemandFst) – The input deterministic on demand FST.
- num_cached_arcs (int) – Number of arcs to keep in the cache.
-
final
(state:int) → TropicalWeight¶ Returns the final weight of the given state.
-
get_arc
(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶ Creates an on demand arc and returns it.
Parameters: Returns: The created arc.
-
start
() → int¶ Returns the start state index.
-
class
kaldi.fstext.special.
StdComposeDeterministicOnDemandFst
¶ A DeterministicOnDemandFst implementing the composition of others.
Parameters: - fst1 (StdDeterministicOnDemandFst) – The first deterministic on demand FST.
- fst2 (StdDeterministicOnDemandFst) – The second deterministic on demand FST.
-
final
(state:int) → TropicalWeight¶ Returns the final weight of the given state.
-
get_arc
(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶ Creates an on demand arc and returns it.
Parameters: Returns: The created arc.
-
start
() → int¶ Returns the start state index.
-
class
kaldi.fstext.special.
StdDeterministicOnDemandFst
¶ Base class for deterministic on demand FSTs over the tropical semiring.
-
final
(state:int) → TropicalWeight¶ Returns the final weight of the given state.
-
get_arc
(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶ Creates an on demand arc and returns it.
Parameters: Returns: The created arc.
-
start
() → int¶ Returns the start state index.
-
-
class
kaldi.fstext.special.
StdInverseContextFst
¶ Inverse of the context FST “C” in “HCLG” over the tropical semiring.
InverseContextFst represents the inverse of the context FST “C” (the “C” in “HCLG”) which transduces from symbols representing phone context windows (e.g. “a, b, c”) to individual phones, e.g. “a”. So InverseContextFst transduces from phones to symbols representing phone context windows. The point is that the inverse is deterministic, so the DeterministicOnDemandFst interface is applicable, which turns out to be a convenient way to implement this.
This doesn’t implement the full Fst interface, it implements the DeterministicOnDemandFst interface which is much simpler and which is sufficient for what we need to do with this.
Search for “hbka.pdf” (“Speech Recognition with Weighted Finite State Transducers”) by M. Mohri, for more context.
Parameters: - subsequential_symbol (int) – Integer index of the subsequential symbol.
- phones (List[int]) – Integer indices for the phones.
- disambig_syms (List[int]) – Integer indices for disambiguation symbols.
- context_width (int) – Size of context window.
- central_position (int) – Position of central phone in context window, from 0..N-1.
-
final
(state:int) → TropicalWeight¶ Returns the final weight of the given state.
-
get_arc
(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶ Creates an on demand arc and returns it.
Parameters: Returns: The created arc.
-
ilabel_info
() → list<list<int>>¶ Returns input label info.
-
start
() → int¶ Returns the start state index.
-
class
kaldi.fstext.special.
StdInverseLeftBiphoneContextFst
¶ Inverse of the left-biphone context FST “C” over the tropical semiring.
This does not take the arguments ‘context_width’ or ‘central_position’ because they are assumed to be (2, 1) meaning a system with left-biphone context; and there is no subsequential symbol because it is not needed in systems without right context.
Parameters: -
final
(state:int) → TropicalWeight¶ Returns the final weight of the given state.
-
get_arc
(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶ Creates an on demand arc and returns it.
Parameters: Returns: The created arc.
-
ilabel_info
() → list<list<int>>¶ Returns input label info.
-
start
() → int¶ Returns the start state index.
-
-
class
kaldi.fstext.special.
StdTableComposeCache
¶ Cache for table compose.
Used for doing multiple compositions while caching the same matcher.
-
from_compose_opts
(opts:TableComposeOptions=default) → StdTableComposeCache¶ Creates a new
StdTableComposeCache
instance.
-
opts
¶ Table compose options.
-
-
class
kaldi.fstext.special.
StdUnweightedNgramFst
¶ A DeterministicOnDemandFst in which states encode an n-gram history.
Conceptually, for n-gram order n and k labels, the FST is an unweighted acceptor with about k^(n-1) states (ignoring end effects). However, the FST is created on demand and doesn’t need the label vocabulary; get_arc matches on any input label. This class is primarily used by
compose_deterministic_on_demand_fst
to expand the n-gram history of lattices.Parameters: n (int) – N-gram order. -
final
(state:int) → TropicalWeight¶ Returns the final weight of the given state.
-
get_arc
(s:int, ilabel:int) -> (success:bool, oarc:StdArc)¶ Creates an on demand arc and returns it.
Parameters: Returns: The created arc.
-
start
() → int¶ Returns the start state index.
-
-
class
kaldi.fstext.special.
TableComposeOptions
¶ Options for table composition.
-
connect
¶ Connect output
-
filter_type
¶ Which pre-defined filter to use.
-
from_matcher_opts
(mo:TableMatcherOptions, connect:bool=default, filter_type:ComposeFilter=default, table_match_type:MatchType=default) → TableComposeOptions¶ Creates a new
TableComposeOptions
instance.
-
min_table_size
¶ Minimum table size.
-
table_match_type
¶ Type of table match.
-
table_ratio
¶ Construct the table if it would be at least this full.
-
-
class
kaldi.fstext.special.
TableMatcherOptions
¶ Options for table matcher.
Table matcher is a matcher specialized for the case where the output side of the left FST always has either all-epsilons coming out of a state, or a majority of the symbol table. Therefore we can either store nothing (for the all-epsilon case) or store a lookup table from labels to arc offsets. Since the table matcher has to iterate over all arcs in each left-hand state the first time it sees it, this matcher type is not efficient if you compose with something very small on the right – unless you do it multiple times and keep the matcher around.
Table matcher class is not exposed to Python code directly. Instances of
TableMatcherOptions
can be passed totable_compose()
andTableComposeCache
for controlling the table matcher behavior.-
min_table_size
¶ Minimum table size.
-
table_ratio
¶ Construct the table if it would be at least this full.
-
-
kaldi.fstext.special.
add_subsequential_loop
(subseq_symbol:int, fst:StdMutableFst)¶ Adds a subsequential symbol loop to the input FST.
Modifies the FST so that it transuces the same paths, but the input side of the paths can all have the subsequential symbol ‘$’ appended to them any number of times.
Parameters: - subseq_symbol (int) – Integer index for the subsequential symbol.
- fst (StdFst) – Input FST.
-
kaldi.fstext.special.
compose_context
(disambig_syms, N, P, ifst)[source]¶ Creates a context FST and composes it on the left with input fst.
Outputs the label information along with the composed FST. Input FST should be mutable since the algorithm adds the subsequential loop to it.
Parameters: Returns: Output fst, label information tuple.
Return type: Tuple[StdVectorFst, List[List[int]]]
-
kaldi.fstext.special.
compose_context_left_biphone
(nonterm_phones_offset:int, disambig_syms:list<int>, ifst:StdVectorFst, ofst:StdVectorFst) → list<list<int>>¶ Creates a context FST and composes it on the left with input fst.
This is a variant of the function :meth:compose_context which is to be used with the “grammar FST” framework. This does not take the ‘context_width’ and ‘central_position’ arguments because they are assumed to be 2 and 1 respectively (meaning, left-biphone phonetic context).
Parameters: - nonterm_phones_offset (int) – The integer index of the first non-terminal symbol.
- disambig_syms (List[int]) – Disambiguation symbols.
- ifst (StdVectorFst) – Input FST.
- ofst (StdVectorFst) – Output FST.
Returns: Label information.
Return type: List[List[int]]
-
kaldi.fstext.special.
compose_deterministic_on_demand_fst
(fst1, fst2, inverse=False)[source]¶ Composes an FST with a deterministic on demand FST.
If inverse is True, computes
ofst = Compose(Inverse(fst2), fst1)
. Note that the arguments are reversed in this case.This function does not trim its output.
Parameters: - fst1 (StdFst) – The input FST.
- fst2 (StdDeterministicOnDemandFst) – The input deterministic on demand FST.
- inverse (bool) – Deterministic FST on the left?
Returns: A composed FST.
-
kaldi.fstext.special.
create_ilabel_info_symbol_table
(info:list<list<int>>, phones_symtab:SymbolTable, separator:str, disambig_prefix:str) → SymbolTable¶ Creates a symbol table from the ilabel info and phones symbol table.
This is mainly used for debugging.
-
kaldi.fstext.special.
determinize_lattice
(ifst, compact_output=True, delta=0.0009765625, max_mem=-1, max_loop=-1)[source]¶ Determinizes lattice.
Implements a special form of determinization with epsilon removal, optimized for a phase of lattice generation.
See kaldi/src/fstext/determinize-lattice.h for details.
Parameters: - ifst (LatticeFst) – Input lattice.
- compact_output (bool) – Whether the output is a compact lattice.
- delta (float) – Comparison/quantization delta.
- max_mem (int) – If positive, determinization will fail when the algorithm’s (approximate) memory consumption crosses this threshold.
- max_loop (int) – If positive, can be used to detect non-determinizable input (a case that wouldn’t be caught by max_mem).
Returns: A determized lattice.
Raises: RuntimeError
– If determization fails.
-
kaldi.fstext.special.
determinize_star
(ifst, delta=0.0009765625, max_states=-1, allow_partial=False)[source]¶ Implements a special determinization with epsilon removal.
See kaldi/src/fstext/determinize-star.h for details.
Parameters: - ifst (StdFst) – Input fst over the tropical semiring.
- delta (float) – Comparison/quantization delta.
- max_states (int) – If positive, determinization will fail when max states is reached.
- allow_partial (bool) – If True, the algorithm will output partial results when the specified max states is reached (when larger than zero), instead of raising an exception.
Returns: A determized lattice.
Raises: RuntimeError
– If determization fails.
-
kaldi.fstext.special.
determinize_star_in_log
(fst:StdVectorFst, delta:float=default, max_states:int=default)¶ Performs determinize_star in place in log semiring.
Parameters: Raises: RuntimeError
– If determization fails.See Also:
determinize_star()
-
kaldi.fstext.special.
get_encoding_multiple
(nonterm_phones_offset:int) → int¶ Returns the smallest multiple of 1000 > nonterm_phones_offset.
-
kaldi.fstext.special.
push_in_log
(ifst, push_weights=False, push_labels=False, remove_common_affix=False, remove_total_weight=False, to_final=False, delta=0.0009765625)[source]¶ Push weights/labels in log semiring.
Destructively pushes weights/labels towards initial or final states.
Parameters: - fst (StdVectorFst) – Input fst over the tropical semiring.
- push_weights – Should weights be pushed?
- push_labels – Should labels be pushed?
- remove_common_affix – If pushing labels, should common prefix/suffix be removed?
- remove_total_weight – If pushing weights, should total weight be removed?
- to_final – Push towards final states?
- delta – Comparison/quantization delta (default: 0.0009765625).
-
kaldi.fstext.special.
push_special
(fst:StdVectorFst, delta:float=default)¶ Pushes weights in log semiring in a special way.
Destructively pushes weights in the log semiring such that any leftover weight after pushing gets distributed evenly along the FST, and doesn’t end up either at the start or at the end. Basically it pushes the weights such that the total weight of each state (i.e. the sum of the arc probabilities plus the final-prob) is the same for all states.
Parameters: - fst (StdFst) – Input fst over the tropical semiring.
- delta – Comparison/quantization delta (default: 0.0009765625).
-
kaldi.fstext.special.
read_ilabel_info
(is:istream, binary:bool) → list<list<int>>¶ Reads ilabel info from input stream.
-
kaldi.fstext.special.
remove_eps_local
(fst, special=False)[source]¶ Removes epsilon arcs locally.
Removes some (but not necessarily all) epsilons in an FST, using an algorithm that is guaranteed to never increase the number of arcs in the FST (and will also never increase the number of states).
See kaldi/src/fstext/remove-eps-local.h for details.
Parameters: - fst (StdVectorFst) – Input fst over the tropical semiring.
- special (bool) – Preserve stochasticity when casting to log semiring.
-
kaldi.fstext.special.
table_compose
(ifst1:StdFst, ifst2:StdFst, ofst:StdMutableFst, opts:TableComposeOptions=default)¶ Performs table composition.
-
kaldi.fstext.special.
table_compose_cache
(ifst1:StdFst, ifst2:StdFst, ofst:StdMutableFst, cache:StdTableComposeCache)¶ Performs cached table composition.
-
kaldi.fstext.special.
table_compose_cache_lattice
(ifst1:LatticeFst, ifst2:LatticeFst, ofst:LatticeMutableFst, cache:LatticeTableComposeCache)¶ Performs cached table composition on lattices.
-
kaldi.fstext.special.
table_compose_lattice
(ifst1:LatticeFst, ifst2:LatticeFst, ofst:LatticeMutableFst, opts:TableComposeOptions=default)¶ Performs table composition on lattices.
-
kaldi.fstext.special.
write_ilabel_info
(os:ostream, binary:bool, info:list<list<int>>)¶ Writes ilabel info to output stream.
kaldi.fstext.utils¶
Functions
acoustic_lattice_scale |
Returns a 2x2 matrix for scaling acoustic cost in lattice weights. |
apply_probability_scale |
Applies a probability scale to the FST. |
cast_log_to_std |
Casts FST in log semiring to tropical semiring. |
cast_std_to_log |
Casts FST in tropical semiring to log semiring. |
clear_symbols |
Sets all input/output labels of the FST to zero. |
compact_lattice_has_alignment |
Checks if compact lattice has state-level alignments. |
convert_compact_lattice_to_lattice |
Converts compact lattice to lattice. |
convert_lattice_to_compact_lattice |
Converts lattice to compact lattice. |
convert_lattice_to_std |
Converts lattice to FST over tropical semiring. |
convert_nbest_to_list |
Converts n-best FST to a list of FSTs. |
convert_std_to_lattice |
Converts FST over tropical semiring to lattice. |
default_lattice_scale |
Returns a default 2x2 matrix for scaling lattice weights. |
equal_align |
Generates sequences from the input FST with exactly “length” symbols. |
following_input_symbols_are_same |
Checks if all arcs exiting any state have the same input symbol. |
get_input_symbols |
Gets input labels of the FST as a sorted unique list. |
get_linear_symbol_sequence |
Extracts linear symbol sequences from the input FST. |
get_output_symbols |
Gets output labels of the FST as a sorted unique list. |
get_symbols |
Gets labels in the symbol table as a sorted unique list. |
graph_lattice_scale |
Returns a 2x2 matrix for scaling graph cost in lattice weights. |
highest_numbered_input_symbol |
Returns the highest numbered input label of the FST (zero if FST is empty). |
highest_numbered_output_symbol |
Returns the highest numbered output label of the FST (zero if FST is empty). |
is_stochastic_fst |
Checks if FST is stochastic. |
is_stochastic_fst_in_log |
Checks if FST is stochastic in log semiring. |
lattice_scale |
Returns a 2x2 matrix for scaling graph and acoustic costs in lattice weights. |
make_following_input_symbols_same |
Ensures that all arcs exiting any state have the same input symbol. |
make_linear_acceptor |
Creates an unweighted linear acceptor from the label sequence. |
make_linear_acceptor_with_alternatives |
Creates an unweighted acceptor with a linear structure. |
make_preceding_input_symbols_same |
Ensures that all arcs entering any state have the same input symbol. |
map_input_symbols |
Maps input labels to labels given in the symbol map. |
minimize_encoded_std_fst |
Minimizes FST in place after encoding labels and weights. |
nbest_as_fsts |
Outputs (up to) n-best paths in the FST as a list of FSTs. |
phi_compose |
Performs composition by handling phi (failure) transitions. |
phi_compose_lattice |
Performs lattice composition by handling phi (failure) transitions. |
preceding_input_symbols_are_same |
Checks if all arcs entering any state have the same input symbol. |
propagate_final |
Propagates final-probs through “phi” transitions. |
remove_alignments_from_compact_lattice |
Removes state-level alignments in a compact lattice. |
remove_some_input_symbols |
Replaces given input labels with zeros. |
remove_useless_arcs |
Removes arcs that are not on best paths for any input symbol sequence. |
remove_weights |
Removes FST weights. |
rho_compose |
Performs composition by handling rho transitions. |
safe_determinize_minimize_wrapper |
Performs safe determinization and minimization. |
safe_determinize_minimize_wrapper_in_log |
Performs safe determinization and minimization in log semiring. |
safe_determinize_wrapper |
Performs safe determinization. |
scale_compact_lattice |
Scales the compact lattice weights. |
scale_lattice |
Scales the lattice weights. |
-
kaldi.fstext.utils.
acoustic_lattice_scale
(acwt:float) → list<list<float>>¶ Returns a 2x2 matrix for scaling acoustic cost in lattice weights.
-
kaldi.fstext.utils.
apply_probability_scale
(scale:float, fst:StdMutableFst)¶ Applies a probability scale to the FST.
This is applicable to FSTs in the log or tropical semiring. It multiplies the arc and final weights by
scale
[this is not the multiplication operation of the semiring, it’s actual multiplication, which is equivalent to taking a power in the semiring].
-
kaldi.fstext.utils.
cast_log_to_std
(ifst:LogVectorFst) → StdVectorFst¶ Casts FST in log semiring to tropical semiring.
-
kaldi.fstext.utils.
cast_std_to_log
(ifst:StdVectorFst) → LogVectorFst¶ Casts FST in tropical semiring to log semiring.
-
kaldi.fstext.utils.
clear_symbols
(clear_input:bool, clear_output:bool, fst:StdMutableFst)¶ Sets all input/output labels of the FST to zero.
Does not alter symbol tables.
-
kaldi.fstext.utils.
compact_lattice_has_alignment
(fst:CompactLatticeExpandedFst) → bool¶ Checks if compact lattice has state-level alignments.
-
kaldi.fstext.utils.
convert_compact_lattice_to_lattice
(ifst, invert=True)[source]¶ Converts compact lattice to lattice.
Parameters: - ifst (CompactLatticeFst) – The input compact lattice.
- invert (bool) – Invert input and output labels.
Returns: The output lattice.
Return type:
-
kaldi.fstext.utils.
convert_lattice_to_compact_lattice
(ifst, invert=True)[source]¶ Converts lattice to compact lattice.
Parameters: - ifst (LatticeFst) – The input lattice.
- invert (bool) – Invert input and output labels.
Returns: The output compact lattice.
Return type:
-
kaldi.fstext.utils.
convert_lattice_to_std
(ifst)[source]¶ Converts lattice to FST over tropical semiring.
Parameters: ifst (LatticeFst) – The input lattice. Returns: The output FST. Return type: StdVectorFst
-
kaldi.fstext.utils.
convert_nbest_to_list
(fst:StdFst) → list<StdVectorFst>¶ Converts n-best FST to a list of FSTs.
-
kaldi.fstext.utils.
convert_std_to_lattice
(ifst)[source]¶ Converts FST over tropical semiring to lattice.
Parameters: ifst (StdFst) – The input FST. Returns: The output lattice. Return type: LatticeVectorFst
-
kaldi.fstext.utils.
default_lattice_scale
() → list<list<float>>¶ Returns a default 2x2 matrix for scaling lattice weights.
-
kaldi.fstext.utils.
equal_align
(ifst:StdFst, length:int, rand_seed:int, ofst:StdMutableFst, num_retries:int=default) → bool¶ Generates sequences from the input FST with exactly “length” symbols.
This is similar to randgen, but it generates a sequence with exactly “length” input symbols. It returns
True
on success,False
on failure (failure is partly random but should never happen in practice for normal speech models.) It generates a random path through the input FST, finds out which subset of the states it visits along the way have self-loops with inupt symbols on them, and outputs a path with exactly enough self-loops to have the requested number of input symbols. Note that EqualAlign does not use the probabilities on the FST. It just uses equal probabilities in the first stage of selection (since the output will anyway not be a truly random sample from the FST). The input fst “ifst” must be connected or this may enter an infinite loop.
-
kaldi.fstext.utils.
following_input_symbols_are_same
(end_is_epsilon:bool, fst:StdFst) → bool¶ Checks if all arcs exiting any state have the same input symbol.
Returns true if and only if the FST is such that the input symbols on arcs exiting any given state all have the same value. If
end_is_epsilon == True
, treats final-states as epsilon output arcs [i.e. ensures only epsilons can exit final-states].
-
kaldi.fstext.utils.
get_input_symbols
(fst:StdFst, include_eps:bool) → list<int>¶ Gets input labels of the FST as a sorted unique list.
-
kaldi.fstext.utils.
get_linear_symbol_sequence
(fst)[source]¶ Extracts linear symbol sequences from the input FST.
Parameters: fst – The input FST. Returns: The tuple (isymbols, osymbols, total_weight).
-
kaldi.fstext.utils.
get_output_symbols
(fst:StdFst, include_eps:bool) → list<int>¶ Gets output labels of the FST as a sorted unique list.
-
kaldi.fstext.utils.
get_symbols
(symtab:SymbolTable, include_eps:bool) → list<int>¶ Gets labels in the symbol table as a sorted unique list.
-
kaldi.fstext.utils.
graph_lattice_scale
(lmwt:float) → list<list<float>>¶ Returns a 2x2 matrix for scaling graph cost in lattice weights.
-
kaldi.fstext.utils.
highest_numbered_input_symbol
(fst:StdFst) → int¶ Returns the highest numbered input label of the FST (zero if FST is empty).
-
kaldi.fstext.utils.
highest_numbered_output_symbol
(fst:StdFst) → int¶ Returns the highest numbered output label of the FST (zero if FST is empty).
-
kaldi.fstext.utils.
is_stochastic_fst
(fst:StdFst, delta:float=default, min_sum:TropicalWeight=default, max_sum:TropicalWeight=default) → bool¶ Checks if FST is stochastic.
This function returns true if, in the semiring of the FST, the sum (within the semiring) of all the arcs out of each state in the FST is one, to within delta.
Parameters: - fst – The FST that we are testing.
- delta – The tolerance to within which we test equality to 1.
- min_sum – If provided, it will be set to the minimum sum of weights.
- max_sum – If provided, it will be set to the maximum sum of weights.
Returns: True if the FST is stochastic, and False otherwise.
-
kaldi.fstext.utils.
is_stochastic_fst_in_log
(fst:StdFst, delta:float=default, min_sum:TropicalWeight=default, max_sum:TropicalWeight=default) → bool¶ Checks if FST is stochastic in log semiring.
This function returns true if, in the log semiring, the sum of all the arcs out of each state in the FST is one, to within delta.
Parameters: - fst – The FST that we are testing.
- delta – The tolerance to within which we test equality to 1.
- min_sum – If provided, it will be set to the minimum sum of weights.
- max_sum – If provided, it will be set to the maximum sum of weights.
Returns: True if the FST is stochastic, and False otherwise.
-
kaldi.fstext.utils.
lattice_scale
(lmwt:float, acwt:float) → list<list<float>>¶ Returns a 2x2 matrix for scaling graph and acoustic costs in lattice weights.
-
kaldi.fstext.utils.
make_following_input_symbols_same
(end_is_epsilon:bool, fst:StdMutableFst)¶ Ensures that all arcs exiting any state have the same input symbol.
Detects states that have differing input symbols going out, and inserts, for each of the following arcs with non-epsilon input symbol, a new dummy state that has an epsilon link from the fst state. The output symbol and weight stay on the link to the dummy state (in order to keep the FST output-deterministic and stochastic, if it already was). If
end_is_epsilon == True
, treats “being a final-state” like having an epsilon output link.
-
kaldi.fstext.utils.
make_linear_acceptor
(labels:list<int>, ofst:StdMutableFst)¶ Creates an unweighted linear acceptor from the label sequence.
-
kaldi.fstext.utils.
make_linear_acceptor_with_alternatives
(labels:list<list<int>>, ofst:StdMutableFst)¶ Creates an unweighted acceptor with a linear structure.
Each position in the input list is a list of labels. Each position must have at least one alternative. Epsilon/0 is treated like a normal symbol.
-
kaldi.fstext.utils.
make_preceding_input_symbols_same
(start_is_epsilon:bool, fst:StdMutableFst)¶ Ensures that all arcs entering any state have the same input symbol.
Detects states that have differing input symbols going in, and inserts, for each of the preceding arcs with non-epsilon input symbol, a new dummy state that has an epsilon link to the fst state. If
start_is_epsilon == True
, ensures that start-state can have only epsilon-links into it.
-
kaldi.fstext.utils.
map_input_symbols
(symbol_map:list<int>, fst:StdMutableFst)¶ Maps input labels to labels given in the symbol map.
-
kaldi.fstext.utils.
minimize_encoded_std_fst
(fst:StdVectorFst, delta:float=default)¶ Minimizes FST in place after encoding labels and weights.
Similar to minimize operation, except it does not push the weights, or the labels.
Parameters: - fst (StdVectorFst) – Input FST.
- delta (float) – Quantization delta (default=0.0009765625).
-
kaldi.fstext.utils.
nbest_as_fsts
(fst:StdFst, n:int) → list<StdVectorFst>¶ Outputs (up to) n-best paths in the FST as a list of FSTs.
-
kaldi.fstext.utils.
phi_compose
(fst1:StdFst, fst2:StdFst, phi_label:int, ofst:StdMutableFst)¶ Performs composition by handling phi (failure) transitions.
This is a version of composition where the right hand FST (fst2) is treated as a backoff language model, with the phi symbol (e.g. #0) treated as a “failure transition”, only taken when there is no match for the requested symbol.
-
kaldi.fstext.utils.
phi_compose_lattice
(fst1:LatticeFst, fst2:LatticeFst, phi_label:int, ofst:LatticeMutableFst)¶ Performs lattice composition by handling phi (failure) transitions.
This is a version of composition where the right hand FST (fst2) is treated as a backoff language model, with the phi symbol (e.g. #0) treated as a “failure transition”, only taken when there is no match for the requested symbol.
-
kaldi.fstext.utils.
preceding_input_symbols_are_same
(start_is_epsilon:bool, fst:StdFst) → bool¶ Checks if all arcs entering any state have the same input symbol.
Returns true if and only if the FST is such that the input symbols on arcs entering any given state all have the same value. If
start_is_epsilon == True
, treats start-state as an epsilon input arc [i.e. ensures only epsilons can enter start-state].
-
kaldi.fstext.utils.
propagate_final
(phi_label:int, fst:StdMutableFst)¶ Propagates final-probs through “phi” transitions.
Note that here, phi_label may be epsilon. If you have a backoff language model with special symbols (“phi”) on the backoff arcs instead of epsilon, you may use
phi_compose()
to compose with it, but this won’t do the right thing w.r.t. final probabilities. You should first callpropagate_final()
on the FST with phi’s in it (fst2
inphi_compose()
), to fix this. If a state does not have a final-prob, but has a phi transition, it makes the state’s final-prob (phi-prob * final-prob-of-dest-state), and does this recursively i.e. follows phi transitions on the dest state first. It behaves as if there were a super-final state with a special symbol leading to it, from each currently final state. Note that this may not behave as desired if there are epsilons in your FST; it might be better to remove those before calling this function.
-
kaldi.fstext.utils.
remove_alignments_from_compact_lattice
(fst:CompactLatticeMutableFst)¶ Removes state-level alignments in a compact lattice.
-
kaldi.fstext.utils.
remove_some_input_symbols
(to_remove:list<int>, fst:StdMutableFst)¶ Replaces given input labels with zeros.
-
kaldi.fstext.utils.
remove_useless_arcs
(fst:StdMutableFst)¶ Removes arcs that are not on best paths for any input symbol sequence.
This removes arcs such that there is no input symbol sequence for which the best path through the FST would contain those arcs [for these purposes, epsilon is not treated as a real symbol]. This is mainly geared towards decoding-graph FSTs which may contain transitions that have less likely words on them that would never be taken. We do not claim that this algorithm removes all such arcs; it just does the best job it can. Only works for tropical (not log) semiring as it uses NaturalLess.
-
kaldi.fstext.utils.
remove_weights
(fst:StdMutableFst)¶ Removes FST weights.
-
kaldi.fstext.utils.
rho_compose
(fst1:StdFst, fst2:StdFst, rho_label:int, ofst:StdMutableFst)¶ Performs composition by handling rho transitions.
This is a version of composition where the right hand FST (fst2) has special “rho transitions” which are taken whenever no normal transition matches; these transitions will be rewritten with whatever symbol was on the first FST.
-
kaldi.fstext.utils.
safe_determinize_minimize_wrapper
(ifst:StdMutableFst, ofst:StdVectorFst, delta:float=default)¶ Performs safe determinization and minimization.
Like meth:
safe_determinize_wrapper
but also does encoded minimization, which is safe. This algorithm will destroyifst
.
-
kaldi.fstext.utils.
safe_determinize_minimize_wrapper_in_log
(ifst:StdVectorFst, ofst:StdVectorFst, delta:float=default)¶ Performs safe determinization and minimization in log semiring.
Like meth:
safe_determinize_minimize_wrapper
but first casts to the log semiring. This algorithm will destroyifst
.
-
kaldi.fstext.utils.
safe_determinize_wrapper
(ifst:StdMutableFst, ofst:StdMutableFst, delta:float=default)¶ Performs safe determinization.
This is a form of determinization that will never blow up. Note that
ifst
is non-const and can be destroyed by this operation. Does not do epsilon removal. This is so it’s safe to cast to log and do this, and maintain equivalence in tropical.
-
kaldi.fstext.utils.
scale_compact_lattice
(scale:list<list<float>>, fst:CompactLatticeMutableFst)¶ Scales the compact lattice weights.
Scales the pair of weights in
CompactLatticeWeight
by viewing the pair (a, b) as a 2-vector and pre-multiplying by the 2x2 matrix inscale
. E.g. typically scale would equal[[1, 0], [0, acwt]]
if we want to scale the acoustics byacwt
.
-
kaldi.fstext.utils.
scale_lattice
(scale:list<list<float>>, fst:LatticeMutableFst)¶ Scales the lattice weights.
Scales the pair of weights in
LatticeWeight
by viewing the pair (a, b) as a 2-vector and pre-multiplying by the 2x2 matrix inscale
. E.g. typically scale would equal[[1, 0], [0, acwt]]
if we want to scale the acoustics byacwt
.
kaldi.fstext.weight¶
PyKaldi has support for the following weight types:
- Tropical weight.
- Log weight.
- Lattice weight.
- Compact lattice weight.
- KWS time weight.
- KWS index weight.
-
kaldi.fstext.weight.
DELTA
= 0.0009765625¶
-
kaldi.fstext.weight.
LEFT_SEMIRING
= 1¶
-
kaldi.fstext.weight.
RIGHT_SEMIRING
= 2¶
-
kaldi.fstext.weight.
SEMIRING
= 3¶
-
kaldi.fstext.weight.
COMMUTATIVE
= 4¶
-
kaldi.fstext.weight.
IDEMPOTENT
= 8¶
-
kaldi.fstext.weight.
PATH
= 16¶
-
kaldi.fstext.weight.
NUM_RANDOM_WEIGHTS
= 5¶
Functions
approx_equal_compact_lattice_weight |
Checks if given compact lattice weights are approximately equal. |
approx_equal_float_weight |
Checks if given float weights are approximately equal. |
approx_equal_lattice_weight |
Checks if given lattice weights are approximately equal. |
compact_lattice_weight_to_cost |
Converts compact lattice weight to cost. |
compare_compact_lattice_weight |
Compares input compact lattice weights. |
compare_lattice_weight |
Compares input lattice weights. |
divide_compact_lattice_weight |
\(\oslash\) operation in the compact lattice semiring. |
divide_kws_index_weight |
\(\oslash\) operation in the KWS index semiring. |
divide_lattice_weight |
\(\oslash\) operation in the lattice semiring. |
divide_log_weight |
\(\oslash\) operation in the log semiring. |
divide_tropical_lt_tropical_weight |
\(\oslash\) operation in the KWS time semiring. |
divide_tropical_weight |
\(\oslash\) operation in the tropical semiring. |
get_log_to_tropical_converter |
Returns a callable for converting log weight to tropical weight. |
get_tropical_to_log_converter |
Returns a callable for converting tropical weight to log weight. |
lattice_weight_to_cost |
Converts lattice weight to cost. |
lattice_weight_to_tropical |
Converts lattice weight to tropical weight. |
plus_compact_lattice_weight |
\(\oplus\) operation in the compact lattice semiring. |
plus_kws_index_weight |
\(\oplus\) operation in the KWS index semiring. |
plus_lattice_weight |
\(\oplus\) operation in the lattice semiring. |
plus_log_weight |
\(\oplus\) operation in the log semiring. |
plus_tropical_lt_tropical_weight |
\(\oplus\) operation in the KWS time semiring. |
plus_tropical_weight |
\(\oplus\) operation in the tropical semiring. |
power_log_weight |
Power operation in the log semiring. |
power_tropical_weight |
Power operation in the tropical semiring. |
scale_compact_lattice_weight |
Scales compact lattice weight. |
scale_lattice_weight |
Scales lattice weight. |
times_compact_lattice_weight |
\(\otimes\) operation in the compact lattice semiring. |
times_kws_index_weight |
\(\otimes\) operation in the KWS index semiring. |
times_lattice_weight |
\(\otimes\) operation in the lattice semiring. |
times_log_weight |
\(\otimes\) operation in the log semiring. |
times_tropical_lt_tropical_weight |
\(\otimes\) operation in the KWS time semiring. |
times_tropical_weight |
\(\otimes\) operation in the tropical semiring. |
tropical_weight_to_cost |
Converts tropical weight to cost. |
Classes
CompactLatticeNaturalLess |
Comparison object in compact lattice semiring. |
CompactLatticeWeight |
Compact lattice weight. |
DivideType |
An enumeration. |
FloatLimits |
Float limits. |
FloatWeight |
Base class for float weight types. |
KwsIndexWeight |
KWS index weight. |
KwsTimeWeight |
KWS time weight. |
LatticeNaturalLess |
Comparison object in lattice semiring. |
LatticeWeight |
Lattice weight. |
LogWeight |
Log weight. |
TropicalWeight |
Tropical weight. |
-
class
kaldi.fstext.weight.
CompactLatticeNaturalLess
¶ Comparison object in compact lattice semiring.
-
class
kaldi.fstext.weight.
CompactLatticeWeight
¶ Compact lattice weight.
-
from_other
(other:CompactLatticeWeight) → CompactLatticeWeight¶ Create a new compact lattice weight from another.
-
from_pair
(w:LatticeWeight, s:list<int>) → CompactLatticeWeight¶ Create a new compact lattice weight from a weight string pair.
-
get_int_size_string
() → str¶ Returns int size string.
-
hash
() → int¶ Returns the hash for the weight.
-
member
() → bool¶ Checks if weight is a member of the compact lattice semiring.
-
no_weight
() → CompactLatticeWeight¶ No weight in compact lattice semiring.
-
one
() → CompactLatticeWeight¶ One in compact lattice semiring.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → CompactLatticeWeight¶ Quantizes the weight.
-
reverse
() → CompactLatticeWeight¶ Reverses the weight.
-
string
¶ The string as a list of integers.
-
type
() → str¶ Returns weight type.
-
weight
¶ The weight.
-
zero
() → CompactLatticeWeight¶ Zero in compact lattice semiring.
-
-
class
kaldi.fstext.weight.
DivideType
¶ An enumeration.
-
DIVIDE_ANY
= 2¶
-
DIVIDE_LEFT
= 0¶
-
DIVIDE_RIGHT
= 1¶
-
-
class
kaldi.fstext.weight.
FloatLimits
¶ Float limits.
-
neg_infinity
() → float¶ Returns float -infinity.
-
number_bad
() → float¶ Returns float bad number.
-
pos_infinity
() → float¶ Returns float +infinity.
-
-
class
kaldi.fstext.weight.
FloatWeight
¶ Base class for float weight types.
-
from_float
(f:float) → FloatWeight¶ Create a new float weight from a float.
-
from_other
(weight:FloatWeight) → FloatWeight¶ Create a new float weight from another.
-
hash
() → int¶ Returns the hash for the weight.
-
value
¶ Float value of the weight.
-
-
class
kaldi.fstext.weight.
KwsIndexWeight
¶ KWS index weight.
A tropical weight triplet with lexicographic ordering.
-
from_components
(w1:TropicalWeight, w2:KwsTimeWeight) → KwsIndexWeight¶ Creates a new KWS index weight from component weights.
-
member
() → bool¶ Checks if weight is a member of the KWS index semiring.
-
no_weight
() → KwsIndexWeight¶ No weight in KWS index semiring.
-
one
() → KwsIndexWeight¶ One in KWS index semiring.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → KwsIndexWeight¶ Quantizes the weight.
-
reverse
() → KwsIndexWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value1
¶ The first component weight.
-
value2
¶ The second component weight.
-
zero
() → KwsIndexWeight¶ Zero in KWS index semiring.
-
-
class
kaldi.fstext.weight.
KwsTimeWeight
¶ KWS time weight.
A tropical weight pair with lexicographic ordering.
-
from_components
(w1:TropicalWeight, w2:TropicalWeight) → KwsTimeWeight¶ Creates a new KWS time weight from component weights.
-
member
() → bool¶ Checks if weight is a member of the KWS time semiring.
-
no_weight
() → KwsTimeWeight¶ No weight in the KWS time semiring.
-
one
() → KwsTimeWeight¶ One in the KWS time semiring.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → KwsTimeWeight¶ Quantizes the weight.
-
reverse
() → KwsTimeWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value1
¶ The first component weight.
-
value2
¶ The second component weight.
-
zero
() → KwsTimeWeight¶ Zero in the KWS time semiring.
-
-
class
kaldi.fstext.weight.
LatticeNaturalLess
¶ Comparison object in lattice semiring.
-
class
kaldi.fstext.weight.
LatticeWeight
¶ Lattice weight.
-
from_other
(other:LatticeWeight) → LatticeWeight¶ Create a new lattice weight from another.
-
from_pair
(a:float, b:float) → LatticeWeight¶ Create a new lattice weight from a pair of floats.
-
hash
() → int¶ Returns the hash for the weight.
-
member
() → bool¶ Checks if weight is a member of the lattice semiring.
-
no_weight
() → LatticeWeight¶ No weight in lattice semiring.
-
one
() → LatticeWeight¶ One in lattice semiring, i.e. (0.0, 0.0).
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → LatticeWeight¶ Quantizes the weight.
-
reverse
() → LatticeWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value1
¶ Float value of the first weight.
-
value2
¶ Float value of the second weight.
-
zero
() → LatticeWeight¶ Zero in lattice semiring, i.e. (+infinity, +infinity).
-
-
class
kaldi.fstext.weight.
LogWeight
¶ Log weight.
-
from_float
(f:float) → LogWeight¶ Create a new log weight from a float.
-
from_other
(weight:LogWeight) → LogWeight¶ Create a new log weight from another.
-
hash
() → int¶ Returns the hash for the weight.
-
member
() → bool¶ Checks if weight is a member of log semiring.
-
no_weight
() → LogWeight¶ No weight in log semiring.
-
one
() → LogWeight¶ One in log semiring, i.e. 0.0.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → LogWeight¶ Quantizes the weight.
-
reverse
() → LogWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value
¶ Float value of the weight.
-
zero
() → LogWeight¶ Zero in log semiring, i.e. float +infinity.
-
-
class
kaldi.fstext.weight.
TropicalWeight
¶ Tropical weight.
-
from_float
(f:float) → TropicalWeight¶ Create a new tropical weight from a float.
-
from_other
(weight:TropicalWeight) → TropicalWeight¶ Create a new tropical weight from another.
-
hash
() → int¶ Returns the hash for the weight.
-
member
() → bool¶ Checks if weight is a member of the tropical semiring.
-
no_weight
() → TropicalWeight¶ No weight in tropical semiring.
-
one
() → TropicalWeight¶ One in tropical semiring, i.e. 0.0.
-
properties
() → int¶ Returns weight properties.
-
quantize
(delta:float=default) → TropicalWeight¶ Quantizes the weight.
-
reverse
() → TropicalWeight¶ Reverses the weight.
-
type
() → str¶ Returns weight type.
-
value
¶ Float value of the weight.
-
zero
() → TropicalWeight¶ Zero in tropical semiring, i.e. float +infinity.
-
-
kaldi.fstext.weight.
approx_equal_compact_lattice_weight
(w1:CompactLatticeWeight, w2:CompactLatticeWeight, delta:float=default) → bool¶ Checks if given compact lattice weights are approximately equal.
-
kaldi.fstext.weight.
approx_equal_float_weight
(w1:FloatWeight, w2:FloatWeight, delta:float=default) → bool¶ Checks if given float weights are approximately equal.
-
kaldi.fstext.weight.
approx_equal_lattice_weight
(w1:LatticeWeight, w2:LatticeWeight, delta:float=default) → bool¶ Checks if given lattice weights are approximately equal.
-
kaldi.fstext.weight.
compact_lattice_weight_to_cost
(w:CompactLatticeWeight) → float¶ Converts compact lattice weight to cost.
-
kaldi.fstext.weight.
compare_compact_lattice_weight
(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → int¶ Compares input compact lattice weights.
-
kaldi.fstext.weight.
compare_lattice_weight
(w1:LatticeWeight, w2:LatticeWeight) → int¶ Compares input lattice weights.
-
kaldi.fstext.weight.
divide_compact_lattice_weight
(w1:CompactLatticeWeight, w2:CompactLatticeWeight, typ:DivideType=default) → CompactLatticeWeight¶ \(\oslash\) operation in the compact lattice semiring.
-
kaldi.fstext.weight.
divide_kws_index_weight
(w1:KwsIndexWeight, w2:KwsIndexWeight, typ:DivideType=default) → KwsIndexWeight¶ \(\oslash\) operation in the KWS index semiring.
-
kaldi.fstext.weight.
divide_lattice_weight
(w1:LatticeWeight, w2:LatticeWeight, typ:DivideType=default) → LatticeWeight¶ \(\oslash\) operation in the lattice semiring.
-
kaldi.fstext.weight.
divide_log_weight
(w1:LogWeight, w2:LogWeight, typ:DivideType=default) → LogWeight¶ \(\oslash\) operation in the log semiring.
-
kaldi.fstext.weight.
divide_tropical_lt_tropical_weight
(w1:KwsTimeWeight, w2:KwsTimeWeight, typ:DivideType=default) → KwsTimeWeight¶ \(\oslash\) operation in the KWS time semiring.
-
kaldi.fstext.weight.
divide_tropical_weight
(w1:TropicalWeight, w2:TropicalWeight, typ:DivideType=default) → TropicalWeight¶ \(\oslash\) operation in the tropical semiring.
-
kaldi.fstext.weight.
get_log_to_tropical_converter
() -> (w:LogWeight) → TropicalWeight¶ Returns a callable for converting log weight to tropical weight.
-
kaldi.fstext.weight.
get_tropical_to_log_converter
() -> (w:TropicalWeight) → LogWeight¶ Returns a callable for converting tropical weight to log weight.
-
kaldi.fstext.weight.
lattice_weight_to_cost
(w:LatticeWeight) → float¶ Converts lattice weight to cost.
-
kaldi.fstext.weight.
lattice_weight_to_tropical
(w_in:LatticeWeight) → TropicalWeight¶ Converts lattice weight to tropical weight.
-
kaldi.fstext.weight.
plus_compact_lattice_weight
(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → CompactLatticeWeight¶ \(\oplus\) operation in the compact lattice semiring.
-
kaldi.fstext.weight.
plus_kws_index_weight
(w1:KwsIndexWeight, w2:KwsIndexWeight) → KwsIndexWeight¶ \(\oplus\) operation in the KWS index semiring.
-
kaldi.fstext.weight.
plus_lattice_weight
(w1:LatticeWeight, w2:LatticeWeight) → LatticeWeight¶ \(\oplus\) operation in the lattice semiring.
-
kaldi.fstext.weight.
plus_log_weight
(w1:LogWeight, w2:LogWeight) → LogWeight¶ \(\oplus\) operation in the log semiring.
-
kaldi.fstext.weight.
plus_tropical_lt_tropical_weight
(w1:KwsTimeWeight, w2:KwsTimeWeight) → KwsTimeWeight¶ \(\oplus\) operation in the KWS time semiring.
-
kaldi.fstext.weight.
plus_tropical_weight
(w1:TropicalWeight, w2:TropicalWeight) → TropicalWeight¶ \(\oplus\) operation in the tropical semiring.
-
kaldi.fstext.weight.
power_log_weight
(weight:LogWeight, scalar:float) → LogWeight¶ Power operation in the log semiring.
-
kaldi.fstext.weight.
power_tropical_weight
(weight:TropicalWeight, scalar:float) → TropicalWeight¶ Power operation in the tropical semiring.
-
kaldi.fstext.weight.
scale_compact_lattice_weight
(w:CompactLatticeWeight, scale:list<list<float>>) → CompactLatticeWeight¶ Scales compact lattice weight.
-
kaldi.fstext.weight.
scale_lattice_weight
(w:LatticeWeight, scale:list<list<float>>) → LatticeWeight¶ Scales lattice weight.
-
kaldi.fstext.weight.
times_compact_lattice_weight
(w1:CompactLatticeWeight, w2:CompactLatticeWeight) → CompactLatticeWeight¶ \(\otimes\) operation in the compact lattice semiring.
-
kaldi.fstext.weight.
times_kws_index_weight
(w1:KwsIndexWeight, w2:KwsIndexWeight) → KwsIndexWeight¶ \(\otimes\) operation in the KWS index semiring.
-
kaldi.fstext.weight.
times_lattice_weight
(w1:LatticeWeight, w2:LatticeWeight) → LatticeWeight¶ \(\otimes\) operation in the lattice semiring.
-
kaldi.fstext.weight.
times_log_weight
(w1:LogWeight, w2:LogWeight) → LogWeight¶ \(\otimes\) operation in the log semiring.
-
kaldi.fstext.weight.
times_tropical_lt_tropical_weight
(w1:KwsTimeWeight, w2:KwsTimeWeight) → KwsTimeWeight¶ \(\otimes\) operation in the KWS time semiring.
-
kaldi.fstext.weight.
times_tropical_weight
(w1:TropicalWeight, w2:TropicalWeight) → TropicalWeight¶ \(\otimes\) operation in the tropical semiring.
-
kaldi.fstext.weight.
tropical_weight_to_cost
(w:TropicalWeight) → float¶ Converts tropical weight to cost.