pydrobert.kaldi.io.util
Kaldi I/O utilities
- pydrobert.kaldi.io.util.infer_kaldi_data_type(obj)[source]
Infer the appropriate kaldi data type for this object
The following map is used (in order):
Object
KaldiDataType
an int
Int32
a boolean
Bool
a float*
Base
str
Token
2-dim numpy array float32
FloatMatrix
1-dim numpy array float32
FloatVector
2-dim numpy array float64
DoubleMatrix
1-dim numpy array float64
DoubleVector
1-dim numpy array of int32
Int32Vector
2-dim numpy array of int32*
Int32VectorVector
(matrix-like, float or int)
WaveMatrix**
an empty container
BaseMatrix
container of str
TokenVector
1-dim py container of ints
Int32Vector
2-dim py container of ints*
Int32VectorVector
2-dim py container of pairs of floats
BasePairVector
matrix-like python container
DoubleMatrix
vector-like python container
DoubleVector
*The same data types could represent a
Double
or anInt32PairVector
, respectively. Care should be taken in these cases.**The first element is the wave data, the second its sample frequency. The wave data can be a 2d numpy float array of the same precision as
KaldiDataType.BaseMatrix
, or a matrix-like python container of floats and/or ints.- Returns
- pydrobert.kaldi.io.util.parse_kaldi_input_path(path)[source]
Determine the characteristics of an input stream by its path
Returns a 4-tuple of the following information:
If path is not an rspecifier (
TableType.NotATable
):Classify path as an rxfilename
return a tuple of
(TableType, path, RxfilenameType, dict())
else:
Put all rspecifier options (once, sorted, called_sorted, permissive, background) into a dictionary
Extract the embedded rxfilename and classify it
return a tuple of
(TableType, rxfilename, RxfilenameType, options)
- Parameters
path (
str
) – A string that would be passed topydrobert.kaldi.io.open
- pydrobert.kaldi.io.util.parse_kaldi_output_path(path)[source]
Determine the charactersistics of an output stram by its path
Returns a 4-tuple of the following information
If path is not a wspecifier (
TableType.NotATable
)Classify path as a wxfilename
return a tuple of
(TableType, path, WxfilenameType, dict())
If path is an archive or script
Put all wspecifier options (binary, flush, permissive) into a dictionary
Extract the embedded wxfilename and classify it
return a tuple of
(TableType, wxfilename, WxfilenameType, options)
If path contains both an archive and a script (
TableType.BothTables
)Put all wspecifier options (binary, flush, permissive) into a dictionary
Extract both embedded wxfilenames and classify them
return a tuple of
(TableType, (arch_wxfilename, script_wxfilename), (arch_WxfilenameType, script_WxfilenameType), options)
- Parameters
path (
str
) – A string that would be passed topydrobert.kaldi.io.open()