pydrobert.kaldi.io.duck_streams

Submodule for reading and writing one-by-one, like (un)packing c structs

class pydrobert.kaldi.io.duck_streams.KaldiInput(path, header=True)[source]

Bases: KaldiIOBase

A kaldi input stream from which objects can be read one at a time

Parameters
  • path (str) – An extended readable file path

  • header (bool) – If False, no attempt will be made to look for the “binary” header in the stream; it will be assumed binary

close()[source]

Close and flush the underlying IO object

This method has no effect if the file is already closed

read(kaldi_dtype, value_style='b', read_binary=None)[source]

Read in one object from the stream

Parameters
  • kaldi_dtype (KaldiDataType) – The type of object to read

  • value_style (Literal['b', 's', 'd']) – 'wm' readers can provide not only the audio buffer ('b') of a wave file, but its sampling rate ('s'), and/or duration (in sec, 'd'). Setting value_style to some combination of 'b', 's', and/or 'd' will cause the reader to return a tuple of that information. If value_style is only one character, the result will not be contained in a tuple

  • read_binary (bool, optional) – If set, the object will be read as either binary (True) or text (False). The default behaviour is to read according to the binary attribute. Ignored if there’s only one way to read the data

readable()[source]

Return whether this object was opened for reading

writable()[source]

Return whether this object was opened for writing

class pydrobert.kaldi.io.duck_streams.KaldiOutput(path, header=True)[source]

Bases: KaldiIOBase

A kaldi output stream from which objects can be written one at a time

Parameters
  • path (str) – An extended writable file path

  • header (bool) – Whether to write a header when opening the binary stream (True) or not.

close()[source]
readable()[source]

Return whether this object was opened for reading

writable()[source]

Return whether this object was opened for writing

write(obj, kaldi_dtype=None, error_on_str=True, write_binary=True)[source]

Write one object to the stream

Parameters
  • obj (Any) – The object to write

  • kaldi_dtype (Optional[KaldiDataType]) – The type of object to write

  • error_on_str (bool) – Token vectors ('tv') accept sequences of whitespace-free ASCII/UTF strings. A str is also a sequence of characters, which may satisfy the token requirements. If error_on_str is True, a ValueError is raised when writing a str as a token vector. Otherwise a str can be written

  • write_binary (bool) – The object will be written as binary (True) or text (False)

Raises

ValueError – If unable to determine a proper data type

See also

pydrobert.kaldi.io.util.infer_kaldi_data_type

Illustrates how different inputs are mapped to data types

pydrobert.kaldi.io.duck_streams.open_duck_stream(path, mode='r', header=True)[source]

Open a “duck” stream

“Duck” streams provide an interface for reading or writing kaldi objects, one at a time. Essentially: remember the order things go in, then pull them out in the same order.

Duck streams can read/write binary or text data. It is mostly up to the user how to read or write data, though the following rules establish the default:

  1. An input stream that does not look for a ‘binary header’ is binary

  2. An input stream that looks for and finds a binary header when opening is binary

  3. An input stream that looks for but does not find a binary header when opening is a text stream

  4. An output stream is always binary. However, the user may choose not to write a binary header. The resulting input stream will be considered a text stream when 3. is satisfied

Parameters
  • path (str) – The extended file name to be opened. This can be quite exotic. More details can be found on the Kaldi website.

  • mode (Literal['r', 'r+', 'w']) – Whether to open the stream for input ('r') or output ('w'). 'r+' is equivalent to 'r'

  • header (bool) – Setting this to True will either check for a ‘binary header’ in an input stream, or write a binary header for an output stream. If False, no check/write is performed