Command-Line Interface ====================== write-table-to-pickle --------------------- :: write-table-to-pickle -h usage: write-table-to-pickle [-h] [-v VERBOSE] [--config CONFIG] [--print-args PRINT_ARGS] [-i IN_TYPE] [-o OUT_TYPE] rspecifier value_out [key_out] Write a kaldi table to pickle file(s) The inverse is write-pickle-to-table positional arguments: rspecifier The table to read value_out A path to write (key,value) pairs to, or just values if key_out was set. If it ends in ".gz", the file will be gzipped key_out A path to write keys to. If it ends in ".gz", the file will be gzipped optional arguments: -h, --help show this help message and exit -v VERBOSE, --verbose VERBOSE Verbose level (higher->more logging) --config CONFIG --print-args PRINT_ARGS -i IN_TYPE, --in-type IN_TYPE The type of kaldi data type to read. Defaults to base matrix -o OUT_TYPE, --out-type OUT_TYPE The numpy data type to cast values to. The default is dependent on the input type. String types will be written as (tuples of) strings write-pickle-to-table --------------------- :: write-pickle-to-table -h usage: write-pickle-to-table [-h] [-v VERBOSE] [--config CONFIG] [--print-args PRINT_ARGS] [-o OUT_TYPE] value_in [key_in] wspecifier Write pickle file(s) contents to a table The inverse is write-table-to-pickle positional arguments: value_in A path to read (key,value) pairs from, or just values if key_in was set. If it ends in ".gz", the file is assumed to be gzipped key_in A path to read keys from. If it ends in ".gz", the file is assumed to be gzipped wspecifier The table to write to optional arguments: -h, --help show this help message and exit -v VERBOSE, --verbose VERBOSE Verbose level (higher->more logging) --config CONFIG --print-args PRINT_ARGS -o OUT_TYPE, --out-type OUT_TYPE The type of kaldi data type to read. Defaults to base matrix compute-error-rate ------------------ :: compute-error-rate -h usage: compute-error-rate [-h] [-v VERBOSE] [--config CONFIG] [--print-args PRINT_ARGS] [--print-tables PRINT_TABLES] [--strict STRICT] [--insertion-cost INSERTION_COST] [--deletion-cost DELETION_COST] [--substitution-cost SUBSTITUTION_COST] [--include-inserts-in-cost INCLUDE_INSERTS_IN_COST] [--report-accuracy REPORT_ACCURACY] ref_rspecifier hyp_rspecifier [out_path] Compute error rates between reference and hypothesis token vectors Two common error rates in speech are the word (WER) and phone (PER), though the computation is the same. Given a reference and hypothesis sequence, the error rate is error_rate = (substitutions + insertions + deletions) / (ref_tokens * 100) Where the number of substitutions (e.g. "A B C -> A D C"), deletions (e.g. "A B C -> A C"), and insertions (e.g. "A B C -> A D B C") are determined by Levenshtein distance. positional arguments: ref_rspecifier Rspecifier pointing to reference (gold standard) transcriptions hyp_rspecifier Rspecifier pointing to hypothesis transcriptions out_path Path to print results to. Default is stdout. optional arguments: -h, --help show this help message and exit -v VERBOSE, --verbose VERBOSE Verbose level (higher->more logging) --config CONFIG --print-args PRINT_ARGS --print-tables PRINT_TABLES If set, will print breakdown of insertions, deletions, and subs to out_path --strict STRICT If set, missing utterances will cause an error --insertion-cost INSERTION_COST Cost (in terms of edit distance) to perform an insertion --deletion-cost DELETION_COST Cost (in terms of edit distance) to perform a deletion --substitution-cost SUBSTITUTION_COST Cost (in terms of edit distance) to perform a substitution --include-inserts-in-cost INCLUDE_INSERTS_IN_COST Whether to include insertions in error rate calculations --report-accuracy REPORT_ACCURACY Whether to report accuracy (1 - error_rate) instead of the error rate normalize-feat-lens ------------------- :: normalize-feat-lens -h usage: normalize-feat-lens [-h] [-v VERBOSE] [--config CONFIG] [--print-args PRINT_ARGS] [--type TYPE] [--tolerance TOLERANCE] [--strict STRICT] [--pad-mode {zero,constant,edge,symmetric,mean}] [--side {left,right,center}] feats_in_rspecifier len_in_rspecifier feats_out_wspecifier Ensure features match some reference lengths Incoming features are either clipped or padded to match reference lengths (stored as an int32 table), if they are within tolerance. positional arguments: feats_in_rspecifier The features to be normalized len_in_rspecifier The reference lengths (int32 table) feats_out_wspecifier The output features optional arguments: -h, --help show this help message and exit -v VERBOSE, --verbose VERBOSE Verbose level (higher->more logging) --config CONFIG --print-args PRINT_ARGS --type TYPE The kaldi type of the input/output features --tolerance TOLERANCE How many frames deviation from reference to tolerate before error. The default is to be infinitely tolerant (a feat I'm sure we all desire) --strict STRICT Whether missing keys in len_in and lengths beyond the threshold cause an error (true) or are skipped with a warning (false) --pad-mode {zero,constant,edge,symmetric,mean} If frames are being padded to the features, specify how they should be padded. zero=zero pad, edge=pad with rightmost frame, symmetric=pad with reverse of frame edges, mean=pad with mean feature values --side {left,right,center} If an utterance needs to be padded or truncated, specify what side of the utterance to do this on. left=beginning, right=end, center=distribute evenly on either side write-table-to-torch-dir ------------------------ :: write-table-to-torch-dir -h usage: write-table-to-torch-dir [-h] [-v VERBOSE] [--config CONFIG] [--print-args PRINT_ARGS] [-i IN_TYPE] [-o {float,double,half,byte,char,short,int,long}] [--file-prefix FILE_PREFIX] [--file-suffix FILE_SUFFIX] rspecifier dir Write a Kaldi table to a series of PyTorch data files in a directory Writes to a folder in the format: folder/ ... The contents of the file "" will be a PyTorch tensor corresponding to the entry in the table for "" positional arguments: rspecifier The table to read dir The folder to write files to optional arguments: -h, --help show this help message and exit -v VERBOSE, --verbose VERBOSE Verbose level (higher->more logging) --config CONFIG --print-args PRINT_ARGS -i IN_TYPE, --in-type IN_TYPE The type of table to read -o {float,double,half,byte,char,short,int,long}, --out-type {float,double,half,byte,char,short,int,long} The type of torch tensor to write. If unset, it is inferrred from the input type --file-prefix FILE_PREFIX The file prefix indicating a torch data file --file-suffix FILE_SUFFIX The file suffix indicating a torch data file write-torch-dir-to-table ------------------------ :: write-torch-dir-to-table -h usage: write-torch-dir-to-table [-h] [-v VERBOSE] [--config CONFIG] [--print-args PRINT_ARGS] [-o OUT_TYPE] [--file-prefix FILE_PREFIX] [--file-suffix FILE_SUFFIX] dir wspecifier Write a data directory containing PyTorch data files to a Kaldi table Reads from a folder in the format: folder/ ... Where each file contains a PyTorch tensor. The contents of the file "" will be written as a value in a Kaldi table with key "" positional arguments: dir The folder to read files from wspecifier The table to write to optional arguments: -h, --help show this help message and exit -v VERBOSE, --verbose VERBOSE Verbose level (higher->more logging) --config CONFIG --print-args PRINT_ARGS -o OUT_TYPE, --out-type OUT_TYPE The type of table to write to --file-prefix FILE_PREFIX The file prefix indicating a torch data file --file-suffix FILE_SUFFIX The file suffix indicating a torch data file