pydrobert.kaldi.eval.util
Utilities for evaluation
- pydrobert.kaldi.eval.util.edit_distance(ref, hyp, insertion_cost=1, deletion_cost=1, substitution_cost=1, return_tables=False)[source]
Levenshtein (edit) distance
- Parameters:
ref (
Sequence) – Sequence of tokens of reference text (source)hyp (
Sequence) – Sequence of tokens of hypothesis text (target)insertion_cost (
int) – Penalty for hyp inserting a token to refdeletion_cost (
int) – Penalty for hyp deleting a token from refsubstitution_cost (
int) – Penalty for hyp swapping tokens in refreturn_tables (
bool) – See below
- Returns:
distances (
intor(int,dict,dict,dict,dict)) – Returns the edit distance of hyp from ref. If return_tables is True, this returns a tuple of the edit distance, a dict of insertion counts, a dict of deletion , a dict of substitution counts per ref token, and a dict of counts of ref tokens. Any tokens with count 0 are excluded from the dictionary.