pydrobert.kaldi.eval.util
Utilities for evaluation
- pydrobert.kaldi.eval.util.edit_distance(ref, hyp, insertion_cost=1, deletion_cost=1, substitution_cost=1, return_tables=False)[source]
Levenshtein (edit) distance
- Parameters
ref (
Sequence
) – Sequence of tokens of reference text (source)hyp (
Sequence
) – Sequence of tokens of hypothesis text (target)insertion_cost (
int
) – Penalty for hyp inserting a token to refdeletion_cost (
int
) – Penalty for hyp deleting a token from refsubstitution_cost (
int
) – Penalty for hyp swapping tokens in refreturn_tables (
bool
) – See below
- Returns
distances (
int
or(int
,dict
,dict
,dict
,dict)
) – Returns the edit distance of hyp from ref. If return_tables is True, this returns a tuple of the edit distance, a dict of insertion counts, a dict of deletion , a dict of substitution counts per ref token, and a dict of counts of ref tokens. Any tokens with count 0 are excluded from the dictionary.