Speech Recognition — Weighted Finite-State Transducers (WFST)

An HMM model with bigram words

Overview

Finite State Automaton (Finite State Machine)

Semirings

Source

Finite-State Acceptor (FSA)

Weighted Finite-State Transducers (WFST)

Source

Operations

Modified from source
Modified from source
Modified from source
Source
Source
Source

Composition

Source
Source

Determinization

  • No two arcs have the same input label from the same state (node), and
  • No empty input label (but some toolkit implementations like OpenFst allow this).
Source

Minimization

Source

More information (optional)

Source
Source

WFST with ASR

Source
Source of the example
  • G is the grammar/language model acceptor.
  • L is the pronunciation lexicons.
  • C is the context-dependent relabeling (convert context-dependent phone to context-independent phone).
  • H is the HMM structure (outputting context-dependent phones — HMM internal state sequence).
Source
Source

Decoder

Source FSA U with 3 audio frames
Source
Source
Source

Applications

Source

Next

Credit & reference

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store