You are here

Software

  • The Festival speech synthesis system - Developed at Edinburgh, Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. As a whole it offers full text to speech through a number APIs: from shell level, though a Scheme command interpreter, as a C++ library, from Java, and an Emacs interface. Festival is multi-lingual (currently English (British and American), and Spanish) though English is the most advanced. Other groups release new languages for the system. And full tools and documentation for build new voices are available through Carnegie Mellon's FestVox project (http://festvox.org).   Festival is free software. Festival and the speech tools are distributed under an X11-type licence allowing unrestricted commercial and non-commercial use alike
  • The HTK speech recognition toolkit - Developed at Cambridge, the Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.
  • The HTS HMM-based speech synthesis system - Developed at Nagoya Institute of Technology,  HTS is developed and maintained by the HTS working group and others. The training part of HTS has been implemented as a modified version of HTK and released as a form of patch code to HTK. The patch code is released under a free software license. 
  • The Kaldi speech recognition toolkit - Written in C++ and licensed under the Apache License v2.0, Kaldi is intended for use by speech recognition researchers.  Kaldi is similar in aims and scope to HTK. The goal is to have modern and flexible code, written in C++, that is easy to modify and extend. Important features include:  Code-level integration with Finite State Transducers (FSTs); Extensive linear algebra support; Extensible design; Open licence;  Complete speech recognition recipes.