ref: 0f56c7165a95c158a0eac16a3b78721664a4e7cb
parent: e7e4868ad7a65eb6ed2966dd5048c4bfeb14f23f
author: Alan W Black <awb@cs.cmu.edu>
date: Sat Oct 21 06:14:34 EDT 2017
make it look better in github page
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@
All rights reserved
http://cmuflite.org
-Flite is an open source small fast run-time text to speeh engine. It
+Flite is an open source small fast run-time text to speech engine. It
is the latest addition to the suite of free software synthesis tools
including University of Edinburgh's Festival Speech Synthesis System
and Carnegie Mellon University's FestVox project, tools, scripts and
@@ -28,40 +28,57 @@
o Flite is designed for very small devices, such as PDAs, and also
for large server machines which need to serve lots of ports.
+
o Flite is not a replacement for Festival but an alternative run time
engine for voices developed in the FestVox framework where size and
speed is crucial.
+
o Flite is all in ANSI C, it contains no C++ or Scheme, thus requires
more care in programming, and is harder to customize at run time.
+
o It is thread safe
+
o Voices, lexicons and language descriptions can be compiled
(mostly automatically for voices and lexicons) into C representations
from their FestVox formats
+
o All voices, lexicons and language model data are const and in the
text segment (i.e. they may be put in ROM). As they are linked in
at compile time, there is virtually no startup delay.
+
o Although the synthesized output is not exactly the same as the same
voice in Festival they are effectively equivalent. That is, flite
doesn't sound better or worse than the equivalent voice in festival,
just faster, smaller and scalable.
+
o For standard diphone voices, maximum run time memory
requirements are approximately less than twice the memory requirement
for the waveform generated. For 32bit archtectures
this effectively means under 1M.
+
o The flite program supports, synthesis of individual strings or files
(utterance by utterance) to direct audio devices or to waveform files.
+
o The flite library offers simple functions suitable for use in specific
applications.
+
Flite is distributed with a single 8K diphone voice (derived from the
cmu_us_kal voice), a pruned lexicon (derived from
cmulex) and a set of models for US English. Here are comparisons
with Festival using basically the same 8KHz diphone voice
+
Flite Festival
+
core code 60K 2.6M
+
USEnglish 100K ??
+
lexicon 600K 5M
+
diphone 1.8M 2.1M
+
runtime <1M 16-20M
+
On a 500Mhz PIII, a timing test of the first two chapters of
"Alice in Wonderland" (doc/alice) was done. This produces about
@@ -75,9 +92,11 @@
o A good C compiler, some of these files are quite large and some C
compilers might choke on these, gcc is fine. Sun CC 3.01 has been
tested too. Visual C++ 6.0 is known to fail on the large diphone
- database files. We recommend you use GCC under bash for Windows,
+ database files. We recommend you use GCC Windows Subsystem for Linux
Cygwin or mingw32 instead.
+
o GNU Make
+
o An audio device isn't required as flite can write its output to
a waveform file.
@@ -87,14 +106,23 @@
o Various Intel Linux systems (and iPaq Linux), under various versions
of GCC (2.7.2 to 6.x)
+
o Mac OS X
+
o Various Android devices
+
o Various openwrt devices
+
o FreeBSD 3.x and 4.x
+
o Solaris 5.7, and Solaris 9
+
o Windows 2000/XP and later under Cygwin 1.3.5 and later
+
o Windows 10 with bash on ubuntu on Windows
+
o Successfully compiles and runs under 64Bit Linux architectures
+
o OSF1 V4.0 (gives an unimportant warning about sizes when compiled cst_val.c)
Previously we supported PalmOS and Windows CE but these seem to be rare
@@ -111,72 +139,113 @@
----
New in 2.1 (Oct 2017)
+
o Improved Indic front end support (thanks to Suresh Bazaj @ Hear2Read)
+
o 18 English Voices (various accents)
+
o 12 Indian Voices (Bengali, Gujarati, Hindi, Kannada, Marathi, Panjabi
Tamil and Telugu) usually with bilingual (with English) support
+
o Can do byteswap architectures [again] (ar9331 yun arduino, zsun etc)
+
o flitecheck front-end test suite
+
o grapheme based festvox builds give working flitevox voices
+
o SAPI support for CG voices (thanks to Alok Parlikar @ Cobalt Speech and
Language INC)
+
o gcc 6.x support
+
o .flitevox files (and models) 40% of previous size, but same quality
New in 2.0.0 (Dec 2014)
o Indic language support (Hindi, Tamil and Telugu)
+
o SSML support
+
o CG voices as files accessilble by file:/// and http://
(and set of 13 voices to load)
+
o random forest (multimodel support) improves voice quality
+
o Supports diffrent sample rates/mgc order to tune for speed
+
o Kal diphone 500K smaller
+
o Fixed lots of API issues
+
o thread safe (again) [after initialization]
+
o Generalized tokenstreams (used in Bard Storyteller)
+
o simple-Pulseaudio support
+
o Improved Android support
+
o Removed PalmOS support from distribution
+
o Companion multilingual ebook reader Bard Storyteller
- http://festvox.org/bard/
+ https://github.com/festvox/bard
New in 1.4.1 (March 2010)
+
o better ssml support (actually does something)
+
o better clunit support (smaller)
+
o Android support
New in 1.4 (December 2009)
o crude multi-voice selection support (may change)
+
o 4 basic voices are included 3 clustergen (awb, rms and slt) plus
the kal diphone database
+
o CMULEX now uses maximum onset for syllabification
+
o alsa support
+
o Clustergen support (including mlpg with mixed excitation)
But is still slow on limited processors
+
o Windows support with Visual Studio (specifically for the Olympus
Spoken Dialog System)
+
o WinCE support is redone with cegcc/mingw32ce with example
example TTS app: Flowm: Flite on Windows Mobile
+
o Speed-ups in feature interpretation limiting calls to alloc
+
o Speed-ups (and fixes) for converting clunits festvox voices
New in 1.3-release (October 2005)
+
o fixes to lpc residual extraction to give better quality output
+
o An updated lexicon (festlex_CMU from festival-2.0.95) and better
- compression its about 30% of the previous size, with about
+ compression its about 30% of the previous size, with about
the same accuracy
- o Fairly substantial code movements to better support PalmOS and
+ o Fairly substantial code movements to better support PalmOS and
multi-platform cross compilation builds
+
o A PalmOS 5.0 port with an small example talking app ("flop")
+
o runs under ix86_64 linux
New in 1.2-release (February 2003)
+
o A build process for diphone and clunits/ldom voices
FestVox voices can be converted (sometimes) automatically
+
o Various bug fixes
+
o Initial support for Mac OS X (not talking to audio device yet)
but compiles and runs
+
o Text files can be synthesize to a single audio file
+
o (optional) shared library support (Linux)
Compilation
@@ -184,13 +253,21 @@
In general
- tar zxvf flite-2.0.0-release.tar.gz
- cd flite-2.0.0-release
+ tar zxvf flite-2.1-current.tar.gz
+
+ cd flite-2.1-current
./configure
make
Where tar is gnu tar (gtar), and make is gnu make (gmake).
+Or
+
+ git clone http://github.com/festvox/flite
+ cd flite
+ ./configure
+ make
+
Configuration should be automatic, but maybe doesn't work in all cases
especially if you have some new compiler. You can explicitly set the
compiler in config/config and add any options you see fit. Configure
@@ -259,32 +336,49 @@
debugging. Some typical examples are
./bin/flite --sets join_type=simple_join doc/intro
+
Use simple concatenation of diphones without prosodic modification
+
./bin/flite -pw doc/alice
- Print sentences as they are said
+
+ Print sentences as they are said
+
./bin/flite --setf duration_stretch=1.5 doc/alice
+
Make it speak slower
+
./bin/flite --setf int_f0_target_mean=145 doc/alice
+
Make it speak higher
The talking clock is an example talking clode as discussed on
http://festvox.org/ldom it requires a single argument HH:MM
under Unix you can call it
+
./bin/flite_time `date +%H:%M`
-./bin/flite -lv
+./bin/flite -lv
+
List the voices linked in directly in this build
./bin/flite -voice rms -f doc/alice
+
Speak with the US male rms voice
+
./bin/flite -voice awb -f doc/alice
+
Speak with the "Scottish" male awb voice
+
./bin/flite -voice slt -f doc/alice
+
Speak with the US female slt voice
./bin/flite -voice http://festvox.org/flite/voices/US/cmu_us_aew.flitevox -f doc/alice
+
Speak with AEW voice, download on the fly from festvox.org
+
./bin/flite -voice voices/cmu_us_ahw.flitevox -f doc/alice
+
Speak with AHW voice loaded from the local file.
Voice names are identified as loadable files if the name includes a
@@ -313,8 +407,10 @@
process.
We expect that often voices will be loaded from external files, and we
-have now set up a voice repository on
+have now set up a voice repository on
+
http://festvox.org/flite/voices/LANG/*.flitevox
+
If you visit there with a browser you can hear the examples. You can
also download the .flitevox files to you machine so you don't need a
network connect everytime you need to load a voice.
@@ -322,8 +418,8 @@
We are now actively adding to this list of available voices in English (16)
and other languages.
-Bard Storyteller: http://festvox.org/bard/
--------------------------------------------
+Bard Storyteller: https://github.com/festvox/bard
+--------------------------------------------------
Bard is a companion app that reads ebooks, both displaying them and
actually reading them to you out loud using flite. Bard supports a