The information presented here provides information about current
uses of the severla versions of Sphinx. These may be used as
guidelines, but keep in mind that there is no definite answer.
Comparison
We have some
regression tests comparing sphinx4 to s3 (flat decoder) and to
s3.3 (fast decoder) in several different tasks, ranging from digits
only to medium-large vocab. s3 (flat decoder) is often the most
accurate, but Sphinx4 is faster and more accurate than sphinx3.3 in
some of these tests.
If you're familiar with ARPA evaluations (using databases available
via the Linguistic Data Consortium
(LDC) at U.Penn), you can find WSJ 5k and RM1 results in the
"Medium vocab" table, and WSJ 20k in the "Large vocab" table.
The decision about which version to use depends on how familiar you
are with C (sphinx3) or Java (sphinx4), and how easy it is to
integrate these into your system.