|
The Sphinx Group at Carnegie Mellon University is committed to
releasing the long-time, DARPA-funded Sphinx projects widely, in order
to stimulate the creation of speech-using tools and applications, and
to advance the state of the art both directly in speech recognition,
as well as in related areas including dialog systems and speech
synthesis.
The Sphinx Group has been supported for many years by funding from
the Defense Advanced Research Projects Agency, and the recognition
engines to be released are those that the group used for the various
DARPA projects and their respective evaluations.
Recent support for the project also include Telefónica I
& D, Sun Microsystems, and Mitsubishi Electric Research Labs.
The licensing terms for the Sphinx engines and tools are derived
from BSD, and based, in particular, upon the license for the Apache
web server. There is no restriction against commercial use or
redistribution. (License terms for CMU Sphinx)
The packages that the CMU Sphinx Group is releasing are a set of
reasonably mature, world-class speech components that provide a basic
level of technology to anyone interested in creating speech-using
applications without the once-prohibitive initial investment cost in
research and development; the same components are open to peer review
by all researchers in the field, and are used for linguistic research
as well.
Note however that Sphinx is not a final product. Those with a
certain level of expertise can achieve great results with the versions
of Sphinx available here, but a naive user will certainly need further
help. In other words, the software available here is not meant for
users with no experience in speech, but for expert users.
This site will be the canonical location for the release of the
Sphinx trainers, recognizers, acoustic and language models, and
documentation.
Try a System
If you'd like to have a chance to try out an application that
uses CMU Sphinx, try one of these.
-
Roomline, a
system that handles conference room reservations within CMU. You can
reach it at the toll-free number 1-877-CMU-PLAN (1-877-268-7526) or at
+1 412 268 1084.
Note that your call will be recorded for development purposes
and may be shared with other researchers. We don't have a policy set
up yet for placing such recordings into a publicly availably database,
and so there is no guarantee that this data will become publicly
available -- though we're motivated to set that up in the future.
-
Let's Go, a spoken
dialog system for the general public. The Let's
Go! project is working in the domain of bus information, providing
information such as schedules and route information for the city of
Pittsburgh's Port Authority Transit (PAT) buses. You can interact with
a version of this system right now by calling 412-268-3526 (requires
some knowledge of Pittsburgh's transit system).
Bug Tracking and Discussion Groups
There are fora for bug tracking and
discussions on the SourceForge site, also. Please go there for
help, questions, to report bugs, and to see the latest work. The work
is currently pre-version 1.0, so there is a lot yet to be done.
There is also an IRC
channel (#cmusphinx on irc.freenode.net) for real-time discussion.
Platforms
-
GNU/Linux, Unix variants, and Windows NT or later
|