Author Archive for admin



As it comes with the Sphinx-4 package, the HelloWorld.jar example only recognizes the following words:
(Good morning | Hello)
( Bhiksha | Evandro | Paul | Philip | Rita | Will )
To expand this vocabulary, we need to modify a grammar file that the JSGFGrammar class imports, and then we need to rebuild the HelloWorld.jar using ant.
In […]

Sphinx 4 Architecture

( diagram taken from Sphinx-4: A Flexible Open Source Framework
for Speech Recognition )
NOTE: This post is not complete.
The beauty of the Sphinx 4 architecture is its modularity and pluggability. Previously, speech recognition programs were built to fulfill specific roles: continuous speech vs. non-continuous, large vocabulary vs. smaller vocabulary, etc. Now with Sphinx 4, […]

Bluetooth Headset results

Bluetooth headsets could turn out to be the most widely available interface for speaking to the computer, so we wanted to do some tests to see if the demos that came with Sphinx-4 would work with a headset.
One useful thing to note is that I had to jump through some hoops to get […]

Getting started with Sphinx 4 :: HelloWorld!!

Sphinx is a HMM based speech recognition system developed at CMU. There are 4 versions of Sphinx. For the moment, I’m exploring Sphinx-4 because it is written in Java, and it would probably combine easiest with the Video Comments work being done at ITP.

Getting everything you need to run Sphinx 4
First, download the bin (or the source if you want to compile everything yourself) of Sphinx 4 at sourceforge.

Then, to run any speech applications in Java, whether they be speech recognition or text to speech, you must get the Java Speech API setup.

After downloading the Sphinx-4 bin, you need to “unpack” the jsapi.jar by signing a BCL license. Instructions for setting up the JSAPI 1.0 for UNIX and Windows systems are here, but I’ll repeat how I set up on my Mac right here for good measure:

In Terminal, change directory to the lib folder in the Sphinx-4 package that you downloaded where the jsapi.sh file sits.
cd sphinx4-1/sphinx4-1.0beta/lib

If you type ls, you should see a file called jsapi.sh in this directory.

Then type chmod +x ./jsapi.sh
Then type sh ./jsapi.sh
A long document that is the BCL license should show up. Scroll down to the end of it and agree to it by typing ‘y’ when prompted to do so. Then we you press enter, the jsapi.jar should be unpacked and ready in this same lib directory.

Move this jsapi.jar file to your Java Extensions folder ( yourComputer: System/Library/Java/Extensions ). Now your computer should know how to talk with Java!

Now run some demos!
To make sure everything is set up correctly, we can now run the demos that came with the Sphinx-4 download. In the bin directory, let’s run the HelloWorld.jar that came with the package. In this application, there are a fixed vocabulary of words that you can speak for your computer to hear.

Run it in terminal by changing directory to the bin folder ( sphinx4-1/sphinx4-1.0beta/bin ) and then launch the HelloWorld app by typing java -mx312m -jar HelloWorld.jar A more in depth tutorial about running the HelloWorld.jar and the app does is on the Spinx website.

I found one useful summary of what speech recognition is here. It details the types of speech recognizers, including speaker-independent, speaker-dependent, continuous speech recognition, isolated speech recognition, and vocabulary constrained system. As the technology exists now, it seems that one has to figure out a compromise between vocabulary size that the computer can […]




About

Archive for admin.

Longer entries are truncated. Click the headline of an entry to read it in its entirety.
"

Categories