This article will quickly get you up and running with Windows Voice Recognition by using the .NET library System.Speech. After completing these steps you will be able to translate audio files or speech into your microphone. Examples are in C#.
- Add references to the System.Speech namespace
- Create a SpeechRecognition object
- Setup event handlers for your audio source
- Load a grammar libary
- Set audio parameters
- Scan the audio and output word recognition
Add References
You will need the following to utilize all the functionality of the Speech Recognition system. In VS you can do this by Project -> Add Reference -> System.Speech
using System.Speech using System.Speech.Recognition using System.Speech.AudioFormat
Create SpeechRecognition Object
The SpeechRecognition object is your main access point to all aspects of the language system.
SpeechRecognitionEngine sre = new SpeechRecognitionEngine();
Setup Event Handlers
First off, a quick intro to how the analysis is done. First an audio file is opened. The sre scans the audio stream until it detects audio. Next it begins analyzing that audio until the audio stream is quiet, not necessarily complete. As the stream is analyzed the sre makes guesses as to what the words are by looking at both the phonemes and the word in context to the words around it.
It is important to remember that there is a difference between an audio file and an audio block. An audio file is the complete audio from start to end. An audio block is what is analyzed by the sre. The block is the audio between when speech is detected and a period of when no speech is detected.
Let’s do a quick overview of each event since I found MS’s documentation lacking.
- SpeechHypothesized – As a block of audio is analyzed the hypothesizer attempts to construct words and phrases that make sense. For example the sound “I” could also be “eye.” By utilizing contextual inferencing the appropriate version can be used in a phrase or sentence.
- SpeechRecognized – When a block of audio has finished analyzing the last valid hypothesis is passed on to speech recognized.
- RecognizeCompleted – When audio analysis is completely finished for the entire stream.
- AudioSignalProblemOccurred – You can read the error for a list of problems with your audio
- SpeechDetected – When the beginning of speech is detected.
- SpeechRecognitionRejected – The system cannot confidently return a result for this section of audio.
sre.SpeechHypothesized += new EventHandler<SpeechHypothesizedEventArgs>(sre_SpeechHypothesized); sre.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized); sre.RecognizeCompleted += new EventHandler<RecognizeCompletedEventArgs>(sre_RecognizeCompleted); sre.AudioSignalProblemOccurred += new EventHandler<AudioSignalProblemOccurredEventArgs>(sre_AudioSignalProblemOccurred); sre.SpeechDetected += new EventHandler<SpeechDetectedEventArgs>(sre_SpeechDetected); sre.SpeechRecognitionRejected += new EventHandler<SpeechRecognitionRejectedEventArgs>(sre_SpeechRecognitionRejected);
Load Grammar
The grammar library is a trained system which can recognize words. You can create your own grammar library cued to specific words or you can use a general library for all words. Below is an example of both.
string[] words = {"power", "on", "off"};
Choices choices = new Choices(words);
GrammarBuilder gb = new GrammarBuilder(choices);
Grammar grammar = new Grammar(gb);
sre.LoadGrammar(grammar);
or
DictationGrammar dg = new DictationGrammar(); sre.LoadGrammar(dg);
Audio Parameters
The audio parameters determine where your audio file comes from and how long a silent block should be to determine an audio block.
For your mic:
sre.SetInputToDefaultAudioDevice();
For an audio file:
sre.SetInputToWaveFile("c:/audio.wav");
Set timeout period:
sre.EndSilenceTimeout = new TimeSpan(0, 0, 2);
Start Analyzing Audio
So now that everything is setup you can run the Recognize or RecognizeAsync methods. The former stops after the first block of audio that it recognizes. The latter does not stop until the end of an audio file or in the case of a microphone, until you tell it to stop.
First recognized audio block:
sre.Recognize();
Entire file:
sre.RecognizeAsync();
Event Handlers:
void sre_AudioSignalProblemOccurred(object sender, AudioSignalProblemOccurredEventArgs e)
{
Debug.Print(e.AudioSignalProblem.ToString());
}
void sre_SpeechHypothesized(object sender, SpeechHypothesizedEventArgs e)
{
Debug.Print(e.Result.Text);
}
void sre_SpeechRecognitionRejected(object sender, SpeechRecognitionRejectedEventArgs e)
{
Debug.Print("Rejected!");
}
void sre_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
{
Debug.Print("Recognition Complete!");
}
void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
Debug.Print(e.Result.Text);
}
void sre_SpeechDetected(object sender, SpeechDetectedEventArgs e)
{
Debug.Print("Speech Detected!");
}



Hi,
This webpage has given me a great start with my Speech Recognition project. I just had one question, once i entered the code above in Visual c# 2010 and tried to compile it, i am getting errors like
“Error 1 An object reference is required for the non-static field, method, or property ‘Hello.sre_SpeechHypothesized(object, System.Speech.Recognition.SpeechHypothesizedEventArgs)’ C:\Users\Owner\Documents\Visual Studio 2010\Projects\Learning\Learning\Class1.cs 16 35 Learning”
This is just one of the errors which i am showing, i am not sure how to solve this problem. Can you please provide me some assistance with this.
Thanks
Divey
Tough to know without looking at your actual code file, but the error is that you are missing a reference. This basically means that you are calling something, but the program doesn’t have a definition for it. In your example many things may not be defined, I can’t know what one.
1. Hello isn’t defined
2. sre_SpeechHypothesized is not defined
3. You are missing the ‘using’ references
My code right is exactly the same as the example that has been given above. I just need to make it work first then i will implement my own application with it. I am pasting the code with this, please try to help me, i have tried alot to fix the errors, but it is not working.
using System; using System.IO; using System.Speech; using System.Speech.Recognition; using System.Speech.AudioFormat; public class Hello { public static void Main() { SpeechRecognitionEngine sre = new SpeechRecognitionEngine(); sre.SpeechHypothesized += new EventHandler<SpeechHypothesizedEventArgs>(sre_SpeechHypothesized); sre.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized); sre.RecognizeCompleted += new EventHandler<RecognizeCompletedEventArgs>(sre_RecognizeCompleted); sre.AudioSignalProblemOccurred += new EventHandler<AudioSignalProblemOccurredEventArgs>(sre_AudioSignalProblemOccurred); sre.SpeechDetected += new EventHandler<SpeechDetectedEventArgs>(sre_SpeechDetected); sre.SpeechRecognitionRejected += new EventHandler<SpeechRecognitionRejectedEventArgs>(sre_SpeechRecognitionRejected); DictationGrammar dg = new DictationGrammar(); sre.LoadGrammar(dg); sre.SetInputToWaveFile("hello.wav"); sre.EndSilenceTimeout = new TimeSpan(0, 0, 2); sre.Recognize(); } void sre_SpeechHypothesized(object sender, SpeechHypothesizedEventArgs e) { Console.WriteLine(e.Result.Text); } void sre_AudioSignalProblemOccurred(object sender, AudioSignalProblemOccurredEventArgs e) { Console.WriteLine(e.AudioSignalProblem.ToString()); } void sre_SpeechRecognitionRejected(object sender, SpeechRecognitionRejectedEventArgs e) { Console.WriteLine("Rejected!"); } void sre_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e) { Console.WriteLine("Recognition Complete!"); } void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e) { Console.WriteLine(e.Result.Text); } void sre_SpeechDetected(object sender, SpeechDetectedEventArgs e) { Console.WriteLine("Speech Detected!"); } }Edit Note: Looks like cut and paste of EventArg casts didn’t make it through WordPress. In my initial response I thought that he had failed to cut and paste the proper event handler code, but that doesn’t appear to be the case. I am going to go back and fix up his initial post so that nobody else gets confused copy and pasting it.
You need to decide how you want your program structured. You would benefit from reading some web articles on constructing a program. In this case you have declared all your methods in Main which is a static entry point. Thus all the methods have to be static. You could add “static” in front of each of the methods and get your program to compile, but this may not be what you want. Instead you may want to create an instance of your class and have it call an entry point method. Then you wouldn’t have to add static before each of the methods. Read up on static and and classes too. For example:
public static void Main() { Hello hello = new Hello(); hello.doStuff(); } private void doStuff() { //all the stuff from my post SpeechRecognitionEngine sre = new SpeechRecognitionEngine(); //... }Thanks for that help, now the errors are finally gone. What i need to know is how can i tell if the recognition is being done, like i don’t know how to set an output text document, where the audio will be written as text.
If you used the methods I provided the output will be written to the console in visual studio. Just google how to bring up the console. If you want to write to a file then take a look into fileinfo and filestream. I would also recommend that you sign up for a stackoverflow account. Here you can ask questions and get answers very quickly. They have rules about how quickly you can post as a new user, but it is a fantastic place for programmers of all skill levels.
Hey, thanks a lot for the help and i will definitely look into getting a stackoverflow account, (i think i have one of these). But, i am still stuck with the code that i have right now. For some reason, when i run the code, it crashes and gives an error saying the input file is not found. I have put my audio file in the correct destination path, but still its not recognising the input. I would greatly appreciate some more help with this topic.
Thanks
Divey
Your path must be formatted incorrectly then. I would double check that everything is correct.
Hello and thank you for taking time to publish this very helpful document. It is the best and most helpful document I have ever seen on speech recognition using C#. I have implemented the code, however, it is not giving me accurate results. I have tried the code with different audio files. Can you please give me any tips for increasing accuracy? Thank you again!
oh this is great..