I am trying to get SAPI 5.4 (also MS Speech Platform SDK v11) for continuous speech recognition on audio coming from a Skype call.
I can use SKYPE4COMLib to capture the sound coming from Skype and push it through the TCP port, issuing an ALTER CALL instruction. You can direct Skype sound to a TCP file or socket. The file worked fine, but I want it to trigger real-time recognition, so use a TCP socket.
Then I built a TCP receiver to collect incoming data (audio file) and transferred an array of bytes as a MemoryStream for SAPI. I tuned SAPI to the expected sound in 16bit, 16khz, mono, PCM format. However, the recognition event never happens ?!
I tried to save this raw sound to disk instead, and then read it in SAPI, and it works fine ... so the data itself is fine and Skype sends audio correctly. However, this does not allow me to make the constant recognition that I need.
The SAPI recognition code works fine using a WAV file or a raw file downloaded from a disk or microphone. I just can't get it to work with a MemoryStream.
I found this similar article, none of the suggestions seemed to work for me, and the discussion seemed to calm down.
Stream input to System.Speech.Recognition.SpeechRecognitionEngine
Does anyone have any recommendations on how to successfully get the SAPI for continuous speech recognition from raw sound sent as a MemoryStream in C #?