How to decode speech input

I want to create an API that translates human speech into IPA (International Phonetic Alphabet) format. My question is where are the resources on how to decode speech at the level of the original audio signal. I was looking for an API, but most of what I found just translates directly into the Roman alphabet. I am looking to create something more precise in his ability to distinguish vocal phonetics.

+5
source share
1 answer

I would like to start by saying that this project is much harder and harder than you think. Speech for word processing is a very large and complex area with a huge amount of research that has been done in it. The reason most parsers send things directly to Roman characters is because most of their processing is a probabilistic mapping of vague sounds to their context of other vague sounds to guess which words make sense together. You will most likely find something that will give you Soundex, not IPA. However, this is a problem that has been raised on several fronts. Best of all, probably the Sphinx project from CMU.

http://cmusphinx.sourceforge.net/wiki/start

, , , , IPA - , Sphinx , , , , , , . .

+4

All Articles