## Real-Time Formant TrackingOne method of speech recognition is to estimate the resonant
frequencies of a speech signal. These frequencies, known as formants,
tell us the vowel being spoken. There are a number of ways to do
this. In my first attempt, I tried using the Levinson-Durbin
algorithm to find the LPC coefficients over short intervals of speech.
From there it is just a matter of finding the poles. The root finding
code turned out to be much to slow for real time. However, now that I
understand how to write optimized code on the C671x, I believe it may
still be possible to do it this way. Next, I tried estimating the LPC
coefficients using LMS*. By using LMS, the coefficients are known at
every sample (instead of at the end of each window). This allows for
tracking the roots. I did this by randomly seeding Newton's
method for each root and running one iteration after each LMS update+.
The biggest problem with this method is that the algorithm may lose
track of roots. Then the algorithm has to be re-seeded repeatedly
until the correct root is found again. In addition, LMS is not very
good at estimating formants. After running a number of simulations, I
decided to use a 25 I tested the real time implementation with four different vowels. Only one vowel worked well (it managed to find the most important formants at least part of the time). For two of the vowels, the tracker appeared to find at least one of the formants. For one vowel, it didn't seem to work at all. This is the video (which runs very quickly) of the tracker. The x-axis plots the smaller frequency; the y-axis plots the larger frequency (Hz). The spot where the markers tend to cluster is the correct location according to F. R. Moore's "Elements of Computer Music". You may find the visualization code useful. Also available is a description of how to get real time data into matlab. You will need test.cpp and test.def as well. *http://citeseer.ist.psu.edu/cache/papers/cs/25646/ |