Abstract
In an effort to improve recognition performance of talker-independent speech systems, many adaptive methods have been proposed. The methods generally seek to exploit the higher recognition performance rate of talker-dependent systems and extend it to talker-independent systems. This is achieved by some form of placing talkers into several categories, usually using gender or vocal-tract size. In this paper we investigate a similar idea, but categorize each utterance independently. An utterance is processed using several spectral compressions, and the compression with the maximum likelihood is then used to train a better model. For testing, the spectral compression with the maximum likelihood is used to decode the utterance. While the spectral compressions divided the utterances well, this did not translate into significant improvement in performance, and the computational cost increase was significant.
Original language | English |
---|---|
Pages (from-to) | 1235-1238 |
Number of pages | 4 |
Journal | Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing |
Volume | 2 |
Publication status | Published - 1997 |
Externally published | Yes |
Event | Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5) - Munich, Ger Duration: 21 Apr 1997 → 24 Apr 1997 |
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering