Abstract
It is well known that when there is an acoustic mismatch between the speech obtained during training and testing the accuracy of speaker recognition systems drastically deteriorates. In this paper we propose Modified Segmental Histogram Equalization to improve the robustness of a speaker verification system operating in telephone environments. The technique transforms the features extracted from short adjacent segments of speech within an utterance such that their statistics conform to that of a Gaussian distribution with zero mean and unity variance across all recording conditions. In doing so, the feature statistics become less environment-dependent. Experiments performed on the NIST 2000 database show significant improvements in performance.
Original language | English |
---|---|
Pages (from-to) | 479-486 |
Number of pages | 8 |
Journal | Pattern Recognition Letters |
Volume | 27 |
Issue number | 5 |
DOIs | |
Publication status | Published - 1 Apr 2006 |
Externally published | Yes |
Keywords
- Histogram Equalization
- Mismatched conditions
- NIST 2000
- Speaker verification
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence