Combining classifier decisions for robust speaker identification

Daniel J. Mashao, Marshalleno Skosan

Research output: Contribution to journalArticlepeer-review

63 Citations (Scopus)

Abstract

In this work, we combine the decisions of two classifiers as an alternative means of improving the performance of a speaker recognition system in adverse environments. The difference between these classifiers is in their feature-sets. One system is based on the popular mel-frequency cepstral coefficients (MFCC) and the other on the new parametric feature-sets (PFS) algorithm. The feature-vectors both have mel-scale spectral warping and are computed in the cepstral domain but the feature-sets differs in the use of spectral filters and compressions. The performance of the classifier is not much different in recognition rates terms but they are complementary. This shows that there is information that is not captured in the popular mel-frequency cepstral coefficients (MFCC), and the parametric feature-sets (PFS) is able to add further information for improved performance. Several ways of combining these classifiers gives significant improvements in a speaker identification task using a very large telephone degraded NTIMIT database.

Original languageEnglish
Pages (from-to)147-155
Number of pages9
JournalPattern Recognition
Volume39
Issue number1
DOIs
Publication statusPublished - Jan 2006
Externally publishedYes

Keywords

  • Gaussian mixture model
  • Multiple classifier systems
  • Parametric feature sets
  • Speaker identification

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Combining classifier decisions for robust speaker identification'. Together they form a unique fingerprint.

Cite this