TY - GEN
T1 - Specific language impairment detection through voice analysis
AU - Slogrove, Kayleigh Joy
AU - van der Haar, Dustin
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2020.
PY - 2020
Y1 - 2020
N2 - Specific Language Impairment is a communication disorder regarding the mastery of language and conversation that impacts children. The system proposed aims to provide an alternative diagnosis method that does not rely on specific assessment tools. The system will accept a voice sample from the child and then detect indicators that differentiate individuals with specific language impairment from that voice sample. These indicators were based on the timbre and pitch characteristics of sound. Three different feature spaces are calculated, followed by derived features, with three different classifiers to determine the most accurate combination. The three feature spaces are Chroma, Mel-frequency cepstral coefficients (MFCC), and Tonnetz and the three classifiers are Support Vector Machines, Random Forest and a Recurrent Neural Network. MFCC, representing the timbre characteristic, was found to be the most accurate feature vector across all classifiers and Random Forest being the most accurate classifier across all feature spaces. The most accurate combination found was the MFCC feature vector with the Random Forest classifier with an accuracy level of 99%. The MFCC feature vector has the most features that are extracted giving the reason for the high accuracy. However, this accuracy decreases when the recorded word is three syllables or longer. The system proposed has proven to be a valid method that can detect SLI.
AB - Specific Language Impairment is a communication disorder regarding the mastery of language and conversation that impacts children. The system proposed aims to provide an alternative diagnosis method that does not rely on specific assessment tools. The system will accept a voice sample from the child and then detect indicators that differentiate individuals with specific language impairment from that voice sample. These indicators were based on the timbre and pitch characteristics of sound. Three different feature spaces are calculated, followed by derived features, with three different classifiers to determine the most accurate combination. The three feature spaces are Chroma, Mel-frequency cepstral coefficients (MFCC), and Tonnetz and the three classifiers are Support Vector Machines, Random Forest and a Recurrent Neural Network. MFCC, representing the timbre characteristic, was found to be the most accurate feature vector across all classifiers and Random Forest being the most accurate classifier across all feature spaces. The most accurate combination found was the MFCC feature vector with the Random Forest classifier with an accuracy level of 99%. The MFCC feature vector has the most features that are extracted giving the reason for the high accuracy. However, this accuracy decreases when the recorded word is three syllables or longer. The system proposed has proven to be a valid method that can detect SLI.
KW - Chroma
KW - Machine learning
KW - Mel frequency cepstral coefficient
KW - Pitch
KW - Random Forest
KW - Recurrent Neural Network
KW - Sound
KW - Specific Language Impairment
KW - Support Vector Machines
KW - Timbre
KW - Tonnetz
KW - Voice
UR - http://www.scopus.com/inward/record.url?scp=85089237944&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-53337-3_10
DO - 10.1007/978-3-030-53337-3_10
M3 - Conference contribution
AN - SCOPUS:85089237944
SN - 9783030533366
T3 - Lecture Notes in Business Information Processing
SP - 130
EP - 141
BT - Business Information Systems - 23rd International Conference, BIS 2020, Proceedings
A2 - Abramowicz, Witold
A2 - Klein, Gary
PB - Springer
T2 - 23rd International Conference on Business Information Systems, BIS 2020
Y2 - 8 June 2020 through 10 June 2020
ER -