Predicting the Level of Emotion by Means of Indonesian Speech Signal

Abstract: Understanding human emotion is of importance for developing better and facilitating smooth interpersonal relations. It becomes much more important because human thinking process and behavior are strongly influenced by the emotion. Align with these needs, an expert system that capable of predicting the emotion state would be useful for many practical applications. Based on a speech signal, the system has been widely developed for various languages. This study intends to evaluate to which extent Mel-Frequency Cepstral Coefficients (MFCC) features, besides Teager energy feature, derived from Indonesian speech signal relates to four emotional types: happy, sad, angry, and fear. The study utilizes empirical data of nearly 300 speech signals collected from four amateur actors and actresses speaking 15 prescribed Indonesian sentences. Using support vector machine classifier, the empirical findings suggest that the Teager energy, as well as the first coefficient of MFCCs, are a crucial feature and the prediction can achieve the accuracy level of 86%. The accuracy increases quickly with a few initial MFCC features. The fourth and more features have negligible effects on the accuracy.
Keywbords: Indonesia speech, mel frequency cepstral coefficient, teager energy, support vector machine
Author: Fergyanto E. Gunawan, Kanyadian Idananta
Journal Code: jptkomputergg170121

Artikel Terkait :