

In: Proceedings of the ICSAAP, Kyoto, Japan, pp. Continua 57(1), 167–178 (2018)Ībdel-Hamid, O., Mohamed, A., Jiang, H., Penn, G.: Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition. 9(8), 1735–1780 (1997)įang, W., Zhang, F., Sheng, V.S., Ding, Y.: A method for improving CNN-Based image recognition using DCGAN. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex.

In: Proceedings of the ISCA-Workshop on Speech and Emotion, Belfast, pp. Pereira, C.: Dimensions of emotional meaning in speech. In: Presented at the XIVth International Congress of Phonetic Science, San Francisco, pp. Johnstone, T., Scherer, K.R.: The effects of emotions on voice quality. Gobl, C., Ni Chasaide, A.: The role of voice quality in communicating emotion, mood, and attitude. In: Proceedings of the ISCA-INTERSPEECH, Lisbon, pp. Luengo, I., Navas, E., Hernaez, I., Sanchez, J.: Automatic emotion recognition using prosodic parameters. Continua 58(3), 697–709 (2019)Īyadi, M.E., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. 25(01), 37–50 (2014)įeng, X., Zhang, X., Xin, Z., Yang, A.: Investigation on the Chinese text sentiment analysis based on convolutional neural networks in deep learning. Wenjing Han, H.L., Ruan, H.: Review of research progress in speech emotion recognition. Shengting, W., Liu, Y., Wang, J., Li, Q.: Sentiment analysis method based on K means and online transfer learning.

In: Proceedings of INTERSPEECH 2010, China, pp. 13(2), 293–303 (2005)īoril, H., Omid Sadjadi, S., Kleinschmidt, T., Hansen, J.H.: Analysis and detection of cognitive load and frustration in drivers’ speech. Lee, C.M., Narayanan, S.S.: Toward detecting emotions in spoken dialogs. 2096–2104 (2014)įrance, D.J., Shiavi, R.G., Silverman, S., Silverman, M., Wilkes, M.: Acoustical properties of speech as indicators of depression and suicidal risk. In: Advances in Neural Information Processing Systems, pp. Irsoy, O., Cardie, C.: Deep recursive neural networks for compositionality in language.

Finally, according to the Five-Factor Model, we conducted a personality analysis. In addition, through comparative experiments, we found that the hybrid neural network model is significantly better than that of CNN or LSTM network alone. At the same time, the natural speech data set collected by the experiment was tested in the experiment, which proved that the model has certain recognition ability. The recognition rate of the model on this data set reached 0.8365, showing a good recognition function. We used the CASIA Chinese data set, which contains 7,200 speeches to train and test the model. Starting from the aspect of deep learning, after researching the relevant neural network architecture, a hybrid model of convolutional neural network and long-term and short-term memory network is constructed, which realizes the recognition of speech emotion. First, the speech emotion recognition and related work are briefly introduced. The main work of this paper is based on the research of psychological counseling and personality analysis algorithms of speech emotions.
