CN105976820B - Voice emotion analysis system - Google Patents

Voice emotion analysis system Download PDF

Info

Publication number
CN105976820B
CN105976820B CN201610415352.2A CN201610415352A CN105976820B CN 105976820 B CN105976820 B CN 105976820B CN 201610415352 A CN201610415352 A CN 201610415352A CN 105976820 B CN105976820 B CN 105976820B
Authority
CN
China
Prior art keywords
voice
emotion
voice signal
speech
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610415352.2A
Other languages
Chinese (zh)
Other versions
CN105976820A (en
Inventor
陈晓群
陈志坤
陈卫东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI LIANGXIANG ELECTRONIC TECHNOLOGY Co Ltd
Shanghai Liangzhi Intelligent Equipment Co Ltd
Shanghai Liangxiang Intelligent Engineering Co Ltd
Original Assignee
SHANGHAI LIANGXIANG ELECTRONIC TECHNOLOGY Co Ltd
Shanghai Liangzhi Intelligent Equipment Co Ltd
Shanghai Liangxiang Intelligent Engineering Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI LIANGXIANG ELECTRONIC TECHNOLOGY Co Ltd, Shanghai Liangzhi Intelligent Equipment Co Ltd, Shanghai Liangxiang Intelligent Engineering Co Ltd filed Critical SHANGHAI LIANGXIANG ELECTRONIC TECHNOLOGY Co Ltd
Priority to CN201610415352.2A priority Critical patent/CN105976820B/en
Publication of CN105976820A publication Critical patent/CN105976820A/en
Application granted granted Critical
Publication of CN105976820B publication Critical patent/CN105976820B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Abstract

The invention provides a speech emotion analysis system, which is characterized by comprising the following components: the voice transcription equipment is connected with the PC, and the PC is operated with: the voice signal processing unit is used for removing background noise in the voice signal acquired by the PC; and the voice emotion analysis unit is used for analyzing the emotion represented by the voice signal obtained after the processing of the voice signal processing unit and displaying the emotion information through the display unit. The voice emotion analysis system provided by the invention can be used for recording the voice data of the suspect at any time and any place in a concealed manner, and obtaining the conclusion whether the suspect lies or not through the analysis of the voice data at a later stage.

Description

Voice emotion analysis system
Technical Field
The invention relates to a system for analyzing emotion types corresponding to voice signals.
Background
The lie detector used at present is an instrument for recording multiple physiological responses, and can be used for assisting detection in criminal investigation to know the psychological condition of a suspect under inquiry so as to judge whether the suspect is related to a criminal case. Modern science proves that the physiology of people actually changes when lying, and some people can observe a series of unnatural human body actions such as scratching ears, shaking legs and feet and the like by naked eyes. Still other physiological changes are not readily perceptible, such as: respiratory rate and blood volume abnormalities, the appearance of respiratory depression and breath-hold; the pulse is accelerated, the blood pressure is increased, the blood output is increased, and the components are changed, so that the skin on the face and the neck is obviously pale or reddish; increased secretion of the subcutaneous sweat glands leads to sweating of the skin, between the eyes or the upper lip first, and especially pronounced sweating of the fingers and palms; enlarging the pupil of the eye; gastric contraction, abnormal secretion of digestive juice, resulting in dry mouth, tongue, and lip; muscle tension, trembling, resulting in speech loss. These physiological parameters are governed by the autonomic nervous system, and are generally not controlled by human consciousness, but are autonomous in movement, and a series of conditioned reflex phenomena occur under external stimulation.
To catch the above phenomenon, lie detectors have been inoculated. The modern lie detector consists of a sensor, a host and a microcomputer. The sensor is connected with the body surface of a person and used for collecting the change information of the physiological parameters of the person; the host is an electronic component and converts analog signals acquired by the sensor into digital signals through processing; the microcomputer stores and analyzes the input digital signal to obtain the lie detection result. Therefore, the existing lie detector has the defect of large volume, and the lie detection can be implemented only by bringing a suspect to a specific place. When the lie detection is implemented, a sensor needs to be installed on the suspect, and the concealed lie detection cannot be realized.
Disclosure of Invention
The invention aims to provide a system capable of carrying out hidden lie detection on a suspect at any time and any place.
In order to achieve the above object, a technical solution of the present invention is to provide a speech emotion analysis system, including:
the voice transcription device is provided with a voice data recording end and two voice leading-out ends, the voice data recording end is connected with a voice output end of a receiver on the telephone, one of the two voice leading-out ends is connected with a voice input end of a host of the telephone, the other voice leading-out end is connected with a PC, and voice signals collected by the receiver are led out to the PC through one voice leading-out end on one hand and are led out to the host of the telephone through the other voice leading-out end on the other hand;
running on the PC are:
the voice signal processing unit is used for removing background noise in the voice signal acquired by the PC;
and the voice emotion analysis unit is used for analyzing the emotion represented by the voice signal obtained after the processing of the voice signal processing unit and displaying the emotion information through the display unit.
Preferably, the voice transcription device further comprises a microphone, wherein the microphone is connected with a PC (personal computer), and the PC collects voice data obtained by the voice transcription device or the microphone.
Preferably, the voice recording device is further included, and the voice signal in a period of time is acquired and stored by the voice recording device and is transmitted to the voice signal processing unit running on the PC.
Preferably, the speech emotion analyzing unit includes:
the voice database firstly defines different emotions, then obtains reference voice signals given by people of different ages and different sexes under different emotions, and records the waveform and the frequency of each reference voice signal in the voice database;
and the voice signal comparison unit is used for obtaining the waveform and the frequency of the voice signal processed by the voice signal processing unit, comparing the obtained waveform with the frequency and the waveform and the frequency of each reference voice signal which are stored in the voice database and correspond to the age group and the gender of the sender of the voice signal to obtain a reference voice signal which is most matched with the current voice signal, and giving the emotion contained in the current voice signal according to the emotion to which the obtained reference voice signal belongs.
Preferably, the speech emotion analyzing unit includes:
the validity analysis unit is used for carrying out emotion parameter validity analysis on the voice signal provided by the voice signal processing unit so as to extract voice emotion characteristic parameters;
the feature parameter classifier is used for dividing the speech emotion feature parameters obtained by the effectiveness analysis unit into short-time features and long-time features, wherein the pitch frequency, the short-time energy, two Mel frequency cepstrum coefficients and five Mel frequency sub-band energies are used as the short-time features, and the Mel energy spectrum dynamic coefficient is used as the long-time feature;
the short-time feature processing unit is used for training and identifying short-time features by using a hidden Markov model method so as to obtain the emotion corresponding to the current voice information;
and the long-term feature processing unit is used for training and identifying the long-term features by using a method of a support vector machine so as to obtain the emotion corresponding to the current voice information.
Preferably, a survey analysis unit is also running on the PC for determining the emotion of the user by questionnaire analysis of the exact presence or absence.
The voice emotion analysis system provided by the invention can be used for recording the voice data of the suspect at any time and any place in a concealed manner, and obtaining the conclusion whether the suspect lies or not through the analysis of the voice data at a later stage.
Drawings
FIG. 1 is a system block diagram of an embodiment of a speech emotion analysis system provided by the present invention.
Detailed Description
In order to make the invention more comprehensible, preferred embodiments are described in detail below with reference to the accompanying drawings.
Example 1
The speech emotion analysis system provided by the invention can work in three different modes, which are respectively as follows:
mode one, survey mode.
In this mode, the user needs to use the survey analysis unit running on the PC to complete the questionnaire displayed on the PC with an exact answer, and then the survey analysis unit analyzes and judges the current emotion of the user according to the answer given by the user.
Mode two, line mode
In this mode, the system is configured as shown in fig. 1, and includes a recording apparatus and a PC. The voice signal is acquired and stored by the recording device over a period of time and transmitted to the PC. The recording device is a portable recording pen or a recording device connected with a telephone communication line.
A voice signal processing unit and a voice emotion analyzing unit are operated on the PC. The voice signal processing unit is used for removing background noise in the voice signal collected by the PC;
in this embodiment, referring to fig. 1, the speech emotion analysis unit includes a speech database and a speech signal comparison unit.
For the voice database, different emotions are defined firstly, reference voice signals given by people of different ages and different sexes under different emotions are obtained, and the waveform and the frequency of each reference voice signal are recorded in the voice database.
For the voice signal comparison unit, the waveform and the frequency of the voice signal processed by the voice signal processing unit are obtained, the obtained waveform and the frequency are compared with the waveform and the frequency of each reference voice signal stored in the voice database and corresponding to the age group and the gender of the sender of the voice signal, a reference voice signal which is most matched with the current voice signal is obtained, and the emotion contained in the current voice signal is given according to the emotion to which the obtained reference voice signal belongs.
Mode three, online mode
In this mode, the recording device is replaced by a speech transcription device or a microphone. The voice transcription equipment is provided with a voice data recording end and two voice leading-out ends, the voice data recording end is connected with a voice output end of a receiver on the telephone, one of the two voice leading-out ends is connected with a voice input end of a host of the telephone, the other voice leading-out end is connected with a PC, and voice signals collected by the receiver are led out to the PC through one voice leading-out end on one hand and are led out to the host of the telephone through the other voice leading-out end on the other hand.
The microphone is connected with the PC, and the PC collects voice data obtained by the voice transcription equipment or collects voice data obtained by the microphone.
And after the voice data is collected by the voice transcription equipment or the microphone, the subsequent processing process is the same as that of the mode two.
Example 2
The present embodiment differs from embodiment 1 in that the speech emotion analyzing unit has a different configuration, and in the present embodiment, the speech emotion analyzing unit includes:
the validity analysis unit is used for carrying out emotion parameter validity analysis on the voice signal provided by the voice signal processing unit so as to extract voice emotion characteristic parameters;
the feature parameter classifier is used for dividing the speech emotion feature parameters obtained by the effectiveness analysis unit into short-time features and long-time features, wherein the pitch frequency, the short-time energy, two Mel frequency cepstrum coefficients and five Mel frequency sub-band energies are used as the short-time features, and the Mel energy spectrum dynamic coefficient is used as the long-time feature;
the short-time feature processing unit is used for training and identifying short-time features by using a hidden Markov model method so as to obtain the emotion corresponding to the current voice information;
and the long-term feature processing unit is used for training and identifying the long-term features by using a method of a support vector machine so as to obtain the emotion corresponding to the current voice information.
Other principles and operation of this embodiment are the same as those of embodiment 1.

Claims (4)

1. A speech emotion analysis system, comprising:
the voice transcription device is provided with a voice data recording end and two voice leading-out ends, the voice data recording end is connected with a voice output end of a receiver on the telephone, one of the two voice leading-out ends is connected with a voice input end of a host of the telephone, the other voice leading-out end is connected with a PC, and voice signals collected by the receiver are led out to the PC through one voice leading-out end on one hand and are led out to the host of the telephone through the other voice leading-out end on the other hand;
running on the PC are:
the voice signal processing unit is used for removing background noise in the voice signal acquired by the PC;
the speech emotion analysis unit analyzes emotion represented by the speech signal obtained after the processing of the speech signal processing unit and displays the emotion information through the display unit, and the speech emotion analysis unit comprises:
the voice database firstly defines different emotions, then obtains reference voice signals given by people of different ages and different sexes under different emotions, and records the waveform and the frequency of each reference voice signal in the voice database;
and the voice signal comparison unit is used for obtaining the waveform and the frequency of the voice signal processed by the voice signal processing unit, comparing the obtained waveform with the frequency and the waveform and the frequency of each reference voice signal which are stored in the voice database and correspond to the age group and the gender of the sender of the voice signal to obtain a reference voice signal which is most matched with the current voice signal, and giving the emotion contained in the current voice signal according to the emotion to which the obtained reference voice signal belongs.
2. The system of claim 1, further comprising a microphone, wherein the microphone is connected to a PC, and the PC collects either the speech data obtained by the speech transcription device or the speech data obtained by the microphone.
3. The system of claim 1, further comprising a recording device, wherein the recording device acquires and stores the speech signal for a period of time and transmits the speech signal to the speech signal processing unit running on the PC.
4. The system of claim 1, wherein a survey analysis unit is further operated on the PC for judging the emotion of the user through questionnaire analysis of exact yes or no.
CN201610415352.2A 2016-06-14 2016-06-14 Voice emotion analysis system Active CN105976820B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610415352.2A CN105976820B (en) 2016-06-14 2016-06-14 Voice emotion analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610415352.2A CN105976820B (en) 2016-06-14 2016-06-14 Voice emotion analysis system

Publications (2)

Publication Number Publication Date
CN105976820A CN105976820A (en) 2016-09-28
CN105976820B true CN105976820B (en) 2019-12-31

Family

ID=57011457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610415352.2A Active CN105976820B (en) 2016-06-14 2016-06-14 Voice emotion analysis system

Country Status (1)

Country Link
CN (1) CN105976820B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108510981B (en) * 2018-04-12 2020-07-24 三星电子(中国)研发中心 Method and system for acquiring voice data
US10984795B2 (en) 2018-04-12 2021-04-20 Samsung Electronics Co., Ltd. Electronic apparatus and operation method thereof
CN109063551A (en) * 2018-06-20 2018-12-21 新华网股份有限公司 Validity test method of talking and system
CN108899046A (en) * 2018-07-12 2018-11-27 东北大学 A kind of speech-emotion recognition method and system based on Multistage Support Vector Machine classification
CN111400539B (en) * 2019-01-02 2023-05-30 阿里巴巴集团控股有限公司 Voice questionnaire processing method, device and system
TWI719429B (en) * 2019-03-19 2021-02-21 瑞昱半導體股份有限公司 Audio processing method and audio processing system
CN110033778B (en) * 2019-05-07 2021-07-23 苏州市职业大学 Real-time identification and correction system for lie state
CN110488973B (en) * 2019-07-23 2020-11-10 清华大学 Virtual interactive message leaving system and method
CN110432916A (en) * 2019-08-13 2019-11-12 上海莫吉娜智能信息科技有限公司 Lie detection system and lie detecting method based on millimetre-wave radar

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201980A (en) * 2007-12-19 2008-06-18 北京交通大学 Remote Chinese language teaching system based on voice affection identification
CN101873378A (en) * 2010-06-11 2010-10-27 湖北海山科技有限公司 Remote monitoring mobile phone based on 3G wireless network
CN103117061A (en) * 2013-02-05 2013-05-22 广东欧珀移动通信有限公司 Method and device for identifying animals based on voice
CN103886869A (en) * 2014-04-09 2014-06-25 北京京东尚科信息技术有限公司 Information feedback method and system based on speech emotion recognition
CN104185868A (en) * 2012-01-24 2014-12-03 澳尔亚有限公司 Voice authentication and speech recognition system and method
CN105869657A (en) * 2016-06-03 2016-08-17 竹间智能科技(上海)有限公司 System and method for identifying voice emotion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201980A (en) * 2007-12-19 2008-06-18 北京交通大学 Remote Chinese language teaching system based on voice affection identification
CN101873378A (en) * 2010-06-11 2010-10-27 湖北海山科技有限公司 Remote monitoring mobile phone based on 3G wireless network
CN104185868A (en) * 2012-01-24 2014-12-03 澳尔亚有限公司 Voice authentication and speech recognition system and method
CN103117061A (en) * 2013-02-05 2013-05-22 广东欧珀移动通信有限公司 Method and device for identifying animals based on voice
CN103886869A (en) * 2014-04-09 2014-06-25 北京京东尚科信息技术有限公司 Information feedback method and system based on speech emotion recognition
CN105869657A (en) * 2016-06-03 2016-08-17 竹间智能科技(上海)有限公司 System and method for identifying voice emotion

Also Published As

Publication number Publication date
CN105976820A (en) 2016-09-28

Similar Documents

Publication Publication Date Title
CN105976820B (en) Voice emotion analysis system
Bi et al. AutoDietary: A wearable acoustic sensor system for food intake recognition in daily life
Matos et al. An automated system for 24-h monitoring of cough frequency: the leicester cough monitor
Drugman et al. Objective study of sensor relevance for automatic cough detection
US20080045805A1 (en) Method and System of Indicating a Condition of an Individual
CN111712183A (en) In-ear non-verbal audio event classification system and method
Schuller et al. Automatic recognition of physiological parameters in the human voice: Heart rate and skin conductance
Hartelius et al. Long-term phonatory instability in individuals with multiple sclerosis
CN110367934B (en) Health monitoring method and system based on non-voice body sounds
US20150313508A1 (en) Systems, Methods, and Media for Finding and Matching Tremor Signals
CN102770063A (en) Physiological signal quality classification methods and systems for ambulatory monitoring
Patil et al. The physiological microphone (PMIC): A competitive alternative for speaker assessment in stress detection and speaker verification
CN110811647B (en) Multi-channel hidden lie detection method based on ballistocardiogram signal
JP2015229040A (en) Emotion analysis system, emotion analysis method, and emotion analysis program
CN108478224A (en) Intense strain detecting system and detection method based on virtual reality Yu brain electricity
US20150164363A1 (en) Knowledge discovery based on brainwave response to external stimulation
CN113838544A (en) System, method and computer program product for providing feedback relating to medical examinations
Tran et al. Stethoscope-sensed speech and breath-sounds for person identification with sparse training data
CN205493847U (en) Pronunciation analytic system
Mahmoudi et al. Sensor-based system for automatic cough detection and classification
CN113143208A (en) Pain sensitivity assessment system and method based on multi-dimensional measurement
JP2005066044A (en) Respiratory sound data processor and program
JP3764663B2 (en) Psychosomatic diagnosis system
Freitas et al. Multimodal corpora for silent speech interaction
CN104793743A (en) Virtual social contact system and control method thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant