CN105976820B - Voice emotion analysis system - Google Patents
Voice emotion analysis system Download PDFInfo
- Publication number
- CN105976820B CN105976820B CN201610415352.2A CN201610415352A CN105976820B CN 105976820 B CN105976820 B CN 105976820B CN 201610415352 A CN201610415352 A CN 201610415352A CN 105976820 B CN105976820 B CN 105976820B
- Authority
- CN
- China
- Prior art keywords
- voice
- emotion
- voice signal
- speech
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
Abstract
The invention provides a speech emotion analysis system, which is characterized by comprising the following components: the voice transcription equipment is connected with the PC, and the PC is operated with: the voice signal processing unit is used for removing background noise in the voice signal acquired by the PC; and the voice emotion analysis unit is used for analyzing the emotion represented by the voice signal obtained after the processing of the voice signal processing unit and displaying the emotion information through the display unit. The voice emotion analysis system provided by the invention can be used for recording the voice data of the suspect at any time and any place in a concealed manner, and obtaining the conclusion whether the suspect lies or not through the analysis of the voice data at a later stage.
Description
Technical Field
The invention relates to a system for analyzing emotion types corresponding to voice signals.
Background
The lie detector used at present is an instrument for recording multiple physiological responses, and can be used for assisting detection in criminal investigation to know the psychological condition of a suspect under inquiry so as to judge whether the suspect is related to a criminal case. Modern science proves that the physiology of people actually changes when lying, and some people can observe a series of unnatural human body actions such as scratching ears, shaking legs and feet and the like by naked eyes. Still other physiological changes are not readily perceptible, such as: respiratory rate and blood volume abnormalities, the appearance of respiratory depression and breath-hold; the pulse is accelerated, the blood pressure is increased, the blood output is increased, and the components are changed, so that the skin on the face and the neck is obviously pale or reddish; increased secretion of the subcutaneous sweat glands leads to sweating of the skin, between the eyes or the upper lip first, and especially pronounced sweating of the fingers and palms; enlarging the pupil of the eye; gastric contraction, abnormal secretion of digestive juice, resulting in dry mouth, tongue, and lip; muscle tension, trembling, resulting in speech loss. These physiological parameters are governed by the autonomic nervous system, and are generally not controlled by human consciousness, but are autonomous in movement, and a series of conditioned reflex phenomena occur under external stimulation.
To catch the above phenomenon, lie detectors have been inoculated. The modern lie detector consists of a sensor, a host and a microcomputer. The sensor is connected with the body surface of a person and used for collecting the change information of the physiological parameters of the person; the host is an electronic component and converts analog signals acquired by the sensor into digital signals through processing; the microcomputer stores and analyzes the input digital signal to obtain the lie detection result. Therefore, the existing lie detector has the defect of large volume, and the lie detection can be implemented only by bringing a suspect to a specific place. When the lie detection is implemented, a sensor needs to be installed on the suspect, and the concealed lie detection cannot be realized.
Disclosure of Invention
The invention aims to provide a system capable of carrying out hidden lie detection on a suspect at any time and any place.
In order to achieve the above object, a technical solution of the present invention is to provide a speech emotion analysis system, including:
the voice transcription device is provided with a voice data recording end and two voice leading-out ends, the voice data recording end is connected with a voice output end of a receiver on the telephone, one of the two voice leading-out ends is connected with a voice input end of a host of the telephone, the other voice leading-out end is connected with a PC, and voice signals collected by the receiver are led out to the PC through one voice leading-out end on one hand and are led out to the host of the telephone through the other voice leading-out end on the other hand;
running on the PC are:
the voice signal processing unit is used for removing background noise in the voice signal acquired by the PC;
and the voice emotion analysis unit is used for analyzing the emotion represented by the voice signal obtained after the processing of the voice signal processing unit and displaying the emotion information through the display unit.
Preferably, the voice transcription device further comprises a microphone, wherein the microphone is connected with a PC (personal computer), and the PC collects voice data obtained by the voice transcription device or the microphone.
Preferably, the voice recording device is further included, and the voice signal in a period of time is acquired and stored by the voice recording device and is transmitted to the voice signal processing unit running on the PC.
Preferably, the speech emotion analyzing unit includes:
the voice database firstly defines different emotions, then obtains reference voice signals given by people of different ages and different sexes under different emotions, and records the waveform and the frequency of each reference voice signal in the voice database;
and the voice signal comparison unit is used for obtaining the waveform and the frequency of the voice signal processed by the voice signal processing unit, comparing the obtained waveform with the frequency and the waveform and the frequency of each reference voice signal which are stored in the voice database and correspond to the age group and the gender of the sender of the voice signal to obtain a reference voice signal which is most matched with the current voice signal, and giving the emotion contained in the current voice signal according to the emotion to which the obtained reference voice signal belongs.
Preferably, the speech emotion analyzing unit includes:
the validity analysis unit is used for carrying out emotion parameter validity analysis on the voice signal provided by the voice signal processing unit so as to extract voice emotion characteristic parameters;
the feature parameter classifier is used for dividing the speech emotion feature parameters obtained by the effectiveness analysis unit into short-time features and long-time features, wherein the pitch frequency, the short-time energy, two Mel frequency cepstrum coefficients and five Mel frequency sub-band energies are used as the short-time features, and the Mel energy spectrum dynamic coefficient is used as the long-time feature;
the short-time feature processing unit is used for training and identifying short-time features by using a hidden Markov model method so as to obtain the emotion corresponding to the current voice information;
and the long-term feature processing unit is used for training and identifying the long-term features by using a method of a support vector machine so as to obtain the emotion corresponding to the current voice information.
Preferably, a survey analysis unit is also running on the PC for determining the emotion of the user by questionnaire analysis of the exact presence or absence.
The voice emotion analysis system provided by the invention can be used for recording the voice data of the suspect at any time and any place in a concealed manner, and obtaining the conclusion whether the suspect lies or not through the analysis of the voice data at a later stage.
Drawings
FIG. 1 is a system block diagram of an embodiment of a speech emotion analysis system provided by the present invention.
Detailed Description
In order to make the invention more comprehensible, preferred embodiments are described in detail below with reference to the accompanying drawings.
Example 1
The speech emotion analysis system provided by the invention can work in three different modes, which are respectively as follows:
mode one, survey mode.
In this mode, the user needs to use the survey analysis unit running on the PC to complete the questionnaire displayed on the PC with an exact answer, and then the survey analysis unit analyzes and judges the current emotion of the user according to the answer given by the user.
Mode two, line mode
In this mode, the system is configured as shown in fig. 1, and includes a recording apparatus and a PC. The voice signal is acquired and stored by the recording device over a period of time and transmitted to the PC. The recording device is a portable recording pen or a recording device connected with a telephone communication line.
A voice signal processing unit and a voice emotion analyzing unit are operated on the PC. The voice signal processing unit is used for removing background noise in the voice signal collected by the PC;
in this embodiment, referring to fig. 1, the speech emotion analysis unit includes a speech database and a speech signal comparison unit.
For the voice database, different emotions are defined firstly, reference voice signals given by people of different ages and different sexes under different emotions are obtained, and the waveform and the frequency of each reference voice signal are recorded in the voice database.
For the voice signal comparison unit, the waveform and the frequency of the voice signal processed by the voice signal processing unit are obtained, the obtained waveform and the frequency are compared with the waveform and the frequency of each reference voice signal stored in the voice database and corresponding to the age group and the gender of the sender of the voice signal, a reference voice signal which is most matched with the current voice signal is obtained, and the emotion contained in the current voice signal is given according to the emotion to which the obtained reference voice signal belongs.
Mode three, online mode
In this mode, the recording device is replaced by a speech transcription device or a microphone. The voice transcription equipment is provided with a voice data recording end and two voice leading-out ends, the voice data recording end is connected with a voice output end of a receiver on the telephone, one of the two voice leading-out ends is connected with a voice input end of a host of the telephone, the other voice leading-out end is connected with a PC, and voice signals collected by the receiver are led out to the PC through one voice leading-out end on one hand and are led out to the host of the telephone through the other voice leading-out end on the other hand.
The microphone is connected with the PC, and the PC collects voice data obtained by the voice transcription equipment or collects voice data obtained by the microphone.
And after the voice data is collected by the voice transcription equipment or the microphone, the subsequent processing process is the same as that of the mode two.
Example 2
The present embodiment differs from embodiment 1 in that the speech emotion analyzing unit has a different configuration, and in the present embodiment, the speech emotion analyzing unit includes:
the validity analysis unit is used for carrying out emotion parameter validity analysis on the voice signal provided by the voice signal processing unit so as to extract voice emotion characteristic parameters;
the feature parameter classifier is used for dividing the speech emotion feature parameters obtained by the effectiveness analysis unit into short-time features and long-time features, wherein the pitch frequency, the short-time energy, two Mel frequency cepstrum coefficients and five Mel frequency sub-band energies are used as the short-time features, and the Mel energy spectrum dynamic coefficient is used as the long-time feature;
the short-time feature processing unit is used for training and identifying short-time features by using a hidden Markov model method so as to obtain the emotion corresponding to the current voice information;
and the long-term feature processing unit is used for training and identifying the long-term features by using a method of a support vector machine so as to obtain the emotion corresponding to the current voice information.
Other principles and operation of this embodiment are the same as those of embodiment 1.
Claims (4)
1. A speech emotion analysis system, comprising:
the voice transcription device is provided with a voice data recording end and two voice leading-out ends, the voice data recording end is connected with a voice output end of a receiver on the telephone, one of the two voice leading-out ends is connected with a voice input end of a host of the telephone, the other voice leading-out end is connected with a PC, and voice signals collected by the receiver are led out to the PC through one voice leading-out end on one hand and are led out to the host of the telephone through the other voice leading-out end on the other hand;
running on the PC are:
the voice signal processing unit is used for removing background noise in the voice signal acquired by the PC;
the speech emotion analysis unit analyzes emotion represented by the speech signal obtained after the processing of the speech signal processing unit and displays the emotion information through the display unit, and the speech emotion analysis unit comprises:
the voice database firstly defines different emotions, then obtains reference voice signals given by people of different ages and different sexes under different emotions, and records the waveform and the frequency of each reference voice signal in the voice database;
and the voice signal comparison unit is used for obtaining the waveform and the frequency of the voice signal processed by the voice signal processing unit, comparing the obtained waveform with the frequency and the waveform and the frequency of each reference voice signal which are stored in the voice database and correspond to the age group and the gender of the sender of the voice signal to obtain a reference voice signal which is most matched with the current voice signal, and giving the emotion contained in the current voice signal according to the emotion to which the obtained reference voice signal belongs.
2. The system of claim 1, further comprising a microphone, wherein the microphone is connected to a PC, and the PC collects either the speech data obtained by the speech transcription device or the speech data obtained by the microphone.
3. The system of claim 1, further comprising a recording device, wherein the recording device acquires and stores the speech signal for a period of time and transmits the speech signal to the speech signal processing unit running on the PC.
4. The system of claim 1, wherein a survey analysis unit is further operated on the PC for judging the emotion of the user through questionnaire analysis of exact yes or no.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610415352.2A CN105976820B (en) | 2016-06-14 | 2016-06-14 | Voice emotion analysis system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610415352.2A CN105976820B (en) | 2016-06-14 | 2016-06-14 | Voice emotion analysis system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105976820A CN105976820A (en) | 2016-09-28 |
CN105976820B true CN105976820B (en) | 2019-12-31 |
Family
ID=57011457
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610415352.2A Active CN105976820B (en) | 2016-06-14 | 2016-06-14 | Voice emotion analysis system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105976820B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108510981B (en) * | 2018-04-12 | 2020-07-24 | 三星电子(中国)研发中心 | Method and system for acquiring voice data |
US10984795B2 (en) | 2018-04-12 | 2021-04-20 | Samsung Electronics Co., Ltd. | Electronic apparatus and operation method thereof |
CN109063551A (en) * | 2018-06-20 | 2018-12-21 | 新华网股份有限公司 | Validity test method of talking and system |
CN108899046A (en) * | 2018-07-12 | 2018-11-27 | 东北大学 | A kind of speech-emotion recognition method and system based on Multistage Support Vector Machine classification |
CN111400539B (en) * | 2019-01-02 | 2023-05-30 | 阿里巴巴集团控股有限公司 | Voice questionnaire processing method, device and system |
TWI719429B (en) * | 2019-03-19 | 2021-02-21 | 瑞昱半導體股份有限公司 | Audio processing method and audio processing system |
CN110033778B (en) * | 2019-05-07 | 2021-07-23 | 苏州市职业大学 | Real-time identification and correction system for lie state |
CN110488973B (en) * | 2019-07-23 | 2020-11-10 | 清华大学 | Virtual interactive message leaving system and method |
CN110432916A (en) * | 2019-08-13 | 2019-11-12 | 上海莫吉娜智能信息科技有限公司 | Lie detection system and lie detecting method based on millimetre-wave radar |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101201980A (en) * | 2007-12-19 | 2008-06-18 | 北京交通大学 | Remote Chinese language teaching system based on voice affection identification |
CN101873378A (en) * | 2010-06-11 | 2010-10-27 | 湖北海山科技有限公司 | Remote monitoring mobile phone based on 3G wireless network |
CN103117061A (en) * | 2013-02-05 | 2013-05-22 | 广东欧珀移动通信有限公司 | Method and device for identifying animals based on voice |
CN103886869A (en) * | 2014-04-09 | 2014-06-25 | 北京京东尚科信息技术有限公司 | Information feedback method and system based on speech emotion recognition |
CN104185868A (en) * | 2012-01-24 | 2014-12-03 | 澳尔亚有限公司 | Voice authentication and speech recognition system and method |
CN105869657A (en) * | 2016-06-03 | 2016-08-17 | 竹间智能科技(上海)有限公司 | System and method for identifying voice emotion |
-
2016
- 2016-06-14 CN CN201610415352.2A patent/CN105976820B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101201980A (en) * | 2007-12-19 | 2008-06-18 | 北京交通大学 | Remote Chinese language teaching system based on voice affection identification |
CN101873378A (en) * | 2010-06-11 | 2010-10-27 | 湖北海山科技有限公司 | Remote monitoring mobile phone based on 3G wireless network |
CN104185868A (en) * | 2012-01-24 | 2014-12-03 | 澳尔亚有限公司 | Voice authentication and speech recognition system and method |
CN103117061A (en) * | 2013-02-05 | 2013-05-22 | 广东欧珀移动通信有限公司 | Method and device for identifying animals based on voice |
CN103886869A (en) * | 2014-04-09 | 2014-06-25 | 北京京东尚科信息技术有限公司 | Information feedback method and system based on speech emotion recognition |
CN105869657A (en) * | 2016-06-03 | 2016-08-17 | 竹间智能科技(上海)有限公司 | System and method for identifying voice emotion |
Also Published As
Publication number | Publication date |
---|---|
CN105976820A (en) | 2016-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105976820B (en) | Voice emotion analysis system | |
Bi et al. | AutoDietary: A wearable acoustic sensor system for food intake recognition in daily life | |
Matos et al. | An automated system for 24-h monitoring of cough frequency: the leicester cough monitor | |
Drugman et al. | Objective study of sensor relevance for automatic cough detection | |
US20080045805A1 (en) | Method and System of Indicating a Condition of an Individual | |
CN111712183A (en) | In-ear non-verbal audio event classification system and method | |
Schuller et al. | Automatic recognition of physiological parameters in the human voice: Heart rate and skin conductance | |
Hartelius et al. | Long-term phonatory instability in individuals with multiple sclerosis | |
CN110367934B (en) | Health monitoring method and system based on non-voice body sounds | |
US20150313508A1 (en) | Systems, Methods, and Media for Finding and Matching Tremor Signals | |
CN102770063A (en) | Physiological signal quality classification methods and systems for ambulatory monitoring | |
Patil et al. | The physiological microphone (PMIC): A competitive alternative for speaker assessment in stress detection and speaker verification | |
CN110811647B (en) | Multi-channel hidden lie detection method based on ballistocardiogram signal | |
JP2015229040A (en) | Emotion analysis system, emotion analysis method, and emotion analysis program | |
CN108478224A (en) | Intense strain detecting system and detection method based on virtual reality Yu brain electricity | |
US20150164363A1 (en) | Knowledge discovery based on brainwave response to external stimulation | |
CN113838544A (en) | System, method and computer program product for providing feedback relating to medical examinations | |
Tran et al. | Stethoscope-sensed speech and breath-sounds for person identification with sparse training data | |
CN205493847U (en) | Pronunciation analytic system | |
Mahmoudi et al. | Sensor-based system for automatic cough detection and classification | |
CN113143208A (en) | Pain sensitivity assessment system and method based on multi-dimensional measurement | |
JP2005066044A (en) | Respiratory sound data processor and program | |
JP3764663B2 (en) | Psychosomatic diagnosis system | |
Freitas et al. | Multimodal corpora for silent speech interaction | |
CN104793743A (en) | Virtual social contact system and control method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |