CN109272986A - A kind of dog sound sensibility classification method based on artificial neural network - Google Patents
A kind of dog sound sensibility classification method based on artificial neural network Download PDFInfo
- Publication number
- CN109272986A CN109272986A CN201810995254.XA CN201810995254A CN109272986A CN 109272986 A CN109272986 A CN 109272986A CN 201810995254 A CN201810995254 A CN 201810995254A CN 109272986 A CN109272986 A CN 109272986A
- Authority
- CN
- China
- Prior art keywords
- sound
- dog
- frame
- neural network
- zero
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 24
- 230000008451 emotion Effects 0.000 claims abstract description 19
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000005236 sound signal Effects 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 20
- 239000000284 extract Substances 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 12
- 238000012360 testing method Methods 0.000 claims description 9
- 238000009432 framing Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 230000005284 excitation Effects 0.000 claims description 5
- 238000005311 autocorrelation function Methods 0.000 claims description 3
- 238000013145 classification model Methods 0.000 claims description 2
- 230000001256 tonic effect Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 8
- 230000002996 emotional effect Effects 0.000 abstract description 5
- 230000000694 effects Effects 0.000 abstract description 3
- 230000000737 periodic effect Effects 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000001737 promoting effect Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/24—Speech recognition using non-acoustical features
- G10L15/25—Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
Abstract
The present invention relates to a kind of dog sound sensibility classification method based on artificial neural network, belongs to Audio Signal Processing field.The present invention studies their relationships between dog emotion from the short-time energy of voice signal, zero-crossing rate and pitch period respectively.They have on distinguishing glad, angry, painful, frightened four kinds of emotions has certain effect, by extracting parameter of these features as emotional semantic classification, and speech emotional model is obtained by building the training of BP neural network algorithm, finally classified using sound emotion of the model to dog, classification error is corrected finally by the expressive features of comparison dog, reduces false recognition rate.Inventive algorithm is simple, and theoretical clear, technology is easy to accomplish.
Description
Technical field
The present invention relates to a kind of dog sound sensibility classification method based on artificial neural network, belongs to Audio Signal Processing skill
Art field.
Background technique
Many progress are had been achieved for the research of the speech emotional of people, but to the research of the speech emotional of animal also in
Blank stage, pronouncing frequency and the people of animal are very different, but the pronouncing frequency of dog and people be very close to, so grinding
The sound emotion of the sound emotion and people of studying carefully dog has similarity, and dog equally has the emotions such as happiness, pain, indignation, fear.But no
Mean that the research method of voice sound emotion can be applied directly in dog sound emotional semantic classification, the characteristic parameter of sound is very
It is more, but can not necessarily react the emotion of sound.
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of dog sound sensibility classification method based on artificial neural network,
Short-time energy, zero-crossing rate, pitch period are extracted respectively, and are used for the sound characteristic parameter of extraction to train artificial neural network.
The model that training obtains classifies automatically to the emotion sound of dog, the final expressive features in conjunction with dog carry out classification results
It corrects, reduces misclassification rate.
The technical solution adopted by the present invention is that: a kind of dog sound sensibility classification method based on artificial neural network, including
Following steps:
(1) dog sound and expression acquisition: acquisition is glad, pain, indignation, frightened four kinds of emotion sound, acquisition corresponding four
The countenance image of kind emotion;
(2) sound pre-processes: mainly including preemphasis, framing, windowing process;
(3) sound characteristic parameter extraction: extracting characteristic parameter from tonic train to be measured, and short-time energy is extracted, zero-crossing rate
It extracts, pitch period extracts, and the characteristic parameter extraction of sound can effectively classify to four kinds of emotion sound;
(4) expressive features parameter extraction: expressive features parameter is extracted from corresponding face image, extracts the part of image
Textural characteristics parameter, the extraction of expressive features can effectively distinguish the emotion of dog;
(5) it builds artificial neural network: obtaining dog sound sentiment classification model using the training of BP neural network algorithm;
(6) training test: it is used as training set by collected sample sound 80%, 20% is used as test set;
(7) the mistake classification of model is corrected in conjunction with the expressive features of image.
Specifically, step (1) sound intermediate frequency acquires, and sample frequency meets nyquist sampling theorem, sample rate fs≥2fh, fh
For signal highest frequency, setting channel number is monophonic, sample frequency 4.8kHz, quantified precision 16bit.Face's table of dog
The sound of collected dog and the corresponding expression of every kind of sound can be labeled by feelings by acquisition of taking pictures.
Specifically, in step (2) pretreatment the following steps are included:
(2.1) preemphasis: the purpose for promoting audio signal frequency spectrum is to keep its frequency spectrum more flat, usually can there are two types of method
It is analog circuit and digital circuit respectively to realize.It is generally realized by single order high-pass digital filter, the number of preemphasis
The transmission function of filter are as follows: H (Z)=1- α Z-1, in which: the value range of α is [0.9,1.0], and usual α takes 0.95.
(2.2) it framing: because voice signal is short-term stationarity signal, needs to carry out sub-frame processing, so as to each
Frame is as stationary signal processing.Simultaneously in order to reduce the variation between frame and frame, overlapping is taken between consecutive frame.General frame length takes
25ms, frame pipette the half of frame length.
(2.3) adding window: being to keep the overall situation more continuous in order to carry out Fourier expansion, avoid the occurrence of gibbs after adding window
Effect after adding window, shows the Partial Feature of periodic function without periodic voice signal originally.In speech signal analysis
In, common window function has rectangular window, Hanning window and Hamming window.
Specifically, in step (3) sound characteristic parameter extraction the following steps are included:
(3.1) short-time energy is extracted: short-time energy indicates the energy of one frame of voice signal, can therefrom observe voice signal
Amplitude characteristic.The representation method of short-time energy are as follows: set voice signal as x (n), obtain the voice signal of l frame after pretreatment
ForThen short-time energy are as follows:
Wherein ElFor the short-time energy of voice signal l frame, N is the length of a frame voice signal.
(3.2) zero-crossing rate extracts: short-time zero-crossing rate is the common temporal signatures of voice signal, and finger speech sound signal is in the short time
The interior number by zero.Continuous signal and discrete signal, the method for obtaining zero-crossing rate is different, can by observing its waveform statistics
To obtain the zero-crossing rate of continuous signal, the sign change number for calculating signal sampling point can obtain the zero-crossing rate of discrete signal.
Zero passage number in unit time is referred to as averagely Zero-crossing Number.
Voice frame signalShort-time average Zero-crossing Number zl(n) is defined as:
WhereinFor l frame voice signal, N is the length of one frame of voice signal, zlIt (n) is l frame voice signal
Zero passage number in short-term, sgn [] are sign function, it may be assumed that
(3.3) pitch period extracts: according to auto-correlation functionCalculate the base of each frame
Sound period, wherein xiIt (m) is the voice signal after adding window, k is the retardation of time.
Specifically, expressive features parameter is extracted using LBP algorithm in step (4):
LBP is a kind of image texture part extraction algorithm, it retains gray processing information, is embodied between pixel and its field value
Relationship.
Specifically, step (5) BP neural network algorithm is as follows:
If the node number of input layer is n, the node number of hidden layer is l, and the node number of output layer is m.Input layer
To the weight w of hidden layerij, the weight of hidden layer to output layer is wjk, input layer to hidden layer is biased to aj, hidden layer is to defeated
Layer is biased to b outk.Learning rate is η, and excitation function is g (x).Wherein excitation function is that g (x) takes sigmoid function.Shape
Formula are as follows:
The output of the output of hidden layer, hidden layer is set as Hj:
The output of output layer:
The calculating of error are as follows:Wherein YkFor desired output, we remember Yk-Ok=ek, then E can be with
It is expressed asIn above formula, i=1 ... n, j=1 ... l, k=1 ... m.
The more new formula of weight are as follows:
The more new formula of biasing are as follows:
Finally judge whether algorithm iteration terminates: there are many methods to may determine that whether algorithm restrains, common are finger
Determine the number of iteration, that is, judges whether the difference between adjacent error twice is less than specified value.
The beneficial effects of the present invention are: present invention could apply to audio identification fields.Nerve net compared with prior art
Network has good self study, self-organizing and preferable fault-tolerance, and calculating process is relatively simple, special for the sound of dog
Point chooses three kinds of characteristic parameters, by that can reduce classification false recognition rate in conjunction with expressive features.
Detailed description of the invention
Fig. 1 is overall flow figure of the present invention;
Fig. 2 is BP neural network training flow chart of the present invention.
Specific embodiment
Below by the drawings and specific embodiments, invention is further described in detail, but protection scope of the present invention is not
It is confined to the content.
Embodiment 1: as shown in Figure 1, 2, a kind of dog sound emotion identification method based on artificial neural network, including it is following
Step:
(1) dog sound and expression acquisition: acquisition is glad, pain, indignation, frightened four kinds of emotion sound, acquisition corresponding four
The countenance image of kind emotion.
(2) sample frequency meets nyquist sampling theorem, sample rate fs≥2fh, fhFor signal highest frequency.Setting sound
Road number is monophonic, and sample frequency is set as 4.8kHz, quantified precision 16bit.
(3) parameter μ of the filter of preemphasis takes 0.95, and the frame length that framing uses is 128, and it is 64 that frame, which moves, and window function is adopted
Use Hamming window.
(4) preemphasis: the purpose for promoting speech signal spec-trum is to keep its frequency spectrum more flat, usually can be with there are two types of method
It realizes, is analog circuit and digital circuit respectively.It is generally realized by single order high-pass digital filter, the number filter of preemphasis
The transmission function of wave device are as follows: H (Z)=1- α Z-1, wherein: the value range of α is [0.9,1.0], and usual α takes 0.95.
(5) it framing: because voice signal is short-term stationarity signal, needs to carry out sub-frame processing, so as to each frame
As stationary signal processing.Simultaneously in order to reduce the variation between frame and frame, overlapping is taken between consecutive frame.General frame length takes
25ms, frame pipette the half of frame length.
(6) adding window: it is to keep the overall situation more continuous to carry out Fourier expansion after adding window, avoids the occurrence of gibbs effect
It answers, after adding window, shows the Partial Feature of periodic function without periodic voice signal originally.In speech signal analysis
In, common window function has rectangular window, Hanning window and Hamming window.
(7) read pretreated data: this step can be realized by programming.
(8) short-time energy is extracted: short-time energy indicates the energy of one frame of voice signal, can therefrom observe voice signal
Amplitude characteristic.The representation method of short-time energy are as follows: set voice signal as x (n), the voice signal that l frame is obtained after pretreatment isThen short-time energy are as follows:
Wherein ElFor the short-time energy of voice signal l frame, N is the length of a frame voice signal.
(9) zero-crossing rate extracts: short-time zero-crossing rate is the common temporal signatures of voice signal, and finger speech sound signal is in a short time
Pass through the number of zero.Continuous signal and discrete signal, the method for obtaining zero-crossing rate is different, can be with by observing its waveform statistics
The zero-crossing rate of continuous signal is obtained, the sign change number for calculating signal sampling point can obtain the zero-crossing rate of discrete signal.It is single
Zero passage number in the time of position is referred to as averagely Zero-crossing Number.
Voice frame signalShort-time average Zero-crossing Number zl(n) is defined as:
WhereinFor l frame voice signal, N is the length of one frame of voice signal, zlIt (n) is l frame voice signal
Zero passage number in short-term, sgn [] are sign function, it may be assumed that
(10) pitch period extracts: according to auto-correlation functionCalculate the base of each frame
Sound period, wherein xiIt (m) is the voice signal after adding window, k is the retardation of time.
(11) expressive features parameter extraction: expressive features parameter is extracted using LBP algorithm, LBP is a kind of image texture office
Portion's extraction algorithm, it retains gray processing information, embodies relationship between pixel and its field value.
(12) it builds artificial neural network: setting the node number of input layer as n, the node number of hidden layer is l, output layer
Node number be l (l=4,4 kinds of emotions).Weight w of the input layer to hidden layerij, the weight of hidden layer to output layer is wjk,
Input layer is biased to a to hidden layerj, hidden layer to output layer is biased to bk.Learning rate is η (η is set as 0.01), is swashed
Encouraging function is g (x).Wherein excitation function is that g (x) takes sigmoid function.Form are as follows:
The output of the output of hidden layer, hidden layer is set as Hj:
The output of output layer:
The calculating of error are as follows:Wherein YkFor desired output, we remember Yk-Ok=ek, then E can be with
It is expressed asIn above formula, i=1 ... n, j=1 ... l, k=1 ... m.
The more new formula of weight are as follows:
The more new formula of biasing are as follows:
Finally judge whether algorithm iteration terminates: there are many methods to may determine that whether algorithm restrains, common are finger
The number for determining iteration, judges whether the difference between adjacent error twice is less than specified value.
(13) training test: sample sound is divided into independent two parts training set and test set, concentration training collection is used to
Training pattern and network allow neural network to meet the requirement of anticipation.And test set is for testing model.Training set accounts for sample
80%, test set accounts for the 20% of sample, and two parts are randomly selected from sample.
(14) error correction: classify inevitably to emotion sound using trained model will appear the classification results of mistake,
The expressive features of classification results combination image are corrected, misclassification rate can be reduced.
The invention patent is directed to the characteristic voice of dog, chooses three kinds of characteristic parameters, extracts these characteristic parameters and is used to classify,
And it in conjunction with the expressive features of dog, is corrected by result of the expressive features to classification, reduces classification error rate.
In conjunction with attached drawing, the embodiment of the present invention is explained in detail above, but the present invention is not limited to above-mentioned
Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept
Put that various changes can be made.
Claims (6)
1. a kind of dog sound sensibility classification method based on artificial neural network, characterized by the following steps:
(1) dog sound and expression acquisition: acquisition is glad, pain, indignation, frightened four kinds of emotion sound, acquires corresponding four kinds of feelings
The countenance image of sense;
(2) sound pre-processes: including preemphasis, framing, windowing process;
(3) sound characteristic parameter extraction: extracting characteristic parameter from tonic train to be measured, and short-time energy is extracted, and zero-crossing rate extracts,
Pitch period extracts;
(4) expressive features parameter extraction: expressive features parameter is extracted from corresponding face image, extracts the local grain of image
Characteristic parameter;
(5) it builds artificial neural network: obtaining dog sound sentiment classification model using the training of BP neural network algorithm;
(6) training test: it is used as training set by collected sample sound 80%, 20% is used as test set;
(7) the mistake classification of model is corrected in conjunction with the expressive features of image.
2. a kind of dog sound sensibility classification method based on artificial neural network according to claim 1, it is characterised in that:
Sound is acquired by sound pick-up outfit in the step (1), and sample frequency meets nyquist sampling theorem, sample rate fs≥
2fh, fhCountenance for signal highest frequency, dog can be by acquisition of taking pictures, by the sound of collected dog and every kind of sound
The corresponding expression of sound is labeled.
3. a kind of dog sound sensibility classification method based on artificial neural network according to claim 1, it is characterised in that:
The parameter μ of the filter of preemphasis takes 0.95 in the step (2), and the frame length that framing uses is 128, and it is 64 that frame, which moves, window function
Using Hanning window.
4. a kind of dog sound sensibility classification method based on artificial neural network according to claim 1, it is characterised in that:
Sound characteristic parameter extraction in the step (3) the following steps are included:
(3.1) short-time energy is extracted: short-time energy indicates the energy of one frame of voice signal, the representation method of short-time energy are as follows: set
Voice signal is x (n), and the voice signal that l frame is obtained after pretreatment isThen short-time energy are as follows:
Wherein ElFor the short-time energy of voice signal l frame, N is the length of a frame voice signal;
(3.2) zero-crossing rate extracts: short-time zero-crossing rate, finger speech sound signal in a short time by the number of zero, continuous signal and
Discrete signal, the method for obtaining zero-crossing rate is different, and the zero-crossing rate of available continuous signal is counted by observing its waveform, calculates
The sign change number of signal sampling point can obtain the zero-crossing rate of discrete signal, and the zero passage number in the unit time is referred to as flat
Equal Zero-crossing Number;
Voice frame signalShort-time average Zero-crossing Number zl(n) is defined as:
WhereinFor l frame voice signal, N is the length of one frame of voice signal, zl(n) in short-term for l frame voice signal
Zero passage number, sgn [] are sign function, it may be assumed that
(3.3) pitch period extracts: according to auto-correlation functionCalculate the fundamental tone week of each frame
Phase, wherein xiIt (m) is the voice signal after adding window, k is the retardation of time.
5. a kind of dog sound sensibility classification method based on artificial neural network according to claim 1, it is characterised in that:
Expressive features parameter is extracted using LBP algorithm in the step (4).
6. a kind of dog sound sensibility classification method based on artificial neural network according to claim 1, it is characterised in that:
BP neural network algorithm steps include: in the step (5)
(5.1) initialization of network
If the node number of input layer is n, the node number of hidden layer is l, and the node number of output layer is m, and input layer is to hidden
Weight w containing layerij, the weight of hidden layer to output layer is wjk, input layer to hidden layer is biased to aj, hidden layer to output layer
Be biased to bk, learning rate η, excitation function is g (x), and wherein excitation function is that g (x) takes sigmoid function, form are as follows:
(5.2) output of the output of hidden layer, hidden layer is set as Hj:
(5.3) output of output layer:
(5.4) calculating of error are as follows:Wherein YkFor desired output, Y is rememberedk-Ok=ek,
Then E can be expressed asIn above formula, i=1 ... n, j=1 ... l, k=1 ... m;
(5.5) the more new formula of weight are as follows:
(5.6) the more new formula biased are as follows:
(5.7) finally judge whether algorithm iteration terminates: by specifying the number of iteration to judge whether algorithm restrains, that is, judging phase
Whether the difference between adjacent error twice is less than specified value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810995254.XA CN109272986A (en) | 2018-08-29 | 2018-08-29 | A kind of dog sound sensibility classification method based on artificial neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810995254.XA CN109272986A (en) | 2018-08-29 | 2018-08-29 | A kind of dog sound sensibility classification method based on artificial neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109272986A true CN109272986A (en) | 2019-01-25 |
Family
ID=65154951
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810995254.XA Pending CN109272986A (en) | 2018-08-29 | 2018-08-29 | A kind of dog sound sensibility classification method based on artificial neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109272986A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110970037A (en) * | 2019-11-28 | 2020-04-07 | 歌尔股份有限公司 | Pet language identification method and device, electronic equipment and readable storage medium |
CN111444137A (en) * | 2020-03-26 | 2020-07-24 | 湖南搜云网络科技股份有限公司 | Multimedia file identity recognition method based on feature codes |
CN111916067A (en) * | 2020-07-27 | 2020-11-10 | 腾讯科技(深圳)有限公司 | Training method and device of voice recognition model, electronic equipment and storage medium |
CN111951812A (en) * | 2020-08-26 | 2020-11-17 | 杭州情咖网络技术有限公司 | Animal emotion recognition method and device and electronic equipment |
CN112634947A (en) * | 2020-12-18 | 2021-04-09 | 大连东软信息学院 | Animal voice and emotion feature set sequencing and identifying method and system |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110082574A1 (en) * | 2009-10-07 | 2011-04-07 | Sony Corporation | Animal-machine audio interaction system |
CN103544962A (en) * | 2012-07-10 | 2014-01-29 | 腾讯科技(深圳)有限公司 | Animal status information release method and device |
CN104700829A (en) * | 2015-03-30 | 2015-06-10 | 中南民族大学 | System and method for recognizing voice emotion of animal |
CN105976809A (en) * | 2016-05-25 | 2016-09-28 | 中国地质大学(武汉) | Voice-and-facial-expression-based identification method and system for dual-modal emotion fusion |
CN106340309A (en) * | 2016-08-23 | 2017-01-18 | 南京大空翼信息技术有限公司 | Dog bark emotion recognition method and device based on deep learning |
CN106531173A (en) * | 2016-11-11 | 2017-03-22 | 努比亚技术有限公司 | Terminal-based animal data processing method and terminal |
TW201713284A (en) * | 2015-10-15 | 2017-04-16 | 昌泰科醫股份有限公司 | Sensing device for measuring physiological condition of pets capable of capturing the sound of the pet and accordingly determining the current mood or health status of the pet |
CN108320735A (en) * | 2018-01-23 | 2018-07-24 | 北京易智能科技有限公司 | A kind of emotion identification method and system of multi-data fusion |
CN110175526A (en) * | 2019-04-28 | 2019-08-27 | 平安科技(深圳)有限公司 | Dog Emotion identification model training method, device, computer equipment and storage medium |
-
2018
- 2018-08-29 CN CN201810995254.XA patent/CN109272986A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110082574A1 (en) * | 2009-10-07 | 2011-04-07 | Sony Corporation | Animal-machine audio interaction system |
CN103544962A (en) * | 2012-07-10 | 2014-01-29 | 腾讯科技(深圳)有限公司 | Animal status information release method and device |
CN104700829A (en) * | 2015-03-30 | 2015-06-10 | 中南民族大学 | System and method for recognizing voice emotion of animal |
TW201713284A (en) * | 2015-10-15 | 2017-04-16 | 昌泰科醫股份有限公司 | Sensing device for measuring physiological condition of pets capable of capturing the sound of the pet and accordingly determining the current mood or health status of the pet |
CN105976809A (en) * | 2016-05-25 | 2016-09-28 | 中国地质大学(武汉) | Voice-and-facial-expression-based identification method and system for dual-modal emotion fusion |
CN106340309A (en) * | 2016-08-23 | 2017-01-18 | 南京大空翼信息技术有限公司 | Dog bark emotion recognition method and device based on deep learning |
CN106531173A (en) * | 2016-11-11 | 2017-03-22 | 努比亚技术有限公司 | Terminal-based animal data processing method and terminal |
CN108320735A (en) * | 2018-01-23 | 2018-07-24 | 北京易智能科技有限公司 | A kind of emotion identification method and system of multi-data fusion |
CN110175526A (en) * | 2019-04-28 | 2019-08-27 | 平安科技(深圳)有限公司 | Dog Emotion identification model training method, device, computer equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
徐照松,等: "基于BP神经网络的语音情感识别研究", 《软件导刊》, vol. 13, no. 04, 23 April 2014 (2014-04-23), pages 11 - 13 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110970037A (en) * | 2019-11-28 | 2020-04-07 | 歌尔股份有限公司 | Pet language identification method and device, electronic equipment and readable storage medium |
CN111444137A (en) * | 2020-03-26 | 2020-07-24 | 湖南搜云网络科技股份有限公司 | Multimedia file identity recognition method based on feature codes |
CN111916067A (en) * | 2020-07-27 | 2020-11-10 | 腾讯科技(深圳)有限公司 | Training method and device of voice recognition model, electronic equipment and storage medium |
CN111951812A (en) * | 2020-08-26 | 2020-11-17 | 杭州情咖网络技术有限公司 | Animal emotion recognition method and device and electronic equipment |
CN112634947A (en) * | 2020-12-18 | 2021-04-09 | 大连东软信息学院 | Animal voice and emotion feature set sequencing and identifying method and system |
CN112634947B (en) * | 2020-12-18 | 2023-03-14 | 大连东软信息学院 | Animal voice and emotion feature set sequencing and identifying method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106878677B (en) | Student classroom mastery degree evaluation system and method based on multiple sensors | |
CN109272986A (en) | A kind of dog sound sensibility classification method based on artificial neural network | |
CN108831485A (en) | Method for distinguishing speek person based on sound spectrograph statistical nature | |
CN103996155A (en) | Intelligent interaction and psychological comfort robot service system | |
CN111798874A (en) | Voice emotion recognition method and system | |
CN103544963A (en) | Voice emotion recognition method based on core semi-supervised discrimination and analysis | |
AL-Dhief et al. | Voice pathology detection using machine learning technique | |
CN102411932B (en) | Methods for extracting and modeling Chinese speech emotion in combination with glottis excitation and sound channel modulation information | |
CN105448291A (en) | Parkinsonism detection method and detection system based on voice | |
CN102655003B (en) | Method for recognizing emotion points of Chinese pronunciation based on sound-track modulating signals MFCC (Mel Frequency Cepstrum Coefficient) | |
CN110111797A (en) | Method for distinguishing speek person based on Gauss super vector and deep neural network | |
CN108922541A (en) | Multidimensional characteristic parameter method for recognizing sound-groove based on DTW and GMM model | |
Wang et al. | Speaker recognition based on MFCC and BP neural networks | |
CN109727608A (en) | A kind of ill voice appraisal procedure based on Chinese speech | |
Murugappan et al. | DWT and MFCC based human emotional speech classification using LDA | |
CN115346561B (en) | Depression emotion assessment and prediction method and system based on voice characteristics | |
CN115410711B (en) | White feather broiler health monitoring method based on sound signal characteristics and random forest | |
CN112397074A (en) | Voiceprint recognition method based on MFCC (Mel frequency cepstrum coefficient) and vector element learning | |
Gallardo-Antolín et al. | On combining acoustic and modulation spectrograms in an attention LSTM-based system for speech intelligibility level classification | |
Warlaumont et al. | Data-driven automated acoustic analysis of human infant vocalizations using neural network tools | |
Sharma et al. | Processing and analysis of human voice for assessment of Parkinson disease | |
Chaves et al. | Katydids acoustic classification on verification approach based on MFCC and HMM | |
CN111091816B (en) | Data processing system and method based on voice evaluation | |
CN102750950B (en) | Chinese emotion speech extracting and modeling method combining glottal excitation and sound track modulation information | |
Marck et al. | Identification, analysis and characterization of base units of bird vocal communication: The white spectacled bulbul (Pycnonotus xanthopygos) as a case study |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190125 |
|
RJ01 | Rejection of invention patent application after publication |