CN103258531A - Harmonic wave feature extracting method for irrelevant speech emotion recognition of speaker - Google Patents

Harmonic wave feature extracting method for irrelevant speech emotion recognition of speaker Download PDF

Info

Publication number
CN103258531A
CN103258531A CN2013102079615A CN201310207961A CN103258531A CN 103258531 A CN103258531 A CN 103258531A CN 2013102079615 A CN2013102079615 A CN 2013102079615A CN 201310207961 A CN201310207961 A CN 201310207961A CN 103258531 A CN103258531 A CN 103258531A
Authority
CN
China
Prior art keywords
harmonic
voice signal
speaker
formula
extracting method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013102079615A
Other languages
Chinese (zh)
Other versions
CN103258531B (en
Inventor
王坤侠
安宁
李廉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Deep Blue Technology Shanghai Co Ltd
Original Assignee
安宁
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 安宁 filed Critical 安宁
Priority to CN201310207961.5A priority Critical patent/CN103258531B/en
Publication of CN103258531A publication Critical patent/CN103258531A/en
Application granted granted Critical
Publication of CN103258531B publication Critical patent/CN103258531B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a harmonic wave feature extracting method for irrelevant speech emotion recognition of a speaker. The harmonic wave feature extracting method comprises the following steps of (1) constructing a harmonic coefficient model based on a Fourier series, (2) extracting the characteristic parameters of the harmonic coefficients of speech signals to form characteristic vectors according to the constructed harmonic coefficient model, (3) inputting the characteristic vectors to a support vector machine (SVM) disaggregated model as data input, carrying out an irrelevant speech emotion recognition test of the speaker, and (4) outputting the effect of the characteristic parameters of the harmonic coefficients on the irrelevant speech emotion recognition of the speaker after training and testing. According to the harmonic wave feature extracting method for the irrelevant speech emotion recognition of the speaker, the characteristic parameters of the harmonic coefficients are applied to the irrelevant speech emotion recognition of the speaker, and a recognition rate is greatly improved.

Description

A kind of harmonic characteristic extracting method of the speech emotional identification that has nothing to do for the speaker
Technical field
The present invention relates to a kind of audio signal processing method, relate in particular to a kind of harmonic characteristic extracting method of the speech emotional identification that has nothing to do for the speaker.
Background technology
Along with the continuous development of pattern-recognition and the emotion theory of computation, utilize computing machine from voice signal, to automatically identify speaker's affective state and variation, namely the speech emotional recognition technology is subjected to numerous scholars' attention.Voice are as the important media that the mankind exchange, and are the Basic Ways of transmitting information between men.Voice signal is not only transmitting actual semantic content, and is containing abundant emotion information.
The research of speech emotional identification is for the intellectuality and the hommization that increase computing machine, the development of new man-machine environment, and promote subject development such as psychology, have important practical significance.At present, the speech emotional recognition technology is brought remarkable influence to people's work, studying and living.In educational circles, the speech emotional recognition technology is applied to the real-time network teaching, strengthens teaching efficiency, improves the quality of teaching; In amusement circles, the emotion interaction technique can be constructed anthropomorphic style and scene of game true to nature; In industry member, intelligent domestic electrical equipment, mobile phone, automobile etc. can be understood our emotion, and respond, for our work and life provides quality services; In medical circle, can the emotion variation of the elderly in part illness (as mental diseases such as depression, anxiety disorders) and the family not living home be detected and offer help.In addition, the speech emotional recognition technology can also play a significant role at aspects such as information retrieval, network communications, and its application scenarios is very wide.
Feature extraction is the basis in the speech emotional identification, and feature extraction is exactly the essential characteristic that extracts the expression speech emotional from voice signal.Phonetic feature can be divided into two classes, and one is the phonetics feature, and one is prosodic features.The researcher has attempted having used many affective characteristicses.A large amount of studies show that, in the speech emotional identification characteristic parameter commonly used as: fundamental frequency, resonance peak coefficient, linear predictor coefficient, cepstrum coefficient etc. are validity feature, how to seek the new expressive force that has more personal characteristics, have the phonetic feature of stronger robustness, remain major issue that needs to be resolved hurrily in the irrelevant speech emotional recognition technology field of speaker.
In sum, acoustic feature is the basis of speech signal analysis, and good acoustic feature can be excavated the essence of voice signal.Although the research of speech emotional identification makes progress, do not reach society far away to the requirement of its practicability, mainly show:
(1) do not find simple acoustical characteristic parameters in order to identify emotion reliably as yet;
(2) present, most of speech emotional characteristic extraction methods all are the stationarities in short-term of utilizing voice signal, and think separate between voice signal adjacent.Such feature extracting method has been lost the behavioral characteristics of voice signal.
Summary of the invention
The present invention is directed to above problem, propose a kind of speech emotional characteristic parameter extraction method based on the harmonic constant model, be used for the speech emotional identification that the speaker has nothing to do, this harmonic constant feature can improve the recognition performance of speech emotional identification.
The present invention is achieved in that a kind of harmonic characteristic extracting method of the speech emotional identification that has nothing to do for the speaker, and it may further comprise the steps:
Step 1 makes up the harmonic constant model based on Fourier series: to a voice signal x (m), satisfy the Fourier series of formula (1), x ( m ) = Σ k = 1 M a k ( m ) cos ( 2 πk F 0 ( m ) m ) + b k ( m ) sin ( 2 πk F 0 ( m ) m ) - - - ( 1 ) When voice signal x (m) is steady in the section at the fixed time, be the limited voice signal x (m) of N to a length, one N point discrete signal [x (0), x (N-1)], after discrete Fourier transformation, generate spectrum signal [X (0) ... X (N-1)], discrete Fourier transformation is defined as formula (2):
Figure BDA0000326830932
, wherein, k=0,1,2..., N-1 is expressed as the linear system X=Wx of formula (3) with discrete Fourier transformation, X ( 0 ) X ( 1 ) X ( 2 ) . . . X ( N - 2 ) X ( N - 1 ) = 1 1 1 . . . 1 1 1 e - j 2 π N e - j 4 π N . . . e - j 2 ( N - 2 ) π N e - j 2 ( N - 1 ) π N 1 e - j 4 π N e - j 8 π N . . . e - j 4 ( N - 2 ) π N e - j 4 ( N - 2 ) π N . . . . . . . . . . . . . . . . . . 1 e - j 2 ( N - 2 ) π N e - j 4 ( N - 2 ) π N . . . e - j 2 ( N - 2 ) 2 π N e - j 2 ( N - 1 ) ( N - 2 ) π N 1 e - j 2 ( N - 1 ) π N e - j 4 ( N - 1 ) π N . . . e - j 2 ( N - 1 ) ( N - 2 ) π N e - j ( N - 1 ) 2 π N x ( 0 ) x ( 1 ) x ( 2 ) . . . x ( N - 2 ) x ( N - 1 ) - - - ( 3 ) , constitute voice signal harmonic constant model, wherein, transition matrix is , X (k) is 0 harmonic constant to the N-1 interval, K is overtone order;
Step 2 is extracted the characteristic parameter based on the harmonic constant model;
At first, harmonic constant characteristic parameter extraction: voice signal x (m) is carried out the branch frame, wherein frame length 16ms, frame moves 8ms, according to voice signal harmonic constant model, calculates the harmonic constant of each frame, the harmonic constant X of voice signal (N, I)=[X (0,1) X (1,1) ... X (N 1,1) ... X (0, i-1) X (1, i-1) ... (N 1 for X, i-1), (0, i) X (1 for X, i) ... X (N-1,1) ,] (4), wherein i is frame number, according to formula (4) the each harmonic coefficient of voice signal x (m) is added up, calculate its maximal value, minimum value, median, mean value and variance obtain the global characteristics vector form (5) of voice signal X min = min ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X max = max ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X med = median ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X mea = 1 k Σ i = 1 k X ( N , i ) X std = Σ i = 1 k ( X ( N , i ) - X avg ) 2 - - - ( 5 )
Secondly, harmonic constant difference characteristic parameter extraction: the each harmonic coefficient that obtains in the harmonic constant characteristic parameter extraction step is carried out first order difference and second order difference computing according to formula (6), ΔX = X ( N , i + 1 ) - X ( N , i ) i = 1,2 , . . . , I ΔΔX = ΔX ( N , i + 1 ) - ΔX ( N , i ) i = 1,2 , . . . , I - 2 - - - ( 6 ) , obtain the dynamic harmonic coefficient sequence of voice signal, same, calculate first order difference and second order difference statistical value according to formula (5), obtain the overall behavioral characteristics vector of voice signal;
Step 3 is imported the eigenvector that step (two) is extracted as data, input to support vector machine disaggregated model (SVM), carries out the irrelevant speech emotional identification of speaker test;
Step 4 is through the effect of training and testing output harmonic wave coefficient characteristics parameter to the irrelevant speech emotional identification of speaker.
As the further improvement of such scheme, in step 3, for given training set (x i, y i), i=1 ..., n, x i∈ R d, y ∈+1, and-1}, best lineoid ω x+b=0 obtains by minimizing (7) formula, , wherein, ξ iBe slack variable, introduce complicacy and wrong branch rate that parameters C is come gauging system.
Preferably, in step 3, the double optimization decision function is defined as
Figure BDA0000326830938
, wherein, K is kernel function, x iBe the support vector of corresponding Lagrangian multiplier parameter, n is the number of support vector, b *It is offset parameter.
Further improvement as such scheme, in step 2, extraction is divided into four-stage based on the characteristic parameter of harmonic constant model: sampling and quantification, pre-emphasis processing, windowing and harmonic characteristic extract, at first voice signal is sampled and quantize, analog signal conversion is become digital signal, promote the HFS of voice signal, make the frequency spectrum of voice signal become level and smooth, realize the pre-emphasis of voice signal, wherein, pre-emphasis adopts digital filter Z transfer function H (z)=1-095Z -1, windowed function adopts Hamming window.
This method has nothing to do voice harmonic constant feature application to the speaker speech emotional has improved discrimination in identifying greatly.Compared with the prior art, beneficial effect of the present invention is embodied in: the present invention proposes the harmonic constant model of voice, extract the harmonic constant feature of voice, comprise local feature and global characteristics, be applied to the speech emotional identification that the speaker has nothing to do, compare traditional feature, this phonetic feature has improved the performance of speech emotional identification greatly.This technology is applied to aspects such as intelligent appliance, medical science auxiliary curing and safety detection, can provides service and product humanized, emotional culture for the mankind.
Description of drawings
Fig. 1 is used for the speech emotional module identified structural drawing that the speaker has nothing to do for what preferred embodiments of the present invention provided.
Fig. 2 is the basic process of feature extraction of the present invention.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explaining the present invention, and be not used in restriction the present invention.
The harmonic characteristic extracting method of the speech emotional identification that has nothing to do for the speaker of the present invention, its flowage structure comprise feature extraction based on the harmonic constant model, as shown in Figure 1 based on main modular such as the model training of support vector machine and identification outputs.This harmonic characteristic extracting method may further comprise the steps: (1) makes up the harmonic constant model based on Fourier series; (2) the harmonic constant model that makes up according to step (1), the harmonic constant characteristic parameter of extraction voice signal; (3) characteristic parameter that extracts according to step (2) obtains eigenvector, and eigenvector is imported as data, inputs to support vector machine disaggregated model (SVM), carries out the irrelevant speech emotional identification of speaker test; (4) through the effect of training and testing output harmonic wave coefficient characteristics parameter to the irrelevant speech emotional identification of speaker.
Wherein, feature extraction is divided into four-stage: sampling and quantification, pre-emphasis processing, windowing and harmonic characteristic extract, as shown in Figure 2.At first voice signal is sampled and quantize, analog signal conversion is become digital signal.Because voice signal is subjected to the influence of glottal excitation and mouth and nose radiation, need to promote HFS, the frequency spectrum of signal is become smoothly, be the pre-emphasis of voice signal.Pre-emphasis adopts digital filter Z transfer function H (z)=1-095Z -1Windowed function adopts Hamming window.
(1) structure is specifically implemented as follows based on the harmonic constant model of Fourier series:
To a voice signal x (m), can be write as the mathematical form suc as formula (1), formula (1) is called voice signal x (m), x ( m ) = Σ k = 1 M a k ( m ) cos ( 2 πk F 0 ( m ) m ) + b k ( m ) sin ( 2 πk F 0 ( m ) m ) - - - ( 1 ) Fourier series.Suppose that voice signal is steady in the time period at 10-30ms, N point discrete signal [x (0) ..., x (N-1)], after discrete Fourier transformation, generate spectrum signal [X (0) ..., X (N-1)].Be the limited voice signal x (m) of N to a length, its discrete Fourier transformation is defined as follows:
X ( k ) = Σ m = 0 N - 1 x ( m ) e - j 2 π N mk , k = 0,1,2 . . . , N - 1 - - - ( 2 )
Discrete Fourier transformation can be expressed as linear system X=Wx, as shown in the formula:
X ( 0 ) X ( 1 ) X ( 2 ) . . . X ( N - 2 ) X ( N - 1 ) = 1 1 1 . . . 1 1 1 e - j 2 π N e - j 4 π N . . . e - j 2 ( N - 2 ) π N e - j 2 ( N - 1 ) π N 1 e - j 4 π N e - j 8 π N . . . e - j 4 ( N - 2 ) π N e - j 4 ( N - 2 ) π N . . . . . . . . . . . . . . . . . . 1 e - j 2 ( N - 2 ) π N e - j 4 ( N - 2 ) π N . . . e - j 2 ( N - 2 ) 2 π N e - j 2 ( N - 1 ) ( N - 2 ) π N 1 e - j 2 ( N - 1 ) π N e - j 4 ( N - 1 ) π N . . . e - j 2 ( N - 1 ) ( N - 2 ) π N e - j ( N - 1 ) 2 π N x ( 0 ) x ( 1 ) x ( 2 ) . . . x ( N - 2 ) x ( N - 1 ) - - - ( 3 ) , wherein, transition matrix is Wherein X (k) is 0 harmonic constant to the N-1 interval, and K is overtone order.
(2) extraction is based on the characteristic parameter of harmonic constant model
1. harmonic constant characteristic parameter extraction
Based on the stationarity in short-term of voice signal, voice signal is carried out the branch frame, frame length 16ms wherein, frame moves 8ms.Obtain voice signal harmonic constant model according to (1), calculate the harmonic constant of each frame, and the harmonic constant X of voice signal (N, I)=[X (0,1) X (1,1) ... X (N 1,1),, (0, i-1) X (1 for X, i-1) ... (N 1, and i-1), X (0 for X, i) X (1, i) ... X (N-1,1), ] (4), wherein i is frame number, N is harmonic constant.According to formula (4) the each harmonic coefficient of voice signal is added up, calculated it: maximal value, minimum value, median, mean value and variance obtain the global characteristics vector (5) of voice signal:
X min = min ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X max = max ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X med = median ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X mea = 1 k Σ i = 1 k X ( N , i ) X std = Σ i = 1 k ( X ( N , i ) - X avg ) 2 - - - ( 5 ) .
2. harmonic constant difference characteristic parameter extraction
The proper vector difference is used for obtaining the continuous dynamic change track of speech feature vector, can obtain the pace of change of proper vector to the first order difference of proper vector, can extract the acceleration that proper vector changes to the second order difference of proper vector.1. the each harmonic coefficient that obtains is carried out first order difference and second order difference computing according to formula (6), ΔX = X ( N , i + 1 ) - X ( N , i ) i = 1,2 , . . . , I ΔΔX = ΔX ( N , i + 1 ) - ΔX ( N , i ) i = 1,2 , . . . , I - 2 - - - ( 6 ) , obtain the dynamic harmonic coefficient sequence of voice signal.Equally, calculate first order difference and second order difference statistical value according to formula (5), obtain the overall behavioral characteristics vector of voice signal.
(3) based on the irrelevant speech emotional identification of the speaker of harmonic constant feature
With 1. and the proper vector that 2. obtains as the input of support vector machine, adopt supporting vector machine model to train, set up supporting vector machine model, the output recognition effect.Idiographic flow is as follows:
For given training set (x i, y i), i=1 ..., n, x i∈ R d, y ∈+1, and-1}, best lineoid ω x+b=0 can obtain by minimizing (7) formula.
p ( ω , ξ ) = 1 2 ω T · ω + C Σ 2 = 1 l ξ i - - - ( 7 ) , ξ iBe slack variable, introduce complicacy and wrong branch rate that parameters C is come gauging system.For solving the double optimization problem, decision function is defined as
Figure BDA00003268309316
, wherein, K is kernel function, x iBe the support vector of corresponding Lagrangian multiplier parameter, n is the number of support vector, b *It is offset parameter.For linear support vector machine, such kernel function meets the requirements very much, and for non-linear support vector machine, the Nonlinear Mapping kernel function is mapped to the high-order feature space with data, and best lineoid namely is present in this space.
(4) experimental result
The present invention's experiment is to finish under the emotion corpus of Berlin.Berlin emotional speech database is to be recorded by the W. Sendlmeier of Berlin technical college professor working group, this database comprises 5 male sex and 5 women's emotion statement, affective state comprises sadness, indignation, fears, detests, glad, be sick of and 7 kinds of neutrality etc. this experiment employing cross validation method.Extract voice 40 subharmonic coefficients as characteristic parameter, carry out the irrelevant emotion recognition of speaker, recognition result is 76.3%, compares traditional characteristic energy, resonance peak, zero-crossing rate, gene frequency etc. and improves 5.4%.Its confusion matrix such as table 1.
Table 1: emotion recognition confusion matrix
Figure BDA00003268309317
The above only is preferred embodiment of the present invention, not in order to limiting the present invention, all any modifications of doing within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.

Claims (4)

1. harmonic characteristic extracting method that is used for the speech emotional identification that the speaker has nothing to do, it is characterized in that: it may further comprise the steps:
Step 1 makes up the harmonic constant model based on Fourier series: to a voice signal x (m), satisfy the Fourier series of formula (1), x ( m ) = Σ k = 1 M a k ( m ) cos ( 2 πk F 0 ( m ) m ) + b k ( m ) sin ( 2 πk F 0 ( m ) m ) - - - ( 1 ) When voice signal x (m) is steady in the section at the fixed time, be the limited voice signal x (m) of N to a length, one N point discrete signal [x (0), x (N-1)], after discrete Fourier transformation, generate spectrum signal [X (0) ... X (N-1)], discrete Fourier transformation is defined as formula (2):
Figure FDA0000326830922
, wherein, k=0,1,2..., N-1 is expressed as the linear system X=Wx of formula (3) with discrete Fourier transformation, X ( 0 ) X ( 1 ) X ( 2 ) . . . X ( N - 2 ) X ( N - 1 ) = 1 1 1 . . . 1 1 1 e - j 2 π N e - j 4 π N . . . e - j 2 ( N - 2 ) π N e - j 2 ( N - 1 ) π N 1 e - j 4 π N e - j 8 π N . . . e - j 4 ( N - 2 ) π N e - j 4 ( N - 2 ) π N . . . . . . . . . . . . . . . . . . 1 e - j 2 ( N - 2 ) π N e - j 4 ( N - 2 ) π N . . . e - j 2 ( N - 2 ) 2 π N e - j 2 ( N - 1 ) ( N - 2 ) π N 1 e - j 2 ( N - 1 ) π N e - j 4 ( N - 1 ) π N . . . e - j 2 ( N - 1 ) ( N - 2 ) π N e - j ( N - 1 ) 2 π N x ( 0 ) x ( 1 ) x ( 2 ) . . . x ( N - 2 ) x ( N - 1 ) - - - ( 3 ) , constitute voice signal harmonic constant model, wherein, transition matrix is
Figure FDA0000326830924
, X (k) is 0 harmonic constant to the N-1 interval, K is overtone order;
Step 2 is extracted the characteristic parameter based on the harmonic constant model;
At first, harmonic constant characteristic parameter extraction: voice signal x (m) is carried out the branch frame, wherein frame length 16ms, frame moves 8ms, according to voice signal harmonic constant model, calculates the harmonic constant of each frame, harmonic constant X (the N of voice signal, I)=[X (0,1) X (1,1) ... (N 1 for X, 1),, (0, i-1) X (1 for X, i-1) ... (N 1 for X, i-1), (0, i) X (1 for X, i) ... X (N-1,1) ,] (4), wherein i is frame number, according to formula (4) the each harmonic coefficient of voice signal x (m) is added up, calculate its maximal value, minimum value, median, mean value and variance obtain the global characteristics vector of voice signal as formula (5) X min = min ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X max = max ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X med = median ( X ( N , 1 ) , X ( N , 2 ) , . . . , X ( N , i ) ) X mea = 1 k Σ i = 1 k X ( N , i ) X std = Σ i = 1 k ( X ( N , i ) - X avg ) 2 - - - ( 5 )
Secondly, harmonic constant difference characteristic parameter extraction: the each harmonic coefficient that obtains in the harmonic constant characteristic parameter extraction step is carried out first order difference and second order difference computing according to formula (6), ΔX = X ( N , i + 1 ) - X ( N , i ) i = 1,2 , . . . , I ΔΔX = ΔX ( N , i + 1 ) - ΔX ( N , i ) i = 1,2 , . . . , I - 2 - - - ( 6 ) , obtain the dynamic harmonic coefficient sequence of voice signal, same, calculate first order difference and second order difference statistical value according to formula (5), obtain the overall behavioral characteristics vector of voice signal;
Step 3 is imported the eigenvector that step (two) is extracted as data, input to support vector machine disaggregated model (SVM), carries out the irrelevant speech emotional identification of speaker test;
Step 4 is through the effect of training and testing output harmonic wave coefficient characteristics parameter to the irrelevant speech emotional identification of speaker.
2. the harmonic characteristic extracting method of the speech emotional identification that has nothing to do for the speaker according to claim 1 is characterized in that: in step 3, for given training set (x i, y i), i=1 ..., n, x i∈ R d, y ∈+1, and-1}, best lineoid ω x+b=0 obtains by minimizing (7) formula,
Figure FDA0000326830927
, wherein, ξ iBe slack variable, introduce complicacy and wrong branch rate that parameters C is come gauging system.
3. the harmonic characteristic extracting method of the speech emotional identification that has nothing to do for the speaker according to claim 2, it is characterized in that: in step 3, the double optimization decision function is defined as
Figure FDA0000326830928
, wherein, K is kernel function, x iBe the support vector of corresponding Lagrangian multiplier parameter, n is the number of support vector, b *It is offset parameter.
4. the harmonic characteristic extracting method of the speech emotional identification that has nothing to do for the speaker according to claim 1, it is characterized in that: in step 2, extraction is divided into four-stage based on the characteristic parameter of harmonic constant model: sampling and quantification, pre-emphasis is handled, windowing and harmonic characteristic extract, at first voice signal is sampled and quantize, analog signal conversion is become digital signal, promote the HFS of voice signal, make the frequency spectrum of voice signal become level and smooth, realize the pre-emphasis of voice signal, wherein, pre-emphasis adopts digital filter Z transfer function H (z)=1-095Z -1, windowed function adopts Hamming window.
CN201310207961.5A 2013-05-29 2013-05-29 A kind of harmonic characteristic extracting method of the speech emotion recognition had nothing to do for speaker Active CN103258531B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310207961.5A CN103258531B (en) 2013-05-29 2013-05-29 A kind of harmonic characteristic extracting method of the speech emotion recognition had nothing to do for speaker

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310207961.5A CN103258531B (en) 2013-05-29 2013-05-29 A kind of harmonic characteristic extracting method of the speech emotion recognition had nothing to do for speaker

Publications (2)

Publication Number Publication Date
CN103258531A true CN103258531A (en) 2013-08-21
CN103258531B CN103258531B (en) 2015-11-11

Family

ID=48962405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310207961.5A Active CN103258531B (en) 2013-05-29 2013-05-29 A kind of harmonic characteristic extracting method of the speech emotion recognition had nothing to do for speaker

Country Status (1)

Country Link
CN (1) CN103258531B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107564543A (en) * 2017-09-13 2018-01-09 苏州大学 A kind of Speech Feature Extraction of high touch discrimination
CN108346436A (en) * 2017-08-22 2018-07-31 腾讯科技(深圳)有限公司 Speech emotional detection method, device, computer equipment and storage medium
CN108577866A (en) * 2018-04-03 2018-09-28 中国地质大学(武汉) A kind of system and method for multidimensional emotion recognition and alleviation
CN108777140A (en) * 2018-04-27 2018-11-09 南京邮电大学 Phonetics transfer method based on VAE under a kind of training of non-parallel corpus
CN112118027A (en) * 2019-11-02 2020-12-22 广东石油化工学院 PLC channel impulse noise detection method and system
CN113555038A (en) * 2021-07-05 2021-10-26 东南大学 Speaker independent speech emotion recognition method and system based on unsupervised field counterwork learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060129392A1 (en) * 2004-12-13 2006-06-15 Lg Electronics Inc Method for extracting feature vectors for speech recognition
CN1975856A (en) * 2006-10-30 2007-06-06 邹采荣 Speech emotion identifying method based on supporting vector machine
CN101261832A (en) * 2008-04-21 2008-09-10 北京航空航天大学 Extraction and modeling method for Chinese speech sensibility information
CN101685634A (en) * 2008-09-27 2010-03-31 上海盛淘智能科技有限公司 Children speech emotion recognition method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060129392A1 (en) * 2004-12-13 2006-06-15 Lg Electronics Inc Method for extracting feature vectors for speech recognition
CN1975856A (en) * 2006-10-30 2007-06-06 邹采荣 Speech emotion identifying method based on supporting vector machine
CN101261832A (en) * 2008-04-21 2008-09-10 北京航空航天大学 Extraction and modeling method for Chinese speech sensibility information
CN101685634A (en) * 2008-09-27 2010-03-31 上海盛淘智能科技有限公司 Children speech emotion recognition method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108346436A (en) * 2017-08-22 2018-07-31 腾讯科技(深圳)有限公司 Speech emotional detection method, device, computer equipment and storage medium
US11189302B2 (en) 2017-08-22 2021-11-30 Tencent Technology (Shenzhen) Company Limited Speech emotion detection method and apparatus, computer device, and storage medium
US11922969B2 (en) 2017-08-22 2024-03-05 Tencent Technology (Shenzhen) Company Limited Speech emotion detection method and apparatus, computer device, and storage medium
CN107564543A (en) * 2017-09-13 2018-01-09 苏州大学 A kind of Speech Feature Extraction of high touch discrimination
CN107564543B (en) * 2017-09-13 2020-06-26 苏州大学 Voice feature extraction method with high emotion distinguishing degree
CN108577866A (en) * 2018-04-03 2018-09-28 中国地质大学(武汉) A kind of system and method for multidimensional emotion recognition and alleviation
CN108777140A (en) * 2018-04-27 2018-11-09 南京邮电大学 Phonetics transfer method based on VAE under a kind of training of non-parallel corpus
CN112118027A (en) * 2019-11-02 2020-12-22 广东石油化工学院 PLC channel impulse noise detection method and system
CN113555038A (en) * 2021-07-05 2021-10-26 东南大学 Speaker independent speech emotion recognition method and system based on unsupervised field counterwork learning
CN113555038B (en) * 2021-07-05 2023-12-29 东南大学 Speaker-independent voice emotion recognition method and system based on unsupervised domain countermeasure learning

Also Published As

Publication number Publication date
CN103258531B (en) 2015-11-11

Similar Documents

Publication Publication Date Title
CN109817246B (en) Emotion recognition model training method, emotion recognition device, emotion recognition equipment and storage medium
CN102800316B (en) Optimal codebook design method for voiceprint recognition system based on nerve network
CN103258531A (en) Harmonic wave feature extracting method for irrelevant speech emotion recognition of speaker
CN109767778B (en) Bi-L STM and WaveNet fused voice conversion method
CN101226743A (en) Method for recognizing speaker based on conversion of neutral and affection sound-groove model
CN104123933A (en) Self-adaptive non-parallel training based voice conversion method
CN108154879B (en) Non-specific human voice emotion recognition method based on cepstrum separation signal
CN103065629A (en) Speech recognition system of humanoid robot
CN102723078A (en) Emotion speech recognition method based on natural language comprehension
CN101930735A (en) Speech emotion recognition equipment and speech emotion recognition method
CN102237083A (en) Portable interpretation system based on WinCE platform and language recognition method thereof
Wang et al. Research on speech emotion recognition technology based on deep and shallow neural network
CN106024010A (en) Speech signal dynamic characteristic extraction method based on formant curves
CN109065073A (en) Speech-emotion recognition method based on depth S VM network model
CN103456302A (en) Emotion speaker recognition method based on emotion GMM model weight synthesis
Waghmare et al. Emotion recognition system from artificial marathi speech using MFCC and LDA techniques
Zhang et al. Speech emotion recognition using combination of features
Chauhan et al. Speech to text converter using Gaussian Mixture Model (GMM)
CN103258537A (en) Method utilizing characteristic combination to identify speech emotions and device thereof
Garg et al. Survey on acoustic modeling and feature extraction for speech recognition
Jie Speech emotion recognition based on convolutional neural network
Zhao et al. Transferring age and gender attributes for dimensional emotion prediction from big speech data using hierarchical deep learning
Paul et al. Automated speech recognition of isolated words using neural networks
CN103886859A (en) Voice conversion method based on one-to-many codebook mapping
Shah Wavelet packets for speech emotion recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20180306

Address after: 230000 Danxia Road, Hefei economic and Technological Development Zone, Anhui Province, Danxia Road, north of jadeite Road West Hefei University City commercial center business office building 5F set up room 426 room

Patentee after: Hefei wing Mdt InfoTech Ltd

Address before: Wanda Plaza, No. 150, Ma On Shan Road in Baohe District of Hefei city in Anhui province 230000 building 15 room 3902

Patentee before: An Ning

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20190117

Address after: Unit 1001, 369 Weining Road, Changning District, Shanghai, 200336 (9th floor of actual floor)

Patentee after: Deep blue Technology (Shanghai) Co., Ltd.

Address before: 230000 Room 426, 5F Creator Space, Business Center, Hefei City Business Center, Feicui Road, North of Danxia Road, Hefei Economic and Technological Development Zone, Anhui Province

Patentee before: Hefei wing Mdt InfoTech Ltd

TR01 Transfer of patent right