CN103871411A - Text-independent speaker identifying device based on line spectrum frequency difference value - Google Patents

Text-independent speaker identifying device based on line spectrum frequency difference value Download PDF

Info

Publication number
CN103871411A
CN103871411A CN201410134694.8A CN201410134694A CN103871411A CN 103871411 A CN103871411 A CN 103871411A CN 201410134694 A CN201410134694 A CN 201410134694A CN 103871411 A CN103871411 A CN 103871411A
Authority
CN
China
Prior art keywords
model
line spectral
super
parameter
sup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410134694.8A
Other languages
Chinese (zh)
Inventor
马占宇
齐峰
张洪刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201410134694.8A priority Critical patent/CN103871411A/en
Publication of CN103871411A publication Critical patent/CN103871411A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例公开了一种基于线谱频率差值的文本无关的说话人鉴别方法。该方法包括如下步骤:特征提取步骤:将线谱频率参数通过线性变换转化为线谱频率参数差值,结合当前帧和其前后相邻两帧形成一个生成线谱频率特征超向量;模型训练步骤:使用超狄利克雷混合模型模拟特征超向量的分布,并解出模型中的参数;鉴别步骤:对待鉴别人的语音序列按照步骤一提取特征,再输入步骤二所得到的模型,计算针对每个概率模型的似然值,获取最大似然值,确认说话人编号。利用本发明实施例,能够提高文本无关的说话人鉴别率,具有很大的实用价值。

The embodiment of the invention discloses a text-independent speaker identification method based on line spectrum frequency difference. The method includes the following steps: feature extraction step: converting the line spectrum frequency parameter into a line spectrum frequency parameter difference through linear transformation, combining the current frame and its two adjacent frames to form a generated line spectrum frequency feature supervector; model training step : use the super-Dirichlet mixture model to simulate the distribution of feature supervectors, and solve the parameters in the model; identification step: extract features according to step 1 of the speech sequence of the person to be identified, and then input the model obtained in step 2, and calculate for each The likelihood value of a probability model, obtain the maximum likelihood value, and confirm the speaker number. By using the embodiment of the present invention, the text-independent speaker identification rate can be improved, which has great practical value.

Description

A kind of speaker's identification device of the text-independent based on line spectral frequencies difference
Technical field
The present invention has described emphatically the Speaker Recognition System of the text-independent of a kind of line spectral frequencies parameter based on linear transformation and super Di Li Cray mixture model.
Background technology
Along with the development of computer technology, utilize people's biological characteristic (as fingerprint, vocal print, face) to carry out identification or confirmation has very important research and using value.Speaker Identification is according to the speak speech parameter of feature of human physiology and behavior of reflection in speech waveform, automatically confirms that speaker whether in recorded words person's set, further confirms speaker's identity.Speaker Identification comprises that again speaker differentiates two parts with speaker verification.Speaker's identification system generally includes three parts: extraction can represent speaker's feature, and each speaker is trained to an independently model that meets the statistical law of its selected feature, finally makes a policy with the model obtaining by relatively inputting data.
Extract feature for Part I, be the good method of effect in current Speaker Identification based on sound channel signature analysis voice signal, conventional feature mainly contains: Mel-cepstrum coefficient (MFCC:Mel-frequency Cepstral Coefficients) and linear spectral coefficient (LSF:Line Spectral Frequencies).Traditional Mel-cepstrum coefficient (MFCC) vector is expressed multidate information by the method for difference, and the present invention adopts the feature super vector that line spectral frequencies difference represents to preserve original neighborhood information.What in addition, method of the present invention had also considered that Mel-cepstrum coefficient (MFCC) ignores differentiates the useful high-frequency information of speaker to machine.
In recognition methods, can be divided three classes at present: template matching method, probability model method, and Artificial Neural Network.Probability model adopts certain probability density function to describe the distribution situation of speaker's speech feature space, and using one group of parameter of this probability density function as speaker model.Gauss hybrid models (GMM:Gaussian Mixture Model) is due to the simple Speaker Recognition System that has efficiently been widely used in text-independent.But super Di Li Cray mixture model of the present invention (SDMM:super-Dirichlet Mixture Model) can better be described boundedness and the order of extracted feature.
According to the difference of identifying object, Speaker Identification can be divided into text dependent and text-independent two classes.The wherein speaker Recognition Technology of text dependent, requires speaker's the keyword of pronunciation and crucial sentence as training text, when identification according to identical content pronunciation.The speaker Recognition Technology of text-independent, in the time of training or in the time of identification, do not specify no matter be the content of speaking, identifying object is voice signal freely, need in voice signal freely, find feature and the method for the information that can characterize speaker, therefore sets up speaker model difficulty relatively.In addition, the easy stolen record of the recognition system of text dependent is emitted and is recognized, and uses inconvenience, and described in the invention is the recognition system of text-independent.
Summary of the invention
In order to solve the existing defect of above-mentioned technology and to improve speaker's resolution of text-independent, the invention provides the Speaker Identification device of the text-independent of a kind of line spectral frequencies parameter based on linear transformation and super Di Li Cray mixture model.
For achieving the above object, the method for distinguishing speek person of the text-independent that the present invention proposes comprises the following steps:
One, characteristic extraction step
A, line spectral frequencies parameter transformation step: in the linear coded prediction model of voice, be converted into line spectral frequencies parameter difference by linear transformation by line spectral frequencies parameter;
B, generation line spectral frequencies feature super vector step: form a feature super vector in conjunction with present frame two frames adjacent with its front and back and express multidate information.
Two. model training step: the frame sequence training pattern that is T by length to each speaker, use the distribution of super Di Li Cray mixture model (SDMM:super-Dirichlet Mixture Model) simulation feature super vector, solve an equation and obtain the parameter alpha in model by gradient method, finally obtain a series of models, the corresponding speaker of each model.
Three. differentiate coupling step: get in a series of probability models that the speech samples input of certain speaker in training set trained, adopt method transformation parameter and generating feature super vector in step 1, calculate the likelihood value for each probability model by the model that trains in step 2, get wherein maximum likelihood value and confirm speaker's numbering.
According to speaker's discrimination method of a kind of and text-independent of an embodiment of the invention, in line spectral frequencies parameter transformation step described in steps A, utilize the 1. non-negative characteristic of line spectral frequencies parameter, 2. in order characteristic and 3. bounded characteristic be transformed to linear spectral parameter difference Δ LSF, being characterized as of this difference: be 1. distributed in (0,1), in open interval, 2. add and be 1.This step detailed process is as follows:
1) K dimension line spectral frequencies Parametric Representation is s=[s 1, s 2..., s k] t, meet 0 < s 1< s 2< ..., s k< π;
2) dimension of the K+1 after conversion line spectral frequencies parameter difference Δ LSF is
Figure BDA0000486788020000021
wherein
x i = s 1 / &pi; i = 1 ( s i - s i - 1 ) / &pi; 1 < i &le; K ( &pi; - s K ) / &pi; i = K + 1 .
According to speaker's discrimination method of a kind of and text-independent of an embodiment of the invention, generation line spectral frequencies feature super vector step described in step B combines present frame x (t) and its consecutive frame to form a super vector, express multidate information with this, this super vector comprises three subvectors in the present invention.The interval of supposing present frame and former frame and a rear frame is all τ, only considers former frame x (t-τ) and two neighborhood frames of a rear frame x (t-τ) of present frame here, and the feature super vector of generation is 3 (K+1) dimension.Its detailed process is as follows:
1) K+1 dimension line spectral frequencies parameter difference vector x (t)=[x 1,1, x 1,2..., x 1, K+1] t;
2) the super vector result that comprises multidate information is:
x sup ( t ) = &Delta; x ( t ) x ( t - &tau; ) x ( t + &tau; ) = x 1,1 ( t ) . . . x 1 , K + 1 ( t ) x 2,1 ( t ) . . . x 2 , K + 1 ( t ) x 3,1 ( t ) . . . x 3 , K + 1 ( t ) , &tau; = 1,2 , . . . .
According to speaker's discrimination method of a kind of and text-independent of an embodiment of the invention, the detailed step of the model training described in step 2 is:
1) x supin each feature subvector x (t), x (t-τ), x (t+ τ) is separate and meet Dirichlet distribute, super vector x supmeet super Di Li Cray probability density distribution:
SDir ( x sup ; &alpha; ) = &Pi; n = 1 3 &Gamma; ( &Sigma; k = 1 K + 1 &alpha; n , k ) &Pi; k = 1 K + 1 &Gamma; ( &alpha; n , k ) &Pi; k = 1 K + 1 ( x n , k ) &alpha; n , k - 1
2) for the line spectral frequencies parameter difference subvector x (1) of a sequential ..., x (t) ..., x (T), has X=[x sup(1) ..., x sup(T)], carry out artificial line spectral frequency parameter difference with super Di Li Cray mixture model (SDMM):
f ( X ) = &Pi; t = 1 T &Sigma; m = 1 M &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) )
Wherein, weight factor &pi; m = 1 T &Sigma; t = 1 T z &OverBar; tm = 1 T &Sigma; t = 1 T &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) ) &Sigma; m = 1 M &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) ) .
3) computation model parameter, for m mixed components, parameter vector α mbe divided into 3 subvectors, the corresponding x of each parameter subvector supin a subvector.So we can obtain all parameters by solution equation below:
Figure BDA0000486788020000034
Beneficial effect of the present invention is, in terms of existing technologies, the line spectral frequencies parameter super vector of the present invention's application conversion is extracted as speaker's feature, by super Di Li Cray mixed distribution training pattern, provide again complete implementation system for application, test findings has been verified high efficiency of the present invention, has very strong practicality.
Brief description of the drawings
Fig. 1 is the flow chart of steps of method provided by the invention;
Fig. 2 is the flow chart of steps of line spectral frequencies parameter transformation;
Fig. 3 is the flow chart of steps of constitutive characteristic super vector.
Embodiment
Below in conjunction with accompanying drawing, specific embodiments of the present invention is described in detail.
Fig. 1 is process flow diagram of the present invention, and wherein dotted line represents training department's point flow process trend, and solid line represents to differentiate part flow process trend, comprises the following steps:
The first step: characteristic extraction step, speaker's voice sequence for the treatment of training carries out feature extraction
Step S1: line spectral frequencies parameter is converted to line spectral frequencies parameter difference;
Step S2: generate line spectral frequencies feature super vector;
Second step: training pattern
Step S3: use the distribution of super Di Li Cray mixture model simulation feature super vector, and solve the parameter in model;
The 3rd step: discrimination process
Speaker's voice sequence to be identified is repeated to step S1 and the step S2 generating feature super vector in the first step, and input step S3 trains the model obtaining.
Step S4: calculate the likelihood value for each probability model, obtain maximum likelihood value, confirm speaker's numbering.
To be specifically described each step below:
Step S1 realizes line spectral frequencies parameter transformation, and the line spectral frequencies parameter of the linear coded prediction model of voice is converted into line spectral frequencies parameter difference by linear transformation.It is as follows that Fig. 2 has provided the idiographic flow of the method:
1) input: line spectral frequencies parameter s=[s 1, s 2..., s k] t;
2), in step 11, by i, from 1 to K+1 circulation, the difference at every turn obtaining is as follows:
x i = s 1 / &pi; i = 1 ( s i - s i - 1 ) / &pi; 1 < i &le; K ( &pi; - s K ) / &pi; i = K + 1 ;
3) output: line spectral frequencies parameter x ~ [ x 1 , x 2 , . . . , x K + 1 ] T .
Step S2 generates line spectral frequencies feature super vector, by present frame x (t) and its former and later two consecutive frames super vector of formation that combines, expresses multidate information with this.The interval of supposing present frame and former frame and a rear frame is all τ, this super vector comprises three subvectors in the present invention: present frame x (t), former frame x (t-τ) and a rear frame x (t-τ), the feature super vector of generation is 3 (K+1) dimension.Fig. 3 provides its idiographic flow schematic diagram, and step is as follows:
1) input: K+1 dimension line spectral frequencies parameter difference vector x (t)=[x 1,1, x 1,2..., x 1, K+1] t;
2) output:
x sup ( t ) = &Delta; x ( t ) x ( t - &tau; ) x ( t + &tau; ) = x 1,1 ( t ) . . . x 1 , K + 1 ( t ) x 2,1 ( t ) . . . x 2 , K + 1 ( t ) x 3,1 ( t ) . . . x 3 , K + 1 ( t ) , &tau; = 1,2 , . . . .
Step S3 uses the distribution of super Di Li Cray mixture model simulation feature super vector, and solves the parameter in model.Detailed step is:
1) x supin each feature subvector x (t), x (t-τ), x (t+ τ) is separate and meet Dirichlet distribute, super vector x supmeet super Dirichlet distribute:
SDir ( x sup ; &alpha; ) = &Pi; n = 1 3 &Gamma; ( &Sigma; k = 1 K + 1 &alpha; n , k ) &Pi; k = 1 K + 1 &Gamma; ( &alpha; n , k ) &Pi; k = 1 K + 1 ( x n , k ) &alpha; n , k - 1
Wherein α 1, k, α 2, k, α 3, kit is parameter subvector.
2) for the line spectral frequencies parameter difference subvector x (1) of a sequential ..., x (t) ..., x (T), has X=[x sup(1) ..., x sup(T)],, by containing the super mixing Di Li Cray model (SDMM) of M component, can obtain the probability of object vector:
f ( X ) = &Pi; t = 1 T &Sigma; m = 1 M &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) )
Wherein, weight factor &pi; m = 1 T &Sigma; t = 1 T z &OverBar; tm = 1 T &Sigma; t = 1 T &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) ) &Sigma; m = 1 M &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) ) , π mthe non-negative weight of m component, and &Sigma; m = 1 M &pi; m = 1 .
3) computation model parameter, for m mixed components, parameter vector α mbe divided into 3 subvectors, the corresponding x of each parameter subvector supin a subvector.So we can obtain all parameters by solution equation below:
Figure BDA0000486788020000054
Step S4, in the time differentiating, trains phonetic entry to be identified in a series of models of all speakers that obtain to step S3, determine that the speaker who is differentiated is numbered the numbering of the model of likelihood value maximum.
Below the nonlinear optimization packet loss method of estimation to proposed speech linear predictive model and the embodiment of each module are set forth by reference to the accompanying drawings.By the description of above embodiment, one of ordinary skill in the art can clearly recognize that the mode that the present invention can add essential general hardware platform by software realizes, and can certainly realize by hardware, but the former is better embodiment.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of computer software product, this software product is stored in a storage medium, comprises that some instructions are in order to make one or more computer equipment carry out the method described in each embodiment of the present invention.
According to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.
Above-described embodiment of the present invention, does not form the restriction to invention protection domain.Any amendment of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (4)

1. a method for distinguishing speek person for the text-independent of the line spectral frequencies parameter based on linear transformation and super Di Li Cray mixture model, is characterized in that, comprises the following steps:
One. characteristic extraction step:
A, line spectral frequencies parameter transformation step: in the linear coded prediction model of voice, be converted into line spectral frequencies parameter difference by linear transformation by line spectral frequencies parameter;
B, generation line spectral frequencies feature super vector step: form a feature super vector in conjunction with present frame two frames adjacent with its front and back and express multidate information.
Two. model training step: the frame sequence training pattern that is T by length to each speaker, use the distribution of super Di Li Cray mixture model (SDMM:super-Dirichlet Mixture Model) simulation feature super vector, solve an equation and obtain the parameter alpha in model by gradient method, finally obtain a series of models, the corresponding speaker of each model.
Three. differentiate coupling step: get in a series of probability models that the speech samples input of certain speaker in training set trained, adopt method transformation parameter and generating feature super vector in step 1, calculate the likelihood value for each probability model by the model that trains in step 2, get wherein maximum likelihood value and confirm speaker's numbering.
2. speaker's discrimination method of a kind of and text-independent as claimed in claim 1, is characterized in that, the line spectral frequencies parameter transformation step described in steps A is:
1) K dimension line spectral frequencies Parametric Representation is s=[s 1, s 2..., s k] t, meet 0 < s 1< s 2< ..., s k< π;
2) dimension of the K+1 after conversion line spectral frequencies parameter difference Δ LSF is
Figure FDA0000486788010000011
wherein
x i = s 1 / &pi; i = 1 ( s i - s i - 1 ) / &pi; 1 < i &le; K ( &pi; - s K ) / &pi; i = K + 1 .
3. speaker's discrimination method of a kind of and text-independent as claimed in claim 1, generation line spectral frequencies feature super vector step described in step B combines present frame x (t) and its consecutive frame to form a super vector, express multidate information with this, this super vector comprises three subvectors in the present invention.The interval of supposing present frame and former frame and a rear frame is all τ, only considers former frame x (t-τ) and two neighborhood frames of a rear frame x (t-τ) of present frame here, and the feature super vector of generation is 3 (K+1) dimension.Its detailed process is as follows:
1) K+1 dimension line spectral frequencies parameter difference vector x (t)=[x 1,1, x 1,2..., x 1, K+1] t;
2) the super vector result that comprises multidate information is:
x sup ( t ) = &Delta; x ( t ) x ( t - &tau; ) x ( t + &tau; ) = x 1,1 ( t ) . . . x 1 , K + 1 ( t ) x 2,1 ( t ) . . . x 2 , K + 1 ( t ) x 3,1 ( t ) . . . x 3 , K + 1 ( t ) , &tau; = 1,2 , . . .
4. speaker's discrimination method of a kind of and text-independent as claimed in claim 1, the detailed step of the model training described in step 2 is:
1) x supin each feature subvector x (t), x (t-τ), x (t+ τ) is separate and meet Dirichlet distribute, super vector x supmeet super Di Li Cray probability density distribution:
SDir ( x sup ; &alpha; ) = &Pi; n = 1 3 &Gamma; ( &Sigma; k = 1 K + 1 &alpha; n , k ) &Pi; k = 1 K + 1 &Gamma; ( &alpha; n , k ) &Pi; k = 1 K + 1 ( x n , k ) &alpha; n , k - 1
2) for the line spectral frequencies parameter difference subvector x (1) of a sequential ..., x (t) ..., x (T), has X=[x sup(1) ..., x sup(T)], carry out artificial line spectral frequency parameter difference with super Di Li Cray mixture model (SDMM):
f ( X ) = &Pi; t = 1 T &Sigma; m = 1 M &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) )
Wherein, weight factor &pi; m = 1 T &Sigma; t = 1 T z &OverBar; tm = 1 T &Sigma; t = 1 T &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) ) &Sigma; m = 1 M &pi; m SDir ( x sup ( t ) ; &alpha; ( m ) ) .
3) computation model parameter, for m mixed components, parameter vector α mbe divided into 3 subvectors, the corresponding x of each parameter subvector supin a subvector.So we can obtain all parameters by solution equation below:
Figure FDA0000486788010000024
CN201410134694.8A 2014-04-03 2014-04-03 Text-independent speaker identifying device based on line spectrum frequency difference value Pending CN103871411A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410134694.8A CN103871411A (en) 2014-04-03 2014-04-03 Text-independent speaker identifying device based on line spectrum frequency difference value

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410134694.8A CN103871411A (en) 2014-04-03 2014-04-03 Text-independent speaker identifying device based on line spectrum frequency difference value

Publications (1)

Publication Number Publication Date
CN103871411A true CN103871411A (en) 2014-06-18

Family

ID=50909875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410134694.8A Pending CN103871411A (en) 2014-04-03 2014-04-03 Text-independent speaker identifying device based on line spectrum frequency difference value

Country Status (1)

Country Link
CN (1) CN103871411A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108630207A (en) * 2017-03-23 2018-10-09 富士通株式会社 Method for identifying speaker and speaker verification's equipment
CN108694949A (en) * 2018-03-27 2018-10-23 佛山市顺德区中山大学研究院 Method for distinguishing speek person and its device based on reorder super vector and residual error network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103207961A (en) * 2013-04-23 2013-07-17 曙光信息产业(北京)有限公司 User verification method and device
CN103685185A (en) * 2012-09-14 2014-03-26 上海掌门科技有限公司 Mobile equipment voiceprint registration and authentication method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103685185A (en) * 2012-09-14 2014-03-26 上海掌门科技有限公司 Mobile equipment voiceprint registration and authentication method and system
CN103207961A (en) * 2013-04-23 2013-07-17 曙光信息产业(北京)有限公司 User verification method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANYU MA, ARNE LEIJON: "Super-Dirichlet Mixture Models Using Differential Line Spectral Frequencies for Text-Independent Speaker Identification", 《INTERSPEECH 2011》, 27 August 2011 (2011-08-27), pages 2360 - 2363 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108630207A (en) * 2017-03-23 2018-10-09 富士通株式会社 Method for identifying speaker and speaker verification's equipment
CN108694949A (en) * 2018-03-27 2018-10-23 佛山市顺德区中山大学研究院 Method for distinguishing speek person and its device based on reorder super vector and residual error network
CN108694949B (en) * 2018-03-27 2021-06-22 佛山市顺德区中山大学研究院 Speaker identification method and device based on reordering supervectors and residual error network

Similar Documents

Publication Publication Date Title
Singh et al. Statistical Analysis of Lower and Raised Pitch Voice Signal and Its Efficiency Calculation.
CN103310788B (en) A kind of voice information identification method and system
CN101751922B (en) Text-independent speech conversion system based on HMM model state mapping
Gunawan et al. A review on emotion recognition algorithms using speech analysis
Zhang et al. Durian-sc: Duration informed attention network based singing voice conversion system
CN101226743A (en) Speaker recognition method based on neutral and emotional voiceprint model conversion
CN101923855A (en) Text-independent Voiceprint Recognition System
CN103594084B (en) Combine speech-emotion recognition method and the system of punishment rarefaction representation dictionary learning
Shahin Speaker identification in emotional talking environments based on CSPHMM2s
CN110047501B (en) Many-to-many speech conversion method based on beta-VAE
CN110085254A (en) Many-to-many speech conversion method based on beta-VAE and i-vector
CN109584893A (en) Based on the multi-to-multi speech conversion system of VAE and i-vector under non-parallel text condition
CN101419799A (en) Speaker identification method based mixed t model
CN101419800B (en) Emotional speaker recognition method based on frequency spectrum translation
CN106875944A (en) A kind of system of Voice command home intelligent terminal
CN103871411A (en) Text-independent speaker identifying device based on line spectrum frequency difference value
Iqbal et al. Voice Recognition using HMM with MFCC for Secure ATM
Prajapati et al. Feature extraction of isolated gujarati digits with mel frequency cepstral coefficients (mfccs)
Chauhan et al. Emotion recognition using LP residual
Wu et al. Non-parallel voice conversion system with WaveNet vocoder and collapsed speech suppression
CN103985384B (en) Text-independent speaker identification device based on random projection histogram model
Shan et al. Speaker identification under the changed sound environment
Wenjing et al. A hybrid speech emotion perception method of VQ-based feature processing and ANN recognition
Bansod et al. Speaker Recognition using Marathi (Varhadi) Language
CN108510995B (en) Identity information hiding method facing voice communication

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140618