CN1652206A - Sound veins identifying method - Google Patents

Sound veins identifying method Download PDF

Info

Publication number
CN1652206A
CN1652206A CNA2005100599131A CN200510059913A CN1652206A CN 1652206 A CN1652206 A CN 1652206A CN A2005100599131 A CNA2005100599131 A CN A2005100599131A CN 200510059913 A CN200510059913 A CN 200510059913A CN 1652206 A CN1652206 A CN 1652206A
Authority
CN
China
Prior art keywords
vector sequence
mentioned
sequence
model
speaker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2005100599131A
Other languages
Chinese (zh)
Other versions
CN1302456C (en
Inventor
郑方
熊振宇
宋战江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing D Ear Technologies Co ltd
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CNB2005100599131A priority Critical patent/CN1302456C/en
Publication of CN1652206A publication Critical patent/CN1652206A/en
Application granted granted Critical
Publication of CN1302456C publication Critical patent/CN1302456C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention provides a vocal print identification method, belonging to the field of identity identification technology based on biological characteristics. Said method includes the following steps: firstly, extracting acoustic feature form voice waveforms of several speakers and forming characteristic rector sequence of several speakers; according to the characteristic vector sequence constructing a general background model, according probability model of every speaker; extracting void feature from speech to be identified, forming characteristic vector sequence of speech to be identified, and rearranging to obtain rearranged characteristic vector sequence, in order to reamange every vector in characteristic vector sequence selecting kernel Gaussian mix from Gaussian mixed tree and calculating probability likelibood fraction of rearranged characteristic vector sequence to be identified which is respectively marched with probability model of every speaker, calculating sum total of probabilikelibood factions, and lopping and taking maximum fraction as identification result.

Description

A kind of method for recognizing sound-groove
Technical field
The present invention relates to a kind of method for recognizing sound-groove, belong to identity identification technical field based on biological characteristic.
Background technology
In prior art, based on universal background model (Universal Background Model, hereinafter to be referred as UBM) Application on Voiceprint Recognition (Voiceprint Recognition) method of text-independent comprise the training method of universal background model UBM, three parts of the training method of speaker model and the recognition methods of vocal print.
The training method of universal background model UBM is:
(1) from a plurality of speakers' sound waveform, extracts acoustic feature, form a plurality of speakers' feature vector sequence;
(2) feature vector sequence according to a plurality of speakers makes up a universal background model.Its method is that the feature vector sequence to all speakers adopts certain existing clustering algorithm (as traditional LBG algorithm) to carry out cluster, obtains the mixing of K Gaussian distribution, and wherein k Gaussian distribution mean value vector is μ k, the diagonal angle variance matrix is ∑ kThe note number percent that k eigenvector number that Gaussian distribution contained accounts for vector sum in the full feature vector sequence during cluster is w k, then universal background model is
UBM = { μ k ubm , Σ k ubm , w k ubm | 1 ≤ k ≤ K } .
Wherein the training method of speaker model is:
(1) from each speaker's sound waveform, extracts acoustic feature, form this speaker's feature vector sequence;
(2) universal background model is carried out self-adaptation according to each speaker's feature vector sequence respectively, obtain everyone sound-groove model, each individual sound-groove model is put together form a model bank.Its adaptive approach can adopt any existing adaptive approach (as traditional MAP adaptive approach), speaker's sound-groove model M={ μ k, ∑ k, w k| Gaussian Mixture among 1≤k≤K} and universal background model UBM = { μ k ubm , Σ k ubm , w k ubm | 1 ≤ k ≤ K } In Gaussian Mixture have one to one relation.
Wherein the recognition methods of vocal print is:
(1) from people's to be identified sound, extracts acoustic feature and form feature vector sequence to be identified;
(2) will this feature vector sequence to be identified and this model bank in sound-groove model carry out matching ratio one by one, the matching score that obtains feature vector sequence and each speaker's sound-groove model (is also referred to as the log-likelihood score, and adjudicate or likelihood score, or score); The method of calculated characteristics vector sequence and speaker model coupling mark is: to feature vector sequence X={X to be identified 1..., X TIn each frame X t, 1≤t≤T at first with the universal background model coupling, finds universal background model UBM = { μ k ubm , Σ k ubm , w k ubm | 1 ≤ k ≤ K } In with X tN the Gaussian Mixture k that mates most 1..., k N, use speaker's sound-groove model M={ μ then k, ∑ k, w k| corresponding Gaussian Mixture is calculated the coupling mark of this speaker model among 1≤k≤K} S ( X t | M ) = ln Σ n = 1 N w k n · p ( X t | μ k n , Σ k n ) ; The mark of whole sequence then is: S ( X ‾ | M ) = Σ t = 1 T S ( X t | M ) ;
(3) according to the type (the closed set vocal print is differentiated, the opener vocal print is differentiated and vocal print is confirmed) of the recognition methods of vocal print, in needs, refuse to know judgement, thereby obtain a result.
Shortcoming: the subject matter based on the method for recognizing sound-groove of universal background model is that the calculated amount of discerning is too big, and its calculating comprises:
(1) to each frame speech characteristic vector X t, 1≤t≤T will select N the mixing of mating most from universal background model; And the mixed number of universal background model is very big usually, is generally 1,024 or 2,048, causes calculated amount very big; (2) all speaker models are calculated the coupling mark; Though each speaker model only need calculate the mark (N=4 usually) of N Gaussian Mixture, very big speaker model number can cause very big calculated amount equally.
Summary of the invention
The objective of the invention is to propose a kind of method for recognizing sound-groove, existing to overcome based on the too big shortcoming of the method for recognizing sound-groove operand of universal background model, improve the arithmetic speed of Application on Voiceprint Recognition.
The method for recognizing sound-groove that the present invention proposes may further comprise the steps:
(1) from a plurality of speakers' sound waveform, extracts acoustic feature, form a plurality of speakers' feature vector sequence;
(2) make up a universal background model according to above-mentioned feature vector sequence;
(3), make up the Gaussian Mixture tree according to above-mentioned universal background model;
(4), train each speaker's probability model according to above-mentioned universal background model;
(5) from voice to be identified, extract acoustic feature, form the feature vector sequence of voice to be identified,, obtain the feature vector sequence that reorders this eigenvector rearrangement;
(6) be each vector in the above-mentioned feature vector sequence that reorders, from the Gaussian Mixture tree of above-mentioned structure, select the Gaussian Mixture of core;
(7) according to above-mentioned core Gaussian Mixture, calculate above-mentioned voice to be identified reorder eigenvector respectively with the probability likelihood mark of each speaker's probability model coupling;
(8) calculate above-mentioned voice to be identified reorder eigenvector respectively with the summation of the probability likelihood mark of each speaker's probability model coupling, and carry out beta pruning, get the recognition result that is of mark maximum.
In the said method, step (5) is eigenvector rearrangement, and the method for the feature vector sequence that obtains reordering may further comprise the steps:
(1) at feature vector sequence X={X 1..., X TIn, n selects vector with the interval, forms vector sequence O={X 1, X 1+n, X 1+2n... }, set up sequence Y, make Y=O;
(2) in sequence Y, get the arithmetic mean of the sequence number of adjacent vector from left to right successively, if from the vector of the nearest sequence number correspondence of this mean value not in above-mentioned Y, then from X, take out this vector and join among the new vector sequence Q;
(3) back of adding the above-mentioned vector sequence Q that obtains to vector sequence Y;
(4) repeating step (2) and (3) are up to vector sequence X={X 1..., X TIn all vector arrange alls in vector sequence Y.
In the said method, be each eigenvector, from the Gaussian Mixture tree that makes up, select the method for core Gaussian Mixture, comprise the steps:
(1) all child nodes of establishing the root node of Gaussian Mixture tree are the both candidate nodes set;
(2) to described each eigenvector, the likelihood mark of each Gaussian distribution in the calculated candidate node set;
(3) if both candidate nodes is a leaf node, then select N the highest Gaussian distribution of likelihood mark as the core Gaussian Mixture; If both candidate nodes is not a leaf node, then select K the highest node of likelihood mark, all child nodes of K node are gathered as both candidate nodes, repeat above-mentioned steps (2) and (3).
In the said method, step (8) is carried out beta pruning to the summation of probability likelihood mark, and that gets the mark maximum is the method for recognition result, may further comprise the steps:
(1) the probability model set of establishing all speakers is candidate collection;
(2) successively to each vector in the described vector sequence that reorders, the likelihood mark of all probability models in the calculated candidate set, and threshold value Θ is set τ=S (τ)-B, wherein, S (τ) is for calculating in the vector sequence that reorders behind the τ frame, and the highest likelihood mark of model in the candidate collection, B be the constant according to the identification requirement setting;
(3) all likelihood marks are deleted from candidate collection less than the speaker model of above-mentioned threshold value;
(4) repeating step (2) and (3), only surplus next model in candidate collection, or all vectors have all been calculated.
The method for recognizing sound-groove that the present invention proposes, proposed to select (Tree-based Kernel Selection based on the core of tree, TBKS) method and beta pruning (the Observation Reordering based Pruning that reorders based on measurement vector, ORBP) method is used for the Application on Voiceprint Recognition system based on universal background model, substantially do not reducing under the prerequisite of discrimination, reduce the required calculated amount of Application on Voiceprint Recognition significantly, improve the speed of Application on Voiceprint Recognition.Method for recognizing sound-groove of the present invention and the general method for recognizing sound-groove based on universal background model have 1031 speakers at one, 1816 test statements speech database on test.The general method for recognizing sound-groove recognition correct rate based on universal background model is 95.32%, method for recognizing sound-groove recognition correct rate 95.26% of the present invention, and travelling speed has improved 16 times.
Description of drawings
Fig. 1 is the structural representation of the Gaussian Mixture tree that relates in the inventive method.
Embodiment
The method for recognizing sound-groove that the present invention proposes at first from extract acoustic feature from a plurality of speakers' sound waveform, forms a plurality of speakers' feature vector sequence; Make up a universal background model according to above-mentioned feature vector sequence; According to above-mentioned universal background model, make up the Gaussian Mixture tree; According to above-mentioned universal background model, train each speaker's probability model; From voice to be identified, extract acoustic feature, form the feature vector sequence of voice to be identified,, obtain the feature vector sequence that reorders this eigenvector rearrangement; Be each vector in the above-mentioned feature vector sequence that reorders, from the Gaussian Mixture tree of above-mentioned structure, select the Gaussian Mixture of core; According to above-mentioned core Gaussian Mixture, calculate above-mentioned voice to be identified reorder eigenvector respectively with the probability likelihood mark of each speaker's probability model coupling; Calculate above-mentioned voice to be identified reorder eigenvector respectively with the summation of the probability likelihood mark of each speaker's probability model coupling, and carry out beta pruning, get the recognition result that is of mark maximum.
Below introduce one embodiment of the present of invention.
Method for recognizing sound-groove embodiment of the present invention comprises the training of universal background model, the structure of universal background model Gaussian Mixture tree, and the training of speaker model and Application on Voiceprint Recognition are described as follows:
The universal background model training concrete steps of present embodiment comprise:
(1) gets 60 male speakers and 60 women speakers' voice data, its raw tone Wave data is analyzed, throw and remove wherein each quiet section;
(2) wide and wide half of frame be that frame moves with 32 milliseconds of frames, each frame extracted the linear prediction cepstrum parameters (LPCC) of 16 dimensions, and calculate its auto-regressive analysis parameter, forms 32 eigenvectors of tieing up; The eigenvector composition characteristic vector sequence of all frames;
(3) make up this speaker's sound-groove model: the feature vector sequence to the speaker adopts traditional LBG algorithm to carry out cluster, obtains the mixing of 1,024 Gaussian distribution, and wherein k Gaussian distribution mean value vector is μ k, the diagonal angle variance matrix is ∑ kThe number percent that k eigenvector number that Gaussian distribution contained accounts for vector sum in the full feature vector sequence during note LBG cluster is w k, then universal background model is: UBM={ μ k, ∑ k, w k| 1≤k≤K}.
The structure concrete steps of the universal background model Gaussian Mixture tree of present embodiment comprise:
(1) specifying tree structure is 5 layers, and the ground floor root node has 16 child nodes, and each node of the second layer has 4 child nodes, and the 3rd layer of each node has 4 byte points, and the 4th node layer number is determined by the construction method of Gaussian Mixture tree;
(2) adopt the construction method of aforementioned Gaussian Mixture tree to make up the Gaussian Mixture tree;
The speaker model training concrete steps of present embodiment comprise:
(1) gets 1 speaker's voice data, its raw tone Wave data is analyzed, throw and remove wherein each quiet section;
(2) wide and wide half of frame be that frame moves with 32 milliseconds of frames, each frame extracted the linear prediction cepstrum parameters (LPCC) of 16 dimensions, and calculate its auto-regressive analysis parameter, forms 32 eigenvectors of tieing up; The eigenvector composition characteristic vector sequence of all frames;
(3) feature vector sequence with the speaker adopts traditional MAP method to carry out self-adaptation to universal background model, obtains speaker model;
(4) if also have not training of speaker, then change the training that step 1) is carried out next speaker; Otherwise training process finishes.
The Application on Voiceprint Recognition of present embodiment may further comprise the steps:
(1) collection speaker's to be identified voice data is analyzed its raw tone Wave data, throws and removes wherein each quiet section;
(2) the wide and frame of identical frame moves when training with sound-groove model, each frame is extracted the linear prediction cepstrum parameters (LPCC) of 16 dimensions, and calculate its auto-regressive analysis parameter vector, forms 32 dimensional feature vectors to be identified; The eigenvector to be identified of all frames is formed feature vector sequence X={X to be identified 1..., X T;
(3) adopt the pruning method that reorders based on measurement vector, to X={X 1..., X TResequence, obtain new sequence Y={Y 1..., Y T;
(4) all speakers' sound-groove model is a set of candidates in the setting sound-groove model storehouse;
(5) for each frame phonetic feature Y τ, 1≤τ≤T adopts the aforementioned searching method that mixes that mates most, finds 4 Gaussian Mixture of mating most with this frame phonetic feature in the universal background model, and its label is k 1, k 2, k 3, k 4
(6) from set of candidates, get a speaker's sound-groove model M={ μ k, ∑ k, w k| 1≤k≤K}, calculate its matching score S ( Y τ | M ) = Σ t = 1 4 ( w k i · p ( Y τ | μ k i , Σ k i ) ) ; And calculate the accumulation score of this model S ( M ) = Σ t = 1 τ ln S ( Y τ | M ) ;
(7) find the highest speaker model of set of candidates accumulation score, it accumulates to such an extent that be divided into S Max(τ), set pruning threshold Θ τ=S Max(τ)-and B, all coupling marks in the set of candidates are lower than threshold value Θ τSound-groove model deletion;
(8) repeat above step, only surplus next speaker model or whole speech characteristic vector sequence were all handled in the set of candidates set;
(9) take out the mark S that accumulates the score maximum in the set of candidates Max(T) and corresponding speaker model M MaxAs recognition result; The output result, the Application on Voiceprint Recognition process finishes.

Claims (4)

1, a kind of method for recognizing sound-groove is characterized in that this method may further comprise the steps:
(1) from a plurality of speakers' sound waveform, extracts acoustic feature, form a plurality of speakers' feature vector sequence;
(2) make up a universal background model according to above-mentioned feature vector sequence;
(3), make up the Gaussian Mixture tree according to above-mentioned universal background model;
(4), train each speaker's probability model according to above-mentioned universal background model;
(5) from voice to be identified, extract acoustic feature, form the feature vector sequence of voice to be identified,, obtain the feature vector sequence that reorders this eigenvector rearrangement;
(6) be each vector in the above-mentioned feature vector sequence that reorders, from the Gaussian Mixture tree of above-mentioned structure, select the Gaussian Mixture of core;
(7) according to above-mentioned core Gaussian Mixture, calculate above-mentioned voice to be identified reorder eigenvector respectively with the probability likelihood mark of each speaker's probability model coupling;
(8) calculate above-mentioned voice to be identified reorder eigenvector respectively with the summation of the probability likelihood mark of each speaker's probability model coupling, and carry out beta pruning, get the recognition result that is of mark maximum.
2, the method for claim 1 is characterized in that wherein step (5) with the eigenvector rearrangement, and the method for the feature vector sequence that obtains reordering may further comprise the steps:
(1) at feature vector sequence X={X 1... X TIn, n selects vector with the interval, forms vector sequence O={X 1, X 1+n, X 1+2n... }, set up sequence Y, make Y=O;
(2) in sequence Y, get the arithmetic mean of the sequence number of adjacent vector from left to right successively, if from the vector of the nearest sequence number correspondence of this mean value not in above-mentioned Y, then from X, take out this vector and join among the new vector sequence Q;
(3) back of adding the above-mentioned vector sequence Q that obtains to vector sequence Y;
(4) repeating step (2) and (3) are up to vector sequence X={X 1... X TIn all vector arrange alls in vector sequence Y.
3, the method for claim 1 is characterized in that wherein being each eigenvector, selects the method for core Gaussian Mixture from the Gaussian Mixture tree that makes up, and comprises the steps:
(1) all child nodes of establishing the root node of Gaussian Mixture tree are the both candidate nodes set;
(2) to described each eigenvector, the likelihood mark of each Gaussian distribution in the calculated candidate node set;
(3) if both candidate nodes is a leaf node, then select N the highest Gaussian distribution of likelihood mark as the core Gaussian Mixture; If both candidate nodes is not a leaf node, then select K the highest node of likelihood mark, all child nodes of K node are gathered as both candidate nodes, repeat above-mentioned steps (2) and (3).
4, the method for claim 1 is characterized in that wherein step (8) is carried out beta pruning to the summation of probability likelihood mark, and that gets the mark maximum is the method for recognition result, may further comprise the steps:
(1) the probability model set of establishing all speakers is candidate collection;
(2) successively to each vector in the described vector sequence that reorders, the likelihood mark of all probability models in the calculated candidate set, and threshold value Θ is set τ=S (τ)-B, wherein, S (τ) is for calculating in the vector sequence that reorders behind the τ frame, and the highest likelihood mark of model in the candidate collection, B be the constant according to the identification requirement setting;
(3) all likelihood marks are deleted from candidate collection less than the speaker model of above-mentioned threshold value;
(4) repeating step (2) and (3), only surplus next model in candidate collection, or all vectors have all been calculated.
CNB2005100599131A 2005-04-01 2005-04-01 Sound veins identifying method Active CN1302456C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100599131A CN1302456C (en) 2005-04-01 2005-04-01 Sound veins identifying method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100599131A CN1302456C (en) 2005-04-01 2005-04-01 Sound veins identifying method

Publications (2)

Publication Number Publication Date
CN1652206A true CN1652206A (en) 2005-08-10
CN1302456C CN1302456C (en) 2007-02-28

Family

ID=34876833

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100599131A Active CN1302456C (en) 2005-04-01 2005-04-01 Sound veins identifying method

Country Status (1)

Country Link
CN (1) CN1302456C (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101420438B (en) * 2008-11-18 2011-06-22 北京航空航天大学 Three stage progressive network attack characteristic extraction method based on sequence alignment
CN102968990A (en) * 2012-11-15 2013-03-13 江苏嘉利德电子科技有限公司 Speaker identifying method and system
CN104765996A (en) * 2014-01-06 2015-07-08 讯飞智元信息科技有限公司 Voiceprint authentication method and system
CN105261367A (en) * 2014-07-14 2016-01-20 中国科学院声学研究所 Identification method of speaker
CN105702263A (en) * 2016-01-06 2016-06-22 清华大学 Voice playback detection method and device
CN107610708A (en) * 2017-06-09 2018-01-19 平安科技(深圳)有限公司 Identify the method and apparatus of vocal print
CN108986791A (en) * 2018-08-10 2018-12-11 南京航空航天大学 For the Chinese and English languages audio recognition method and system in civil aviaton's land sky call field
CN109545229A (en) * 2019-01-11 2019-03-29 华南理工大学 A kind of method for distinguishing speek person based on speech samples Feature space trace
CN111081261A (en) * 2019-12-25 2020-04-28 华南理工大学 Text-independent voiceprint recognition method based on LDA
CN111222005A (en) * 2020-01-08 2020-06-02 科大讯飞股份有限公司 Voiceprint data reordering method and device, electronic equipment and storage medium
CN111369992A (en) * 2020-02-27 2020-07-03 Oppo(重庆)智能科技有限公司 Instruction execution method and device, storage medium and electronic equipment
WO2021139589A1 (en) * 2020-01-10 2021-07-15 华为技术有限公司 Voice processing method, medium, and system
CN113140222A (en) * 2021-05-10 2021-07-20 科大讯飞股份有限公司 Voiceprint vector extraction method, device, equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001312293A (en) * 2000-04-28 2001-11-09 Matsushita Electric Ind Co Ltd Method and device for voice recognition, and computer- readable storage medium
US6751590B1 (en) * 2000-06-13 2004-06-15 International Business Machines Corporation Method and apparatus for performing pattern-specific maximum likelihood transformations for speaker recognition
CN1170239C (en) * 2002-09-06 2004-10-06 浙江大学 Palm acoustic-print verifying system
CN1188804C (en) * 2002-11-15 2005-02-09 郑方 Method for recognizing voice print
CN1221938C (en) * 2003-01-27 2005-10-05 北京天朗语音科技有限公司 Speaker self-adaptive method based on Gauss similarity analysis

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101420438B (en) * 2008-11-18 2011-06-22 北京航空航天大学 Three stage progressive network attack characteristic extraction method based on sequence alignment
CN102968990A (en) * 2012-11-15 2013-03-13 江苏嘉利德电子科技有限公司 Speaker identifying method and system
CN102968990B (en) * 2012-11-15 2015-04-15 朱东来 Speaker identifying method and system
CN104765996A (en) * 2014-01-06 2015-07-08 讯飞智元信息科技有限公司 Voiceprint authentication method and system
CN104765996B (en) * 2014-01-06 2018-04-27 讯飞智元信息科技有限公司 Voiceprint password authentication method and system
CN105261367A (en) * 2014-07-14 2016-01-20 中国科学院声学研究所 Identification method of speaker
CN105261367B (en) * 2014-07-14 2019-03-15 中国科学院声学研究所 A kind of method for distinguishing speek person
CN105702263B (en) * 2016-01-06 2019-08-30 清华大学 Speech playback detection method and device
CN105702263A (en) * 2016-01-06 2016-06-22 清华大学 Voice playback detection method and device
CN107610708A (en) * 2017-06-09 2018-01-19 平安科技(深圳)有限公司 Identify the method and apparatus of vocal print
CN108986791A (en) * 2018-08-10 2018-12-11 南京航空航天大学 For the Chinese and English languages audio recognition method and system in civil aviaton's land sky call field
CN109545229A (en) * 2019-01-11 2019-03-29 华南理工大学 A kind of method for distinguishing speek person based on speech samples Feature space trace
CN109545229B (en) * 2019-01-11 2023-04-21 华南理工大学 Speaker recognition method based on voice sample characteristic space track
CN111081261A (en) * 2019-12-25 2020-04-28 华南理工大学 Text-independent voiceprint recognition method based on LDA
CN111081261B (en) * 2019-12-25 2023-04-21 华南理工大学 Text-independent voiceprint recognition method based on LDA
CN111222005A (en) * 2020-01-08 2020-06-02 科大讯飞股份有限公司 Voiceprint data reordering method and device, electronic equipment and storage medium
WO2021139589A1 (en) * 2020-01-10 2021-07-15 华为技术有限公司 Voice processing method, medium, and system
CN111369992A (en) * 2020-02-27 2020-07-03 Oppo(重庆)智能科技有限公司 Instruction execution method and device, storage medium and electronic equipment
CN113140222A (en) * 2021-05-10 2021-07-20 科大讯飞股份有限公司 Voiceprint vector extraction method, device, equipment and storage medium
CN113140222B (en) * 2021-05-10 2023-08-01 科大讯飞股份有限公司 Voiceprint vector extraction method, voiceprint vector extraction device, voiceprint vector extraction equipment and storage medium

Also Published As

Publication number Publication date
CN1302456C (en) 2007-02-28

Similar Documents

Publication Publication Date Title
CN1302456C (en) Sound veins identifying method
CN112308158B (en) Multi-source field self-adaptive model and method based on partial feature alignment
CN106503805A (en) A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method
CN1188804C (en) Method for recognizing voice print
CN107492382A (en) Voiceprint extracting method and device based on neutral net
CN102201236A (en) Speaker recognition method combining Gaussian mixture model and quantum neural network
CN1302427A (en) Model adaptation system and method for speaker verification
CN111243602A (en) Voiceprint recognition method based on gender, nationality and emotional information
CN106228980A (en) Data processing method and device
CN1750121A (en) A kind of pronunciation evaluating method based on speech recognition and speech analysis
CN108648759A (en) A kind of method for recognizing sound-groove that text is unrelated
CN101710490A (en) Method and device for compensating noise for voice assessment
CN103871424A (en) Online speaking people cluster analysis method based on bayesian information criterion
CN1655234A (en) Apparatus, method, and medium for distinguishing vocal sound from other sounds
CN110600054A (en) Sound scene classification method based on network model fusion
CN1787077A (en) Method for fast identifying speeking person based on comparing ordinal number of archor model space projection
CN1514432A (en) Dynamic time curving system and method based on Gauss model in speech processing
CN1691054A (en) Content based image recognition method
CN1556522A (en) Telephone channel speaker voice print identification system
Lasseck Improving Bird Identification using Multiresolution Template Matching and Feature Selection during Training.
CN1198261C (en) Voice identification based on decision tree
CN1162789C (en) Theme word correction method of text similarity calculation based on vector space model
CN1366295A (en) Speaker's inspection and speaker's identification system and method based on prior knowledge
CN1746972A (en) Speech lock
CN1099662C (en) Continuous voice identification technology for Chinese putonghua large vocabulary

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: BEIJING D-EAR TECHNOLOGIES CO., LTD.

Free format text: FORMER OWNER: ZHENG FANG

Effective date: 20121231

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20121231

Address after: 100084 room 1005, B building, Tsinghua Science and Technology Park, Haidian District, Beijing

Patentee after: BEIJING D-EAR TECHNOLOGIES Co.,Ltd.

Address before: 100084 Haidian District Tsinghua Yuan, Beijing, Tsinghua University, West 14-4-202

Patentee before: Zheng Fang

DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Mi Qingshan

Document name: payment instructions

DD01 Delivery of document by public notice

Addressee: BEIJING D-EAR TECHNOLOGIES Co.,Ltd. Person in charge of patentsThe principal of patent

Document name: Notice of Termination of Patent Rights

DD01 Delivery of document by public notice