CN106448685A - System and method for identifying voice prints based on phoneme information - Google Patents

System and method for identifying voice prints based on phoneme information Download PDF

Info

Publication number
CN106448685A
CN106448685A CN201610880776.6A CN201610880776A CN106448685A CN 106448685 A CN106448685 A CN 106448685A CN 201610880776 A CN201610880776 A CN 201610880776A CN 106448685 A CN106448685 A CN 106448685A
Authority
CN
China
Prior art keywords
phoneme
numeric string
information
module
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610880776.6A
Other languages
Chinese (zh)
Other versions
CN106448685B (en
Inventor
郑榕
张策
王黎明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yuanjian Information Technology Co Ltd
Original Assignee
Beijing Yuanjian Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yuanjian Technologies Co ltd filed Critical Beijing Yuanjian Technologies Co ltd
Priority to CN201610880776.6A priority Critical patent/CN106448685B/en
Publication of CN106448685A publication Critical patent/CN106448685A/en
Application granted granted Critical
Publication of CN106448685B publication Critical patent/CN106448685B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/18Artificial neural networks; Connectionist approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Telephonic Communication Services (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a system and a method for identifying voice prints based on phoneme information. The system comprises a phoneme forced aligning module, a phoneme relevance model creating module and a neural network classifier module, wherein the phoneme forced aligning module is based on a Chinese mandarin voice identifier; the neural network classifier module is based on a dropout strategy. The method comprises the following steps of defining sixteen phoneme types of Chinese mandarin numeric string voice prints, and digitally utilizing each piece of pronunciation type information of a numeric string; according to the Chinese mandarin voice identifier, using a viterbi forced aligning algorithm to obtain the phoneme boundary of the text content of each numeric string; using a text irrelevance algorithm to establish a phoneme relevance model; calculating the phoneme relevance model, so as to obtain a fraction vector. The system and the method have the advantages that the functions of division of phoneme information, modeling of phonemes and analysis of distinguishing capability of the phoneme relevance model are realized, the neural network training method based on the dropout strategy is proposed, the problem of phoneme missing of the numeric string is solved, and the performance of a numeric string voice print identifying system is improved.

Description

A kind of voiceprint authentication system based on phoneme information and method
Technical field
The present invention relates to voiceprint authentication system technical field, it particularly relates to a kind of vocal print based on phoneme information is recognized Card system and method.
Background technology
Living things feature recognition is a kind of physiological feature intrinsic according to human body itself and behavior characteristicss recognizing identity Technology, have the advantages that being difficult to forget, anti-counterfeiting performance is good, be difficult to forge or stolen, possess with oneself and can use whenever and wherever possible.With The Internet fast development, traditional identity authentication techniques means cannot increasingly meet the need of user experience and security capabilities Ask.Sound groove recognition technology in e easy to use, due to its wealthy application prospect, huge Social benefit and economic benefit, causes The extensive concern of all trades and professions and great attention.
Application on Voiceprint Recognition, also known as Speaker Identification, is one kind of biological identification technology.The technology is by reflecting in speech waveform Speak the speech parameter of human physiology and behavior characteristicss, and then tell speaker's identity.Convenient with safe, data acquisition The features such as.
In recent years, the Speaker Identification of text correlation (Text-dependent) becomes the focus in user authentication field.By In the major progress in unrelated (Text-independent) the Speaker Identification field of text, a lot of research worker are attempted by text no Close speaker's recognizer and be applied to text association area, such as numeric string Application on Voiceprint Recognition.
Under numeric string authentication condition, have research worker using simultaneous factor analysises (Joint Factor Analysis, JFA), gauss hybrid models-interference properties mapping (Gaussian Mixture Model-Nuisance Attribute Projection, GMM-NAP) and HMM-interference properties mapping (Hidden Markov Model-Nuisance Attribute Projection, HMM-NAP) it is compared.For comparing JFA, performed better than based on the algorithm of NAP, reason It is to train JFA that substantial amounts of tagged data is needed, and exists between the training data of JFA matrix and numeric string test data and lose Join.
In the unrelated Speaker Identification of text, JFA and be based on probability linear discriminant analysiss (Probabilistic Linear Discriminant Analysis, PLDA) population variance modeling factors (iVector) algorithm all rely on substantial amounts of exploitation Collection data.Increasing work is devoted to development set data in the limited field of process and asks to the migration of application data outside field Topic, the self adaptation of such as lexical gap and backoff algorithm.
By Android system (Android) and the mobile phone of apple system (iOS), record and construct the number comprising 536 people Word string voice set.It is divided into two kinds of scenes:Global condition and rand-n condition.Global condition represents that registration and checking are adopted Identical numeric string content;Rand-n condition represents that each digit string is random number word string of the length for n, and this is at certain More safer than global condition in the application system that anti-recording is attacked a bit.Be related in the present invention as shown in table 1 three kind registration/ Authentication condition:The whole numerical ciphers of fixation, dynamic 8 bit digital password and dynamic 6 bit digital password.Every kind of scene partitioning development set With evaluation and test collection.Development set is used for training global context model (Universal Background Model, UBM), population variance Modeling matrix (iVector T matrix) and linearly differentiation analysis matrix (Linear Discriminant Analysis, LDA) etc..In three kinds of conditions of evaluation and test collection, everyone comprising a three registration voices and tested speech, per bar tested speech with All speaker models are compared.
Table 1:Several form examples of password figure
Table 2 be GMM-NAP and using iVector voiceprint authentication system etc. error rate (Equal Error Rate, EER) contrast.As a result show, with the increase of digital string length, the performance of voiceprint authentication system has obtained significantly as one man carrying Rise.But GMM-NAP and iVector system does not all account for the utilization of phoneme (Phone/Phoneme) information, be based on text no Close direct application of the Application on Voiceprint Recognition under text associated scenario.In numeric string vocal print application, ignore phoneme information or without sound The effectively utilizes of prime information, it will limit the unrelated recognizer of text effect in actual applications.
Table 2:GMM-NAP and iVector system under different test conditions etc. error rate contrast
The whole numerical ciphers of fixation Dynamic 8 bit digital passwords Dynamic 6 bit digital passwords
GMM-NAP 2.09% 2.64% 3.76%
iVector 1.87% 2.40% 3.32%
Content of the invention
It is an object of the invention to a kind of voiceprint authentication system based on phoneme information and method is proposed, sound can realized While prime information cutting, phoneme model (Phone-dependent) model separating capacity analysis related with phoneme, number is solved The problem of word string phoneme disappearance, and improve the performance of numeric string voiceprint authentication system.
For above-mentioned technical purpose is realized, the technical scheme is that and be achieved in that:
A kind of voiceprint authentication system based on phoneme information, forces including the phoneme based on Chinese putonghua speech evaluator Alignment module, the model creation module of phoneme correlation and the neural network classifier module based on dropout strategy;
The phoneme based on Chinese putonghua speech evaluator forces alignment module to be used for 16 sounds to numeric string Plain classification carries out cutting;
The model creation module of the phoneme correlation is used for setting up phoneme correlation model, and analyzes each phoneme correlation model Separating capacity to voiceprint, features the differentiation feature of speaker, rather than difference between vocabulary;
The neural network classifier module based on dropout strategy is used for merging the complementary letter of phoneme correlation model Breath.
A kind of voiceprint authentication method based on phoneme information, comprises the steps:
S01:Define 16 phoneme class of standard Chinese numeric string vocal print, explicit each pronunciation using numeric string Classification information;
S02:Based on Chinese putonghua speech evaluator, force alignment algorithm to obtain each using Viterbi and correspond to numeric string The phoneme boundary of content of text, completes the phone segmentation to voice content, i.e. speech feature vector to the mapping relations of phoneme, obtains To the characteristic vector subclass for belonging to phoneme, each character subset conjunction is considered as independent data flow carries out subsequent treatment;
S03:Phoneme correlation model is set up using the unrelated algorithm of text, the model of phoneme correlation is set up process and reduces each The parameter amount of phoneme correlation model, it is to avoid training crossed by model;
S04:Phoneme correlation model is calculated, obtains scores vector.
Further, using the dropout Strategies Training rear end integrated classification device in neural network algorithm in step S04.
Beneficial effects of the present invention:
(1) present invention forces alignment algorithm to obtain using typical Chinese putonghua speech evaluator is based on using Viterbi Each phoneme boundary for corresponding to numeric string content of text is taken, the phone segmentation to voice content is completed, is based on compared to common The cutting effect of dynamic time warping (Dynamic Time Warping, DTW) scheduling algorithm is advantageously;
(2) present invention defines 16 pronunciation classifications to the numeric string pronunciation of standard Chinese, it is to avoid affiliated phoneme class Training problem crossed by the very few model for causing of characteristic vector, establishes phoneme correlation model, and analyzes each phoneme correlation model pair The separating capacity of voiceprint, phoneme correlation model features the differentiation feature of speaker, rather than the difference between vocabulary;
(3) in order to improve the Information Pull effect of phoneme correlation model further, and certification language in practical application is considered Sound only includes the partial content of set of phonemes, it is understood that there may be the problem of vector dimension disappearance, using dropout Strategies Training nerve Network backend grader, realizes the amalgamation judging of phoneme associated score vector, and has been obviously improved the systematic function of voiceprint.
Description of the drawings
Fig. 1 is the rear end grader process chart of the scores vector in the present invention based on phoneme correlation;
Fig. 2 be in the present invention for different phoneme correlation models etc. error rate experimental result picture.
Specific embodiment
With reference to the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Ground description, it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.Based on this Embodiment in invention, the every other embodiment obtained by those of ordinary skill in the art, belong to the model of present invention protection Enclose.
The present invention is proposed phoneme information explicitly using the numeric string voiceprint authentication method for being combined with neural network classification, For per bar digit string, alignment algorithm is forced to complete to voice content using the Viterbi of Chinese putonghua speech evaluator Phone segmentation;Reduce the training parameter amount of phoneme correlation model, it is to avoid the training phonetic feature of each phoneme model is less can Can caused crosses training problem, analyzes separating capacity of each phoneme model to Application on Voiceprint Recognition;Fraction to phoneme correlation model Vector there may be the problem of dimension disappearance, using the dropout Strategies Training rear end integrated classification device in neural network algorithm, The utilizing status of phoneme relevant information are improve, improves the systematic function of numeric string voiceprint further.
Table 3 gives the phonemic representation of ten standard Chinese numeric utterance.It is noted that digital " 1 " has " y i " and " y Two kinds of pronunciations of ao ", therefore correspond to ten standard Chinese numeric utterance and have 16 phonemes.
Table 3:Ten digital standard Chinese pronunciation phonemes
In " the whole numerical ciphers of fixation " condition, phoneme content immobilizes." dynamic 8 bit digital password " and " dynamic 6 The phoneme content of numerical ciphers " is also known, because the random algorithm that digital text is typically based on background system is pushed or base Generated according to special algorithm in OTP dynamic password (One-time Password).
Based on Chinese putonghua speech identifying system, force alignment algorithm to obtain each using Viterbi and correspond to content of text Phoneme boundary, complete the phone segmentation to voice content, i.e. speech feature vector to the mapping of phoneme.
Therefore, the acoustic featuress sequence vector χ=x of a hop count word string voice is given1,...,xT, discrete son can be cut into Set χ1,...,χ16.Wherein x ∈ χiRepresent the characteristic vector subclass for belonging to i-th phoneme.Each subclass is considered as solely Vertical data flow carries out subsequent treatment.The voiceprint registration stage, the model of 16 phoneme correlations(i-th sound of speaker s Sub-prime set) obtained by the unrelated Algorithm for Training of text.It should be noted that registration voice needs to cover ten numerals.This In bright, registration phase registers voice using three numeric strings, it is ensured that in everyone registration voice, each numeral at least occurs one Time.
During voiceprint, for " the whole numerical ciphers of fixation " condition, the scores vector ξ of ten 6 DOFs is obtained,Can be by averaging to scores vector ξ or the method such as logistic regression trains rear end grader to enter Row judgement.But for the rand-n condition such as " dynamic 8 bit digital password " and " dynamic 6 bit digital password ", scores vector ξ may There is disappearance, because tested speech only includes the partial content of set of phonemes.In order to solve the problem, using neural network algorithm In dropout strategy, this be a kind of effective lifting generalization ability implementation method.
The dropout training algorithm of neutral net is stochastic gradient descent (the Standard Stochastic of standard Gradient Descent), simply during forward calculation, some input blocks and hidden layer are ignored at random with certain probability γ Unit.Only activation unit participates in back propagation (Back-propagation) and gradient calculation.Because dropout is not used to Identification, in the training process, the output to per layer is readjusted:
Wherein δ (), WlAnd blIt is activation primitive, the weight of l layer and the biasing of l layer respectively.Bm is binary cover (Binary mask) represents which dimension is disallowable, and * represents vector multiplication.
Said process can regard a kind of effective model averaging method as, i.e., by train the disappearance of substantial amounts of shared weight to The average expression of the heterogeneous networks for measuring.As shown in figure 1, neural network classifier of the training package containing a hidden layer.Wherein defeated It is scores vector to enter, and output includes two units, represents target authentication classification respectively and emits imitative certification classification.For " dynamic 8 Vector dimension disappearance problem under the conditions of the rand-n such as numerical ciphers " and " dynamic 6 bit digital password ", to input layer with probability γ Application dropout strategy carries out network training.In Qualify Phase, the log-likelihood being calculated as follows is used for system output:
Wherein p (ξ | target verification class) and p (ξ | emit imitative checking class) is the likelihood score of scores vector ξ.By Bayes's public affairs Formula, likelihood score can be exchanged into posteriority and represent,
P (ξ | target verification class)=p (target verification class | ξ) p (ξ)/p (target verification class)
P (ξ | emit imitative checking class)=p (emit imitative checking class | ξ) p (ξ)/p (emitting imitative checking class)
Wherein p (target verification class | ξ) and p (emit imitative checking class | ξ) it is that scores vector ξ is obtained by network forward calculation Posteriority.P (target verification class) and p (emitting imitative checking class) is to estimate the priori of the target verification class for obtaining from training set and emit imitative The priori of checking class.P (ξ) is unrelated with any model, can ignore during LLR is calculated.
Each phoneme model separating capacity to Application on Voiceprint Recognition is analyzed first.Training voice in view of each phoneme model Feature is less, in order to avoid crossing training problem, reduces the training parameter amount of each phoneme correlation model.Fig. 2 gives each Phoneme correlation model etc. error rate contrast.
From figure 2 it can be seen that first, in all phoneme correlation models, iVector is to be better than GMM- more by a small margin NAP model.Secondly, the EER numerical value of the worst consonant of performance " w " is five times of the EER of the best vowel of performance " an " or so.This Individual experimental result has directive function to practical application, and on-line system can be limited and push the bad numeral of performance, and such as " 5 [wu]”.
By dropout neutral net rear end grader is trained, fusion output is carried out to the scores vector of phoneme correlation.Table 4 give phoneme correlation model using different rear ends graders etc. error rate contrast.Compare for convenience, give also here The authentication performance averaged by the phoneme associated score of GMM-NAP and iVector system.Fraction average formula is as follows:
Table 4:Phoneme correlation model using different rear ends graders etc. error rate contrast
From table 4, it can be seen that of the present invention based on phoneme information explicitly using the calculation with the fusion of neutral net rear end Method can effectively lift the systematic function of numeric string voiceprint.The result average compared to fraction, neutral net rear end is divided Class device etc. error rate lower, performance is more excellent.With GMM-NAP the and iVector Comparative result of table 2, register/recognize in three kinds of differences Under the conditions of card, the algorithm of phoneme correlation model and neutral net rear end grader is all achieved under about 20% or so relative EER Drop.
Presently preferred embodiments of the present invention is the foregoing is only, not in order to limit the present invention, all essences in the present invention Within god and principle, any modification, equivalent substitution and improvement that is made etc., should be included within the scope of the present invention.

Claims (3)

1. a kind of voiceprint authentication system based on phoneme information, it is characterised in that include based on Chinese putonghua speech evaluator Phoneme force alignment module, phoneme correlation model creation module and based on dropout strategy neural network classifier mould Block;
The phoneme based on Chinese putonghua speech evaluator forces alignment module to be used for 16 phoneme classes to numeric string Cutting is not carried out;
The model creation module of the phoneme correlation is used for setting up phoneme correlation model, and analyzes each phoneme correlation model to sound The separating capacity of stricture of vagina certification;
The neural network classifier module based on dropout strategy is used for merging the complementary information of phoneme correlation model.
2. a kind of voiceprint authentication method based on phoneme information, it is characterised in that comprise the steps:
S01:Define 16 phoneme class of standard Chinese numeric string vocal print, explicit each pronunciation classification using numeric string Information;
S02:Based on Chinese putonghua speech evaluator, force alignment algorithm to obtain each using Viterbi and correspond to numeric string text The phoneme boundary of content, completes the phone segmentation to voice content, obtains belonging to the characteristic vector subclass of phoneme;
S03:Phoneme correlation model is set up using the unrelated algorithm of text;
S04:Phoneme correlation model is calculated, obtains scores vector.
3. the voiceprint authentication method based on phoneme information according to claim 2, it is characterised in that adopt in step S04 Dropout Strategies Training rear end integrated classification device in neural network algorithm.
CN201610880776.6A 2016-10-09 2016-10-09 A kind of voiceprint authentication system and method based on phoneme information Active CN106448685B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610880776.6A CN106448685B (en) 2016-10-09 2016-10-09 A kind of voiceprint authentication system and method based on phoneme information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610880776.6A CN106448685B (en) 2016-10-09 2016-10-09 A kind of voiceprint authentication system and method based on phoneme information

Publications (2)

Publication Number Publication Date
CN106448685A true CN106448685A (en) 2017-02-22
CN106448685B CN106448685B (en) 2019-11-22

Family

ID=58172115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610880776.6A Active CN106448685B (en) 2016-10-09 2016-10-09 A kind of voiceprint authentication system and method based on phoneme information

Country Status (1)

Country Link
CN (1) CN106448685B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108198574A (en) * 2017-12-29 2018-06-22 科大讯飞股份有限公司 Change of voice detection method and device
CN108648760A (en) * 2018-04-17 2018-10-12 四川长虹电器股份有限公司 Real-time sound-groove identification System and method for
CN109065023A (en) * 2018-08-23 2018-12-21 广州势必可赢网络科技有限公司 A kind of voice identification method, device, equipment and computer readable storage medium
CN110111798A (en) * 2019-04-29 2019-08-09 平安科技(深圳)有限公司 A kind of method and terminal identifying speaker
CN110689895A (en) * 2019-09-06 2020-01-14 北京捷通华声科技股份有限公司 Voice verification method and device, electronic equipment and readable storage medium
CN110875044A (en) * 2018-08-30 2020-03-10 中国科学院声学研究所 Speaker identification method based on word correlation score calculation
CN111243603A (en) * 2020-01-09 2020-06-05 厦门快商通科技股份有限公司 Voiceprint recognition method, system, mobile terminal and storage medium
CN111341320A (en) * 2020-02-28 2020-06-26 中国工商银行股份有限公司 Phrase voice voiceprint recognition method and device
CN111785284A (en) * 2020-08-19 2020-10-16 科大讯飞股份有限公司 Method, device and equipment for recognizing text-independent voiceprint based on phoneme assistance
CN114299921A (en) * 2021-12-07 2022-04-08 浙江大学 Voiceprint security scoring method and system for voice command
CN115831120A (en) * 2023-02-03 2023-03-21 北京探境科技有限公司 Corpus data acquisition method and device, electronic equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070033041A1 (en) * 2004-07-12 2007-02-08 Norton Jeffrey W Method of identifying a person based upon voice analysis
CN101467204A (en) * 2005-05-27 2009-06-24 普提克斯科技股份有限公司 Method and system for bio-metric voice print authentication
CN204465555U (en) * 2015-04-14 2015-07-08 时代亿宝(北京)科技有限公司 Based on the voiceprint authentication apparatus of time type dynamic password
CN104834849A (en) * 2015-04-14 2015-08-12 时代亿宝(北京)科技有限公司 Dual-factor identity authentication method and system based on voiceprint recognition and face recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070033041A1 (en) * 2004-07-12 2007-02-08 Norton Jeffrey W Method of identifying a person based upon voice analysis
CN101467204A (en) * 2005-05-27 2009-06-24 普提克斯科技股份有限公司 Method and system for bio-metric voice print authentication
CN204465555U (en) * 2015-04-14 2015-07-08 时代亿宝(北京)科技有限公司 Based on the voiceprint authentication apparatus of time type dynamic password
CN104834849A (en) * 2015-04-14 2015-08-12 时代亿宝(北京)科技有限公司 Dual-factor identity authentication method and system based on voiceprint recognition and face recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张涛涛: "语音声纹密码验证技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108198574A (en) * 2017-12-29 2018-06-22 科大讯飞股份有限公司 Change of voice detection method and device
CN108198574B (en) * 2017-12-29 2020-12-08 科大讯飞股份有限公司 Sound change detection method and device
CN108648760A (en) * 2018-04-17 2018-10-12 四川长虹电器股份有限公司 Real-time sound-groove identification System and method for
CN109065023A (en) * 2018-08-23 2018-12-21 广州势必可赢网络科技有限公司 A kind of voice identification method, device, equipment and computer readable storage medium
CN110875044A (en) * 2018-08-30 2020-03-10 中国科学院声学研究所 Speaker identification method based on word correlation score calculation
CN110875044B (en) * 2018-08-30 2022-05-03 中国科学院声学研究所 Speaker identification method based on word correlation score calculation
CN110111798A (en) * 2019-04-29 2019-08-09 平安科技(深圳)有限公司 A kind of method and terminal identifying speaker
CN110111798B (en) * 2019-04-29 2023-05-05 平安科技(深圳)有限公司 Method, terminal and computer readable storage medium for identifying speaker
CN110689895A (en) * 2019-09-06 2020-01-14 北京捷通华声科技股份有限公司 Voice verification method and device, electronic equipment and readable storage medium
CN111243603A (en) * 2020-01-09 2020-06-05 厦门快商通科技股份有限公司 Voiceprint recognition method, system, mobile terminal and storage medium
CN111341320B (en) * 2020-02-28 2023-04-14 中国工商银行股份有限公司 Phrase voice voiceprint recognition method and device
CN111341320A (en) * 2020-02-28 2020-06-26 中国工商银行股份有限公司 Phrase voice voiceprint recognition method and device
CN111785284A (en) * 2020-08-19 2020-10-16 科大讯飞股份有限公司 Method, device and equipment for recognizing text-independent voiceprint based on phoneme assistance
CN111785284B (en) * 2020-08-19 2024-04-30 科大讯飞股份有限公司 Text-independent voiceprint recognition method, device and equipment based on phoneme assistance
CN114299921B (en) * 2021-12-07 2022-11-18 浙江大学 Voiceprint security scoring method and system for voice command
CN114299921A (en) * 2021-12-07 2022-04-08 浙江大学 Voiceprint security scoring method and system for voice command
CN115831120A (en) * 2023-02-03 2023-03-21 北京探境科技有限公司 Corpus data acquisition method and device, electronic equipment and readable storage medium
CN115831120B (en) * 2023-02-03 2023-06-16 北京探境科技有限公司 Corpus data acquisition method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN106448685B (en) 2019-11-22

Similar Documents

Publication Publication Date Title
CN106448685A (en) System and method for identifying voice prints based on phoneme information
KR101995547B1 (en) Neural Networks for Speaker Verification
CN104575490B (en) Spoken language pronunciation evaluating method based on deep neural network posterior probability algorithm
US10013972B2 (en) System and method for identifying speakers
ES2605779T3 (en) Speaker Recognition
Gomez-Alanis et al. On joint optimization of automatic speaker verification and anti-spoofing in the embedding space
TWI527023B (en) A voiceprint recognition method and apparatus
Dileep et al. GMM-based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines
CN107492382A (en) Voiceprint extracting method and device based on neutral net
CN107731233A (en) A kind of method for recognizing sound-groove based on RNN
CN104765996B (en) Voiceprint password authentication method and system
KR20060070603A (en) Two stage utterance verification method and device of speech recognition system
CN104240706B (en) It is a kind of that the method for distinguishing speek person that similarity corrects score is matched based on GMM Token
CN104462912B (en) Improved biometric password security
Lopez-Otero et al. Analysis of gender and identity issues in depression detection on de-identified speech
Wang et al. A network model of speaker identification with new feature extraction methods and asymmetric BLSTM
Wang et al. Automatic detection of speaker state: Lexical, prosodic, and phonetic approaches to level-of-interest and intoxication classification
CN105280181A (en) Training method for language recognition model and language recognition method
Safavi et al. Fraud detection in voice-based identity authentication applications and services
CN104464738B (en) A kind of method for recognizing sound-groove towards Intelligent mobile equipment
Folorunso et al. A review of voice-base person identification: state-of-the-art
Li et al. Cost-sensitive learning for emotion robust speaker recognition
Chen et al. Speech emotion classification using acoustic features
US6499012B1 (en) Method and apparatus for hierarchical training of speech models for use in speaker verification
Wang et al. Capture interspeaker information with a neural network for speaker identification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: East Zone 9A, 9th Floor, Building 1, No. 158 West Fourth Ring North Road, Haidian District, Beijing, 100142

Patentee after: Beijing Yuan Jian Polytron Technologies Inc.

Address before: East Zone 9A, 9th Floor, Building 1, No. 158 West Fourth Ring North Road, Haidian District, Beijing, 100142

Patentee before: Beijing Yuanjian Technologies Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231010

Address after: No. 016, Xiaocuigezhuang Village, Gaolou Town, Sanhe City, Langfang City, Hebei Province, 065200

Patentee after: Liu Xuefeng

Address before: East Zone 9A, 9th Floor, Building 1, No. 158 West Fourth Ring North Road, Haidian District, Beijing, 100142

Patentee before: Beijing Yuan Jian Polytron Technologies Inc.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240218

Address after: Room 320, 3rd Floor, Building A, No. 119 West Fourth Ring North Road, Haidian District, Beijing, 100000

Patentee after: Beijing Yuanjian Information Technology Co.,Ltd.

Country or region after: China

Address before: No. 016, Xiaocuigezhuang Village, Gaolou Town, Sanhe City, Langfang City, Hebei Province, 065200

Patentee before: Liu Xuefeng

Country or region before: China