CN108806695A - Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh - Google Patents

Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh Download PDF

Info

Publication number
CN108806695A
CN108806695A CN201810345256.4A CN201810345256A CN108806695A CN 108806695 A CN108806695 A CN 108806695A CN 201810345256 A CN201810345256 A CN 201810345256A CN 108806695 A CN108806695 A CN 108806695A
Authority
CN
China
Prior art keywords
vocal print
voice data
fraud
training
blacklist
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810345256.4A
Other languages
Chinese (zh)
Inventor
郑斯奇
王健宗
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810345256.4A priority Critical patent/CN108806695A/en
Priority to PCT/CN2018/095486 priority patent/WO2019200744A1/en
Publication of CN108806695A publication Critical patent/CN108806695A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1466Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided herein anti-fraud method, apparatus, computer equipment and the storage medium of a kind of self refresh, when new fraud voice data being added in blacklist vocal print library, based on the training parameter of the fraud voice data re -training vocal print training pattern in blacklist vocal print library, updated vocal print training pattern is obtained;The first voice data is received, and calculates the similarity score of the first voice data and the fraud voice data in blacklist vocal print library by the updated vocal print training pattern;If the similarity score is higher than the similarity threshold of setting, when then judging first voice data for new fraud voice data is added in fraud voice data blacklist vocal print library, for the training parameter of all fraud voice data re -training vocal print training patterns, updated vocal print training pattern is obtained;Vocal print training pattern is constantly updated, voice is counter to cheat preferably to adapt to, and promotes anti-fraud accuracy, overcomes the defect of wrong report, while promoting detection efficiency.

Description

Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh
Technical field
Cheat technical field this application involves call voice is counter, more particularly to the anti-fraud method, apparatus of a kind of self refresh, Computer equipment and storage medium.
Background technology
Currently, the scope of business of many large size financing corporations is related to multiple business such as insurance, bank, investment, and it is every A business is usually required for same client to link up, and is required for carrying out instead cheating identification, therefore, tests the identity of client Card and anti-fraud identification become the important component for ensureing service security.In order to meet the real-time demand of business, some gold Melt the identity that company starts to client by the way of speech recognition and verifies and instead cheat identification, in anti-fraud identifies, Company can preserve the fraud voice data of history, it is established that a blacklist sound bank recycles sound groove recognition technology in e Vocal print feature information is extracted, blacklist vocal print library is established.
When new call is into fashionable, the sound of speaker will be extracted vocal print feature information automatically, with blacklist vocal print library In vocal print feature information be compared, if it is decided that the sound of speaker and the information matches success in blacklist vocal print library, Will be prompted to the speaker may be fraud identity.The difficult point of the program is that this is a voice print matching process than N, when black name When number in monophone line library is very more (such as N is more than 500), accuracy rate is relatively low, and detection efficiency is low.For example, for any theory People is talked about, as long as the sound of any people is similar to speaker in 500 people of blacklist, all may determine that the speaker belongs to fraud and uses Family and trigger alarm, so exist triggering wrong report the case where.
Invention content
The main purpose of the application is to provide a kind of anti-fraud method, apparatus of self refresh, computer equipment and storage to be situated between Matter overcomes accuracy rate low defect when to voice data instead cheat in the prior art.
To achieve the above object, this application provides a kind of anti-fraud method of self refresh, include the following steps:
When new fraud voice data being added in blacklist vocal print library, based on the fraud voice in blacklist vocal print library The training parameter of data re -training vocal print training pattern, obtains updated vocal print training pattern;
Receive the first voice data, and by the updated vocal print training pattern calculate first voice data with The similarity score of fraud voice data in blacklist vocal print library;
If the similarity score higher than the similarity threshold of setting, judges first voice data to cheat voice Data.
Further, the training parameter is the mean value of covariance matrix and discriminant vectors between the class in PLDA matrixes, The vocal print training pattern is gauss hybrid models, and the fraud voice data in the library based on the blacklist vocal print is instructed again The step of training parameter of vocalism line training pattern, including:
Based on the fraud voice data in blacklist vocal print library, Extraction and discrimination is vectorial respectively;
PLDA matrixes are trained using the discriminant vectors, update covariance matrix and mirror between the class in the PLDA matrixes Not vectorial mean value.
Further, described that first voice data and blacklist are calculated by the updated vocal print training pattern The step of similarity score of fraud voice data in vocal print library, including:
The discriminant vectors of first voice data are extracted using discriminant vectors extractor;
The discriminant vectors of first voice data are input in updated vocal print training pattern and calculate the first voice The similarity score of data and the fraud voice data in blacklist vocal print library.
Further, the step of discriminant vectors that first voice data is extracted using discriminant vectors extractor, It specifically includes:
The vocal print feature composition vocal print feature vector of first voice data is extracted, and passes through discriminant vectors extractor meter The vocal print feature vector is calculated, the discriminant vectors of first voice data are extracted.
Further, the training parameter is the mean value of covariance matrix and discriminant vectors between the class in PLDA matrixes, The vocal print training pattern is that gauss hybrid models are based on when new fraud voice data is added in blacklist vocal print library Before the step of training parameter of fraud voice data re -training vocal print training pattern in blacklist vocal print library, packet It includes:
The vocal print feature in user voice data is extracted, and corresponding based on the vocal print feature structure voice data Vocal print feature vector;
The vocal print feature vector is input in gauss hybrid models and discriminant vectors extractor and is trained;
Multiple discriminant vectors in one more logical fraud voice data are extracted by trained discriminant vectors extractor;
Train PLDA matrixes by multiple discriminant vectors, between the class in the training PLDA matrixes covariance matrix with And the mean value of discriminant vectors.
Further, described that the vocal print feature vector is input in gauss hybrid models and discriminant vectors extractor The step of being trained, including:
It inputs, and uses using the vocal print feature vector as the data of gauss hybrid models and discriminant vectors extractor EM algorithms train gauss hybrid models and discriminant vectors extractor.
Further, the vocal print feature is mel-frequency cepstrum coefficient, the vocal print in the extraction user voice data Feature, and the step for building based on the vocal print feature the corresponding vocal print feature vector of the voice data includes:
Preemphasis, framing and windowing process are carried out successively to the voice data;
To each adding window, frequency spectrum is obtained by Fourier transformation;
The frequency spectrum is filtered by Meier filter, obtains Meier frequency spectrum;
Cepstral analysis is carried out to the Meier frequency spectrum, obtains mel-frequency cepstrum coefficient;
The vocal print feature vector is built based on the mel-frequency cepstrum coefficient.
Present invention also provides a kind of anti-rogue devices of self refresh, including:
Updating unit when for new fraud voice data to be added in blacklist vocal print library, is based on the blacklist vocal print The training parameter of all fraud voice data re -training vocal print training patterns in library obtains updated vocal print training mould Type;
Marking unit calculates the first voice data and blacklist vocal print for passing through the updated vocal print training pattern The similarity score of fraud voice data in library;
Judging unit then judges first voice when for the similarity score higher than the similarity threshold set Data are fraud voice data.
The application also provides a kind of computer equipment, including memory and processor, and the memory is stored with computer The step of program, the processor realizes any of the above-described the method when executing the computer program.
The application also provides a kind of computer storage media, is stored thereon with computer program, the computer program quilt The step of processor realizes method described in any one of the above embodiments when executing.
Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh provided herein have with following Beneficial effect:
Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh provided herein, blacklist vocal print When new fraud voice data being added in library, the training for all fraud voice data re -training vocal print training patterns is joined Number, obtains updated vocal print training pattern;Vocal print training pattern is constantly updated, voice is counter to cheat preferably to adapt to, and is promoted Anti- fraud accuracy;When with stylish the first voice data access, first is calculated by the updated vocal print training pattern The similarity score of voice data and the fraud voice data in blacklist vocal print library, calculating are and all fraud voice numbers A similarity score between promotes accuracy rate, overcomes the defect of wrong report, while promoting detection efficiency.
Description of the drawings
Fig. 1 is the anti-fraud method step schematic diagram of the self refresh in one embodiment of the application;
Fig. 2 is the specific steps schematic diagram of the step S1 in one embodiment of the application;
Fig. 3 is the specific steps schematic diagram of the step S2 in one embodiment of the application;
Fig. 4 is the anti-rogue device structure diagram of the self refresh in one embodiment of the application;
Fig. 5 is the anti-rogue device structure diagram of the self refresh in another embodiment of the application;
Fig. 6 is the structural schematic block diagram of the computer equipment of one embodiment of the application.
The embodiments will be further described with reference to the accompanying drawings for realization, functional characteristics and the advantage of the application purpose.
Specific implementation mode
It is with reference to the accompanying drawings and embodiments, right in order to make the object, technical solution and advantage of the application be more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.
Referring to Fig.1, one embodiment of the application provides a kind of anti-fraud method of self refresh, includes the following steps:
Step S1, when new fraud voice data is added in blacklist vocal print library, based in blacklist vocal print library The training parameter for cheating voice data re -training vocal print training pattern, obtains updated vocal print training pattern;
Step S2 receives the first voice data, and calculates the first voice number by the updated vocal print training pattern According to the similarity score with the fraud voice data in blacklist vocal print library;
Step S3, if the similarity score judges that first voice data is higher than the similarity threshold of setting Cheat voice data.
As described in above-mentioned steps S1, in the present embodiment, it is stored with multiple fraud voice data in blacklist vocal print library, takes advantage of Swindleness voice data refers to the voice data that fraudulent user is sent out;Above-mentioned first voice data is the voice data of new user.On The voice print database that certain amount fraudulent user is stored in blacklist vocal print library is stated, and some user that ought newly judge uses for fraud When family, in order to enrich blacklist vocal print library, usually the fraud voice data of the new fraudulent user is added to blacklist vocal print In library;It should be understood that the fraud voice data of the fraudulent user of the new judgement can be examined by the model in the application The fraud voice data measured, can also be in advance otherwise, approach detect, i.e., the fraud voice in any source Data can be added in the blacklist vocal print library in the application.
If desired Application on Voiceprint Recognition is carried out to a large amount of voice data, needs to use vocal print training pattern, with to voice number According to vocal print feature be trained.Vocal print training pattern is typically used at present trains fraud in fixed blacklist vocal print library Then voice data reuses always the training parameter of the vocal print training pattern, with the increase of new user volume, it is clear that can not Accomplish that Accurate Prediction goes out fraudulent user.For example, when new fraud voice data is injected towards in blacklist vocal print library, if still The identification that fraudulent user is carried out according to training parameter before, then may cause inaccuracy, it is also possible to generate wrong report.Therefore, exist In the present embodiment, when new fraud voice data being added in blacklist vocal print library, then taken advantage of using all in blacklist vocal print library The training parameter for cheating voice data re -training vocal print training pattern, obtains updated vocal print training pattern, this process is one As long as loop iteration process then carries out a self refresh have new fraud voice data to be added into;Then such as above-mentioned steps S2 It is described, when there is new the first voice data access, reuses updated vocal print training pattern and calculate first voice data With the similarity score of the fraud voice data in blacklist vocal print library;Finally as described in above-mentioned steps S3, if the similarity Score value then judges first voice data to cheat voice data higher than the similarity threshold of setting.Further, step S3 Later, it can also include S4:Blacklist vocal print library is added using the first voice data as new fraud voice data, to update Vocal print training pattern.In this way, this programme also can constantly be recycled when identifying fraud voice data based on vocal print training pattern Blacklist vocal print library is updated, to optimize vocal print training pattern so that vocal print training pattern is to cheating the identification of voice data more What is added is accurate.
In one embodiment, as described in above-mentioned steps S1, if being stored with 500 in the blacklist vocal print library of certain bank The fraud voice data (including its vocal print feature in the fraud voice data) of a user, can use vocal print training pattern in advance The vocal print feature of 500 users in blacklist vocal print library is trained, training parameter at this time is obtained.If there are one marks again When being set to the fraud voice data of fraudulent user and being added into the blacklist vocal print library, then the sound again based on 501 users Line feature is trained, and retrieves training parameter.Whenever thering is new fraud voice data to be added, this process loop iteration one It is secondary.Later, when new the first voice data of banking terminal device access, then be based on updated vocal print training pattern to this first Voice data carries out the calculating of similarity score.
With reference to Fig. 2, in one embodiment, above-mentioned training parameter is covariance matrix and discriminating between the class in PLDA matrixes The mean value of vector, above-mentioned vocal print training pattern are gauss hybrid models, and all in the above-mentioned library based on the blacklist vocal print take advantage of The step S1 of the training parameter of voice data re -training vocal print training pattern is cheated, including:
Step S11, based on all fraud voice data in blacklist vocal print library, Extraction and discrimination is vectorial respectively (counts Calculate discriminant vectors i-vector);Above-mentioned discriminant vectors are the acoustic feature of speaker, and reflection is speaker's acoustic difference.
In this step, all fraud voice data being directed in blacklist vocal print library, are required to extract its discriminating Vector, and the extraction of discriminant vectors can be extracted using trained discriminant vectors extractor.
In one embodiment, above-mentioned steps S11 detailed processes are:Extract all fraud languages in blacklist vocal print library The vocal print feature of sound data, and corresponding vocal print feature vector is built based on the vocal print feature, vocal print feature vector is input to It is trained in gauss hybrid models and discriminant vectors extractor, training obtains discriminant vectors extractor;Specifically, by vocal print Feature vector is input in gauss hybrid models and is trained, constantly maximum likelihood pair of the iterative calculation until vocal print feature vector Numerical value no longer changes, then completes the training of gauss hybrid models.Then vocal print feature is calculated by discriminant vectors extractor Likelihood logarithm of the vector in gauss hybrid models, constantly iterative calculation no longer change until likelihood logarithm, then are instructed The discriminant vectors extractor perfected.Finally, the mirror of fraud voice data is extracted respectively by trained discriminant vectors extractor Not vector.
Step S12 trains PLDA matrixes using the discriminant vectors, updates covariance square between the class in the PLDA matrixes The mean value of battle array and discriminant vectors.
In the present embodiment, the vocal print training pattern used is gauss hybrid models, mainly discriminant vectors is utilized to train It is assisted between the class of PLDA (Probabilistic Linear Discriminant Analysis, probability linear discriminant analysis) matrix The two parameters of the mean value of variance matrix and discriminant vectors.PLDA matrixes refer to the i-Vector by all speakers Covariance matrix between the class of (discriminant vectors) training can indicate the how logical language of a speaker mostly logical voice and other speakers Covariance between sound.PLDA covariance matrixes help preferably to extract the speaker's sound itself for including in i-Vector Information, eliminating as possible caused by channel difference influences.
In the present embodiment newer training parameter be between the class in above-mentioned PLDA matrixes covariance matrix and differentiate to The mean value of amount completes self refresh, and the can be calculated according to the mean value of covariance matrix and discriminant vectors between updated class The similarity score of one voice data and the fraud voice data in blacklist vocal print library, which is by first A similarity result is calculated in PLDA matrixes in the discriminant vectors of voice data and the mean value of above-mentioned discriminant vectors.
It is in one embodiment, described that first voice data is calculated by the updated vocal print training pattern with reference to Fig. 3 With in blacklist vocal print library fraud voice data similarity score step S2, including:
Step S21 extracts the discriminant vectors of the first voice data using discriminant vectors extractor;
The discriminant vectors of first voice data are input in updated vocal print training pattern and calculate by step S22 The similarity score of first voice data and the fraud voice data in blacklist vocal print library.
In the present embodiment, above-mentioned steps S21 is specially:Extract the vocal print feature composition vocal print feature of the first voice data Vector, and vocal print feature vector is calculated by discriminant vectors extractor, extract the discriminating of first voice data to Amount.
Above-mentioned discriminant vectors extractor can be trained in advance, can also utilize in blacklist vocal print library and cheat voice The vocal print feature vector of data is trained.In one embodiment, the training step of above-mentioned discriminant vectors extractor includes:Extraction The vocal print feature of voice data is cheated, and based on the corresponding vocal print feature vector of vocal print feature structure fraud voice data, it will Vocal print feature vector is input in gauss hybrid models and discriminant vectors extractor and is trained, and training obtains discriminant vectors and carries Take device;The vocal print feature vector of above-mentioned first voice data can be finally calculated by trained discriminant vectors extractor, To extract the discriminant vectors of above-mentioned first voice data.
In above-mentioned steps S22, using covariance matrix between newer class in above-mentioned updated vocal print training pattern and The mean value of discriminant vectors can calculate the similarity point of the first voice data and the fraud voice data in blacklist vocal print library Value, the similarity score are by the discriminant vectors of the first voice data and the mean value of discriminant vectors after above-mentioned update in PLDA A similarity result is calculated in matrix;When the similarity score is more than the threshold value of setting, then illustrate first voice Data correspond to the fraud voice data in blacklist vocal print library, then judge the enunciator of the first voice data for fraudulent user.
In one embodiment, when new fraud voice data is added in blacklist vocal print library, it is based on the blacklist Before the step S1 of the training parameter of all fraud voice data re -training vocal print training patterns in vocal print library, including:
Step S101 extracts the vocal print feature in user voice data, and builds the voice based on the vocal print feature The corresponding vocal print feature vector of data;Can essentially be black using the data arrived when user voice data is training pattern Arbitrary fraud voice data in list vocal print library.
The vocal print feature vector is input in gauss hybrid models and discriminant vectors extractor and carries out by step S102 Training;
Step S103 extracts multiple discriminatings in one more logical training voices by trained discriminant vectors extractor Vector;The training voice can be multiple fraud voice data of same person.
Step S104 trains PLDA matrixes by multiple discriminant vectors, is assisted between the class in the training PLDA matrixes The mean value of variance matrix and discriminant vectors.
Specifically, it is above-mentioned by the vocal print feature vector be input in gauss hybrid models and discriminant vectors extractor into The step S102 of row training, including:
It inputs, and uses using the vocal print feature vector as the data of gauss hybrid models and discriminant vectors extractor EM algorithms (Expectation Maximization Algorithm, expectation-maximization algorithm) training gauss hybrid models and Discriminant vectors extractor.A kind of iterative algorithm of EM algorithms, the maximum likelihood logarithm for gauss hybrid models are estimated.
The present embodiment can be using advance trained vocal print training pattern, when blacklist vocal print library updates, to sound Line training pattern is trained the update of parameter.Can certainly be first based on the voice data of user to vocal print training pattern into Row training, obtains our distinctive training patterns.Therefore, in the present embodiment, vocal print is carried out before above-mentioned steps S1 train mould The training of type, i.e. above-mentioned steps S101- steps S104.It is worth noting that, step S101, step S102 in the present embodiment, The specific implementation of step S103 is referred to step S11, step S21 in above-described embodiment, realizes that process is roughly the same, area Not being to be directed to different voice data respectively, (such as step S11, step S21 are directed to fraud voice data respectively And first voice data), step S104 is identical as above-mentioned steps S12 specific implementations process, is no longer repeated herein.
In one embodiment, the vocal print feature is mel-frequency cepstrum coefficient, in the extraction user voice data Vocal print feature, and the step S101 of the corresponding vocal print feature vector of the voice data is built based on the vocal print feature, specifically Including:
A, preemphasis, framing and windowing process are carried out successively to the voice data;
B, to each adding window, frequency spectrum is obtained by Fourier transformation;
C, the frequency spectrum is filtered by Meier filter, obtains Meier frequency spectrum;
D, cepstral analysis is carried out to the Meier frequency spectrum, obtains mel-frequency cepstrum coefficient;
E, it is based on the mel-frequency cepstrum coefficient and builds the vocal print feature vector.
It is understood that in step a, b, c, d, e in the present embodiment for extract the vocal print feature of voice data to The specific implementation process of amount can be equally used in above-mentioned steps S11, step S21.
In the present embodiment, above-mentioned steps a, b are the preprocessing process of voice data.Above-mentioned cepstral analysis includes taking pair Number, does the modes such as inverse transformation, inverse transformation is realized generally by DCT discrete cosine transforms, takes the 2nd to the 13rd after DCT A coefficient carries out cepstral analysis by Meier frequency spectrum and obtains mel-frequency cepstrum coefficient (Mel Frequency CepstrumCoefficient, MFCC coefficient), which is exactly the vocal print feature of this frame voice;Finally, By the MFCC coefficient characteristics composition vocal print feature vector of every frame voice.
Specifically, in the present embodiment, the preemphasis processing in above-mentioned steps a uses a high-pass filtering in fact Device, the effect of the high-pass filter are to filter off low frequency, so that the high frequency characteristics of voice data is more burst, the biography of the high-pass filter Delivery function is H (Z)=1- α Z-1, wherein Z is audio data, and α is constant factor, in one embodiment of the application, the value of α It is 0.97.
The purpose of sub-frame processing in above-mentioned steps a is:Since only stationarity is presented in voice data within a short period of time, because One section of voice data is divided into the signal data of N sections of short time by this, and in order to avoid losing the continuity Characteristics of voice, adjacent One section of repeat region is had between frame, repeat region is generally the 1/2 of frame length.
After carrying out framing to voice data, each frame signal is all handled as stationary signal, behind we need With Fourier expansion each single item, to obtain Mel spectrum signatures, at this moment following effect will appear:By the period with discontinuity point After function (such as rectangular pulse) carries out fourier progression expanding method, chooses finite term and synthesized, when the item number of selection is more, in institute The peak occurred in the waveform of synthesis plays the discontinuity point closer to original signal, and when the item number of selection is very big, which plays value and tend to One constant, the 9% of approximately equal to total hop value, this phenomenon is known as Gibbs' effect, and effect is bad because frame starting and End will appear discontinuous situation certainly, then this signal after framing, will increasingly deviate from original signal.Cause This, after framing, it would be desirable to windowing process be carried out to voice data, purpose is exactly to reduce the ground of frame starting and ending The discontinuity problem of square signal handles speech data signal because speech data signal is generally steady in a short time The data of a period of time are only handled every time, so needing to carry out windowing operation to voice signal, once only handle the data in window.
It, can by discriminant vectors extractor after obtaining the vocal print feature vector of voice data after above-mentioned steps e Calculate discriminant vectors.Specifically, in one embodiment, a kind of specific calculating process of discriminant vectors is provided.
1, Gauss model is selected:
First, we can calculate each frame data in different Gaussian modes using the parameter in discriminant vectors extractor model The likelihood logarithm of type, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally obtaining one A matrix per frame data N before numerical value in mixed Gauss model.Wherein, likelihood logarithm Matrix Computation Formulas is as follows:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T
Parameter:Loglike is likelihood logarithm value matrix, i.e., the likelihood logarithm calculated under mixed Gauss model per frame Value;
E (X) trains the Mean Matrix come for universal background model;
D (X) is covariance matrix;
X is data matrix;
X.2For matrix, each value is squared.
2, posterior probability is calculated:
X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, be reduced to lower triangular matrix, and element is pressed It is ranked sequentially and becomes a N frame for 1 row and be multiplied by a vector of the such latitude of lower triangular matrix number to be calculated, by institute There is as frame vector be combined into new data matrix, while by the covariance of the calculating probability in discriminant vectors extractor model Matrix, each matrix are also reduced to lower triangular matrix, become with matrix as new data matrix class, extracted by discriminant vectors Mean Matrix and covariance matrix in device model calculate the likelihood logarithm under the Gauss model of the selection per frame data, Then carry out Softmax recurrence, operation is finally normalized, obtain every frame mixed Gauss model Posterior probability distribution, The ProbabilityDistribution Vector of every frame is formed into probability matrix:
Likelihood logarithm calculation formula:
Wherein, loglikesiFor the i-th row vector of likelihood logarithm value matrix;
CiFor the constant term of i-th of model;
EiFor the Mean Matrix of i-th of model;
CoviFor the covariance matrix of i-th of model;
XiFor the i-th frame voice data.
Softmax is carried out to every row vector of loglikes and returns calculating (method of comentropy), every frame is obtained and is mixing Posterior probability in Gauss model:
Xi=Exp (Xi-max(X))/∑Exp(Xi-max(X))
XiFor a line likelihood logarithm i-th of value of value matrix;
Max (X) is the maximum value of the row vector;
3, a second order coefficient is calculated:
Coefficient of first order is calculated to arrange to sum by above-mentioned probability matrix and be obtained, and second order coefficient can pass through probability matrix Transposition is multiplied by data matrix acquisition:
Coefficient of first order calculation formula is as follows:
GammaiFor i-th of element of coefficient of first order vector;
loglikesjiJth row for the probability matrix calculated before, i-th of element.
Second order coefficient calculation formula is as follows:
X=loglikesT*feats
X is second order coefficient matrix;
loglikesTFor probability matrix;
Feats is vocal print feature vector.
4, the calculating of discriminant vectors:
Have been calculated single order, after second order coefficient, then parallel computation first order and quadratic term pass through first order and quadratic term Calculate discriminant vectors:
Ivector=quadratic-1*linear
Ivector is discriminant vectors.
The wherein calculation formula of linear is:
Mi is the Mean Matrix of i-th of model in universal model;
ΣiFor the covariance matrix of i-th of model;
XiFor the i-th row vector of second order coefficient matrix;
The calculation formula of quadratic:
M is coefficient of first order vector;
MiFor the Mean Matrix of i-th of model in universal model;
ΣiFor the covariance matrix of i-th of model.
In specific one embodiment, in the gauss hybrid models trained using EM algorithms, the vocal print feature pair of extraction The likelihood probability answered can be indicated with K Gaussian component:
P (x)=1Kwkp(x|k)
Parameter:P (x) is the probability (mixed Gauss model) that sample is generated by gauss hybrid models;
wkFor the weight of each Gauss model;
P (x | k) it is the probability that sample is generated by k-th of Gauss model;
K is Gauss model quantity;
The weight of wherein i-th Gaussian component is:
The parameter of entire Gauss model can be expressed as:{wii,∑i}。
wiFor the weight of i-th of Gauss model;
μiFor the mean value of i-th of Gauss model;
iFor the covariance of i-th of Gauss model;
This model of training can use non-supervisory EM algorithms, and object function uses maximal possibility estimation, i.e., by selecting to join Number keeps log-likelihood function maximum:
LogP (x)=i=1Klogp (xi)
Wherein log P (x) are max log likelihood value of the sample in gauss hybrid models;
P (xi) is the probability that sample is generated by i-th of Gauss model;
K is Gauss model quantity.
And it is as follows in the newer training parameter model of every single-step iteration:
w′iFor the weight of i-th of Gauss model;
μ′iFor the mean value of i-th of Gauss model;
For the covariance of i-th of Gauss model.
Wherein, p (i | Xj, θ) be i-th of mixed components posterior probability:
Wherein, WiFor the weight of i-th of mixed components;
pi(xjj) be i-th of mixed components probability;
N is the quantity of mixed components.
It constantly iterates to calculate until the max log likelihood value of sample no longer changes.
After the completion of training, the weight vectors of gauss hybrid models are obtained, constant vector, N number of covariance matrix, mean value is multiplied by The matrix etc. of covariance is exactly a trained universal background model.
In conclusion for the anti-fraud method of the self refresh provided in the embodiment of the present application, blacklist vocal print is added in library When new fraud voice data, for the training parameters of all fraud voice data re -training vocal print training patterns, obtain more Vocal print training pattern after new;Vocal print training pattern is constantly updated, voice is counter to cheat preferably to adapt to, and it is accurate to promote anti-fraud Property;When with stylish first voice data access, by the updated vocal print training pattern calculate the first voice data with The similarity score of fraud voice data in blacklist vocal print library, that calculate is one between all fraud voice data A similarity score promotes accuracy rate, overcomes the defect of wrong report, while promoting detection efficiency.
With reference to Fig. 4, a kind of anti-rogue device of self refresh is additionally provided in the embodiment of the present application, including:
Updating unit 10 when for new fraud voice data to be added in blacklist vocal print library, is based on the blacklist sound The training parameter of fraud voice data re -training vocal print training pattern in line library, obtains updated vocal print training pattern;
It gives a mark unit 20, the is calculated for receiving the first voice data, and by the updated vocal print training pattern The similarity score of one voice data and the fraud voice data in blacklist vocal print library;
Judging unit 30 then judges first language when for the similarity score higher than the similarity threshold set Sound data are fraud voice data.
In the present embodiment, multiple fraud voice data are stored in blacklist vocal print library, fraud voice data refers to The voice data that fraudulent user is sent out;Above-mentioned first voice data is the voice data of new user.In above-mentioned blacklist vocal print library Be stored with the voice print database of certain amount fraudulent user, and when some user newly judged for fraudulent user when, it is black in order to enrich Usually the fraud voice data of the new fraudulent user is added into blacklist vocal print library for list vocal print library;It should be understood that It is that the fraud voice data of the fraudulent user of the new judgement can be the fraud voice number gone out by the model inspection in the application According to, can also be in advance otherwise, approach detect that the fraud voice data in that is, any source can add Into the blacklist vocal print library in the application.
If desired Application on Voiceprint Recognition is carried out to a large amount of voice data, needs to use vocal print training pattern, with to voice number According to vocal print feature be trained.Vocal print training pattern is typically used at present trains fraud in fixed blacklist vocal print library Then voice data reuses always the training parameter of the vocal print training pattern, with the increase of new user volume, it is clear that can not Accomplish that Accurate Prediction goes out fraudulent user.For example, when new fraud voice data is injected towards in blacklist vocal print library, if still The identification that fraudulent user is carried out according to training parameter before, then may cause inaccuracy, it is also possible to generate wrong report.Therefore, exist In the present embodiment, when new fraud voice data being added in blacklist vocal print library, then updating unit 10 utilizes blacklist vocal print library In all fraud voice data re -training vocal print training patterns training parameter, obtain updated vocal print training pattern, This process is a loop iteration process, as long as have new fraud voice data to be added into, then carries out a self refresh;Then when When having new the first voice data access, marking unit 20 reuses updated vocal print training pattern and calculates the first voice number According to the similarity score with the fraud voice data in blacklist vocal print library;Finally, judging unit 30 is if it is determined that the similarity Score value then judges first voice data to cheat voice data higher than the similarity threshold of setting.Further, it is above-mentioned from Newer anti-rogue device can also include:Processing unit, for adding the first voice data as new fraud voice data Enter blacklist vocal print library, to update vocal print training pattern.In this way, this programme can be cheated being identified based on vocal print training pattern When voice data, it is also constantly cyclically updated blacklist vocal print library, to optimize vocal print training pattern so that vocal print training pattern It is more accurate to the identification for cheating voice data.
In one embodiment, the fraud voice data of 500 users is stored in the blacklist vocal print library of certain bank (including its vocal print feature in the fraud voice data) can use vocal print training pattern in blacklist vocal print library in advance The vocal print feature of 500 users is trained, and obtains training parameter at this time.If there are one the frauds for being demarcated as fraudulent user again When voice data is added into the blacklist vocal print library, then the vocal print feature again based on 501 users is trained, again Obtain training parameter.Whenever having new fraud voice data to be added, this process loop iteration is primary.Later, banking terminal is set When standby the first voice data accessed newly, then it is based on updated vocal print training pattern and similarity is carried out to first voice data The calculating of score value.
In one embodiment, above-mentioned training parameter is the equal of covariance matrix and discriminant vectors between the class in PLDA matrixes Value, above-mentioned vocal print training pattern are gauss hybrid models, and above-mentioned updating unit 10 includes:
Subelement is extracted, for based on all fraud voice data in blacklist vocal print library, difference Extraction and discrimination Vector;Above-mentioned discriminant vectors are the acoustic feature of speaker, and reflection is speaker's acoustic difference.It is directed to blacklist vocal print All fraud voice data in library, are required to extract its discriminant vectors, and the extraction of discriminant vectors can use train Discriminant vectors extractors extract.
In one embodiment, the specific extraction process of said extracted subelement is:Extract the institute in blacklist vocal print library Have the vocal print feature of fraud voice data, and corresponding vocal print feature vector built based on the vocal print feature, by vocal print feature to Amount, which is input in gauss hybrid models and discriminant vectors extractor, to be trained, and training obtains discriminant vectors extractor;Specifically Vocal print feature vector is input in gauss hybrid models and is trained by ground, and constantly iterative calculation is until vocal print feature vector Maximum likelihood logarithm no longer changes, then completes the training of gauss hybrid models.Then pass through discriminant vectors extractor meter Likelihood logarithm of the vocal print feature vector in gauss hybrid models is calculated, is constantly iterated to calculate until likelihood logarithm no longer becomes Change, then obtains trained discriminant vectors extractor.Finally, fraud language is extracted respectively by trained discriminant vectors extractor The discriminant vectors of sound data.
Subelement is updated, for training PLDA matrixes using the discriminant vectors, is updated between the class in the PLDA matrixes The mean value of covariance matrix and discriminant vectors.
In the present embodiment, the vocal print training pattern used is gauss hybrid models, mainly discriminant vectors is utilized to train The two parameters of the mean value of covariance matrix and discriminant vectors between the class of PLDA matrixes.PLDA matrixes are referred to through all theorys Talk about covariance matrix between the class of i-Vector (discriminant vectors) training of people, can indicate the mostly logical voice of a speaker with it is other Covariance between the how logical voice of speaker.PLDA covariance matrixes contribute to include in preferably extraction i-Vector to say The information of voice sound itself is talked about, eliminate influences caused by channel difference as possible.
In the present embodiment newer training parameter be between the class in above-mentioned PLDA matrixes covariance matrix and differentiate to The mean value of amount completes self refresh, and the can be calculated according to the mean value of covariance matrix and discriminant vectors between updated class The similarity score of one voice data and the fraud voice data in blacklist vocal print library, which is by first A similarity result is calculated in PLDA matrixes in the discriminant vectors of voice data and the mean value of above-mentioned discriminant vectors.
In one embodiment, the marking unit 20 includes:
Extraction module, the discriminant vectors for extracting the first voice data using discriminant vectors extractor;
Computing module, for the discriminant vectors of first voice data to be input in updated vocal print training pattern Calculate the similarity score of the first voice data and the fraud voice data in blacklist vocal print library.
In the present embodiment, said extracted module is specifically used for:Extract the vocal print feature composition vocal print of the first voice data Feature vector, and the vocal print feature vector is calculated by discriminant vectors extractor, extract the mirror of first voice data Not vector.
Above-mentioned discriminant vectors extractor can be trained in advance, can also utilize in blacklist vocal print library and cheat voice The vocal print feature vector of data is trained.In one embodiment, the training process of above-mentioned discriminant vectors extractor includes:Extraction The vocal print feature of voice data is cheated, and based on the corresponding vocal print feature vector of vocal print feature structure fraud voice data, it will Vocal print feature vector is input in gauss hybrid models and discriminant vectors extractor and is trained, and training obtains discriminant vectors and carries Take device;The vocal print feature vector of above-mentioned first voice data can be finally calculated by trained discriminant vectors extractor, To extract the discriminant vectors of above-mentioned first voice data.
Above-mentioned computing module is specifically used for:Utilize covariance square between newer class in above-mentioned updated vocal print training pattern Battle array and the mean value of discriminant vectors can calculate the phase of the first voice data and the fraud voice data in blacklist vocal print library Like degree score value, which is the mean value by the discriminant vectors and discriminant vectors after above-mentioned update of the first voice data A similarity result is calculated in PLDA matrixes;When the similarity score is more than the threshold value of setting, then illustrate this One voice data corresponds to the fraud voice data in blacklist vocal print library, then judges the enunciator of the first voice data for fraud User.
With reference to Fig. 5, in one embodiment, the anti-fraud method of the self refresh further includes:
First extraction unit 101 for extracting the vocal print feature in user voice data, and is based on the vocal print feature structure Build the corresponding vocal print feature vector of the voice data;User voice data uses the data arrived, reality when being training pattern On can be arbitrary fraud voice data in blacklist vocal print library.
First training unit 102, for the vocal print feature vector to be input to gauss hybrid models and discriminant vectors It is trained in extractor;Specifically, using the vocal print feature vector as gauss hybrid models and discriminant vectors extractor Data input, and use EM algorithms training gauss hybrid models and discriminant vectors extractor.
Second extraction unit 103, for extracting one more logical training voices by trained discriminant vectors extractor In multiple discriminant vectors;The training voice can be multiple fraud voice data of same person.
Second training unit 104, for training PLDA matrixes, the training PLDA matrixes by multiple discriminant vectors In class between the mean value of covariance matrix and discriminant vectors.
The present embodiment can be using advance trained vocal print training pattern, when blacklist vocal print library updates, to sound Line training pattern is trained the update of parameter.Can certainly be first based on the voice data of user to vocal print training pattern into Row training, obtains our distinctive training patterns.It is worth noting that, the first extraction unit 101, first instruction in the present embodiment The specific implementation for practicing unit 102, the second extraction unit 103 is referred to extraction subelement, extraction module in above-described embodiment, Its realize process it is roughly the same, difference lies in be directed to respectively different voice data (such as extraction subelement, extraction module It is directed to fraud voice data and the first voice data respectively), the tool of the second training unit 104 and above-mentioned update subelement Body realizes that process is identical, is no longer repeated herein.
In one embodiment, the vocal print feature is mel-frequency cepstrum coefficient, and first extraction unit 101 specifically wraps It includes:
Preprocessing module, for carrying out preemphasis, framing and windowing process successively to the voice data;
Conversion module, for each adding window, frequency spectrum to be obtained by Fourier transformation;
Filter module is filtered the frequency spectrum for passing through Meier filter, obtains Meier frequency spectrum;
Analysis module obtains mel-frequency cepstrum coefficient for carrying out cepstral analysis to the Meier frequency spectrum;
Module is built, for building the vocal print feature vector based on the mel-frequency cepstrum coefficient.
It is understood that preprocessing module, conversion module, filter module, analysis module, structure mould in the present embodiment The specific implementation of block can be equally used in said extracted subelement, extraction module.
In the present embodiment, above-mentioned preprocessing module, conversion module are the pretreatment to voice data.Above-mentioned cepstral analysis Including taking logarithm, the modes such as inverse transformation are done, inverse transformation is realized generally by DCT discrete cosine transforms, takes the 2nd after DCT A to the 13rd coefficient carries out cepstral analysis by Meier frequency spectrum and obtains mel-frequency cepstrum coefficient (Mel Frequency CepstrumCoefficient, MFCC coefficient), which is exactly the vocal print feature of this frame voice;Finally, By the MFCC coefficient characteristics composition vocal print feature vector of every frame voice.
Specifically, in the present embodiment, the preemphasis processing in above-mentioned preprocessing module uses a high pass in fact Filter, the effect of the high-pass filter are to filter off low frequency, so that the high frequency characteristics of voice data is more burst, the high-pass filter Transmission function be H (Z)=1- α Z-1, wherein Z is audio data, and α is constant factor, in one embodiment of the application, α's Value is 0.97.
The purpose of sub-frame processing in above-mentioned preprocessing module is:Since voice data is only presented steadily within a short period of time Property, therefore one section of voice data is divided into the signal data of N sections of short time, and in order to avoid the continuity for losing voice is special It levies, has one section of repeat region between consecutive frame, repeat region is generally the 1/2 of frame length.
After carrying out framing to voice data, each frame signal is all handled as stationary signal, behind we need With Fourier expansion each single item, to obtain Mel spectrum signatures, at this moment following effect will appear:By the period with discontinuity point After function (such as rectangular pulse) carries out fourier progression expanding method, chooses finite term and synthesized, when the item number of selection is more, in institute The peak occurred in the waveform of synthesis plays the discontinuity point closer to original signal, and when the item number of selection is very big, which plays value and tend to One constant, the 9% of approximately equal to total hop value, this phenomenon is known as Gibbs' effect, and effect is bad because frame starting and End will appear discontinuous situation certainly, then this signal after framing, will increasingly deviate from original signal.Cause This, after framing, it would be desirable to windowing process be carried out to voice data, purpose is exactly to reduce the ground of frame starting and ending The discontinuity problem of square signal handles speech data signal because speech data signal is generally steady in a short time The data of a period of time are only handled every time, so needing to carry out windowing operation to voice signal, once only handle the data in window.
It, can by discriminant vectors extractor after above-mentioned structure module obtains the vocal print feature vector of voice data Calculate discriminant vectors.Specifically, in one embodiment, a kind of specific calculating process of discriminant vectors is provided.
1, Gauss model is selected:
First, we can calculate each frame data in different Gaussian modes using the parameter in discriminant vectors extractor model The likelihood logarithm of type, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally obtaining one A matrix per frame data N before numerical value in mixed Gauss model.Wherein, likelihood logarithm Matrix Computation Formulas is as follows:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T
Parameter:Loglike is likelihood logarithm value matrix, i.e., the likelihood logarithm calculated under mixed Gauss model per frame Value;
E (X) trains the Mean Matrix come for universal background model;
D (X) is covariance matrix;
X is data matrix;
X.2For matrix, each value is squared.
2, posterior probability is calculated:
X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, be reduced to lower triangular matrix, and element is pressed It is ranked sequentially and becomes a N frame for 1 row and be multiplied by a vector of the such latitude of lower triangular matrix number to be calculated, by institute There is as frame vector be combined into new data matrix, while by the covariance of the calculating probability in discriminant vectors extractor model Matrix, each matrix are also reduced to lower triangular matrix, become with matrix as new data matrix class, extracted by discriminant vectors Mean Matrix and covariance matrix in device model calculate the likelihood logarithm under the Gauss model of the selection per frame data, Then carry out Softmax recurrence, operation is finally normalized, obtain every frame mixed Gauss model Posterior probability distribution, The ProbabilityDistribution Vector of every frame is formed into probability matrix:
Likelihood logarithm calculation formula:
Wherein, loglikesiFor the i-th row vector of likelihood logarithm value matrix;
CiFor the constant term of i-th of model;
EiFor the Mean Matrix of i-th of model;
CoviFor the covariance matrix of i-th of model;
XiFor the i-th frame voice data.
Softmax is carried out to every row vector of loglikes and returns calculating (method of comentropy), every frame is obtained and is mixing Posterior probability in Gauss model:
Xi=Exp (Xi-max(X))/∑Exp(Xi-max(X))
XiFor a line likelihood logarithm i-th of value of value matrix;
Max (X) is the maximum value of the row vector;
3, a second order coefficient is calculated:
Coefficient of first order is calculated to arrange to sum by above-mentioned probability matrix and be obtained, and second order coefficient can pass through probability matrix Transposition is multiplied by data matrix acquisition:
Coefficient of first order calculation formula is as follows:
GammaiFor i-th of element of coefficient of first order vector;
loglikesjiJth row for the probability matrix calculated before, i-th of element.
Second order coefficient calculation formula is as follows:
X=loglikesT*feats
X is second order coefficient matrix;
loglikesTFor probability matrix;
Feats is vocal print feature vector.
4, the calculating of discriminant vectors:
Have been calculated single order, after second order coefficient, then parallel computation first order and quadratic term pass through first order and quadratic term Calculate discriminant vectors:
Ivector=quadratic-1*linear
Ivector is discriminant vectors.
The wherein calculation formula of linear is:
Mi is the Mean Matrix of i-th of model in universal model;
ΣiFor the covariance matrix of i-th of model;
XiFor the i-th row vector of second order coefficient matrix;
The calculation formula of quadratic:
M is coefficient of first order vector;
MiFor the Mean Matrix of i-th of model in universal model;
ΣiFor the covariance matrix of i-th of model.
In specific one embodiment, in the gauss hybrid models trained using EM algorithms, the vocal print feature pair of extraction The likelihood probability answered can be indicated with K Gaussian component:
P (x)=1Kwkp(x|k)
Parameter:P (x) is the probability (mixed Gauss model) that sample is generated by gauss hybrid models;
wkFor the weight of each Gauss model;
P (x | k) it is the probability that sample is generated by k-th of Gauss model;
K is Gauss model quantity;
The weight of wherein i-th Gaussian component is:
The parameter of entire Gauss model can be expressed as:{wii,∑i}。
wiFor the weight of i-th of Gauss model;
μiFor the mean value of i-th of Gauss model;
iFor the covariance of i-th of Gauss model;
This model of training can use non-supervisory EM algorithms, and object function uses maximal possibility estimation, i.e., by selecting to join Number keeps log-likelihood function maximum:
LogP (x)=i=1Klogp (xi)
Wherein log P (x) are max log likelihood value of the sample in gauss hybrid models;
P (xi) is the probability that sample is generated by i-th of Gauss model;
K is Gauss model quantity.
And it is as follows in the newer training parameter model of every single-step iteration:
w′iFor the weight of i-th of Gauss model;
μ′iFor the mean value of i-th of Gauss model;
For the covariance of i-th of Gauss model.
Wherein, p (i | Xj, θ) be i-th of mixed components posterior probability:
Wherein, WiFor the weight of i-th of mixed components;
pi(xjj) be i-th of mixed components probability;
N is the quantity of mixed components.
It constantly iterates to calculate until the max log likelihood value of sample no longer changes.
After the completion of training, the weight vectors of gauss hybrid models are obtained, constant vector, N number of covariance matrix, mean value is multiplied by The matrix etc. of covariance is exactly a trained universal background model.
In conclusion for the anti-rogue device of the self refresh provided in the embodiment of the present application, blacklist vocal print is added in library When new fraud voice data, for the training parameters of all fraud voice data re -training vocal print training patterns, obtain more Vocal print training pattern after new;Vocal print training pattern is constantly updated, voice is counter to cheat preferably to adapt to, and it is accurate to promote anti-fraud Property;When with stylish first voice data access, by the updated vocal print training pattern calculate the first voice data with The similarity score of fraud voice data in blacklist vocal print library, that calculate is one between all fraud voice data A similarity score promotes accuracy rate, overcomes the defect of wrong report, while promoting detection efficiency.
With reference to Fig. 6, a kind of computer equipment is also provided in the embodiment of the present application, which can be server, Its internal structure can be as shown in Figure 6.The computer equipment includes processor, memory, the network connected by system bus Interface and database.Wherein, the processor of the Computer Design is for providing calculating and control ability.The computer equipment is deposited Reservoir includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program And database.The internal memory provides environment for the operation of operating system and computer program in non-volatile memory medium.It should The database of computer equipment is for storing the data such as vocal print training pattern.The network interface of the computer equipment is used for and outside Terminal communicated by network connection.To realize a kind of anti-fraudulent party of self refresh when the computer program is executed by processor Method.
Above-mentioned processor executes the step of anti-fraud method of above-mentioned self refresh:New fraud is added in library in blacklist vocal print When voice data, the training ginseng based on the fraud voice data re -training vocal print training pattern in blacklist vocal print library Number, obtains updated vocal print training pattern;The first voice data is received, and passes through the updated vocal print training pattern meter Calculate the similarity score of the first voice data and the fraud voice data in blacklist vocal print library;If the similarity score is higher than The similarity threshold of setting then judges first voice data to cheat voice data.
In one embodiment, the training parameter is the equal of covariance matrix and discriminant vectors between the class in PLDA matrixes Value, the vocal print training pattern are gauss hybrid models, and above-mentioned processor is based on the fraud voice in blacklist vocal print library The step of training parameter of data re -training vocal print training pattern, including:
Based on the fraud voice data in blacklist vocal print library, Extraction and discrimination is vectorial respectively;
PLDA matrixes are trained using the discriminant vectors, update covariance matrix and mirror between the class in the PLDA matrixes Not vectorial mean value.
In one embodiment, above-mentioned processor by the updated vocal print training pattern calculate the first voice data with The step of similarity score of fraud voice data in blacklist vocal print library, including:
The discriminant vectors of the first voice data are extracted using discriminant vectors extractor;
The discriminant vectors of first voice data are input in updated vocal print training pattern and calculate the first voice The similarity score of data and the fraud voice data in blacklist vocal print library.
In one embodiment, above-mentioned processor extracts the discriminant vectors of the first voice data using discriminant vectors extractor Step specifically includes:
The vocal print feature composition vocal print feature vector of the first voice data is extracted, and institute is calculated by discriminant vectors extractor Vocal print feature vector is stated, the discriminant vectors of first voice data are extracted.
In one embodiment, the training parameter is the equal of covariance matrix and discriminant vectors between the class in PLDA matrixes Value, the vocal print training pattern are gauss hybrid models, and above-mentioned processor executes in blacklist vocal print library and new fraud language is added When sound data, the training ginseng based on all fraud voice data re -training vocal print training patterns in blacklist vocal print library Before several steps, including:
The vocal print feature in user voice data is extracted, and corresponding based on the vocal print feature structure voice data Vocal print feature vector;
The vocal print feature vector is input in gauss hybrid models and discriminant vectors extractor and is trained;
Multiple discriminant vectors in one more logical fraud voice data are extracted by trained discriminant vectors extractor;
Train PLDA matrixes by multiple discriminant vectors, between the class in the training PLDA matrixes covariance matrix with And the mean value of discriminant vectors.
In one embodiment, above-mentioned processor by the vocal print feature vector be input to gauss hybrid models and differentiate to The step of being trained in amount extractor, including:
It is inputted the vocal print feature vector as the data of gauss hybrid models and discriminant vectors extractor, uses EM Algorithm trains gauss hybrid models and discriminant vectors extractor.
In one embodiment, the vocal print feature is mel-frequency cepstrum coefficient, and above-mentioned processor extracts user speech number Vocal print feature in, and build based on the vocal print feature step packet of the corresponding vocal print feature vector of the voice data It includes:
Preemphasis, framing and windowing process are carried out successively to the voice data;
To each adding window, frequency spectrum is obtained by Fourier transformation;
The frequency spectrum is filtered by Meier filter, obtains Meier frequency spectrum;
Cepstral analysis is carried out to the Meier frequency spectrum, obtains mel-frequency cepstrum coefficient;
The vocal print feature vector is built based on the mel-frequency cepstrum coefficient.
It will be understood by those skilled in the art that structure shown in Fig. 6, is only tied with the relevant part of application scheme The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme.
One embodiment of the application also provides a kind of computer storage media, is stored thereon with computer program, computer journey A kind of anti-fraud method of self refresh is realized when sequence is executed by processor, specially:New fraud is added in library in blacklist vocal print When voice data, the training ginseng based on the fraud voice data re -training vocal print training pattern in blacklist vocal print library Number, obtains updated vocal print training pattern;The first voice data is received, and passes through the updated vocal print training pattern meter Calculate the similarity score of the first voice data and the fraud voice data in blacklist vocal print library;If the similarity score is higher than The similarity threshold of setting then judges first voice data to cheat voice data.
In one embodiment, the training parameter is the equal of covariance matrix and discriminant vectors between the class in PLDA matrixes Value, the vocal print training pattern are gauss hybrid models, and above-mentioned processor is based on all frauds in blacklist vocal print library The step of training parameter of voice data re -training vocal print training pattern, including:
Based on the fraud voice data in blacklist vocal print library, Extraction and discrimination is vectorial respectively;
PLDA matrixes are trained using the discriminant vectors, update covariance matrix and mirror between the class in the PLDA matrixes Not vectorial mean value.
In one embodiment, above-mentioned processor by the updated vocal print training pattern calculate the first voice data with The step of similarity score of fraud voice data in blacklist vocal print library, including:
The discriminant vectors of the first voice data are extracted using discriminant vectors extractor;
The discriminant vectors of first voice data are input in updated vocal print training pattern and calculate the first voice The similarity score of data and the fraud voice data in blacklist vocal print library.
In one embodiment, above-mentioned processor extracts the discriminant vectors of the first voice data using discriminant vectors extractor Step specifically includes:
The vocal print feature composition vocal print feature vector of the first voice data is extracted, and institute is calculated by discriminant vectors extractor Vocal print feature vector is stated, the discriminant vectors of first voice data are extracted.
In one embodiment, the training parameter is the equal of covariance matrix and discriminant vectors between the class in PLDA matrixes Value, the vocal print training pattern are gauss hybrid models, and above-mentioned processor executes in blacklist vocal print library and new fraud language is added When sound data, the training ginseng based on all fraud voice data re -training vocal print training patterns in blacklist vocal print library Before several steps, including:
The vocal print feature in user voice data is extracted, and corresponding based on the vocal print feature structure voice data Vocal print feature vector;
The vocal print feature vector is input in gauss hybrid models and discriminant vectors extractor and is trained;
Multiple discriminant vectors in one more logical fraud voice data are extracted by trained discriminant vectors extractor;
Train PLDA matrixes by multiple discriminant vectors, between the class in the training PLDA matrixes covariance matrix with And the mean value of discriminant vectors.
In one embodiment, above-mentioned processor by the vocal print feature vector be input to gauss hybrid models and differentiate to The step of being trained in amount extractor, including:
It is inputted the vocal print feature vector as the data of gauss hybrid models and discriminant vectors extractor, uses EM Algorithm trains gauss hybrid models and discriminant vectors extractor.
In one embodiment, the vocal print feature is mel-frequency cepstrum coefficient, and above-mentioned processor extracts user speech number Vocal print feature in, and build based on the vocal print feature step packet of the corresponding vocal print feature vector of the voice data It includes:
Preemphasis, framing and windowing process are carried out successively to the voice data;
To each adding window, frequency spectrum is obtained by Fourier transformation;
The frequency spectrum is filtered by Meier filter, obtains Meier frequency spectrum;
Cepstral analysis is carried out to the Meier frequency spectrum, obtains mel-frequency cepstrum coefficient;
The vocal print feature vector is built based on the mel-frequency cepstrum coefficient.
One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can store and a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, Any reference to memory, storage, database or other media used in provided herein and embodiment, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, mono- diversified forms of RAM can obtain, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double speed are according to rate SDRAM (SSRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
It should be noted that herein, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that process, device, article or method including a series of elements include not only those elements, and And further include the other elements being not explicitly listed, or further include for this process, device, article or method institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including this There is also other identical elements in the process of element, device, article or method.
The foregoing is merely the preferred embodiments of the application, are not intended to limit the scope of the claims of the application, every utilization Equivalent structure or equivalent flow shift made by present specification and accompanying drawing content is applied directly or indirectly in other correlations Technical field, include similarly in the scope of patent protection of the application.

Claims (10)

1. a kind of anti-fraud method of self refresh, which is characterized in that include the following steps:
When new fraud voice data being added in blacklist vocal print library, based on the fraud voice data in blacklist vocal print library The training parameter of re -training vocal print training pattern obtains updated vocal print training pattern;
The first voice data is received, and first voice data and black name are calculated by the updated vocal print training pattern The similarity score of fraud voice data in monophone line library;
If the similarity score higher than the similarity threshold of setting, judges first voice data to cheat voice number According to.
2. the anti-fraud method of self refresh according to claim 1, which is characterized in that the training parameter is PLDA matrixes In class between the mean value of covariance matrix and discriminant vectors, the vocal print training pattern be gauss hybrid models, it is described to be based on The step of training parameter of fraud voice data re -training vocal print training pattern in blacklist vocal print library, including:
Based on the fraud voice data in blacklist vocal print library, Extraction and discrimination is vectorial respectively;
Train PLDA matrixes using the discriminant vectors, update between the class in the PLDA matrixes covariance matrix and differentiate to The mean value of amount.
3. the anti-fraud method of self refresh according to claim 1, which is characterized in that described to pass through the updated sound Line training pattern calculates the step of first voice data and the similarity score of the fraud voice data in blacklist vocal print library Suddenly, including:
The discriminant vectors of first voice data are extracted using discriminant vectors extractor;
The discriminant vectors of first voice data are input in updated vocal print training pattern and calculate the first voice data With the similarity score of the fraud voice data in blacklist vocal print library.
4. the anti-fraud method of self refresh according to claim 3, which is characterized in that described to use discriminant vectors extractor The step of extracting the discriminant vectors of first voice data, specifically includes:
The vocal print feature composition vocal print feature vector of first voice data is extracted, and institute is calculated by discriminant vectors extractor Vocal print feature vector is stated, the discriminant vectors of first voice data are extracted.
5. the anti-fraud method of the self refresh according to any one of claim 1-4, which is characterized in that the training parameter The mean value of covariance matrix and discriminant vectors between the class in PLDA matrixes, the vocal print training pattern are Gaussian Mixture mould Type, when new fraud voice data is added in blacklist vocal print library, based on the fraud voice in blacklist vocal print library Before the step of training parameter of data re -training vocal print training pattern, including:
The vocal print feature in user voice data is extracted, and the corresponding vocal print of the voice data is built based on the vocal print feature Feature vector;
The vocal print feature vector is input in gauss hybrid models and discriminant vectors extractor and is trained;
Multiple discriminant vectors in one more logical fraud voice data are extracted by trained discriminant vectors extractor;
PLDA matrixes are trained by multiple discriminant vectors, train covariance matrix and mirror between the class in the PLDA matrixes Not vectorial mean value.
6. the anti-fraud method of self refresh according to claim 5, which is characterized in that described by vocal print feature vector It is input to the step of being trained in gauss hybrid models and discriminant vectors extractor, including:
It inputs the vocal print feature vector as the data of gauss hybrid models and discriminant vectors extractor, and is calculated using EM Method trains gauss hybrid models and discriminant vectors extractor.
7. the anti-fraud method of self refresh according to claim 5, which is characterized in that the vocal print feature is mel-frequency Cepstrum coefficient, the vocal print feature extracted in user voice data, and the voice data is built based on the vocal print feature The step of corresponding vocal print feature vector includes:
Preemphasis, framing and windowing process are carried out successively to the voice data;
To each adding window, frequency spectrum is obtained by Fourier transformation;
The frequency spectrum is filtered by Meier filter, obtains Meier frequency spectrum;
Cepstral analysis is carried out to the Meier frequency spectrum, obtains mel-frequency cepstrum coefficient;
The vocal print feature vector is built based on the mel-frequency cepstrum coefficient.
8. a kind of anti-rogue device of self refresh, which is characterized in that including:
Updating unit, when for new fraud voice data to be added in blacklist vocal print library, based in blacklist vocal print library Fraud voice data re -training vocal print training pattern training parameter, obtain updated vocal print training pattern;
Marking unit for receiving the first voice data, and passes through the updated vocal print training pattern and calculates described first The similarity score of voice data and the fraud voice data in blacklist vocal print library;
Judging unit then judges first voice data when for the similarity score higher than the similarity threshold set To cheat voice data.
9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In when the processor executes the computer program the step of any one of realization claim 1 to 7 the method.
10. a kind of computer storage media, is stored thereon with computer program, which is characterized in that the computer program is located Manage the step of realizing the method described in any one of claim 1 to 7 when device executes.
CN201810345256.4A 2018-04-17 2018-04-17 Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh Pending CN108806695A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810345256.4A CN108806695A (en) 2018-04-17 2018-04-17 Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh
PCT/CN2018/095486 WO2019200744A1 (en) 2018-04-17 2018-07-12 Self-updated anti-fraud method and apparatus, computer device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810345256.4A CN108806695A (en) 2018-04-17 2018-04-17 Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh

Publications (1)

Publication Number Publication Date
CN108806695A true CN108806695A (en) 2018-11-13

Family

ID=64094806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810345256.4A Pending CN108806695A (en) 2018-04-17 2018-04-17 Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh

Country Status (2)

Country Link
CN (1) CN108806695A (en)
WO (1) WO2019200744A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109994118A (en) * 2019-04-04 2019-07-09 平安科技(深圳)有限公司 Speech cipher verification method, device, storage medium and computer equipment
CN110311909A (en) * 2019-06-28 2019-10-08 平安科技(深圳)有限公司 The abnormality determination method and device of terminal device network access
WO2020107756A1 (en) * 2018-11-27 2020-06-04 深圳前海微众银行股份有限公司 Credit anti-fraud method, system, device and computer-readable storage medium
CN111611566A (en) * 2020-05-12 2020-09-01 珠海造极声音科技有限公司 Speaker verification system and replay attack detection method thereof
CN111933147A (en) * 2020-06-22 2020-11-13 厦门快商通科技股份有限公司 Voiceprint recognition method, system, mobile terminal and storage medium
CN112331230A (en) * 2020-11-17 2021-02-05 平安科技(深圳)有限公司 Method and device for identifying fraudulent conduct, computer equipment and storage medium
CN112735438A (en) * 2020-12-29 2021-04-30 科大讯飞股份有限公司 Online voiceprint feature updating method and device, storage device and modeling device
CN112863523A (en) * 2019-11-27 2021-05-28 华为技术有限公司 Voice anti-counterfeiting method and device, terminal equipment and storage medium
CN113112992A (en) * 2019-12-24 2021-07-13 中国移动通信集团有限公司 Voice recognition method and device, storage medium and server
CN113590873A (en) * 2021-07-23 2021-11-02 中信银行股份有限公司 Processing method and device for white list voiceprint feature library and electronic equipment
WO2023124248A1 (en) * 2021-12-28 2023-07-06 荣耀终端有限公司 Voiceprint recognition method and apparatus

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030182119A1 (en) * 2001-12-13 2003-09-25 Junqua Jean-Claude Speaker authentication system and method
CN102567788A (en) * 2010-12-28 2012-07-11 中国移动通信集团重庆有限公司 Real-time identification system and real-time identification method for fraudulent practice in communication services
CN102760434A (en) * 2012-07-09 2012-10-31 华为终端有限公司 Method for updating voiceprint feature model and terminal
CN103971700A (en) * 2013-08-01 2014-08-06 哈尔滨理工大学 Voice monitoring method and device
US20150206538A1 (en) * 2014-01-17 2015-07-23 Agnitio, S.L. Tamper-resistant element for use in speaker recognition
CN106157959A (en) * 2015-03-31 2016-11-23 讯飞智元信息科技有限公司 Sound-groove model update method and system
CN106251874A (en) * 2016-07-27 2016-12-21 深圳市鹰硕音频科技有限公司 A kind of voice gate inhibition and quiet environment monitoring method and system
CN106506454A (en) * 2016-10-10 2017-03-15 江苏通付盾科技有限公司 Fraud business recognition method and device
CN106981289A (en) * 2016-01-14 2017-07-25 芋头科技(杭州)有限公司 A kind of identification model training method and system and intelligent terminal
CN106991312A (en) * 2017-04-05 2017-07-28 百融(北京)金融信息服务股份有限公司 Internet based on Application on Voiceprint Recognition is counter to cheat authentication method
CN107068154A (en) * 2017-03-13 2017-08-18 平安科技(深圳)有限公司 The method and system of authentication based on Application on Voiceprint Recognition
CN107680600A (en) * 2017-09-11 2018-02-09 平安科技(深圳)有限公司 Sound-groove model training method, audio recognition method, device, equipment and medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030182119A1 (en) * 2001-12-13 2003-09-25 Junqua Jean-Claude Speaker authentication system and method
CN102567788A (en) * 2010-12-28 2012-07-11 中国移动通信集团重庆有限公司 Real-time identification system and real-time identification method for fraudulent practice in communication services
CN102760434A (en) * 2012-07-09 2012-10-31 华为终端有限公司 Method for updating voiceprint feature model and terminal
CN103971700A (en) * 2013-08-01 2014-08-06 哈尔滨理工大学 Voice monitoring method and device
US20150206538A1 (en) * 2014-01-17 2015-07-23 Agnitio, S.L. Tamper-resistant element for use in speaker recognition
CN106157959A (en) * 2015-03-31 2016-11-23 讯飞智元信息科技有限公司 Sound-groove model update method and system
CN106981289A (en) * 2016-01-14 2017-07-25 芋头科技(杭州)有限公司 A kind of identification model training method and system and intelligent terminal
CN106251874A (en) * 2016-07-27 2016-12-21 深圳市鹰硕音频科技有限公司 A kind of voice gate inhibition and quiet environment monitoring method and system
CN106506454A (en) * 2016-10-10 2017-03-15 江苏通付盾科技有限公司 Fraud business recognition method and device
CN107068154A (en) * 2017-03-13 2017-08-18 平安科技(深圳)有限公司 The method and system of authentication based on Application on Voiceprint Recognition
CN106991312A (en) * 2017-04-05 2017-07-28 百融(北京)金融信息服务股份有限公司 Internet based on Application on Voiceprint Recognition is counter to cheat authentication method
CN107680600A (en) * 2017-09-11 2018-02-09 平安科技(深圳)有限公司 Sound-groove model training method, audio recognition method, device, equipment and medium

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020107756A1 (en) * 2018-11-27 2020-06-04 深圳前海微众银行股份有限公司 Credit anti-fraud method, system, device and computer-readable storage medium
CN109994118B (en) * 2019-04-04 2022-10-11 平安科技(深圳)有限公司 Voice password verification method and device, storage medium and computer equipment
CN109994118A (en) * 2019-04-04 2019-07-09 平安科技(深圳)有限公司 Speech cipher verification method, device, storage medium and computer equipment
CN110311909A (en) * 2019-06-28 2019-10-08 平安科技(深圳)有限公司 The abnormality determination method and device of terminal device network access
CN110311909B (en) * 2019-06-28 2021-12-24 平安科技(深圳)有限公司 Method and device for judging abnormity of network access of terminal equipment
WO2021103913A1 (en) * 2019-11-27 2021-06-03 华为技术有限公司 Voice anti-counterfeiting method and apparatus, terminal device, and storage medium
CN112863523B (en) * 2019-11-27 2023-05-16 华为技术有限公司 Voice anti-counterfeiting method and device, terminal equipment and storage medium
CN112863523A (en) * 2019-11-27 2021-05-28 华为技术有限公司 Voice anti-counterfeiting method and device, terminal equipment and storage medium
CN113112992B (en) * 2019-12-24 2022-09-16 中国移动通信集团有限公司 Voice recognition method and device, storage medium and server
CN113112992A (en) * 2019-12-24 2021-07-13 中国移动通信集团有限公司 Voice recognition method and device, storage medium and server
CN111611566A (en) * 2020-05-12 2020-09-01 珠海造极声音科技有限公司 Speaker verification system and replay attack detection method thereof
CN111611566B (en) * 2020-05-12 2023-09-05 珠海造极智能生物科技有限公司 Speaker verification system and replay attack detection method thereof
CN111933147A (en) * 2020-06-22 2020-11-13 厦门快商通科技股份有限公司 Voiceprint recognition method, system, mobile terminal and storage medium
CN111933147B (en) * 2020-06-22 2023-02-14 厦门快商通科技股份有限公司 Voiceprint recognition method, system, mobile terminal and storage medium
CN112331230A (en) * 2020-11-17 2021-02-05 平安科技(深圳)有限公司 Method and device for identifying fraudulent conduct, computer equipment and storage medium
CN112735438A (en) * 2020-12-29 2021-04-30 科大讯飞股份有限公司 Online voiceprint feature updating method and device, storage device and modeling device
CN112735438B (en) * 2020-12-29 2024-05-31 科大讯飞股份有限公司 Online voiceprint feature updating method and device, storage device and modeling device
CN113590873A (en) * 2021-07-23 2021-11-02 中信银行股份有限公司 Processing method and device for white list voiceprint feature library and electronic equipment
WO2023124248A1 (en) * 2021-12-28 2023-07-06 荣耀终端有限公司 Voiceprint recognition method and apparatus

Also Published As

Publication number Publication date
WO2019200744A1 (en) 2019-10-24

Similar Documents

Publication Publication Date Title
CN108806695A (en) Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh
CN107610707B (en) A kind of method for recognizing sound-groove and device
US9373330B2 (en) Fast speaker recognition scoring using I-vector posteriors and probabilistic linear discriminant analysis
CN110378562B (en) Voice quality inspection method, device, computer equipment and storage medium
Wu et al. SAS: A speaker verification spoofing database containing diverse attacks
CN110047490A (en) Method for recognizing sound-groove, device, equipment and computer readable storage medium
CN108154371A (en) Electronic device, the method for authentication and storage medium
CN108986798B (en) Processing method, device and the equipment of voice data
CN108922544A (en) General vector training method, voice clustering method, device, equipment and medium
CN113223536B (en) Voiceprint recognition method and device and terminal equipment
CN109065022A (en) I-vector vector extracting method, method for distinguishing speek person, device, equipment and medium
CN108091326A (en) A kind of method for recognizing sound-groove and system based on linear regression
CN112233651B (en) Dialect type determining method, device, equipment and storage medium
CN108922543A (en) Model library method for building up, audio recognition method, device, equipment and medium
Novotný et al. Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge.
CN111161713A (en) Voice gender identification method and device and computing equipment
CN110379433A (en) Method, apparatus, computer equipment and the storage medium of authentication
Kheder et al. A unified joint model to deal with nuisance variabilities in the i-vector space
Khosravani et al. Nonparametrically trained PLDA for short duration i-vector speaker verification
CN109360573A (en) Livestock method for recognizing sound-groove, device, terminal device and computer storage medium
Mirghafori et al. An adaptive speaker verification system with speaker dependent a priori decision thresholds.
Ferrer et al. Joint PLDA for simultaneous modeling of two factors
Singh et al. Language identification using sparse representation: A comparison between gmm supervector and i-vector based approaches
Zajíc et al. Fisher vectors in PLDA speaker verification system
Chaudhari et al. Transformation enhanced multi-grained modeling for text-independent speaker recognition.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181113

RJ01 Rejection of invention patent application after publication