CN105869630A - Method and system for detecting voice spoofing attack of speakers on basis of deep learning - Google Patents

Method and system for detecting voice spoofing attack of speakers on basis of deep learning Download PDF

Info

Publication number
CN105869630A
CN105869630A CN201610478041.0A CN201610478041A CN105869630A CN 105869630 A CN105869630 A CN 105869630A CN 201610478041 A CN201610478041 A CN 201610478041A CN 105869630 A CN105869630 A CN 105869630A
Authority
CN
China
Prior art keywords
degree
depth
neural network
voice
speaker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610478041.0A
Other languages
Chinese (zh)
Other versions
CN105869630B (en
Inventor
钱彦旻
陈楠昕
俞凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201610478041.0A priority Critical patent/CN105869630B/en
Publication of CN105869630A publication Critical patent/CN105869630A/en
Application granted granted Critical
Publication of CN105869630B publication Critical patent/CN105869630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a method and a system for detecting voice spoofing attack of speakers on the basis of deep learning. The method includes constructing audio-frequency training sets, initializing deep feed-forward neural networks and deep recurrent neural networks and respectively training the deep feed-forward neural networks and the deep recurrent neural networks by the aid of multi-frame feature vectors and single-frame vector sequences of the training sets; respectively leading frame level and sequence level feature vectors of to-be-tested audio frequencies into two trained linear differential analysis models in test phases, weighting two obtained result grades to obtain scores and comparing the scores to predefined threshold values so as to discriminate voice spoofing. The method and the system have the advantages that local features can be captured, and global information can be grasped; the linear differential analysis models are used as classifiers in identification and verification phases, the voice spoofing attack can be judged by means of grade fusion, and accordingly the voice spoofing detection accuracy can be greatly improved.

Description

Speaker's voice spoofing attack detection method based on degree of depth study and system
Technical field
The present invention relates to the technology in a kind of intelligent sound field, a kind of speaker's language based on degree of depth study Sound spoofing attack detection method and system.
Background technology
Voice spoofing attack, refers to forge for specific objective sound, hence for automatic Speaker Recognition System Carry out the technology attacked.Speaker Recognition Technology is widely used at numerous areas, such as at present: authentication, Internet security, man-machine interaction, bank securities system, military criminal investigation etc..In recent years for the attack master of Speaker Recognition System Being divided into four classes, i.e. impersonation attack, playback, phonetic synthesis, voice is changed.Research shows, traditional voice spoofing attack Detection subject matter be present in feature extraction, existing feature extracting method in the expressive force of human voice characteristics and Robustness aspect has many deficiencies.
In the most existing technology, for detection and the identification of voice spoofing attack, characteristic extraction part through frequently with Characteristic parameter mainly have spectrum signature parameter, phase property parameter, class cochlea aural signature (cochlea based Features), Perception feature etc., the method for these feature extractions still has some deficits in the characteristic aspect characterizing true and false voice, Thus affect accuracy of detection.Additionally, these methods all make use of the aural signature of voice signal, lost the dynamic of voice signal Feature, robustness is poor, and recognition effect is undesirable.
Identifying model part, the method for main flow is mainly gauss hybrid models (GMM) and supporting vector machine model (SVM).Both approaches is suitable for processing continuous signal, is limited by training criterion, more weak in ability to express, its result Can only easily distinguish the difference between inhomogeneity sample, therefore, its recognition effect is poor.
Summary of the invention
The present invention is directed to the method for existing traditional voice spoofing attack detection there is feature extraction can not take advantage of by accurate characterization Deceive distinctive feature between voice and real speech, and lose the limitation such as the behavioral characteristics of voice signal, robustness be poor And the shortcoming that recognition effect is the best, propose a kind of speaker's voice spoofing attack detection method based on degree of depth study and be System, in feature extraction phases, utilizes degree of depth learning model to extract characteristic vector, two kinds of different frames: based on degree of depth feed forward neural The other character representation of the frame level of network and sequence level character representation based on degree of depth recurrent neural network, can either catch local Feature, can hold again global information.And identifying that Qualify Phase uses linear differential analysis as grader, melted by mark Conjunction judges.The present invention can be greatly enhanced the accuracy of voice fraud detection.
The present invention is achieved by the following technical solutions:
The present invention relates to a kind of speaker's voice spoofing attack detection method based on degree of depth study, build audio frequency training Collection, initializes and uses the multiframe characteristic vector of training set and single frames sequence vector to be respectively trained degree of depth feedforward neural network with deep Degree recurrent neural network;At test phase, by the frame level of audio frequency to be measured, not and sequence level characteristic vector is directed respectively into trained Two linear differential analyze models, using after two obtained result marks weightings as scoring, warp and predefined threshold ratio Relatively realize voice deception to distinguish.
Described training degree of depth feedforward neural network and degree of depth recurrent neural network, particularly as follows: use the Mel filtering of multiframe Device group extracts the acoustic feature of the registration audio frequency obtained, i.e. Filter bank features training degree of depth feedforward neural network, then sound Frequently training set passes through degree of depth feedforward neural network, last hidden layer of network obtains the frame level characteristics of this audio frequency to Amount;The Mel bank of filters using multiframe extracts the acoustic feature training degree of depth recurrent neural network of the registration audio frequency obtained, then By feature normalization, last hidden layer of degree of depth recurrent neural network obtains the sequence level feature of this audio frequency to Amount.
Described training degree of depth feedforward neural network and degree of depth recurrent neural network, in its backward communication process, learning rate By simulated annealing and as early as possible stop strategy determining.
Described multiframe refers to: 31 frame windows and every limit 15 frame.
Described acoustic feature, i.e. the acoustic feature of Mel bank of filters, by by one group of Mel wave filter on frequency domain Audio signal to be detected be filtered, obtain one group filter after array, i.e. Mel frequency spectrum, each of which bandpass filter Exporting a Filter bank coefficient, the length of array is equal to the number of filter in Mel bank of filters.
Described Mel wave filter, uses but is not limited to quarter window wave filter.
Described degree of depth feedforward neural network, comprises several hidden layers, is full connection, at the beginning of parameter value randomization between hidden layer Begin, propagated by Back Propagation Algorithm;
Described degree of depth recurrent neural network, comprises several hidden layers, wherein arrives except full connection also comprises hidden layer self The connection of self, is used for propagating the information in a moment, to reach to protect stored purpose.
Attack pattern that described network output layer difference node on behalf is different or real human's voice, whole nerve net Network is classified, using cross entropy as object function for input voice.
Described frame level is not and sequence level characteristic vector is respectively via degree of depth feedforward neural network and degree of depth recurrent neural Network exports, preferably through regular process to possess identical vector two norm length.
Described housebroken two linear differential analysis (LDA) models, refer to: use degree of depth feedforward neural network and the degree of depth Last hidden layer of recurrent neural network obtains frame level, and not and sequence level characteristic vector is respectively trained two linear differential models, In this LDA model, the density of each classification is modeled by Multi-dimensional Gaussian distribution:Its In: ∑ k and μkIt is the covariance of kth class, Mean Matrix respectively, this LDA model assumption:And posterior probability by Bayesian formula is given:Wherein: πkIt it is the prior probability of kth class.
Two described LDA models, preferably adjust both score weights according to the performance in development set.
The quantity of described classification is consistent with the output layer nodes of described neutral net, i.e. attacks kind+1.
The present invention relates to a kind of speaker's voice spoofing attack detecting system based on degree of depth study, including: logarithmic spectrum is special Levy extraction module, deep neural network module and linear differential module, wherein: logarithmic spectrum characteristic extracting module and degree of depth nerve net Network module is connected and transmits the acoustic feature information of audio frequency to be measured, and deep neural network module is according to acoustic feature information output spy Levying vector information to linear differential module to be trained, linear differential module can treat the feature of acoustic frequency after training Vector information judges and marks, thus realizes the detection of voice deception.
Technique effect
Compared with prior art, the characteristic vector utilizing degree of depth study to extract proposed in the present invention can table more accurately The phonetic feature of traveller on a long journey;And use linear differential analysis (LDA) model as grader in Classification and Identification part, it is possible to reduce same Difference between class, expands the gap between inhomogeneity, and recognition effect is good, strong robustness, and the more existing method of precision has had very Big lifting, the technology of the present invention effect includes:
1) the more existing method of accuracy of identification is greatly improved;
2) feature extracted can characterize the personal characteristics of speaker more accurately;
3) degree of depth learning strategy avoids the over-fitting of network;
4) degree of depth study makes feature become more added with distinction;
5) under different channels and environment, robustness is higher;
Additionally, the present invention is at unknown complex condition effect more robust.
Accompanying drawing explanation
Fig. 1 is schematic flow sheet of the present invention.
Detailed description of the invention
Embodiment 1
The present embodiment uses the new ASVSpoof2015 data set issued to be tested, and enters with existing Baseline Methods Having gone contrast, result is as shown in table 1.It will be seen that method proposed by the invention, it is possible to reach the most best result.
Speaker's voice spoofing attack detecting system that the present embodiment relates to, including: logarithmic spectrum characteristic extracting module, the degree of depth Neural network module and linear differential module, wherein: logarithmic spectrum characteristic extracting module is connected with deep neural network module and passes The acoustic feature information of defeated audio frequency to be measured, deep neural network module according to acoustic feature information output characteristic vector information to line Property difference block to be trained, linear differential module through training after can treat acoustic frequency eigenvector information judge also Scoring, thus realize the detection of voice deception.
The detection process that the present embodiment relates to said system is as follows:
Step 1) build audio frequency training set (training set of ASVSpoof2015) random initializtion by degree of depth feed forward neural The deep neural network that network and degree of depth recurrent neural network are constituted;
The loss function of described deep neural network is cross entropy, and to have a coefficient be 10‐6Euclidean distance (L2 Norm) weight attenuation term.
Described random initializtion refers to: be randomly derived network parameter initial value, after stochastic gradient descent (SGD) Be used for the parameter adjustment of degree of depth feedforward neural network to propagation algorithm, time evolution anti-pass (BPTT) algorithm based on SGD is used for The parameter adjustment of degree of depth recurrent neural network.
Step 2) training stage, with the multiframe characteristic vector training degree of depth feedforward neural network of training audio frequency, window size is 31 frames, left and right respectively extends 15 frames;With the single frames sequence vector training degree of depth recurrent neural network of training audio frequency, use based on SGD BPTT algorithm.Learning rate by simulated annealing and as early as possible stop strategy determining, use cross entropy training, introducing value is 10‐6Power Weight attenuation term.After network training completes, training audio frequency is respectively by degree of depth feedforward neural network and degree of depth recurrent neural net Frame level is not obtained not and sequence level characteristic vector, for training two linear differential models after last hidden layer of network.Finally Both score weights are adjusted according to the performance in development set.
Step 3) test phase, do not calculate the frame level of audio frequency to be measured not and sequence level characteristic vector, be directed respectively into and train Linear differential analyze model, using after obtained two results weighting as scoring, through relatively realizing language with training threshold ratio Sound deception distinguishes.
More specific such as following table between the present invention and existing algorithm:
Above-mentioned be embodied as can by those skilled in the art on the premise of without departing substantially from the principle of the invention and objective with difference Mode it is carried out local directed complete set, protection scope of the present invention is as the criterion with claims and is not embodied as institute by above-mentioned Limit, each implementation in the range of it is all by the constraint of the present invention.

Claims (10)

1. speaker's voice spoofing attack detection method based on degree of depth study, it is characterised in that instruct by building audio frequency Practice collection, initialize and use the multiframe characteristic vector of training set and single frames sequence vector be respectively trained degree of depth feedforward neural network and Degree of depth recurrent neural network;At test phase, the frame level of audio frequency to be measured is not directed respectively into through instruction with sequence level characteristic vector Two linear differential practiced analyze model, as scoring after two obtained result marks are weighted, and warp and predefined threshold value Relatively realize voice deception to distinguish.
Speaker's voice spoofing attack detection method the most according to claim 1, is characterized in that, before the described training degree of depth Feedback neutral net and degree of depth recurrent neural network, particularly as follows: use the Mel bank of filters of multiframe to extract the registration audio frequency that obtains Acoustic feature, i.e. Filter bank features training degree of depth feedforward neural network, then audio frequency training set passes through degree of depth feed forward neural Network, obtains the other characteristic vector of frame level of this audio frequency in last hidden layer of network;The Mel bank of filters using multiframe carries Obtain the acoustic feature training degree of depth recurrent neural network of the registration audio frequency arrived, then by feature normalization, in degree of depth recurrence The sequence level characteristic vector of this audio frequency is obtained in last hidden layer of neutral net.
Speaker's voice spoofing attack detection method the most according to claim 1 and 2, is characterized in that, described training is deep Degree feedforward neural network and degree of depth recurrent neural network, in its backward communication process, learning rate by simulated annealing and stops as early as possible Strategy determines.
Speaker's voice spoofing attack detection method the most according to claim 1, is characterized in that, described acoustic feature, The acoustic feature of i.e. Mel bank of filters, by filtering the audio signal to be detected on frequency domain by one group of Mel wave filter Ripple, obtain one group filter after array, i.e. Mel frequency spectrum, each of which bandpass filter exports a Filter bank system Number, the length of array is equal to the number of filter in Mel bank of filters.
Speaker's voice spoofing attack detection method the most according to claim 1 and 2, is characterized in that, before the described degree of depth Feedback neutral net, comprises several hidden layers, is full connection between hidden layer, and parameter value randomization is initial, passes through Back Propagation Algorithm Propagate;Described degree of depth recurrent neural network, comprises several hidden layers, wherein also comprises hidden layer self to self except full connection Connection, be used for propagating the information in a moment, to reach to protect stored purpose.
Speaker's voice spoofing attack detection method the most according to claim 1, is characterized in that, described frame level is not and sequence Row level characteristics vector exports via degree of depth feedforward neural network and degree of depth recurrent neural network respectively, through regular process with tool Standby identical vector two norm length.
Speaker's voice spoofing attack detection method the most according to claim 1, is characterized in that, described housebroken two Individual linear differential analyzes model, refers to: use degree of depth feedforward neural network and last hidden layer of degree of depth recurrent neural network Not and sequence level characteristic vector is respectively trained two linear differential models to obtain frame level, and in this LDA model, each classification is close Degree is modeled by Multi-dimensional Gaussian distribution:Wherein: ∑ k and μkIt is kth respectively The covariance of individual class, Mean Matrix, this LDA model assumption:And posterior probability is given by Bayesian formula: Wherein: πkIt it is the prior probability of kth class.
8. according to the speaker's voice spoofing attack detection method described in claim 1 or 7, it is characterized in that, two described LDA Model, adjusts both score weights according to the performance in development set.
Speaker's voice spoofing attack detection method the most according to claim 7, is characterized in that, the quantity of described classification Consistent with the output layer nodes of described neutral net, i.e. attack kind+1.
10. speaker's voice spoofing attack detecting system based on degree of depth study, it is characterised in that including: logarithmic spectrum is special Levy extraction module, deep neural network module and linear differential module, wherein: logarithmic spectrum characteristic extracting module and degree of depth nerve net Network module is connected and transmits the acoustic feature information of audio frequency to be measured, and deep neural network module is according to acoustic feature information output spy Levying vector information to linear differential module to be trained, linear differential module can treat the feature of acoustic frequency after training Vector information judges and marks, thus realizes the detection of voice deception.
CN201610478041.0A 2016-06-27 2016-06-27 Speaker's voice spoofing attack detection method and system based on deep learning Active CN105869630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610478041.0A CN105869630B (en) 2016-06-27 2016-06-27 Speaker's voice spoofing attack detection method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610478041.0A CN105869630B (en) 2016-06-27 2016-06-27 Speaker's voice spoofing attack detection method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN105869630A true CN105869630A (en) 2016-08-17
CN105869630B CN105869630B (en) 2019-08-02

Family

ID=56655288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610478041.0A Active CN105869630B (en) 2016-06-27 2016-06-27 Speaker's voice spoofing attack detection method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN105869630B (en)

Cited By (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875007A (en) * 2017-01-25 2017-06-20 上海交通大学 End-to-end deep neural network is remembered based on convolution shot and long term for voice fraud detection
CN106991999A (en) * 2017-03-29 2017-07-28 北京小米移动软件有限公司 Audio recognition method and device
CN107221320A (en) * 2017-05-19 2017-09-29 百度在线网络技术(北京)有限公司 Train method, device, equipment and the computer-readable storage medium of acoustic feature extraction model
CN107527616A (en) * 2017-09-29 2017-12-29 上海与德通讯技术有限公司 Intelligent identification Method and robot
CN108172224A (en) * 2017-12-19 2018-06-15 浙江大学 The method without vocal command control voice assistant based on the defence of machine learning
CN108281158A (en) * 2018-01-12 2018-07-13 平安科技(深圳)有限公司 Voice biopsy method, server and storage medium based on deep learning
CN108320732A (en) * 2017-01-13 2018-07-24 阿里巴巴集团控股有限公司 The method and apparatus for generating target speaker's speech recognition computation model
CN108417217A (en) * 2018-01-11 2018-08-17 苏州思必驰信息科技有限公司 Speaker Identification network model training method, method for distinguishing speek person and system
GB2560620A (en) * 2017-01-20 2018-09-19 Ford Global Tech Llc Recurrent deep convolutional neural network for object detection
CN108711436A (en) * 2018-05-17 2018-10-26 哈尔滨工业大学 Speaker verification's system Replay Attack detection method based on high frequency and bottleneck characteristic
CN109065069A (en) * 2018-10-10 2018-12-21 广州市百果园信息技术有限公司 A kind of audio-frequency detection, device, equipment and storage medium
CN109147799A (en) * 2018-10-18 2019-01-04 广州势必可赢网络科技有限公司 A kind of method, apparatus of speech recognition, equipment and computer storage medium
CN109165726A (en) * 2018-08-17 2019-01-08 联智科技(天津)有限责任公司 A kind of neural network embedded system for without speaker verification's text
CN109394476A (en) * 2018-12-06 2019-03-01 上海神添实业有限公司 The automatic intention assessment of brain flesh information and upper limb intelligent control method and system
CN109448759A (en) * 2018-12-28 2019-03-08 武汉大学 A kind of anti-voice authentication spoofing attack detection method based on gas explosion sound
CN109767776A (en) * 2019-01-14 2019-05-17 广东技术师范学院 A kind of deception speech detection method based on intensive neural network
CN109920447A (en) * 2019-01-29 2019-06-21 天津大学 Recording fraud detection method based on sef-adapting filter Amplitude & Phase feature extraction
CN110110732A (en) * 2019-05-08 2019-08-09 杭州视在科技有限公司 A kind of intelligence inspection algorithm for kitchen after food and drink
CN110335591A (en) * 2019-07-04 2019-10-15 广州云从信息科技有限公司 A kind of parameter management method, device, machine readable media and equipment
CN110348189A (en) * 2019-06-17 2019-10-18 五邑大学 A kind of identity spoofing detection method and its system, device, storage medium
CN110349586A (en) * 2019-07-23 2019-10-18 北京邮电大学 Telecommunication fraud detection method and device
CN110414536A (en) * 2019-07-17 2019-11-05 北京得意音通技术有限责任公司 Data characteristics extracting method, playback detection method, storage medium and electronic equipment
CN110491391A (en) * 2019-07-02 2019-11-22 厦门大学 A kind of deception speech detection method based on deep neural network
CN110827837A (en) * 2019-10-18 2020-02-21 中山大学 Whale activity audio classification method based on deep learning
CN111028852A (en) * 2019-11-06 2020-04-17 杭州哲信信息技术有限公司 Noise removing method in intelligent calling system based on CNN
CN111243621A (en) * 2020-01-14 2020-06-05 四川大学 Construction method of GRU-SVM deep learning model for synthetic speech detection
CN111295674A (en) * 2017-11-01 2020-06-16 国际商业机器公司 Protecting cognitive systems from gradient-based attacks by using spoof gradients
CN111316668A (en) * 2017-11-14 2020-06-19 思睿逻辑国际半导体有限公司 Detection of loudspeaker playback
CN111327608A (en) * 2020-02-14 2020-06-23 中南大学 Application layer malicious request detection method and system based on cascade deep neural network
CN111418009A (en) * 2019-10-31 2020-07-14 支付宝(杭州)信息技术有限公司 Personalized speaker verification system and method
CN111755014A (en) * 2020-07-02 2020-10-09 四川长虹电器股份有限公司 Domain-adaptive replay attack detection method and system
CN111771213A (en) * 2018-02-16 2020-10-13 杜比实验室特许公司 Speech style migration
US10984083B2 (en) 2017-07-07 2021-04-20 Cirrus Logic, Inc. Authentication of user using ear biometric data
US11017252B2 (en) 2017-10-13 2021-05-25 Cirrus Logic, Inc. Detection of liveness
US11023755B2 (en) 2017-10-13 2021-06-01 Cirrus Logic, Inc. Detection of liveness
US11037574B2 (en) 2018-09-05 2021-06-15 Cirrus Logic, Inc. Speaker recognition and speaker change detection
US11042617B2 (en) 2017-07-07 2021-06-22 Cirrus Logic, Inc. Methods, apparatus and systems for biometric processes
US11042618B2 (en) 2017-07-07 2021-06-22 Cirrus Logic, Inc. Methods, apparatus and systems for biometric processes
US11042616B2 (en) 2017-06-27 2021-06-22 Cirrus Logic, Inc. Detection of replay attack
CN113362822A (en) * 2021-06-08 2021-09-07 北京计算机技术及应用研究所 Black box voice confrontation sample generation method with auditory masking
CN113555023A (en) * 2021-09-18 2021-10-26 中国科学院自动化研究所 Method for joint modeling of voice authentication and speaker recognition
US11164588B2 (en) 2017-06-28 2021-11-02 Cirrus Logic, Inc. Magnetic detection of replay attack
CN113641980A (en) * 2021-08-23 2021-11-12 北京百度网讯科技有限公司 Authentication method and apparatus, electronic device, and medium
US11264037B2 (en) 2018-01-23 2022-03-01 Cirrus Logic, Inc. Speaker identification
US11270707B2 (en) 2017-10-13 2022-03-08 Cirrus Logic, Inc. Analysing speech signals
US11276409B2 (en) 2017-11-14 2022-03-15 Cirrus Logic, Inc. Detection of replay attack
CN114283817A (en) * 2021-12-27 2022-04-05 思必驰科技股份有限公司 Speaker verification method and system
US11475899B2 (en) 2018-01-23 2022-10-18 Cirrus Logic, Inc. Speaker identification
CN115273859A (en) * 2021-04-30 2022-11-01 清华大学 Safety testing method and device for voice verification device
US11538455B2 (en) 2018-02-16 2022-12-27 Dolby Laboratories Licensing Corporation Speech style transfer
US11631402B2 (en) 2018-07-31 2023-04-18 Cirrus Logic, Inc. Detection of replay attack
US11704397B2 (en) 2017-06-28 2023-07-18 Cirrus Logic, Inc. Detection of replay attack
US11705135B2 (en) 2017-10-13 2023-07-18 Cirrus Logic, Inc. Detection of liveness
US11735189B2 (en) 2018-01-23 2023-08-22 Cirrus Logic, Inc. Speaker identification
US11748462B2 (en) 2018-08-31 2023-09-05 Cirrus Logic Inc. Biometric authentication
US11755701B2 (en) 2017-07-07 2023-09-12 Cirrus Logic Inc. Methods, apparatus and systems for authentication
US11829461B2 (en) 2017-07-07 2023-11-28 Cirrus Logic Inc. Methods, apparatus and systems for audio playback

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436810A (en) * 2011-10-26 2012-05-02 华南理工大学 Record replay attack detection method and system based on channel mode noise
CN104954532A (en) * 2015-06-19 2015-09-30 深圳天珑无线科技有限公司 Voice recognition method, voice recognition device and mobile terminal
CN105139857A (en) * 2015-09-02 2015-12-09 广东顺德中山大学卡内基梅隆大学国际联合研究院 Countercheck method for automatically identifying speaker aiming to voice deception

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436810A (en) * 2011-10-26 2012-05-02 华南理工大学 Record replay attack detection method and system based on channel mode noise
CN104954532A (en) * 2015-06-19 2015-09-30 深圳天珑无线科技有限公司 Voice recognition method, voice recognition device and mobile terminal
CN105139857A (en) * 2015-09-02 2015-12-09 广东顺德中山大学卡内基梅隆大学国际联合研究院 Countercheck method for automatically identifying speaker aiming to voice deception

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ALAN GODOY 等: "Using Deep Learning for Detecting Spoofing Attacks on Speech Signals", 《AIRXIV》 *

Cited By (82)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108320732A (en) * 2017-01-13 2018-07-24 阿里巴巴集团控股有限公司 The method and apparatus for generating target speaker's speech recognition computation model
GB2560620A (en) * 2017-01-20 2018-09-19 Ford Global Tech Llc Recurrent deep convolutional neural network for object detection
CN106875007A (en) * 2017-01-25 2017-06-20 上海交通大学 End-to-end deep neural network is remembered based on convolution shot and long term for voice fraud detection
CN106991999B (en) * 2017-03-29 2020-06-02 北京小米移动软件有限公司 Voice recognition method and device
CN106991999A (en) * 2017-03-29 2017-07-28 北京小米移动软件有限公司 Audio recognition method and device
CN107221320A (en) * 2017-05-19 2017-09-29 百度在线网络技术(北京)有限公司 Train method, device, equipment and the computer-readable storage medium of acoustic feature extraction model
US11042616B2 (en) 2017-06-27 2021-06-22 Cirrus Logic, Inc. Detection of replay attack
US12026241B2 (en) 2017-06-27 2024-07-02 Cirrus Logic Inc. Detection of replay attack
US11164588B2 (en) 2017-06-28 2021-11-02 Cirrus Logic, Inc. Magnetic detection of replay attack
US11704397B2 (en) 2017-06-28 2023-07-18 Cirrus Logic, Inc. Detection of replay attack
US11714888B2 (en) 2017-07-07 2023-08-01 Cirrus Logic Inc. Methods, apparatus and systems for biometric processes
US11755701B2 (en) 2017-07-07 2023-09-12 Cirrus Logic Inc. Methods, apparatus and systems for authentication
US11829461B2 (en) 2017-07-07 2023-11-28 Cirrus Logic Inc. Methods, apparatus and systems for audio playback
US11042618B2 (en) 2017-07-07 2021-06-22 Cirrus Logic, Inc. Methods, apparatus and systems for biometric processes
US11042617B2 (en) 2017-07-07 2021-06-22 Cirrus Logic, Inc. Methods, apparatus and systems for biometric processes
US10984083B2 (en) 2017-07-07 2021-04-20 Cirrus Logic, Inc. Authentication of user using ear biometric data
CN107527616A (en) * 2017-09-29 2017-12-29 上海与德通讯技术有限公司 Intelligent identification Method and robot
US11705135B2 (en) 2017-10-13 2023-07-18 Cirrus Logic, Inc. Detection of liveness
US11270707B2 (en) 2017-10-13 2022-03-08 Cirrus Logic, Inc. Analysing speech signals
US11023755B2 (en) 2017-10-13 2021-06-01 Cirrus Logic, Inc. Detection of liveness
US11017252B2 (en) 2017-10-13 2021-05-25 Cirrus Logic, Inc. Detection of liveness
CN111295674B (en) * 2017-11-01 2024-01-30 国际商业机器公司 Protecting cognitive systems from gradient-based attacks by using spoof gradients
CN111295674A (en) * 2017-11-01 2020-06-16 国际商业机器公司 Protecting cognitive systems from gradient-based attacks by using spoof gradients
US11276409B2 (en) 2017-11-14 2022-03-15 Cirrus Logic, Inc. Detection of replay attack
CN111316668A (en) * 2017-11-14 2020-06-19 思睿逻辑国际半导体有限公司 Detection of loudspeaker playback
CN111316668B (en) * 2017-11-14 2021-09-28 思睿逻辑国际半导体有限公司 Detection of loudspeaker playback
US11051117B2 (en) 2017-11-14 2021-06-29 Cirrus Logic, Inc. Detection of loudspeaker playback
CN108172224A (en) * 2017-12-19 2018-06-15 浙江大学 The method without vocal command control voice assistant based on the defence of machine learning
CN108172224B (en) * 2017-12-19 2019-08-27 浙江大学 Method based on the defence of machine learning without vocal command control voice assistant
CN108417217B (en) * 2018-01-11 2021-07-13 思必驰科技股份有限公司 Speaker recognition network model training method, speaker recognition method and system
CN108417217A (en) * 2018-01-11 2018-08-17 苏州思必驰信息科技有限公司 Speaker Identification network model training method, method for distinguishing speek person and system
CN108281158A (en) * 2018-01-12 2018-07-13 平安科技(深圳)有限公司 Voice biopsy method, server and storage medium based on deep learning
WO2019136909A1 (en) * 2018-01-12 2019-07-18 平安科技(深圳)有限公司 Voice living-body detection method based on deep learning, server and storage medium
US11264037B2 (en) 2018-01-23 2022-03-01 Cirrus Logic, Inc. Speaker identification
US11735189B2 (en) 2018-01-23 2023-08-22 Cirrus Logic, Inc. Speaker identification
US11475899B2 (en) 2018-01-23 2022-10-18 Cirrus Logic, Inc. Speaker identification
US11694695B2 (en) 2018-01-23 2023-07-04 Cirrus Logic, Inc. Speaker identification
CN111771213B (en) * 2018-02-16 2021-10-08 杜比实验室特许公司 Speech style migration
CN111771213A (en) * 2018-02-16 2020-10-13 杜比实验室特许公司 Speech style migration
US11538455B2 (en) 2018-02-16 2022-12-27 Dolby Laboratories Licensing Corporation Speech style transfer
CN108711436A (en) * 2018-05-17 2018-10-26 哈尔滨工业大学 Speaker verification's system Replay Attack detection method based on high frequency and bottleneck characteristic
CN108711436B (en) * 2018-05-17 2020-06-09 哈尔滨工业大学 Speaker verification system replay attack detection method based on high frequency and bottleneck characteristics
US11631402B2 (en) 2018-07-31 2023-04-18 Cirrus Logic, Inc. Detection of replay attack
CN109165726A (en) * 2018-08-17 2019-01-08 联智科技(天津)有限责任公司 A kind of neural network embedded system for without speaker verification's text
US11748462B2 (en) 2018-08-31 2023-09-05 Cirrus Logic Inc. Biometric authentication
US11037574B2 (en) 2018-09-05 2021-06-15 Cirrus Logic, Inc. Speaker recognition and speaker change detection
CN109065069A (en) * 2018-10-10 2018-12-21 广州市百果园信息技术有限公司 A kind of audio-frequency detection, device, equipment and storage medium
US11948595B2 (en) 2018-10-10 2024-04-02 Bigo Technology Pte. Ltd. Method for detecting audio, device, and storage medium
CN109147799A (en) * 2018-10-18 2019-01-04 广州势必可赢网络科技有限公司 A kind of method, apparatus of speech recognition, equipment and computer storage medium
CN109394476A (en) * 2018-12-06 2019-03-01 上海神添实业有限公司 The automatic intention assessment of brain flesh information and upper limb intelligent control method and system
CN109448759A (en) * 2018-12-28 2019-03-08 武汉大学 A kind of anti-voice authentication spoofing attack detection method based on gas explosion sound
CN109767776A (en) * 2019-01-14 2019-05-17 广东技术师范学院 A kind of deception speech detection method based on intensive neural network
CN109767776B (en) * 2019-01-14 2023-12-15 广东技术师范大学 Deception voice detection method based on dense neural network
CN109920447B (en) * 2019-01-29 2021-07-13 天津大学 Recording fraud detection method based on adaptive filter amplitude phase characteristic extraction
CN109920447A (en) * 2019-01-29 2019-06-21 天津大学 Recording fraud detection method based on sef-adapting filter Amplitude & Phase feature extraction
CN110110732A (en) * 2019-05-08 2019-08-09 杭州视在科技有限公司 A kind of intelligence inspection algorithm for kitchen after food and drink
CN110348189A (en) * 2019-06-17 2019-10-18 五邑大学 A kind of identity spoofing detection method and its system, device, storage medium
CN110491391A (en) * 2019-07-02 2019-11-22 厦门大学 A kind of deception speech detection method based on deep neural network
CN110335591A (en) * 2019-07-04 2019-10-15 广州云从信息科技有限公司 A kind of parameter management method, device, machine readable media and equipment
CN110414536A (en) * 2019-07-17 2019-11-05 北京得意音通技术有限责任公司 Data characteristics extracting method, playback detection method, storage medium and electronic equipment
CN110414536B (en) * 2019-07-17 2022-03-25 北京得意音通技术有限责任公司 Playback detection method, storage medium, and electronic device
CN110349586A (en) * 2019-07-23 2019-10-18 北京邮电大学 Telecommunication fraud detection method and device
CN110827837A (en) * 2019-10-18 2020-02-21 中山大学 Whale activity audio classification method based on deep learning
CN110827837B (en) * 2019-10-18 2022-02-22 中山大学 Whale activity audio classification method based on deep learning
US11031018B2 (en) 2019-10-31 2021-06-08 Alipay (Hangzhou) Information Technology Co., Ltd. System and method for personalized speaker verification
US11244689B2 (en) 2019-10-31 2022-02-08 Alipay (Hangzhou) Information Technology Co., Ltd. System and method for determining voice characteristics
CN111418009B (en) * 2019-10-31 2023-09-05 支付宝(杭州)信息技术有限公司 Personalized speaker verification system and method
CN111418009A (en) * 2019-10-31 2020-07-14 支付宝(杭州)信息技术有限公司 Personalized speaker verification system and method
US10997980B2 (en) 2019-10-31 2021-05-04 Alipay (Hangzhou) Information Technology Co., Ltd. System and method for determining voice characteristics
WO2020098828A3 (en) * 2019-10-31 2020-09-03 Alipay (Hangzhou) Information Technology Co., Ltd. System and method for personalized speaker verification
CN111028852A (en) * 2019-11-06 2020-04-17 杭州哲信信息技术有限公司 Noise removing method in intelligent calling system based on CNN
CN111243621A (en) * 2020-01-14 2020-06-05 四川大学 Construction method of GRU-SVM deep learning model for synthetic speech detection
CN111327608A (en) * 2020-02-14 2020-06-23 中南大学 Application layer malicious request detection method and system based on cascade deep neural network
CN111755014A (en) * 2020-07-02 2020-10-09 四川长虹电器股份有限公司 Domain-adaptive replay attack detection method and system
CN111755014B (en) * 2020-07-02 2022-06-03 四川长虹电器股份有限公司 Domain-adaptive replay attack detection method and system
CN115273859A (en) * 2021-04-30 2022-11-01 清华大学 Safety testing method and device for voice verification device
CN115273859B (en) * 2021-04-30 2024-05-28 清华大学 Safety testing method and device for voice verification device
CN113362822A (en) * 2021-06-08 2021-09-07 北京计算机技术及应用研究所 Black box voice confrontation sample generation method with auditory masking
CN113641980A (en) * 2021-08-23 2021-11-12 北京百度网讯科技有限公司 Authentication method and apparatus, electronic device, and medium
CN113555023A (en) * 2021-09-18 2021-10-26 中国科学院自动化研究所 Method for joint modeling of voice authentication and speaker recognition
CN113555023B (en) * 2021-09-18 2022-01-11 中国科学院自动化研究所 Method for joint modeling of voice authentication and speaker recognition
CN114283817A (en) * 2021-12-27 2022-04-05 思必驰科技股份有限公司 Speaker verification method and system

Also Published As

Publication number Publication date
CN105869630B (en) 2019-08-02

Similar Documents

Publication Publication Date Title
CN105869630B (en) Speaker's voice spoofing attack detection method and system based on deep learning
CN104732978B (en) The relevant method for distinguishing speek person of text based on combined depth study
CN110491391B (en) Deception voice detection method based on deep neural network
CN108231067A (en) Sound scenery recognition methods based on convolutional neural networks and random forest classification
CN102034288B (en) Multiple biological characteristic identification-based intelligent door control system
CN107886943A (en) Voiceprint recognition method and device
CN110428843A (en) A kind of voice gender identification deep learning method
CN106448684A (en) Deep-belief-network-characteristic-vector-based channel-robust voiceprint recognition system
CN102968990B (en) Speaker identifying method and system
CN104916289A (en) Quick acoustic event detection method under vehicle-driving noise environment
CN106898355B (en) Speaker identification method based on secondary modeling
CN111611566B (en) Speaker verification system and replay attack detection method thereof
CN104978507A (en) Intelligent well logging evaluation expert system identity authentication method based on voiceprint recognition
CN110211604A (en) A kind of depth residual error network structure for voice deformation detection
CN108198561A (en) A kind of pirate recordings speech detection method based on convolutional neural networks
CN111613240B (en) Camouflage voice detection method based on attention mechanism and Bi-LSTM
CN111816185A (en) Method and device for identifying speaker in mixed voice
CN104221079A (en) Modified Mel filter bank structure using spectral characteristics for sound analysis
CN113343198B (en) Video-based random gesture authentication method and system
CN105513598A (en) Playback voice detection method based on distribution of information quantity in frequency domain
Gomez-Alanis et al. Performance evaluation of front-and back-end techniques for ASV spoofing detection systems based on deep features
Gautam et al. Biometric system from heart sound using wavelet based feature set
CN109920447B (en) Recording fraud detection method based on adaptive filter amplitude phase characteristic extraction
CN107274912A (en) A kind of equipment source discrimination method of mobile phone recording
Woubie et al. Voice quality features for replay attack detection

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200617

Address after: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120

Patentee after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd.

Address before: 200240 Dongchuan Road, Shanghai, No. 800, No.

Patentee before: SHANGHAI JIAO TONG University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201028

Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee after: AI SPEECH Ltd.

Address before: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120

Patentee before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd.

TR01 Transfer of patent right
CP01 Change in the name or title of a patent holder

Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee after: Sipic Technology Co.,Ltd.

Address before: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee before: AI SPEECH Ltd.

CP01 Change in the name or title of a patent holder