CN105869630B - Speaker's voice spoofing attack detection method and system based on deep learning - Google Patents

Speaker's voice spoofing attack detection method and system based on deep learning Download PDF

Info

Publication number
CN105869630B
CN105869630B CN201610478041.0A CN201610478041A CN105869630B CN 105869630 B CN105869630 B CN 105869630B CN 201610478041 A CN201610478041 A CN 201610478041A CN 105869630 B CN105869630 B CN 105869630B
Authority
CN
China
Prior art keywords
neural network
depth
voice
speaker
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610478041.0A
Other languages
Chinese (zh)
Other versions
CN105869630A (en
Inventor
钱彦旻
陈楠昕
俞凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201610478041.0A priority Critical patent/CN105869630B/en
Publication of CN105869630A publication Critical patent/CN105869630A/en
Application granted granted Critical
Publication of CN105869630B publication Critical patent/CN105869630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Abstract

A kind of speaker's voice spoofing attack detection method and system based on deep learning, by constructing audio training set, initializing and depth feedforward neural network and depth recurrent neural network being respectively trained using the multiframe feature vector and single frames sequence vector of training set;It, not and sequence level feature vector is directed respectively into housebroken two linear differential analysis models by the frame level of audio to be measured, will be after the weighting of obtained two result scores as scoring, through realizing that voice cheats discrimination compared with predefined thresholds in test phase.The present invention can either capture local feature, and can hold global information.And classifier is used as using linear differential analysis in identification Qualify Phase, is judged by score fusion, the accuracy of voice fraud detection can be greatlyd improve.

Description

Speaker's voice spoofing attack detection method and system based on deep learning
Technical field
The present invention relates to a kind of technology in intelligent sound field, specifically a kind of human speech of speaking based on deep learning Sound spoofing attack detection method and system.
Background technique
Voice spoofing attack refers to and is forged for specific objective sound, hence for automatic Speaker Recognition System The technology attacked.Speaker Recognition Technology has been widely used in numerous areas at present, such as: authentication, Internet security, human-computer interaction, bank securities system, military criminal investigation etc..It is directed to the attack master of Speaker Recognition System in recent years It is divided into four classes, i.e. impersonation attack, playback, speech synthesis, voice conversion.Studies have shown that traditional voice spoofing attack The main problem of detection is present in feature extraction, existing feature extracting method in the expressive force of human voice characteristics and There are many deficiencies in terms of robustness.
In recent years in existing technology, for the detection and identification of voice spoofing attack, characteristic extraction part pass through frequently with Characteristic parameter mainly have spectrum signature parameter, phase property parameter, class cochlea aural signature (cochlea based Features), the method for Perception feature etc., these feature extractions still has some deficits in the characteristic aspect for characterizing true and false voice, To influence detection accuracy.In addition, the aural signature of voice signal is all utilized in these methods, it is lost the dynamic of voice signal Feature, robustness is poor, and recognition effect is undesirable.
In identification model part, the method for mainstream is mainly gauss hybrid models (GMM) and supporting vector machine model (SVM).Both methods is suitble to handle continuous signal, is limited by training criterion, weaker in ability to express, processing result The difference between inhomogeneity sample can only be easily distinguished, therefore, recognition effect is poor.
Summary of the invention
The present invention is unable to accurate characterization with feature extraction for the method for existing traditional voice spoofing attack detection and takes advantage of It deceives distinctive feature between voice and real speech, and loses the limitations such as the behavioral characteristics of voice signal, robustness be poor And the disadvantage that recognition effect is bad, it proposes a kind of speaker's voice spoofing attack detection method based on deep learning and is System, using deep learning model extraction feature vector, two kinds of different frames: is based on depth feed forward neural in feature extraction phases The other character representation of the frame level of network and the sequence level character representation based on depth recurrent neural network, can either capture part Feature, and global information can be held.And classifier is used as using linear differential analysis in identification Qualify Phase, is melted by score Conjunction judges.The present invention can greatly improve the accuracy of voice fraud detection.
The present invention is achieved by the following technical solutions:
Speaker's voice spoofing attack detection method based on deep learning that the present invention relates to a kind of, building audio training Collection initializes and depth feedforward neural network and depth is respectively trained using the multiframe feature vector and single frames sequence vector of training set Spend recurrent neural network;In test phase, by the frame level of audio to be measured not and sequence level feature vector be directed respectively into it is trained Two linear differential analysis models, will obtained two result scores weighting after as scoring, through with predefined thresholds ratio It is distinguished compared with realizing that voice is cheated.
The training depth feedforward neural network and depth recurrent neural network, specifically: it is filtered using the Mel of multiframe The acoustic feature for the registration audio that device group is extracted, i.e. Filter-bank feature train depth feedforward neural network, then sound Frequency training set by depth feedforward neural network, obtained in the last one hidden layer of network the frame level characteristics of the audio to Amount;Using the acoustic feature training depth recurrent neural network for the registration audio that the Mel filter group of multiframe is extracted, then By feature normalization, obtained in the last one hidden layer of depth recurrent neural network the sequence level feature of the audio to Amount.
The training depth feedforward neural network and depth recurrent neural network, in backward communication process, learning rate By simulated annealing and stop strategy as early as possible to determine.
The multiframe refers to: 31 frame windows and 15 frame of every side.
The acoustic feature, the i.e. acoustic feature of Mel filter group, by passing through one group of Mel filter on frequency domain Audio signal to be detected be filtered, obtain one group of filtered array, i.e. Mel frequency spectrum, wherein each bandpass filter A Filter-bank coefficient is exported, the length of array is equal to the number of filter in Mel filter group.
The Mel filter, using but be not limited to triangle window filter.
The depth feedforward neural network includes several hidden layers, is to connect entirely between hidden layer, and parameter value randomization is just Begin, is propagated by Back Propagation Algorithm;
The depth recurrent neural network includes several hidden layers, wherein also arriving comprising hidden layer itself in addition to connecting entirely The connection of itself protects stored purpose for propagating the information of last moment to reach.
The network output layer difference node on behalf different attack pattern or real human's voice, entire nerve net Network classifies for input voice, using cross entropy as objective function.
The frame level is not and sequence level feature vector is respectively via depth feedforward neural network and depth recurrent neural Network output, preferably through regular processing to have identical two norm length of vector.
Housebroken two linear differentials analyze (LDA) model, refer to: using depth feedforward neural network and depth The last one hidden layer of degree recurrent neural network obtains frame level not and two linear differentials are respectively trained in sequence level feature vector Model, the density of each classification is modeled by Multi-dimensional Gaussian distribution in the LDA model:Wherein: ∑ k and μkIt is the covariance, Mean Matrix of k-th of class respectively, is somebody's turn to do LDA model assumption:And posterior probability is provided by Bayesian formula:Its In: πkIt is the prior probability of k-th of class.
Two LDA models, preferably according to the score weight of performance adjustment the two in development set.
The quantity of the classification is consistent with the output layer number of nodes of the neural network, i.e. attack type+1.
Speaker's voice spoofing attack detection system based on deep learning that the present invention relates to a kind of, comprising: logarithmic spectrum is special Levy extraction module, deep neural network module and linear differential module, in which: logarithmic spectrum characteristic extracting module and depth nerve net Network module is connected and transmits the acoustic feature information of audio to be measured, and deep neural network module exports special according to acoustic feature information Vector information is levied to linear differential module to be trained, linear differential module can treat the feature of acoustic frequency after training Vector information judges and scores, to realize the detection of voice deception.
Technical effect
Compared with prior art, the feature vector extracted using deep learning proposed in the present invention being capable of more acurrate earth's surface The phonetic feature of traveller on a long journey;And in Classification and Identification part using linear differential analysis (LDA) model as classifier, it can reduce same Difference between class expands the gap between inhomogeneity, and recognition effect is good, strong robustness, and the more existing method of precision has very Big promotion, the technology of the present invention effect include:
1) the more existing method of accuracy of identification greatly improves;
2) feature extracted can more accurately characterize the personal characteristics of speaker;
3) deep learning strategy avoids the over-fitting of network;
4) deep learning becomes feature more added with distinction;
5) robustness is stronger under different channels and environment;
In addition, the present invention is more robust in unknown complex condition effect.
Detailed description of the invention
Fig. 1 is flow diagram of the present invention.
Specific embodiment
Embodiment 1
The present embodiment is tested using the ASVSpoof2015 data set newly issued, and with existing Baseline Methods into Comparison is gone, the results are shown in Table 1.It can be seen that method proposed by the invention, can reach result best at present.
Speaker's voice spoofing attack detection system that the present embodiment is related to, comprising: logarithmic spectrum characteristic extracting module, depth Neural network module and linear differential module, in which: logarithmic spectrum characteristic extracting module is connected and passes with deep neural network module The acoustic feature information of defeated audio to be measured, deep neural network module export eigenvector information to line according to acoustic feature information Property difference block to be trained, linear differential module can be treated after training acoustic frequency eigenvector information judgement simultaneously Scoring, to realize the detection of voice deception.
The detection process that the present embodiment is related to above system is as follows:
Step 1) constructs audio training set (training set of ASVSpoof2015) and random initializtion by depth feed forward neural The deep neural network that network and depth recurrent neural network are constituted;
The loss function of the deep neural network is cross entropy, and having a coefficient is 10‐6Euclidean distance (L2- Norm) weight attenuation term.
The random initializtion refers to: network parameter initial value is randomly derived, based on after stochastic gradient descent (SGD) Parameter to propagation algorithm for depth feedforward neural network adjusts, and time evolution anti-pass (BPTT) algorithm based on SGD is used for The parameter of depth recurrent neural network adjusts.
Step 2) the training stage, with the multiframe feature vector training depth feedforward neural network of training audio, window size is 31 frames, 15 frames of each extension in left and right;Depth recurrent neural network is trained with the single frames sequence vector of training audio, using based on SGD BPTT algorithm.Learning rate stops strategy by simulated annealing and as early as possible and determines, using cross entropy training, introducing value is 10‐6Power Weight attenuation term.After the completion of network training, training audio is passing through depth feedforward neural network and depth recurrent neural net respectively Obtained after the last one hidden layer of network frame level not and sequence level feature vector, for training two linear differential models.Finally According to the score weight of performance adjustment the two in development set.
Step 3) test phase, calculate audio to be measured frame level not and sequence level feature vector, be directed respectively into and train Linear differential analysis model, will obtained two results weighting after as scoring, through with training threshold value comparison realize language Sound deception distinguishes.
More specific between the present invention and existing algorithm is as follows:
Above-mentioned specific implementation can by those skilled in the art under the premise of without departing substantially from the principle of the invention and objective with difference Mode carry out local directed complete set to it, protection scope of the present invention is subject to claims and not by above-mentioned specific implementation institute Limit, each implementation within its scope is by the constraint of the present invention.

Claims (9)

1. a kind of speaker's voice spoofing attack detection method based on deep learning, which is characterized in that pass through building audio instruction Practice collection, initialize and using training set multiframe feature vector and single frames sequence vector be respectively trained depth feedforward neural network and Depth recurrent neural network;In test phase, the frame level of audio to be measured is not directed respectively into sequence level feature vector through instructing Two experienced linear differential analysis models will be used as scoring, warp and predefined thresholds after the weighting of obtained two result scores Compare and realizes that voice deception distinguishes;
Housebroken two linear differential analysis models refer to: using depth feedforward neural network and depth recurrent neural The last one hidden layer of network obtains frame level not and two linear differential models are respectively trained in sequence level feature vector, this is linear The density of each classification is modeled by Multi-dimensional Gaussian distribution in difference analysis model: Wherein: x indicates that each frame phonetic feature, p indicate the dimension of characteristic variable, ∑ k and μkBe respectively k-th of class covariance, Value matrix, the linear differential analysis model assume:Σ indicates covariance matrix, indicates between each dimension variable The degree of correlation, K indicates total Gauss quantity and posterior probability is provided by Bayesian formula:Its In: πkIt is the prior probability of k-th of class, G indicates that Gauss index, X indicate observational characteristic vector, πtIndicate t-th of Gaussian component Weight.
2. speaker's voice spoofing attack detection method according to claim 1, characterized in that before the training depth Neural network and depth recurrent neural network are presented, specifically: the registration audio extracted using the Mel filter group of multiframe Acoustic feature, i.e. Filter-bank feature train depth feedforward neural network, and then audio training set passes through depth feed forward neural Network obtains the other feature vector of frame level of the audio in the last one hidden layer of network;It is mentioned using the Mel filter group of multiframe The acoustic feature training depth recurrent neural network of the registration audio obtained, then by feature normalization, in depth recurrence The sequence level feature vector of the audio is obtained in the last one hidden layer of neural network.
3. speaker's voice spoofing attack detection method according to claim 1 or 2, characterized in that the training is deep Spend feedforward neural network and depth recurrent neural network, in backward communication process, learning rate stops by simulated annealing and as early as possible Strategy determines.
4. speaker's voice spoofing attack detection method according to claim 2, characterized in that the acoustic feature, That is the acoustic feature of Mel filter group, by being filtered by one group of Mel filter to the audio signal to be detected on frequency domain Wave obtains one group of filtered array, i.e. Mel frequency spectrum, and wherein each bandpass filter exports a Filter-bank system Number, the length of array are equal to the number of filter in Mel filter group.
5. speaker's voice spoofing attack detection method according to claim 1 or 2, characterized in that before the depth Neural network is presented, includes several hidden layers, is to connect entirely between hidden layer, parameter value randomization is initial, passes through Back Propagation Algorithm It propagates;The depth recurrent neural network includes several hidden layers, wherein also arriving itself comprising hidden layer itself in addition to connecting entirely Connection protect stored purpose for propagating the information of last moment to reach.
6. speaker's voice spoofing attack detection method according to claim 1, characterized in that the frame level is not and sequence Column level characteristics vector is exported via depth feedforward neural network and depth recurrent neural network respectively, by regular processing to have Standby identical two norm length of vector.
7. speaker's voice spoofing attack detection method according to claim 1, characterized in that described two are linear poor Divide analysis model, according to the score weight of performance adjustment the two in development set.
8. speaker's voice spoofing attack detection method according to claim 1, characterized in that the quantity of the classification It is consistent with the output layer number of nodes of the neural network, that is, attack type+1.
9. a kind of speaker's voice spoofing attack inspection based on deep learning for realizing any the method in claim 1~8 Examining system characterized by comprising logarithmic spectrum characteristic extracting module, deep neural network module and linear differential module, In: logarithmic spectrum characteristic extracting module is connected with deep neural network module and transmits the acoustic feature information of audio to be measured, depth Neural network module according to acoustic feature information export eigenvector information to linear differential module to be trained, linear differential The eigenvector information that module can treat acoustic frequency after training judges and scores, to realize the detection of voice deception.
CN201610478041.0A 2016-06-27 2016-06-27 Speaker's voice spoofing attack detection method and system based on deep learning Active CN105869630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610478041.0A CN105869630B (en) 2016-06-27 2016-06-27 Speaker's voice spoofing attack detection method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610478041.0A CN105869630B (en) 2016-06-27 2016-06-27 Speaker's voice spoofing attack detection method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN105869630A CN105869630A (en) 2016-08-17
CN105869630B true CN105869630B (en) 2019-08-02

Family

ID=56655288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610478041.0A Active CN105869630B (en) 2016-06-27 2016-06-27 Speaker's voice spoofing attack detection method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN105869630B (en)

Families Citing this family (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108320732A (en) * 2017-01-13 2018-07-24 阿里巴巴集团控股有限公司 The method and apparatus for generating target speaker's speech recognition computation model
US20180211403A1 (en) * 2017-01-20 2018-07-26 Ford Global Technologies, Llc Recurrent Deep Convolutional Neural Network For Object Detection
CN106875007A (en) * 2017-01-25 2017-06-20 上海交通大学 End-to-end deep neural network is remembered based on convolution shot and long term for voice fraud detection
CN106991999B (en) * 2017-03-29 2020-06-02 北京小米移动软件有限公司 Voice recognition method and device
CN107221320A (en) * 2017-05-19 2017-09-29 百度在线网络技术(北京)有限公司 Train method, device, equipment and the computer-readable storage medium of acoustic feature extraction model
GB2578386B (en) 2017-06-27 2021-12-01 Cirrus Logic Int Semiconductor Ltd Detection of replay attack
GB201713697D0 (en) 2017-06-28 2017-10-11 Cirrus Logic Int Semiconductor Ltd Magnetic detection of replay attack
GB2563953A (en) 2017-06-28 2019-01-02 Cirrus Logic Int Semiconductor Ltd Detection of replay attack
GB201801528D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Method, apparatus and systems for biometric processes
GB201801527D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Method, apparatus and systems for biometric processes
GB201801532D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Methods, apparatus and systems for audio playback
GB201801530D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Methods, apparatus and systems for authentication
GB201801526D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Methods, apparatus and systems for authentication
CN107527616A (en) * 2017-09-29 2017-12-29 上海与德通讯技术有限公司 Intelligent identification Method and robot
GB2567503A (en) 2017-10-13 2019-04-17 Cirrus Logic Int Semiconductor Ltd Analysing speech signals
GB201801664D0 (en) 2017-10-13 2018-03-21 Cirrus Logic Int Semiconductor Ltd Detection of liveness
GB201801663D0 (en) 2017-10-13 2018-03-21 Cirrus Logic Int Semiconductor Ltd Detection of liveness
GB201801661D0 (en) 2017-10-13 2018-03-21 Cirrus Logic International Uk Ltd Detection of liveness
GB201804843D0 (en) 2017-11-14 2018-05-09 Cirrus Logic Int Semiconductor Ltd Detection of replay attack
US10657259B2 (en) * 2017-11-01 2020-05-19 International Business Machines Corporation Protecting cognitive systems from gradient based attacks through the use of deceiving gradients
GB201801659D0 (en) 2017-11-14 2018-03-21 Cirrus Logic Int Semiconductor Ltd Detection of loudspeaker playback
CN108172224B (en) * 2017-12-19 2019-08-27 浙江大学 Method based on the defence of machine learning without vocal command control voice assistant
CN108417217B (en) * 2018-01-11 2021-07-13 思必驰科技股份有限公司 Speaker recognition network model training method, speaker recognition method and system
CN108281158A (en) * 2018-01-12 2018-07-13 平安科技(深圳)有限公司 Voice biopsy method, server and storage medium based on deep learning
US11475899B2 (en) 2018-01-23 2022-10-18 Cirrus Logic, Inc. Speaker identification
US11264037B2 (en) 2018-01-23 2022-03-01 Cirrus Logic, Inc. Speaker identification
US11735189B2 (en) 2018-01-23 2023-08-22 Cirrus Logic, Inc. Speaker identification
US11538455B2 (en) 2018-02-16 2022-12-27 Dolby Laboratories Licensing Corporation Speech style transfer
EP3752964B1 (en) * 2018-02-16 2023-06-28 Dolby Laboratories Licensing Corporation Speech style transfer
CN108711436B (en) * 2018-05-17 2020-06-09 哈尔滨工业大学 Speaker verification system replay attack detection method based on high frequency and bottleneck characteristics
US10692490B2 (en) 2018-07-31 2020-06-23 Cirrus Logic, Inc. Detection of replay attack
CN109165726A (en) * 2018-08-17 2019-01-08 联智科技(天津)有限责任公司 A kind of neural network embedded system for without speaker verification's text
US10915614B2 (en) 2018-08-31 2021-02-09 Cirrus Logic, Inc. Biometric authentication
US11037574B2 (en) 2018-09-05 2021-06-15 Cirrus Logic, Inc. Speaker recognition and speaker change detection
CN109065069B (en) 2018-10-10 2020-09-04 广州市百果园信息技术有限公司 Audio detection method, device, equipment and storage medium
CN109147799A (en) * 2018-10-18 2019-01-04 广州势必可赢网络科技有限公司 A kind of method, apparatus of speech recognition, equipment and computer storage medium
CN109394476B (en) * 2018-12-06 2021-01-19 上海神添实业有限公司 Method and system for automatic intention recognition of brain muscle information and intelligent control of upper limbs
CN109448759A (en) * 2018-12-28 2019-03-08 武汉大学 A kind of anti-voice authentication spoofing attack detection method based on gas explosion sound
CN109767776B (en) * 2019-01-14 2023-12-15 广东技术师范大学 Deception voice detection method based on dense neural network
CN109920447B (en) * 2019-01-29 2021-07-13 天津大学 Recording fraud detection method based on adaptive filter amplitude phase characteristic extraction
CN110110732B (en) * 2019-05-08 2020-04-28 杭州视在科技有限公司 Intelligent inspection method for catering kitchen
CN110348189A (en) * 2019-06-17 2019-10-18 五邑大学 A kind of identity spoofing detection method and its system, device, storage medium
CN110491391B (en) * 2019-07-02 2021-09-17 厦门大学 Deception voice detection method based on deep neural network
CN110335591A (en) * 2019-07-04 2019-10-15 广州云从信息科技有限公司 A kind of parameter management method, device, machine readable media and equipment
CN110414536B (en) * 2019-07-17 2022-03-25 北京得意音通技术有限责任公司 Playback detection method, storage medium, and electronic device
CN110349586B (en) * 2019-07-23 2022-05-13 北京邮电大学 Telecommunication fraud detection method and device
CN110827837B (en) * 2019-10-18 2022-02-22 中山大学 Whale activity audio classification method based on deep learning
SG11202010803VA (en) 2019-10-31 2020-11-27 Alipay Hangzhou Inf Tech Co Ltd System and method for determining voice characteristics
CN111028852A (en) * 2019-11-06 2020-04-17 杭州哲信信息技术有限公司 Noise removing method in intelligent calling system based on CNN
CN111243621A (en) * 2020-01-14 2020-06-05 四川大学 Construction method of GRU-SVM deep learning model for synthetic speech detection
CN111327608B (en) * 2020-02-14 2021-02-02 中南大学 Application layer malicious request detection method and system based on cascade deep neural network
CN111755014B (en) * 2020-07-02 2022-06-03 四川长虹电器股份有限公司 Domain-adaptive replay attack detection method and system
CN113362822B (en) * 2021-06-08 2022-09-30 北京计算机技术及应用研究所 Black box voice confrontation sample generation method with auditory masking
CN113641980A (en) * 2021-08-23 2021-11-12 北京百度网讯科技有限公司 Authentication method and apparatus, electronic device, and medium
CN113555023B (en) * 2021-09-18 2022-01-11 中国科学院自动化研究所 Method for joint modeling of voice authentication and speaker recognition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436810A (en) * 2011-10-26 2012-05-02 华南理工大学 Record replay attack detection method and system based on channel mode noise
CN104954532A (en) * 2015-06-19 2015-09-30 深圳天珑无线科技有限公司 Voice recognition method, voice recognition device and mobile terminal
CN105139857A (en) * 2015-09-02 2015-12-09 广东顺德中山大学卡内基梅隆大学国际联合研究院 Countercheck method for automatically identifying speaker aiming to voice deception

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436810A (en) * 2011-10-26 2012-05-02 华南理工大学 Record replay attack detection method and system based on channel mode noise
CN104954532A (en) * 2015-06-19 2015-09-30 深圳天珑无线科技有限公司 Voice recognition method, voice recognition device and mobile terminal
CN105139857A (en) * 2015-09-02 2015-12-09 广东顺德中山大学卡内基梅隆大学国际联合研究院 Countercheck method for automatically identifying speaker aiming to voice deception

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Using Deep Learning for Detecting Spoofing Attacks on Speech Signals;Alan Godoy 等;《airxiv》;20160119;第1-5页

Also Published As

Publication number Publication date
CN105869630A (en) 2016-08-17

Similar Documents

Publication Publication Date Title
CN105869630B (en) Speaker's voice spoofing attack detection method and system based on deep learning
CN104732978B (en) The relevant method for distinguishing speek person of text based on combined depth study
CN110491391B (en) Deception voice detection method based on deep neural network
CN108231067A (en) Sound scenery recognition methods based on convolutional neural networks and random forest classification
CN102968990B (en) Speaker identifying method and system
CN107886943A (en) A kind of method for recognizing sound-groove and device
CN109584884A (en) A kind of speech identity feature extractor, classifier training method and relevant device
Tapkir et al. Novel spectral root cepstral features for replay spoof detection
CN110428843A (en) A kind of voice gender identification deep learning method
CN110211604A (en) A kind of depth residual error network structure for voice deformation detection
CN106531174A (en) Animal sound recognition method based on wavelet packet decomposition and spectrogram features
CN104978507A (en) Intelligent well logging evaluation expert system identity authentication method based on voiceprint recognition
CN107784215B (en) Audio unit based on intelligent terminal carries out the user authen method and system of labiomaney
CN111816185A (en) Method and device for identifying speaker in mixed voice
CN111611566B (en) Speaker verification system and replay attack detection method thereof
Gomez-Alanis et al. Performance evaluation of front-and back-end techniques for ASV spoofing detection systems based on deep features
CN111613240A (en) Camouflage voice detection method based on attention mechanism and Bi-LSTM
Gautam et al. Biometric system from heart sound using wavelet based feature set
WO2022268183A1 (en) Video-based random gesture authentication method and system
CN107274912A (en) A kind of equipment source discrimination method of mobile phone recording
Sailor et al. Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection.
CN111785262B (en) Speaker age and gender classification method based on residual error network and fusion characteristics
Islam et al. Neural-Response-Based Text-Dependent speaker identification under noisy conditions
Purnapatra et al. Longitudinal study of voice recognition in children
Neelima et al. Spoofing det ection and count ermeasure is aut omat ic speaker verificat ion syst em using dynamic feat ures

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200617

Address after: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120

Patentee after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd.

Address before: 200240 Dongchuan Road, Shanghai, No. 800, No.

Patentee before: SHANGHAI JIAO TONG University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201028

Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee after: AI SPEECH Ltd.

Address before: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120

Patentee before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd.

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee after: Sipic Technology Co.,Ltd.

Address before: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee before: AI SPEECH Ltd.