CN111755014A - Domain-adaptive replay attack detection method and system - Google Patents
Domain-adaptive replay attack detection method and system Download PDFInfo
- Publication number
- CN111755014A CN111755014A CN202010630019.XA CN202010630019A CN111755014A CN 111755014 A CN111755014 A CN 111755014A CN 202010630019 A CN202010630019 A CN 202010630019A CN 111755014 A CN111755014 A CN 111755014A
- Authority
- CN
- China
- Prior art keywords
- replay
- shared
- module
- detection module
- voiceprint feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 151
- 239000013598 vector Substances 0.000 claims abstract description 33
- 238000000034 method Methods 0.000 claims abstract description 19
- 230000003044 adaptive effect Effects 0.000 claims abstract description 15
- 238000000605 extraction Methods 0.000 claims description 20
- 238000013528 artificial neural network Methods 0.000 claims description 15
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 230000003111 delayed effect Effects 0.000 claims 1
- 230000030808 detection of mechanical stimulus involved in sensory perception of sound Effects 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/18—Artificial neural networks; Connectionist approaches
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/32—User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4014—Identity check for transactions
- G06Q20/40145—Biometric identity checks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0861—Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Biomedical Technology (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Circuit For Audible Band Transducer (AREA)
- Soundproofing, Sound Blocking, And Sound Damping (AREA)
Abstract
The invention discloses a field self-adaptive detection method for replay attack of a sound recording, which comprises the following steps: extracting acoustic features from at least one recording region of the recording; extracting a shared voiceprint feature vector from the acoustic features; and detecting whether the sound recording is a replay sound recording or not by a domain adaptive method from the shared voiceprint feature vector. The invention can still ensure the robustness of the record replay attack detection system under the conditions of the equipment and environment of record replay and the field diversity of speakers.
Description
Technical Field
The invention relates to the technical field of voice signal processing, in particular to a field-adaptive record playback attack detection method and system.
Background
In recent years, with the rapid development of artificial intelligence technology, more and more products with artificial intelligence technology appear in people's daily life, especially the smart sound box of recent years is different military prominence. The voiceprint recognition technology is almost the standard configuration of all intelligent sound boxes, and a user can finish account login, shopping payment and the like by using own voice. The detection of the replay attack of the recording is an extremely important link in a voiceprint recognition system, and whether a real person from which voice comes or the recording is judged. The diversity of domains leads to degraded performance of replay attack detection systems, as devices, environments, and speakers of the replay are diverse.
Disclosure of Invention
The invention provides a method and a system for detecting the replay attack of the recording with self-adaption in the field, aiming at solving the problem of field diversity of the replay attack of the recording. Designing a shared voiceprint feature extraction module, inputting the acoustic features of voice into the shared module, extracting the shared voiceprint features, and then respectively inputting the shared voiceprint features into four sub-classification modules, wherein the four sub-classification modules respectively comprise: a replay attack detection module, a replay device detection module, a replay environment detection module and a replay speaker detection module. The error gradients of the replay attack detection module are directly fed back to the shared voiceprint feature extraction module and the replay attack detection module, the error gradients of the replay device detection module, the replay environment detection module and the replay speaker detection module are fed back to the outside of the respective modules, and the error gradients are fed back to the shared voiceprint feature extraction module after being inverted. By the method and the system, the field adaptivity of the system can be enhanced, and the replay attack detection capability of the system is improved.
The invention realizes the purpose through the following technical scheme:
a method and a system for detecting the attack of playback of a voice record with self-adaptation field comprise the following steps:
calculating and extracting acoustic features from at least one recording region in the recording, wherein the acoustic features comprise Mel Frequency Cepstrum Coefficient (MFCC) or Power-normalized Cepstral Coefficients (PNCC);
extracting a shared voiceprint feature vector from the acoustic features;
and detecting whether the sound recording is a replay sound recording or not from the shared voiceprint feature vector by a domain adaptive method.
Further, in a detection phase, the shared voiceprint feature vectors are used to detect corresponding targets of at least one domain adaptive countermeasure task associated with the replay attack detection, the domain adaptive countermeasure task comprising: a playback device detection task, a playback environment detection task, and a playback speaker detection task.
Furthermore, the shared voiceprint feature vector is extracted through a shared voiceprint feature module, whether the record is replayed or not is detected through a replay attack detection module, the replay device detection task is achieved through a replay device detection module, the replay environment detection task is achieved through a replay environment detection module, and the replay speaker detection task is achieved through a replay speaker detection module.
Further, the shared voiceprint feature module, the replay attack detection module, the replay device detection module, the replay environment detection module, and the replay speaker detection module are all formed of a deep neural network including a combination of one or more of a Convolutional Neural Network (CNN), a recurrent neural network (RNN, LSTM, GRU), and a time-delayed neural network (TDNN).
Further, the method also comprises a training method of each module. Wherein the weight of the shared voiceprint feature module is WfThe replay attack detection module has a weight WaThe weight of the detection module of the playback device is WdThe replay speaker detection module has a weight WsThe playback environment detection module has a weight WeThe training steps of each module are as follows:
s0: inputting the acoustic features of the sound recording into a shared voiceprint feature module, and extracting shared voiceprint feature vectors;
s1: inputting the shared voiceprint feature vector in S0 into a replay attack detection module, and outputting a classification error La;
S2: inputting the shared voiceprint feature vector in S0 into a detection module of a playback device, and outputting a classification error Ld;
S3: mixing S0 togetherThe shared voiceprint characteristic vector is input into a speaker detection module for replaying, and a classification error L is outputs;
S4: inputting the shared voiceprint feature vector in S0 into a playback environment detection module, and outputting a classification error Le;
S5: the update method of each weight is as follows:
wherein is the learning rate, α1、α2、α3The weights of the playback device detection module, the playback speaker detection module, and the playback environment detection module, respectively.
S6: the steps of S0 to S5 are repeated until the blocks converge.
The embodiment of the invention provides another field self-adaptive record replay attack detection system, which comprises the following modules:
the acoustic feature extraction module is used for extracting acoustic features of at least one section of recording area in the recording;
the shared voiceprint feature extraction module is used for extracting a shared voiceprint feature vector from the acoustic features;
a detection module for detecting whether the shared voiceprint feature vector is a replay attack;
further, the detection module is also used to detect at least one domain-adaptive countermeasure task associated with the replay attack.
Further, the shared voiceprint feature extraction module and the detection module further comprise a deep neural network module.
And further, the system also comprises a training module which is used for training the deep neural network module in the shared voiceprint feature extraction module and the detection module.
The invention has the beneficial effects that:
the invention can solve the problem of performance degradation of the record replay attack detection system caused by the field diversity of the record replay equipment, environment and speakers; the robustness of the replay attack detection system can still be ensured under the conditions of the devices and environments of replay recording and the field diversity of speakers.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following briefly introduces the embodiments or the drawings needed to be practical in the prior art description, and obviously, the drawings in the following description are only some embodiments of the embodiments, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1: a schematic diagram of a domain adaptive replay attack detection method;
FIG. 2: a schematic diagram of a training method in a field-adaptive replay attack detection method;
FIG. 3: the structure schematic diagram of a domain-adaptive replay attack detection system;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
Example one
A domain-adaptive replay attack detection method proposed by the present invention is described with reference to fig. 1 and 2, where fig. 1 shows a flowchart of the replay attack detection method and fig. 2 shows a training flowchart of the domain-adaptive replay attack detection method.
In step S101, extracting acoustic features from at least one recording region in the recording, the acoustic features including Mel-Frequency Cepstrum Coefficients (MFCCs) or Power-normalized Cepstral Coefficients (PNCCs);
in step S102, a shared voiceprint feature vector is extracted from the multiple acoustic features extracted in step S101;
in step S103, it is detected whether the recording is a replay attack from the shared voiceprint feature vector extracted in step S102; while in a detection phase, the shared voiceprint feature vectors are used to detect corresponding targets of at least one domain adaptive countermeasure task associated with the replay attack detection, the domain adaptive countermeasure task including, but not limited to: the replay device detects the task, replays the environment detection task and replays the speaker detection task, and obtains detection results of all fields of adaptive confrontation tasks. The shared voiceprint feature vector is extracted through a shared voiceprint feature module, whether the record is replayed or not is detected through a replay attack detection module, the replay device detection task is achieved through a replay device detection module, the replay environment detection task is achieved through a replay environment detection module, and the replay speaker detection task is achieved through a replay speaker detection module. The shared voiceprint feature module, the replay attack detection module, the replay device detection module, the replay environment detection module and the replay speaker detection module are all composed of a deep neural network comprising one or a combination of Convolutional Neural Networks (CNN), recurrent neural networks (RNN, LSTM, GRU) and time-delay neural networks (TDNN).
In addition, the method of the present invention further comprises a training method of a shared voiceprint feature module, a replay attack detection module, a replay device detection module, a replay environment detection module and a replay speaker detection module, as shown in fig. 2.
In step S201, a sound recording sample in a training set, its real playback label, and a real label of a domain-adaptive countermeasure task are acquired;
in step S202, the weight of the shared voiceprint feature module is WfThe weight of the replay attack detection module is WaThe weight of the detection module of the playback device is WdThe replay speaker detection module has a weight of WsAnd the playback environment detection module has a weight of WeInputting the acoustic features of the sound recording into a shared voiceprint feature module, extracting shared voiceprint feature vectors, inputting the shared voiceprint feature vectors into a replay attack detection module, a replay device detection module, a replay speaker detection module and a replay environment detection module, and acquiring the detection result of replay sound recording and the detection result of a domain-adaptive confrontation task;
in step S203, the detection result of the reproduced sound recording is compared with the real tag of the reproduced sound recording, and the detection error L is obtaineda;
In step S204, parameters of the replay attack detection module are updated in a back propagation manner, where the updating manner is: wd←Wherein is the learning rate;
in step S205, the detection result of the domain-adaptive countermeasure task and the true label of the domain-adaptive countermeasure task are compared, respectively, and the detection error L is obtainedd、Ls、Le;
In step S206, parameters of the domain-adaptive confrontation task detection module are updated in a back-propagation manner, where the updating manner is:wherein is the learning rate;
in step S207, the shared voiceprint feature module parameters are updated in a back-propagation manner after the detection error of the domain adaptive countermeasure task is inverted and the detection error of the playback record is simultaneously updated, and the updating manner is as follows:
wherein is the learning rate, α1、α2、α3The weights of the playback device detection module, the playback speaker detection module and the playback environment detection module are respectively.
In step S208, it is determined whether the module converges or the training frequency reaches the set maximum iteration frequency or the module error reaches the set minimum error, if any one of the conditions is satisfied, the training is terminated, otherwise, the steps S201 to S208 are repeated.
The field-adaptive attack detection method for the record replay provided by the embodiment of the invention can still ensure the robustness of a record replay attack detection system under the conditions of the field diversity of the record replay equipment, environment and speakers.
Example two
A domain-adaptive replay attack detection system proposed by the present invention is described with reference to fig. 3, of which fig. 3 shows the constituent modules. Referring to fig. 3, the system includes an acoustic feature extraction module 301, a shared voiceprint feature extraction module 302, a detection module 303, and a training module 304.
The acoustic feature extraction module 301 extracts acoustic features from at least one recording area or the whole recording in the recording data;
the shared voiceprint feature extraction module 302 extracts a shared voiceprint feature vector from the acoustic features in the acoustic feature extraction module 301;
the detection module 303 detects whether the recording is a playback recording from the shared voiceprint feature vectors in the shared voiceprint feature extraction module 302. Meanwhile, the detection module 303 may further detect at least one domain-adaptive countermeasure task associated with the replay attack from the shared voiceprint feature vector, where the countermeasure tasks include a replay device detection task, a replay environment detection task, and a replay speaker detection task and obtain detection results of all the domain-adaptive countermeasure tasks.
The training module 304 is used to train the deep neural network modules in the shared voiceprint feature extraction module 302 and the detection module 303, and the system replay attack detection at least one domain-adaptive countermeasure task that can be associated with replay attack is simultaneously trained, and the training step refers to the first embodiment described above.
The second field-adaptive attack detection system provided by the embodiment of the invention can still ensure the robustness of the attack detection system under the conditions of the field diversity of the equipment, environment and speaker for record playback.
It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by hardware instructions related to a program, and the program may be stored in a computer-readable storage medium, and when executed, may include the processes of the above embodiments of the methods. The storage medium may be a magnetic disk, an optical disk, a Read-only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims. It should be noted that the various technical features described in the above embodiments can be combined in any suitable manner without contradiction, and the invention is not described in any way for the possible combinations in order to avoid unnecessary repetition. In addition, any combination of the various embodiments of the present invention is also possible, and the same should be considered as the disclosure of the present invention as long as it does not depart from the spirit of the present invention.
Claims (10)
1. A domain-adaptive replay attack detection method is characterized by comprising the following steps:
extracting acoustic features from at least one recording region of the recording;
extracting a shared voiceprint feature vector from the acoustic features; and
and detecting whether the sound recording is a replay sound recording or not from the shared voiceprint feature vector by a domain adaptive method.
2. The domain-adaptive replay attack detection method of claim 1, wherein the acoustic features include mel-frequency cepstral coefficients or energy-normalized cepstral coefficients.
3. The method of claim 1, wherein the shared voiceprint feature vector is used to detect a corresponding target of at least one domain adaptive countermeasure task associated with playback attack detection, the domain adaptive countermeasure task comprising: a playback device detection task, a playback environment detection task, and a playback speaker detection task.
4. The domain-adaptive replay attack detection method of claim 3, wherein the shared voiceprint feature vector is extracted by a shared voiceprint feature extraction module, whether the recording is replayed is detected by a replay attack detection module, the replay device detection task is realized by a replay device detection module, the replay environment detection task is realized by a replay environment detection module, and the replay speaker detection task is realized by a replay speaker detection module.
5. The method and system for domain-adaptive replay attack detection of sound recordings according to claim 4, wherein the shared voiceprint feature extraction module, replay attack detection module, replay device detection module, replay environment detection module and replay speaker detection module are formed by a deep neural network, and the deep neural network comprises one or more networks selected from the group consisting of convolutional neural network, recursive neural network and delayed neural network.
6. A domain adaptive replay attack detection method according to any one of claims 4 to 5 in which the training steps for each module are as follows:
wherein the weight of the shared voiceprint feature module is WfThe replay attack detection module has a weight WaThe weight of the detection module of the playback device is WdThe replay speaker detection module has a weight WsThe playback environment detection module has a weight We,
S0: inputting the acoustic features of the sound recording into a shared voiceprint feature module, and extracting shared voiceprint feature vectors;
s1: inputting the shared voiceprint feature vector in S0 into a replay attack detection module, and outputting a classification error La;
S2: inputting the shared voiceprint feature vector in S0 into a detection module of a playback device, and outputting a classification error Ld;
S3: inputting the shared voiceprint feature vector in S0 into the speaker detection module, and outputting a classification error Ls;
S4: inputting the shared voiceprint feature vector in S0 into a playback environment detection module, and outputting a classification error Le;
S5: the update method of each weight is as follows:
wherein is the learning rate, α1、α2、α3The weights of the playback device detection module, the playback speaker detection module, and the playback environment detection module, respectively.
S6: the steps of S0 to S5 are repeated until the blocks converge.
7. The domain adaptive replay attack detection method detection system of any one of claims 1 to 8, comprising:
the acoustic feature extraction module is used for extracting acoustic features of at least one section of recording area in the recording;
the shared voiceprint feature extraction module is used for extracting a shared voiceprint feature vector from the acoustic features;
a detection module for detecting whether the shared voiceprint feature vector is a replay attack.
8. The domain-adaptive replay attack detection system of claim 7, wherein the detection module is further configured to detect at least one domain-adaptive countermeasure task associated with a replay attack.
9. The domain-adaptive replay attack detection system of claim 7, wherein the shared voiceprint feature extraction module and detection module further comprises a deep neural network module.
10. A domain adaptive replay attack detection system according to claims 7-9 further including a training module for training the deep neural network module in the shared voiceprint feature extraction module and the detection module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010630019.XA CN111755014B (en) | 2020-07-02 | 2020-07-02 | Domain-adaptive replay attack detection method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010630019.XA CN111755014B (en) | 2020-07-02 | 2020-07-02 | Domain-adaptive replay attack detection method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111755014A true CN111755014A (en) | 2020-10-09 |
CN111755014B CN111755014B (en) | 2022-06-03 |
Family
ID=72678889
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010630019.XA Active CN111755014B (en) | 2020-07-02 | 2020-07-02 | Domain-adaptive replay attack detection method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111755014B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113284486A (en) * | 2021-07-26 | 2021-08-20 | 中国科学院自动化研究所 | Robust voice identification method for environmental countermeasure |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105139857A (en) * | 2015-09-02 | 2015-12-09 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Countercheck method for automatically identifying speaker aiming to voice deception |
CN105702263A (en) * | 2016-01-06 | 2016-06-22 | 清华大学 | Voice playback detection method and device |
CN105869630A (en) * | 2016-06-27 | 2016-08-17 | 上海交通大学 | Method and system for detecting voice spoofing attack of speakers on basis of deep learning |
CN106531172A (en) * | 2016-11-23 | 2017-03-22 | 湖北大学 | Speaker voice playback identification method and system based on environmental noise change detection |
CN108039176A (en) * | 2018-01-11 | 2018-05-15 | 广州势必可赢网络科技有限公司 | Voiceprint authentication method and device for preventing recording attack and access control system |
CN108806698A (en) * | 2018-03-15 | 2018-11-13 | 中山大学 | A kind of camouflage audio recognition method based on convolutional neural networks |
US20190013033A1 (en) * | 2016-08-19 | 2019-01-10 | Amazon Technologies, Inc. | Detecting replay attacks in voice-based authentication |
US20190115033A1 (en) * | 2017-10-13 | 2019-04-18 | Cirrus Logic International Semiconductor Ltd. | Detection of liveness |
CN109841219A (en) * | 2019-03-15 | 2019-06-04 | 慧言科技(天津)有限公司 | Replay Attack method is cheated using speech amplitude information and a variety of phase-detection voices |
CN110491391A (en) * | 2019-07-02 | 2019-11-22 | 厦门大学 | A kind of deception speech detection method based on deep neural network |
CN110718229A (en) * | 2019-11-14 | 2020-01-21 | 国微集团(深圳)有限公司 | Detection method for record playback attack and training method corresponding to detection model |
-
2020
- 2020-07-02 CN CN202010630019.XA patent/CN111755014B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105139857A (en) * | 2015-09-02 | 2015-12-09 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Countercheck method for automatically identifying speaker aiming to voice deception |
CN105702263A (en) * | 2016-01-06 | 2016-06-22 | 清华大学 | Voice playback detection method and device |
CN105869630A (en) * | 2016-06-27 | 2016-08-17 | 上海交通大学 | Method and system for detecting voice spoofing attack of speakers on basis of deep learning |
US20190013033A1 (en) * | 2016-08-19 | 2019-01-10 | Amazon Technologies, Inc. | Detecting replay attacks in voice-based authentication |
US20200118577A1 (en) * | 2016-08-19 | 2020-04-16 | Amazon Technologies, Inc. | Detecting replay attacks in voice-based authentication |
US10510352B2 (en) * | 2016-08-19 | 2019-12-17 | Amazon Technologies, Inc. | Detecting replay attacks in voice-based authentication |
CN106531172A (en) * | 2016-11-23 | 2017-03-22 | 湖北大学 | Speaker voice playback identification method and system based on environmental noise change detection |
US20190115033A1 (en) * | 2017-10-13 | 2019-04-18 | Cirrus Logic International Semiconductor Ltd. | Detection of liveness |
CN108039176A (en) * | 2018-01-11 | 2018-05-15 | 广州势必可赢网络科技有限公司 | Voiceprint authentication method and device for preventing recording attack and access control system |
CN108806698A (en) * | 2018-03-15 | 2018-11-13 | 中山大学 | A kind of camouflage audio recognition method based on convolutional neural networks |
CN109841219A (en) * | 2019-03-15 | 2019-06-04 | 慧言科技(天津)有限公司 | Replay Attack method is cheated using speech amplitude information and a variety of phase-detection voices |
CN110491391A (en) * | 2019-07-02 | 2019-11-22 | 厦门大学 | A kind of deception speech detection method based on deep neural network |
CN110718229A (en) * | 2019-11-14 | 2020-01-21 | 国微集团(深圳)有限公司 | Detection method for record playback attack and training method corresponding to detection model |
Non-Patent Citations (2)
Title |
---|
VALENTI G: ""An end-to-end spoofing countermeasure for automatic speaker verification using evolving recurrent neural networks"", 《 ODYSSEY 2018 THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP》 * |
ZHUXIN CHEN: ""Recurrent Neural Networks for Automatic Replay Spoofing Attack Detection"", 《ICASSP》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113284486A (en) * | 2021-07-26 | 2021-08-20 | 中国科学院自动化研究所 | Robust voice identification method for environmental countermeasure |
Also Published As
Publication number | Publication date |
---|---|
CN111755014B (en) | 2022-06-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3504703B1 (en) | A speech recognition method and apparatus | |
Li et al. | Online direction of arrival estimation based on deep learning | |
Cui et al. | Data augmentation for deep neural network acoustic modeling | |
Li et al. | Developing far-field speaker system via teacher-student learning | |
CN110503970A (en) | A kind of audio data processing method, device and storage medium | |
Cheng et al. | Replay detection using CQT-based modified group delay feature and ResNeWt network in ASVspoof 2019 | |
CN109584896A (en) | A kind of speech chip and electronic equipment | |
CN108122563A (en) | Improve voice wake-up rate and the method for correcting DOA | |
JP6703460B2 (en) | Audio processing device, audio processing method, and audio processing program | |
CN108417224A (en) | The training and recognition methods of two way blocks model and system | |
Ge et al. | Deep learning approach in DOA estimation: A systematic literature review | |
CN114708857A (en) | Speech recognition model training method, speech recognition method and corresponding device | |
Basbug et al. | Acoustic scene classification using spatial pyramid pooling with convolutional neural networks | |
Takeda et al. | Unsupervised adaptation of neural networks for discriminative sound source localization with eliminative constraint | |
Chang et al. | Audio adversarial examples generation with recurrent neural networks | |
CN110930996A (en) | Model training method, voice recognition method, device, storage medium and equipment | |
CN115208507A (en) | Privacy protection method and device based on white-box voice countermeasure sample | |
CN112180318A (en) | Sound source direction-of-arrival estimation model training and sound source direction-of-arrival estimation method | |
CN111755014B (en) | Domain-adaptive replay attack detection method and system | |
CN112133293A (en) | Phrase voice sample compensation method based on generation countermeasure network and storage medium | |
Suh et al. | Phoneme segmentation of continuous speech using multi-layer perceptron | |
Salvati et al. | End-to-End Speaker Identification in Noisy and Reverberant Environments Using Raw Waveform Convolutional Neural Networks. | |
CN114664288A (en) | Voice recognition method, device, equipment and storage medium | |
KR20210131067A (en) | Method and appratus for training acoustic scene recognition model and method and appratus for reconition of acoustic scene using acoustic scene recognition model | |
Iqbal et al. | Enhancing audio augmentation methods with consistency learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |