CN108694952A - Electronic device, the method for authentication and storage medium - Google Patents

Electronic device, the method for authentication and storage medium Download PDF

Info

Publication number
CN108694952A
CN108694952A CN201810311721.2A CN201810311721A CN108694952A CN 108694952 A CN108694952 A CN 108694952A CN 201810311721 A CN201810311721 A CN 201810311721A CN 108694952 A CN108694952 A CN 108694952A
Authority
CN
China
Prior art keywords
user
reading
voice
vocal print
print feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810311721.2A
Other languages
Chinese (zh)
Other versions
CN108694952B (en
Inventor
王健宗
于夕畔
李瑾瑾
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810311721.2A priority Critical patent/CN108694952B/en
Priority to PCT/CN2018/102208 priority patent/WO2019196305A1/en
Publication of CN108694952A publication Critical patent/CN108694952A/en
Application granted granted Critical
Publication of CN108694952B publication Critical patent/CN108694952B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/16Hidden Markov models [HMM]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Collating Specific Patterns (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention relates to a kind of electronic device, the method for authentication and storage medium, this method to include:Under IVR scenes when user's transacting business, report the random code of the first presetting digit capacity for the user with reading, and after with reading be respectively this random code for reporting and the user this acoustic model of preset kind is established with the voice of reading;By the acoustic model of the random code of this report and the user, this with the acoustic model of the voice of reading carries out forcing whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;If the probability is more than preset first threshold value, then extracting the user, this is vectorial with the vocal print feature of the voice of reading, obtain the standard vocal print feature vector that the user prestores after succeeding in registration, and calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user progress authentication.The present invention carries out double verification to user identity, can accurately confirm user identity.

Description

Electronic device, the method for authentication and storage medium
Technical field
The present invention relates to a kind of field of communication technology more particularly to electronic device, the method for authentication and storage mediums.
Background technology
Currently, in interactive voice answering IVR (Interactive Voice Response) scene, providing will be mutual Dynamic formula voice answer-back IVR is combined with Application on Voiceprint Recognition, to carry out the scheme of authentication to client, for example, client receives credit card When using phone to carry out credit card activation or Modify password afterwards, the scene for verifying client identity is needed.The prior art is interactive In voice answer-back IVR (Interactive Voice Response) scene, in view of long-range voice print verification both sides be not face to face into Row verification cannot accurately confirm client's body accordingly, it is possible to can have the fraud that client utilizes pre-prepd synthesized voice Part, the safety of authentication is low.
Invention content
The purpose of the present invention is to provide a kind of electronic device, the method for authentication and storage mediums, it is intended to user Identity carries out double verification, can accurately confirm user identity.
To achieve the above object, the present invention provides a kind of electronic device, the electronic device include memory and with it is described The processor of memory connection, is stored with the processing system that can be run on the processor, the processing in the memory System realizes following steps when being executed by the processor:
Under interactive voice answering IVR scenes when user's transacting business, it is default to report first for acoustic model establishment step The random code of digit for the user with read, and after with reading be respectively this report random code and the user this with reading language Sound establishes the acoustic model of preset kind;
Force whole alignment step, by the acoustic model of the random code of this report and the user this with reading voice Acoustic model carries out forcing whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;
Authentication step is extracted if the identical probability of two acoustic models after the alignment is more than preset first threshold value The user this with the voice of reading vocal print feature vector, obtain standard vocal print feature that the user prestores after succeeding in registration to Amount, and calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to this User carries out authentication.
Preferably, when the processing system is executed by the processor, following steps are also realized:
When user carries out voiceprint registration under interactive voice answering IVR scenes, the random code of the second presetting digit capacity is reported It is respectively that the random code reported and user establish the default class with the voice of reading after every time with reading for user with reading default time The acoustic model of type;
The acoustic model for the random code reported every time and corresponding user are carried out with the acoustic model of the voice of reading respectively Whole alignment operation is forced, the identical probability of two acoustic models after alignment is calculated using pre-defined algorithm;
If the identical probability of two acoustic models after alignment is all higher than default second threshold, each user is extracted with reading Whether the vocal print feature vector of voice, calculate the distance of vocal print feature vector two-by-two, be every time same with the user of reading with analysis User;
If so, the standard vocal print feature vector using the vocal print feature vector as the user stores.
Preferably, the acoustic model of the preset kind is deep neural network-hidden Markov model.
Preferably, described extraction user this with the step of the vocal print feature vector of the voice of reading include:
To the user, this with the voice of reading carries out preemphasis and windowing process, and Fourier transform is carried out to each adding window Corresponding frequency spectrum is obtained, the frequency spectrum is inputted into Meier filter to export to obtain Meier frequency spectrum;
Cepstral analysis is carried out on Meier frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, is fallen based on the mel-frequency Spectral coefficient MFCC forms the user, and this is vectorial with the vocal print feature of the voice of reading.
To achieve the above object, the present invention also provides a kind of method of authentication, the method for the authentication includes:
S1, under interactive voice answering IVR scenes when user's transacting business, the random code for reporting the first presetting digit capacity supplies The user with reading, and after with reading be respectively this random code for reporting and the user this with the voice of reading establish preset kind Acoustic model;
S2, by the acoustic model of the random code of this report and the user this with the voice of reading acoustic model carry out it is strong The whole alignment operation of system, the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;
S3, if the identical probability of two acoustic models after the alignment is more than preset first threshold value, extract the user this With the vocal print feature vector of the voice of reading, the standard vocal print feature vector that the user prestores after succeeding in registration is obtained, and calculate The user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user progress body Part verification.
Preferably, before the step S1, further include:
S01, under interactive voice answering IVR scenes user carry out voiceprint registration when, report the second presetting digit capacity with Machine code for user with reading default time, with the voice of reading establish described pre- by the random code and user respectively reported after every time with reading If the acoustic model of type;
S02, respectively by the acoustic model for the random code reported every time and corresponding user with the acoustic model of the voice of reading It carries out forcing whole alignment operation, the identical probability of two acoustic models after alignment is calculated using pre-defined algorithm;
S03, if alignment after the identical probability of two acoustic models be all higher than default second threshold, extract each user with The vocal print feature vector of the voice of reading, calculates the distance of vocal print feature vector two-by-two, with analysis every time with the user of reading whether be Same user;
S04, if so, the standard vocal print feature vector using the vocal print feature vector as the user stores.
Preferably, the acoustic model of the preset kind is deep neural network-hidden Markov model.
Preferably, described extraction user this with the step of the vocal print feature vector of the voice of reading include:
To the user, this with the voice of reading carries out preemphasis and windowing process, and Fourier transform is carried out to each adding window Corresponding frequency spectrum is obtained, the frequency spectrum is inputted into Meier filter to export to obtain Meier frequency spectrum;
Cepstral analysis is carried out on Meier frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, is fallen based on the mel-frequency Spectral coefficient MFCC forms the user, and this is vectorial with the vocal print feature of the voice of reading.
Preferably, it is described calculate the user this with the voice of reading vocal print feature vector and standard vocal print feature vector Apart from the step of include:
Wherein, describedIt is described for standard vocal print feature vectorFor the user this with reading The vocal print feature vector of voice.
The present invention also provides a kind of computer readable storage medium, processing is stored on the computer readable storage medium The step of system, the processing system realizes the method for above-mentioned authentication when being executed by processor.
The beneficial effects of the invention are as follows:When the present invention carries out identification under interactive voice answering IVR scenes, utilize Random code can effectively prevent pre-prepd synthesized voice with reading for user and cheat, by random code and Application on Voiceprint Recognition knot It closes, realizes the double verification to user identity, can accurately confirm user identity, improve interactive voice answering IVR scenes The safety of lower authentication, in addition, the acoustic model of acoustic model and the user to the random code of report with the voice of reading It carries out forcing whole alignment operation, calculation amount can be reduced, improve identification efficiency.
Description of the drawings
Fig. 1 is each one optional application environment schematic diagram of embodiment of the present invention;
Fig. 2 is the flow diagram of one embodiment of method of authentication of the present invention.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not For limiting the present invention.Based on the embodiments of the present invention, those of ordinary skill in the art are not before making creative work The every other embodiment obtained is put, shall fall within the protection scope of the present invention.
It should be noted that the description for being related to " first ", " second " etc. in the present invention is used for description purposes only, and cannot It is interpreted as indicating or implying its relative importance or implicitly indicates the quantity of indicated technical characteristic.Define as a result, " the One ", the feature of " second " can explicitly or implicitly include at least one of the features.In addition, the skill between each embodiment Art scheme can be combined with each other, but must can be implemented as basis with those of ordinary skill in the art, when technical solution Will be understood that the combination of this technical solution is not present in conjunction with there is conflicting or cannot achieve when, also not the present invention claims Protection domain within.
As shown in fig.1, being the application environment schematic diagram of the preferred embodiment of the method for authentication of the present invention.The application Environment schematic includes electronic device 1 and terminal device.What electronic device 1 can be suitble to by network, near-field communication technology etc. Technology carries out data interaction with terminal device.In the present embodiment, interactive language that user passes through terminal device logs electronic device 1 Sound response IVR system, to execute the operation of voiceprint registration and Application on Voiceprint Recognition.
The terminal device, which includes, but are not limited to any type, to pass through keyboard, mouse, remote controler, touch tablet with user Or the modes such as voice-operated device carry out the electronic product of human-computer interaction, for example, personal computer, tablet computer, smart mobile phone, a Personal digital assistant (Personal Digital Assistant, PDA), game machine, Interactive Internet TV (Internet Protocol Television, IPTV), the movable equipment of intellectual Wearable, navigation device etc., or such as The fixed terminal of digital TV, desktop computer, notebook, server etc..
The electronic device 1 be it is a kind of can according to the instruction for being previously set or storing, it is automatic carry out numerical computations and/ Or the equipment of information processing.The electronic device 1 can be computer, can also be single network server, multiple networks clothes It is engaged in the server group either cloud being made of a large amount of hosts or network server based on cloud computing of device composition, wherein cloud computing It is one kind of Distributed Calculation, a super virtual computer being made of the computer collection of a group loose couplings.
In the present embodiment, electronic device 1 may include, but be not limited only to, and can be in communication with each other connection by system bus Memory 11, processor 12, network interface 13, memory 11 are stored with the processing system that can be run on the processor 12.It needs , it is noted that Fig. 1 illustrates only the electronic device 1 with component 11-13, it should be understood that being not required for implementing all The component shown, the implementation that can be substituted is more or less component.
Wherein, memory 11 includes memory and the readable storage medium storing program for executing of at least one type.Inside save as the fortune of electronic device 1 Row provides caching;Readable storage medium storing program for executing can be if flash memory, hard disk, multimedia card, card-type memory are (for example, SD or DX memories Deng), random access storage device (RAM), static random-access memory (SRAM), read-only memory (ROM), electric erasable can compile Journey read-only memory (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc. it is non-volatile Storage medium.In some embodiments, readable storage medium storing program for executing can be the internal storage unit of electronic device 1, such as the electronics The hard disk of device 1;In further embodiments, which can also be that the external storage of electronic device 1 is set Plug-in type hard disk that is standby, such as being equipped on electronic device 1, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) blocks, flash card (Flash Card) etc..In the present embodiment, the readable storage medium storing program for executing of memory 11 It is installed on commonly used in storage in the operating system and types of applications software of electronic device 1, such as storage one embodiment of the invention Processing system program code etc..It has exported or will export in addition, memory 11 can be also used for temporarily storing Various types of data.
The processor 12 can be in some embodiments central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chips.The processor 12 is commonly used in the control electricity The overall operation of sub-device 1, such as execute and carry out data interaction with the terminal device or communicate relevant control and processing Deng.In the present embodiment, the processor 12 is for running the program code stored in the memory 11 or processing data, example Such as run processing system.
The network interface 13 may include radio network interface or wired network interface, which is commonly used in Communication connection is established between the electronic device 1 and other electronic equipments.In the present embodiment, network interface 13 is mainly used for will be electric Sub-device 1 is connected with one or more terminal devices, and data are established between electronic device 1 and one or more terminal devices and are passed Defeated channel and communication connection.
The processing system is stored in memory 11, including it is at least one be stored in it is computer-readable in memory 11 Instruction, at least one computer-readable instruction can be executed by processor device 12, the method to realize each embodiment of the application;With And the function that at least one computer-readable instruction is realized according to its each section is different, can be divided into different logic moulds Block.
In one embodiment, following steps are realized when above-mentioned processing system is executed by the processor 12:
Under interactive voice answering IVR scenes when user's transacting business, it is default to report first for acoustic model establishment step The random code of digit for the user with read, and after with reading be respectively this report random code and the user this with reading language Sound establishes the acoustic model of preset kind;
Under interactive voice answering IVR scenes, user asks to send identity code, such as identity card when transacting business Number, after receiving the request of user, whether the business handled of analysis user needs further authentication, and according to The identity code at family analyzes whether the user is registered vocal print, if desired the further authentication and user is registered There is vocal print, then generate the random code of the first presetting digit capacity and speech synthesis technique is used to report the random code with speech form, draws It leads user to carry out with reading, which is, for example, 8.
It is that the voice of this random code reported establishes the acoustic model of preset kind, is the user after user is with reading This establishes the acoustic model of preset kind with the voice of reading.In a preferred embodiment, the acoustic model of the preset kind is Deep neural network-hidden Markov acoustic model, i.e. DNN-HMM acoustic models.In other embodiments, the preset kind Acoustic model may be other acoustic models, for example, hidden Markov acoustic model etc..
In a specific example, by taking DNN-HMM acoustic models as an example, wherein HMM is used for describing the dynamic of voice signal Variation, the posterior probability of some state of continuous density HMM is estimated using each output node of DNN, you can obtain DNN- HMM model.This with the voice of reading is all a series of syllable by the voice of this random code reported and the user, to identification At word, then be a series of character.The present embodiment is based on scheduled character voice when establishing DNN-HMM acoustic models Library obtains DNN-HMM acoustic models, the use of the voice of the random code of this report by global character acoustics adaptive training Family this with the voice of reading DNN-HMM acoustic models.
Force whole alignment step, by the acoustic model of the random code of this report and the user this with reading voice Acoustic model carries out forcing whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;
Wherein, by this report random code acoustic model and the user this with the voice of reading acoustic model carry out Whole alignment (Force Alignment) operation is forced, takes the method word for word compared, the present embodiment can compared to traditional Calculation amount is substantially reduced, the advantageous efficiency for improving identification.
Wherein, posterior probability algorithm is tested before pre-defined algorithm is in one embodiment, in other embodiments, can also be phase Like degree algorithm, such as editing distance that the similarity algorithm be character in two acoustic models calculated after being aligned, editing distance gets over The identical probability of two acoustic models after small then alignment is bigger;The similarity algorithm can also be longest common subsequence algorithm, If obtained longest common subsequence be aligned after two acoustic models in the length of character differ smaller, two after alignment The identical probability of acoustic model is bigger.
Authentication step is extracted if the identical probability of two acoustic models after the alignment is more than preset first threshold value The user this with the voice of reading vocal print feature vector, obtain standard vocal print feature that the user prestores after succeeding in registration to Amount, and calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to this User carries out authentication.
In the present embodiment, if the identical probability of two acoustic models after the alignment is more than preset first threshold value, for example, it is default First threshold is 0.985, then it is assumed that this is consistent with the random code that this is reported with the character of reading by user.Due to report be with Machine code, therefore the pre-prepd synthesized voice of user can be effectively prevented and cheated, promote the safety of identification.
In one embodiment, extracting the user, this with the step of the vocal print feature vector of the voice of reading includes:To the use This with the voice of reading carries out preemphasis and windowing process at family, and carrying out Fourier transform to each adding window obtains corresponding frequency The frequency spectrum is inputted Meier filter to export to obtain Meier frequency spectrum by spectrum;Cepstral analysis is carried out on Meier frequency spectrum to obtain Mel-frequency cepstrum coefficient MFCC, based on the mel-frequency cepstrum coefficient MFCC form the user this with the voice of reading sound Line feature vector.
Wherein, to the user this with reading voice carry out framing, then to after framing voice data carry out preemphasis Processing, preemphasis processing are really high-pass filtering processing, filter out low-frequency data so that the high frequency characteristics in the voice data is more It highlights, specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z-1, wherein Z is voice data, and α is constant factor, excellent The value of selection of land, α is 0.97;Since voice deviates from raw tone to a certain extent after framing, therefore, it is necessary to the language Sound data carry out windowing process.
In the present embodiment, it is, for example, to take logarithm, do inverse transformation that cepstral analysis is carried out on Meier frequency spectrum, and inverse transformation is usually It is realized by DCT discrete cosine transforms, takes the 2nd after DCT to the 13rd coefficient as mel-frequency cepstrum coefficient MFCC. Mel-frequency cepstrum coefficient MFCC is the vocal print feature of this frame voice data, by the mel-frequency cepstrum coefficient MFCC groups of every frame At characteristic matrix, this feature data matrix be the user this with the voice of reading vocal print feature vector.
The present embodiment takes the mel-frequency cepstrum coefficient MFCC of voice data to form corresponding vocal print feature vector, due to it Than the frequency band for the linear interval in normal cepstrum more can subhuman auditory system, therefore body can be improved The accuracy of part verification.
In one embodiment, calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature to The distance of amount is the COS distance of both calculating, including:
Wherein, describedIt is described for standard vocal print feature vectorFor the user this with reading The vocal print feature vector of voice.
If COS distance is less than or equal to preset distance threshold, authentication passes through;If COS distance is more than pre- If distance threshold, then authentication do not pass through.
In one embodiment, the step of standard vocal print feature vector to prestore after user registration success, the registration vocal print Including:
When user carries out voiceprint registration under interactive voice answering IVR scenes, the random code of the second presetting digit capacity is reported It is respectively that the random code reported and user establish the default class with the voice of reading after every time with reading for user with reading default time The acoustic model of type;
The acoustic model for the random code reported every time and corresponding user are carried out with the acoustic model of the voice of reading respectively Whole alignment operation is forced, the identical probability of two acoustic models after alignment is calculated using pre-defined algorithm;
If the identical probability of two acoustic models after alignment is all higher than default second threshold, each user is extracted with reading Whether the vocal print feature vector of voice, calculate the distance of vocal print feature vector two-by-two, be every time same with the user of reading with analysis User;
If so, the standard vocal print feature vector using the vocal print feature vector as the user stores;
If it is not, the step of then user being prompted to re-type, carrying out registration vocal print again.
Wherein, under interactive voice answering IVR scenes, user asks to send identity code, such as identity when registration Card number generates the random code of the second presetting digit capacity and uses speech synthesis technique with voice shape after receiving the request of user Formula reports the random code, and guiding user carries out with reading default time (such as 3 times), which is, for example, 8.
After user is with reading, the voice of the random code to report every time establishes the acoustic model of preset kind, is the user The acoustic model of preset kind is established with the voice of reading every time.In a preferred embodiment, the acoustic model of the preset kind is Deep neural network-hidden Markov acoustic model, i.e. DNN-HMM acoustic models.In other embodiments, the preset kind Acoustic model may be other acoustic models, for example, hidden Markov acoustic model etc..Specific example can refer to The above embodiments, details are not described herein again.
In a specific example, by taking DNN-HMM acoustic models as an example, wherein HMM is used for describing the dynamic of voice signal Variation, the posterior probability of some state of continuous density HMM is estimated using each output node of DNN, you can obtain DNN- HMM model.The voice for the random code reported every time and the user are a series of syllables with the voice of reading, to identification Word is then a series of character.The present embodiment is based on scheduled character sound bank when establishing DNN-HMM acoustic models, leads to The DNN-HMM acoustic models of the voice for the random code that global character acoustics adaptive training is reported, the user are crossed with reading The DNN-HMM acoustic models of voice.
Wherein, the acoustic model for the random code reported every time and the user are forced with the acoustic model of the voice of reading Whole alignment (Force Alignment) operation, takes the method word for word compared, the present embodiment can be significantly compared to traditional Reduce calculation amount, the advantageous efficiency for improving identification.
Wherein, posterior probability algorithm is tested before pre-defined algorithm is in one embodiment, in other embodiments, can also be phase Like degree algorithm, specific example can refer to the above embodiments, and details are not described herein again.
In the present embodiment, if the identical probability of two acoustic models after alignment is all higher than default second threshold, for example, it is default Second threshold is 0.985, then it is assumed that user is consistent with the random code reported with the character of reading every time.What it is due to report is random Code, therefore the pre-prepd synthesized voice of user can be effectively prevented and cheated, promote the safety of identification.
In one embodiment, step and above-described embodiment of each user with the vocal print feature vector of the voice of reading are extracted The method for extracting the vocal print feature vector of voice is essentially identical, and details are not described herein again.
In one embodiment, calculate two-by-two vocal print feature vector apart from the step of, the step with above-mentioned calculating COS distance Rapid essentially identical, details are not described herein again.
It is every time same user with the user of reading, at this time if COS distance is less than or equal to preset distance threshold Standard vocal print feature vector using the vocal print feature vector as the user stores;If COS distance is more than preset distance Threshold value, then be not same user with the user of reading every time, and prompt user re-registers.
Compared with prior art, when the present invention carries out identification under interactive voice answering IVR scenes, using random Code can effectively prevent pre-prepd synthesized voice with reading for user and cheat, and random code is combined with Application on Voiceprint Recognition, real Show the double verification to user identity, can accurately confirm user identity, improves identity under interactive voice answering IVR scenes The safety of verification, in addition, the acoustic model and the user to the random code of report carry out by force with the acoustic model of the voice of reading The whole alignment operation of system, can reduce calculation amount, improve identification efficiency.
As shown in Fig. 2, Fig. 2 is the flow diagram of one embodiment of method of authentication of the present invention, the authentication Method includes the following steps:
Step S1 under interactive voice answering IVR scenes when user's transacting business, reports the random of the first presetting digit capacity Code for the user with reading, and after with reading be respectively this random code for reporting and the user this with reading voice establish it is default The acoustic model of type;
Under interactive voice answering IVR scenes, user asks to send identity code, such as identity card when transacting business Number, after receiving the request of user, whether the business handled of analysis user needs further authentication, and according to The identity code at family analyzes whether the user is registered vocal print, if desired the further authentication and user is registered There is vocal print, then generate the random code of the first presetting digit capacity and speech synthesis technique is used to report the random code with speech form, draws It leads user to carry out with reading, which is, for example, 8.
It is that the voice of this random code reported establishes the acoustic model of preset kind, is the user after user is with reading This establishes the acoustic model of preset kind with the voice of reading.In a preferred embodiment, the acoustic model of the preset kind is Deep neural network-hidden Markov acoustic model, i.e. DNN-HMM acoustic models.In other embodiments, the preset kind Acoustic model may be other acoustic models, for example, hidden Markov acoustic model etc..
In a specific example, by taking DNN-HMM acoustic models as an example, wherein HMM is used for describing the dynamic of voice signal Variation, the posterior probability of some state of continuous density HMM is estimated using each output node of DNN, you can obtain DNN- HMM model.This with the voice of reading is all a series of syllable by the voice of this random code reported and the user, to identification At word, then be a series of character.The present embodiment is based on scheduled character voice when establishing DNN-HMM acoustic models Library obtains DNN-HMM acoustic models, the use of the voice of the random code of this report by global character acoustics adaptive training Family this with the voice of reading DNN-HMM acoustic models.
Step S2, by this report random code acoustic model and the user this with the voice of reading acoustic model into Row forces whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;
Wherein, by this report random code acoustic model and the user this with the voice of reading acoustic model carry out Whole alignment (Force Alignment) operation is forced, takes the method word for word compared, the present embodiment can compared to traditional Calculation amount is substantially reduced, the advantageous efficiency for improving identification.
Wherein, posterior probability algorithm is tested before pre-defined algorithm is in one embodiment, in other embodiments, can also be phase Like degree algorithm, such as editing distance that the similarity algorithm be character in two acoustic models calculated after being aligned, editing distance gets over The identical probability of two acoustic models after small then alignment is bigger;The similarity algorithm can also be longest common subsequence algorithm, If obtained longest common subsequence be aligned after two acoustic models in the length of character differ smaller, two after alignment The identical probability of acoustic model is bigger.
Step S3 extracts the user if the identical probability of two acoustic models after the alignment is more than preset first threshold value This obtains the standard vocal print feature vector that the user prestores after succeeding in registration with the vocal print feature vector of the voice of reading, and Calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user into Row authentication.
In the present embodiment, if the identical probability of two acoustic models after the alignment is more than preset first threshold value, for example, it is default First threshold is 0.985, then it is assumed that this is consistent with the random code that this is reported with the character of reading by user.Due to report be with Machine code, therefore the pre-prepd synthesized voice of user can be effectively prevented and cheated, promote the safety of identification.
In one embodiment, extracting the user, this with the step of the vocal print feature vector of the voice of reading includes:To the use This with the voice of reading carries out preemphasis and windowing process at family, and carrying out Fourier transform to each adding window obtains corresponding frequency The frequency spectrum is inputted Meier filter to export to obtain Meier frequency spectrum by spectrum;Cepstral analysis is carried out on Meier frequency spectrum to obtain Mel-frequency cepstrum coefficient MFCC, based on the mel-frequency cepstrum coefficient MFCC form the user this with the voice of reading sound Line feature vector.
Wherein, to the user this with reading voice carry out framing, then to after framing voice data carry out preemphasis Processing, preemphasis processing are really high-pass filtering processing, filter out low-frequency data so that the high frequency characteristics in the voice data is more It highlights, specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z-1, wherein Z is voice data, and α is constant factor, excellent The value of selection of land, α is 0.97;Since voice deviates from raw tone to a certain extent after framing, therefore, it is necessary to the language Sound data carry out windowing process.
In the present embodiment, it is, for example, to take logarithm, do inverse transformation that cepstral analysis is carried out on Meier frequency spectrum, and inverse transformation is usually It is realized by DCT discrete cosine transforms, takes the 2nd after DCT to the 13rd coefficient as mel-frequency cepstrum coefficient MFCC. Mel-frequency cepstrum coefficient MFCC is the vocal print feature of this frame voice data, by the mel-frequency cepstrum coefficient MFCC groups of every frame At characteristic matrix, this feature data matrix be the user this with the voice of reading vocal print feature vector.
The present embodiment takes the mel-frequency cepstrum coefficient MFCC of voice data to form corresponding vocal print feature vector, due to it Than the frequency band for the linear interval in normal cepstrum more can subhuman auditory system, therefore body can be improved The accuracy of part verification.
In one embodiment, calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature to The distance of amount is the COS distance of both calculating, including:
Wherein, describedIt is described for standard vocal print feature vectorFor the user this with reading The vocal print feature vector of voice.
If COS distance is less than or equal to preset distance threshold, authentication passes through;If COS distance is more than pre- If distance threshold, then authentication do not pass through.
In one embodiment, the step of standard vocal print feature vector to prestore after user registration success, the registration vocal print Including:
When user carries out voiceprint registration under interactive voice answering IVR scenes, the random code of the second presetting digit capacity is reported It is respectively that the random code reported and user establish the default class with the voice of reading after every time with reading for user with reading default time The acoustic model of type;
The acoustic model for the random code reported every time and corresponding user are carried out with the acoustic model of the voice of reading respectively Whole alignment operation is forced, the identical probability of two acoustic models after alignment is calculated using pre-defined algorithm;
If the identical probability of two acoustic models after alignment is all higher than default second threshold, each user is extracted with reading Whether the vocal print feature vector of voice, calculate the distance of vocal print feature vector two-by-two, be every time same with the user of reading with analysis User;
If so, the standard vocal print feature vector using the vocal print feature vector as the user stores;
If it is not, the step of then user being prompted to re-type, carrying out registration vocal print again.
Wherein, under interactive voice answering IVR scenes, user asks to send identity code, such as identity when registration Card number generates the random code of the second presetting digit capacity and uses speech synthesis technique with voice shape after receiving the request of user Formula reports the random code, and guiding user carries out with reading default time (such as 3 times), which is, for example, 8.
After user is with reading, the voice of the random code to report every time establishes the acoustic model of preset kind, is the user The acoustic model of preset kind is established with the voice of reading every time.In a preferred embodiment, the acoustic model of the preset kind is Deep neural network-hidden Markov acoustic model, i.e. DNN-HMM acoustic models.In other embodiments, the preset kind Acoustic model may be other acoustic models, for example, hidden Markov acoustic model etc..Specific example can refer to The above embodiments, details are not described herein again.
In a specific example, by taking DNN-HMM acoustic models as an example, wherein HMM is used for describing the dynamic of voice signal Variation, the posterior probability of some state of continuous density HMM is estimated using each output node of DNN, you can obtain DNN- HMM model.The voice for the random code reported every time and the user are a series of syllables with the voice of reading, to identification Word is then a series of character.The present embodiment is based on scheduled character sound bank when establishing DNN-HMM acoustic models, leads to The DNN-HMM acoustic models of the voice for the random code that global character acoustics adaptive training is reported, the user are crossed with reading The DNN-HMM acoustic models of voice.
Wherein, the acoustic model for the random code reported every time and the user are forced with the acoustic model of the voice of reading Whole alignment (Force Alignment) operation, takes the method word for word compared, the present embodiment can be significantly compared to traditional Reduce calculation amount, the advantageous efficiency for improving identification.
Wherein, posterior probability algorithm is tested before pre-defined algorithm is in one embodiment, in other embodiments, can also be phase Like degree algorithm, specific example can refer to the above embodiments, and details are not described herein again.
In the present embodiment, if the identical probability of two acoustic models after alignment is all higher than default second threshold, for example, it is default Second threshold is 0.985, then it is assumed that user is consistent with the random code reported with the character of reading every time.What it is due to report is random Code, therefore the pre-prepd synthesized voice of user can be effectively prevented and cheated, promote the safety of identification.
In one embodiment, step and above-described embodiment of each user with the vocal print feature vector of the voice of reading are extracted The method for extracting the vocal print feature vector of voice is essentially identical, and details are not described herein again.
In one embodiment, calculate two-by-two vocal print feature vector apart from the step of, the step with above-mentioned calculating COS distance Rapid essentially identical, details are not described herein again.
It is every time same user with the user of reading, at this time if COS distance is less than or equal to preset distance threshold Standard vocal print feature vector using the vocal print feature vector as the user stores;If COS distance is more than preset distance Threshold value, then be not same user with the user of reading every time, and prompt user re-registers.
The present invention also provides a kind of computer readable storage medium, processing is stored on the computer readable storage medium The step of system, the processing system realizes the method for above-mentioned authentication when being executed by processor.
The embodiments of the present invention are for illustration only, can not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical scheme of the present invention substantially in other words does the prior art Going out the part of contribution can be expressed in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disc, CD), including some instructions are used so that a station terminal equipment (can be mobile phone, computer, clothes Be engaged in device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
It these are only the preferred embodiment of the present invention, be not intended to limit the scope of the invention, it is every to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of electronic device, which is characterized in that the electronic device includes memory and the processing that is connect with the memory Device is stored with the processing system that can be run on the processor in the memory, and the processing system is by the processor Following steps are realized when execution:
Acoustic model establishment step under interactive voice answering IVR scenes when user's transacting business, reports the first presetting digit capacity Random code for the user with read, and after with reading be respectively this report random code and the user this built with the voice of reading The acoustic model of vertical preset kind;
Force whole alignment step, by the acoustic model of the random code of this report and the user this with the voice of reading acoustics Model carries out forcing whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;
Authentication step extracts the use if the identical probability of two acoustic models after the alignment is more than preset first threshold value This is vectorial with the vocal print feature of the voice of reading at family, obtains the standard vocal print feature vector that the user prestores after succeeding in registration, And calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user Carry out authentication.
2. electronic device according to claim 1, which is characterized in that when the processing system is executed by the processor, Also realize following steps:
Under interactive voice answering IVR scenes user carry out voiceprint registration when, report the second presetting digit capacity random code for Family is respectively that the random code reported and user establish the preset kind with the voice of reading after every time with reading with reading default time Acoustic model;
The acoustic model for the random code reported every time and corresponding user are forced with the acoustic model of the voice of reading respectively Whole alignment operation calculates the identical probability of two acoustic models after alignment using pre-defined algorithm;
If the identical probability of two acoustic models after alignment is all higher than default second threshold, voice of each user with reading is extracted Vocal print feature vector, calculate the distance of vocal print feature vector two-by-two, whether be every time same user with the user of reading to analyze;
If so, the standard vocal print feature vector using the vocal print feature vector as the user stores.
3. electronic device according to claim 1 or 2, which is characterized in that the acoustic model of the preset kind is depth Neural network-hidden Markov model.
4. electronic device according to claim 1 or 2, which is characterized in that it is described extract the user this with reading voice The step of vocal print feature vector include:
To the user, this with the voice of reading carries out preemphasis and windowing process, and carrying out Fourier transform to each adding window obtains The frequency spectrum is inputted Meier filter to export to obtain Meier frequency spectrum by corresponding frequency spectrum;
Cepstral analysis is carried out on Meier frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, is based on mel-frequency cepstrum system Number MFCC forms the user, and this is vectorial with the vocal print feature of the voice of reading.
5. a kind of method of authentication, which is characterized in that the method for the authentication includes:
S1 under interactive voice answering IVR scenes when user's transacting business, reports the random code of the first presetting digit capacity for this Family with reading, and after with reading be respectively this random code for reporting and the user this sound of preset kind is established with the voice of reading Learn model;
S2, by this report random code acoustic model and the user this with the voice of reading acoustic model force it is whole Body alignment operation calculates the identical probability of two acoustic models after the alignment using pre-defined algorithm;
S3, if the identical probability of two acoustic models after the alignment is more than preset first threshold value, extract the user this with reading Voice vocal print feature vector, obtain the standard vocal print feature vector that the user prestores after succeeding in registration, and calculate the use Family this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user progress identity test Card.
6. the method for authentication according to claim 5, which is characterized in that before the step S1, further include:
S01 when user carries out voiceprint registration under interactive voice answering IVR scenes, reports the random code of the second presetting digit capacity It is respectively that the random code reported and user establish the default class with the voice of reading after every time with reading for user with reading default time The acoustic model of type;
S02 respectively carries out the acoustic model for the random code reported every time and corresponding user with the acoustic model of the voice of reading Whole alignment operation is forced, the identical probability of two acoustic models after alignment is calculated using pre-defined algorithm;
S03 extracts each user with reading if the identical probability of two acoustic models after alignment is all higher than default second threshold Whether the vocal print feature vector of voice, calculate the distance of vocal print feature vector two-by-two, be every time same with the user of reading with analysis User;
S04, if so, the standard vocal print feature vector using the vocal print feature vector as the user stores.
7. the method for authentication according to claim 5 or 6, which is characterized in that the acoustic model of the preset kind For deep neural network-hidden Markov model.
8. the method for authentication according to claim 5 or 6, which is characterized in that described extraction user this with read The step of vocal print feature vector of voice include:
To the user, this with the voice of reading carries out preemphasis and windowing process, and carrying out Fourier transform to each adding window obtains The frequency spectrum is inputted Meier filter to export to obtain Meier frequency spectrum by corresponding frequency spectrum;
Cepstral analysis is carried out on Meier frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, is based on mel-frequency cepstrum system Number MFCC forms the user, and this is vectorial with the vocal print feature of the voice of reading.
9. the method for authentication according to claim 5 or 6, which is characterized in that described calculating user this with read Voice vocal print feature vector and standard vocal print feature vector apart from the step of include:
Wherein, describedIt is described for standard vocal print feature vectorFor the user this with reading voice Vocal print feature vector.
10. a kind of computer readable storage medium, which is characterized in that be stored with processing system on the computer readable storage medium System, when the processing system is executed by processor the method for authentication of the realization as described in any one of claim 5 to 9 Step.
CN201810311721.2A 2018-04-09 2018-04-09 Electronic device, identity authentication method and storage medium Active CN108694952B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810311721.2A CN108694952B (en) 2018-04-09 2018-04-09 Electronic device, identity authentication method and storage medium
PCT/CN2018/102208 WO2019196305A1 (en) 2018-04-09 2018-08-24 Electronic device, identity verification method, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810311721.2A CN108694952B (en) 2018-04-09 2018-04-09 Electronic device, identity authentication method and storage medium

Publications (2)

Publication Number Publication Date
CN108694952A true CN108694952A (en) 2018-10-23
CN108694952B CN108694952B (en) 2020-04-28

Family

ID=63844884

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810311721.2A Active CN108694952B (en) 2018-04-09 2018-04-09 Electronic device, identity authentication method and storage medium

Country Status (2)

Country Link
CN (1) CN108694952B (en)
WO (1) WO2019196305A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109448732A (en) * 2018-12-27 2019-03-08 科大讯飞股份有限公司 A kind of digit string processing method and processing device
CN110491393A (en) * 2019-08-30 2019-11-22 科大讯飞股份有限公司 The training method and relevant apparatus of vocal print characterization model
CN110536029A (en) * 2019-08-15 2019-12-03 咪咕音乐有限公司 A kind of exchange method, network side equipment, terminal device, storage medium and system
CN111161746A (en) * 2019-12-31 2020-05-15 苏州思必驰信息科技有限公司 Voiceprint registration method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103680497A (en) * 2012-08-31 2014-03-26 百度在线网络技术(北京)有限公司 Voice recognition system and voice recognition method based on video
CN103986725A (en) * 2014-05-29 2014-08-13 中国农业银行股份有限公司 Client side, server side and identity authentication system and method
CN107517207A (en) * 2017-03-13 2017-12-26 平安科技(深圳)有限公司 Server, auth method and computer-readable recording medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103680497A (en) * 2012-08-31 2014-03-26 百度在线网络技术(北京)有限公司 Voice recognition system and voice recognition method based on video
CN103986725A (en) * 2014-05-29 2014-08-13 中国农业银行股份有限公司 Client side, server side and identity authentication system and method
CN107517207A (en) * 2017-03-13 2017-12-26 平安科技(深圳)有限公司 Server, auth method and computer-readable recording medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109448732A (en) * 2018-12-27 2019-03-08 科大讯飞股份有限公司 A kind of digit string processing method and processing device
CN109448732B (en) * 2018-12-27 2021-06-08 科大讯飞股份有限公司 Digital string voice processing method and device
CN110536029A (en) * 2019-08-15 2019-12-03 咪咕音乐有限公司 A kind of exchange method, network side equipment, terminal device, storage medium and system
CN110491393A (en) * 2019-08-30 2019-11-22 科大讯飞股份有限公司 The training method and relevant apparatus of vocal print characterization model
CN110491393B (en) * 2019-08-30 2022-04-22 科大讯飞股份有限公司 Training method of voiceprint representation model and related device
CN111161746A (en) * 2019-12-31 2020-05-15 苏州思必驰信息科技有限公司 Voiceprint registration method and system

Also Published As

Publication number Publication date
WO2019196305A1 (en) 2019-10-17
CN108694952B (en) 2020-04-28

Similar Documents

Publication Publication Date Title
CN107527620B (en) Electronic device, the method for authentication and computer readable storage medium
CN107517207A (en) Server, auth method and computer-readable recording medium
US9940935B2 (en) Method and device for voiceprint recognition
US11862176B2 (en) Reverberation compensation for far-field speaker recognition
CN108694952A (en) Electronic device, the method for authentication and storage medium
CN107993071A (en) Electronic device, auth method and storage medium based on vocal print
CN108154371A (en) Electronic device, the method for authentication and storage medium
CN108281158A (en) Voice biopsy method, server and storage medium based on deep learning
CN110473552A (en) Speech recognition authentication method and system
EP3989217A1 (en) Method for detecting an audio adversarial attack with respect to a voice input processed by an automatic speech recognition system, corresponding device, computer program product and computer-readable carrier medium
CN108650266A (en) Server, the method for voice print verification and storage medium
CN109065022A (en) I-vector vector extracting method, method for distinguishing speek person, device, equipment and medium
CN110753263A (en) Video dubbing method, device, terminal and storage medium
CN112037800A (en) Voiceprint nuclear model training method and device, medium and electronic equipment
CN107229691A (en) A kind of method and apparatus for being used to provide social object
CN109545226B (en) Voice recognition method, device and computer readable storage medium
CN113112992B (en) Voice recognition method and device, storage medium and server
CN109739968A (en) A kind of data processing method and device
CN108630208B (en) Server, voiceprint-based identity authentication method and storage medium
CN112382296A (en) Method and device for voiceprint remote control of wireless audio equipment
CN113436633B (en) Speaker recognition method, speaker recognition device, computer equipment and storage medium
CN115101054A (en) Voice recognition method, device and equipment based on hot word graph and storage medium
CN116975823A (en) Data processing method, device, computer equipment, storage medium and product
CN115223569A (en) Speaker verification method based on deep neural network, terminal and storage medium
CN116403585A (en) Outbound customer identification method and system based on robustness characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant