CN107517207A - Server, auth method and computer-readable recording medium - Google Patents

Server, auth method and computer-readable recording medium Download PDF

Info

Publication number
CN107517207A
CN107517207A CN201710715433.9A CN201710715433A CN107517207A CN 107517207 A CN107517207 A CN 107517207A CN 201710715433 A CN201710715433 A CN 201710715433A CN 107517207 A CN107517207 A CN 107517207A
Authority
CN
China
Prior art keywords
vocal print
print feature
voice
password
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710715433.9A
Other languages
Chinese (zh)
Inventor
王健宗
查高密
程宁
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to PCT/CN2017/105031 priority Critical patent/WO2018166187A1/en
Publication of CN107517207A publication Critical patent/CN107517207A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0861Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Biomedical Technology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Collating Specific Patterns (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention relates to a kind of server, auth method and computer-readable recording medium, server includes memory and the processor being connected with memory, the authentication system that can be run on a processor is stored with memory, following steps are realized when authentication system is executed by processor:After authentication request is received, send voice to the client at random and obtain text;The password voice that the user that client is sent reports is received, identifies code characters corresponding to password voice;If code characters standard cipher character corresponding with voice acquisition text is consistent, then build the current vocal print feature vector of the password voice, and the standard vocal print feature vector according to corresponding to determining predetermined mapping relations, it is vectorial the distance between with identified standard vocal print feature vector that current vocal print feature is calculated using predetermined distance calculation formula, and authentication is carried out to user according to distance.The present invention can improve the security of authentication.

Description

Server, auth method and computer-readable recording medium
Technical field
The present invention relates to communication technical field, more particularly to a kind of server, auth method and computer-readable deposit Storage media.
Background technology
At present, the scope of business of large-scale financing corporation is related to multiple business such as insurance, bank, investment, each business Category is generally required for same client to be linked up, and the mode of communication has a variety of (such as telephonic communication or communications etc. face-to-face). Before being linked up, checking is carried out as the important component of service security is ensured to the identity of client.
In order to meet the real-time demand of business, financing corporation is much analyzed the identity of client using manual type Checking, but because customer group is huge, carries out discriminant analysis by artificial in a manner of the identity to verifying client accuracy is not Height, efficiency is also low, and in order to solve this problem, in other existing schemes, financing corporation is also carried out using a kind of vocal print scheme Authentication, but this kind of scheme can not exclude criminal and pass through vocal print authentication using false recording, have necessarily Safety risks.
The content of the invention
It is an object of the invention to provide a kind of server, auth method and computer-readable recording medium, it is intended to Improve the security of authentication.
To achieve the above object, the present invention provides a kind of server, the server include memory and with the storage The processor of device connection, is stored with the authentication system that can be run on the processor, the identity in the memory Following steps are realized when checking system is by the computing device:
S1, after the authentication request of carrying identity of client transmission is received, sent at random to the client Text is obtained for the voice of user response;
S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to described close Code voice carries out character recognition, identifies code characters corresponding to the password voice;
S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, build the password The current vocal print feature vector of voice, and determine to be somebody's turn to do according to the mapping relations of predetermined identity and standard vocal print feature vector Standard vocal print feature vector corresponding to the identity of user, it is special to calculate current vocal print using predetermined distance calculation formula The distance between vectorial and identified standard vocal print feature vector is levied, authentication is carried out to user according to the distance.
Preferably, the step S2 includes:
The password voice that the user that client is sent reports is received, analyzes whether the password voice can use, if described close Code voice is unavailable, then the recording for prompting client to re-start password voice, or, it is right if the password voice can use The password voice carries out character recognition.
Preferably, when the authentication system is by the computing device, following steps are also realized:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to this Client sends the voice acquisition text for user response;
The voice for adding up to send to client obtains the number of text, if the number is more than or equal to preset times, eventually Only to the response of the authentication request.
Preferably, the step of the current vocal print feature vector for building the password voice includes:
The password voice is handled using Predetermined filter to carry out the extraction of preset kind vocal print feature, and base Vocal print feature vector corresponding to the password voice is built in the preset kind vocal print feature of extraction;
It is special to construct the current vocal print by the background channel model of the vocal print feature vector input training in advance of structure Sign vector;
It is described to utilize the current vocal print feature of predetermined distance calculation formula calculating vectorial and identified standard vocal print The distance between characteristic vector, the step of user's progress authentication, is included according to the distance:
Calculate the current COS distance between vocal print discriminant vectorses and identified standard vocal print feature vector: It is vectorial for the standard vocal print feature,For current vocal print feature vector;
If the COS distance is less than or equal to default distance threshold, authentication passes through;
If the COS distance is more than default distance threshold, authentication does not pass through.
To achieve the above object, the present invention also provides a kind of server, and the server includes memory and deposited with described The processor of reservoir connection, is stored with the identity based on Application on Voiceprint Recognition that can be run on the processor in the memory and tests The system of card, following steps are realized when the system of the authentication based on Application on Voiceprint Recognition is by the computing device:
S101, after the speech data for the user for carrying out authentication is received, the vocal print for obtaining the speech data is special Sign, and based on vocal print feature vector corresponding to vocal print feature structure;
S102, by the background channel model of vocal print feature vector input training in advance generation, the predicate to construct Current vocal print discriminant vectorses corresponding to sound data;
S103, calculate the sky between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user to prestore Between distance, authentication is carried out to the user based on the distance, and generates the result.
To achieve the above object, the present invention also provides a kind of auth method, and the auth method includes:
S1, after the authentication request of carrying identity of client transmission is received, sent at random to the client Text is obtained for the voice of user response;
S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to described close Code voice carries out character recognition, identifies code characters corresponding to the password voice;
S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, build the password The current vocal print feature vector of voice, and determine to be somebody's turn to do according to the mapping relations of predetermined identity and standard vocal print feature vector Standard vocal print feature vector corresponding to the identity of user, it is special to calculate current vocal print using predetermined distance calculation formula The distance between vectorial and identified standard vocal print feature vector is levied, authentication is carried out to user according to the distance.
Preferably, the step S2 includes:
The password voice that the user that client is sent reports is received, analyzes whether the password voice can use, if described close Code voice is unavailable, then the recording for prompting client to re-start password voice, or, it is right if the password voice can use The password voice carries out character recognition.
Preferably, also include after the step S2:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to this Client sends the voice acquisition text for user response;
The voice for adding up to send to client obtains the number of text, if the number is more than or equal to preset times, eventually Only to the response of the authentication request.
Preferably, the step of the current vocal print feature vector for building the password voice includes:
The password voice is handled using Predetermined filter to carry out the extraction of preset kind vocal print feature, and base Vocal print feature vector corresponding to the password voice is built in the preset kind vocal print feature of extraction;
It is special to construct the current vocal print by the background channel model of the vocal print feature vector input training in advance of structure Sign vector;
It is described to utilize the current vocal print feature of predetermined distance calculation formula calculating vectorial and identified standard vocal print The distance between characteristic vector, the step of user's progress authentication, is included according to the distance:
Calculate the current COS distance between vocal print discriminant vectorses and identified standard vocal print feature vector: It is vectorial for the standard vocal print feature,For current vocal print feature vector;
If the COS distance is less than or equal to default distance threshold, authentication passes through;
If the COS distance is more than default distance threshold, authentication does not pass through.
Preferably, the background channel model is gauss hybrid models, and the training background channel model includes:
The speech data sample of predetermined number is obtained, and obtains vocal print feature corresponding to each speech data sample, and is based on Vocal print feature corresponding to each speech data sample builds vocal print feature vector corresponding to each speech data sample;
Vocal print feature vector corresponding to each speech data sample is divided into the training set of the first ratio and testing for the second ratio Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, and institute is used as using the gauss hybrid models after training Background channel model is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the number of the speech data sample Amount, and training is re-started based on the speech data sample after increase.
The present invention also provides a kind of computer-readable recording medium, and identity is stored with the computer-readable recording medium Checking system, the authentication system realizes above-mentioned auth method when being executed by processor the step of.
The beneficial effects of the invention are as follows:If other people carry out authentication using false recording that is existing or being ready for, due to The voice of transmission obtains the randomness of text, then the obtained code characters identified should differ with corresponding standard cipher character Cause, can so prevent other people from carrying out authentication using false recording that is existing or being ready for;If other people record oneself Sound carries out authentication, then can not be verified by vocal print feature afterwards.Therefore, the present embodiment is equivalent to carrying out identity twice Checking, there is the effect of double verification, while the accuracy rate and efficiency of subscriber authentication is ensured, improve authentication Security.
Brief description of the drawings
Fig. 1 is each optional application environment schematic diagram of embodiment one of the present invention;
Fig. 2 is the schematic flow sheet of the embodiment of auth method one of the present invention.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, not For limiting the present invention.Based on the embodiment in the present invention, those of ordinary skill in the art are not before creative work is made The every other embodiment obtained is put, belongs to the scope of protection of the invention.
It should be noted that the description for being related to " first ", " second " etc. in the present invention is only used for describing purpose, and can not It is interpreted as indicating or implies its relative importance or imply the quantity of the technical characteristic indicated by indicating.Thus, define " the One ", at least one this feature can be expressed or be implicitly included to the feature of " second ".In addition, the skill between each embodiment Art scheme can be combined with each other, but must can be implemented as basis with those of ordinary skill in the art, when technical scheme With reference to occurring conflicting or will be understood that the combination of this technical scheme is not present when can not realize, also not in application claims Protection domain within.
As shown in fig.1, it is the application environment schematic diagram of the preferred embodiment of auth method of the present invention.This applies ring Border schematic diagram includes server 1 and terminal device 2.Server 1 can by network, near-field communication technology etc. be adapted to technology with Terminal device 2 carries out data interaction.
Client for sending from authentication request to server 1 is installed on terminal device 2, terminal device 2 includes, But it is not limited to, any one can enter pedestrian with user by modes such as keyboard, mouse, remote control, touch pad or voice-operated devices The electronic product of machine interaction, for example, personal computer, tablet personal computer, smart mobile phone, personal digital assistant (Personal Digital Assistant, PDA), game machine, IPTV (Internet Protocol Television, IPTV), the movable equipment of intellectual Wearable, guider etc., or such as digital TV, desktop computer, pen The fixed terminal of note sheet, server etc..
The server 1 be it is a kind of can according to the instruction for being previously set or storing, it is automatic carry out numerical computations and/or The equipment of information processing.The server 1 can be computer, can also be single network server, multiple webservers The server group of the composition either cloud being made up of a large amount of main frames or the webserver based on cloud computing, wherein cloud computing is point One kind that cloth calculates, a super virtual computer being made up of the computer collection of a group loose couplings.
In the present embodiment, server 1 may include, but be not limited only to, and depositing for connection can be in communication with each other by system bus Reservoir 11, processor 12, network interface 13, memory 11 are stored with the authentication system that can be run on the processor 12.Need It is noted that Fig. 1 illustrate only the server 1 with component 11-13, it should be understood that being not required for implementing to own The component shown, what can be substituted implements more or less components.
Wherein, memory 11 includes internal memory and the readable storage medium storing program for executing of at least one type.Inside save as the operation of server 1 Caching is provided;Readable storage medium storing program for executing can be if flash memory, hard disk, multimedia card, card-type memory are (for example, SD or DX memories Deng), random access storage device (RAM), static random-access memory (SRAM), read-only storage (ROM), electric erasable can compile Journey read-only storage (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc. it is non-volatile Storage medium.In certain embodiments, readable storage medium storing program for executing can be the internal storage unit of server 1, such as the server 1 Hard disk;In further embodiments, the non-volatile memory medium can also be the External memory equipment of server 1, such as The plug-in type hard disk being equipped with server 1, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..In the present embodiment, the readable storage medium storing program for executing of memory 11 is generally used for Storage is installed on the operating system and types of applications software of server 1, such as the authentication system in one embodiment of the invention Program code etc..In addition, memory 11 can be also used for temporarily storing the Various types of data that has exported or will export.
The processor 12 can be in certain embodiments central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chips.The processor 12 is generally used for controlling the clothes The overall operation of business device 1, such as perform the control and processing related to the terminal device 2 progress data interaction or communication Deng.In the present embodiment, the processor 12 is used to run the program code stored in the memory 11 or processing data, example Such as run authentication system.
The network interface 13 may include radio network interface or wired network interface, and the network interface 13 is generally used for Communication connection is established between the server 1 and other electronic equipments.In the present embodiment, network interface 13 is mainly used in servicing Device 1 is connected with one or more terminal devices 2, and data transfer is established between server 1 and one or more terminal devices 2 and is led to Road and communication connection.
The authentication system is stored in memory 11, including at least one computer being stored in memory 11 Readable instruction, at least one computer-readable instruction can be performed by processor device 12, to realize the side of each embodiment of the application Method;And at least one computer-readable instruction is different according to the function that its each several part is realized, can be divided into different patrol Collect module.
In one embodiment, following steps are realized when above-mentioned authentication system is performed by the processor 12:
Step S1, after the authentication request of carrying identity of client transmission is received, at random to the client The voice sent for user response obtains text;
Wherein, user is operated on the client, and the authentication request for carrying identity, clothes are sent to server After business device receives the authentication request, the voice acquisition text for user response is sent to client at random.
Wherein, identity can be the identification card number of user or the phone number etc. of user;For user response Voice, which obtains text, to be had a variety of, and server sends one kind therein to client at random, it is therefore intended that prevents other people using existing False recording carry out authentication.It can need text corresponding to the random cipher of voice recording that the voice, which obtains text, Or can be the text of the enquirement for the random cipher for needing voice recording.For example, voice obtains text as that " please record a string Digital * * * ", user according to the voice obtain text responded when record " string number * * * " voice please be record, and for example, Voice obtains text to put question to text " where is your birthplace ", and user obtains when text is responded according to the voice and recorded " my birthplace is in * * * ".
Step S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to institute State password voice and carry out character recognition, identify code characters corresponding to the password voice;
In the present embodiment, user can be in the mode of the client recording password voice:User obtains text according to voice This, after user presses predetermined physical button or virtual key, control sound recording unit carries out voice recording, After user discharges the button, stop voice recording, the voice recorded is sent to server as password voice.
Wherein, when carrying out password voice recording, should try one's best the interference for preventing ambient noise and voice recording equipment.Voice Recording arrangement keeps suitable distance with user, and does not have to the big voice recording equipment of distortion as far as possible, and power supply preferably uses civil power, and Keep electric current stable;Sensor should be used when carrying out telephonograph.
After server receives the password voice, character recognition is carried out to the password voice, i.e., is converted into password voice Character one by one, wherein it is possible to which password voice directly is converted into character, noise treatment can be carried out to password voice, Disturbed with further reduce.In order to extract to obtain the vocal print feature of password voice, the password voice recorded is present count According to the speech data of length, or it is the speech data more than preset data length.
Step S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, structure should The current vocal print feature vector of password voice, and it is true according to predetermined identity and the mapping relations of standard vocal print feature vector Standard vocal print feature vector corresponding to the identity of the fixed user, current sound is calculated using predetermined distance calculation formula The distance between line characteristic vector and identified standard vocal print feature vector, identity is carried out to user according to the distance and tested Card.
In the present embodiment, voice, which obtains text, to be had a variety of, and the standard cipher character to be prestored on server also has a variety of, voice Text is obtained to correspond with standard cipher character respectively.After code characters corresponding to password voice are identified, acquisition and institute The voice of transmission obtains standard cipher character corresponding to text, and the obtained code characters and corresponding standard that judgement is identified are close Whether code character is consistent.
If the obtained code characters identified should be consistent with corresponding standard cipher character, it is close further to build this The current vocal print feature vector of code voice.Wherein, vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude Vocal print etc., the vocal print feature of the present embodiment is preferably mel-frequency cepstrum coefficient (the Mel Frequency of speech data Cepstrum Coefficient, MFCC).In vocal print feature vector corresponding to structure, by the vocal print feature group of password voice Into characteristic matrix, this feature data matrix is the vocal print feature vector of password voice.
Vector has a variety of, including COS distance and Euclidean distance etc. with the distance between vector, it is preferable that the present embodiment Current vocal print feature it is vectorial the distance between with identified standard vocal print feature vector be COS distance, COS distance is sharp Measurement by the use of two vectorial angle cosine values in vector space as the size for weighing two interindividual variations.
Wherein, standard vocal print feature vector is vectorial for the vocal print feature prestored.Before distance is calculated, marked according to user Know standard vocal print feature vector corresponding to obtaining.
Wherein, when the distance being calculated is less than or equal to pre-determined distance threshold value, it is verified, conversely, then authentication failed.
Compared with prior art, if other people carry out authentication using false recording that is existing or being ready for, due to sending Voice obtain text randomness, then the obtained code characters identified should be inconsistent with corresponding standard cipher character, It can so prevent other people from carrying out authentication using false recording that is existing or being ready for;If other people record the sound of oneself Authentication is carried out, then can not be verified by vocal print feature afterwards.Therefore, the present embodiment is tested equivalent to identity twice is carried out Card, there is the effect of double verification, while the accuracy rate and efficiency of subscriber authentication is ensured, improve the peace of authentication Quan Xing.
In a preferred embodiment, in order to prevent the audio quality of password voice from influenceing the result of vocal print feature checking, On the basis of above-mentioned Fig. 1 embodiment, the step S2 includes:The password voice that the user that client is sent reports is received, Analyze whether the password voice can use, if the password voice is unavailable, prompt client to re-start password voice Record, or, if the password voice can use, character recognition is carried out to the password voice.
Wherein, it is based on following analyses that whether password voice is available:Whether analysis user speaks part duration more than pre- If whether the background noise volume of duration, password voice is less than the first default volume and/or speaking volume is more than the second default sound Amount, the password voice can use if the analysis result in above-mentioned is satisfied by, and can perform the operation such as follow-up character recognition;Instead It, if user speaks, part duration is less than preset duration, or the background noise volume of password voice is more than or equal to the first default sound Amount, or speaking volume are less than or equal to the second default volume, then the password voice is unavailable, now, prompt client to re-start The recording of password voice.
In a preferred embodiment, when the authentication system is by the computing device, following steps are also realized: If code characters standard cipher character corresponding with voice acquisition text is inconsistent, sent out at random to the client again The voice for user response is sent to obtain text;The voice for adding up to send to client obtains the number of text, if the number is big In equal to preset times, then the response to the authentication request is terminated.
If user recorded the password voice of mistake, i.e. code characters standard cipher word corresponding with voice acquisition text When according with inconsistent, the chance for sending the voice acquisition text for user response to the client at random again can be provided, meanwhile, In order to prevent excessive password authentification from wasting computer resource, the number that can limit password authentification is less than preset times, i.e., tired The number for counting the voice acquisition text sent to client is less than preset times, and whole when the number is more than or equal to preset times Only to the response of authentication request.
In a preferred embodiment, on the basis of above-described embodiment, the password voice is built in above-mentioned steps S3 The step of current vocal print feature vector includes:The password voice is handled to carry out preset kind using Predetermined filter The extraction of vocal print feature, and the preset kind vocal print feature based on extraction builds vocal print feature vector corresponding to the password voice; It is vectorial to construct the current vocal print feature by the background channel model of the vocal print feature vector input training in advance of structure.
Wherein, Predetermined filter is preferably Mel wave filter.First, to the password voice carry out preemphasis, framing and Windowing process;In the present embodiment, after the password voice for the user for carrying out authentication is received, at password voice Reason.Wherein, preemphasis processing is really high-pass filtering processing, filters out low-frequency data so that the high frequency characteristics in password voice is more Add and highlight, specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z-1, wherein, Z is speech data, and α is constant factor, Preferably, α value is 0.97;Because stationarity is only presented in voice signal within a short period of time, therefore by one section of voice signal It is divided into the signal (i.e. N frames) of N section short time, and is lost in order to avoid the continuity Characteristics of sound, has one section of weight between consecutive frame Multiple region, repeat region are generally 1/2 per frame length;After framing is carried out to password voice, each frame signal is all as steady Signal is handled, but the presence of Gibbs' effect, and the start frame and end frame of password voice be discontinuous, after framing, More deviate from raw tone, therefore, it is necessary to windowing process is carried out to password voice.
Fourier transform is carried out to each adding window and obtains corresponding frequency spectrum;
The frequency spectrum is inputted into Mel wave filter to export to obtain Mel frequency spectrum;
Cepstral analysis is carried out on Mel frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, based on the mel-frequency Vocal print feature vector corresponding to cepstrum coefficient MFCC compositions.Wherein, cepstral analysis is, for example, and takes the logarithm, does inverse transformation, inverse transformation Realized generally by DCT discrete cosine transforms, take the 2nd after DCT to the 13rd coefficient as MFCC coefficients.Mel frequency Rate cepstrum coefficient MFCC is the vocal print feature of this frame password voice, by the mel-frequency cepstrum coefficient MFCC composition characteristics of every frame Data matrix, this feature data matrix are the vocal print feature vector of password voice.
Then, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould Type is gauss hybrid models, calculates vocal print feature vector using the background channel model, draws corresponding current vocal print feature Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come Battle array, D (X) are covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down Angular moment battle array, and element is arranged as to 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model Variance matrix, each matrix are also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data Value, Softmax recurrence is then carried out, operation is finally normalized, obtain every frame in mixed Gauss model Posterior probability distribution, The ProbabilityDistribution Vector of every frame is formed into probability matrix.
3) current vocal print feature vector, is extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order calculates can be with Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor the jth row of probability matrix, i-th yuan Element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized Data matrix.
It is being calculated single order, after second order coefficient, parallel computation first order and quadratic term, is then passing through first order and two Current vocal print feature vector is calculated in secondary item.
In a preferred embodiment, on the basis of above-described embodiment, in above-mentioned steps S3 using it is predetermined away from It is vectorial the distance between with identified standard vocal print feature vector to calculate current vocal print feature from calculation formula, according to it is described away from Include from the step of carrying out authentication to user:Calculate the current vocal print discriminant vectorses and identified standard vocal print feature COS distance between vector:
Wherein,It is vectorial for the standard vocal print feature,For current vocal print feature vector.If the COS distance is less than Or pass through equal to default distance threshold, then authentication;If the COS distance is more than default distance threshold, identity Checking does not pass through.
The present invention also provides another server, and the server is similar with the hardware structure of above-mentioned Fig. 1 server, including Memory and the processor being connected with memory, and be connected by network interface with the terminal device of outside.Except that deposit The system for the authentication based on Application on Voiceprint Recognition that can be run on the processor is stored with reservoir, Application on Voiceprint Recognition should be based on Authentication system by the computing device when realize following steps:
S101, after the speech data for the user for carrying out authentication is received, the vocal print for obtaining the speech data is special Sign, and based on vocal print feature vector corresponding to vocal print feature structure;
In the present embodiment, speech data collects (voice capture device is, for example, microphone) by voice capture device, The system that the speech data of collection is sent to the authentication based on Application on Voiceprint Recognition by voice capture device.
When gathering speech data, should try one's best prevents the interference of ambient noise and voice capture device.Voice capture device Suitable distance is kept with user, and does not have to the big voice capture device of distortion as far as possible, power supply preferably uses civil power, and keeps electric current It is stable;Sensor should be used when carrying out telephonograph., can be to voice number before the vocal print feature in extracting speech data According to noise treatment is carried out, disturbed with further reduce.In order to extract to obtain the vocal print feature of speech data, gathered Speech data is the speech data of preset data length, or is the speech data more than preset data length.
Vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude vocal print etc., the vocal print of the present embodiment Be characterized as preferably speech data mel-frequency cepstrum coefficient (Mel Frequency Cepstrum Coefficient, MFCC).In vocal print feature vector corresponding to structure, by the vocal print feature composition characteristic data matrix of speech data, this feature Data matrix is the vocal print feature vector of speech data.
S102, by the background channel model of vocal print feature vector input training in advance generation, the predicate to construct Current vocal print discriminant vectorses corresponding to sound data;
Wherein, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould Type is gauss hybrid models, and vocal print feature vector is calculated using the background channel model, show that corresponding current vocal print differentiates Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come Battle array, D (X) are covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down Angular moment battle array, and element is arranged as to 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model Variance matrix, each matrix are also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data Value, Softmax recurrence is then carried out, operation is finally normalized, obtain every frame in mixed Gauss model Posterior probability distribution, The ProbabilityDistribution Vector of every frame is formed into probability matrix.
3) current vocal print discriminant vectorses, are extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order calculates can be with Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor the jth row of probability matrix, i-th yuan Element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized Data matrix.
It is being calculated single order, after second order coefficient, parallel computation first order and quadratic term, is then passing through first order and two Secondary item calculates current vocal print discriminant vectorses.
Preferably, background channel model is gauss hybrid models, is included before above-mentioned steps S1:
The speech data sample of predetermined number is obtained, and obtains vocal print feature corresponding to each speech data sample, and is based on Vocal print feature corresponding to each speech data sample builds vocal print feature vector corresponding to each speech data sample;
Vocal print feature vector corresponding to each speech data sample is divided into the training set of the first ratio and testing for the second ratio Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, and institute is used as using the gauss hybrid models after training Step S2 background channel model is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample This quantity, and training is re-started based on the speech data sample after increase.
Wherein, when the vocal print feature vector in using training set is trained to gauss hybrid models, the D that extracts Likelihood probability corresponding to dimension vocal print feature can be expressed as with K Gaussian component:
Wherein, the probability (mixed Gauss model) that P (x) is generated for speech data sample by gauss hybrid models, wkTo be each high The weight of this model, and p (x | k) it is the probability that sample is generated by k-th of Gauss model, K is Gauss model quantity.
The parameter of whole gauss hybrid models can be expressed as:{wiii, wiFor the weight of i-th of Gauss model, μi For the average of i-th of Gauss model, ∑iFor the covariance of i-th of Gauss model.The gauss hybrid models are trained to use non-prison The EM algorithms superintended and directed.After the completion of training, the weight vectors of gauss hybrid models, constant vector, N number of covariance matrix, average are obtained It is multiplied by matrix of covariance etc., the gauss hybrid models after as one training.
S103, calculate the sky between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user to prestore Between distance, authentication is carried out to the user based on the distance, and generates the result.
Vector has a variety of, including COS distance and Euclidean distance etc. with the distance between vector, it is preferable that the present embodiment Space length be COS distance, COS distance is as measurement two by the use of two vectorial angle cosine values in vector space The measurement of the size of interindividual variation.
Wherein, standard vocal print discriminant vectorses are the vocal print discriminant vectorses for being obtained ahead of time and storing, standard vocal print discriminant vectorses The identification information of its corresponding user is carried in storage, it is capable of the identity of user corresponding to accurate representation.Calculating space Before distance, the identification information provided according to user obtains the vocal print discriminant vectorses of storage.
Wherein, when the space length being calculated is less than or equal to pre-determined distance threshold value, it is verified, conversely, then verifying Failure.
Compared with prior art, the background channel model of the present embodiment training in advance generation is by a large amount of speech datas Excavation obtained with comparing training, this model can to greatest extent retain user vocal print feature while, accurately portray Background vocal print feature when user speaks, and can remove this feature in identification, and extract the intrinsic spy of user voice Sign, can significantly improve the accuracy rate of subscriber authentication, and improve the efficiency of authentication;In addition, the present embodiment is abundant Vocal print feature related to sound channel in voice is make use of, this vocal print feature need not be simultaneously any limitation as to text, thus entered There is larger flexibility during row identification and checking.
As shown in Fig. 2 Fig. 2 is the schematic flow sheet of the embodiment of auth method one of the present invention, the auth method Comprise the following steps:
Step S1, after the authentication request of carrying identity of client transmission is received, at random to the client The voice sent for user response obtains text;
Wherein, user is operated on the client, and the authentication request for carrying identity, clothes are sent to server After business device receives the authentication request, the voice acquisition text for user response is sent to client at random.
Wherein, identity can be the identification card number of user or the phone number etc. of user;For user response Voice, which obtains text, to be had a variety of, and server sends one kind therein to client at random, it is therefore intended that prevents other people using existing False recording carry out authentication.It can need text corresponding to the random cipher of voice recording that the voice, which obtains text, Or can be the text of the enquirement for the random cipher for needing voice recording.For example, voice obtains text as that " please record a string Digital * * * ", user according to the voice obtain text responded when record " string number * * * " voice please be record, and for example, Voice obtains text to put question to text " where is your birthplace ", and user obtains when text is responded according to the voice and recorded " my birthplace is in * * * ".
Step S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to institute State password voice and carry out character recognition, identify code characters corresponding to the password voice;
In the present embodiment, user can be in the mode of the client recording password voice:User obtains text according to voice This, after user presses predetermined physical button or virtual key, control sound recording unit carries out voice recording, After user discharges the button, stop voice recording, the voice recorded is sent to server as password voice.
Wherein, when carrying out password voice recording, should try one's best the interference for preventing ambient noise and voice recording equipment.Voice Recording arrangement keeps suitable distance with user, and does not have to the big voice recording equipment of distortion as far as possible, and power supply preferably uses civil power, and Keep electric current stable;Sensor should be used when carrying out telephonograph.
After server receives the password voice, character recognition is carried out to the password voice, i.e., is converted into password voice Character one by one, wherein it is possible to which password voice directly is converted into character, noise treatment can be carried out to password voice, Disturbed with further reduce.In order to extract to obtain the vocal print feature of password voice, the password voice recorded is present count According to the speech data of length, or it is the speech data more than preset data length.
Step S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, structure should The current vocal print feature vector of password voice, and it is true according to predetermined identity and the mapping relations of standard vocal print feature vector Standard vocal print feature vector corresponding to the identity of the fixed user, current sound is calculated using predetermined distance calculation formula The distance between line characteristic vector and identified standard vocal print feature vector, identity is carried out to user according to the distance and tested Card.
In the present embodiment, voice, which obtains text, to be had a variety of, and the standard cipher character to be prestored on server also has a variety of, voice Text is obtained to correspond with standard cipher character respectively.After code characters corresponding to password voice are identified, acquisition and institute The voice of transmission obtains standard cipher character corresponding to text, and the obtained code characters and corresponding standard that judgement is identified are close Whether code character is consistent.
If the obtained code characters identified should be consistent with corresponding standard cipher character, it is close further to build this The current vocal print feature vector of code voice.Wherein, vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude Vocal print etc., the vocal print feature of the present embodiment is preferably mel-frequency cepstrum coefficient (the Mel Frequency of speech data Cepstrum Coefficient, MFCC).In vocal print feature vector corresponding to structure, by the vocal print feature group of password voice Into characteristic matrix, this feature data matrix is the vocal print feature vector of password voice.
Vector has a variety of, including COS distance and Euclidean distance etc. with the distance between vector, it is preferable that the present embodiment Current vocal print feature it is vectorial the distance between with identified standard vocal print feature vector be COS distance, COS distance is sharp Measurement by the use of two vectorial angle cosine values in vector space as the size for weighing two interindividual variations.
Wherein, standard vocal print feature vector is vectorial for the vocal print feature prestored.Before distance is calculated, marked according to user Know standard vocal print feature vector corresponding to obtaining.
Wherein, when the distance being calculated is less than or equal to pre-determined distance threshold value, it is verified, conversely, then authentication failed.
In a preferred embodiment, in order to prevent the audio quality of password voice from influenceing the result of vocal print feature checking, On the basis of above-mentioned Fig. 2 embodiment, the step S2 includes:The password voice that the user that client is sent reports is received, Analyze whether the password voice can use, if the password voice is unavailable, prompt client to re-start password voice Record, or, if the password voice can use, character recognition is carried out to the password voice.
Wherein, it is based on following analyses that whether password voice is available:Whether analysis user speaks part duration more than pre- If whether the background noise volume of duration, password voice is less than the first default volume and/or speaking volume is more than the second default sound Amount, the password voice can use if the analysis result in above-mentioned is satisfied by, and can perform the operation such as follow-up character recognition;Instead It, if user speaks, part duration is less than preset duration, or the background noise volume of password voice is more than or equal to the first default sound Amount, or speaking volume are less than or equal to the second default volume, then the password voice is unavailable, now, prompt client to re-start The recording of password voice.
In a preferred embodiment, on the basis of above-mentioned Fig. 2 embodiment, the auth method also includes as follows Step:If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to the visitor Family end sends the voice acquisition text for user response;The voice for adding up to send to client obtains the number of text, if described Number is more than or equal to preset times, then terminates the response to the authentication request.
If user recorded the password voice of mistake, i.e. code characters standard cipher word corresponding with voice acquisition text When according with inconsistent, the chance for sending the voice acquisition text for user response to the client at random again can be provided, meanwhile, In order to prevent excessive password authentification from wasting computer resource, the number that can limit password authentification is less than preset times, i.e., tired The number for counting the voice acquisition text sent to client is less than preset times, and whole when the number is more than or equal to preset times Only to the response of authentication request.
In a preferred embodiment, on the basis of above-described embodiment, the password voice is built in above-mentioned steps S3 The step of current vocal print feature vector includes:The password voice is handled to carry out preset kind using Predetermined filter The extraction of vocal print feature, and the preset kind vocal print feature based on extraction builds vocal print feature vector corresponding to the password voice; It is vectorial to construct the current vocal print feature by the background channel model of the vocal print feature vector input training in advance of structure.
Wherein, Predetermined filter is preferably Mel wave filter.First, to the password voice carry out preemphasis, framing and Windowing process;In the present embodiment, after the password voice for the user for carrying out authentication is received, at password voice Reason.Wherein, preemphasis processing is really high-pass filtering processing, filters out low-frequency data so that the high frequency characteristics in password voice is more Add and highlight, specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z-1, wherein, Z is speech data, and α is constant factor, Preferably, α value is 0.97;Because stationarity is only presented in voice signal within a short period of time, therefore by one section of voice signal It is divided into the signal (i.e. N frames) of N section short time, and is lost in order to avoid the continuity Characteristics of sound, has one section of weight between consecutive frame Multiple region, repeat region are generally 1/2 per frame length;After framing is carried out to password voice, each frame signal is all as steady Signal is handled, but the presence of Gibbs' effect, and the start frame and end frame of password voice be discontinuous, after framing, More deviate from raw tone, therefore, it is necessary to windowing process is carried out to password voice.
Fourier transform is carried out to each adding window and obtains corresponding frequency spectrum;
The frequency spectrum is inputted into Mel wave filter to export to obtain Mel frequency spectrum;
Cepstral analysis is carried out on Mel frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, based on the mel-frequency Vocal print feature vector corresponding to cepstrum coefficient MFCC compositions.Wherein, cepstral analysis is, for example, and takes the logarithm, does inverse transformation, inverse transformation Realized generally by DCT discrete cosine transforms, take the 2nd after DCT to the 13rd coefficient as MFCC coefficients.Mel frequency Rate cepstrum coefficient MFCC is the vocal print feature of this frame password voice, by the mel-frequency cepstrum coefficient MFCC composition characteristics of every frame Data matrix, this feature data matrix are the vocal print feature vector of password voice.
Then, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould Type is gauss hybrid models, calculates vocal print feature vector using the background channel model, draws corresponding current vocal print feature Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come Battle array, D (X) are covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down Angular moment battle array, and element is arranged as to 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model Variance matrix, each matrix are also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data Value, Softmax recurrence is then carried out, operation is finally normalized, obtain every frame in mixed Gauss model Posterior probability distribution, The ProbabilityDistribution Vector of every frame is formed into probability matrix.
3) current vocal print feature vector, is extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order calculates can be with Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor the jth row of probability matrix, i-th yuan Element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized Data matrix.
It is being calculated single order, after second order coefficient, parallel computation first order and quadratic term, is then passing through first order and two Current vocal print feature vector is calculated in secondary item.
In a preferred embodiment, on the basis of above-described embodiment, in above-mentioned steps S3 using it is predetermined away from It is vectorial the distance between with identified standard vocal print feature vector to calculate current vocal print feature from calculation formula, according to it is described away from Include from the step of carrying out authentication to user:Calculate the current vocal print discriminant vectorses and identified standard vocal print feature COS distance between vector:
Wherein,It is vectorial for the standard vocal print feature,For current vocal print feature vector.If the COS distance is less than Or pass through equal to default distance threshold, then authentication;If the COS distance is more than default distance threshold, identity Checking does not pass through.
In a preferred embodiment, on the basis of above-described embodiment, background channel model is gauss hybrid models, instruction Practicing background channel model includes:
The speech data sample of predetermined number is obtained, and obtains vocal print feature corresponding to each speech data sample, and is based on Vocal print feature corresponding to each speech data sample builds vocal print feature vector corresponding to each speech data sample;
Vocal print feature vector corresponding to each speech data sample is divided into the training set of the first ratio and testing for the second ratio Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, using the gauss hybrid models after training as upper Background channel model to be applied is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample This quantity, and training is re-started based on the speech data sample after increase.
Wherein, when the vocal print feature vector in using training set is trained to gauss hybrid models, the D that extracts Likelihood probability corresponding to dimension vocal print feature can be expressed as with K Gaussian component:
Wherein, the probability (mixed Gauss model) that P (x) is generated for speech data sample by gauss hybrid models, wkTo be each high The weight of this model, and p (x | k) it is the probability that sample is generated by k-th of Gauss model, K is Gauss model quantity.
The parameter of whole gauss hybrid models can be expressed as:{wiii, wiFor the weight of i-th of Gauss model, μi For the average of i-th of Gauss model, ∑iFor the covariance of i-th of Gauss model.The gauss hybrid models are trained to use non-prison The EM algorithms superintended and directed.After the completion of training, the weight vectors of gauss hybrid models, constant vector, N number of covariance matrix, average are obtained It is multiplied by matrix of covariance etc., the gauss hybrid models after as one training.
The present invention also provides a kind of computer-readable recording medium, and identity is stored with the computer-readable recording medium Checking system, the authentication system realizes above-mentioned auth method when being executed by processor the step of.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on such understanding, technical scheme is substantially done to prior art in other words Going out the part of contribution can be embodied in the form of software product, and the computer software product is stored in a storage medium In (such as ROM/RAM, magnetic disc, CD), including some instructions to cause a station terminal equipment (can be mobile phone, computer, clothes Be engaged in device, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.
The preferred embodiments of the present invention are these are only, are not intended to limit the scope of the invention, it is every to utilize this hair The equivalent structure or equivalent flow conversion that bright specification and accompanying drawing content are made, or directly or indirectly it is used in other related skills Art field, is included within the scope of the present invention.

Claims (11)

1. a kind of server, it is characterised in that the server includes memory and the processor being connected with the memory, institute The authentication system that is stored with and can run on the processor in memory is stated, the authentication system is by the processing Device realizes following steps when performing:
S1, receive client transmission carrying identity authentication request after, at random to the client send for The voice of family response obtains text;
S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to the password language Sound carries out character recognition, identifies code characters corresponding to the password voice;
S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, build the password voice Current vocal print feature vector, and according to predetermined identity and standard vocal print feature vector mapping relations determine the user Identity corresponding to standard vocal print feature vector, using predetermined distance calculation formula calculate current vocal print feature to The distance between amount and identified standard vocal print feature vector, authentication is carried out to user according to the distance.
2. server according to claim 1, it is characterised in that the step S2 includes:
The password voice that the user that client is sent reports is received, analyzes whether the password voice can use, if the password language Sound is unavailable, then the recording for prompting client to re-start password voice, or, if the password voice can use, to described Password voice carries out character recognition.
3. server according to claim 1 or 2, it is characterised in that the authentication system is held by the processor During row, following steps are also realized:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to the client End sends the voice acquisition text for user response;
The voice for adding up to send to client obtains the number of text, if the number is more than or equal to preset times, termination pair The response of the authentication request.
4. server according to claim 1 or 2, it is characterised in that the current vocal print for building the password voice is special The step of sign vector includes:
The password voice is handled using Predetermined filter to carry out the extraction of preset kind vocal print feature, and be based on carrying The preset kind vocal print feature taken builds vocal print feature vector corresponding to the password voice;
By the background channel model of the vocal print feature of structure vector input training in advance, with construct the current vocal print feature to Amount;
It is described to utilize the current vocal print feature of predetermined distance calculation formula calculating vectorial and identified standard vocal print feature The distance between vector, the step of user's progress authentication, is included according to the distance:
Calculate the current COS distance between vocal print discriminant vectorses and identified standard vocal print feature vector: It is vectorial for the standard vocal print feature,For current vocal print feature vector;
If the COS distance is less than or equal to default distance threshold, authentication passes through;
If the COS distance is more than default distance threshold, authentication does not pass through.
5. a kind of server, it is characterised in that the server includes memory and the processor being connected with the memory, institute The system that the authentication based on Application on Voiceprint Recognition that can be run on the processor is stored with memory is stated, it is described to be based on sound Following steps are realized when the system of the authentication of line identification is by the computing device:
S101, after the speech data for the user for carrying out authentication is received, the vocal print feature of the speech data is obtained, and Based on vocal print feature vector corresponding to vocal print feature structure;
S102, the background channel model that vocal print feature vector input training in advance is generated, to construct the voice number According to corresponding current vocal print discriminant vectorses;
S103, calculate space between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user to prestore away from From carrying out authentication to the user based on the distance, and generate the result.
6. a kind of auth method, it is characterised in that the auth method includes:
S1, receive client transmission carrying identity authentication request after, at random to the client send for The voice of family response obtains text;
S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to the password language Sound carries out character recognition, identifies code characters corresponding to the password voice;
S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, build the password voice Current vocal print feature vector, and according to predetermined identity and standard vocal print feature vector mapping relations determine the user Identity corresponding to standard vocal print feature vector, using predetermined distance calculation formula calculate current vocal print feature to The distance between amount and identified standard vocal print feature vector, authentication is carried out to user according to the distance.
7. auth method according to claim 6, it is characterised in that the step S2 includes:
The password voice that the user that client is sent reports is received, analyzes whether the password voice can use, if the password language Sound is unavailable, then the recording for prompting client to re-start password voice, or, if the password voice can use, to described Password voice carries out character recognition.
8. the auth method according to claim 6 or 7, it is characterised in that also include after the step S2:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to the client End sends the voice acquisition text for user response;
The voice for adding up to send to client obtains the number of text, if the number is more than or equal to preset times, termination pair The response of the authentication request.
9. the auth method according to claim 6 or 7, it is characterised in that described to build the current of the password voice The step of vocal print feature vector includes:
The password voice is handled using Predetermined filter to carry out the extraction of preset kind vocal print feature, and be based on carrying The preset kind vocal print feature taken builds vocal print feature vector corresponding to the password voice;
By the background channel model of the vocal print feature of structure vector input training in advance, with construct the current vocal print feature to Amount;
It is described to utilize the current vocal print feature of predetermined distance calculation formula calculating vectorial and identified standard vocal print feature The distance between vector, the step of user's progress authentication, is included according to the distance:
Calculate the current COS distance between vocal print discriminant vectorses and identified standard vocal print feature vector: It is vectorial for the standard vocal print feature,For current vocal print feature vector;
If the COS distance is less than or equal to default distance threshold, authentication passes through;
If the COS distance is more than default distance threshold, authentication does not pass through.
10. auth method according to claim 9, it is characterised in that the background channel model is Gaussian Mixture Model, the training background channel model include:
The speech data sample of predetermined number is obtained, and obtains vocal print feature corresponding to each speech data sample, and is based on each language Vocal print feature corresponding to sound data sample builds vocal print feature vector corresponding to each speech data sample;
Vocal print feature vector corresponding to each speech data sample is divided into the training set of the first ratio and the checking collection of the second ratio, First ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, utilized The accuracy rate of gauss hybrid models after the checking set pair training is verified;
If the accuracy rate is more than predetermined threshold value, model training terminates, and the back of the body is used as using the gauss hybrid models after training Scape channel model, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the quantity of the speech data sample, and Training is re-started based on the speech data sample after increase.
11. a kind of computer-readable recording medium, it is characterised in that be stored with identity on the computer-readable recording medium and test Card system, realize that the identity as any one of claim 6 to 10 is tested when the authentication system is executed by processor The step of card method.
CN201710715433.9A 2017-03-13 2017-08-20 Server, auth method and computer-readable recording medium Pending CN107517207A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/105031 WO2018166187A1 (en) 2017-03-13 2017-09-30 Server, identity verification method and system, and a computer-readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710147695.XA CN107068154A (en) 2017-03-13 2017-03-13 The method and system of authentication based on Application on Voiceprint Recognition
CN201710147695X 2017-03-13

Publications (1)

Publication Number Publication Date
CN107517207A true CN107517207A (en) 2017-12-26

Family

ID=59622093

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201710147695.XA Pending CN107068154A (en) 2017-03-13 2017-03-13 The method and system of authentication based on Application on Voiceprint Recognition
CN201710715433.9A Pending CN107517207A (en) 2017-03-13 2017-08-20 Server, auth method and computer-readable recording medium

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201710147695.XA Pending CN107068154A (en) 2017-03-13 2017-03-13 The method and system of authentication based on Application on Voiceprint Recognition

Country Status (3)

Country Link
CN (2) CN107068154A (en)
TW (1) TWI641965B (en)
WO (2) WO2018166112A1 (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108091326A (en) * 2018-02-11 2018-05-29 张晓雷 A kind of method for recognizing sound-groove and system based on linear regression
CN108447489A (en) * 2018-04-17 2018-08-24 清华大学 A kind of continuous voiceprint authentication method and system of band feedback
CN108630208A (en) * 2018-05-14 2018-10-09 平安科技(深圳)有限公司 Server, auth method and storage medium based on vocal print
CN108694952A (en) * 2018-04-09 2018-10-23 平安科技(深圳)有限公司 Electronic device, the method for authentication and storage medium
CN108768654A (en) * 2018-04-09 2018-11-06 平安科技(深圳)有限公司 Auth method, server based on Application on Voiceprint Recognition and storage medium
CN108834138A (en) * 2018-05-25 2018-11-16 四川斐讯全智信息技术有限公司 A kind of distribution method and system based on voice print database
CN109087647A (en) * 2018-08-03 2018-12-25 平安科技(深圳)有限公司 Application on Voiceprint Recognition processing method, device, electronic equipment and storage medium
CN109147797A (en) * 2018-10-18 2019-01-04 平安科技(深圳)有限公司 Client service method, device, computer equipment and storage medium based on Application on Voiceprint Recognition
CN109256138A (en) * 2018-08-13 2019-01-22 平安科技(深圳)有限公司 Auth method, terminal device and computer readable storage medium
CN109450850A (en) * 2018-09-26 2019-03-08 深圳壹账通智能科技有限公司 Auth method, device, computer equipment and storage medium
CN109473108A (en) * 2018-12-15 2019-03-15 深圳壹账通智能科技有限公司 Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition
CN109545226A (en) * 2019-01-04 2019-03-29 平安科技(深圳)有限公司 A kind of audio recognition method, equipment and computer readable storage medium
CN109816508A (en) * 2018-12-14 2019-05-28 深圳壹账通智能科技有限公司 Method for authenticating user identity, device based on big data, computer equipment
CN110046910A (en) * 2018-12-13 2019-07-23 阿里巴巴集团控股有限公司 The method and apparatus for obtaining customer group relevant to particular customer
CN110322888A (en) * 2019-05-21 2019-10-11 平安科技(深圳)有限公司 Credit card unlocking method, device, equipment and computer readable storage medium
CN110334603A (en) * 2019-06-06 2019-10-15 视联动力信息技术股份有限公司 Authentication system
WO2019218512A1 (en) * 2018-05-14 2019-11-21 平安科技(深圳)有限公司 Server, voiceprint verification method, and storage medium
CN110971755A (en) * 2019-11-18 2020-04-07 武汉大学 Double-factor identity authentication method based on PIN code and pressure code
CN111597531A (en) * 2020-04-07 2020-08-28 北京捷通华声科技股份有限公司 Identity authentication method and device, electronic equipment and readable storage medium
CN111613230A (en) * 2020-06-24 2020-09-01 泰康保险集团股份有限公司 Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium
CN111710340A (en) * 2020-06-05 2020-09-25 深圳市卡牛科技有限公司 Method, device, server and storage medium for identifying user identity based on voice
CN112669841A (en) * 2020-12-18 2021-04-16 平安科技(深圳)有限公司 Training method and device for multilingual speech generation model and computer equipment

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107068154A (en) * 2017-03-13 2017-08-18 平安科技(深圳)有限公司 The method and system of authentication based on Application on Voiceprint Recognition
CN107527620B (en) 2017-07-25 2019-03-26 平安科技(深圳)有限公司 Electronic device, the method for authentication and computer readable storage medium
CN107993071A (en) * 2017-11-21 2018-05-04 平安科技(深圳)有限公司 Electronic device, auth method and storage medium based on vocal print
CN108172230A (en) * 2018-01-03 2018-06-15 平安科技(深圳)有限公司 Voiceprint registration method, terminal installation and storage medium based on Application on Voiceprint Recognition model
CN108154371A (en) * 2018-01-12 2018-06-12 平安科技(深圳)有限公司 Electronic device, the method for authentication and storage medium
CN108269575B (en) * 2018-01-12 2021-11-02 平安科技(深圳)有限公司 Voice recognition method for updating voiceprint data, terminal device and storage medium
CN108766444B (en) * 2018-04-09 2020-11-03 平安科技(深圳)有限公司 User identity authentication method, server and storage medium
CN108806695A (en) * 2018-04-17 2018-11-13 平安科技(深圳)有限公司 Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh
CN109101801B (en) * 2018-07-12 2021-04-27 北京百度网讯科技有限公司 Method, apparatus, device and computer readable storage medium for identity authentication
CN110867189A (en) * 2018-08-28 2020-03-06 北京京东尚科信息技术有限公司 Login method and device
CN110880325B (en) * 2018-09-05 2022-06-28 华为技术有限公司 Identity recognition method and equipment
CN109377662A (en) * 2018-09-29 2019-02-22 途客易达(天津)网络科技有限公司 Charging pile control method, device and electronic equipment
CN109257362A (en) * 2018-10-11 2019-01-22 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of voice print verification
CN109378002B (en) * 2018-10-11 2024-05-07 平安科技(深圳)有限公司 Voiceprint verification method, voiceprint verification device, computer equipment and storage medium
CN109524026B (en) * 2018-10-26 2022-04-26 北京网众共创科技有限公司 Method and device for determining prompt tone, storage medium and electronic device
CN109473105A (en) * 2018-10-26 2019-03-15 平安科技(深圳)有限公司 The voice print verification method, apparatus unrelated with text and computer equipment
CN109360573A (en) * 2018-11-13 2019-02-19 平安科技(深圳)有限公司 Livestock method for recognizing sound-groove, device, terminal device and computer storage medium
CN109493873A (en) * 2018-11-13 2019-03-19 平安科技(深圳)有限公司 Livestock method for recognizing sound-groove, device, terminal device and computer storage medium
CN109636630A (en) * 2018-12-07 2019-04-16 泰康保险集团股份有限公司 Method, apparatus, medium and electronic equipment of the detection for behavior of insuring
CN110298150B (en) * 2019-05-29 2021-11-26 上海拍拍贷金融信息服务有限公司 Identity verification method and system based on voice recognition
CN110738998A (en) * 2019-09-11 2020-01-31 深圳壹账通智能科技有限公司 Voice-based personal credit evaluation method, device, terminal and storage medium
CN110473569A (en) * 2019-09-11 2019-11-19 苏州思必驰信息科技有限公司 Detect the optimization method and system of speaker's spoofing attack
CN111402899B (en) * 2020-03-25 2023-10-13 中国工商银行股份有限公司 Cross-channel voiceprint recognition method and device
CN111625704A (en) * 2020-05-11 2020-09-04 镇江纵陌阡横信息科技有限公司 Non-personalized recommendation algorithm model based on user intention and data cooperation
CN111899566A (en) * 2020-08-11 2020-11-06 南京畅淼科技有限责任公司 Ship traffic management system based on AIS
CN112289324B (en) * 2020-10-27 2024-05-10 湖南华威金安企业管理有限公司 Voiceprint identity recognition method and device and electronic equipment
CN112802481A (en) * 2021-04-06 2021-05-14 北京远鉴信息技术有限公司 Voiceprint verification method, voiceprint recognition model training method, device and equipment
CN113421575B (en) * 2021-06-30 2024-02-06 平安科技(深圳)有限公司 Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium
CN114780787A (en) * 2022-04-01 2022-07-22 杭州半云科技有限公司 Voiceprint retrieval method, identity verification method, identity registration method and device

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101064043A (en) * 2006-04-29 2007-10-31 上海优浪信息科技有限公司 Sound-groove gate inhibition system and uses thereof
CN102695112A (en) * 2012-06-09 2012-09-26 九江妙士酷实业有限公司 Automobile player and volume control method thereof
CN102916815A (en) * 2012-11-07 2013-02-06 华为终端有限公司 Method and device for checking identity of user
CN103220286A (en) * 2013-04-10 2013-07-24 郑方 Identity verification system and identity verification method based on dynamic password voice
CN103632504A (en) * 2013-12-17 2014-03-12 上海电机学院 Silence reminder for library
CN103986725A (en) * 2014-05-29 2014-08-13 中国农业银行股份有限公司 Client side, server side and identity authentication system and method
CN104157301A (en) * 2014-07-25 2014-11-19 广州三星通信技术研究有限公司 Method, device and terminal deleting voice information blank segment
CN104427076A (en) * 2013-08-30 2015-03-18 中兴通讯股份有限公司 Recognition method and recognition device for automatic answering of calling system
CN104992708A (en) * 2015-05-11 2015-10-21 国家计算机网络与信息安全管理中心 Short-time specific audio detection model generating method and short-time specific audio detection method
CN105100911A (en) * 2014-05-06 2015-11-25 夏普株式会社 Intelligent multimedia system and method
CN105321293A (en) * 2014-09-18 2016-02-10 广东小天才科技有限公司 Danger detection and warning method and danger detection and warning smart device
CN105611461A (en) * 2016-01-04 2016-05-25 浙江宇视科技有限公司 Noise suppression method, apparatus and system for voice application system of front-end device
CN105869645A (en) * 2016-03-25 2016-08-17 腾讯科技(深圳)有限公司 Voice data processing method and device
CN106210323A (en) * 2016-07-13 2016-12-07 广东欧珀移动通信有限公司 A kind of speech playing method and terminal unit
CN106971717A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 Robot and audio recognition method, the device of webserver collaborative process

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8645137B2 (en) * 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
CN1170239C (en) * 2002-09-06 2004-10-06 浙江大学 Palm acoustic-print verifying system
TWI234762B (en) * 2003-12-22 2005-06-21 Top Dihital Co Ltd Voiceprint identification system for e-commerce
US7447633B2 (en) * 2004-11-22 2008-11-04 International Business Machines Corporation Method and apparatus for training a text independent speaker recognition system using speech data with text labels
US7536304B2 (en) * 2005-05-27 2009-05-19 Porticus, Inc. Method and system for bio-metric voice print authentication
CN102479511A (en) * 2010-11-23 2012-05-30 盛乐信息技术(上海)有限公司 Large-scale voiceprint authentication method and system
TW201301261A (en) * 2011-06-27 2013-01-01 Hon Hai Prec Ind Co Ltd Identity authentication system and method thereof
CN102238190B (en) * 2011-08-01 2013-12-11 安徽科大讯飞信息科技股份有限公司 Identity authentication method and system
CN102509547B (en) * 2011-12-29 2013-06-19 辽宁工业大学 Method and system for voiceprint recognition based on vector quantization based
US9042867B2 (en) * 2012-02-24 2015-05-26 Agnitio S.L. System and method for speaker recognition on mobile devices
CN102820033B (en) * 2012-08-17 2013-12-04 南京大学 Voiceprint identification method
CN104765996B (en) * 2014-01-06 2018-04-27 讯飞智元信息科技有限公司 Voiceprint password authentication method and system
CN104978507B (en) * 2014-04-14 2019-02-01 中国石油化工集团公司 A kind of Intelligent controller for logging evaluation expert system identity identifying method based on Application on Voiceprint Recognition
CN104485102A (en) * 2014-12-23 2015-04-01 智慧眼(湖南)科技发展有限公司 Voiceprint recognition method and device
CN104751845A (en) * 2015-03-31 2015-07-01 江苏久祥汽车电器集团有限公司 Voice recognition method and system used for intelligent robot
CN105096955B (en) * 2015-09-06 2019-02-01 广东外语外贸大学 A kind of speaker's method for quickly identifying and system based on model growth cluster
CN105575394A (en) * 2016-01-04 2016-05-11 北京时代瑞朗科技有限公司 Voiceprint identification method based on global change space and deep learning hybrid modeling
CN106169295B (en) * 2016-07-15 2019-03-01 腾讯科技(深圳)有限公司 Identity vector generation method and device
CN106373576B (en) * 2016-09-07 2020-07-21 Tcl科技集团股份有限公司 Speaker confirmation method and system based on VQ and SVM algorithms
CN107068154A (en) * 2017-03-13 2017-08-18 平安科技(深圳)有限公司 The method and system of authentication based on Application on Voiceprint Recognition

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101064043A (en) * 2006-04-29 2007-10-31 上海优浪信息科技有限公司 Sound-groove gate inhibition system and uses thereof
CN102695112A (en) * 2012-06-09 2012-09-26 九江妙士酷实业有限公司 Automobile player and volume control method thereof
CN102916815A (en) * 2012-11-07 2013-02-06 华为终端有限公司 Method and device for checking identity of user
CN103220286A (en) * 2013-04-10 2013-07-24 郑方 Identity verification system and identity verification method based on dynamic password voice
CN104427076A (en) * 2013-08-30 2015-03-18 中兴通讯股份有限公司 Recognition method and recognition device for automatic answering of calling system
CN103632504A (en) * 2013-12-17 2014-03-12 上海电机学院 Silence reminder for library
CN105100911A (en) * 2014-05-06 2015-11-25 夏普株式会社 Intelligent multimedia system and method
CN103986725A (en) * 2014-05-29 2014-08-13 中国农业银行股份有限公司 Client side, server side and identity authentication system and method
CN104157301A (en) * 2014-07-25 2014-11-19 广州三星通信技术研究有限公司 Method, device and terminal deleting voice information blank segment
CN105321293A (en) * 2014-09-18 2016-02-10 广东小天才科技有限公司 Danger detection and warning method and danger detection and warning smart device
CN104992708A (en) * 2015-05-11 2015-10-21 国家计算机网络与信息安全管理中心 Short-time specific audio detection model generating method and short-time specific audio detection method
CN105611461A (en) * 2016-01-04 2016-05-25 浙江宇视科技有限公司 Noise suppression method, apparatus and system for voice application system of front-end device
CN106971717A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 Robot and audio recognition method, the device of webserver collaborative process
CN105869645A (en) * 2016-03-25 2016-08-17 腾讯科技(深圳)有限公司 Voice data processing method and device
CN106210323A (en) * 2016-07-13 2016-12-07 广东欧珀移动通信有限公司 A kind of speech playing method and terminal unit

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG J C, LEE H P, WANG J F: "《Robust Environmental Sound Recognition for Home Automation》", 《IEEE TRANSACTIONS ON AUTOMATION SCIENCE & ENGINEERING》 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108091326B (en) * 2018-02-11 2021-08-06 张晓雷 Voiceprint recognition method and system based on linear regression
CN108091326A (en) * 2018-02-11 2018-05-29 张晓雷 A kind of method for recognizing sound-groove and system based on linear regression
CN108768654B (en) * 2018-04-09 2020-04-21 平安科技(深圳)有限公司 Identity verification method based on voiceprint recognition, server and storage medium
CN108694952B (en) * 2018-04-09 2020-04-28 平安科技(深圳)有限公司 Electronic device, identity authentication method and storage medium
CN108694952A (en) * 2018-04-09 2018-10-23 平安科技(深圳)有限公司 Electronic device, the method for authentication and storage medium
CN108768654A (en) * 2018-04-09 2018-11-06 平安科技(深圳)有限公司 Auth method, server based on Application on Voiceprint Recognition and storage medium
WO2019196305A1 (en) * 2018-04-09 2019-10-17 平安科技(深圳)有限公司 Electronic device, identity verification method, and storage medium
CN108447489B (en) * 2018-04-17 2020-05-22 清华大学 Continuous voiceprint authentication method and system with feedback
CN108447489A (en) * 2018-04-17 2018-08-24 清华大学 A kind of continuous voiceprint authentication method and system of band feedback
CN108630208A (en) * 2018-05-14 2018-10-09 平安科技(深圳)有限公司 Server, auth method and storage medium based on vocal print
WO2019218512A1 (en) * 2018-05-14 2019-11-21 平安科技(深圳)有限公司 Server, voiceprint verification method, and storage medium
WO2019218515A1 (en) * 2018-05-14 2019-11-21 平安科技(深圳)有限公司 Server, voiceprint-based identity authentication method, and storage medium
CN108834138A (en) * 2018-05-25 2018-11-16 四川斐讯全智信息技术有限公司 A kind of distribution method and system based on voice print database
CN109087647A (en) * 2018-08-03 2018-12-25 平安科技(深圳)有限公司 Application on Voiceprint Recognition processing method, device, electronic equipment and storage medium
CN109256138A (en) * 2018-08-13 2019-01-22 平安科技(深圳)有限公司 Auth method, terminal device and computer readable storage medium
CN109256138B (en) * 2018-08-13 2023-07-07 平安科技(深圳)有限公司 Identity verification method, terminal device and computer readable storage medium
CN109450850A (en) * 2018-09-26 2019-03-08 深圳壹账通智能科技有限公司 Auth method, device, computer equipment and storage medium
CN109147797A (en) * 2018-10-18 2019-01-04 平安科技(深圳)有限公司 Client service method, device, computer equipment and storage medium based on Application on Voiceprint Recognition
CN109147797B (en) * 2018-10-18 2024-05-07 平安科技(深圳)有限公司 Customer service method, device, computer equipment and storage medium based on voiceprint recognition
CN110046910A (en) * 2018-12-13 2019-07-23 阿里巴巴集团控股有限公司 The method and apparatus for obtaining customer group relevant to particular customer
CN110046910B (en) * 2018-12-13 2023-04-14 蚂蚁金服(杭州)网络技术有限公司 Method and equipment for judging validity of transaction performed by customer through electronic payment platform
CN109816508A (en) * 2018-12-14 2019-05-28 深圳壹账通智能科技有限公司 Method for authenticating user identity, device based on big data, computer equipment
CN109473108A (en) * 2018-12-15 2019-03-15 深圳壹账通智能科技有限公司 Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition
CN109545226A (en) * 2019-01-04 2019-03-29 平安科技(深圳)有限公司 A kind of audio recognition method, equipment and computer readable storage medium
CN110322888A (en) * 2019-05-21 2019-10-11 平安科技(深圳)有限公司 Credit card unlocking method, device, equipment and computer readable storage medium
CN110322888B (en) * 2019-05-21 2023-05-30 平安科技(深圳)有限公司 Credit card unlocking method, apparatus, device and computer readable storage medium
CN110334603A (en) * 2019-06-06 2019-10-15 视联动力信息技术股份有限公司 Authentication system
CN110971755A (en) * 2019-11-18 2020-04-07 武汉大学 Double-factor identity authentication method based on PIN code and pressure code
CN111597531A (en) * 2020-04-07 2020-08-28 北京捷通华声科技股份有限公司 Identity authentication method and device, electronic equipment and readable storage medium
CN111710340A (en) * 2020-06-05 2020-09-25 深圳市卡牛科技有限公司 Method, device, server and storage medium for identifying user identity based on voice
CN111613230A (en) * 2020-06-24 2020-09-01 泰康保险集团股份有限公司 Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium
CN112669841A (en) * 2020-12-18 2021-04-16 平安科技(深圳)有限公司 Training method and device for multilingual speech generation model and computer equipment

Also Published As

Publication number Publication date
CN107068154A (en) 2017-08-18
TW201833810A (en) 2018-09-16
WO2018166187A1 (en) 2018-09-20
WO2018166112A1 (en) 2018-09-20
TWI641965B (en) 2018-11-21

Similar Documents

Publication Publication Date Title
CN107517207A (en) Server, auth method and computer-readable recording medium
WO2019100606A1 (en) Electronic device, voiceprint-based identity verification method and system, and storage medium
KR102159217B1 (en) Electronic device, identification method, system and computer-readable storage medium
WO2020119448A1 (en) Voice information verification
Liu et al. An MFCC‐based text‐independent speaker identification system for access control
CN110047490A (en) Method for recognizing sound-groove, device, equipment and computer readable storage medium
WO2019136912A1 (en) Electronic device, identity authentication method and system, and storage medium
CN103971690A (en) Voiceprint recognition method and device
CN103794207A (en) Dual-mode voice identity recognition method
CN107766868A (en) A kind of classifier training method and device
CN105096955A (en) Speaker rapid identification method and system based on growing and clustering algorithm of models
CN108650266B (en) Server, voiceprint verification method and storage medium
CN110473552A (en) Speech recognition authentication method and system
CN104517066A (en) Folder encrypting method
CN109378014A (en) A kind of mobile device source discrimination and system based on convolutional neural networks
CN108694952B (en) Electronic device, identity authentication method and storage medium
CN113177850A (en) Method and device for multi-party identity authentication of insurance
CN111933154B (en) Method, equipment and computer readable storage medium for recognizing fake voice
CN108630208B (en) Server, voiceprint-based identity authentication method and storage medium
CN114003883A (en) Portable digital identity authentication equipment and identity authentication method
CN111916074A (en) Cross-device voice control method, system, terminal and storage medium
TW201944320A (en) Payment authentication method, device, equipment and storage medium
CN112562691B (en) Voiceprint recognition method, voiceprint recognition device, computer equipment and storage medium
CN115690920B (en) Credible living body detection method for medical identity authentication and related equipment
CN113436633B (en) Speaker recognition method, speaker recognition device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171226