CN107517207A - Server, auth method and computer-readable recording medium - Google Patents
Server, auth method and computer-readable recording medium Download PDFInfo
- Publication number
- CN107517207A CN107517207A CN201710715433.9A CN201710715433A CN107517207A CN 107517207 A CN107517207 A CN 107517207A CN 201710715433 A CN201710715433 A CN 201710715433A CN 107517207 A CN107517207 A CN 107517207A
- Authority
- CN
- China
- Prior art keywords
- vocal print
- print feature
- voice
- password
- feature vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000001755 vocal effect Effects 0.000 claims abstract description 224
- 239000013598 vector Substances 0.000 claims abstract description 143
- 230000015654 memory Effects 0.000 claims abstract description 30
- 238000004364 calculation method Methods 0.000 claims abstract description 13
- 238000013507 mapping Methods 0.000 claims abstract description 7
- 238000012549 training Methods 0.000 claims description 50
- 230000004044 response Effects 0.000 claims description 26
- 230000005540 biological transmission Effects 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 239000000203 mixture Substances 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 description 86
- 238000003860 storage Methods 0.000 description 16
- 238000004458 analytical method Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 9
- 238000001228 spectrum Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 238000009432 framing Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 238000005259 measurement Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000017105 transposition Effects 0.000 description 3
- 238000012795 verification Methods 0.000 description 2
- 238000005303 weighing Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0861—Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computer Networks & Wireless Communication (AREA)
- Biomedical Technology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Game Theory and Decision Science (AREA)
- Collating Specific Patterns (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
- Telephonic Communication Services (AREA)
Abstract
The present invention relates to a kind of server, auth method and computer-readable recording medium, server includes memory and the processor being connected with memory, the authentication system that can be run on a processor is stored with memory, following steps are realized when authentication system is executed by processor:After authentication request is received, send voice to the client at random and obtain text;The password voice that the user that client is sent reports is received, identifies code characters corresponding to password voice;If code characters standard cipher character corresponding with voice acquisition text is consistent, then build the current vocal print feature vector of the password voice, and the standard vocal print feature vector according to corresponding to determining predetermined mapping relations, it is vectorial the distance between with identified standard vocal print feature vector that current vocal print feature is calculated using predetermined distance calculation formula, and authentication is carried out to user according to distance.The present invention can improve the security of authentication.
Description
Technical field
The present invention relates to communication technical field, more particularly to a kind of server, auth method and computer-readable deposit
Storage media.
Background technology
At present, the scope of business of large-scale financing corporation is related to multiple business such as insurance, bank, investment, each business
Category is generally required for same client to be linked up, and the mode of communication has a variety of (such as telephonic communication or communications etc. face-to-face).
Before being linked up, checking is carried out as the important component of service security is ensured to the identity of client.
In order to meet the real-time demand of business, financing corporation is much analyzed the identity of client using manual type
Checking, but because customer group is huge, carries out discriminant analysis by artificial in a manner of the identity to verifying client accuracy is not
Height, efficiency is also low, and in order to solve this problem, in other existing schemes, financing corporation is also carried out using a kind of vocal print scheme
Authentication, but this kind of scheme can not exclude criminal and pass through vocal print authentication using false recording, have necessarily
Safety risks.
The content of the invention
It is an object of the invention to provide a kind of server, auth method and computer-readable recording medium, it is intended to
Improve the security of authentication.
To achieve the above object, the present invention provides a kind of server, the server include memory and with the storage
The processor of device connection, is stored with the authentication system that can be run on the processor, the identity in the memory
Following steps are realized when checking system is by the computing device:
S1, after the authentication request of carrying identity of client transmission is received, sent at random to the client
Text is obtained for the voice of user response;
S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to described close
Code voice carries out character recognition, identifies code characters corresponding to the password voice;
S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, build the password
The current vocal print feature vector of voice, and determine to be somebody's turn to do according to the mapping relations of predetermined identity and standard vocal print feature vector
Standard vocal print feature vector corresponding to the identity of user, it is special to calculate current vocal print using predetermined distance calculation formula
The distance between vectorial and identified standard vocal print feature vector is levied, authentication is carried out to user according to the distance.
Preferably, the step S2 includes:
The password voice that the user that client is sent reports is received, analyzes whether the password voice can use, if described close
Code voice is unavailable, then the recording for prompting client to re-start password voice, or, it is right if the password voice can use
The password voice carries out character recognition.
Preferably, when the authentication system is by the computing device, following steps are also realized:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to this
Client sends the voice acquisition text for user response;
The voice for adding up to send to client obtains the number of text, if the number is more than or equal to preset times, eventually
Only to the response of the authentication request.
Preferably, the step of the current vocal print feature vector for building the password voice includes:
The password voice is handled using Predetermined filter to carry out the extraction of preset kind vocal print feature, and base
Vocal print feature vector corresponding to the password voice is built in the preset kind vocal print feature of extraction;
It is special to construct the current vocal print by the background channel model of the vocal print feature vector input training in advance of structure
Sign vector;
It is described to utilize the current vocal print feature of predetermined distance calculation formula calculating vectorial and identified standard vocal print
The distance between characteristic vector, the step of user's progress authentication, is included according to the distance:
Calculate the current COS distance between vocal print discriminant vectorses and identified standard vocal print feature vector: It is vectorial for the standard vocal print feature,For current vocal print feature vector;
If the COS distance is less than or equal to default distance threshold, authentication passes through;
If the COS distance is more than default distance threshold, authentication does not pass through.
To achieve the above object, the present invention also provides a kind of server, and the server includes memory and deposited with described
The processor of reservoir connection, is stored with the identity based on Application on Voiceprint Recognition that can be run on the processor in the memory and tests
The system of card, following steps are realized when the system of the authentication based on Application on Voiceprint Recognition is by the computing device:
S101, after the speech data for the user for carrying out authentication is received, the vocal print for obtaining the speech data is special
Sign, and based on vocal print feature vector corresponding to vocal print feature structure;
S102, by the background channel model of vocal print feature vector input training in advance generation, the predicate to construct
Current vocal print discriminant vectorses corresponding to sound data;
S103, calculate the sky between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user to prestore
Between distance, authentication is carried out to the user based on the distance, and generates the result.
To achieve the above object, the present invention also provides a kind of auth method, and the auth method includes:
S1, after the authentication request of carrying identity of client transmission is received, sent at random to the client
Text is obtained for the voice of user response;
S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to described close
Code voice carries out character recognition, identifies code characters corresponding to the password voice;
S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, build the password
The current vocal print feature vector of voice, and determine to be somebody's turn to do according to the mapping relations of predetermined identity and standard vocal print feature vector
Standard vocal print feature vector corresponding to the identity of user, it is special to calculate current vocal print using predetermined distance calculation formula
The distance between vectorial and identified standard vocal print feature vector is levied, authentication is carried out to user according to the distance.
Preferably, the step S2 includes:
The password voice that the user that client is sent reports is received, analyzes whether the password voice can use, if described close
Code voice is unavailable, then the recording for prompting client to re-start password voice, or, it is right if the password voice can use
The password voice carries out character recognition.
Preferably, also include after the step S2:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to this
Client sends the voice acquisition text for user response;
The voice for adding up to send to client obtains the number of text, if the number is more than or equal to preset times, eventually
Only to the response of the authentication request.
Preferably, the step of the current vocal print feature vector for building the password voice includes:
The password voice is handled using Predetermined filter to carry out the extraction of preset kind vocal print feature, and base
Vocal print feature vector corresponding to the password voice is built in the preset kind vocal print feature of extraction;
It is special to construct the current vocal print by the background channel model of the vocal print feature vector input training in advance of structure
Sign vector;
It is described to utilize the current vocal print feature of predetermined distance calculation formula calculating vectorial and identified standard vocal print
The distance between characteristic vector, the step of user's progress authentication, is included according to the distance:
Calculate the current COS distance between vocal print discriminant vectorses and identified standard vocal print feature vector: It is vectorial for the standard vocal print feature,For current vocal print feature vector;
If the COS distance is less than or equal to default distance threshold, authentication passes through;
If the COS distance is more than default distance threshold, authentication does not pass through.
Preferably, the background channel model is gauss hybrid models, and the training background channel model includes:
The speech data sample of predetermined number is obtained, and obtains vocal print feature corresponding to each speech data sample, and is based on
Vocal print feature corresponding to each speech data sample builds vocal print feature vector corresponding to each speech data sample;
Vocal print feature vector corresponding to each speech data sample is divided into the training set of the first ratio and testing for the second ratio
Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training,
Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, and institute is used as using the gauss hybrid models after training
Background channel model is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the number of the speech data sample
Amount, and training is re-started based on the speech data sample after increase.
The present invention also provides a kind of computer-readable recording medium, and identity is stored with the computer-readable recording medium
Checking system, the authentication system realizes above-mentioned auth method when being executed by processor the step of.
The beneficial effects of the invention are as follows:If other people carry out authentication using false recording that is existing or being ready for, due to
The voice of transmission obtains the randomness of text, then the obtained code characters identified should differ with corresponding standard cipher character
Cause, can so prevent other people from carrying out authentication using false recording that is existing or being ready for;If other people record oneself
Sound carries out authentication, then can not be verified by vocal print feature afterwards.Therefore, the present embodiment is equivalent to carrying out identity twice
Checking, there is the effect of double verification, while the accuracy rate and efficiency of subscriber authentication is ensured, improve authentication
Security.
Brief description of the drawings
Fig. 1 is each optional application environment schematic diagram of embodiment one of the present invention;
Fig. 2 is the schematic flow sheet of the embodiment of auth method one of the present invention.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples
The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, not
For limiting the present invention.Based on the embodiment in the present invention, those of ordinary skill in the art are not before creative work is made
The every other embodiment obtained is put, belongs to the scope of protection of the invention.
It should be noted that the description for being related to " first ", " second " etc. in the present invention is only used for describing purpose, and can not
It is interpreted as indicating or implies its relative importance or imply the quantity of the technical characteristic indicated by indicating.Thus, define " the
One ", at least one this feature can be expressed or be implicitly included to the feature of " second ".In addition, the skill between each embodiment
Art scheme can be combined with each other, but must can be implemented as basis with those of ordinary skill in the art, when technical scheme
With reference to occurring conflicting or will be understood that the combination of this technical scheme is not present when can not realize, also not in application claims
Protection domain within.
As shown in fig.1, it is the application environment schematic diagram of the preferred embodiment of auth method of the present invention.This applies ring
Border schematic diagram includes server 1 and terminal device 2.Server 1 can by network, near-field communication technology etc. be adapted to technology with
Terminal device 2 carries out data interaction.
Client for sending from authentication request to server 1 is installed on terminal device 2, terminal device 2 includes,
But it is not limited to, any one can enter pedestrian with user by modes such as keyboard, mouse, remote control, touch pad or voice-operated devices
The electronic product of machine interaction, for example, personal computer, tablet personal computer, smart mobile phone, personal digital assistant (Personal
Digital Assistant, PDA), game machine, IPTV (Internet Protocol Television,
IPTV), the movable equipment of intellectual Wearable, guider etc., or such as digital TV, desktop computer, pen
The fixed terminal of note sheet, server etc..
The server 1 be it is a kind of can according to the instruction for being previously set or storing, it is automatic carry out numerical computations and/or
The equipment of information processing.The server 1 can be computer, can also be single network server, multiple webservers
The server group of the composition either cloud being made up of a large amount of main frames or the webserver based on cloud computing, wherein cloud computing is point
One kind that cloth calculates, a super virtual computer being made up of the computer collection of a group loose couplings.
In the present embodiment, server 1 may include, but be not limited only to, and depositing for connection can be in communication with each other by system bus
Reservoir 11, processor 12, network interface 13, memory 11 are stored with the authentication system that can be run on the processor 12.Need
It is noted that Fig. 1 illustrate only the server 1 with component 11-13, it should be understood that being not required for implementing to own
The component shown, what can be substituted implements more or less components.
Wherein, memory 11 includes internal memory and the readable storage medium storing program for executing of at least one type.Inside save as the operation of server 1
Caching is provided;Readable storage medium storing program for executing can be if flash memory, hard disk, multimedia card, card-type memory are (for example, SD or DX memories
Deng), random access storage device (RAM), static random-access memory (SRAM), read-only storage (ROM), electric erasable can compile
Journey read-only storage (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc. it is non-volatile
Storage medium.In certain embodiments, readable storage medium storing program for executing can be the internal storage unit of server 1, such as the server 1
Hard disk;In further embodiments, the non-volatile memory medium can also be the External memory equipment of server 1, such as
The plug-in type hard disk being equipped with server 1, intelligent memory card (Smart Media Card, SMC), secure digital (Secure
Digital, SD) card, flash card (Flash Card) etc..In the present embodiment, the readable storage medium storing program for executing of memory 11 is generally used for
Storage is installed on the operating system and types of applications software of server 1, such as the authentication system in one embodiment of the invention
Program code etc..In addition, memory 11 can be also used for temporarily storing the Various types of data that has exported or will export.
The processor 12 can be in certain embodiments central processing unit (Central Processing Unit,
CPU), controller, microcontroller, microprocessor or other data processing chips.The processor 12 is generally used for controlling the clothes
The overall operation of business device 1, such as perform the control and processing related to the terminal device 2 progress data interaction or communication
Deng.In the present embodiment, the processor 12 is used to run the program code stored in the memory 11 or processing data, example
Such as run authentication system.
The network interface 13 may include radio network interface or wired network interface, and the network interface 13 is generally used for
Communication connection is established between the server 1 and other electronic equipments.In the present embodiment, network interface 13 is mainly used in servicing
Device 1 is connected with one or more terminal devices 2, and data transfer is established between server 1 and one or more terminal devices 2 and is led to
Road and communication connection.
The authentication system is stored in memory 11, including at least one computer being stored in memory 11
Readable instruction, at least one computer-readable instruction can be performed by processor device 12, to realize the side of each embodiment of the application
Method;And at least one computer-readable instruction is different according to the function that its each several part is realized, can be divided into different patrol
Collect module.
In one embodiment, following steps are realized when above-mentioned authentication system is performed by the processor 12:
Step S1, after the authentication request of carrying identity of client transmission is received, at random to the client
The voice sent for user response obtains text;
Wherein, user is operated on the client, and the authentication request for carrying identity, clothes are sent to server
After business device receives the authentication request, the voice acquisition text for user response is sent to client at random.
Wherein, identity can be the identification card number of user or the phone number etc. of user;For user response
Voice, which obtains text, to be had a variety of, and server sends one kind therein to client at random, it is therefore intended that prevents other people using existing
False recording carry out authentication.It can need text corresponding to the random cipher of voice recording that the voice, which obtains text,
Or can be the text of the enquirement for the random cipher for needing voice recording.For example, voice obtains text as that " please record a string
Digital * * * ", user according to the voice obtain text responded when record " string number * * * " voice please be record, and for example,
Voice obtains text to put question to text " where is your birthplace ", and user obtains when text is responded according to the voice and recorded
" my birthplace is in * * * ".
Step S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to institute
State password voice and carry out character recognition, identify code characters corresponding to the password voice;
In the present embodiment, user can be in the mode of the client recording password voice:User obtains text according to voice
This, after user presses predetermined physical button or virtual key, control sound recording unit carries out voice recording,
After user discharges the button, stop voice recording, the voice recorded is sent to server as password voice.
Wherein, when carrying out password voice recording, should try one's best the interference for preventing ambient noise and voice recording equipment.Voice
Recording arrangement keeps suitable distance with user, and does not have to the big voice recording equipment of distortion as far as possible, and power supply preferably uses civil power, and
Keep electric current stable;Sensor should be used when carrying out telephonograph.
After server receives the password voice, character recognition is carried out to the password voice, i.e., is converted into password voice
Character one by one, wherein it is possible to which password voice directly is converted into character, noise treatment can be carried out to password voice,
Disturbed with further reduce.In order to extract to obtain the vocal print feature of password voice, the password voice recorded is present count
According to the speech data of length, or it is the speech data more than preset data length.
Step S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, structure should
The current vocal print feature vector of password voice, and it is true according to predetermined identity and the mapping relations of standard vocal print feature vector
Standard vocal print feature vector corresponding to the identity of the fixed user, current sound is calculated using predetermined distance calculation formula
The distance between line characteristic vector and identified standard vocal print feature vector, identity is carried out to user according to the distance and tested
Card.
In the present embodiment, voice, which obtains text, to be had a variety of, and the standard cipher character to be prestored on server also has a variety of, voice
Text is obtained to correspond with standard cipher character respectively.After code characters corresponding to password voice are identified, acquisition and institute
The voice of transmission obtains standard cipher character corresponding to text, and the obtained code characters and corresponding standard that judgement is identified are close
Whether code character is consistent.
If the obtained code characters identified should be consistent with corresponding standard cipher character, it is close further to build this
The current vocal print feature vector of code voice.Wherein, vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude
Vocal print etc., the vocal print feature of the present embodiment is preferably mel-frequency cepstrum coefficient (the Mel Frequency of speech data
Cepstrum Coefficient, MFCC).In vocal print feature vector corresponding to structure, by the vocal print feature group of password voice
Into characteristic matrix, this feature data matrix is the vocal print feature vector of password voice.
Vector has a variety of, including COS distance and Euclidean distance etc. with the distance between vector, it is preferable that the present embodiment
Current vocal print feature it is vectorial the distance between with identified standard vocal print feature vector be COS distance, COS distance is sharp
Measurement by the use of two vectorial angle cosine values in vector space as the size for weighing two interindividual variations.
Wherein, standard vocal print feature vector is vectorial for the vocal print feature prestored.Before distance is calculated, marked according to user
Know standard vocal print feature vector corresponding to obtaining.
Wherein, when the distance being calculated is less than or equal to pre-determined distance threshold value, it is verified, conversely, then authentication failed.
Compared with prior art, if other people carry out authentication using false recording that is existing or being ready for, due to sending
Voice obtain text randomness, then the obtained code characters identified should be inconsistent with corresponding standard cipher character,
It can so prevent other people from carrying out authentication using false recording that is existing or being ready for;If other people record the sound of oneself
Authentication is carried out, then can not be verified by vocal print feature afterwards.Therefore, the present embodiment is tested equivalent to identity twice is carried out
Card, there is the effect of double verification, while the accuracy rate and efficiency of subscriber authentication is ensured, improve the peace of authentication
Quan Xing.
In a preferred embodiment, in order to prevent the audio quality of password voice from influenceing the result of vocal print feature checking,
On the basis of above-mentioned Fig. 1 embodiment, the step S2 includes:The password voice that the user that client is sent reports is received,
Analyze whether the password voice can use, if the password voice is unavailable, prompt client to re-start password voice
Record, or, if the password voice can use, character recognition is carried out to the password voice.
Wherein, it is based on following analyses that whether password voice is available:Whether analysis user speaks part duration more than pre-
If whether the background noise volume of duration, password voice is less than the first default volume and/or speaking volume is more than the second default sound
Amount, the password voice can use if the analysis result in above-mentioned is satisfied by, and can perform the operation such as follow-up character recognition;Instead
It, if user speaks, part duration is less than preset duration, or the background noise volume of password voice is more than or equal to the first default sound
Amount, or speaking volume are less than or equal to the second default volume, then the password voice is unavailable, now, prompt client to re-start
The recording of password voice.
In a preferred embodiment, when the authentication system is by the computing device, following steps are also realized:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, sent out at random to the client again
The voice for user response is sent to obtain text;The voice for adding up to send to client obtains the number of text, if the number is big
In equal to preset times, then the response to the authentication request is terminated.
If user recorded the password voice of mistake, i.e. code characters standard cipher word corresponding with voice acquisition text
When according with inconsistent, the chance for sending the voice acquisition text for user response to the client at random again can be provided, meanwhile,
In order to prevent excessive password authentification from wasting computer resource, the number that can limit password authentification is less than preset times, i.e., tired
The number for counting the voice acquisition text sent to client is less than preset times, and whole when the number is more than or equal to preset times
Only to the response of authentication request.
In a preferred embodiment, on the basis of above-described embodiment, the password voice is built in above-mentioned steps S3
The step of current vocal print feature vector includes:The password voice is handled to carry out preset kind using Predetermined filter
The extraction of vocal print feature, and the preset kind vocal print feature based on extraction builds vocal print feature vector corresponding to the password voice;
It is vectorial to construct the current vocal print feature by the background channel model of the vocal print feature vector input training in advance of structure.
Wherein, Predetermined filter is preferably Mel wave filter.First, to the password voice carry out preemphasis, framing and
Windowing process;In the present embodiment, after the password voice for the user for carrying out authentication is received, at password voice
Reason.Wherein, preemphasis processing is really high-pass filtering processing, filters out low-frequency data so that the high frequency characteristics in password voice is more
Add and highlight, specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z-1, wherein, Z is speech data, and α is constant factor,
Preferably, α value is 0.97;Because stationarity is only presented in voice signal within a short period of time, therefore by one section of voice signal
It is divided into the signal (i.e. N frames) of N section short time, and is lost in order to avoid the continuity Characteristics of sound, has one section of weight between consecutive frame
Multiple region, repeat region are generally 1/2 per frame length;After framing is carried out to password voice, each frame signal is all as steady
Signal is handled, but the presence of Gibbs' effect, and the start frame and end frame of password voice be discontinuous, after framing,
More deviate from raw tone, therefore, it is necessary to windowing process is carried out to password voice.
Fourier transform is carried out to each adding window and obtains corresponding frequency spectrum;
The frequency spectrum is inputted into Mel wave filter to export to obtain Mel frequency spectrum;
Cepstral analysis is carried out on Mel frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, based on the mel-frequency
Vocal print feature vector corresponding to cepstrum coefficient MFCC compositions.Wherein, cepstral analysis is, for example, and takes the logarithm, does inverse transformation, inverse transformation
Realized generally by DCT discrete cosine transforms, take the 2nd after DCT to the 13rd coefficient as MFCC coefficients.Mel frequency
Rate cepstrum coefficient MFCC is the vocal print feature of this frame password voice, by the mel-frequency cepstrum coefficient MFCC composition characteristics of every frame
Data matrix, this feature data matrix are the vocal print feature vector of password voice.
Then, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould
Type is gauss hybrid models, calculates vocal print feature vector using the background channel model, draws corresponding current vocal print feature
Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference
The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally
Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come
Battle array, D (X) are covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down
Angular moment battle array, and element is arranged as to 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude
Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model
Variance matrix, each matrix are also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background
Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data
Value, Softmax recurrence is then carried out, operation is finally normalized, obtain every frame in mixed Gauss model Posterior probability distribution,
The ProbabilityDistribution Vector of every frame is formed into probability matrix.
3) current vocal print feature vector, is extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order calculates can be with
Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor the jth row of probability matrix, i-th yuan
Element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized
Data matrix.
It is being calculated single order, after second order coefficient, parallel computation first order and quadratic term, is then passing through first order and two
Current vocal print feature vector is calculated in secondary item.
In a preferred embodiment, on the basis of above-described embodiment, in above-mentioned steps S3 using it is predetermined away from
It is vectorial the distance between with identified standard vocal print feature vector to calculate current vocal print feature from calculation formula, according to it is described away from
Include from the step of carrying out authentication to user:Calculate the current vocal print discriminant vectorses and identified standard vocal print feature
COS distance between vector:
Wherein,It is vectorial for the standard vocal print feature,For current vocal print feature vector.If the COS distance is less than
Or pass through equal to default distance threshold, then authentication;If the COS distance is more than default distance threshold, identity
Checking does not pass through.
The present invention also provides another server, and the server is similar with the hardware structure of above-mentioned Fig. 1 server, including
Memory and the processor being connected with memory, and be connected by network interface with the terminal device of outside.Except that deposit
The system for the authentication based on Application on Voiceprint Recognition that can be run on the processor is stored with reservoir, Application on Voiceprint Recognition should be based on
Authentication system by the computing device when realize following steps:
S101, after the speech data for the user for carrying out authentication is received, the vocal print for obtaining the speech data is special
Sign, and based on vocal print feature vector corresponding to vocal print feature structure;
In the present embodiment, speech data collects (voice capture device is, for example, microphone) by voice capture device,
The system that the speech data of collection is sent to the authentication based on Application on Voiceprint Recognition by voice capture device.
When gathering speech data, should try one's best prevents the interference of ambient noise and voice capture device.Voice capture device
Suitable distance is kept with user, and does not have to the big voice capture device of distortion as far as possible, power supply preferably uses civil power, and keeps electric current
It is stable;Sensor should be used when carrying out telephonograph., can be to voice number before the vocal print feature in extracting speech data
According to noise treatment is carried out, disturbed with further reduce.In order to extract to obtain the vocal print feature of speech data, gathered
Speech data is the speech data of preset data length, or is the speech data more than preset data length.
Vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude vocal print etc., the vocal print of the present embodiment
Be characterized as preferably speech data mel-frequency cepstrum coefficient (Mel Frequency Cepstrum Coefficient,
MFCC).In vocal print feature vector corresponding to structure, by the vocal print feature composition characteristic data matrix of speech data, this feature
Data matrix is the vocal print feature vector of speech data.
S102, by the background channel model of vocal print feature vector input training in advance generation, the predicate to construct
Current vocal print discriminant vectorses corresponding to sound data;
Wherein, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould
Type is gauss hybrid models, and vocal print feature vector is calculated using the background channel model, show that corresponding current vocal print differentiates
Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference
The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally
Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come
Battle array, D (X) are covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down
Angular moment battle array, and element is arranged as to 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude
Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model
Variance matrix, each matrix are also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background
Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data
Value, Softmax recurrence is then carried out, operation is finally normalized, obtain every frame in mixed Gauss model Posterior probability distribution,
The ProbabilityDistribution Vector of every frame is formed into probability matrix.
3) current vocal print discriminant vectorses, are extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order calculates can be with
Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor the jth row of probability matrix, i-th yuan
Element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized
Data matrix.
It is being calculated single order, after second order coefficient, parallel computation first order and quadratic term, is then passing through first order and two
Secondary item calculates current vocal print discriminant vectorses.
Preferably, background channel model is gauss hybrid models, is included before above-mentioned steps S1:
The speech data sample of predetermined number is obtained, and obtains vocal print feature corresponding to each speech data sample, and is based on
Vocal print feature corresponding to each speech data sample builds vocal print feature vector corresponding to each speech data sample;
Vocal print feature vector corresponding to each speech data sample is divided into the training set of the first ratio and testing for the second ratio
Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training,
Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, and institute is used as using the gauss hybrid models after training
Step S2 background channel model is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample
This quantity, and training is re-started based on the speech data sample after increase.
Wherein, when the vocal print feature vector in using training set is trained to gauss hybrid models, the D that extracts
Likelihood probability corresponding to dimension vocal print feature can be expressed as with K Gaussian component:
Wherein, the probability (mixed Gauss model) that P (x) is generated for speech data sample by gauss hybrid models, wkTo be each high
The weight of this model, and p (x | k) it is the probability that sample is generated by k-th of Gauss model, K is Gauss model quantity.
The parameter of whole gauss hybrid models can be expressed as:{wi,μi,Σi, wiFor the weight of i-th of Gauss model, μi
For the average of i-th of Gauss model, ∑iFor the covariance of i-th of Gauss model.The gauss hybrid models are trained to use non-prison
The EM algorithms superintended and directed.After the completion of training, the weight vectors of gauss hybrid models, constant vector, N number of covariance matrix, average are obtained
It is multiplied by matrix of covariance etc., the gauss hybrid models after as one training.
S103, calculate the sky between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user to prestore
Between distance, authentication is carried out to the user based on the distance, and generates the result.
Vector has a variety of, including COS distance and Euclidean distance etc. with the distance between vector, it is preferable that the present embodiment
Space length be COS distance, COS distance is as measurement two by the use of two vectorial angle cosine values in vector space
The measurement of the size of interindividual variation.
Wherein, standard vocal print discriminant vectorses are the vocal print discriminant vectorses for being obtained ahead of time and storing, standard vocal print discriminant vectorses
The identification information of its corresponding user is carried in storage, it is capable of the identity of user corresponding to accurate representation.Calculating space
Before distance, the identification information provided according to user obtains the vocal print discriminant vectorses of storage.
Wherein, when the space length being calculated is less than or equal to pre-determined distance threshold value, it is verified, conversely, then verifying
Failure.
Compared with prior art, the background channel model of the present embodiment training in advance generation is by a large amount of speech datas
Excavation obtained with comparing training, this model can to greatest extent retain user vocal print feature while, accurately portray
Background vocal print feature when user speaks, and can remove this feature in identification, and extract the intrinsic spy of user voice
Sign, can significantly improve the accuracy rate of subscriber authentication, and improve the efficiency of authentication;In addition, the present embodiment is abundant
Vocal print feature related to sound channel in voice is make use of, this vocal print feature need not be simultaneously any limitation as to text, thus entered
There is larger flexibility during row identification and checking.
As shown in Fig. 2 Fig. 2 is the schematic flow sheet of the embodiment of auth method one of the present invention, the auth method
Comprise the following steps:
Step S1, after the authentication request of carrying identity of client transmission is received, at random to the client
The voice sent for user response obtains text;
Wherein, user is operated on the client, and the authentication request for carrying identity, clothes are sent to server
After business device receives the authentication request, the voice acquisition text for user response is sent to client at random.
Wherein, identity can be the identification card number of user or the phone number etc. of user;For user response
Voice, which obtains text, to be had a variety of, and server sends one kind therein to client at random, it is therefore intended that prevents other people using existing
False recording carry out authentication.It can need text corresponding to the random cipher of voice recording that the voice, which obtains text,
Or can be the text of the enquirement for the random cipher for needing voice recording.For example, voice obtains text as that " please record a string
Digital * * * ", user according to the voice obtain text responded when record " string number * * * " voice please be record, and for example,
Voice obtains text to put question to text " where is your birthplace ", and user obtains when text is responded according to the voice and recorded
" my birthplace is in * * * ".
Step S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to institute
State password voice and carry out character recognition, identify code characters corresponding to the password voice;
In the present embodiment, user can be in the mode of the client recording password voice:User obtains text according to voice
This, after user presses predetermined physical button or virtual key, control sound recording unit carries out voice recording,
After user discharges the button, stop voice recording, the voice recorded is sent to server as password voice.
Wherein, when carrying out password voice recording, should try one's best the interference for preventing ambient noise and voice recording equipment.Voice
Recording arrangement keeps suitable distance with user, and does not have to the big voice recording equipment of distortion as far as possible, and power supply preferably uses civil power, and
Keep electric current stable;Sensor should be used when carrying out telephonograph.
After server receives the password voice, character recognition is carried out to the password voice, i.e., is converted into password voice
Character one by one, wherein it is possible to which password voice directly is converted into character, noise treatment can be carried out to password voice,
Disturbed with further reduce.In order to extract to obtain the vocal print feature of password voice, the password voice recorded is present count
According to the speech data of length, or it is the speech data more than preset data length.
Step S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, structure should
The current vocal print feature vector of password voice, and it is true according to predetermined identity and the mapping relations of standard vocal print feature vector
Standard vocal print feature vector corresponding to the identity of the fixed user, current sound is calculated using predetermined distance calculation formula
The distance between line characteristic vector and identified standard vocal print feature vector, identity is carried out to user according to the distance and tested
Card.
In the present embodiment, voice, which obtains text, to be had a variety of, and the standard cipher character to be prestored on server also has a variety of, voice
Text is obtained to correspond with standard cipher character respectively.After code characters corresponding to password voice are identified, acquisition and institute
The voice of transmission obtains standard cipher character corresponding to text, and the obtained code characters and corresponding standard that judgement is identified are close
Whether code character is consistent.
If the obtained code characters identified should be consistent with corresponding standard cipher character, it is close further to build this
The current vocal print feature vector of code voice.Wherein, vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude
Vocal print etc., the vocal print feature of the present embodiment is preferably mel-frequency cepstrum coefficient (the Mel Frequency of speech data
Cepstrum Coefficient, MFCC).In vocal print feature vector corresponding to structure, by the vocal print feature group of password voice
Into characteristic matrix, this feature data matrix is the vocal print feature vector of password voice.
Vector has a variety of, including COS distance and Euclidean distance etc. with the distance between vector, it is preferable that the present embodiment
Current vocal print feature it is vectorial the distance between with identified standard vocal print feature vector be COS distance, COS distance is sharp
Measurement by the use of two vectorial angle cosine values in vector space as the size for weighing two interindividual variations.
Wherein, standard vocal print feature vector is vectorial for the vocal print feature prestored.Before distance is calculated, marked according to user
Know standard vocal print feature vector corresponding to obtaining.
Wherein, when the distance being calculated is less than or equal to pre-determined distance threshold value, it is verified, conversely, then authentication failed.
In a preferred embodiment, in order to prevent the audio quality of password voice from influenceing the result of vocal print feature checking,
On the basis of above-mentioned Fig. 2 embodiment, the step S2 includes:The password voice that the user that client is sent reports is received,
Analyze whether the password voice can use, if the password voice is unavailable, prompt client to re-start password voice
Record, or, if the password voice can use, character recognition is carried out to the password voice.
Wherein, it is based on following analyses that whether password voice is available:Whether analysis user speaks part duration more than pre-
If whether the background noise volume of duration, password voice is less than the first default volume and/or speaking volume is more than the second default sound
Amount, the password voice can use if the analysis result in above-mentioned is satisfied by, and can perform the operation such as follow-up character recognition;Instead
It, if user speaks, part duration is less than preset duration, or the background noise volume of password voice is more than or equal to the first default sound
Amount, or speaking volume are less than or equal to the second default volume, then the password voice is unavailable, now, prompt client to re-start
The recording of password voice.
In a preferred embodiment, on the basis of above-mentioned Fig. 2 embodiment, the auth method also includes as follows
Step:If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to the visitor
Family end sends the voice acquisition text for user response;The voice for adding up to send to client obtains the number of text, if described
Number is more than or equal to preset times, then terminates the response to the authentication request.
If user recorded the password voice of mistake, i.e. code characters standard cipher word corresponding with voice acquisition text
When according with inconsistent, the chance for sending the voice acquisition text for user response to the client at random again can be provided, meanwhile,
In order to prevent excessive password authentification from wasting computer resource, the number that can limit password authentification is less than preset times, i.e., tired
The number for counting the voice acquisition text sent to client is less than preset times, and whole when the number is more than or equal to preset times
Only to the response of authentication request.
In a preferred embodiment, on the basis of above-described embodiment, the password voice is built in above-mentioned steps S3
The step of current vocal print feature vector includes:The password voice is handled to carry out preset kind using Predetermined filter
The extraction of vocal print feature, and the preset kind vocal print feature based on extraction builds vocal print feature vector corresponding to the password voice;
It is vectorial to construct the current vocal print feature by the background channel model of the vocal print feature vector input training in advance of structure.
Wherein, Predetermined filter is preferably Mel wave filter.First, to the password voice carry out preemphasis, framing and
Windowing process;In the present embodiment, after the password voice for the user for carrying out authentication is received, at password voice
Reason.Wherein, preemphasis processing is really high-pass filtering processing, filters out low-frequency data so that the high frequency characteristics in password voice is more
Add and highlight, specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z-1, wherein, Z is speech data, and α is constant factor,
Preferably, α value is 0.97;Because stationarity is only presented in voice signal within a short period of time, therefore by one section of voice signal
It is divided into the signal (i.e. N frames) of N section short time, and is lost in order to avoid the continuity Characteristics of sound, has one section of weight between consecutive frame
Multiple region, repeat region are generally 1/2 per frame length;After framing is carried out to password voice, each frame signal is all as steady
Signal is handled, but the presence of Gibbs' effect, and the start frame and end frame of password voice be discontinuous, after framing,
More deviate from raw tone, therefore, it is necessary to windowing process is carried out to password voice.
Fourier transform is carried out to each adding window and obtains corresponding frequency spectrum;
The frequency spectrum is inputted into Mel wave filter to export to obtain Mel frequency spectrum;
Cepstral analysis is carried out on Mel frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, based on the mel-frequency
Vocal print feature vector corresponding to cepstrum coefficient MFCC compositions.Wherein, cepstral analysis is, for example, and takes the logarithm, does inverse transformation, inverse transformation
Realized generally by DCT discrete cosine transforms, take the 2nd after DCT to the 13rd coefficient as MFCC coefficients.Mel frequency
Rate cepstrum coefficient MFCC is the vocal print feature of this frame password voice, by the mel-frequency cepstrum coefficient MFCC composition characteristics of every frame
Data matrix, this feature data matrix are the vocal print feature vector of password voice.
Then, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould
Type is gauss hybrid models, calculates vocal print feature vector using the background channel model, draws corresponding current vocal print feature
Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference
The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally
Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come
Battle array, D (X) are covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down
Angular moment battle array, and element is arranged as to 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude
Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model
Variance matrix, each matrix are also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background
Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data
Value, Softmax recurrence is then carried out, operation is finally normalized, obtain every frame in mixed Gauss model Posterior probability distribution,
The ProbabilityDistribution Vector of every frame is formed into probability matrix.
3) current vocal print feature vector, is extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order calculates can be with
Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor the jth row of probability matrix, i-th yuan
Element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized
Data matrix.
It is being calculated single order, after second order coefficient, parallel computation first order and quadratic term, is then passing through first order and two
Current vocal print feature vector is calculated in secondary item.
In a preferred embodiment, on the basis of above-described embodiment, in above-mentioned steps S3 using it is predetermined away from
It is vectorial the distance between with identified standard vocal print feature vector to calculate current vocal print feature from calculation formula, according to it is described away from
Include from the step of carrying out authentication to user:Calculate the current vocal print discriminant vectorses and identified standard vocal print feature
COS distance between vector:
Wherein,It is vectorial for the standard vocal print feature,For current vocal print feature vector.If the COS distance is less than
Or pass through equal to default distance threshold, then authentication;If the COS distance is more than default distance threshold, identity
Checking does not pass through.
In a preferred embodiment, on the basis of above-described embodiment, background channel model is gauss hybrid models, instruction
Practicing background channel model includes:
The speech data sample of predetermined number is obtained, and obtains vocal print feature corresponding to each speech data sample, and is based on
Vocal print feature corresponding to each speech data sample builds vocal print feature vector corresponding to each speech data sample;
Vocal print feature vector corresponding to each speech data sample is divided into the training set of the first ratio and testing for the second ratio
Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training,
Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, using the gauss hybrid models after training as upper
Background channel model to be applied is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample
This quantity, and training is re-started based on the speech data sample after increase.
Wherein, when the vocal print feature vector in using training set is trained to gauss hybrid models, the D that extracts
Likelihood probability corresponding to dimension vocal print feature can be expressed as with K Gaussian component:
Wherein, the probability (mixed Gauss model) that P (x) is generated for speech data sample by gauss hybrid models, wkTo be each high
The weight of this model, and p (x | k) it is the probability that sample is generated by k-th of Gauss model, K is Gauss model quantity.
The parameter of whole gauss hybrid models can be expressed as:{wi,μi,Σi, wiFor the weight of i-th of Gauss model, μi
For the average of i-th of Gauss model, ∑iFor the covariance of i-th of Gauss model.The gauss hybrid models are trained to use non-prison
The EM algorithms superintended and directed.After the completion of training, the weight vectors of gauss hybrid models, constant vector, N number of covariance matrix, average are obtained
It is multiplied by matrix of covariance etc., the gauss hybrid models after as one training.
The present invention also provides a kind of computer-readable recording medium, and identity is stored with the computer-readable recording medium
Checking system, the authentication system realizes above-mentioned auth method when being executed by processor the step of.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on such understanding, technical scheme is substantially done to prior art in other words
Going out the part of contribution can be embodied in the form of software product, and the computer software product is stored in a storage medium
In (such as ROM/RAM, magnetic disc, CD), including some instructions to cause a station terminal equipment (can be mobile phone, computer, clothes
Be engaged in device, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.
The preferred embodiments of the present invention are these are only, are not intended to limit the scope of the invention, it is every to utilize this hair
The equivalent structure or equivalent flow conversion that bright specification and accompanying drawing content are made, or directly or indirectly it is used in other related skills
Art field, is included within the scope of the present invention.
Claims (11)
1. a kind of server, it is characterised in that the server includes memory and the processor being connected with the memory, institute
The authentication system that is stored with and can run on the processor in memory is stated, the authentication system is by the processing
Device realizes following steps when performing:
S1, receive client transmission carrying identity authentication request after, at random to the client send for
The voice of family response obtains text;
S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to the password language
Sound carries out character recognition, identifies code characters corresponding to the password voice;
S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, build the password voice
Current vocal print feature vector, and according to predetermined identity and standard vocal print feature vector mapping relations determine the user
Identity corresponding to standard vocal print feature vector, using predetermined distance calculation formula calculate current vocal print feature to
The distance between amount and identified standard vocal print feature vector, authentication is carried out to user according to the distance.
2. server according to claim 1, it is characterised in that the step S2 includes:
The password voice that the user that client is sent reports is received, analyzes whether the password voice can use, if the password language
Sound is unavailable, then the recording for prompting client to re-start password voice, or, if the password voice can use, to described
Password voice carries out character recognition.
3. server according to claim 1 or 2, it is characterised in that the authentication system is held by the processor
During row, following steps are also realized:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to the client
End sends the voice acquisition text for user response;
The voice for adding up to send to client obtains the number of text, if the number is more than or equal to preset times, termination pair
The response of the authentication request.
4. server according to claim 1 or 2, it is characterised in that the current vocal print for building the password voice is special
The step of sign vector includes:
The password voice is handled using Predetermined filter to carry out the extraction of preset kind vocal print feature, and be based on carrying
The preset kind vocal print feature taken builds vocal print feature vector corresponding to the password voice;
By the background channel model of the vocal print feature of structure vector input training in advance, with construct the current vocal print feature to
Amount;
It is described to utilize the current vocal print feature of predetermined distance calculation formula calculating vectorial and identified standard vocal print feature
The distance between vector, the step of user's progress authentication, is included according to the distance:
Calculate the current COS distance between vocal print discriminant vectorses and identified standard vocal print feature vector: It is vectorial for the standard vocal print feature,For current vocal print feature vector;
If the COS distance is less than or equal to default distance threshold, authentication passes through;
If the COS distance is more than default distance threshold, authentication does not pass through.
5. a kind of server, it is characterised in that the server includes memory and the processor being connected with the memory, institute
The system that the authentication based on Application on Voiceprint Recognition that can be run on the processor is stored with memory is stated, it is described to be based on sound
Following steps are realized when the system of the authentication of line identification is by the computing device:
S101, after the speech data for the user for carrying out authentication is received, the vocal print feature of the speech data is obtained, and
Based on vocal print feature vector corresponding to vocal print feature structure;
S102, the background channel model that vocal print feature vector input training in advance is generated, to construct the voice number
According to corresponding current vocal print discriminant vectorses;
S103, calculate space between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user to prestore away from
From carrying out authentication to the user based on the distance, and generate the result.
6. a kind of auth method, it is characterised in that the auth method includes:
S1, receive client transmission carrying identity authentication request after, at random to the client send for
The voice of family response obtains text;
S2, the password voice that client obtains user's report that text is sent based on the voice is received, and to the password language
Sound carries out character recognition, identifies code characters corresponding to the password voice;
S3, if code characters standard cipher character corresponding with voice acquisition text is consistent, build the password voice
Current vocal print feature vector, and according to predetermined identity and standard vocal print feature vector mapping relations determine the user
Identity corresponding to standard vocal print feature vector, using predetermined distance calculation formula calculate current vocal print feature to
The distance between amount and identified standard vocal print feature vector, authentication is carried out to user according to the distance.
7. auth method according to claim 6, it is characterised in that the step S2 includes:
The password voice that the user that client is sent reports is received, analyzes whether the password voice can use, if the password language
Sound is unavailable, then the recording for prompting client to re-start password voice, or, if the password voice can use, to described
Password voice carries out character recognition.
8. the auth method according to claim 6 or 7, it is characterised in that also include after the step S2:
If code characters standard cipher character corresponding with voice acquisition text is inconsistent, again at random to the client
End sends the voice acquisition text for user response;
The voice for adding up to send to client obtains the number of text, if the number is more than or equal to preset times, termination pair
The response of the authentication request.
9. the auth method according to claim 6 or 7, it is characterised in that described to build the current of the password voice
The step of vocal print feature vector includes:
The password voice is handled using Predetermined filter to carry out the extraction of preset kind vocal print feature, and be based on carrying
The preset kind vocal print feature taken builds vocal print feature vector corresponding to the password voice;
By the background channel model of the vocal print feature of structure vector input training in advance, with construct the current vocal print feature to
Amount;
It is described to utilize the current vocal print feature of predetermined distance calculation formula calculating vectorial and identified standard vocal print feature
The distance between vector, the step of user's progress authentication, is included according to the distance:
Calculate the current COS distance between vocal print discriminant vectorses and identified standard vocal print feature vector: It is vectorial for the standard vocal print feature,For current vocal print feature vector;
If the COS distance is less than or equal to default distance threshold, authentication passes through;
If the COS distance is more than default distance threshold, authentication does not pass through.
10. auth method according to claim 9, it is characterised in that the background channel model is Gaussian Mixture
Model, the training background channel model include:
The speech data sample of predetermined number is obtained, and obtains vocal print feature corresponding to each speech data sample, and is based on each language
Vocal print feature corresponding to sound data sample builds vocal print feature vector corresponding to each speech data sample;
Vocal print feature vector corresponding to each speech data sample is divided into the training set of the first ratio and the checking collection of the second ratio,
First ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, utilized
The accuracy rate of gauss hybrid models after the checking set pair training is verified;
If the accuracy rate is more than predetermined threshold value, model training terminates, and the back of the body is used as using the gauss hybrid models after training
Scape channel model, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the quantity of the speech data sample, and
Training is re-started based on the speech data sample after increase.
11. a kind of computer-readable recording medium, it is characterised in that be stored with identity on the computer-readable recording medium and test
Card system, realize that the identity as any one of claim 6 to 10 is tested when the authentication system is executed by processor
The step of card method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/105031 WO2018166187A1 (en) | 2017-03-13 | 2017-09-30 | Server, identity verification method and system, and a computer-readable storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710147695.XA CN107068154A (en) | 2017-03-13 | 2017-03-13 | The method and system of authentication based on Application on Voiceprint Recognition |
CN201710147695X | 2017-03-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107517207A true CN107517207A (en) | 2017-12-26 |
Family
ID=59622093
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710147695.XA Pending CN107068154A (en) | 2017-03-13 | 2017-03-13 | The method and system of authentication based on Application on Voiceprint Recognition |
CN201710715433.9A Pending CN107517207A (en) | 2017-03-13 | 2017-08-20 | Server, auth method and computer-readable recording medium |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710147695.XA Pending CN107068154A (en) | 2017-03-13 | 2017-03-13 | The method and system of authentication based on Application on Voiceprint Recognition |
Country Status (3)
Country | Link |
---|---|
CN (2) | CN107068154A (en) |
TW (1) | TWI641965B (en) |
WO (2) | WO2018166112A1 (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108091326A (en) * | 2018-02-11 | 2018-05-29 | 张晓雷 | A kind of method for recognizing sound-groove and system based on linear regression |
CN108447489A (en) * | 2018-04-17 | 2018-08-24 | 清华大学 | A kind of continuous voiceprint authentication method and system of band feedback |
CN108630208A (en) * | 2018-05-14 | 2018-10-09 | 平安科技(深圳)有限公司 | Server, auth method and storage medium based on vocal print |
CN108694952A (en) * | 2018-04-09 | 2018-10-23 | 平安科技(深圳)有限公司 | Electronic device, the method for authentication and storage medium |
CN108768654A (en) * | 2018-04-09 | 2018-11-06 | 平安科技(深圳)有限公司 | Auth method, server based on Application on Voiceprint Recognition and storage medium |
CN108834138A (en) * | 2018-05-25 | 2018-11-16 | 四川斐讯全智信息技术有限公司 | A kind of distribution method and system based on voice print database |
CN109087647A (en) * | 2018-08-03 | 2018-12-25 | 平安科技(深圳)有限公司 | Application on Voiceprint Recognition processing method, device, electronic equipment and storage medium |
CN109147797A (en) * | 2018-10-18 | 2019-01-04 | 平安科技(深圳)有限公司 | Client service method, device, computer equipment and storage medium based on Application on Voiceprint Recognition |
CN109256138A (en) * | 2018-08-13 | 2019-01-22 | 平安科技(深圳)有限公司 | Auth method, terminal device and computer readable storage medium |
CN109450850A (en) * | 2018-09-26 | 2019-03-08 | 深圳壹账通智能科技有限公司 | Auth method, device, computer equipment and storage medium |
CN109473108A (en) * | 2018-12-15 | 2019-03-15 | 深圳壹账通智能科技有限公司 | Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition |
CN109545226A (en) * | 2019-01-04 | 2019-03-29 | 平安科技(深圳)有限公司 | A kind of audio recognition method, equipment and computer readable storage medium |
CN109816508A (en) * | 2018-12-14 | 2019-05-28 | 深圳壹账通智能科技有限公司 | Method for authenticating user identity, device based on big data, computer equipment |
CN110046910A (en) * | 2018-12-13 | 2019-07-23 | 阿里巴巴集团控股有限公司 | The method and apparatus for obtaining customer group relevant to particular customer |
CN110322888A (en) * | 2019-05-21 | 2019-10-11 | 平安科技(深圳)有限公司 | Credit card unlocking method, device, equipment and computer readable storage medium |
CN110334603A (en) * | 2019-06-06 | 2019-10-15 | 视联动力信息技术股份有限公司 | Authentication system |
WO2019218512A1 (en) * | 2018-05-14 | 2019-11-21 | 平安科技(深圳)有限公司 | Server, voiceprint verification method, and storage medium |
CN110971755A (en) * | 2019-11-18 | 2020-04-07 | 武汉大学 | Double-factor identity authentication method based on PIN code and pressure code |
CN111597531A (en) * | 2020-04-07 | 2020-08-28 | 北京捷通华声科技股份有限公司 | Identity authentication method and device, electronic equipment and readable storage medium |
CN111613230A (en) * | 2020-06-24 | 2020-09-01 | 泰康保险集团股份有限公司 | Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium |
CN111710340A (en) * | 2020-06-05 | 2020-09-25 | 深圳市卡牛科技有限公司 | Method, device, server and storage medium for identifying user identity based on voice |
CN112669841A (en) * | 2020-12-18 | 2021-04-16 | 平安科技(深圳)有限公司 | Training method and device for multilingual speech generation model and computer equipment |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107068154A (en) * | 2017-03-13 | 2017-08-18 | 平安科技(深圳)有限公司 | The method and system of authentication based on Application on Voiceprint Recognition |
CN107527620B (en) | 2017-07-25 | 2019-03-26 | 平安科技(深圳)有限公司 | Electronic device, the method for authentication and computer readable storage medium |
CN107993071A (en) * | 2017-11-21 | 2018-05-04 | 平安科技(深圳)有限公司 | Electronic device, auth method and storage medium based on vocal print |
CN108172230A (en) * | 2018-01-03 | 2018-06-15 | 平安科技(深圳)有限公司 | Voiceprint registration method, terminal installation and storage medium based on Application on Voiceprint Recognition model |
CN108154371A (en) * | 2018-01-12 | 2018-06-12 | 平安科技(深圳)有限公司 | Electronic device, the method for authentication and storage medium |
CN108269575B (en) * | 2018-01-12 | 2021-11-02 | 平安科技(深圳)有限公司 | Voice recognition method for updating voiceprint data, terminal device and storage medium |
CN108766444B (en) * | 2018-04-09 | 2020-11-03 | 平安科技(深圳)有限公司 | User identity authentication method, server and storage medium |
CN108806695A (en) * | 2018-04-17 | 2018-11-13 | 平安科技(深圳)有限公司 | Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh |
CN109101801B (en) * | 2018-07-12 | 2021-04-27 | 北京百度网讯科技有限公司 | Method, apparatus, device and computer readable storage medium for identity authentication |
CN110867189A (en) * | 2018-08-28 | 2020-03-06 | 北京京东尚科信息技术有限公司 | Login method and device |
CN110880325B (en) * | 2018-09-05 | 2022-06-28 | 华为技术有限公司 | Identity recognition method and equipment |
CN109377662A (en) * | 2018-09-29 | 2019-02-22 | 途客易达(天津)网络科技有限公司 | Charging pile control method, device and electronic equipment |
CN109257362A (en) * | 2018-10-11 | 2019-01-22 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of voice print verification |
CN109378002B (en) * | 2018-10-11 | 2024-05-07 | 平安科技(深圳)有限公司 | Voiceprint verification method, voiceprint verification device, computer equipment and storage medium |
CN109524026B (en) * | 2018-10-26 | 2022-04-26 | 北京网众共创科技有限公司 | Method and device for determining prompt tone, storage medium and electronic device |
CN109473105A (en) * | 2018-10-26 | 2019-03-15 | 平安科技(深圳)有限公司 | The voice print verification method, apparatus unrelated with text and computer equipment |
CN109360573A (en) * | 2018-11-13 | 2019-02-19 | 平安科技(深圳)有限公司 | Livestock method for recognizing sound-groove, device, terminal device and computer storage medium |
CN109493873A (en) * | 2018-11-13 | 2019-03-19 | 平安科技(深圳)有限公司 | Livestock method for recognizing sound-groove, device, terminal device and computer storage medium |
CN109636630A (en) * | 2018-12-07 | 2019-04-16 | 泰康保险集团股份有限公司 | Method, apparatus, medium and electronic equipment of the detection for behavior of insuring |
CN110298150B (en) * | 2019-05-29 | 2021-11-26 | 上海拍拍贷金融信息服务有限公司 | Identity verification method and system based on voice recognition |
CN110738998A (en) * | 2019-09-11 | 2020-01-31 | 深圳壹账通智能科技有限公司 | Voice-based personal credit evaluation method, device, terminal and storage medium |
CN110473569A (en) * | 2019-09-11 | 2019-11-19 | 苏州思必驰信息科技有限公司 | Detect the optimization method and system of speaker's spoofing attack |
CN111402899B (en) * | 2020-03-25 | 2023-10-13 | 中国工商银行股份有限公司 | Cross-channel voiceprint recognition method and device |
CN111625704A (en) * | 2020-05-11 | 2020-09-04 | 镇江纵陌阡横信息科技有限公司 | Non-personalized recommendation algorithm model based on user intention and data cooperation |
CN111899566A (en) * | 2020-08-11 | 2020-11-06 | 南京畅淼科技有限责任公司 | Ship traffic management system based on AIS |
CN112289324B (en) * | 2020-10-27 | 2024-05-10 | 湖南华威金安企业管理有限公司 | Voiceprint identity recognition method and device and electronic equipment |
CN112802481A (en) * | 2021-04-06 | 2021-05-14 | 北京远鉴信息技术有限公司 | Voiceprint verification method, voiceprint recognition model training method, device and equipment |
CN113421575B (en) * | 2021-06-30 | 2024-02-06 | 平安科技(深圳)有限公司 | Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium |
CN114780787A (en) * | 2022-04-01 | 2022-07-22 | 杭州半云科技有限公司 | Voiceprint retrieval method, identity verification method, identity registration method and device |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101064043A (en) * | 2006-04-29 | 2007-10-31 | 上海优浪信息科技有限公司 | Sound-groove gate inhibition system and uses thereof |
CN102695112A (en) * | 2012-06-09 | 2012-09-26 | 九江妙士酷实业有限公司 | Automobile player and volume control method thereof |
CN102916815A (en) * | 2012-11-07 | 2013-02-06 | 华为终端有限公司 | Method and device for checking identity of user |
CN103220286A (en) * | 2013-04-10 | 2013-07-24 | 郑方 | Identity verification system and identity verification method based on dynamic password voice |
CN103632504A (en) * | 2013-12-17 | 2014-03-12 | 上海电机学院 | Silence reminder for library |
CN103986725A (en) * | 2014-05-29 | 2014-08-13 | 中国农业银行股份有限公司 | Client side, server side and identity authentication system and method |
CN104157301A (en) * | 2014-07-25 | 2014-11-19 | 广州三星通信技术研究有限公司 | Method, device and terminal deleting voice information blank segment |
CN104427076A (en) * | 2013-08-30 | 2015-03-18 | 中兴通讯股份有限公司 | Recognition method and recognition device for automatic answering of calling system |
CN104992708A (en) * | 2015-05-11 | 2015-10-21 | 国家计算机网络与信息安全管理中心 | Short-time specific audio detection model generating method and short-time specific audio detection method |
CN105100911A (en) * | 2014-05-06 | 2015-11-25 | 夏普株式会社 | Intelligent multimedia system and method |
CN105321293A (en) * | 2014-09-18 | 2016-02-10 | 广东小天才科技有限公司 | Danger detection and warning method and danger detection and warning smart device |
CN105611461A (en) * | 2016-01-04 | 2016-05-25 | 浙江宇视科技有限公司 | Noise suppression method, apparatus and system for voice application system of front-end device |
CN105869645A (en) * | 2016-03-25 | 2016-08-17 | 腾讯科技(深圳)有限公司 | Voice data processing method and device |
CN106210323A (en) * | 2016-07-13 | 2016-12-07 | 广东欧珀移动通信有限公司 | A kind of speech playing method and terminal unit |
CN106971717A (en) * | 2016-01-14 | 2017-07-21 | 芋头科技(杭州)有限公司 | Robot and audio recognition method, the device of webserver collaborative process |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8645137B2 (en) * | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
CN1170239C (en) * | 2002-09-06 | 2004-10-06 | 浙江大学 | Palm acoustic-print verifying system |
TWI234762B (en) * | 2003-12-22 | 2005-06-21 | Top Dihital Co Ltd | Voiceprint identification system for e-commerce |
US7447633B2 (en) * | 2004-11-22 | 2008-11-04 | International Business Machines Corporation | Method and apparatus for training a text independent speaker recognition system using speech data with text labels |
US7536304B2 (en) * | 2005-05-27 | 2009-05-19 | Porticus, Inc. | Method and system for bio-metric voice print authentication |
CN102479511A (en) * | 2010-11-23 | 2012-05-30 | 盛乐信息技术(上海)有限公司 | Large-scale voiceprint authentication method and system |
TW201301261A (en) * | 2011-06-27 | 2013-01-01 | Hon Hai Prec Ind Co Ltd | Identity authentication system and method thereof |
CN102238190B (en) * | 2011-08-01 | 2013-12-11 | 安徽科大讯飞信息科技股份有限公司 | Identity authentication method and system |
CN102509547B (en) * | 2011-12-29 | 2013-06-19 | 辽宁工业大学 | Method and system for voiceprint recognition based on vector quantization based |
US9042867B2 (en) * | 2012-02-24 | 2015-05-26 | Agnitio S.L. | System and method for speaker recognition on mobile devices |
CN102820033B (en) * | 2012-08-17 | 2013-12-04 | 南京大学 | Voiceprint identification method |
CN104765996B (en) * | 2014-01-06 | 2018-04-27 | 讯飞智元信息科技有限公司 | Voiceprint password authentication method and system |
CN104978507B (en) * | 2014-04-14 | 2019-02-01 | 中国石油化工集团公司 | A kind of Intelligent controller for logging evaluation expert system identity identifying method based on Application on Voiceprint Recognition |
CN104485102A (en) * | 2014-12-23 | 2015-04-01 | 智慧眼(湖南)科技发展有限公司 | Voiceprint recognition method and device |
CN104751845A (en) * | 2015-03-31 | 2015-07-01 | 江苏久祥汽车电器集团有限公司 | Voice recognition method and system used for intelligent robot |
CN105096955B (en) * | 2015-09-06 | 2019-02-01 | 广东外语外贸大学 | A kind of speaker's method for quickly identifying and system based on model growth cluster |
CN105575394A (en) * | 2016-01-04 | 2016-05-11 | 北京时代瑞朗科技有限公司 | Voiceprint identification method based on global change space and deep learning hybrid modeling |
CN106169295B (en) * | 2016-07-15 | 2019-03-01 | 腾讯科技(深圳)有限公司 | Identity vector generation method and device |
CN106373576B (en) * | 2016-09-07 | 2020-07-21 | Tcl科技集团股份有限公司 | Speaker confirmation method and system based on VQ and SVM algorithms |
CN107068154A (en) * | 2017-03-13 | 2017-08-18 | 平安科技(深圳)有限公司 | The method and system of authentication based on Application on Voiceprint Recognition |
-
2017
- 2017-03-13 CN CN201710147695.XA patent/CN107068154A/en active Pending
- 2017-06-30 WO PCT/CN2017/091361 patent/WO2018166112A1/en active Application Filing
- 2017-08-20 CN CN201710715433.9A patent/CN107517207A/en active Pending
- 2017-09-30 WO PCT/CN2017/105031 patent/WO2018166187A1/en active Application Filing
- 2017-10-13 TW TW106135250A patent/TWI641965B/en active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101064043A (en) * | 2006-04-29 | 2007-10-31 | 上海优浪信息科技有限公司 | Sound-groove gate inhibition system and uses thereof |
CN102695112A (en) * | 2012-06-09 | 2012-09-26 | 九江妙士酷实业有限公司 | Automobile player and volume control method thereof |
CN102916815A (en) * | 2012-11-07 | 2013-02-06 | 华为终端有限公司 | Method and device for checking identity of user |
CN103220286A (en) * | 2013-04-10 | 2013-07-24 | 郑方 | Identity verification system and identity verification method based on dynamic password voice |
CN104427076A (en) * | 2013-08-30 | 2015-03-18 | 中兴通讯股份有限公司 | Recognition method and recognition device for automatic answering of calling system |
CN103632504A (en) * | 2013-12-17 | 2014-03-12 | 上海电机学院 | Silence reminder for library |
CN105100911A (en) * | 2014-05-06 | 2015-11-25 | 夏普株式会社 | Intelligent multimedia system and method |
CN103986725A (en) * | 2014-05-29 | 2014-08-13 | 中国农业银行股份有限公司 | Client side, server side and identity authentication system and method |
CN104157301A (en) * | 2014-07-25 | 2014-11-19 | 广州三星通信技术研究有限公司 | Method, device and terminal deleting voice information blank segment |
CN105321293A (en) * | 2014-09-18 | 2016-02-10 | 广东小天才科技有限公司 | Danger detection and warning method and danger detection and warning smart device |
CN104992708A (en) * | 2015-05-11 | 2015-10-21 | 国家计算机网络与信息安全管理中心 | Short-time specific audio detection model generating method and short-time specific audio detection method |
CN105611461A (en) * | 2016-01-04 | 2016-05-25 | 浙江宇视科技有限公司 | Noise suppression method, apparatus and system for voice application system of front-end device |
CN106971717A (en) * | 2016-01-14 | 2017-07-21 | 芋头科技(杭州)有限公司 | Robot and audio recognition method, the device of webserver collaborative process |
CN105869645A (en) * | 2016-03-25 | 2016-08-17 | 腾讯科技(深圳)有限公司 | Voice data processing method and device |
CN106210323A (en) * | 2016-07-13 | 2016-12-07 | 广东欧珀移动通信有限公司 | A kind of speech playing method and terminal unit |
Non-Patent Citations (1)
Title |
---|
WANG J C, LEE H P, WANG J F: "《Robust Environmental Sound Recognition for Home Automation》", 《IEEE TRANSACTIONS ON AUTOMATION SCIENCE & ENGINEERING》 * |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108091326B (en) * | 2018-02-11 | 2021-08-06 | 张晓雷 | Voiceprint recognition method and system based on linear regression |
CN108091326A (en) * | 2018-02-11 | 2018-05-29 | 张晓雷 | A kind of method for recognizing sound-groove and system based on linear regression |
CN108768654B (en) * | 2018-04-09 | 2020-04-21 | 平安科技(深圳)有限公司 | Identity verification method based on voiceprint recognition, server and storage medium |
CN108694952B (en) * | 2018-04-09 | 2020-04-28 | 平安科技(深圳)有限公司 | Electronic device, identity authentication method and storage medium |
CN108694952A (en) * | 2018-04-09 | 2018-10-23 | 平安科技(深圳)有限公司 | Electronic device, the method for authentication and storage medium |
CN108768654A (en) * | 2018-04-09 | 2018-11-06 | 平安科技(深圳)有限公司 | Auth method, server based on Application on Voiceprint Recognition and storage medium |
WO2019196305A1 (en) * | 2018-04-09 | 2019-10-17 | 平安科技(深圳)有限公司 | Electronic device, identity verification method, and storage medium |
CN108447489B (en) * | 2018-04-17 | 2020-05-22 | 清华大学 | Continuous voiceprint authentication method and system with feedback |
CN108447489A (en) * | 2018-04-17 | 2018-08-24 | 清华大学 | A kind of continuous voiceprint authentication method and system of band feedback |
CN108630208A (en) * | 2018-05-14 | 2018-10-09 | 平安科技(深圳)有限公司 | Server, auth method and storage medium based on vocal print |
WO2019218512A1 (en) * | 2018-05-14 | 2019-11-21 | 平安科技(深圳)有限公司 | Server, voiceprint verification method, and storage medium |
WO2019218515A1 (en) * | 2018-05-14 | 2019-11-21 | 平安科技(深圳)有限公司 | Server, voiceprint-based identity authentication method, and storage medium |
CN108834138A (en) * | 2018-05-25 | 2018-11-16 | 四川斐讯全智信息技术有限公司 | A kind of distribution method and system based on voice print database |
CN109087647A (en) * | 2018-08-03 | 2018-12-25 | 平安科技(深圳)有限公司 | Application on Voiceprint Recognition processing method, device, electronic equipment and storage medium |
CN109256138A (en) * | 2018-08-13 | 2019-01-22 | 平安科技(深圳)有限公司 | Auth method, terminal device and computer readable storage medium |
CN109256138B (en) * | 2018-08-13 | 2023-07-07 | 平安科技(深圳)有限公司 | Identity verification method, terminal device and computer readable storage medium |
CN109450850A (en) * | 2018-09-26 | 2019-03-08 | 深圳壹账通智能科技有限公司 | Auth method, device, computer equipment and storage medium |
CN109147797A (en) * | 2018-10-18 | 2019-01-04 | 平安科技(深圳)有限公司 | Client service method, device, computer equipment and storage medium based on Application on Voiceprint Recognition |
CN109147797B (en) * | 2018-10-18 | 2024-05-07 | 平安科技(深圳)有限公司 | Customer service method, device, computer equipment and storage medium based on voiceprint recognition |
CN110046910A (en) * | 2018-12-13 | 2019-07-23 | 阿里巴巴集团控股有限公司 | The method and apparatus for obtaining customer group relevant to particular customer |
CN110046910B (en) * | 2018-12-13 | 2023-04-14 | 蚂蚁金服(杭州)网络技术有限公司 | Method and equipment for judging validity of transaction performed by customer through electronic payment platform |
CN109816508A (en) * | 2018-12-14 | 2019-05-28 | 深圳壹账通智能科技有限公司 | Method for authenticating user identity, device based on big data, computer equipment |
CN109473108A (en) * | 2018-12-15 | 2019-03-15 | 深圳壹账通智能科技有限公司 | Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition |
CN109545226A (en) * | 2019-01-04 | 2019-03-29 | 平安科技(深圳)有限公司 | A kind of audio recognition method, equipment and computer readable storage medium |
CN110322888A (en) * | 2019-05-21 | 2019-10-11 | 平安科技(深圳)有限公司 | Credit card unlocking method, device, equipment and computer readable storage medium |
CN110322888B (en) * | 2019-05-21 | 2023-05-30 | 平安科技(深圳)有限公司 | Credit card unlocking method, apparatus, device and computer readable storage medium |
CN110334603A (en) * | 2019-06-06 | 2019-10-15 | 视联动力信息技术股份有限公司 | Authentication system |
CN110971755A (en) * | 2019-11-18 | 2020-04-07 | 武汉大学 | Double-factor identity authentication method based on PIN code and pressure code |
CN111597531A (en) * | 2020-04-07 | 2020-08-28 | 北京捷通华声科技股份有限公司 | Identity authentication method and device, electronic equipment and readable storage medium |
CN111710340A (en) * | 2020-06-05 | 2020-09-25 | 深圳市卡牛科技有限公司 | Method, device, server and storage medium for identifying user identity based on voice |
CN111613230A (en) * | 2020-06-24 | 2020-09-01 | 泰康保险集团股份有限公司 | Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium |
CN112669841A (en) * | 2020-12-18 | 2021-04-16 | 平安科技(深圳)有限公司 | Training method and device for multilingual speech generation model and computer equipment |
Also Published As
Publication number | Publication date |
---|---|
CN107068154A (en) | 2017-08-18 |
TW201833810A (en) | 2018-09-16 |
WO2018166187A1 (en) | 2018-09-20 |
WO2018166112A1 (en) | 2018-09-20 |
TWI641965B (en) | 2018-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107517207A (en) | Server, auth method and computer-readable recording medium | |
WO2019100606A1 (en) | Electronic device, voiceprint-based identity verification method and system, and storage medium | |
KR102159217B1 (en) | Electronic device, identification method, system and computer-readable storage medium | |
WO2020119448A1 (en) | Voice information verification | |
Liu et al. | An MFCC‐based text‐independent speaker identification system for access control | |
CN110047490A (en) | Method for recognizing sound-groove, device, equipment and computer readable storage medium | |
WO2019136912A1 (en) | Electronic device, identity authentication method and system, and storage medium | |
CN103971690A (en) | Voiceprint recognition method and device | |
CN103794207A (en) | Dual-mode voice identity recognition method | |
CN107766868A (en) | A kind of classifier training method and device | |
CN105096955A (en) | Speaker rapid identification method and system based on growing and clustering algorithm of models | |
CN108650266B (en) | Server, voiceprint verification method and storage medium | |
CN110473552A (en) | Speech recognition authentication method and system | |
CN104517066A (en) | Folder encrypting method | |
CN109378014A (en) | A kind of mobile device source discrimination and system based on convolutional neural networks | |
CN108694952B (en) | Electronic device, identity authentication method and storage medium | |
CN113177850A (en) | Method and device for multi-party identity authentication of insurance | |
CN111933154B (en) | Method, equipment and computer readable storage medium for recognizing fake voice | |
CN108630208B (en) | Server, voiceprint-based identity authentication method and storage medium | |
CN114003883A (en) | Portable digital identity authentication equipment and identity authentication method | |
CN111916074A (en) | Cross-device voice control method, system, terminal and storage medium | |
TW201944320A (en) | Payment authentication method, device, equipment and storage medium | |
CN112562691B (en) | Voiceprint recognition method, voiceprint recognition device, computer equipment and storage medium | |
CN115690920B (en) | Credible living body detection method for medical identity authentication and related equipment | |
CN113436633B (en) | Speaker recognition method, speaker recognition device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171226 |