CN110110513A

CN110110513A - Identity identifying method, device and storage medium based on face and vocal print

Info

Publication number: CN110110513A
Application number: CN201910337540.1A
Authority: CN
Inventors: 陈继华; 陈志国; 陈凯迪
Original assignee: Shenzhen Liwei Zhilian Technology Co Ltd; Shanghai Yueling Information Technology Co Ltd
Current assignee: Shenzhen Liwei Zhilian Technology Co Ltd; Shanghai Yueling Information Technology Co Ltd; Shenzhen ZNV Technology Co Ltd
Priority date: 2019-04-24
Filing date: 2019-04-24
Publication date: 2019-08-09

Abstract

The invention discloses a kind of identity identifying method based on face and vocal print, comprising: when receiving ID authentication request, obtain the image information and acoustic information of object to be certified；The face information in described image information is extracted, and extracts the voiceprint in the acoustic information；In vivo detection is carried out to the object to be certified according to the face information and the acoustic information；When the object to be certified passes through the In vivo detection, identity detection is carried out to the object to be certified according to the face information and the voiceprint；When the object to be certified is detected by the identity, determine that authentication passes through.The invention also discloses a kind of identification authentication system and computer readable storage medium based on face and vocal print.The present invention treats certification object and is based on face information and acoustic information progress In vivo detection, improves the safety of authentication.

Description

Identity identifying method, device and storage medium based on face and vocal print

Technical field

The present invention relates to field of information security technology, more particularly to the identity identifying method based on face and vocal print, device And computer readable storage medium.

Background technique

With the rapid development of computer science, by account and password carry out authentication in the way of be increasingly easy quilt It cracks, safety is lower and lower.Therefore, the skill of authentication is carried out using the intrinsic physiological characteristic of human body such as face or voice Art is come into being, i.e., in registration, the face or voice of storage registration user obtain identity to be verified when carrying out authentication The face or sound of user carries out identity and recognizes according to the face of acquisition and voice with whether the face of storage or voice match Card.

However, since face information can be got by the photo of user, and the voice of user can also pass through record type It obtains, it can be by the certification currently based on face or the identity authorization system of voice, so that current using photo and recording The identification authentication mode safety of recognition of face or speech recognition is lower.

Above content is only used to facilitate the understanding of the technical scheme, and is not represented and is recognized that above content is existing skill Art.

Summary of the invention

The main purpose of the present invention is to provide a kind of identity identifying method based on face and vocal print, device and computers Readable storage medium storing program for executing, it is intended to solve that the identity authorization system currently based on face or voice can be passed through using photo and recording The lower technical problem of the identification authentication mode safety of certification, recognition of face or speech recognition.

To achieve the above object, the present invention provides a kind of identity identifying method based on face and vocal print, described to be based on people The identity identifying method of face and vocal print the following steps are included:

When receiving ID authentication request, the image information and acoustic information of object to be certified are obtained；

The face information in described image information is extracted, and extracts the voiceprint in the acoustic information；

In vivo detection is carried out to the object to be certified according to the face information and the acoustic information；

When the object to be certified passes through the In vivo detection, according to the face information and the voiceprint to institute It states object to be certified and carries out identity detection；

When the object to be certified is detected by the identity, determine that authentication passes through.

Preferably, described that identity detection is carried out to the object to be certified according to the face information and the voiceprint The step of include:

Obtain the account information that the ID authentication request includes；

It is determined according to the account information and prestores face information and prestore voiceprint；

Judge the face information whether with it is described prestore face information match and the voiceprint whether and institute It states and prestores voiceprint and match, wherein match and the vocal print in the face information and the face information that prestores Information and described when prestoring voiceprint and matching, then determine that the object to be certified is detected by the identity.

Preferably, described that In vivo detection is carried out to the object to be certified according to the face information and the acoustic information The step of include:

In vivo detection is carried out to the face information, and In vivo detection is carried out to the acoustic information；

When the face information and the acoustic information pass through the In vivo detection, determine that the object to be certified is logical Cross the In vivo detection.

Preferably, described the step of carrying out In vivo detection to face information, includes:

Image information is continuously acquired according to prefixed time interval in preset duration；

Obtain the pixel summation of the sclera image in each described image information；

It is dynamic with the presence or absence of blink in the described image information continuously acquired according to the judgement of the pixel summation of the sclera image Make；

When there are blink movement, determine that the face information In vivo detection passes through.

Preferably, described the step of carrying out In vivo detection to acoustic information, includes:

Voice living body challenge instruction is sent to the terminal, the voice living body challenge instruction includes sample voice, so that Sample voice described in the terminal plays, acquisition simultaneously feed back challenge voice；

When receiving the challenge voice of the terminal feedback, the deviation of the challenge voice and the sample voice is calculated Value；

When the deviation is less than or equal to predetermined deviation threshold value, determine that the acoustic information passes through In vivo detection.

Preferably, the step of deviation for calculating the challenge voice and the sample voice includes:

The grading parameters of the challenge voice and the sample voice are obtained respectively, wherein the grading parameters include stopping Immediately at least one of length, word speed, volume, pronunciation emphasis and pronunciation duration；

Calculate the deviation of each grading parameters between the challenge voice and the sample voice；

The corresponding weighted value of each grading parameters is obtained, according to the weighted value to the inclined of each grading parameters Difference is weighted, and obtains the deviation of the challenge voice and the sample voice.

Preferably, the step of face information extracted in described image information includes:

Obtain the image quality parameter of human face region in described image information, wherein described image mass parameter includes people Face tilt angle, brightness and noise；

When the mass parameter is in preset parameter range, the face information in described image information is extracted.

Preferably, the step of voiceprint extracted in the acoustic information includes:

End-point detection is carried out to the acoustic information, obtains the efficient voice in the acoustic information；

Quantization encoding, high frequency section reinforcement and framing windowing process are successively carried out to the efficient voice；

Voiceprint is extracted from the efficient voice after treatment.

In addition, to achieve the above object, the present invention also provides a kind of identification authentication system based on face and vocal print is described Identification authentication system based on face and vocal print includes: memory, processor and is stored on the memory and can be described The authentication program based on face and vocal print run on processor, the authentication program quilt based on face and vocal print The step of as above described in any item identity identifying methods based on face and vocal print are realized when the processor executes.

In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium The authentication program based on face and vocal print, the authentication program based on face and vocal print are stored on storage medium The step of as above described in any item identity identifying methods based on face and vocal print are realized when being executed by processor.

The identity identifying method based on face and vocal print, device and the computer-readable storage medium that the embodiment of the present invention proposes Matter, server is in the image information and acoustic information for when receiving ID authentication request, obtaining object to be certified；Described in extraction Face information in image information, and extract the voiceprint in the acoustic information；According to the face information and described Acoustic information carries out In vivo detection to the object to be certified；When the object to be certified passes through the In vivo detection, according to The face information and the voiceprint carry out identity detection to the object to be certified；Pass through institute in the object to be certified When stating identity detection, determine that authentication passes through.The present invention is treated certification object and is lived based on face information and acoustic information Physical examination is surveyed, and the safety of authentication is improved.

Detailed description of the invention

Fig. 1 is the apparatus structure schematic diagram for the hardware running environment that the embodiment of the present invention is related to；

Fig. 2 is that the present invention is based on the flow diagrams of one embodiment of identity identifying method of face and vocal print；

Fig. 3 is that the present invention is based on the flow diagrams of another embodiment of the identity identifying method of face and vocal print.

The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.

Specific embodiment

It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.

The primary solutions of the embodiment of the present invention are: when receiving ID authentication request, obtaining object to be certified Image information and acoustic information；The face information in described image information is extracted, and extracts the vocal print in the acoustic information Information；In vivo detection is carried out to the object to be certified according to the face information and the acoustic information；Described to be certified When object passes through the In vivo detection, identity is carried out to the object to be certified according to the face information and the voiceprint Detection；When the object to be certified is detected by the identity, determine that authentication passes through.

Since in the prior art, the identity authorization system currently based on face or voice can be passed through using photo and recording Certification, the lower technical problem of the identification authentication mode safety of recognition of face or speech recognition.

The present invention provides a solution, and object to be certified is lived firstly the need of using face information and acoustic information Physical examination is surveyed, and is effectively prevented carrying out authentication using photo and recording, is improved the safety of authentication.

As shown in Figure 1, Fig. 1 is the apparatus structure schematic diagram for the hardware running environment that the embodiment of the present invention is related to.

As shown in Figure 1, the apparatus may include: processor 1001, such as CPU, user interface 1003, memory 1004, Communication bus 1002.Wherein, communication bus 1002 is for realizing the connection communication between these components.User interface 1003 can be with Including display screen (Display), input unit such as keyboard (Keyboard), optional user interface 1003 can also include standard Wireline interface, wireless interface.Memory 1004 can be high speed RAM memory, be also possible to stable memory (non- Volatile memory), such as magnetic disk storage.Memory 1004 optionally can also be independently of aforementioned processor 1001 Storage device.

It will be understood by those skilled in the art that the restriction of the not structure twin installation of apparatus structure shown in Fig. 1, can wrap It includes than illustrating more or fewer components, perhaps combines certain components or different component layouts.

As shown in Figure 1, as may include operating system, Yong Hujie in a kind of memory 1004 of computer storage medium Mouth mold block and authentication program based on face and vocal print.

In device shown in Fig. 1, user interface 1003 is mainly used for connecting terminal, carries out data communication with terminal；And Processor 1001 can be used for calling the authentication program based on face and vocal print stored in memory 1004, and execute with Lower operation:

Further, processor 1001 can call the authentication based on face and vocal print stored in memory 1004 Program also executes following operation:

Obtain the account information that the ID authentication request includes；

Voiceprint is extracted from the efficient voice after treatment.

According to above scheme, server is in the image information for when receiving ID authentication request, obtaining object to be certified And acoustic information；The face information in described image information is extracted, and extracts the voiceprint in the acoustic information；According to The face information and the acoustic information carry out In vivo detection to the object to be certified；Pass through institute in the object to be certified When stating In vivo detection, identity detection is carried out to the object to be certified according to the face information and the voiceprint；Institute When stating object to be certified and detecting by the identity, determine that authentication passes through.The present invention is treated certification object and is believed based on face Breath and acoustic information carry out In vivo detection, improve the safety of authentication.

Referring to Fig. 2, Fig. 2 be the present invention is based on the flow diagram of one embodiment of identity identifying method of face and vocal print, The identity identifying method based on face and vocal print includes:

Step S10 obtains the image information and acoustic information of object to be certified when receiving ID authentication request；

Identity identifying method provided by the invention based on face and vocal print is mainly used for authentication, particularly for utilizing The authentication that the intrinsic physiological characteristic of human body carries out.Executing subject of the invention can be server or terminal, such as authenticate Server etc. is below illustrated the present invention using server as executing subject.

Login account information is inputted by terminal when user logs in, terminal to server sends authentication and asks It asks, the ID authentication request includes login account information.When server receives the ID authentication request of terminal transmission, to institute It states terminal and sends the image information of object to be certified and the acquisition instruction of acoustic information, so that terminal is adopted according to the acquisition instruction Collect and feed back the image information and acoustic information of object to be certified.Terminal receives the image information harmony of the object to be certified When the acquisition instruction of message breath, acquisition obtains the image information and acoustic information of object to be certified, and will be collected to be certified The image information and acoustic information of object are sent to server.

Step S20 extracts the face information in described image information, extracts the voiceprint in the acoustic information；

After server receives image information and the acoustic information of object to be certified, extract in described image information respectively Voiceprint in face information and the extraction acoustic information.Server carries out recognition of face, judgement to described image information Whether recognition of face succeeds, and when recognition of face failure, server can retransmit image information acquisition instruction to terminal, with logical Cross the image information that terminal reacquires user to be certified；When face information identifies successfully, the face in image information is extracted Information.Server carries out Application on Voiceprint Recognition to the acoustic information, judges whether Application on Voiceprint Recognition succeeds, when voiceprint recognition failures When, server can retransmit acoustic information acquisition instruction to terminal, to reacquire the sound of user to be certified by terminal Information；When Application on Voiceprint Recognition success, server extracts the voiceprint in the acoustic information.

Face information in server described image information can carry out in the following way: server receives image information Afterwards, image information is parsed, detects in described image information face whether occur, specifically, server extracts the figure Face texture feature information in piece information is detected whether comprising facial face information, when testing result does not include facial face When information, then determine do not occur face in described image information, server can retransmit image information acquisition to the terminal Instruction, so that the terminal reacquires and feeds back the image information of user to be certified.When testing result includes facial face letter When breath, then determine face occur in described image information.

When there is face in described image information, server extracts the picture of human face region in described image information, point Whether the picture quality for analysing the human face region meets the requirement of face picture quality index.Specifically, server obtains the figure As the image quality parameter of human face region in information, wherein described image mass parameter includes face tilt angle, brightness and makes an uproar Sound, then judges whether every described image mass parameter is in preset parameter range, is in every mass parameter When preset parameter range, then determine that the picture quality of the human face region meets every face picture quality index requirement.Each When at least one of described mass parameter of item exceeds corresponding preset parameter range, then the picture quality of the human face region is determined not Meet the requirement of face picture quality index；Equally the requirement of face picture quality index is unsatisfactory in the picture quality of human face region When, server can to the terminal retransmit image information acquisition instruction so that the terminal reacquire and feed back to Authenticate the image information of user.It is to be appreciated that the preset parameter range can self-setting according to the actual situation, herein not Do concrete restriction.

When the picture quality of the human face region meets every face picture quality index requirement, described in server extraction Face information in image information.Judge to extract whether face characteristic information succeeds, if extracting face characteristic information success Indicate that the human face region picture can analyze out face characteristic information；Described in being indicated if extracting face characteristic information failure Human face region picture can not analyze face characteristic information.When face information extracts failed, server can be to the terminal Image information acquisition instruction is retransmitted, so that the terminal reacquires and feeds back the image information of user to be certified.

The voiceprint that service is extracted in the acoustic information can carry out in the following way:

Speech terminals detection is carried out to the acoustic information of acquisition first, obtains the efficient voice in the acoustic information.By During user carries out voice recording using terminal, starting point and terminal usually have silence clip or background noise, in order to System is avoided to carry out unnecessary processing, user has in the acoustic information that server is obtained using end-point detecting method detection first The starting point and terminal of voice signal are imitated, the invalid voice signal in sound bite is eliminated, obtains having in the acoustic information Imitate voice.

Quantization encoding is carried out to the efficient voice.Recording obtained voice signal is a kind of analog signal, is not easy to count The processing of calculation machine.After getting efficient voice, server quantifies the amplitude of the speech signal of the efficient voice Coding, converts digital signal from analog signal for the efficient voice.

High frequency section reinforcement is carried out to the efficient voice after quantization encoding.Since the range of the speech frequency of normal person is general Voice for 40Hz to 4000Hz, 800Hz or more is considered as the high frequency section of voice signal.Due to the radiation-induced energy in lip end Amount loss, the corresponding spectrum component in the higher part of frequency is with regard to smaller, therefore low frequency part in the voice spectrum ratio of high frequency section Difficulty in computation it is big.In order to carry out united analysis and processing to voice signal, the language that server above-mentioned steps are handled is believed The high frequency section of breath is reinforced.

Framing adding window is carried out to the efficient voice after high frequency enhancement.Server draws the strengthened voice signal of high frequency section It is divided into multiple voice segments in short-term, becomes analysis frame one by one, voice signal is carried out using the processing method of stationary process Short time treatment.

Finally, from by extracting voiceprint in treated efficient voice, and judge to extract vocal print feature information Whether succeed.Indicate that the voice signal can analyze out vocal print characteristic information if extracting the success of vocal print feature information, such as Fruit, which extracts the failure of vocal print feature information, then indicates that the voice signal can not analyze vocal print feature information.Extracting vocal print feature When information fails, server can retransmit acoustic information acquisition instruction to the terminal, so that the terminal reacquires simultaneously Feed back the acoustic information of user to be certified.

Step S30 carries out In vivo detection to the object to be certified according to the face information and the acoustic information；

After successfully extracting face information and voiceprint, server is according to the face information and the acoustic information In vivo detection is carried out to the object to be certified, judges whether the object to be certified is living body.

Step S40, when the object to be certified passes through the In vivo detection, according to the face information and the vocal print Information carries out identity detection to the object to be certified；

When the object to be certified passes through the In vivo detection, illustrate that object to be certified is living body, server is further Identity detection is carried out to the object to be certified according to the face information and the voiceprint；The object to be certified not When by the In vivo detection, illustrate that object to be certified is not living body, server can terminate to verify, and to the terminal send to Certification object is not the prompting message of living body, or the acquisition request of image information and acoustic information is retransmitted to the terminal, Re-start authentication.

Specifically, server obtains the account information that the ID authentication request includes, then according to the account information It determines and prestores face information and prestore voiceprint, that is, obtain the associated face information of the account information and voiceprint.So After judge whether the face information with described prestores that face information matches and whether the voiceprint prestores with described Voiceprint matches.Match and the voiceprint and described in the face information and the face information that prestores When prestoring voiceprint and matching, then determine that the object to be certified is detected by the identity.In the face information and institute State prestore face information mismatch, or/and, the voiceprint and it is described prestore voiceprint mismatch when, then determine described in Object to be certified is not detected by the identity.

Step S50 determines that authentication passes through when the object to be certified is detected by the identity.

When the object to be certified is not detected by the identity, server determines that authentication does not pass through, at this point, clothes Business device can terminate to verify, and send authentication failure prompting message to the terminal；Pass through in the object to be certified described When identity detects, server determines that authentication passes through.

Further, in order to reduce the in-fighting of server and terminal, in embodiments of the present invention, it is to be certified right first to obtain The image information of elephant extracts the face information in described image information, carries out In vivo detection to face information, logical in face information When crossing In vivo detection, then the acoustic information of object to be certified is obtained, extract the voiceprint in the acoustic information, to the sound Message breath carries out In vivo detection；Or the acoustic information of object to be certified is first obtained, extract the vocal print letter in the acoustic information Breath carries out In vivo detection to the acoustic information, when acoustic information passes through In vivo detection, then obtains the image of object to be certified Information extracts the face information in described image information, carries out In vivo detection to face information.

Technical solution provided in this embodiment, server when receiving ID authentication request, are obtaining object to be certified Image information and acoustic information；The face information in described image information is extracted, and extracts the sound in the acoustic information Line information；In vivo detection is carried out to the object to be certified according to the face information and the acoustic information；Described wait recognize When demonstrate,proving object by the In vivo detection, body is carried out to the object to be certified according to the face information and the voiceprint Part detection；When the object to be certified is detected by the identity, determine that authentication passes through.The present invention treats certification object In vivo detection is carried out based on face information and acoustic information, improves the safety of authentication.

It further, is that the present invention is based on another embodiments of the identity identifying method of face and vocal print referring to Fig. 3, Fig. 3 Flow diagram, on the basis of above-mentioned embodiment illustrated in fig. 2, the refinement step of the step S30 includes:

Step S31 carries out In vivo detection to the face information, and carries out In vivo detection to the acoustic information；

Step S32 determines described wait recognize when the face information and the acoustic information pass through the In vivo detection It demonstrate,proves object and passes through the In vivo detection.

In the present embodiment, server extracts the face information in described image information, and extracts in the acoustic information Voiceprint, In vivo detection then is carried out to face information and acoustic information respectively.It is logical in face information and acoustic information When crossing the In vivo detection, then server determines that the object to be certified passes through the In vivo detection.When the face information and When at least one of described acoustic information is by the In vivo detection, then server determines that the object to be certified is to pass through The In vivo detection.

In the present embodiment, carrying out In vivo detection to the face information can realize that blink is human body by blink detection With spontaneous one of physiological behavior, blink behavior whether will occur and distinguish living body faces and photo face as condition When, the subjectivity cooperation without user.

Specifically, server (carries out image information acquisition/acquisition time according to prefixed time interval in preset duration It is spaced) image information is continuously acquired by terminal, wherein and the preset duration and the prefixed time interval can be according to practical feelings Condition is configured, for example, can according to the actual situation in the emotionally condition of blinking of eyes of people be configured.In reality, the eyes of people Every point can blink ten several times, about every 2-4 seconds eyes closed once, 0.2 to 0.4 second process used time of blink.Therefore, in order to Guarantee that enough pictures can be got during at a wink at certain to guarantee the realization of blink analysis, it can be when will be described default Between be set to every hundred milliseconds.Further, it is to be appreciated that being used to carry out that someone must to be contained in the image information of blink detection Face information just can be carried out detection, and therefore, server detects the image information got, and the face texture for extracting picture is special Reference breath detects whether to illustrate do not occur people in image information when not including facial face information comprising facial face information Face needs to carry out resampling.When comprising face information, illustrate occur face in image information, server obtains each The pixel summation of sclera image in described image information.The pixel summation for obtaining the sclera image in described image information can lead to Cross following manner realization: when there is face information in described image information, server extracts the eyes in described image information Region carries out gray processing, binarization operation to eye areas, removes interference information, obtains ocular image, then count eye Pearl pixel summation counts the pixel summation of sclera internal image.

Server handles collected each image in the manner described above, counts the eye in each image information The pixel of pearl image.Then, server judges the described image information continuously acquired according to the pixel summation of the sclera image In with the presence or absence of blink act.Specifically, eyeball picture in each image information acquired in preset duration described in server analysis The situation of change of plain summation, if in each image information, the pixel (sclera interior pixels) of eyeball image occur from have to Without the case where, then the fluctuation of the pixel generating period of the eyeball image in each image, meet people blink rule, can sentence It is disconnected that blink movement has occurred.When there is blink movement at least once, determine that the face information In vivo detection passes through, face figure As being living body faces；When there is no blink movement, determine that the face information In vivo detection does not pass through, facial image is non-live Body face.

In the present embodiment, carrying out In vivo detection to the acoustic information can be realized by sound challenge.Specifically service Device generates vocal print challenge code sequence at random, and vocal print challenge code sequence refers to that a brief voice, the characteristic information for including have: every Word speed, volume, pronunciation emphasis and pronunciation length when time interval between a word (or character), voice prompting.Server is raw After vocal print challenge code series, speech synthesis is carried out to the vocal print challenge code using speech synthesis technique, obtains sample voice.

After server synthesizes sample voice, Xiang Suoshu terminal sends voice living body challenge instruction, the voice living body challenge Instruction includes sample voice, so as to sample voice described in the terminal plays, acquisition and feed back challenge voice.Specifically, terminal After receiving voice living body challenge instruction, play the sample voice, and user prompted to mention according to sample voice, keep with Sample style is consistent, repeats the sample voice.After terminal collects the challenge voice of user's input, the challenge voice is sent out It send to the server.Server calculates the challenge voice and the sample when receiving the challenge voice of terminal transmission The voice Activity Score deviation of voice.

Specifically, the parameter of voice Activity Score deviation can be preset, the parameter can voluntarily be set according to the actual situation It sets, for example, the parameter of voice Activity Score deviation may include pause duration, word speed, volume, pronounce emphasis and pronunciation duration etc., Each parameter can carry out setting different weights (0≤weight≤1) according to test scene or different safety standards.Server point The grading parameters of the challenge voice and the sample voice are not obtained, wherein the grading parameters include pause duration, language At least one of speed, volume, pronunciation emphasis and pronunciation duration, then calculate between the challenge voice and the sample voice The deviation of each grading parameters, such as calculate the deviation of user speech challenge code and voice prompting challenge code dead time The deviation S3 of deviation S2, volume between value S1, word speed, the deviation S4 of the emphasis that pronounces, tone period length Deviation S5.Then the corresponding weighted value of each grading parameters is further obtained, according to the weighted value to each described The deviation of grading parameters is weighted, and obtains the deviation of the challenge voice and the sample voice.I.e. voice is living Property effort analysis value can be calculated according to following formula: voice Activity Score deviation=weight 1 × S1+ weight, 2 × S2+ power Weigh 3 × S3+ weight 4 × S4+ weight, 5 × S5.

After sample voice and the deviation of challenge voice is calculated in server, judgement sample voice is inclined with challenge voice Whether difference is less than or equal to predetermined deviation threshold value, and when the deviation is less than or equal to predetermined deviation threshold value, server determines institute Acoustic information is stated by In vivo detection, voice In vivo detection result is living body voice；When the deviation be greater than predetermined deviation When threshold value, server determines that the acoustic information does not pass through In vivo detection, i.e. voice In vivo detection result is non-living body voice.

In addition, in any embodiment of the present invention, passing through end in user to further enhance the safety of authentication End is when being registered, the In vivo detection of server also advanced row face information and acoustic information, in face information and acoustic information In vivo detection when passing through, server just stores user account information, face information and vocal print information association.Specifically, Server obtains the account information that terminal is sent in the registration request for receiving terminal, sends image information harmony to terminal Sound information acquisition request extracts the people in described image information when getting the image information and acoustic information of terminal transmission Voiceprint in face information and the extraction acoustic information, then carries out the In vivo detection of face information and acoustic information, When the In vivo detection of face information and acoustic information passes through, server is just by user account information, face information and vocal print Information association storage, to guarantee to register object as living body, further increases the safety of authentication.Wherein, image information The In vivo detection of the extraction of acquisition, face information and voiceprint with acoustic information and face information and acoustic information step Suddenly with carry out authentication when it is identical, details are not described herein.

Technical solution provided in this embodiment, server carry out the face information in image by continuous acquisition image Blink detection realizes that the face information carries out In vivo detection；And living body is carried out to the acoustic information by sound challenge Detection；When the face information and the acoustic information pass through the In vivo detection, determine that the object to be certified passes through The In vivo detection, improves the accuracy of face information and acoustic information In vivo detection, and then improves the peace of authentication Quan Xing.

In addition, in order to achieve the above object, the embodiment of the present invention, which also provides one kind, carries out living body identity based on face and sound The certificate server of registration, living body authentication, the certificate server includes: face recognition module, face In vivo detection mould Block, voiceprint identification module, voice In vivo detection module, user's determining module, face alignment module, face judgment module, vocal print Comparison module, vocal print judgment module, user information database.

Wherein, face recognition module face characteristic information for identification, by carrying out recognition of face to human face photo, Face characteristic information is obtained, the detailed implementation process of recognition of face is referring to first embodiment.The face living body module be used for into Pedestrian's face In vivo detection carries out face In vivo detection, the detailed realization stream of face In vivo detection specifically by blink detection Journey is referring to second embodiment.Voiceprint identification module vocal print feature information for identification, by carrying out vocal print to voice data Identification obtains vocal print feature information, and the detailed implementation process of Application on Voiceprint Recognition is referring to first embodiment.The voice In vivo detection mould Block carries out vocal print challenge by user, calculates voice Activity Score deviation, language for carrying out vocal print challenge, detection voice activity The detailed implementation process of sound In vivo detection is referring to second embodiment.If user's determining module is used for the recognition of face knot Fruit be successfully, face In vivo detection result be successfully, Application on Voiceprint Recognition result be successfully, voice In vivo detection result is when being successfully, By the account information of the user, face characteristic information, the user information of vocal print feature information storage to certificate server itself Library；The face alignment module is for comparing the user information database of the face characteristic information and certificate server itself It is right, obtain face alignment as a result, face alignment by the human face similarity degree threshold value of face characteristic alignment algorithm and systemic presupposition come Judge whether it is the face of same people.The face judgment module is for judging whether the face alignment result is registered use Family；It then indicates face characteristic information that the human face photo analyzes if it is registered users and is stored in user information database Face characteristic information similarity is equal to or higher than preset threshold, and the user account of login system is corresponding with user information database User account it is identical, then be determined as registered users；If face characteristic information and user that the human face photo analyzes User account and user information of the face characteristic information similarity stored in information bank lower than preset threshold or login system Corresponding user account is different in library, then is determined as that object to be certified is not registered users.The vocal print comparison module is used for The vocal print feature information is compared with the user information database of certificate server itself, obtains vocal print comparison result；Vocal print The vocal print similarity threshold by vocal print feature alignment algorithm and systemic presupposition is compared to determine whether being the sound of same people.Institute Vocal print judgment module is stated for judging whether the vocal print comparison result is registered users；It is then indicated if it is registered users In the vocal print feature information and user information database that the speech data analysis goes out the vocal print feature information similarity that stores be equal to or Higher than preset threshold, and the user account of login system is identical as user account corresponding in user information database, then judgement sound Line comparison result is registered users；If stored in vocal print feature information and user information database that the speech data analysis goes out User account of the vocal print feature information similarity lower than preset threshold or login system and use corresponding in user information database Family account is different, then determines it is not registered users.The user information database is used to store account information, the face characteristic of user The storage mode of information, vocal print feature information, data can be using the storage of the various ways such as relational database.

In addition, in order to achieve the above object, the embodiment of the present invention provides one kind also for realizing based on face and sound progress Identity registration, authentication terminal, the terminal include: input keyboard, user account information receiving module, user account letter Cease sending module, camera, image information receiving module, image information sending module, microphone, voice data receiving module, Voice data sending module.

Wherein, the input keyboard inputs account information for user；The user account information receiving module is for leading to The account information that input keyboard receives user's registration or inputs when logging in is crossed, the account information of user at least should include user's Account title, it is to be understood that the account information of user may include other additional informations of user.The user account letter The account information that breath sending module is used to register customers as or input when logging in is sent to certificate server；The camera is used for Acquire the image information of user (user to be certified or user to be registered)；Described image information receiving module is for passing through camera Receive the image information of user；Described image information sending module is used to described image information being sent to certificate server；Institute Microphone is stated for recorded speech data；The voice data receiving module is used to receive the voice number of user by microphone According to；The voice data sending module is used to the voice data being sent to certificate server.

It is to be appreciated that the present embodiments relate to terminal can be mobile phone, PC or other have input The terminal of keyboard, camera and microphone, is not particularly limited herein.

In addition, to achieve the above object, the embodiment of the present invention also provides a kind of authentication dress based on face and vocal print It sets, the identification authentication system based on face and vocal print includes: memory, processor and is stored on the memory and can The authentication program based on face and vocal print run on the processor, the authentication based on face and vocal print Program realizes the identity identifying method based on face and vocal print described in any embodiment as above when being executed by the processor Step.

In addition, to achieve the above object, the embodiment of the present invention also provides a kind of computer readable storage medium, the calculating The authentication program based on face and vocal print is stored on machine readable storage medium storing program for executing, the identity based on face and vocal print is recognized Card program realizes the step of the identity identifying method based on face and vocal print described in any embodiment as above when being executed by processor Suddenly.

It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the system that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or system institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or system.

The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone, Computer, server, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.

The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims

1. a kind of identity identifying method based on face and vocal print, which is characterized in that the identity based on face and vocal print is recognized Card method the following steps are included:

When the object to be certified passes through the In vivo detection, according to the face information and the voiceprint to it is described to Certification object carries out identity detection；

2. the identity identifying method based on face and vocal print as described in claim 1, which is characterized in that described according to the people Face information and the voiceprint include: to the step of object progress identity detection to be certified

Obtain the account information that the ID authentication request includes；

Judge the face information whether with it is described prestore face information match and the voiceprint whether with it is described pre- It deposits voiceprint to match, wherein match and the voiceprint in the face information and the face information that prestores With it is described prestore voiceprint and match when, then determine that the object to be certified is detected by the identity.

3. the identity identifying method based on face and vocal print as described in claim 1, which is characterized in that described according to the people Face information and the acoustic information include: to the step of object progress In vivo detection to be certified

When the face information and the acoustic information pass through the In vivo detection, determine that the object to be certified passes through institute State In vivo detection.

4. the identity identifying method based on face and vocal print as claimed in claim 3, which is characterized in that described to the face Information carry out In vivo detection the step of include:

It is acted in the described image information continuously acquired according to the judgement of the pixel summation of the sclera image with the presence or absence of blink；

5. the identity identifying method based on face and vocal print as claimed in claim 3, which is characterized in that described to the sound Information carry out In vivo detection the step of include:

Voice living body challenge instruction is sent to the terminal, the voice living body challenge instruction includes sample voice, so that described Sample voice, acquisition described in terminal plays simultaneously feed back challenge voice；

When receiving the challenge voice of the terminal feedback, the deviation of the challenge voice and the sample voice is calculated；

6. the identity identifying method based on face and vocal print as claimed in claim 5, which is characterized in that chosen described in the calculating War voice and the sample voice deviation the step of include:

The grading parameters of the challenge voice and the sample voice are obtained respectively, wherein the grading parameters include when pausing At least one of length, word speed, volume, pronunciation emphasis and pronunciation duration；

The corresponding weighted value of each grading parameters is obtained, according to the weighted value to the deviation of each grading parameters It is weighted, obtains the deviation of the challenge voice and the sample voice.

7. the identity identifying method as claimed in any one of claims 1 to 6 based on face and vocal print, which is characterized in that described to mention The step of taking the face information in described image information include:

Obtain the image quality parameter of human face region in described image information, wherein described image mass parameter includes that face inclines Rake angle, brightness and noise；

8. the identity identifying method as claimed in any one of claims 1 to 6 based on face and vocal print, which is characterized in that described to mention The step of taking the voiceprint in the acoustic information include:

Voiceprint is extracted from the efficient voice after treatment.

9. a kind of identification authentication system based on face and vocal print, which is characterized in that the identity based on face and vocal print is recognized Card device include: memory, processor and be stored on the memory and can run on the processor based on face With the authentication program of vocal print, realized such as when the authentication program based on face and vocal print is executed by the processor The step of identity identifying method described in any item of the claim 1 to 8 based on face and vocal print.

10. a kind of computer readable storage medium, which is characterized in that be stored on the computer readable storage medium based on people The authentication program of face and vocal print is realized when the authentication program based on face and vocal print is executed by processor as weighed Benefit require any one of 1 to 8 described in identity identifying method based on face and vocal print the step of.