CN105989264A

CN105989264A - Bioassay method and bioassay system for biological characteristics

Info

Publication number: CN105989264A
Application number: CN201510053281.1A
Authority: CN
Inventors: 邓琼
Original assignee: Beijing Keaosen Data Technology Co Ltd
Current assignee: Beijing Keaosen Data Technology Co Ltd
Priority date: 2015-02-02
Filing date: 2015-02-02
Publication date: 2016-10-05
Anticipated expiration: 2035-02-02
Also published as: CN105989264B

Abstract

The invention provides a bioassay method and a bioassay system for biological characteristics. Fake body attack in modes such as human face pictures, human face video and sound recording can be rejected by the aid of the bioassay method and the bioassay system. The bioassay method includes generating random action and voice sequence instructions; displaying the random action sequence instructions to users in audio-visual modes, simultaneously acquiring human face video frequency and voice data of the users and feeding the human face video frequency and voice data to the users in real time; analyzing coincidence degrees of the acquired data and the random action sequence instructions so as to judge whether the users are alive persons or not. According to the technical scheme, the bioassay method and the bioassay system have the advantages that action and voice instructions are randomly generated, and accordingly prepared human face picture, video or voice attack can be prevented; the random action sequence instructions are displayed in the audio-visual modes, and accordingly the users can be assisted to understand the random action sequence instructions; user video and voice can be synchronously fed and displayed, and accordingly the users can be effectively guided to carry out corresponding actions and phonate; the authentication safety can be improved, and the usability of products and the user experience can be enhanced.

Description

Biological characteristic in-vivo detection method and system

Technical field

The present invention relates to automated image analysis and biometrics identification technology field, particularly relate to a kind of biological characteristic live body inspection Survey method and system.

Background technology

Living things feature recognition has important application in authentication and authorized domain, such as, utilize recognition of face means to move Authentication in dynamic payment, can strengthen the identity security in the Internet and mobile Internet business.But, face Identification system is easily subject to forge the attack of face, solicited message and identity security problem.Such as, assailant can pass through Certain means obtain the facial image of account owner and make the forgery features such as photo, video or mask, are presented on identification Before system, to replace real human face to obtain illegal authority.

The technology for distinguishing true man and forgery face currently mainly, can be divided mainly into two big classes.The first kind is based on stricture of vagina The method of reason, the method is by obtaining abundant human body skin detail textures, and the high fdrequency component analyzing texture distinguishes true man With forgery feature.The analysis based on motor pattern of Equations of The Second Kind method, such as, people is generally in certain background, works as people During motion, background changes the most therewith, relatively independent for true man, the motion of human region and the motion of background, by dividing Analysis human region and the relative motion modes of background area, can distinguish true man and forge feature.

But inventor finds in product development with during using: along with improving constantly of photo acquisition and printing technique, mesh Before can obtain the human body skin photo of more high definition, comprise abundant grain details, cause the first kind based on texture The reliability of method is substantially reduced；Equations of The Second Kind method generally cannot solve the attack pattern of video playback.Still lack energy at present Enough effectively real people of identification, the face identification method preventing forgery feature attack and system.

Summary of the invention

For the problems referred to above overcoming existing biological characteristic anti-counterfeiting technology to exist, the present invention provides a kind of based on man-machine interaction and pattern Identify the biological characteristic biopsy method that combines and system, with refuse prosthese attack record a video such as face picture, face and Sound is recorded, and improves the safety of living things feature recognition.

Details are as follows for a kind of biological characteristic in-vivo detection method and system that the application provides.

First aspect according to the embodiment of the present application, it is provided that a kind of biological characteristic biopsy method, including:

Generate random action sequence instruction；

The code of described random action sequence instruction is converted into text, vision and/or auditory coding, and with visual, Audible sound or the two mode combined present；

Gather user's response image sequence；

Tong Bu described response image sequence and described random action sequence instruction are carried out vision present；

Analyze the response action sequence of user in described response image sequence；

Judge whether described response action sequence meets the action sequence that described random action sequence instruction is corresponding, if symbol Close, then judge that described response action sequence is from live body people.

Wherein, described biological characteristic biopsy method, it is also possible to including:

Generate the instruction of random voice sequence；

The code that described random voice sequence instructs is converted into text, vision and/or auditory coding, and with visual, Audible sound or the two mode combined present；

Gather user's voice responsive sequence；

Analyze the voice responsive of user in described voice responsive sequence；

Judge whether described voice responsive meets the voice sequence that the instruction of described random voice sequence is corresponding, if met, then Judge that described voice responsive sequence is from live body people.

Wherein, described biological characteristic biopsy method, is that the time is specified in each action in random action sequence instruction Stamp, described timestamp is used for identifying the movement time of each action or the initial time of each action and end time, described Timestamp stochastic generation.

Wherein, the response action sequence of user in described analysis described response image sequence, including:

The face in every image is detected from response image sequence；

Every face is carried out key point location；

Face key point according to location calculates head pose corner；

Face key point according to location calculates human face expression type；

The response action sequence of user is obtained according to described head pose corner and described human face expression type；

The action sequence that relatively described response action sequence is corresponding with described random action sequence instruction, calculates type of action symbol Right；

Described type of action goodness of fit is compared with the first predetermined threshold value, if described type of action goodness of fit is pre-more than first If threshold value, then judge that described response action sequence, from live body people, is otherwise not considered as from live body people.

Wherein, the response action sequence of user in described analysis described response image sequence, it is also possible to including:

To each action in described response action sequence, calculate the movement time of each action；

Relatively the movement time of each action of described calculating and the timestamp of each action, calculate goodness of fit movement time；

The overall goodness of fit of calculating action=type of action goodness of fit+w × movement time goodness of fit, wherein w is weights；

Overall for described action goodness of fit is compared with the second predetermined threshold value, if the overall goodness of fit of action presets threshold more than second Value, then judge that described response action sequence, from live body people, is otherwise not considered as from live body people.

Identify described voice responsive sequence content；

Calculate the voice content goodness of fit of described voice responsive sequence；

Calculate overall goodness of fit=type of action goodness of fit+w1 × movement time goodness of fit+w2 × voice content to meet Degree, wherein w1, w2 are weights；

Described overall goodness of fit is compared with the 3rd predetermined threshold value, if overall goodness of fit is more than the 3rd predetermined threshold value, then sentences Disconnected described response action sequence, from live body people, is otherwise not considered as from live body people.

Wherein, the complexity of random action sequence instruction and described first predetermined threshold value, second pre-are set according to safe class If threshold value and the size of the 3rd predetermined threshold value.

Corresponding to the first aspect of the embodiment of the present application, according to the second aspect of the embodiment of the present application, it is provided that a kind of biological special Levy In vivo detection system, including:

Action sequence instruction signal generating unit, is used for generating random action sequence instruction；

Action command display unit, including display and speaker, for first by the code of described random action sequence instruction It is converted into text, vision and/or auditory coding, and presents in the way of visual, audible sound or the two combination；

Wherein, described display, for showing text and/or the picture of visual coding of described random action sequence instruction, Described speaker, for playing text and/or the auditory coding sound of described random action sequence instruction；

Image acquisition units, is used for gathering user and responds human face image sequence；

Response action display unit, for Tong Bu regarding described response image sequence and described random action sequence instruction Feel presents；

Motion analysis unit, for analyzing the response action sequence of user in described response human face image sequence；

Action goodness of fit judging unit, is used for judging whether described response action sequence meets described random action sequence instruction Corresponding action sequence, if met, then judges that described response action sequence is from live body people.

Wherein, described biological characteristic In vivo detection system, it is also possible to including:

Phonetic order signal generating unit, is used for generating random phonetic order；

Phonetic order display unit, including display and speaker, for the code first instructed by described random voice sequence It is converted into text, vision and/or auditory coding, and presents in the way of visual, audible sound or the two combination；

Voice collecting unit, is used for gathering user's voice responsive sequence；

Voice analyzing unit, for analyzing the voice responsive of user in described voice responsive sequence；

Voice goodness of fit judging unit, is used for judging whether described voice responsive meets described random voice sequence instruction correspondence Voice sequence, if met, then judge that described voice responsive sequence is from live body people.

Wherein, described random action sequence instruction signal generating unit is that the time is specified in each action in random action sequence instruction Stamp, described timestamp is used for identifying the movement time of each action or the initial time of each action and end time, described Timestamp stochastic generation.

Wherein, described motion analysis unit, including:

Face datection subelement, for detecting the face in every image from response action sequence；

Key point locator unit, for carrying out key point location to every face；

Head pose corner computation subunit, calculates head pose corner for the face key point according to location；

Human face expression type computation subunit, calculates human face expression type for the face key point according to location；

Action sequence identification subelement, for obtaining described sound according to described head pose corner and described human face expression type Answer action sequence；

Described action goodness of fit judging unit, including:

Move corresponding with the instruction of described action sequence of type of action goodness of fit computation subunit, relatively described response action sequence Make sequence, calculate type of action goodness of fit；

First judgment sub-unit, for described type of action goodness of fit is compared with the first predetermined threshold value, if described action Type goodness of fit is more than the first predetermined threshold value, then in response action sequence, the response type of action of people meets described random action Sequence instruction, it is judged that described response action sequence, from live body people, is otherwise not considered as from live body people.

Wherein, described motion analysis unit, it is also possible to including:

Movement time, computation subunit, was used for, to each action in described response action sequence, calculating the dynamic of each action Make the time；

Movement time, goodness of fit computation subunit, was used for the movement time of each action of described calculating and each action Timestamp comparation, calculates goodness of fit movement time；

Action overall goodness of fit computation subunit, is used for calculating the overall goodness of fit of action, the overall goodness of fit of action=action class Type goodness of fit+w × movement time goodness of fit, wherein w is weights；

Second judgment sub-unit, for comparing overall for described action goodness of fit with the second predetermined threshold value, if greater than second Predetermined threshold value, then in response action sequence, the response action of people meets the action sequence that described random action sequence instruction is corresponding, Judge that described response action sequence, from live body people, is otherwise not considered as from live body people.

Voice analyzing unit, is used for identifying described voice responsive sequence content；

Voice goodness of fit computing unit, for calculating the described voice responsive content goodness of fit of described voice responsive sequence；

Overall goodness of fit computing unit, is used for calculating overall goodness of fit, calculate overall goodness of fit=type of action goodness of fit+ W1 × movement time goodness of fit+w2 × voice content goodness of fit, wherein w1, w2 are weights；

3rd judging unit, for described overall goodness of fit being compared with the 3rd predetermined threshold value, presets threshold if greater than the 3rd Value, then judge that described response action sequence, from live body people, is otherwise not considered as from live body people.

The technical scheme that the embodiment of the present application provides can include following beneficial effect: action and phonetic order stochastic generation, It is difficult to use cut-and-dried human face photo, video or voice language material to attack；Random action sequence instruction is carried out Audio visual presents, and effectively helps user to understand instruction；Action and the voice synchronous feedback of user are presented, effectively guides User makes corresponding actions and sounding, thus improves the safety of authentication, and the ease for use of improving product and user's body Test.

It should be appreciated that it is only exemplary and explanatory that above general description and details hereinafter describe, can not Limit the application.

Accompanying drawing explanation

In order to be illustrated more clearly that the embodiment of the present application or technical scheme of the prior art, below will be to embodiment or existing In technology description, the required accompanying drawing used is briefly described, it should be apparent that, for those of ordinary skill in the art Speech, on the premise of not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is the schematic flow sheet of a kind of biological characteristic biopsy method shown in the application one exemplary embodiment.

Fig. 2 is the structural representation of a kind of biological characteristic In vivo detection system shown in the application one exemplary embodiment.

Fig. 3 is that the random action sequence instruction visualization vision of biological character vivo detecting system presents and random voice sequence The schematic diagram that command visible vision presents.

Fig. 4 be display when being perpendicular screen the random action sequence instruction vision of a kind of biological characteristic In vivo detection system present and show It is intended to.

Fig. 5 be display when being transverse screen the random action sequence instruction vision of a kind of biological characteristic In vivo detection system present and show It is intended to.

Fig. 6 is that the text of random action sequence instruction and response action sequence are done the schematic diagram that vision presents simultaneously.

Fig. 7 is by together with responding action sequence by the instruction of random voice sequence and random action sequence instruction with the user gathered The schematic diagram of simultaneous display.

Detailed description of the invention

Here will illustrate exemplary embodiment in detail, its example represents in the accompanying drawings.Explained below relates to attached During figure, unless otherwise indicated, the same numbers in different accompanying drawings represents same or analogous key element.Following exemplary is implemented Embodiment described in example does not represent all embodiments consistent with the application.On the contrary, they be only with such as The example of the apparatus and method that some aspects that described in detail in appended claims, the application are consistent.

In order to understand the application comprehensively, refer to numerous concrete details in the following detailed description, but art technology Personnel are it should be understood that the application can realize without these details.In other embodiments, it is not described in detail public affairs Method, process, assembly and the circuit known, obscures in order to avoid undesirably resulting in embodiment.

Fig. 1 is the schematic flow sheet of a kind of biological characteristic biopsy method shown in the application one exemplary embodiment, as Shown in Fig. 1, described method includes:

Step S101, generates random action sequence instruction.

Wherein, described random action sequence instruction is used for indicating how user does action, and it is described by type of action and forms, Can also include that timestamp is specified in each action, described timestamp is for identifying the movement time of each action or each action Time started and the end time.Described timestamp is relative time stamp when the movement time identifying each action, table Levy in random action sequence instruction, length movement time of each type of action；Described timestamp is used for identifying each moving It is absolute timestamp when the time started made and end time, when can calculate the action of each action according to absolute timestamp Between, the end time of the most each action deducts the end time of each action.Described relative time stamp and absolute timestamp can With stochastic generation, so that the random degree of random action sequence instruction is higher, during absolute timestamp stochastic generation, later The time started of the absolute timestamp of action is more than the end time of the absolute timestamp of previous action.Described action sequence Instruction can be that individual part instruction can also be combined by multiple action commands.Described generation random action sequence instruction, can To include:

(a1) action number N, such as N=4 are determined at random.

(a2) from candidate actions type set, randomly select N number of type of action and be combined, N number of dynamic in combination The order making type is random, such as 4 type of action of stochastic generation (head left-hand rotation 30 degree → 10 degree → left-hand rotation 20 of turning right Degree → turn right 40 degree), or (head left-hand rotation 30 degree → become a full member into 0 degree → open one's mouth → shut up).

(a3) being randomly assigned movement time and/or the action frequency of each type of action, described movement time is each action The persistent period of type, wherein, movement time can be with relative time stamp or the form addition action sequence of absolute timestamp Row instruction, the most each type of action describes the timestamp plus mark movement time, adds action sequence with timestamp form Instruction, can facilitate corresponding with each type of action, separate each type of action, and concordance is good, the most error-prone. When each type of action specify for absolute timestamp time, the time started of latter action is equal to the end of previous action Time.

Can be with required movement number of times to each type of action, default-action number of times is 1 time, if user is at required movement It is made that repeatedly action in time, then whether has done action in only identifying the required movement time in follow-up identification, and failed to see Do not do several times；Can required movement number of times, not required movement time；Can also movement time and action frequency all refer to Fixed, movement time and the action frequency of each type of action can be the same or different；Can also partial act type refer to Determine movement time, the partial act type not required movement time, or, partial act type required movement number of times, part Type of action not required movement number of times, by that analogy.Movement time and the action frequency of such as type of action are appointed as: from Left-to-right shake the head 2 times, 2 seconds movement times；Shake the head from top to bottom 3 times, 3 seconds movement times；Open one's mouth 2 times, action 1 second time；Closing one's eyes 3 times, 2 seconds movement times, the instruction of corresponding action sequence is: (from left to right shake the head 2 times, dynamic Make the time 2 seconds) → (shaking the head from top to bottom 3 times, 3 seconds movement times) → (opening one's mouth 2 times, 1 second movement time) → (closing one's eyes 3 times, 2 seconds movement times), specifies movement time therein in the way of relative time stamp.Work as movement time When specifying in the way of absolute timestamp, corresponding action sequence instruction can be: (from the 0th second to the 2nd second, from a left side to Shake the head 2 times in the right side) → (from the 2nd second to the 5th second, shake the head from top to bottom 3 times) → (from the 5th second to the 6th second, Mouth 2 times) → (from the 6th second to the 8th second, close one's eyes 3 times)

Wherein, random action sequence instruction can be according to the different complexity of the different set of safe class, such as, safety During grade height, increase action number N in action sequence.

Step S102, is converted into text, vision and/or auditory coding by the code of described random action sequence instruction, and Present in the way of visual, audible sound or the two combination.

Wherein, the code of random action sequence instruction is converted into text and/or visual coding, is rendered as " opening one's mouth ", " closes Mouth ", text printed words, image or the animated of the action such as " head left-hand rotation ", " bowing ", " nictation ", then by it to regard The mode felt presents to user by display；Or the code of random action sequence instruction is converted into text and/or audition Coding, is converted into text, is then converted into voice by TTS (Text To Speech) engine, such as, passes through speaker Voice broadcast " is opened one's mouth ", " shutting up ", the sound such as " head left-hand rotation ", " bowing ", " nictation "；Or with vision and audition two The mode that person combines presents to user.Present by audio-visual and point out, helping user to understand instruction, in order in time Corresponding action is made according to prompting.

Step S103, gathers user's response image sequence.

Wherein it is possible to utilize photographic head or other image/video capture apparatus, the action of shooting user, thus collect Response human face image sequence, in response human face image sequence, each image is the frame of video that shooting obtains.

Step S104, Tong Bu carries out vision by described response image sequence and described random action sequence instruction and presents；

Wherein, synchronize to carry out vision and present be by captured response image sequence pictures together with random action sequence instruction It is simultaneously displayed on screen, feeds back to user in time, allow users to adjust the action of oneself so that it is random and action sequence Row instruction is consistent.

Step S105, analyzes the response action sequence of user in described response image sequence.

Wherein, step S105 includes:

(a1) from response action sequence, the face in every image is detected；

(a2) every face is carried out key point location；

(a3) head pose corner is calculated according to the face key point of location；

(a4) human face expression type is calculated according to the face key point of location；

(a5) described response action sequence is calculated according to head pose and expression type；

(a6) the action sequence that the response action sequence of described calculating gained is corresponding with described random action sequence instruction is compared Row, calculate type of action goodness of fit.

Wherein, each image in response action sequence is carried out Face datection, available based on local feature and The human-face detector of Adaboost study, it is also possible to utilize the human-face detector obtained based on neural metwork training, if inspection Measure face, then continue below step, be not detected by face, then skip this image, if every image does not all have Face detected, then terminate whole process, now can point out user again to user by the way of vision or audition Start.

After image detects face, face is carried out key point location, i.e. to every facial image, choose corresponding The multiple key points preset, such as, choose 68 key points, can sketch the contours of face details profile according to key point coordinate. Attitude and the expression classification of face is calculated on the basis of key point.

In alternatively possible embodiment, it is possible to use feature assessment method obtains head pose corner and human face expression class Type, described feature assessment method gathers the face image data under substantial amounts of different attitude and expression in advance, from facial image number According to middle extraction external performance, the mode such as SVM or recurrence of employing is trained and is obtained Attitude estimation grader, then uses instruction Get and facial image carried out attitude and expression is estimated at Attitude estimation grader, such as facial image, Ke Yijin Row Gabor characteristic is extracted or LBP (Local Binary Patterns, local binary patterns) feature extraction, uses SVM (Support Vector Machine, support vector machine) training obtains attitude and expression classifier to carry out face The Attitude estimation of image and expression classification.

Wherein, after obtaining every head pose corner corresponding to facial image and human face expression type, according to described head appearance Response human face image sequence is carried out cutting by state corner and described human face expression type, refers to each action to separate and to identify The human action that order is corresponding, meet with a response action sequence.The timestamp cutting that described cutting can instruct according to action sequence, Type of action cutting in can also instructing according to action sequence.

The response human face image sequence that timestamp cutting according to action sequence instruction will gather is by according to described timestamp The movement time obtained carries out cutting, when timestamp is relative time stamp, during the i.e. action of size that described relative time stabs Between size, can according to relative time stamp carry out cutting, when timestamp is absolute timestamp, during according to action start-stop Between carry out cutting.Such as action sequence instruction is (from left to right shaking the head 2 times, 2 seconds movement times) → (from top to bottom Shake the head 3 times, 3 seconds movement times) → (opening one's mouth 2 times, 1 second movement time) → (close one's eyes 3 times, movement time 2 Second), the time of timestamp mark is respectively 2 seconds, 2 seconds, 1 second and 2 seconds, then will response human face image sequence by 2 seconds, Within 2 seconds, 1 second, 2 seconds, carry out cutting.If action sequence instruction is (from the 0th second to the 2nd second, from left to right shake the head 2 times) → (from the 2nd second to the 5th second, shake the head from top to bottom 3 times) → (from the 5th second to the 6th second, open one's mouth 2 times) → (from 6th second to the 8th second, close one's eyes 3 times), the beginning and ending time of each action of timestamp mark is respectively the 0th second to the 2nd Second, the 2nd second to the 5th second, the 5th second to the 6th second, the 6th second to the 8th second, then pressing response human face image sequence Cutting is carried out according to the beginning and ending time.The every section of response human face image sequence obtaining cutting, in conjunction with every the image detected Head pose corner and human face expression type, identify the human action that every section of response human face image sequence is corresponding, dynamic to shake the head As example, the head pose corner of each facial image obtained during shaking the head is different, by the response of every section The head pose angle data of human face image sequence combines, and extracts the motion characteristic of every section of response human face image sequence, Use the human action recognizer of routine, every section of action corresponding to response human face image sequence can be obtained, also may be used simultaneously To identify the number of times obtaining described action.Every section of number of times responding action corresponding to human face image sequence and described action is by former Some time order and function sequential combination, meet with a response action sequence.

Type of action cutting in instructing according to action sequence, i.e. according to head pose corner and the face of each facial image Expression type, and according to type of action in action sequence instruction and corresponding action frequency and order, successively to all sound Answering human face image sequence to carry out action recognition, such as, in action sequence instruction, first element type is for shaking the head, action time Number is 2 times, then recognize whether head shaking movement and the number of times shaken the head in all response human face image sequences, if Can recognize that the action shaken the head, then by response human face image sequence corresponding for all head shaking movements from all response face figures Cut out as in sequence, and keep the sequence of positions being syncopated as responding human face image sequence at whole facial images, such as The facial image being split out is in all responding the front end of human face image sequence, records the response people cut out simultaneously The number of times of the action that face image sequence is corresponding, such as, the number of times shaken the head identified.Then in instructing according to action sequence Second type of action, remaining response human face image sequence is carried out action recognition, such as in action sequence instruction the Two type of action, for nodding, are nodded 3 times, then recognize whether to nod in remaining response human face image sequence dynamic The number of times made and nod, if it is possible to identify the action nodded, then by response face figure corresponding for all nodding actions As sequence cuts out from all response human face image sequences, and keep the response human face image sequence being syncopated as all Time sequencing in facial image, and and the response human face image sequence being syncopated as the first time between sequential relationship, example In this way at all the response front end of human face image sequence, middle part or afterbody, and at the response facial image separated for the first time Before sequence or after, record the number of times of action corresponding to response human face image sequence that this cuts out simultaneously. By that analogy, until last the type of action cutting in instructing according to action sequence is complete.After cutting, will be every What section cut out responds the number of times of human action corresponding to human face image sequence and described human action by original time first Rear sequential combination, i.e. meet with a response action sequence.Wherein, for responding the action recognition of human face image sequence, Ke Yigen Conventional action recognition is used to calculate according to head pose corner and the human face expression type of each image in response human face image sequence Method.When response human face image sequence being carried out cutting according to type of action, contain action during cutting simultaneously Identify.Again every section of response human face image sequence can be carried out after cutting the identification of human action, to ensure described knowledge Other correctness, it is also possible to no longer carry out described identification, meets with a response action sequence according to the result of cutting.

When the timestamp instructed according to action sequence or type of action carry out cutting to response human face image sequence, Ke Yi Every section of response human face image sequence that cutting obtains adds relative time stamp or absolute timestamp, for identifying correspondence The persistent period of action or its time started and end time.During according to timestamp cutting, due to every section of response face figure As sequence has clear and definite time span or beginning and ending time, it may not be necessary to add again timestamp identify correspondence action time Between length or beginning and ending time.When carrying out cutting according to type of action, to the every section of response human face image sequence being sliced into, Obtain first facial image and the time of last facial image in time sequencing, respond face figure respectively as this section The time started of the action answered as sequence pair and end time, according to time started and end time in response action sequence Corresponding human action adds timestamp.In response action sequence, add timestamp be conducive to separating each type of action, Remaining response human face image sequence after selecting when being also beneficial to cutting to carry out everyone body action identification, and be conducive to meter Calculate the movement time of each type of action.

For responding the cutting of human face image sequence, carrying out cutting according to timestamp, dicing process is simple, but requires user Can be in strict accordance with time requirement execution, owing to the action of people is difficult to accurate assurance sometimes movement time, can only be substantially Meeting the requirement of time span, such as, it is desirable to shake the head 2s, actual shaking the head may carry out cutting for 2.2s according to timestamp, Then it is likely to result in every section of cutting human action corresponding to response human face image sequence imperfect or have the remnants of other actions Image, makes the identification of human action error occur.According to type of action cutting, although dicing process is complex, but root The response human face image sequence obtained according to cutting can identify complete human action exactly.

In a kind of possible embodiment, when response human face image sequence being carried out cutting according to type of action, if entirely The response human face image sequence in portion cannot recognize that the first element type in action sequence instruction, or according to all Response human face image sequence identify instruct with action sequence in first element type identical type of action time, this moves Make response human face image sequence corresponding to type not in the front end (Part I) all responding human face image sequence, the most permissible Determine recognition failures, be not required to carry out again follow-up step.This mode can be the most undesirable at user's first element, Or forge feature when can not carry out action, it is determined that active user non-living body people, and terminate follow-up flow process, the most permissible More succinctly prevent the attack of possible dangerous feature rapidly.

Wherein, the action sequence that relatively the response action sequence of described calculating gained is corresponding with the instruction of described action sequence, meter Calculating every section of action in type of action goodness of fit, i.e. comparison response action sequence and corresponding action command, comparison includes moving The type made and the number of times of action, according to the result of comparison, arrange different weights, such as, action sequence for every section of action In row instruction, first element type is for shaking the head, and action frequency is 3 times, if first element class in response action sequence Type is for shaking the head, and action frequency is 3 times, then the weights S of first element type in response action sequence₁Could be arranged to 1, If first element type is for shaking the head in response action sequence, but action frequency is 2 times, then S₁Could be arranged to 0.7, By that analogy.The weights of every section of action are added, obtain type of action goodness of fit.

Wherein, under random action sequence instruction is the pattern that type of action adds relative time stamp, step S105 is all right Including:

(b1) to each action in the described response action sequence calculated, the movement time of each action is calculated；

(b2) movement time of each action is compared with the relative time stamp of each action, calculate goodness of fit movement time；

(b3) the overall goodness of fit of calculating action=type of action goodness of fit+w × movement time goodness of fit, wherein w is power Value；

Movement time corresponding to every section of action and corresponding relative time stamp in the response action sequence that will obtain compare Right.According to the result of comparison, every section of action for response action sequence arranges different time weights, by every section of action Time weight is added, and obtains goodness of fit movement time.The time weight of every section of action can be equal to (1-error movement time), Or equal to (1/ movement time error).Wherein, in random action sequence instruction, relative time stamp is t1, and responds action In sequence, the movement time of certain section of action is t2, then this section of action

The then overall goodness of fit of action=type of action goodness of fit+w × movement time goodness of fit, wherein w is weights.

Wherein, under random action sequence instruction is the pattern that type of action adds absolute timestamp, step S105 is all right Including:

(c1) to each action in the described response action sequence calculated, the movement time of each action is calculated；

(c2) movement time of each action is compared with the absolute timestamp of each action, calculate goodness of fit movement time；

(c3) the overall goodness of fit of the calculating action=overall goodness of fit of calculating action=type of action goodness of fit+w × movement time Goodness of fit, wherein w is weights；

Movement time corresponding to every section of action and corresponding absolute timestamp in the response action sequence that will obtain compare Right.According to the result of comparison, every section of action for response action sequence arranges different time weights, by every section of action Time weight is added, and obtains goodness of fit movement time.The time weight of every section of action can be equal to (1-error movement time), Or equal to (1/ movement time error).Wherein, in random action sequence instruction, absolute timestamp is the t1 second to t2 Second, and responding the movement time of certain section of action in action sequence is T, then this section of action

The then overall goodness of fit of action=type of action goodness of fit+w × movement time accordance, wherein w is weights.

Additionally, under random action sequence instruction is the pattern that type of action adds absolute timestamp, another kind of scheme is all right Including:

Whether user is on the timestamp specified in analysis, has done the instruction action specified.Such as type of action job sequence is (the 0th second to the 2nd second, head turned left to 30 degree → the 2nd second to the 4th second, turned right to 10 degree → the 4th second to the 5th Second, turn left to 20 degree → the 5th second to the 7th second, right-hand rotation is to 40 degree), the time started of absolute timestamp is respectively the 0 second, the 2nd second, the 4th second, the 5th second, the end time was respectively the 2nd second, the 4th second, the 5th second, the 7th second. Whether system test head position when the 2nd second is left avertence 30 degree, the head position whether right avertence 10 degree when the 4th second, When the 5th second, head position whether left avertence 20 degree, when the 7th second, head position whether right avertence 40 degree, if all according with Close, then judged that 4 corresponding headworks judge that described response action sequence, from live body people, is otherwise not considered as From live body people.

Step S106, it is judged that whether described response action sequence meets the action sequence that described random action sequence instruction is corresponding Row, if met, then judge that described response action sequence is from live body people.

Wherein, described type of action goodness of fit is compared with the first predetermined threshold value, if described type of action goodness of fit is more than First predetermined threshold value, then in response action sequence, the response type of action of people meets the instruction of described action sequence, it is judged that described Response action sequence, from live body people, is otherwise not considered as from live body people.It is that type of action adds at random action sequence instruction Under the pattern of timestamp, it is also possible to overall for described action goodness of fit is compared with the second predetermined threshold value, pre-if greater than second If threshold value, then in response action sequence, the response action of people meets the action sequence that the instruction of described action sequence is corresponding, it is judged that Described response action sequence, from live body people, is otherwise not considered as from live body people.

Wherein, described first predetermined threshold value and the second predetermined threshold value can be arranged according to the requirement of degree of safety, such as, Safety grade is high, then the first predetermined threshold value and the second predetermined threshold value are set to big value.

Below with the application mobile payment live body checking applied environment under an application case further illustrate the application, So that those skilled in the art are more fully understood that principle and the application of the application.

During mobile payment, forge feature during for preventing authentication and cause checking by mistake, active user need to be identified whether For real people.For making case the clearest, the key step of the application is carried out citing description.If mobile payment process In, the instruction set of vivo identification system candidate includes { shake the head, open one's mouth, nictation } three kinds of common actions.

(1a) when, after system start-up vivo identification, system stochastic generation action sequence instructs, such as " from left to right shake the head 3 times, 6 seconds movement times；Open one's mouth 2 times, 1 second movement time；Blink 4 times, 2 seconds movement times ", and with animation Form generation action command schematic diagram, present to user.

(2a) user is according to action command schematic diagram, in the face of photographic head, starts shooting, can make corresponding on request Action, now system acquisition response human face image sequence, after the whole action of user completes, terminates shooting, now system knot Bundle gathers response human face image sequence.

(3a) use Gabor characteristic to extract and SVM training obtains Attitude estimation grader, use described Attitude estimation The response human face image sequence collected is estimated the attitude of each image by grader frame by frame, including head, eyes, nose State with face.

(4a) by response human face image sequence according to the timestamp of action command be cut into time span be 6 seconds, 1 second, 2 Three sections of second.According to the attitude of every facial image, identify the human action that every section of response action sequence is corresponding, rung Answer action sequence.

If being from left to right to shake for 3 times according to the corresponding human action that first paragraph response human face image sequence identification obtains Head, then (type of action is described as from left to right shaking the head, action time to arrange first element corresponding in response action sequence Number is 3) weightsIf action corresponding in response action sequence is from left to right to shake the head for 2 times, then arrangeIf action corresponding in response action sequence is from left to right to shake the head for 1 time, then arrangeIf According to first paragraph response, action sequence is unidentified is from left to right shaken the head, then arrange

If the corresponding human action obtained according to second segment response human face image sequence identification is for opening one's mouth 2 times, then arrange The weights of second action corresponding in response action sequenceIf opened one's mouth, number of times is 1 time, then arrange If opened one's mouth, number of times is 0 (obtaining action of opening one's mouth not according to second segment response action sequence identification), then arrange

If being number of winks 4 times according to the corresponding human action that the 3rd section of response action sequence identification obtains, then arrange The weights note of the 3rd action corresponding in response action sequenceIf number of winks is 3 times, then arrangeIf number of winks is 2 times；Then arrangeIf number of winks is 1 time, then arrange If obtaining action nictation not according to second segment response action sequence identification, then arrange

(5a) the type of action goodness of fit being calculated response action sequence and random action sequence instruction is

S^{a} = S_{1}^{a} + S_{2}^{a} + S_{3}^{a},

And compare with the first predetermined threshold value, if

S_{1}^{a} = 1, S_{2}^{a} = 0.5, S_{3}^{a} = 0.25,

Then action Type goodness of fit S^a=1.75, the threshold value preset is 2, then owing to type of action goodness of fit is less than the first predetermined threshold value, identify Failure, it is judged that current user's non-living body, correspondingly, authentication can not be passed through.

If response action sequence being carried out cutting according to type of action, then step (4a) could alternatively be:

(4b) all response human face image sequences are carried out the identification of from left to right head shaking movement, if dynamic in all responses Have identified the action from left to right shaken the head as sequence, response human face image sequence corresponding to described action is positioned at all responses The front end of human face image sequence, but action frequency is less than three times, then arrange the weights of first element in response action sequenceIf action frequency reaches three times, then rememberAnd recording responses human face image sequence gathers time started t0 With the acquisition time t1 of last facial image responding action sequence of correspondence of from left to right shaking the head for the last time, calculate Meet with a response t1-t0 movement time of first element in action sequence.It can also be provided that other value, it is also possible to root According to different action frequencies, different values is set.

Response human face image sequence after the t1 moment is opened one's mouth the identification of action, if that identifies opens one's mouth number of times not Full 2 times, then the weights of second action in response action sequence are setIf opened one's mouth, number of times reaches 2 times, then ArrangeAnd when recording the collection of corresponding last facial image responding human face image sequence of opening one's mouth for the last time Between t2, using t2-t1 as movement time of second action in response human face image sequence.

Response human face image sequence after the t2 moment is carried out the identification of action nictation, if action nictation identified is not Full 4 times, then the weights of the 3rd action in response human face image sequence are setIf action nictation reaches 4 times, Then rememberAnd when recording the collection of last corresponding last facial image responding human face image sequence of blinking Between t3, using t3-t2 as movement time of the 3rd action in response human face image sequence.

Step (5a) could alternatively be simultaneously:

(5b) response action sequence and the overall goodness of fit of random action sequence instruction are calculated:

S^{b} = S_{1}^{b} + S_{2}^{b} + S_{3}^{b} + η (\frac{| t_{1} - t_{0} - T_{1} |}{t_{1} - t_{0}} + \frac{| t_{2} - t_{3} - T_{2} |}{t_{2} - t_{1}} + \frac{| t_{3} - t_{2} - T_{3} |}{t_{3} - t_{2}}),

Wherein T1, T2, T3 be respectively random action sequence instruction is shaken the head, open one's mouth, blink correspondence timestamp, η is weight coefficient.

Rule of thumb require to arrange the second predetermined threshold value θ with safe class, when overall goodness of fit is more than the second predetermined threshold value θ Time, then it is judged to live body people, is otherwise judged to non-living body people.

In a kind of possible embodiment, the biological characteristic biopsy method that the embodiment of the present application provides, also include:

(d1) instruction of random voice sequence is generated；

(d2) code that described random voice sequence instructs is converted into text, vision and/or auditory coding, and with vision Picture, audible sound or the two mode combined present；；

(d3) user's voice responsive sequence is gathered；

(d4) voice responsive of user in described voice responsive sequence is analyzed；

(d5) judge whether described voice responsive meets the voice sequence that the instruction of described random voice sequence is corresponding, if symbol Close, then judge that described voice responsive sequence is from live body people.

Wherein, the instruction of described random voice sequence can be a string word or a string sound bite, its content stochastic generation, Or in sound template storehouse, randomly draw some sound templates be combined as phonetic order sequence.Phonetic order sequence after generation Row can show to indicate user at display with the form of word, image, or by raising one's voice in the way of speech play Device instruction user, it is also possible to indicate with display and speaker in the way of the combination of text, image and speech play simultaneously and use Family.User, after receiving instruction, according to instruction voice, i.e. sends voice responsive.Gather user by sound pick-up outfit to ring Answer voice sequence.When phonetic order sequence is a string literal, or when being made up of sound template, can be to collecting User's voice responsive sequence carries out audio analysis and identification, and described audio analysis can be conventional audio content analysis and knowledge Not, result and the phonetic order sequence of identification are contrasted, if the percentage ratio shared by identical part is pre-more than one If threshold value, such as more than 90%, then judge that described user's voice responsive sequence is from live body people；When random phonetic order When being a string literal, the result of described identification can be converted to word, after described voice content is converted to word and language Sound job sequence contrasts, if the word that the word that is converted to of described voice content and phonetic order sequence are consistent The threshold value default more than one, such as more than 90%, then judge that described user's voice responsive sequence is from live body people.When with When machine phonetic order sequence is a string sound bite, can be to the user's voice responsive sequence collected and phonetic order sequence Carry out Waveform Matching analysis, if the Waveform Matching degree of voice responsive sequence and random phonetic order sequence is preset more than one Threshold value, then judge that described user's voice responsive sequence is from living person.

In a kind of possible embodiment, combining graphical analysis and speech analysis, action and voice to people simultaneously is entered Row discriminatory analysis judges whether user is live body, then described biological characteristic biopsy method, may include that

(e1) random action sequence instruction is generated；

(e2) code of described random action sequence instruction is converted into text, vision and/or auditory coding, and with vision Picture, audible sound or the two mode combined present；

(e3) user's response image sequence is gathered；

(e4) the response action sequence of user in described response image sequence is analyzed；

(e5) judge whether described response action sequence meets the action sequence that described random action sequence instruction is corresponding, meter The overall goodness of fit of calculation action；

(e6) random phonetic order is generated；

(e7) code that described random voice sequence instructs is converted into text, vision and/or auditory coding, and with vision Picture, audible sound or the two mode combined present；

(e8) user's voice responsive sequence is gathered；

(e9) voice responsive of user in described voice responsive sequence is analyzed；

(e10) the voice content goodness of fit of described voice responsive sequence is calculated, will voice responsive sequence content and random Phonetic order alignment, according to the situation of contrast, each voice for voice responsive sequence arranges different weights, will The weights of each voice are added, and obtain voice content goodness of fit；

(e11) overall goodness of fit=action overall goodness of fit+w2 × voice content goodness of fit=type of action goodness of fit is calculated + w1 × movement time goodness of fit+w2 × voice content goodness of fit, wherein w1, w2 are weights；

(e12) described overall goodness of fit is compared with the 3rd predetermined threshold value, if greater than the 3rd predetermined threshold value, then judge Described response action sequence, from live body people, is otherwise not considered as from live body people.

By the description of above embodiment of the method, those skilled in the art is it can be understood that can borrow to the application The mode helping software to add required general hardware platform realizes, naturally it is also possible to by hardware, but a lot of in the case of the former It it is more preferably embodiment.Based on such understanding, prior art is made by the technical scheme of the application the most in other words The part of contribution can embody with the form of software product, and is stored in a storage medium, including some instructions With so that a smart machine performs all or part of step of method described in each embodiment of the application.And aforesaid deposit Storage media includes: read only memory (ROM), random access memory (RAM), magnetic disc or CD etc. are various can With storage data and the medium of program code.

Fig. 2 is the structural representation of a kind of biological characteristic In vivo detection system shown in the application one exemplary embodiment.As Shown in Fig. 2, described system includes:

Action sequence instruction signal generating unit U201, is used for generating random action sequence instruction；

Action command display unit U202, including display and speaker, for first by described random action sequence instruction Code be converted into text, vision and/or auditory coding, then done visual, audible sound or the two combine Mode presents；

Image acquisition units U203, is used for gathering user and responds human face image sequence；

Response action display unit U204, for Tong Bu with described random action sequence instruction by described response image sequence Carry out vision to present；

Motion analysis unit U205, for analyzing the response action sequence of user in described response human face image sequence；

Action goodness of fit judging unit U206, is used for judging whether described response action sequence meets described random action sequence The action sequence that row instruction is corresponding, if met, then judges that described response action sequence is from live body people.

Phonetic order display unit, including display and speaker, for the code first instructed by described random voice sequence Be converted into text, vision and/or auditory coding, then done visual, audible sound or the two combine mode in Existing；

Voice goodness of fit judging unit, for judging whether described voice responsive sequence meets the language that described phonetic order is corresponding Sound sequence, if met, then judges that described voice responsive sequence is from live body people.

Wherein, in a kind of possible embodiment, described action sequence instruction signal generating unit is at random action sequence instruction In be that timestamp is specified in each action, described timestamp is for identifying the initial of movement time of each action or each action Time and end time, described timestamp stochastic generation.

Wherein, described action command display unit is converted into text and/or visual coding the code of random action sequence instruction, Such as " open one's mouth ", " shutting up ", the text printed words of action, image, the animated such as " head left-hand rotation ", " bowing ", " nictation ", Then it is presented to user by display in the way of vision；The code of random action sequence instruction be converted into text and/or Auditory coding, is converted into text, is then converted into voice by TTS (Text To Speech) engine, such as by raising one's voice Device voice broadcast " is opened one's mouth ", " shutting up ", the sound such as " head left-hand rotation ", " bowing ", " nictation "；Or both vision and audition are tied The mode closed presents to user.Present prompting by audio visual, help user to understand instruction, in order to make according to instruction in time Corresponding action.

Wherein, described image acquisition units, photographic head or other image/video capture apparatus, the action of shooting user, Thus collecting response human face image sequence, in response human face image sequence, each image is the frame of video that shooting obtains.

Wherein, response action display unit, for Tong Bu with described random action sequence instruction by described response image sequence Carrying out vision to present, display, on screen, feeds back to user in time, allows users to adjust the action of oneself so that it is with Random action sequence instruction is consistent.

Wherein, described motion analysis unit, may include that

Key point locator unit, for carrying out key point location to every face；

Action sequence identification subelement, for calculating described sound according to described head pose corner and described human face expression type Answer action sequence.

Described action goodness of fit judging unit, may include that

The response action sequence of type of action goodness of fit computation subunit, relatively described calculating gained and described random action sequence The action sequence that row instruction is corresponding, calculates type of action goodness of fit；

Wherein, described motion analysis unit, also include:

Movement time, computation subunit, for each action in the described response action sequence calculated, calculated each The movement time of action；

Movement time, goodness of fit computation subunit, was used for the timestamp ratio of the movement time of each action with stochastic generation Relatively, goodness of fit movement time is calculated；

Second judgment sub-unit, for comparing overall for described action goodness of fit with the second predetermined threshold value, if greater than second Predetermined threshold value, then in response action sequence, the response action of people meets the action sequence that the instruction of described action sequence is corresponding, sentences Disconnected described response action sequence, from live body people, is otherwise not considered as from live body people.

The biological characteristic In vivo detection system that the embodiment of the present application provides, in a kind of possible embodiment, it is also possible to bag Include:

Complexity and described first predetermined threshold value, second predetermined threshold value of random action sequence instruction is set according to safe class Size with the 3rd predetermined threshold value.

Fig. 3 is that the random action sequence instruction visualization vision of biological character vivo detecting system presents and random voice sequence The schematic diagram that command visible vision presents.Wherein, (1), (2), (3) represent (left-hand rotation 45 of random action sequence instruction Degree → front → right-hand rotation 45 degree), present with writings and image simultaneously, (4) represent that random voice sequence instructs, and use Written form presents, and wherein example is for reading in short, it is also possible to read a string random digit.

Fig. 4 is random action sequence instruction and user's response of display a kind of biological characteristic In vivo detection system when being perpendicular screen The schematic diagram that image sequence synchronizing visual presents.Refer to preferably guide collected object to make to meet random action sequence The action sequence of order, is concurrently presented aobvious by the response image sequence of random action sequence instruction and collection in the way of vision Show on device.To perpendicular screen, random action sequence instruction is presented on the upper right corner of the response image sequence of collection, guiding in real time User makes and responds action sequence accordingly.Wherein in Fig. 4, (1) to (4) represents positive face → side face → positive face → open one's mouth Random action sequence instruction and corresponding response image sequence present schematic diagram.

Fig. 5 is random action sequence instruction and user's response of display a kind of biological characteristic In vivo detection system when being transverse screen The schematic diagram that image sequence synchronizing visual presents.Wherein in Fig. 5 (1) to (4) represent positive face → side face → positive face → The random action sequence instruction of mouth and corresponding response image sequence present schematic diagram.

Fig. 6 is text and the response image sequence schematic diagram that vision presents simultaneously, the wherein upper end of random action sequence instruction Showing every random action sequence instruction one by one, lower part shows gathered user's response image sequence.

Fig. 7 instructs by random voice sequence and random action sequence instruction is same together with the user's response image sequence gathered The schematic diagram of step display.

Each embodiment in this specification all uses the mode gone forward one by one to describe, identical similar part between each embodiment Seeing mutually, what each embodiment stressed is the difference with other embodiments.Especially for device Or for system embodiment, owing to it is substantially similar to embodiment of the method, so describing fairly simple, relevant part ginseng See that the part of embodiment of the method illustrates.Apparatus and system embodiment described above is only schematically, wherein The described unit illustrated as separating component can be or may not be physically separate, the portion shown as unit Part can be or may not be physical location, i.e. may be located at a place, or can also be distributed to multiple network On unit.Some or all of module therein can be selected according to the actual needs to realize the purpose of the present embodiment scheme. Those of ordinary skill in the art, in the case of not paying creative work, are i.e. appreciated that and implement.

It should be noted that in this article, such as the relational terms of " first " and " second " or the like be used merely to by One entity or operation separate with another entity or operating space, and not necessarily require or imply these entities or behaviour Relation or the backward of any this reality is there is between work.And, term " includes ", " comprising " or its any its His variant is intended to comprising of nonexcludability, so that include the process of a series of key element, method, system or set Standby not only include those key elements, but also include other key elements being not expressly set out, or also include for this process, The key element that method, system or equipment are intrinsic.In the case of there is no more restriction, by statement " including ... " The key element limited, it is not excluded that there is also other phase in including the process of described key element, method, article or equipment Same key element.

The above is only the detailed description of the invention of the application, makes to skilled artisans appreciate that or realize the application. Multiple amendment to these embodiments will be apparent to one skilled in the art, and as defined herein one As principle can realize in other embodiments in the case of without departing from spirit herein or scope.Therefore, this Shen Please be not intended to be limited to the embodiments shown herein, and be to fit to and principles disclosed herein and features of novelty The widest consistent scope.

Claims

1. a biological characteristic biopsy method, it is characterised in that including:

Generate random action sequence instruction；

Gather user's response image sequence；

Judge whether described response action sequence meets the action sequence that described random action sequence instruction is corresponding, if met, Then judge that described response action sequence is from live body people.

2. biological characteristic biopsy method as claimed in claim 1, it is characterised in that also include:

Generate the instruction of random voice sequence；

Gather user's voice responsive sequence；

Analyze the voice responsive of user in described voice responsive sequence；

3. biological characteristic biopsy method as claimed in claim 1, it is characterised in that at random action sequence instruction In be that timestamp is specified in each action, described timestamp is for identifying the initial of movement time of each action or each action Time and end time, described timestamp stochastic generation.

4. biological characteristic biopsy method as claimed in claim 1, it is characterised in that the described response diagram of described analysis The response action sequence of user in picture sequence, including:

The face in every image is detected from response image sequence；

Every face is carried out key point location；

Face key point according to location calculates head pose corner；

Face key point according to location calculates human face expression type；

Described type of action goodness of fit is compared with the first predetermined threshold value, if described type of action goodness of fit is more than described the One predetermined threshold value, then judge that described response action sequence, from live body people, is otherwise not considered as from live body people.

5. biological characteristic biopsy method as claimed in claim 4, it is characterised in that the described response diagram of described analysis As the response action sequence of user in sequence, also include:

6. biological characteristic biopsy method as claimed in claim 5, is further characterized in that, also includes:

Identify described voice responsive sequence content；

7. the biological characteristic biopsy method as described in any one of claim 4-6, it is characterised in that according to safety etc. Level sets complexity and described first predetermined threshold value, the second predetermined threshold value and the 3rd predetermined threshold value of random action sequence instruction Size.

8. a biological characteristic In vivo detection system, it is characterised in that including:

9. biological characteristic In vivo detection system as claimed in claim 8, it is characterised in that also include:

10. biological characteristic In vivo detection system as claimed in claim 8, it is characterised in that refer in random action sequence Being that timestamp is specified in each action in order, described timestamp is for identifying the movement time of each action or rising of each action Time beginning and end time, described timestamp stochastic generation.

11. biological characteristic In vivo detection systems as claimed in claim 8, it is characterised in that described motion analysis unit, Including:

Key point locator unit, for carrying out key point location to every face；

Action sequence identification subelement, for obtaining user's according to described head pose corner and described human face expression type Response action sequence；

Described action goodness of fit judging unit, including:

Type of action goodness of fit computation subunit, relatively described response action sequence is corresponding with described random action sequence instruction Action sequence, calculate type of action goodness of fit；

The 12. biological characteristic In vivo detection systems as described in claim 10 and 11, it is characterised in that described action divides Analysis unit, also includes:

Action overall goodness of fit computation subunit, is used for calculating the overall goodness of fit of action, the overall goodness of fit=type of action of action Goodness of fit+w × movement time goodness of fit, wherein w is weights；

13. biological characteristic In vivo detection systems as claimed in claim 9, it is characterised in that also include:

Overall goodness of fit computing unit, is used for calculating overall goodness of fit, calculates overall goodness of fit=type of action goodness of fit+w1 × movement time goodness of fit+w2 × voice content goodness of fit, wherein w1, w2 are weights；

The 14. biological characteristic In vivo detection systems as described in any one of claim 8-13, it is characterised in that according to safety Complexity and described first predetermined threshold value, second predetermined threshold value and the 3rd of grade setting random action sequence instruction preset threshold The size of value.