WO2021082548A1

WO2021082548A1 - Living body testing method and apparatus, server and facial recognition device

Info

Publication number: WO2021082548A1
Application number: PCT/CN2020/103962
Authority: WO
Inventors: 曹佳炯
Original assignee: 支付宝(杭州)信息技术有限公司
Priority date: 2019-10-28
Filing date: 2020-07-24
Publication date: 2021-05-06
Also published as: CN111091047B; CN111091047A

Abstract

Provided are a living body testing method and apparatus, a server and a facial recognition device. In one embodiment, the method comprises: by calling a preset living body testing model obtained on the basis of second scenario training, processing target image data collected from a first scenario; extracting and obtaining a corresponding target feature group; determining, by means of the model and on the basis of the target feature group, a probability value of a target object in the target image data being a non-living object, and taking the probability value as a target probability; meanwhile, introducing and using an anchor point feature group of the first scenario that is determined on the basis of sample data of the first scenario, so as to determine a target feature group distance between the target feature group and the anchor point feature group; and by synthesizing the target probability and the target feature group distance, accurately determining whether the target object in the target image data collected from the first scenario is a living body object.

Description

Living body detection method, device, server and face recognition equipment

Technical field

This specification belongs to the field of Internet technology, and in particular relates to a living body detection method, device, server and face recognition equipment.

Background technique

With the development of technology, the application of face recognition becomes more and more extensive. In many application scenarios, face recognition technology is often used to determine the identity of the user by performing face recognition on the collected image data, and then provide the user with the corresponding service or open the corresponding authority.

However, there have been many phenomena that pretend to be other people by using photos and videos containing other people's faces, or applying other people's face masks, etc., in order to confuse and threaten the rights and security of others through face recognition. Therefore, in some embodiments, before face recognition is usually performed, a living detection model corresponding to the specific scene is trained to perform living detection to determine whether the face object in the image data to be recognized is real. Human faces instead of photos, videos, or masks. In addition, due to differences in environmental conditions and recognition requirements corresponding to different scenarios, it is often necessary to separately train and establish corresponding live detection models for live detection for different scenarios.

Therefore, there is an urgent need for a method that can efficiently perform living body detection.

Summary of the invention

This manual provides a living body detection method, device, server, and face recognition equipment, so that the preset living body detection model trained and established in the second scene can be effectively used to compare the image data collected in the first scene The target object is more efficient and accurate live detection.

The living body detection method, device, server and face recognition equipment provided in this manual are implemented as follows:

A living body detection method includes: acquiring target image data, wherein the target image data includes image data including a target object collected in a first scene; calling a preset living body detection model to obtain data from the target image The target feature group is extracted from the, and the probability value of the target object being a non-living object is determined as the target probability through the preset live detection model, wherein the preset live detection model includes the use of the second scene A model trained on sample data; determine the distance of the target feature group according to the target feature group and the anchor point feature group of the first scene, wherein the anchor point feature group of the first scene is determined according to the sample data of the first scene; According to the target feature group distance and the target probability, it is determined whether the target object is a living object.

A living body detection device includes: an acquisition module for acquiring target image data, wherein the target image data includes image data including a target object acquired in a first scene; a use module for using preset A living body detection model, extracting a target feature group from the target image data, and determining the probability value of the target object being a non-living body object through the preset living body detection model as the target probability, wherein the preset The living body detection model includes a model trained using sample data of the second scene; a first determining module is used to determine the target feature group distance according to the target feature group and the anchor point feature group of the first scene, wherein the The anchor point feature group of the first scene is determined according to the sample data of the first scene; the second determining module is configured to determine whether the target object is a living object according to the distance of the target feature group and the target probability.

A server includes a processor and a memory for storing executable instructions of the processor. When the processor executes the instructions, the target image data is obtained, wherein the target image data includes data collected in a first scene. Image data of a target object; using a preset living body detection model to extract a target feature group from the target image data, and determining the probability that the target object is a non-living body object through the preset living body detection model Value as the target probability, wherein the preset liveness detection model includes a model trained using sample data of the second scene; the target feature group distance is determined according to the target feature group and the anchor point feature group of the first scene, Wherein, the anchor point feature group of the first scene is determined according to the sample data of the first scene; according to the distance of the target feature group and the target probability, it is determined whether the target object is a living object.

A computer-readable storage medium, on which computer instructions are stored, which realize acquisition of target image data when the instructions are executed, wherein the target image data includes image data including a target object collected in a first scene; Using a preset life detection model, extract a target feature group from the target image data, and determine the probability value of the target object being a non-living object through the preset life detection model as the target probability, wherein, The preset living detection model includes a model trained using sample data of the second scene; the target feature group distance is determined according to the target feature group and the anchor point feature group of the first scene, wherein the first scene The anchor point feature group of is determined according to the sample data of the first scene; according to the distance of the target feature group and the target probability, it is determined whether the target object is a living object.

A face recognition device includes a processor and a memory for storing executable instructions of the processor. When the processor executes the instructions, the above-mentioned living body detection method is implemented to determine the target image data used for face recognition. Whether the target object of is a living object; in the case where it is determined that the target object is not a living object, it is determined that the face recognition fails.

The living body detection method, device, server, and face recognition equipment provided in this manual process the target image data collected in the first scene by using a preset living body detection model trained on the second scene to extract the corresponding target features According to the target feature group, the above model determines the target probability that the target object in the target image data belongs to the non-living object; at the same time, the anchor point of the first scene determined based on the sample data of the first scene is introduced and used The feature group is used to determine the target feature group distance between the target feature group and the anchor point feature group; then the above-mentioned target probability and the target feature group distance are combined to more accurately determine the target in the target image data collected in the first scene Whether the object is a living object. Therefore, there is no need to separately train the corresponding living detection model for the first scene, and the living detection model that has been trained in other scenes can be effectively used to efficiently analyze the target objects in the target image data collected in the first scene. Perform a live test.

Description of the drawings

In order to explain the embodiments of this specification more clearly, the following will briefly introduce the drawings needed in the embodiments. The drawings in the following description are only some of the embodiments recorded in this specification. In other words, other drawings can be obtained based on these drawings without creative labor.

FIG. 1 is a schematic diagram of an embodiment of the system structure composition of the living body detection method provided by the embodiment of this specification;

FIG. 2 is a schematic diagram of an embodiment of applying the living body detection method provided in the embodiment of this specification in an example of a scene;

FIG. 3 is a schematic diagram of an embodiment of applying the living body detection method provided by the embodiment of this specification in an example of a scene;

4 is a schematic flowchart of a living body detection method provided by an embodiment of this specification;

FIG. 5 is a schematic diagram of an embodiment of a living body detection method provided by an embodiment of this specification;

FIG. 6 is a schematic diagram of an embodiment of a living body detection method provided by an embodiment of this specification;

FIG. 7 is a schematic diagram of the structural composition of a server provided by an embodiment of this specification;

FIG. 8 is a schematic diagram of the structural composition of a living body detection device provided by an embodiment of this specification.

Detailed ways

In order to enable those skilled in the art to better understand the technical solutions in this specification, the following will clearly and completely describe the technical solutions in the embodiments of this specification in conjunction with the drawings in the embodiments of this specification. Obviously, the described The embodiments are only a part of the embodiments in this specification, rather than all the embodiments. Based on the embodiments in this specification, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this specification.

The embodiments of the present specification provide a living body detection method, and the living body detection method can be specifically applied to a system architecture including a server and a collection terminal. For details, please refer to FIG. 1, where the collection terminal is arranged in the first scenario, and the collection terminal is coupled with the server in a wired or wireless manner to facilitate data interaction. Through this system, it is no longer necessary to train the corresponding live detection model for the current scene, but can call the live detection model that has been trained in other scenes, and efficiently perform the live detection of the target object in the image data collected in the current scene .

Specifically, the collection terminal may be specifically used to collect the target image data of the first scene, and send the target image data to the server. Wherein, the target image data includes image data including a target object (for example, a human face, etc.) collected in the first scene. The server may be specifically configured to call a preset life detection model, extract a target feature group from the target image data, and determine that the target object is based on the target feature through the preset life detection model The probability value of the non-living object is used as the target probability. Wherein, the preset living body detection model includes a model obtained by training using sample data of the second scene. Determine the distance of the target feature group according to the target feature group and the anchor point feature group of the first scene. Wherein, the anchor point feature group of the first scene is determined according to the sample data of the first scene. According to the target feature group distance and the target probability, it is determined whether the target object is a living object.

In this embodiment, the server may be a background service server that is applied to the side of the data processing platform and can implement functions such as data transmission and data processing. Specifically, the server may be an electronic device with data operation, storage functions, and network interaction functions; it may also be a software program that runs in the electronic device and provides support for data processing, storage, and network interaction. In this embodiment, the number of the servers is not specifically limited. The server may specifically be one server, or several servers, or a server cluster formed by several servers.

In this embodiment, the collection terminal may be a front-end device that is applied to the side of a specific scene area and can implement functions such as data collection and data transmission. Specifically, the collection terminal may be, for example, a surveillance camera, or other electronic equipment equipped with a camera, such as a tablet computer, a notebook computer, a smart phone, and the like.

In an example of a scenario, company A can apply the living body detection method provided in the embodiment of this specification to perform living body detection on the face image collected by the company's access control system.

In this scenario example, company A has set up an attendance system inside the company before, and a set of face recognition equipment (recorded as the first face recognition device) is deployed in the attendance system. The face equipment includes a camera and processing Device. When an employee enters the company to check in, the attendance system can call the above-mentioned first face recognition device to perform face recognition based on the face photo taken by the camera to identify and confirm the identity information of the employee who checks in. Refer to Figure 2.

Specifically, in order to avoid the situation of employees checking in on behalf of employees, after the aforementioned first face recognition device collects the face photos of the employees who check in through the camera, the processor will first call the trained first live detection model for the face. The face object in the photo is subject to live detection to determine whether the face object in the collected face photo is a living object. If it is determined by the above-mentioned first live body detection model that the face in the face photo is the face of a live object, then the face recognition device will perform further face recognition on the above-mentioned face photo through live body detection, and determine The identity information of the employee corresponding to the face, and the determined identity information of the employee is fed back to the attendance system to complete the attendance record of the employee.

Wherein, in the specific application of the above-mentioned first living body detection model, the corresponding feature can be extracted from the face photo first, and then it is determined whether the face object of the face photo is a living object according to the extracted features. If it is determined by the above-mentioned first liveness detection model that the face given in the face photo is not the face of a live object, then the liveness detection is not passed. At this time, the face recognition device will judge that someone is using a face that contains other people’s faces. The photo, video or mask pretends to be another person to punch in on behalf of the card, and then no further face recognition will be performed on the face photo, but an alarm will be issued and the attendance record will stop, prompting the user to open the failure.

Currently, Company A plans to deploy a set of access control systems that include facial recognition equipment (denoted as the second facial recognition equipment) outside the company. You can refer to Figure 3. Wherein, the above-mentioned second face recognition device includes a camera and a processor. Specifically, the above-mentioned camera may be arranged above the gate position outside the company to collect face photos of people who are about to enter the company. The processor in the second face recognition device also needs to use the live body detection model to first perform live body detection on the face photos collected by the camera, and after the detection determines that the face objects in the collected face photos are live objects , Then further face recognition will be performed on the face in the face photo to determine whether the identity information corresponding to the face object is an employee of company A, and then the determination result will be fed back to the access control system. According to the determination result, the access control system will automatically open the door of the company after determining that the identity information corresponding to the face object in the face photo is an employee of company A, so that the employee can enter the company smoothly.

However, there is currently no live detection model for the application scenario of the access control system, and it usually takes a lot of time and resources to train to obtain the live detection model for the application scenario of the access control system.

Although, company A currently has the first live detection model that has been trained. However, because the model is designed and trained for the application scenario of the attendance system corresponding to the first face recognition device. Therefore, this model is not suitable for the application scenario of the access control system corresponding to the second face recognition device. If the first living body detection model is directly applied to the scene of the access control system, recognition errors are often prone to occur.

Specifically, for example, in the application scenario of the attendance system targeted by the first face recognition device, the camera is usually set indoors and the environmental conditions are relatively stable. When the face photos are collected, the outside light is relatively sufficient, and the resulting face photos It is usually relatively clear. In addition, in the application scenario of the attendance system, employees often cooperate with the camera to collect facial photos during the clock-in process. Therefore, in the application scenario of the attendance system, the image quality of the face photos to be processed by the living body detection model will be relatively high. Furthermore, because company A has high requirements for the accuracy of the company's attendance system, the living body detection model will be relatively strict in the specific detection process.

In the application scenario of the access control system targeted by the second face recognition device, because the camera is usually set outdoors, the environmental conditions are more complex and changeable and not stable enough, resulting in insufficient light from outside when taking face photos, for example, The light is too strong at noon, and the light at night is too weak, so the collected face photos may not be clear enough. In addition, in the application scenario of the access control system, employees usually do not cooperate with the camera to collect facial photos when entering the company. Therefore, the image quality of the face photos collected in the application scenarios of the access control system and processed by the living body detection model is usually relatively low. Furthermore, because A company's requirements for the accuracy of the access control system are not as high as that of the attendance system.

Therefore, if the first live detection model for the application scenario of the attendance system is directly applied to the second face recognition device to determine whether the face object in the face photo collected by the second face recognition device is a live object, Detection errors are often prone to occur. For example, it may often happen that the employees of the company cannot be identified, resulting in failure to open the door for the employees of the company in time.

In some embodiments, it usually takes a lot of resources and time to retrain and establish a living body detection model for the application scenario of the access control system. Or, collect corresponding sample data for the application scenario of the access control system, and then use the above sample data to train and adjust the first living body detection model to obtain a different model from the first living body detection model, which is suitable for the adjustment of the application scenario of the access control system After the live detection model.

In this scenario example, based on the living body detection method provided in the embodiment of this specification, there is no need to separately retrain for the application scenario of the access control system, or adjust to obtain a new living body detection model.

During specific implementation, the processor of the second face device can first directly call the trained application scenario that is suitable for the attendance system. The first living body detection model checks the camera of the second face device that contains human face objects collected outdoors. Feature extraction is performed on the face photos of, and the corresponding target feature group is obtained. Further, the probability value of determining that the face object is a non-living object based on the target feature group by the first living body detection model may be used as the target probability. It should be noted that, since the first living body detection model used above is trained for the application scenario of the attendance system, in the scene where the first living body detection model is applied to the access control system, the person in the collected face photos is When the face object is judged in vivo, it does not take into account the specific characteristics of the application scenario of the access control system in terms of environmental conditions and accuracy requirements. Therefore, the target probability value obtained may not be completely accurate, but it can be used as a reference for judging living objects. in accordance with. Generally, the larger the value of the target probability value, the more likely the corresponding face object is not a living object, but a non-living object.

Further, the processor may compare the target feature group extracted based on the first living body detection model with a predetermined anchor point feature group of the application scenario of the access control system, and calculate the difference between the target feature group and the aforementioned anchor point feature group. Characteristic distance.

Among them, the aforementioned anchor point feature group can be specifically understood as a feature set that includes typical features of positive samples in different situations (for example, different environmental conditions) in the application scenario of the access control system. Specifically, the aforementioned anchor point feature group may be determined in advance according to the positive sample data in the sample data in the application scenario of the access control system. The above-mentioned positive samples may specifically include image data containing real human faces.

The feature distance between the target feature group and the anchor point feature group can be specifically used to measure the difference between the target feature group and the positive sample feature of the application scenario of the access control system under different conditions (for example, different lighting conditions, different acquisition angles, etc.) degree. The data can also be used as a reference basis for judging living objects. However, it should be noted that this data is a kind of reference data that has considered and took into account the environmental conditions and accuracy requirements of the application scenario of the access control system. Generally, the larger the feature distance between the target feature group and the anchor point feature group, the more likely the corresponding face object is not a living object.

In this scenario example, when specifically calculating the feature distance between the target feature group and the anchor point feature group, the difference between the target feature group and the anchor point feature group can be calculated separately as the target feature group and the anchor point feature group. The feature distance of each feature in, then the feature distance between the target feature group and the anchor feature group is determined according to the feature distance of each feature in the target feature group and the anchor feature group, which can be recorded as the target feature group distance.

Specifically, the above characteristic distance can be determined according to the following formula:

Dis tan ce={D _center ,D ₁ ,D ₂ ,...,D _K }={||ff _center || ₂ ,||ff _a1 || ₂ ,||ff _a2 || ₂ ,... ,||ff _aK || ₂ }

Among them, the above Dis tan ce can be specifically used to represent the feature distance between the target feature group and the anchor point feature group, D _center can specifically be used to represent the feature distance between the target feature group and the center point feature in the anchor feature group, and D _K specifically It can be used to represent the feature distance between the target feature group and the feature numbered K in the anchor feature group, f can be specifically represented as the target feature group, f _center can be specifically represented as the center point feature in the anchor feature group, f _aK Specifically, it can be expressed as the feature numbered K in the anchor point feature group, and |||| ₂ can be expressed as a modulo operation.

After confirming the target probability and the characteristic distance between the target feature group and the anchor point feature group considering the environmental conditions and accuracy requirements of the application scenario of the access control system according to the above method, the above two data can be synthesized at the same time. In view of the application scenario of the current access control system, it is more accurate and efficient to determine whether the face object in the face photo collected in the scene is a living object.

Specifically, the first score is first determined according to the feature distance between the target feature group and the anchor point feature group; the second score is determined according to the target probability. For example, the first score can be determined by comparing the feature distance between the target feature group and the anchor point feature group and the preset distance threshold. If the above characteristic distance is less than the preset distance threshold, a relatively high first score can be obtained. On the contrary, if the above characteristic distance is greater than the preset distance threshold, the first score obtained will be relatively low. Similarly, the second score can be determined by comparing the target probability with a preset ratio threshold. If the target probability is less than the preset ratio threshold, a relatively high second score can be obtained. On the contrary, if the aforementioned target probability is greater than the preset ratio threshold, the second score obtained will also be relatively low.

Wherein, the above-mentioned preset distance threshold and preset ratio threshold can be set according to specific conditions in combination with specific accuracy requirements. The specific values of the preset distance threshold and the preset ratio threshold are not limited in this specification.

Further, a weighted sum of the first score and the second score may be performed according to a preset weighting rule to obtain a third score. Specifically, the first weight corresponding to the first score and the second weight corresponding to the second score can be determined according to a preset weight rule, and the product of the first score and the first weight is then multiplied by the second score The sum obtained by adding the product given by the second weight is used as the above-mentioned third score.

The above-mentioned third score can be understood as an evaluation score obtained by comprehensively considering the two reference basis of target probability and target feature group distance. Furthermore, it can be determined more accurately whether the face object in the detected face photo is a living object according to the third score.

Specifically, the third score may be compared with a preset score threshold to obtain a comparison result. According to the comparison result, it is determined whether the target object is a living object.

For example, if it is determined according to the comparison result that the third score is less than or equal to the aforementioned preset score threshold, it can be determined that the face object is not a living object, and then it can be directly determined that the face recognition fails, and no further face recognition is performed. And the second face recognition device will feed back the recognition result of the face recognition failure to the access control system. At this time, the access control system will not open the door for the person based on the face recognition result. If it is determined according to the comparison result that the third score is greater than the preset score threshold, it can be determined that the face object is a living object, and further face recognition can be performed. If it is determined through further face recognition that the identity information corresponding to the face object is an employee of company A, it is determined that the face recognition is successful. And the second face recognition device will feed back the recognition result of successful face recognition to the access control system. At this time, the access control system will automatically open the door for the person based on the face recognition result.

In this way, it is possible to directly call the first living detection model that has been trained but suitable for the application scenario of the attendance system without the need to retrain for the application scenario of the access control system, or modify and adjust to obtain a new living detection model. Combining the anchor point feature group of the application scenario of the access control system to quickly and accurately detect the face objects in the photos collected in the application scenario of the access control system.

In another scenario example, company A can collect photos containing real human faces under a variety of outdoor environmental conditions through the camera of the second face device deployed in the access control system in advance as positive sample data to form a positive sample data Set, can be denoted as X. Among them, the positive sample data set includes multiple photos containing real human faces. Specifically, for example, it can be expressed as X={x ₁ ,x ₂ ,...,x _i ,...,x _N }. Among them, x _i can be specifically used to represent the photo numbered i in the positive sample data set.

When acquiring the above-mentioned positive sample data, in order to enable the acquired positive sample data to more comprehensively cover different environmental conditions in the application scenario of the access control system. Specifically, when collecting face photos, the time of collection and the weather data at the time of collection can be recorded. Further, according to the collection time and the weather data at the time of collection, the face photos corresponding to different collection time and weather data combinations can be selected from the large number of collected face photos as positive sample data.

After the positive sample data is obtained, each positive sample data in the positive sample data set can be input to the trained first living body detection model, and the corresponding sample feature can be extracted through the first living body detection model. Among them, each sample feature corresponds to a positive sample data. Create a sample feature set based on multiple sample features, which can be denoted as F. Specifically, for example, the sample feature set can be expressed as F={f ₁ ,f ₂ ,...,f _i ,...,f _N }. Among them, f _i can be specifically used to represent the sample feature corresponding to the photo numbered i.

Further, corresponding feature processing can be performed on the above-mentioned sample features respectively. Specifically, for example, an average feature (may be denoted as f _mean ) can be determined according to specific conditions; then, the average feature is subtracted from each of the multiple sample features to obtain the processed sample feature. In this way, the above-mentioned sample characteristics can be normalized, so that the values of the above-mentioned sample characteristics are unified within a numerical range, so as to reduce errors in the subsequent processing. Then build a corresponding processed sample feature set based on the processed template features, which can be denoted as F'. Specifically, for example, the above-mentioned processed sample feature set can be expressed as: F′={f ₁ ′, f′ ₂ ,..., f _i ′,..., f′ _N }. Among them, f _i ′ can be specifically used to represent the processed sample feature corresponding to the photo numbered i.

According to the above-mentioned processed sample characteristics, the central point characteristic can be determined by calculating the characteristic average value of the above-mentioned processed sample characteristic set. Specifically, for example, the above-mentioned center point feature can be determined according to the following formula:

Among them, f _center can be specifically expressed as the above-mentioned center point feature.

Calculate the feature distances between the processed sample features and the central point feature separately, and screen out the processed sample features whose feature distance from the central point feature is greater than the feature distance threshold from the multiple processed sample features based on the feature distance. The sample features that meet the requirements can enable the selected sample features to have a relatively comprehensive coverage for the targeted application scenarios of the access control system.

Of course, the methods for determining the characteristics of the samples that meet the requirements listed above are merely illustrative. In specific implementation, the processed sample features can also be sorted according to the characteristic distance between the processed sample features and the center point feature in descending distance; the preset number of processed sample features at the top of the ranking can be obtained As a sample feature that meets the requirements. Specifically, for example, the sample characteristics that meet the requirements can also be determined in the following manner:

{f _a1 ,f _a2 ,...,f _at ,...,f _aK }=TopK(||f _i ′-f _center ||)

Wherein, the above f _at can be specifically used to represent the characteristics of the sample with the number t that meet the requirements, K can specifically represent a preset number, and TopK() can specifically be represented as an operation to obtain the top K data in numerical order.

Furthermore, the anchor point feature group corresponding to the application scenario of the access control system can be established according to the above-mentioned center point feature and the selected sample feature that meets the requirements. Wherein, the anchor point feature group may specifically include a center point feature and a sample feature that meets the requirements. Specifically, the above anchor point feature group can be denoted as

Thus, an anchor point feature group that can effectively and comprehensively reflect the environmental characteristics of the application scenario of the access control system is obtained.

The second face recognition device comprehensively utilizes the first living body detection model for the application scenario of the attendance system and the above-mentioned anchor feature group for the application scenario of the access control system to perform living detection on the face photos collected by the camera. Record and count the error rate within each preset time period. Wherein, the above error ratio can be specifically understood as the ratio of the number of in-vivo detection errors in a preset period of time to the total number of in-vivo detections performed in the period.

During specific implementation, the processor of the second face recognition device will compare the above error ratio with a preset ratio threshold, and if it is determined that the error ratio is greater than the above preset ratio threshold, it can determine the application scenario of the current access control system The characteristics of the data to be processed have changed, and the detection error of live detection based on the previously determined anchor point feature group will be relatively large. At this time, the anchor point feature group corresponding to the latest situation of the application scenario of the access control system can be re-determined to replace the currently used anchor point feature group, and update the anchor point feature group to reduce detection errors.

For example, it is possible to obtain the positive sample data within a period of time (for example, the last week) that is relatively close to the current time, and then re-determine a new anchor point feature group based on the above positive sample data, and use the above new anchor point feature group Replacing the previously used anchor point feature group, so that the second face recognition device can still have a high accuracy rate when performing live body detection on the face photo in the application scenario of the current access control system based on the new anchor point feature group.

It can be seen from the above scene examples that the living body detection method provided in this manual can effectively use the living body detection model that has been trained in other application scenarios, and efficiently detect people in the face photos collected in the current application scene. The face object performs a more accurate live body detection.

Referring to FIG. 4, an embodiment of the present specification provides a living body detection method, wherein the method is specifically applied to the server side. In specific implementation, the method may include the following content.

S401: Acquire target image data, where the target image data includes image data including the target object collected in the first scene.

In some embodiments, the above-mentioned target image data can be specifically understood as the image data collected in the first scene and containing the target object to be detected. Specifically, the aforementioned target image data may specifically be photos, image data, and the like. This specification does not limit the specific form type of the above-mentioned target image data.

During specific implementation, the above-mentioned target image data can also be intercepted from multimedia data such as video images. For example, the image frame containing the target object can be intercepted from the surveillance video as the above-mentioned target image data.

In some embodiments, the aforementioned target object may be specifically determined according to the corresponding application scenario. For example, for the application scenario of face-swiping payment, the above-mentioned target object may be the face data of the user. For the application scenario of iris punch card, the above-mentioned target object may be the user's iris data. Of course, it should be noted that the target objects listed above are only schematic illustrations. During specific implementation, according to specific circumstances, the aforementioned target object may also be object data of other types of content. In this regard, this manual is not limited.

In some embodiments, the above-mentioned first scenario may be specifically understood as a specific application scenario targeted by the living body detection method. Specifically, the above-mentioned first scenario may be a business scenario in which the access control system automatically opens the door for a user who has passed through facial recognition to allow this type of user to enter. It can also be that the face payment system verifies the identity of the paying user through face recognition. If the authentication is the same, it responds to the payment instruction of the paying user and calls the funds data in the user’s account to write off the transaction order. Application scenarios. It can also be that the identity determination system matches the collected iris of the user with the iris stored in the identity information database to identify the application scenarios for determining the identity information of the user, and so on. Of course, the first scenario listed above is only a schematic illustration. During specific implementation, according to specific conditions and business requirements, other forms or types of application scenarios can also be introduced as the first scenario described above. In this regard, this manual is not limited.

For the first scenario listed above, the target object in the target image data may be subjected to live detection to determine whether the target object is a live object. If it is found that the target object is not a living object through live detection, the target object in the target image data may be a photo or mask containing the face of another person, or a picture containing the iris of another person, etc., and then the target object in the target image can be judged It is not a real person, but may be a disguised attack. At this time, further identification of the target object can be stopped.

In some embodiments, during specific implementation, image data including the target object in the first scene may be collected by an image acquisition device such as a camera as the target image data. Of course, it should be noted that the above-mentioned manners for obtaining target image data are merely schematic illustrations. During specific implementation, according to specific circumstances, other suitable methods may also be used to obtain the target image data containing the target object in the first scene. In this regard, this manual is not limited.

S403: Invoking a preset living body detection model, extracting a target feature group from the target image data, and determining the probability value of the target object being a non-living body object through the preset living body detection model as the target probability, Wherein, the preset living body detection model includes a model obtained by training using sample data of the second scene.

In some embodiments, the aforementioned target feature group may specifically include image feature data extracted from target image data for determining whether the target object is a living object. For example, whether there are features such as the reflection of the frame of the mobile phone or the reflection of the photo paper in the image data. For another example, the feature of the displacement change between the key points of the target object (for example, the position of the corner of the mouth in the face) in two consecutive frames of pictures in the video data, and so on. Of course, the target feature group listed above is only a schematic illustration. During specific implementation, according to specific application scenarios, other types of feature data can also be used as the target feature group. In this regard, this manual is not limited. Among them, the above-mentioned living object can be specifically understood as a characteristic object of a real person, for example, a real person's face, a real person's iris, and so on. In contrast, a non-living object can be specifically understood as a data object disguised as a characteristic object of a real person, for example, a picture containing a real person's face or a face mask.

In some embodiments, the aforementioned target probability may specifically include a probability value used to reflect that the target object in the target image data is not a living object. Generally, if the value of the target probability is larger, correspondingly, the target object in the target image is more likely to be not a living object. On the contrary, if the value of the target probability is smaller, correspondingly, the target object in the target image is more likely to be a living object.

In some embodiments, the aforementioned preset living body detection model may specifically include a pre-trained model for living body detection on the image data of the second scene. Wherein, the foregoing second scenario may be a different application scenario from the first scenario, the environmental conditions involved, and/or the accuracy requirements for detection are different. Of course, the foregoing second scenario may also be an application scenario that is the same or similar to the first scenario.

The above-mentioned preset living body detection model is based on the characteristics of the environmental conditions and detection accuracy requirements of the second scene, and is established using the sample data in the second scene. In order to determine whether the target object in the image data to be recognized is a living object.

In the second scenario, during specific implementation, the target image data to be recognized is input to the above-mentioned preset living body detection model. When the preset living detection model is running, it can first extract the corresponding image feature from the input image data; then determine the probability value of the target object corresponding to the image feature as a non-living object according to the image feature; The aforementioned probability value is compared with a preset judgment threshold value, and when the aforementioned probability value is greater than the preset judgment threshold value, it can be judged that the target object is not a living object.

Due to the differences between the second scene and the first scene in terms of environmental conditions and processing accuracy requirements, if the preset living body detection model corresponding to the second scene is directly applied to the first scene, the target collected in the first scene If the target object in the image data is detected in vivo, the accuracy of the detection may not be high, and the detection error is prone to occur. However, because the application of the preset life detection model in the first scene has to deal with the same problems as the second scene, the preset life detection model can be used to perform the target image data collected in the first scene. Processing to extract the corresponding image features; at the same time, it can also judge whether the target object in the target image data is a living object based on the image feature, and give the corresponding probability value. Although the accuracy of this probability value is not high, but It also has a certain reference value.

Therefore, in this embodiment, referring to Figure 5, the preset living detection model that has been trained in the second scene can be directly called, and the target image data collected in the first scene containing the target object is input In the above-mentioned preset living body detection model, by running the living body detection model, the corresponding image feature can be extracted from the above-mentioned target image data as the target feature group. Furthermore, it is also possible to judge whether the target object in the target image data is a living object according to the extracted target feature group through the above preset living detection model, and obtain the probability value of the target object not being a living object as the target probability. Wherein, the aforementioned target probability value can be used as a type of subsequent reference data used to determine whether the target object is a living object.

In this way, there is no need to separately train or adjust for the first scene to obtain the live detection model corresponding to the first scene, which reduces the processing cost and processing cycle. Instead, it can directly call the live detection model that has been trained in other application scenarios. Feature extraction is performed on the target image data of the first scene to obtain the required target feature group. And through the model according to the above target feature group, it is judged whether the target object is a living object, and the target probability determined based on the model is given as the subsequent final judgment whether the target object in the target image data collected in the first scene is It is a kind of reference data for living objects.

S405: Determine the distance of the target feature group according to the target feature group and the anchor point feature group of the first scene, where the anchor point feature group of the first scene is determined according to the sample data of the first scene.

In some embodiments, the aforementioned anchor point feature group can be specifically understood as a feature set that includes image features of positive samples in different situations in the first scene. Among them, the positive samples in the above different situations may specifically include image data containing living objects collected under different environmental conditions (for example, different light intensity, shooting angles, shooting distances, etc.).

In this embodiment, during specific implementation, image data containing living objects under different conditions can be collected in advance for the first scene as positive sample data, and then the anchor point feature group can be established based on the positive sample data.

In some embodiments, in addition to collecting and using positive sample data in the above manner to establish an anchor point feature group, part of the negative sample data can also be collected and used in the process of establishing an anchor point feature group. Specifically, the positive sample data may be mixed with part of the collected negative sample data, and then the above-mentioned anchor point feature group can be established based on the above-mentioned sample data including the negative sample data and the original data. In this way, the noise caused by the negative samples in the scene can be introduced, so that the established anchor point feature group can better reflect the image characteristics in the real scene and have better effects. Wherein, the above-mentioned negative sample data may specifically include image data that does not contain living objects collected under different environmental conditions (for example, different light intensity, shooting angle, shooting distance, etc.).

In some embodiments, the aforementioned target feature group distance can be specifically understood as the distance between the target feature group and the aforementioned anchor point feature group. The feature distance between the target feature group and the anchor point feature group can be specifically used to measure the degree of difference between the target feature group and the positive sample features of the first scene in different situations. The above-mentioned characteristic distance may also be used as a kind of reference data combined with the specific characteristics of the first scene for subsequent determination of whether the target object is a living object. By contacting the positive sample data of the first scene, the reference data has considered and took into account the characteristics of the first scene’s environmental conditions and accuracy requirements, and made up for the previously determined target probability that did not take into account the specific characteristics of the first scene. insufficient. Generally, if the distance value of the target feature group is larger, correspondingly, the similarity between the features of the target feature group and the anchor point feature group is also lower, and the corresponding target object is more likely to be not a living object. On the contrary, if the distance value of the target feature group is smaller, the similarity between the target feature group and the anchor point feature group is correspondingly higher, and the corresponding target object is more likely to be a living object.

In some embodiments, the foregoing determination of the target feature group distance based on the target feature group and the anchor point feature group of the first scene may include: calculating each feature in the target feature group and the anchor point feature group respectively. The modulus of the difference of, as the feature distance between the target feature group and each feature in the anchor feature group; according to the feature distance between the target feature group and each feature in the anchor feature group, the target feature group and the anchor point are determined The feature distance of the feature group can be abbreviated as the target feature group distance.

S407: Determine whether the target object is a living object according to the target feature group distance and the target probability.

In some embodiments, according to the targeted application scenario, the above-mentioned living object may be a real person's face, or a real person's iris, etc., instead of non-real person props such as a photo of a human face or iris, a mask, etc.

In some embodiments, by comprehensively using the target feature group and the target probability, to take into account the specific characteristics of the first scene and the difference from the second scene before, using the preset living detection model trained in the second scene, Accurately judge whether the target object in the target image data collected in the first scene is a living object.

In some embodiments, the foregoing determination of whether the target object is a living object based on the distance of the target feature group and the target probability may include the following content: determining the first score according to the distance of the target feature group ; Determine the second score according to the target probability. For example, the first score can be determined by comparing the distance of the target feature group with a preset distance threshold. If the target feature group distance is less than the preset distance threshold, a relatively high first score can be obtained. On the contrary, if the distance of the target feature group is greater than the preset distance threshold, the first score obtained will be relatively low. Similarly, the second score can be determined by comparing the target probability with a preset ratio threshold. If the target probability is less than the preset ratio threshold, a relatively high second score can be obtained. On the contrary, if the aforementioned target probability is greater than the preset ratio threshold, the second score obtained will also be relatively low. Wherein, the above-mentioned preset distance threshold and preset ratio threshold can be set according to specific conditions in combination with specific accuracy requirements. The specific values of the preset distance threshold and the preset ratio threshold are not limited in this specification.

Further, a weighted sum of the first score and the second score may be performed according to a preset weighting rule to obtain a third score. Specifically, the first weight corresponding to the first score and the second weight corresponding to the second score can be determined according to a preset weight rule, and the product of the first score and the first weight is then multiplied by the second score The sum obtained by adding the product given by the second weight is used as the above-mentioned third score. Among them, the above-mentioned third score can be specifically understood as an evaluation score obtained by comprehensively considering the two reference data of target probability and target feature group distance.

Since the aforementioned third score effectively takes into account and reflects the specific characteristics of the first scene, it can be determined more accurately whether the target object in the detected target image data is a living object according to the third score. Specifically, the third score may be compared with a preset score threshold to obtain a comparison result. According to the comparison result, it is determined whether the target object is a living object. Specifically, for example, if it is determined according to the comparison result that the third score is less than or equal to the preset score threshold, it can be determined that the target object in the target image data collected in the first scene is not a living object. On the contrary, if it is determined that the third score is greater than the preset score threshold according to the comparison result, it can be determined that the target object in the target image data collected in the first scene is a living object.

It can be seen from the above that the living body detection method provided by the embodiment of this specification can effectively use the preset living body detection model trained and established in the second scene, and efficiently analyze the target object in the image data collected in the first scene. Perform more accurate live detection.

In some embodiments, the anchor point feature group of the first scene may be specifically established in the following manner: collecting image data containing living objects in the first scene as the positive sample data of the first scene; calling a preset living body detection The model extracts the sample feature from the positive sample data; determines the center point feature according to the sample feature, wherein the center point feature; calculates the feature distance between the sample feature and the center point feature; according to the sample feature , And the feature distance between the sample feature and the center point feature to establish the anchor point feature group of the first scene. In this way, it is possible to obtain an anchor point feature group that can more comprehensively cover the data characteristics in different situations in the first scene, so as to compensate for the direct use of the preset living detection model, without considering the environment between the first scene and the second scene. Errors caused by differences in conditions, processing accuracy requirements, etc.

In some embodiments, during specific implementation, refer to FIG. 6, and the environmental characteristics at the time of collection can be recorded from the time when the image data in the first scene is collected. For example, when a photo is taken through a camera arranged in the first scene, the lighting conditions of the photo can be recorded. According to the above method, the image data within a period of time is collected in the first scene, and the image data containing the living object can be filtered from the above image data as the positive sample data. For example, a photo containing a human face of a real person can be selected from the aforementioned photos as the positive sample data of the first scene. When specifically screening the positive sample data of the above first scene, according to the environmental characteristics recorded by the image data of the first scene, the image data containing living objects corresponding to different environmental characteristics can be screened out as positive sample data in a targeted manner In this way, the acquired positive sample data of the first scene can more comprehensively cover the data characteristics of the first scene in different situations.

In some embodiments, the positive sample data of the first scene obtained above may be respectively input into the called preset living body detection model. It should be noted that in this embodiment, it is not necessary to use the above-mentioned preset living body detection model to detect whether the target object in the positive sample data is a living object, but only need to use the above-mentioned preset living body detection model from the above The image features corresponding to each positive sample data are extracted from the positive sample data as sample features.

Further, the corresponding center point feature can be determined based on the above-mentioned sample feature. Specifically, the above-mentioned center point characteristic can be obtained by adding and averaging the above-mentioned sample characteristics. Furthermore, the central point feature can be used as a reference, and the modulus of the difference between each of the sample features and the central point feature can be calculated as the feature distance of the sample feature. According to the feature distance of the sample feature, the sample feature with a larger distance from the center point feature is selected from the multiple sample features as the sample feature that meets the requirements, so that the sample feature that meets the requirements can better cover the first scene The data characteristics of the image data collected under different conditions.

In some embodiments, the above-mentioned establishment of the anchor point feature group of the first scene according to the sample feature and the feature distance between the sample feature and the center point feature may include the following content during specific implementation: Among the sample features, the sample features whose feature distance from the center point feature is greater than the feature distance threshold are selected as the sample features that meet the requirements; based on the center point feature and the sample features that meet the requirements, the anchor point of the first scene is established Feature group.

In some embodiments, during specific implementation, the sample features can be sorted according to the feature distance from the largest to the smallest according to the feature distance between the sample feature and the center point, and the preset number of sample features with the highest ranking can be selected as the above-mentioned meeting requirements. Sample characteristics. It is also possible to compare the feature distance between the sample feature and the center point feature with the feature distance threshold, and screen out the sample feature whose feature distance between the sample feature and the center point feature is greater than the feature distance threshold as the above-mentioned sample feature that meets the requirements. Of course, it should be noted that the above-mentioned method of screening out the sample characteristics that meet the requirements is only a schematic illustration. During specific implementation, other suitable methods can also be used to screen out the sample characteristics that meet the requirements according to the specific situation. In this regard, this manual is not limited.

After the sample features that meet the requirements are determined, a feature set can be further established based on the sample features that meet the requirements and the center point features as the anchor point feature group for the first scene. Wherein, the anchor point feature group may specifically include a sample feature that meets the requirements, and a center point feature.

In some embodiments, corresponding feature processing can also be performed on the sample features obtained above. Specifically, the corresponding average feature can be determined according to the overall numerical value of the sample feature; and then the sample feature can be obtained by subtracting the average feature from the sample feature to obtain the processed sample feature. Subsequently, the processed sample features can be used to replace the originally used sample features to determine an anchor point feature group with better accuracy.

In some embodiments, after the anchor point feature group of the first scene is established, when the method is specifically implemented, the method may further include the following content: perform the analysis on the features in the anchor point feature group of the first scene respectively. Fuman encoding to obtain the anchor point feature group of the compressed first scene; and save the anchor point feature group of the compressed first scene. In this way, the anchor point feature group of the first scene can be compressed by Huffman coding, and the compressed anchor point feature group can be saved and managed, which effectively reduces the resource occupation and management of the anchor point feature group when saving and managing the anchor point feature group. Consumption. Of course, it needs to be clarified that the above-mentioned use of Huffman numbers to compress the anchor point feature group is only a schematic illustration. During specific implementation, according to specific conditions and processing requirements, other suitable compression methods may also be used to compress the anchor point feature group. In this regard, this manual is not limited.

In some embodiments, in addition to compressing the anchor point feature group, after extracting the sample feature, the extracted sample feature can be compressed and saved by Huffman coding, which can further reduce the resource cost. Occupation and consumption.

In some embodiments, the foregoing determination of whether the target object is a living object based on the distance of the target feature group and the target probability may include the following content: determining the first score according to the distance of the target feature group Determine the second score according to the target probability; perform a weighted summation of the first score and the second score according to a preset weighting rule to obtain a third score; combine the third score with a preset score threshold The comparison is performed to obtain a comparison result; according to the comparison result, it is determined whether the target object is a living object. In this way, two reference data of target probability and target feature group distance can be comprehensively used, that is, there is no need to re-establish and use the living detection model corresponding to the first scene, and the specific characteristics of the first scene can be taken into account, so that the first scene can be accurately The target object in the image data collected in the image data is detected in vivo.

In some embodiments, when it is determined that the target object is not a living object according to the target feature group distance and the target probability, the method may further include the following content when the method is specifically implemented: The permission application request corresponding to the image data.

In this embodiment, when it is determined that the target object in the target image to be detected is not a living object, subsequent further identification and determination of the target object can be stopped, and the permission application request corresponding to the target image data can be rejected. For example, in the application scenario of swiping face payment, when it is determined that the face in the face photo to be detected is not a living object, it can be judged that a user is pretending to be another person by using a picture containing the face of another person or a face mask. Payment, at this time, the recognition of the face in the face photo and the matching of identity information can be stopped, and the payment application request initiated by the user can be rejected to protect the safety of other people's property.

In some embodiments, when the method is specifically implemented, it may further include the following content: counting the error ratio within a preset time period; comparing the error ratio with a preset ratio threshold; determining that the error ratio is greater than the preset ratio threshold. In the case of the ratio threshold value, the anchor point feature group of the first scene is re-determined.

Specifically, every preset time period, for example, every other week, the rate of error in the live body detection in the most recent week can be counted. Wherein, the above-mentioned error ratio can be specifically obtained by dividing the number of errors in the detection of a living body within the preset time period by the total number of detections of a living body processed within the preset time period. Further, the error ratio can be compared with a preset ratio threshold to obtain the corresponding comparison result. According to the comparison result, it can be determined whether the anchor point feature group currently used for living body detection conforms to the specific situation of the current scene. For example, if according to the comparison result, it is determined that the error ratio is greater than the preset ratio threshold, it can be judged that the current situation of the corresponding scene may have changed, resulting in a relative detection error when performing live detection based on the previously determined anchor point feature group. Larger. At this time, you can obtain the positive sample data in the most recent period of time, re-determine the anchor point feature group of the scene based on the positive sample data in the most recent period of time, and update the previously used anchor point feature group with the newly determined anchor point feature group , Use the newly determined anchor point feature combined with the preset living detection model to perform living detection on the image data collected in the scene. If according to the comparison result, it is determined that the error ratio is less than or equal to the above preset ratio threshold, it can be judged that the currently used anchor point feature group can better cover the current scene situation. Therefore, it is not necessary to check the currently used anchor point feature The group is updated. By counting the error ratio in the preset time period, and according to the numerical comparison result of the error ratio and the preset ratio threshold, it is determined whether to re-determine the anchor point feature group of the first scene, so that the anchor point feature of the first scene can be periodically checked The group is updated regularly to improve the accuracy of detecting live objects and reduce the error rate. Therefore, in a relatively long period of time, the target image data collected in the first scene can be detected in vivo more accurately.

As can be seen from the above, the living body detection method provided by the embodiment of this specification processes the target image data collected in the first scene by calling a preset living body detection model trained on the second scene, extracting the corresponding target feature group, and The above model determines the target probability that the target object in the target image data belongs to the non-living object based on the target feature group; at the same time, the anchor point feature for the first scene determined based on the positive sample data of the first scene is introduced and used Group to determine the feature distance between the target feature group and the anchor point feature group; and then combine the target probability and the target feature group distance to more accurately determine whether the target object in the target image data collected in the first scene is a living object. Therefore, the preset living detection model trained and established in the second scene can be effectively used to efficiently perform relatively accurate living detection of the target object in the image data collected in the first scene. Since there is no need to additionally train the corresponding living body detection model for the first scene, the processing cost and processing time of living body detection are effectively reduced.

The embodiment of the present specification also provides a server, including a processor and a memory for storing executable instructions of the processor. The processor can execute the following steps according to the instructions during specific implementation: acquiring target image data, wherein the target image The data includes the image data that contains the target object collected in the first scene; the preset living detection model is called, the target feature group is extracted from the target image data, and the preset living detection model is used to determine The probability value that the target object is a non-living object is used as the target probability, wherein the preset living detection model includes a model trained using sample data of the second scene; according to the target feature group and the anchor of the first scene Point feature group, determine the target feature group distance, wherein the anchor point feature group of the first scene is determined according to the sample data of the first scene; according to the target feature group distance and the target probability, it is determined whether the target object It is a living object.

In order to be able to complete the above instructions more accurately, referring to FIG. 7, the embodiment of this specification also provides another specific server, where the server includes a network communication port 701, a processor 702, and a memory 703. The above structure Internal cables are connected so that each structure can carry out specific data interactions.

The network communication port 701 may be specifically used to obtain target image data, where the target image data includes image data including the target object collected in the first scene.

The processor 702 may be specifically configured to call a preset living body detection model, extract a target feature group from the target image data, and determine that the target object is a non-living body through the preset living body detection model The probability value of the object is used as the target probability, wherein the preset liveness detection model includes a model trained using sample data of the second scene; and the target feature is determined according to the target feature group and the anchor feature group of the first scene The group distance, wherein the anchor point feature group of the first scene is determined according to the sample data of the first scene; according to the target feature group distance and the target probability, it is determined whether the target object is a living object.

The memory 703 may be specifically used to store corresponding instruction programs.

In this embodiment, the network communication port 701 may be a virtual port that is bound to different communication protocols, so that different data can be sent or received. For example, the network communication port may be port 80 responsible for web data communication, port 21 responsible for FTP data communication, or port 25 responsible for mail data communication. In addition, the network communication port may also be a physical communication interface or a communication chip. For example, it can be a wireless mobile network communication chip, such as GSM, CDMA, etc.; it can also be a Wifi chip; it can also be a Bluetooth chip.

In this embodiment, the processor 702 may be implemented in any suitable manner. For example, the processor may take the form of a microprocessor or a processor and a computer readable medium, logic gates, switches, application specific integrated circuits ( Application Specific Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller form, etc. This manual is not limited.

In this embodiment, the memory 703 may include multiple levels. In a digital system, any memory that can store binary data can be a memory; in an integrated circuit, a circuit with a storage function without a physical form is also called a memory. , Such as RAM, FIFO, etc.; in the system, storage devices in physical form are also called memory, such as memory sticks, TF cards, etc.

The embodiment of the present specification also provides a computer storage medium based on the above-mentioned living body detection method. The computer storage medium stores computer program instructions. When the computer program instructions are executed, it is realized: acquiring target image data, wherein the The target image data includes image data including the target object collected in the first scene;

Using a preset life detection model, extract a target feature group from the target image data, and determine the probability value of the target object being a non-living object through the preset life detection model as the target probability, wherein, The preset living body detection model includes a model trained using sample data of the second scene; the target feature group distance is determined according to the target feature group and the anchor point feature group of the first scene, wherein The anchor point feature group of is determined according to the sample data of the first scene; according to the distance of the target feature group and the target probability, it is determined whether the target object is a living object.

In this embodiment, the aforementioned storage medium includes but is not limited to random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), cache (Cache), and hard disk (Hard Disk Drive, HDD). Or memory card (Memory Card). The memory can be used to store computer program instructions. The network communication unit may be an interface set up in accordance with a standard stipulated by the communication protocol and used for network connection communication.

In this embodiment, the specific functions and effects realized by the program instructions stored in the computer storage medium can be explained in comparison with other embodiments, and will not be repeated here.

This specification also provides a face recognition device, where the face recognition device at least includes a camera and a processor. Wherein, the aforementioned camera is specifically used to obtain target image data, wherein the target image data includes image data including the target object collected in the first scene. The above-mentioned processor is specifically configured to call a preset life detection model, extract a target feature group from the target image data, and determine the probability value that the target object is a non-living object through the preset life detection model As the target probability, the preset liveness detection model includes a model trained using sample data of the second scene; the target feature group distance is determined according to the target feature group and the anchor point feature group of the first scene, where , The anchor point feature group of the first scene is determined according to the sample data of the first scene; according to the distance of the target feature group and the target probability, it is determined whether the target object is a living object. In the case that the above-mentioned processor determines that the target object is not a living object, it determines that the face recognition has failed, and no further face recognition is performed on the target image data; in the case of determining that the target object is a living object, the target image can be Further face recognition is performed to determine the identity information of the user who matches the face in the target image.

Referring to FIG. 8, at the software level, the embodiment of this specification also provides a living body detection device, which may specifically include the following structural modules.

The obtaining module 801 may be specifically used to obtain target image data, where the target image data includes image data including the target object collected in the first scene;

The using module 803 can be specifically used to extract a target feature group from the target image data using a preset life detection model, and determine that the target object is a non-living object through the preset life detection model The probability value is used as the target probability, where the preset living detection model includes a model trained by using sample data of the second scene;

The first determining module 805 may be specifically configured to determine the distance of the target feature group according to the target feature group and the anchor point feature group of the first scene, wherein the anchor point feature group of the first scene is based on the sample of the first scene Data determination;

The second determining module 807 may be specifically configured to determine whether the target object is a living object according to the target feature group distance and the target probability.

In some embodiments, the device may specifically further include an establishment module, and the module may specifically include the following structural units:

The collecting unit may be specifically used to collect image data containing a living object in the first scene as the positive sample data of the first scene;

The calling unit may be specifically used to call a preset living body detection model to extract sample features from the positive sample data;

The first determining unit may be specifically configured to determine the center point feature according to the sample feature;

The calculation unit may be specifically used to calculate the feature distance between the sample feature and the center point feature;

The establishing unit may be specifically configured to establish the anchor point feature group of the first scene according to the sample feature and the feature distance between the sample feature and the center point feature.

In some embodiments, the establishment unit may specifically include the following structural sub-units:

The screening subunit can be specifically used to screen out the sample features whose feature distance from the center point feature is greater than the feature distance threshold from the sample features as the sample feature that meets the requirements;

The establishment of the subunit may be specifically used to establish the anchor point feature group of the first scene according to the central point feature and the qualified sample feature.

In some embodiments, the establishment module may specifically further include the following units:

The coding unit may be specifically used to perform Huffman coding on the features in the anchor point feature group of the first scene respectively to obtain the compressed anchor point feature group of the first scene;

The storage unit may be specifically used to save the compressed anchor point feature group of the first scene.

In some embodiments, the second determining module may specifically include the following structural units:

The scoring unit may be specifically configured to determine a first score according to the distance of the target feature group; determine a second score according to the target probability; and perform a weighted summation of the first score and the second score according to a preset weighting rule , Get the third score;

The first comparison unit may be specifically configured to compare the third score with a preset score threshold to obtain a comparison result;

The second determining unit may be specifically configured to determine whether the target object is a living object according to the comparison result.

In some embodiments, the device may specifically further include a processing module, which may be specifically configured to reject the permission application corresponding to the target image data in the case that the target object is determined to be not a living object according to the second determining module request.

In some embodiments, the device may further include an update module, and the module may specifically include the following structural units:

The statistical unit, which can be specifically used to calculate the error ratio within a preset time period;

The second comparison unit may be specifically used to compare the error ratio with a preset ratio threshold;

The third determining unit may be specifically configured to re-determine the anchor point feature group of the first scene when it is determined that the error ratio is greater than a preset ratio threshold.

It should be noted that the units, devices, or modules described in the foregoing embodiments may be specifically implemented by computer chips or entities, or implemented by products with certain functions. For the convenience of description, when describing the above device, the functions are divided into various modules and described separately. Of course, when implementing this specification, the functions of each module can be implemented in the same one or more software and/or hardware, or a module that implements the same function can be implemented by a combination of multiple sub-modules or sub-units. The device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

As can be seen from the above, the living body detection device provided by the embodiment of this specification uses the module to call the preset living body detection model obtained by training based on the second scene to process the target image data collected in the first scene to obtain the corresponding target feature group. , And determine the target probability that the target object in the target image data belongs to the non-living object based on the target feature group through the above model; at the same time, the target object determined by the positive sample data based on the first scene is introduced and used through the first determination module. The anchor point feature group of the first scene is used to determine the feature distance between the target feature group and the anchor point feature group; then the second determination module is used to synthesize the target probability and the target feature group distance to more accurately determine the collection of the first scene Whether the target object in the target image data is a living object. Therefore, the preset living detection model trained and established in the second scene can be effectively used to efficiently perform relatively accurate living detection of the target object in the image data collected in the first scene.

Although this specification provides method operation steps as described in the embodiments or flowcharts, conventional or non-innovative methods may include more or fewer operation steps. The sequence of steps listed in the embodiment is only one way of the execution order of many steps, and does not represent the only execution order. When the actual device or client product is executed, it can be executed sequentially or in parallel according to the methods shown in the embodiments or drawings (for example, a parallel processor or multi-threaded processing environment, or even a distributed data processing environment). The terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, product, or device that includes a series of elements includes not only those elements, but also other elements that are not explicitly listed. Elements, or also include elements inherent to such processes, methods, products, or equipment. If there are no more restrictions, it does not exclude that there are other identical or equivalent elements in the process, method, product, or device including the elements. Words such as first and second are used to denote names, but do not denote any specific order.

Those skilled in the art also know that, in addition to implementing the controller in a purely computer-readable program code manner, it is entirely possible to program the method steps to make the controller use logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded logic. The same function can be realized in the form of a microcontroller, etc. Therefore, such a controller can be regarded as a hardware component, and the devices included in the controller for realizing various functions can also be regarded as a structure within the hardware component. Or even, the device for realizing various functions can be regarded as both a software module for realizing the method and a structure within a hardware component.

This specification may be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform specific tasks or implement specific abstract data types. This specification can also be practiced in distributed computing environments. In these distributed computing environments, tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.

It can be known from the description of the above embodiments that those skilled in the art can clearly understand that this specification can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the technical solution of this specification can essentially be embodied in the form of a software product. The computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., including several instructions to make a A computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) executes the methods described in each embodiment or some parts of the embodiment in this specification.

The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. This manual can be used in many general-purpose or special-purpose computer system environments or configurations. For example: personal computers, server computers, handheld devices or portable devices, tablet devices, multi-processor systems, microprocessor-based systems, set-top boxes, programmable electronic devices, network PCs, small computers, large computers, including the above Distributed computing environment of any system or device, etc.

Although this specification has been described through the embodiments, those of ordinary skill in the art know that there are many variations and changes in this specification without departing from the spirit of this specification, and it is hoped that the appended claims include these variations and changes without departing from the spirit of this specification.

Claims

A living body detection method, including:

Acquiring target image data, where the target image data includes image data including the target object collected in the first scene;

Using a preset life detection model, extract a target feature group from the target image data, and determine the probability value of the target object being a non-living object through the preset life detection model as the target probability, wherein, The preset living body detection model includes a model trained by using sample data of the second scene;

Determining the distance of the target feature group according to the target feature group and the anchor point feature group of the first scene, wherein the anchor point feature group of the first scene is determined according to the sample data of the first scene;

According to the target feature group distance and the target probability, it is determined whether the target object is a living object.
According to the method of claim 1, the anchor point feature group of the first scene is determined in the following manner:

Collecting image data containing living objects in the first scene as positive sample data of the first scene;

Extracting sample features from the positive sample data using a preset living body detection model;

Determine the center point feature according to the sample feature;

Calculating the feature distance between the sample feature and the center point feature;

The anchor point feature group of the first scene is established according to the sample feature and the feature distance between the sample feature and the center point feature.
The method according to claim 2, wherein establishing the anchor point feature group of the first scene according to the sample feature and the feature distance between the sample feature and the center point feature, comprising:

From the sample features, select the sample features whose feature distance from the center point feature is greater than the feature distance threshold as the sample feature that meets the requirements;

The anchor point feature group of the first scene is established according to the center point feature and the sample feature that meets the requirements.
The method according to claim 2, after the anchor point feature group of the first scene is established, the method further comprises:

Huffman coding is performed on the features in the anchor point feature group of the first scene respectively to obtain the compressed anchor point feature group of the first scene;

The anchor point feature group of the compressed first scene is saved.
The method according to claim 1, determining whether the target object is a living object according to the target feature group distance and the target probability, comprising:

Determine a first score according to the distance of the target feature group; determine a second score according to the target probability;

Performing a weighted summation of the first score and the second score according to a preset weighting rule to obtain a third score;

Comparing the third score with a preset score threshold to obtain a comparison result;

According to the comparison result, it is determined whether the target object is a living object.
The method according to claim 1, in the case where it is determined that the target object is not a living object according to the target feature group distance and the target probability, the method further comprises:

Reject the permission application request corresponding to the target image data.
The method according to claim 6, further comprising:

Calculate the error ratio within the preset time period;

Comparing the error ratio with a preset ratio threshold;

In the case where it is determined that the error ratio is greater than the preset ratio threshold, the anchor point feature group of the first scene is re-determined.
A living body detection device includes:

An acquisition module for acquiring target image data, wherein the target image data includes image data including the target object collected in the first scene;

The use module is used to extract a target feature group from the target image data using a preset liveness detection model, and determine the probability value that the target object is a non-living object through the preset liveness detection model as Target probability, wherein the preset liveness detection model includes a model trained by using sample data of the second scene;

The first determining module is configured to determine the distance of the target feature group according to the target feature group and the anchor point feature group of the first scene, wherein the anchor point feature group of the first scene is determined according to the sample data of the first scene;

The second determining module is configured to determine whether the target object is a living object according to the target feature group distance and the target probability.
The device according to claim 8, the device further comprising an establishment module, comprising:

An acquisition unit, configured to acquire image data containing living objects in the first scene as positive sample data of the first scene;

A calling unit for calling a preset living body detection model to extract sample features from the positive sample data;

The first determining unit is configured to determine the center point feature according to the sample feature;

A calculation unit for calculating the feature distance between the sample feature and the center point feature;

The establishment unit is configured to establish the anchor point feature group of the first scene according to the sample feature and the feature distance between the sample feature and the center point feature.
The device according to claim 9, wherein the establishing unit comprises:

The screening subunit is used to screen out the sample features whose feature distance from the center point feature is greater than the feature distance threshold from the sample features as the sample feature that meets the requirements;

The establishment subunit is used to establish the anchor point feature group of the first scene according to the center point feature and the sample feature that meets the requirements.
The apparatus according to claim 9, wherein the establishment module further comprises:

An encoding unit, configured to perform Huffman encoding on the features in the anchor point feature group of the first scene respectively to obtain the compressed anchor point feature group of the first scene;

The storage unit is configured to store the anchor point feature group of the compressed first scene.
The apparatus according to claim 8, wherein the second determining module comprises:

The scoring unit is configured to determine a first score according to the distance of the target feature group; determine a second score according to the target probability; and perform a weighted summation of the first score and the second score according to a preset weighting rule to obtain Third score

The first comparison unit is configured to compare the third score with a preset score threshold to obtain a comparison result;

The second determining unit is configured to determine whether the target object is a living object according to the comparison result.
The device according to claim 8, further comprising a processing module, configured to reject the permission application request corresponding to the target image data in the case that the target object is determined to be not a living object according to the second determining module.
The device according to claim 13, the device further comprising an update module, comprising:

The statistical unit is used to calculate the error ratio within the preset time period;

A second comparison unit, configured to compare the error ratio with a preset ratio threshold;

The third determining unit is configured to re-determine the anchor point feature group of the first scene when it is determined that the error ratio is greater than the preset ratio threshold.
A server includes a processor and a memory for storing executable instructions of the processor, and the processor implements the steps of the method according to any one of claims 1 to 7 when the processor executes the instructions.
A face recognition device, comprising a processor and a memory for storing executable instructions of the processor, and when the processor executes the instructions, the steps of the method according to any one of claims 1 to 7 are implemented to determine the use of Whether the target object in the target image data for face recognition is a living object; if it is determined that the target object is not a living object, it is determined that the face recognition fails.