CN116959125A

CN116959125A - Data processing method and related device

Info

Publication number: CN116959125A
Application number: CN202310787686.2A
Authority: CN
Inventors: 李博; 陈兆宇; 吴双; 丁守鸿
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-06-29
Filing date: 2023-06-29
Publication date: 2023-10-27

Abstract

The application provides a data processing method and a related device. The embodiment of the application can be applied to the technical field of computer vision. According to the data processing method provided by the embodiment of the application, firstly, the living body detection pre-training model is trained through the positive sample image and the M-class negative sample image, so that whether the generated living body detection pre-training model contains real object living body information or not can be determined, then, the sample countermeasure image is generated according to the positive sample image and the M-class negative sample image, the label corresponding to the sample countermeasure image is given, then, the living body detection pre-training model is trained again according to the sample countermeasure image and the label corresponding to the sample countermeasure image, the data enhancement is performed from the countermeasure angle, and the living body detection pre-training model is optimized through the training data after the data enhancement, so that the trained living body detection model can accurately detect the living body of the real object, and the defense capability is improved.

Description

Data processing method and related device

Technical Field

The present application relates to the field of computer vision, and in particular, to a data processing method and related apparatus.

Background

Along with the continuous development of computer vision technology, the face recognition technology is widely applied in production and life, the safety of a face recognition system is also of great importance, and living detection is an important ring in face recognition, and the safety of the face recognition system is ensured by resisting face attack.

At present, training data for training a living body detection model are often data enhancement by changing the modes of illumination, brightness, angle, contrast, moderate deformation and the like of an image, and the data enhancement methods are all to add some transformations on the basis of original data so as to enlarge the data diversity of the original data types, so that the defending capability of the living body detection model is poor.

Disclosure of Invention

The embodiment of the application provides a data processing method and a related device, which solve the problem of poor defensive ability of a living body detection model in the prior art.

One aspect of the present application provides a data processing method, including:

acquiring a positive sample image and M types of negative sample images, wherein the positive sample image contains real object living body information, the positive sample image carries positive sample labels, the M types of negative sample images correspond to M negative sample types, the negative sample image does not contain real object living body information, and the negative sample image carries negative sample labels, and M is more than or equal to 1;

Training a living body detection pre-training model according to the positive sample image carrying the positive sample label and the M-class negative sample image carrying the negative sample label, and generating a living body detection pre-training model;

inputting a positive sample image into a diffusion model to generate a positive sample countermeasure image, and inputting M types of negative sample images into the diffusion model to generate M types of negative sample countermeasure images, wherein the positive sample countermeasure image is an image generated by adding interference information to the positive sample image, and the negative sample countermeasure image is an image generated by adding the interference information to the negative sample image;

setting a first target label for the positive sample countermeasure image and setting a second target label for each negative sample countermeasure image in the M-type negative sample countermeasure image, wherein the first target label is one of the M-type negative sample labels corresponding to the M-type negative sample image, and the second target label is used for representing that the negative sample countermeasure image contains real object living body information;

training the biopsy pre-training model according to the positive sample antibody image carrying the first target label and the M negative sample antibody image carrying the second target label so as to optimize parameters of the biopsy pre-training model and generate a trained biopsy model.

Another aspect of the present application provides a data processing apparatus comprising: the system comprises a sample image acquisition module, a model pre-training module, a sample countermeasure image generation module, a target label setting module and a model retraining module; specific:

the sample image acquisition module is used for acquiring a positive sample image and M types of negative sample images, wherein the positive sample image contains real object living body information, the positive sample image carries positive sample labels, the M types of negative sample images correspond to M negative sample types, the negative sample image does not contain real object living body information, and the negative sample image carries negative sample labels, and M is more than or equal to 1;

the model pre-training module is used for training the living body detection model according to the positive sample image carrying the positive sample label and the M-type negative sample image carrying the negative sample label to generate a living body detection pre-training model;

the sample countermeasure image generation module is used for inputting a positive sample image into the diffusion model to generate a positive sample countermeasure image, inputting M types of negative sample images into the diffusion model to generate M types of negative sample countermeasure images, wherein the positive sample countermeasure image is an image generated by adding interference information on the positive sample image, and the negative sample countermeasure image is an image generated by adding the interference information on the negative sample image;

The target label setting module is used for setting a first target label for the positive sample countermeasure image and setting a second target label for each negative sample countermeasure image in the M-type negative sample countermeasure image, wherein the first target label is one of the M-type negative sample labels corresponding to the M-type negative sample image, and the second target label is used for representing that the negative sample countermeasure image contains real object living body information;

the model retraining module is used for training the living body detection retraining model according to the positive sample contrast image carrying the first target label and the M-class negative sample contrast image carrying the second target label so as to optimize parameters of the living body detection retraining model and generate a trained living body detection model.

In another implementation of the embodiment of the present application, the model retraining module is further configured to:

inputting the positive sample countermeasure image carrying the first target label into a living body detection pre-training model, outputting a first probability value corresponding to the positive sample countermeasure image and M second probability values through the living body detection pre-training model, wherein the first probability value is used for representing the possibility that the positive sample countermeasure image contains living body information of a real object, the M second probability values are used for representing the possibility that the positive sample countermeasure image corresponds to each negative sample category in M negative sample categories,

Inputting M negative-sample countermeasure images carrying second target labels into a living body detection pre-training model, and outputting a third probability value and M fourth probability values corresponding to each negative-sample countermeasure image through the living body detection pre-training model, wherein the third predicted probability value characterizes the possibility that the negative-sample countermeasure images contain living body information of real objects, and the M fourth probability values are used for characterizing the possibility that the negative-sample countermeasure images correspond to each of the M negative-sample categories;

calculating a loss function according to the first probability value and the M second probability values corresponding to the positive sample countermeasure images, and the third probability value and the M fourth probability values corresponding to each negative sample countermeasure image;

and training the living body detection pre-training model according to the loss function to optimize parameters of the living body detection pre-training model, and generating a trained living body detection model.

determining a second probability value of a corresponding negative sample category of the positive sample countermeasure image from the M second probability values as a first target probability value;

determining the largest probability value in the probability values as a second target probability value from M fourth probability values corresponding to each negative sample countermeasure image;

And calculating a loss function according to the first probability value, the first target probability value, the second target probability value corresponding to each negative sample countermeasure image and the third probability value.

calculating a gradient value of the positive sample countermeasure image according to the loss function, and generating the positive sample countermeasure image gradient value;

calculating the gradient value of the M-class negative sample countermeasure image according to the loss function, and generating the gradient value of the M-class negative sample countermeasure image;

performing gradient processing on the positive sample countermeasure image according to the positive sample countermeasure image gradient value to generate a positive sample countermeasure composite image;

performing gradient processing on the corresponding negative-sample countermeasure image according to each negative-sample countermeasure image gradient value pair in the M-class negative-sample countermeasure image gradient values, and generating an M-class negative-sample countermeasure composite image;

and training the living body detection pre-training model according to the positive sample antibody synthesized image and the M-class negative sample antibody synthesized image so as to optimize parameters of the living body detection pre-training model and generate a trained living body detection model.

Inputting the positive sample countermeasure synthetic image into a living body detection pre-training model, generating a first prediction result of whether the positive sample countermeasure synthetic image contains living body information of a real object through the living body detection pre-training model,

inputting the M-class negative-sample antibody synthesized image into a living body detection pre-training model, and predicting whether the M-class negative-sample antibody synthesized image contains M-class second prediction results of living body information of a real object or not through the living body detection pre-training model;

training the living body detection pre-training model according to the first prediction result, the first target label, the M-class second prediction result and the M-class second target label to optimize parameters of the living body detection pre-training model and generate a trained living body detection model.

inputting the positive sample countermeasure synthetic image into a living body detection pre-training model, and generating a first prediction probability that the positive sample countermeasure synthetic image contains living body information of a real object and M second prediction probabilities that the positive sample countermeasure synthetic image corresponds to M negative sample categories through a classification module in the living body detection pre-training model;

And determining the maximum value of the first prediction probability and the M second prediction probabilities as a first prediction result.

inputting each type of negative-sample countermeasure synthetic image in the M types of negative-sample countermeasure synthetic images into a living body detection pre-training model, and generating a third prediction probability that each type of negative-sample countermeasure synthetic image in the M types of negative-sample countermeasure synthetic images contains living body information of a real object and M fourth prediction probabilities that each type of negative-sample countermeasure synthetic images corresponds to M negative-sample categories through a classification module in the living body detection pre-training model;

and determining whether the M-class negative-sample anti-synthesis image contains M-class second prediction results of real object living body information or not according to the third prediction probability corresponding to each class of negative-sample anti-synthesis image in the M-class negative-sample anti-synthesis image and the maximum value in the M fourth prediction probabilities.

In another implementation of the embodiment of the present application, the sample challenge image generation module is further configured to:

inputting the positive sample image into a diffusion model, and encoding the positive sample image through an image encoding network in the diffusion model to generate positive sample image vector data;

Inputting the positive sample image vector data into a diffusion network in a diffusion model, and adding interference data to the positive sample image vector data through the diffusion network to generate positive sample interference vector data;

inputting the positive sample interference vector data into a feature extraction network in the diffusion model, and carrying out feature extraction on the positive sample interference vector data through the feature extraction network to generate a positive sample interference feature vector;

inputting the positive sample interference feature vector into a noise reduction network in the diffusion model, and carrying out noise reduction on the positive sample interference feature vector through the noise reduction network to generate positive sample noise reduction vector data;

the positive sample noise reduction vector data is input to an image decoding network in the diffusion model to generate a positive sample countermeasure image.

inputting each type of negative-sample countermeasure images in the M types of negative-sample countermeasure images into a diffusion model, and encoding each type of negative-sample countermeasure images through an image encoding network in the diffusion model to generate negative-sample image vector data corresponding to each type of negative-sample countermeasure images in the M types of negative-sample countermeasure images;

inputting negative sample image vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images into a diffusion network in a diffusion model, adding interference data to the negative sample image vector data through the diffusion network, and generating negative sample interference vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images;

Inputting the negative sample image vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images into a feature extraction network in the diffusion model, and carrying out feature extraction on the negative sample image vector data through the feature extraction network to generate negative sample interference feature vectors corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images;

inputting negative sample interference feature vectors corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images into a noise reduction network in the diffusion model, and carrying out noise reduction on the negative sample interference feature vectors through the noise reduction network to generate negative sample noise reduction vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images;

and inputting the negative sample noise reduction vector data corresponding to each type of negative sample countermeasure image in the M types of negative sample countermeasure images to an image decoding network in the diffusion model, and generating the negative sample countermeasure image corresponding to each type of negative sample countermeasure image in the M types of negative sample countermeasure images.

training the biopsy pre-training model according to the positive sample antibody image carrying the first target label and the M negative sample antibody image carrying the second target label so as to optimize the learning rate and the iteration step length of the biopsy pre-training model and generate a trained biopsy model.

In another implementation manner of the embodiment of the present application, the data processing apparatus further includes: the target image acquisition module and the real object living body detection prediction module; specific:

the target image acquisition module is used for acquiring a target image;

the real object living body detection prediction module is used for inputting the target image into a living body detection model which is trained, and outputting a target prediction result of which the target image contains real object living body information through the trained living body detection model.

Another aspect of the present application provides a computer apparatus comprising:

memory, transceiver, processor, and bus system;

wherein the memory is used for storing programs;

the processor is used for executing programs in the memory, and the method comprises the steps of executing the aspects;

the bus system is used to connect the memory and the processor to communicate the memory and the processor.

Another aspect of the application provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the methods of the above aspects.

Another aspect of the application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the above aspects.

From the above technical solutions, the embodiment of the present application has the following advantages:

the application provides a data processing method and a related device, wherein the method comprises the following steps: acquiring a positive sample image and M types of negative sample images, wherein the positive sample image contains real object living body information, the positive sample image carries positive sample labels, the M types of negative sample images correspond to M negative sample types, the negative sample image does not contain real object living body information, and the negative sample image carries negative sample labels, and M is more than or equal to 1; training a living body detection pre-training model according to the positive sample image carrying the positive sample label and the M-class negative sample image carrying the negative sample label, and generating a living body detection pre-training model; inputting a positive sample image into a diffusion model to generate a positive sample countermeasure image, and inputting M types of negative sample images into the diffusion model to generate M types of negative sample countermeasure images, wherein the positive sample countermeasure image is an image generated by adding interference information to the positive sample image, and the negative sample countermeasure image is an image generated by adding the interference information to the negative sample image; setting a first target label for the positive sample countermeasure image and setting a second target label for each negative sample countermeasure image in the M-type negative sample countermeasure image, wherein the first target label is one of the M-type negative sample labels corresponding to the M-type negative sample image, and the second target label is used for representing that the negative sample countermeasure image contains real object living body information; training the biopsy pre-training model according to the positive sample antibody image carrying the first target label and the M negative sample antibody image carrying the second target label so as to optimize parameters of the biopsy pre-training model and generate a trained biopsy model. According to the data processing method provided by the embodiment of the application, firstly, the living body detection pre-training model is trained through the positive sample image and the M-class negative sample image, so that whether the generated living body detection pre-training model contains real object living body information or not can be determined, then, the sample countermeasure image is generated according to the positive sample image and the M-class negative sample image, the label corresponding to the sample countermeasure image is given, then, the living body detection pre-training model is trained again according to the sample countermeasure image and the label corresponding to the sample countermeasure image, the data enhancement is performed from the countermeasure angle, and the living body detection pre-training model is optimized through the training data after the data enhancement, so that the trained living body detection model can accurately detect the living body of the real object, and the defense capability is improved.

Drawings

FIG. 1 is a schematic diagram of a data processing system according to an embodiment of the present application;

FIG. 2 is a flow chart of a data processing method according to an embodiment of the present application;

FIG. 3 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 4 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 5 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 6 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 7 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 8 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 9 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 10 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 11 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 12 is a flowchart of a data processing method according to another embodiment of the present application;

FIG. 13 is a schematic diagram illustrating a flow of a data processing method according to an embodiment of the present application;

FIG. 14 is a schematic diagram of a pre-training stage of a living body detection model according to an embodiment of the present application;

FIG. 15 is a schematic diagram of an fight sample phase of generating boundary constraints provided by an embodiment of the present application;

FIG. 16 is a block diagram of a diffusion model according to an embodiment of the present application;

FIG. 17 (a) is a schematic diagram of a classification boundary surface of a living detection model trained using clean data according to an embodiment of the present application;

FIG. 17 (b) is a schematic diagram of a classification boundary surface using a living body detection model mixed with challenge data training according to an embodiment of the present application;

FIG. 18 is a schematic diagram illustrating a data processing apparatus according to an embodiment of the present application;

FIG. 19 is a schematic diagram of a data processing apparatus according to another embodiment of the present application;

fig. 20 is a schematic diagram of a server structure according to an embodiment of the present application.

Detailed Description

The embodiment of the application provides a data processing method, firstly, a living body detection pre-training model is trained through a positive sample image and M-class negative sample images, so that whether the generated living body detection pre-training model contains real object living body information or not can be determined, then, a sample antibody image is generated according to the positive sample image and the M-class negative sample images, a label corresponding to the sample antibody image is given to the sample antibody image, then, the living body detection pre-training model is trained again according to the sample antibody image and the label corresponding to the sample antibody image, data enhancement is carried out from the angle of antibody attack and defense, and the living body detection pre-training model is optimized through training data after the data enhancement, so that the trained living body detection model can accurately detect the living body of a real object, and the defensive capacity is improved.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "includes" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace a human eye with a camera and a Computer to perform machine Vision such as recognition and measurement on a target, and further perform graphic processing to make the Computer process an image more suitable for human eye observation or transmission to an instrument for detection. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Along with the continuous development of computer vision technology, the face recognition technology is widely applied in production life, the safety of a face recognition system is also of great importance, and living detection is used as an important ring in face recognition, so that the safety of the face recognition system is ensured by resisting face attack, for example, in intelligent face-brushing payment business, the safety of face-brushing payment is powerfully ensured by real object living detection.

However, in the practical application scenario of living body detection, due to the existence of abnormal industries, the attack types become very diversified, the attack form is novel and is not adequate, the image containing the face information is attacked in various forms, for example, a photo containing the face is printed or attacked on a display screen (such as a tablet computer), the face information is attacked in the form of a head model or a mask, and the like, one of the materials with various patterns always has a material with a model that is not seen, so that the living body detection model is bypassed, and once some unknown scenes or unknown objects appear, the judgment of the living body detection model is disturbed, and the defending capability of the living body detection model is reduced.

For this reason, the training data does not cover the entire class space, in other words, even if the training data reaches several millions, the training data is sparse in the entire class space. Therefore, the embodiment of the technical application provides a data processing method, and the living body detection model is trained through the sample image and the sample contrast image, so that the performance of the living body detection model is integrally improved, the defending capability of the living body detection model is improved, and the face living body can be accurately detected.

Face recognition is also a significant component in biometric identification, and its security is also of great concern. The living body detection of the human face is used as a whistle station for guaranteeing the safety of the human face, and a great deal of researches are carried out. Before the deep neural network appears, the face living body detection mainly focuses on manually extracting the texture features of living bodies to conduct classification of true man/attack, and better detection performance can be achieved through the convolutional neural network. Related art is mainly divided into two categories: a human face living body detection method based on manual feature extraction and a human face living body detection method based on a deep neural network.

The human face living body detection method based on manual feature extraction is mainly deployed in the lightweight face recognition service of the front end, and human face living body detection also needs to achieve better detection performance in limited peripheral resources. Therefore, the main technical principle of the algorithm is to use some traditional artificial feature extraction methods and classification algorithms to conduct two classifications of true man/attack. The general flow is that firstly, face samples of a plurality of true persons and attack samples of a plurality of turnups are collected on a plurality of camera equipment to be used as training data sets (the quantity scale is below 10 w), then, the feature extraction is respectively carried out on the true persons and the turnups by using a plurality of artificial feature extraction methods such as binary statistical image features, and finally, classification is carried out by using classifiers such as SVM, bayesian networks and the like.

The development of visual tasks is greatly promoted by the appearance of the deep neural network, and the human face living body detection method based on the deep neural network can be described from two aspects of 2D image living body texture analysis and 3D depth information processing: 1) Face living body detection mainly relies on capturing fine texture and material differences between 2D RGB images of a real person and an attack to classify, and CNN is responsible for extracting features of the real person and the attack image and training a classification model. During this training process, the classification loss function constrains the parameter updates of the CNN to converge to a state with a lower classification error. 2) In addition to 2D RGB images, some specific hardware devices (such as binocular cameras, RGBD cameras, structured light sensors, kinect, etc.) can also acquire images with 3D information. The depth image of the face can be directly or indirectly obtained through the hardware devices, and then whether the current user is a true face or a planar flip attack of some paper sheets, screens and the like is judged through the depth image.

However, the method based on manual feature extraction or the method based on a deep neural network, and the considered data enhancement methods are illumination, brightness, angle, contrast, moderate deformation and the like, and the data enhancement methods are all to add some transformation on the basis of the original data, so that the data diversity of the original data category is enlarged, and finally the accuracy and generalization of the living body detection model are improved. However, none of these approaches take into account data enhancement from the point of view of anti-attack. Therefore, the embodiment of the application provides a data processing method, which is based on the countermeasure data of a diffusion model to enhance the detection capability of a face living body model, uses the specificity of the countermeasure sample in data distribution to amplify and enhance the training data, and trains a living body detection model through the amplified and enhanced training data, thereby further improving the defensive capability and generalization of the living body detection model.

According to the data processing method provided by the embodiment of the application, firstly, the living body detection pre-training model is trained through the positive sample image and the M-class negative sample image, so that whether the generated living body detection pre-training model contains real object living body information or not can be determined, then, the sample countermeasure image is generated according to the positive sample image and the M-class negative sample image, the label corresponding to the sample countermeasure image is given, then, the living body detection pre-training model is trained again according to the sample countermeasure image and the label corresponding to the sample countermeasure image, data enhancement is removed from the countermeasure viewpoint, and the living body detection pre-training model is optimized, so that the trained living body detection model can accurately detect the human face living body, and the defense capability is improved.

The living body detection model trained by the method provided by the embodiment of the application can be directly deployed before the face recognition model, the picture input by face recognition is detected, if the picture is a true face, the picture enters a subsequent recognition flow, and if the picture is an attack picture, error reporting prompt retry is carried out. The living body detection model trained by the method provided by the embodiment of the application can be applied to all face recognition related applications, including but not limited to: an online face payment scene, an offline face payment scene, a face access control unlocking system scene, a mobile phone face recognition scene, an automatic face recognition customs clearance scene and the like.

For ease of understanding, referring to fig. 1, fig. 1 is a diagram illustrating an application environment of a data processing method according to an embodiment of the present application, as shown in fig. 1, where the data processing method according to the embodiment of the present application is applied to a data processing system. The data processing system includes: a server and a terminal device; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content distribution network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligent platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and embodiments of the present application are not limited herein.

The method comprises the steps that firstly, a positive sample image and M types of negative sample images are obtained, wherein the positive sample image contains real object living body information, the positive sample image carries positive sample labels, the M types of negative sample images correspond to M negative sample types, the negative sample image does not contain real object living body information, the negative sample image carries the negative sample labels, and M is more than or equal to 1; secondly, the server trains a living body detection model according to the positive sample image carrying the positive sample label and the M-class negative sample image carrying the negative sample label, and generates a living body detection pre-training model; then, the service inputs the positive sample image into the diffusion model to generate a positive sample countermeasure image, and inputs the M-class negative sample image into the diffusion model to generate an M-class negative sample countermeasure image, wherein the positive sample countermeasure image is an image generated by adding interference information to the positive sample image, and the negative sample countermeasure image is an image generated by adding the interference information to the negative sample image; then, the server sets a first target label for the positive sample countermeasure image and sets a second target label for each negative sample countermeasure image in the M-class negative sample countermeasure image, wherein the first target label is one of the M-class negative sample labels corresponding to the M-class negative sample image, and the second target label is used for representing that the negative sample countermeasure image contains real object living body information; and finally, the server trains the living body detection pre-training model according to the positive sample contrast image carrying the first target label and the M negative sample contrast image carrying the second target label so as to optimize parameters of the living body detection pre-training model and generate a trained living body detection model.

The data processing method of the present application will be described from the perspective of the server. It should be noted that the data processing method provided by the present application is not limited to the scene of detecting a real human face, but may be a scene of detecting living bodies of other biological characteristics, for example, detecting living bodies of other biological characteristics such as a palm. For easy understanding, the embodiment of the application is illustrated by taking a scene of real human face detection as an example.

Referring to fig. 2, the data processing method provided in the embodiment of the application includes: step S110 to step S150. It should be noted that, steps S110 to S150 are a process of training the living body detection model. Specific:

s110, acquiring a positive sample image and an M-type negative sample image.

The positive sample image contains real object living body information, the positive sample image carries a positive sample label, the M negative sample images correspond to M negative sample categories, the negative sample image does not contain real object living body information, the negative sample image carries a negative sample label, and M is more than or equal to 1.

It can be understood that, in order to make the living body detection model better predict whether the target image contains the face information, when the living body detection model is trained, the living body detection model is subjected to comparison learning through the positive sample image and the negative sample image so as to improve the accuracy of predicting whether the target image contains the face information.

The positive sample image is a real human face photo directly acquired from an acquisition device (such as a camera acquisition device), the positive sample image contains real object living body information, the positive sample image carries a positive sample label, and the positive sample label is used for representing that the positive sample image contains real object living body information.

The negative sample image is an image obtained by performing secondary processing on a real human face photo (also called a human face attack picture), such as an image obtained by performing high-definition reproduction on the real human face photo, an image obtained by displaying the real human face photo in a display, an image obtained by printing the real human face photo, an image obtained by combining or splicing part of features in the real human face photo with the real human face (such as an image obtained by placing a printed image of eyes in the real human face photo at eyes of the real human face to shield eyes of the real human face), and an image such as a head model and a mask made of living body information of a real object. The negative sample image carries a negative sample label, and the negative sample label is used for representing that the negative sample image does not contain real object living body information and the type of the negative sample image.

For ease of understanding, in one embodiment of the application, the negative sample categories include: the paper sheet type, the paper sheet cutting type, the head die and face tool type and the high-definition flap type. Referring to table 1, the positive sample label of the positive sample image is 0; the negative sample label corresponding to the negative sample image (the negative sample image of the whole paper type) printed by the face photo of the real person is 1; printing out part of characteristics in the photo of the real human face, wherein a negative sample label corresponding to a negative sample image (a negative sample image of a paper cutting type) of the real human face for combination or splicing is 2; negative sample labels corresponding to negative sample images (negative sample images of the head model mask type) of the head model and the mask manufactured by using the living body information of the real object are 3; and a negative sample label corresponding to a negative sample image (a negative sample image of a high-definition flip type) obtained by performing high-definition flip on the real human face photo is 4.

TABLE 1

Image type	Real label
		Face of real person	0
Whole paper sheet	1
		Paper cutting	2
Head mould surface tool	3
		High definition is turned over and is clapped	4

It should be noted that, in the embodiment of the present application, the number of positive sample images is not limited, and the number of each type of negative sample image in the M types of negative sample images is not limited.

The image sizes of the positive sample image and the negative sample image meet the size requirement of the living body detection model on the input picture, namely the positive sample image and the negative sample image are images subjected to pretreatment. The pretreatment process comprises the following steps: 1) Determining five-point coordinates of a face in an image by using a face detection model; 2) Obtaining an external rectangle by utilizing five-point coordinates, and expanding the external rectangle by 2.226 times to obtain a detection frame with a double face size; 3) Continuously expanding one time of face detection frame to twice, then picking up face images by using the two times of face detection frame, and adjusting the size of the face images after being picked up to 256 multiplied by 3; 4) Finally, the 256×256×3 image is randomly cropped to obtain a sample image with a final size of 224×224×3 standard.

S120, training the living body detection model according to the positive sample image carrying the positive sample label and the M-type negative sample image carrying the negative sample label, and generating a living body detection pre-training model.

It can be understood that the living body detection model is trained according to the positive sample image carrying the positive sample label and the M-type negative sample image carrying the negative sample label, so that the living body detection pre-training model is generated by comparing and learning the positive sample image and the M-type negative sample image, and the living body detection pre-training model can accurately classify whether the image contains living body information of a real object or not.

In a specific embodiment of the present application, the pre-training process is performed according to the classification of the positive sample image and the negative sample image, i.e. training the living body detection model according to the above five classifications. The living detection pre-training model may output a predicted probability of the image for a real person, a predicted probability for an entire sheet type, a predicted probability for a sheet cut type, a predicted probability for a head model mask type, and a predicted probability for a high definition flip type. Each prediction probability represents the likelihood that the input image belongs to that class, and the sum of all the prediction probabilities is 1. For example, if the input image is a photo of a face of a real person, the prediction probability of the real person is infinitely close to 1, and the other four prediction probabilities are infinitely close to 0; the input image is an entire sheet, the predicted probability for the entire sheet type will approach 1 indefinitely, while the other four predicted probabilities will approach 0 indefinitely, and so on.

S130, inputting the positive sample image into the diffusion model to generate a positive sample countermeasure image, and inputting the M-class negative sample image into the diffusion model to generate an M-class negative sample countermeasure image.

The positive sample countermeasure image is an image generated by adding interference information to the positive sample image, and the negative sample countermeasure image is an image generated by adding interference information to the negative sample image.

It is understood that step S130 is a step of data enhancement of training data. And processing each positive sample image through a diffusion model to generate a positive sample countermeasure image, and specifically, adding interference information into the positive sample image through the diffusion model to generate the positive sample countermeasure image. And processing each negative sample image through a diffusion model to generate a negative sample countermeasure image, and specifically, adding interference information into the negative sample image through the diffusion model to generate the negative sample countermeasure image. The diffusion model can be processed in batches according to the type of the image, and can also process all types of images at the same time. Since the number of positive sample images and negative sample images used for training the living body detection model in step S120 is large, and in the subsequent training, the positive sample challenge image and the negative sample challenge image corresponding to the negative sample image corresponding to the positive sample image may be selected, and thus the positive sample challenge image and the negative sample challenge image may be generated by extracting a part of the positive sample images and a part of the negative sample images from the total number of positive sample images and the total number of negative sample images in a random sampling manner before step S130.

S140, setting a first target label for the positive sample challenge image, and setting a second target label for each negative sample challenge image in the M-class negative sample challenge image.

The first target label is one of M types of negative sample labels corresponding to the M types of negative sample images, and the second target label is used for representing that the negative sample countermeasure image contains real object living body information.

It will be appreciated that a first target label is set for each positive specimen challenge image and a second target label is set for each negative specimen challenge image.

In a specific embodiment of the present application, referring to table 2, the label of the real face is 0, the label of the whole paper sheet type is 1, the label of the paper sheet cutting type is 2, the label of the head mold face type is 3, and the label of the high-definition flip type is 4. Randomly setting any one of labels 1 to 4 for each positive sample countermeasure image, and generating a first target label of each positive sample countermeasure image; a tag 0 is set for each negative-working counter image, generating a second target tag for each negative-working counter image.

TABLE 2

Image type	Real label	Target label (attack label)
			Face of real person	0	1 to 4
Whole paper sheet	1	0
			Paper cutting	2	0
Head mould surface tool	3	0
			High definition is turned over and is clapped	4	0

And S150, training the biopsy pre-training model according to the positive sample antibody image carrying the first target label and the M negative sample antibody image carrying the second target label so as to optimize parameters of the biopsy pre-training model and generate a trained biopsy model.

It is understood that step S150 is a challenge step. Training the biopsy pre-training model by using the positive sample antigen image carrying the first target label and the M negative sample antigen image carrying the second target label, so as to optimize parameters of the biopsy pre-training model and generate a trained biopsy model.

According to the data processing method provided by the embodiment of the application, firstly, the living body detection pre-training model is trained through the positive sample image and the M-class negative sample image, so that whether the generated living body detection pre-training model contains real object living body information or not can be determined, then, the sample countermeasure image is generated according to the positive sample image and the M-class negative sample image, the label corresponding to the sample countermeasure image is given, then, the living body detection pre-training model is trained again according to the sample countermeasure image and the label corresponding to the sample countermeasure image, the data enhancement is performed from the countermeasure angle, and the living body detection pre-training model is optimized through the training data after the data enhancement, so that the trained living body detection model can accurately detect the living body of the real object, and the defense capability is improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 2, referring to fig. 3, step S150 further includes sub-steps S151 to S154. Specific:

s151, inputting the positive sample countering image carrying the first target label into a living body detection pre-training model, and outputting a first probability value and M second probability values corresponding to the positive sample countering image through the living body detection pre-training model.

Wherein the first probability value is used for representing the possibility that the positive sample countermeasure image contains real object living body information, and the M second probability values are used for representing the possibility that the positive sample countermeasure image corresponds to each of the M negative sample categories.

It will be appreciated that the positive sample challenge image is input to the live pre-training model, which outputs m+1 probability values that characterize the likelihood that the positive sample challenge image belongs to either the positive sample or M negative sample categories.

S152, inputting M negative sample countermeasure images carrying second target labels into a living body detection pre-training model, and outputting a third probability value and M fourth probability values corresponding to each negative sample countermeasure image through the living body detection pre-training model.

Wherein the third predicted probability value characterizes a likelihood that the negative-sample challenge image contains real-object living information, and the M fourth probability values are used to characterize a likelihood that the negative-sample challenge image corresponds to each of the M negative-sample categories.

It will be appreciated that the negative-sample challenge image is input to the live pre-training model, which outputs m+1 probability values that characterize the likelihood that the negative-sample challenge image belongs to the positive-sample or M negative-sample categories.

S153, calculating a loss function according to the first probability value and the M second probability values corresponding to the positive sample countermeasure images, and the third probability value and the M fourth probability values corresponding to each negative sample countermeasure image.

It will be appreciated that the loss function is calculated from the probability values output by the in vivo pre-training model. In a particular embodiment, for a positive sample challenge image, a probability value for the positive sample challenge image corresponding to a real person tag (tag 0) and a probability value for a first target tag corresponding to the positive sample challenge image are determined.

For example, as shown in table 3, if a positive sample label (real label) corresponding to a positive sample contrast image is 0 and a corresponding first target label (attack label) is 1, a first probability value corresponding to the positive sample label (real label) is determined, a second probability value corresponding to the first target label (attack label) is determined to be 1, and the second probability value corresponding to the first target label (attack label) is added to the first probability value corresponding to the positive sample label (real label) being 0 after taking a negative value, so as to obtain a loss value of the positive sample contrast image, i.e., l= 0.9874+ (-0.0035) = 0.9839.

TABLE 3 Table 3

For another example, as shown in table 4, if a negative-sample label (real label) corresponding to a negative-sample countermeasure image is 3 and a corresponding second target label (attack label) is 0, a third probability value corresponding to the negative-sample label (real label) is determined to be 3, a fourth probability value corresponding to the second target label (attack label) is determined to be 0, and the third probability value corresponding to the negative-sample label (real label) is added after the fourth probability value corresponding to the second target label (attack label) is 0 takes a negative value, so as to obtain a loss value of the negative-sample countermeasure image, namely l= 0.9567+ (-0.0267) =0.93.

TABLE 4 Table 4

And S154, training the living body detection pre-training model according to the loss function to optimize parameters of the living body detection pre-training model and generating a trained living body detection model.

It will be appreciated that the biopsy pre-training model is trained according to the loss function to optimize parameters of the biopsy pre-training model, resulting in a trained biopsy model.

According to the method provided by the embodiment of the application, the total loss function is calculated through the probability value of each antibody sample corresponding to each category, and then the living body detection pre-training model is trained through the loss function so as to optimize the parameters of the living body detection pre-training model, and the trained living body detection model is generated, so that the trained living body detection model can accurately detect the living body of a real object, and the defensive capacity is improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 3 of the present application, referring to fig. 4, the substep S153 further includes substeps S1531 to S1533. Specific:

s1531, determining the second probability value of the corresponding negative sample category of the positive sample countermeasure image as the first target probability value from the M second probability values.

It is understood that, for each positive sample antagonism image, the probability value corresponding to the category of the first target tag (attack tag) of the positive sample antagonism image is determined as the first target probability value from the M second probability values corresponding to the positive sample antagonism image.

S1532, determining the largest probability value in the probability values from the M fourth probability values corresponding to each negative sample countermeasure image as the second target probability value.

It is understood that, for each negative-sample countermeasure image, the probability value corresponding to the category of the negative-sample label (real label) corresponding to each negative-sample countermeasure image is the second target probability value.

S1533, calculating a loss function according to the first probability value, the first target probability value, the second target probability value and the third probability value corresponding to each negative sample countermeasure image.

It will be appreciated that for each sample challenge image, its corresponding loss value is calculated, from all sample challenge images, its corresponding loss value is calculated.

According to the method provided by the embodiment of the application, the loss function of the positive sample countermeasure image is calculated through the probability value of the real human face corresponding to the positive sample countermeasure image and the probability value of the class corresponding to the first target label (attack label), the loss function of the negative sample countermeasure image is calculated through the probability value of the real human face corresponding to the negative sample countermeasure image and the probability value of the class corresponding to the negative sample label (real label), the total loss function is generated, the living body detection pre-training model is trained to optimize the parameters of the living body detection pre-training model, and the trained living body detection model is generated, so that the living body detection of the real object can be accurately detected by the trained living body detection model, and the defending capability is improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 2, referring to fig. 5, the substep S154 further includes substeps S1541 to S1545. Specific:

s1541, calculating a gradient value of the positive sample countermeasure image from the loss function, and generating a positive sample countermeasure image gradient value.

S1542, calculating the gradient value of the M-class negative sample countermeasure image according to the loss function, and generating the gradient value of the M-class negative sample countermeasure image.

It will be appreciated that the gradient value of each type of sample challenge image is calculated from the corresponding loss function of that type of sample challenge image.

S1543, performing gradient processing on the positive sample countermeasure image according to the positive sample countermeasure image gradient value, and generating a positive sample countermeasure composite image.

S1544, performing gradient processing on the corresponding negative-sample countermeasure image according to each negative-sample countermeasure image gradient value pair in the M-class negative-sample countermeasure image gradient values, and generating an M-class negative-sample countermeasure composite image.

It will be appreciated that the sample challenge image is gradient processed according to the gradient value of each sample challenge image to generate a sample challenge composite image. The sample challenge composite image (also called challenge hidden variable) is represented by the following formula:

wherein Z is _T Sample countermeasure composite image representing gradient value T, T representing gradient value, Z _T-1 The sample representing the gradient value of T-1, is taken as the contrast composite image, alpha is the coefficient,representing the loss value of the sample against the composite image.

S1545, training the biopsy pre-training model according to the positive sample countermeasure synthetic image and the M-class negative sample countermeasure synthetic image to optimize parameters of the biopsy pre-training model, and generating a trained biopsy model.

It will be appreciated that the biopsy pre-training model is trained on the positive and negative sample contrast composite images to optimize the parameters of the biopsy pre-training model, generating a trained biopsy model.

According to the method provided by the embodiment of the application, the positive sample countermeasure synthetic image corresponding to the positive sample countermeasure image is generated in a gradient processing mode, the negative sample countermeasure synthetic image corresponding to the negative sample countermeasure image is generated in a gradient processing mode, and the living body detection pre-training model is trained through the positive sample countermeasure synthetic image and the negative sample countermeasure synthetic image pair so as to optimize parameters of the living body detection pre-training model, so that the trained living body detection model can accurately detect a living body of a real object, and the defensive capacity is improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 5 of the present application, referring to fig. 6, the substep S1545 further includes substeps S15451 to S15453. Specific:

s15451, inputting the positive sample challenge composite image into the living body detection pre-training model, and predicting by the living body detection pre-training model to generate a first prediction result of whether the positive sample challenge composite image contains living body information of the real object.

It can be understood that the positive sample is taken as the input of the living body detection pre-training model, and the living body detection pre-training model predicts whether the positive sample contains living body information of a real object or not to generate the positive sample, so as to obtain a first prediction result.

S15452, inputting the M-class negative-sample antigen synthesized image into a living body detection pre-training model, and predicting and generating an M-class second prediction result of whether the M-class negative-sample antigen synthesized image contains living body information of a real object through the living body detection pre-training model.

It can be understood that the negative-sample anti-synthesis image is taken as an input of a living body detection pre-training model, whether the negative-sample anti-synthesis image contains living body information of a real object or not is predicted through the living body detection pre-training model prediction, and a second prediction result is obtained.

S15453, training the living body detection pre-training model according to the first prediction result, the first target label, the M-class second prediction result and the M-class second target label to optimize parameters of the living body detection pre-training model and generate a trained living body detection model.

It may be understood that, according to the first prediction result and the first target label corresponding to each positive sample antibody composite image, the prediction deviation of the living body detection pre-training model for the positive sample antibody composite image may be determined, according to the second prediction result and the second target label corresponding to each negative sample antibody composite image, the prediction deviation of the living body detection pre-training model for the negative sample antibody composite image may be determined, and further, the living body detection pre-training model is trained by means of contrast learning, so as to optimize the parameters of the living body detection pre-training model, and a trained living body detection model is generated.

According to the method provided by the embodiment of the application, the positive sample antigen synthetic image and the negative sample antigen synthetic image are predicted through the living body detection pre-training model, the first prediction result corresponding to the positive sample antigen synthetic image and the second prediction result corresponding to the negative sample antigen synthetic image are respectively obtained, and then the living body detection pre-training model is trained according to the prediction result and the target label so as to optimize the parameters of the living body detection pre-training model, and the trained living body detection model is generated, so that the trained living body detection model can accurately detect a real object living body, and the defensive capacity is improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 6, referring to fig. 7, the substep S15451 includes substeps S54511 to S54512. Specific:

s54511, inputting the positive sample countermeasure synthetic image into a living body detection pre-training model, and generating a first prediction probability that the positive sample countermeasure synthetic image contains living body information of a real object and M second prediction probabilities that the positive sample countermeasure synthetic image corresponds to M negative sample categories through a classification module in the living body detection pre-training model.

It will be appreciated that the positive sample challenge composite image is taken as input to the in vivo detection pre-training model, and the predictive probability of the positive sample challenge composite image for each class is generated by the classification module in the in vivo detection pre-training model.

S54512, determining the maximum value of the first prediction probability and the M second prediction probabilities as a first prediction result.

It will be appreciated that the largest probability of the positive sample against the composite image's predicted probability for each class is the result of the living body detection pre-training model's prediction of the positive sample against the composite image. Since the living body detection pre-training model is capable of accurately predicting whether the image contains the living body information of the real object, the result of the living body detection pre-training model for predicting the antigen synthesized image by the positive sample should be the result containing the living body information of the real object, namely, the maximum probability in the prediction probability for each classification is the first prediction probability corresponding to the real human face.

For example, the true label of the positive sample countermeasure image corresponding to the positive sample countermeasure image is 0, the attack label of the corresponding positive sample countermeasure image is 1, and the positive sample countermeasure image is predicted by the in-vivo detection pre-training model, so that the probability value of the true human face is the largest of all probability values.

According to the method provided by the embodiment of the application, the classification model in the living body detection pre-training model predicts the corresponding probability value of the positive sample contrast synthetic image for each classification, and the classification corresponding to the maximum probability value in all probability values is used as a prediction result, so that the accuracy of the output result of the living body detection pre-training model is improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 6, referring to fig. 8, the substep S15452 includes substeps S54521 to S54522. Specific:

s54521, inputting each type of negative-sample countermeasure synthetic image in the M types of negative-sample countermeasure synthetic images into the living body detection pre-training model, and generating a third prediction probability that each type of negative-sample countermeasure synthetic image in the M types of negative-sample countermeasure synthetic images contains living body information of a real object and M fourth prediction probabilities that each type of negative-sample countermeasure synthetic image corresponds to M negative-sample categories through a classification module in the living body detection pre-training model.

It will be appreciated that each negative-working challenge composite image is taken as input to the in-vivo detection pre-training model, and the predictive probability of each negative-working challenge composite image corresponding to each class is generated by the classification module in the in-vivo detection pre-training model.

S54522, determining that the maximum value of the third prediction probabilities and the fourth prediction probabilities corresponding to each type of negative-sample anti-composite image in the M types of negative-sample anti-composite images is M types of second prediction results of whether the M types of negative-sample anti-composite images contain real object living body information.

It will be appreciated that the largest probability of the negative-sample anti-composite image's predicted probability for each class is the outcome of the living pre-training model's prediction of the negative-sample anti-composite image. Since the living body detection pre-training model is capable of accurately predicting whether the image contains living body information of a real object, the result of the living body detection pre-training model for predicting the negative sample against the synthetic image should be the type result of the label corresponding to the negative sample image, namely the maximum probability in the prediction probability for each classification is the third prediction probability corresponding to the real human face.

For example, the negative sample label (real label) of the negative sample countermeasure image corresponding to the negative sample countermeasure composite image is 3, the second target label (attack label) is 0, the negative sample countermeasure composite image is predicted by the in-vivo detection pre-training model, and the probability value corresponding to the obtained head model face should be the largest of all probability values.

According to the method provided by the embodiment of the application, the classification model in the living body detection pre-training model is used for predicting the probability value corresponding to each classification of the negative sample contrast synthetic image, and the classification corresponding to the maximum probability value in all probability values is used as a prediction result, so that the accuracy of the output result of the living body detection pre-training model is improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 2, referring to fig. 9, step S130 includes sub-steps S1301 to S1309. Specific:

s1301, inputting the positive sample image into a diffusion model, and encoding the positive sample image through an image encoding network in the diffusion model to generate positive sample image vector data.

It will be appreciated that the positive sample image is encoded by the image encoding network in the diffusion model with the positive sample image as input to the diffusion model, generating positive sample image vector data. The image encoding network may be an encoder. Encoding an image refers to converting image information into corresponding vector information.

S1303, inputting the positive sample image vector data to a diffusion network in a diffusion model, and adding interference data to the positive sample image vector data through the diffusion network to generate positive sample interference vector data.

It will be appreciated that the positive sample image vector data is processed through a diffusion network, and the interference vector is added to the positive sample image vector data, or a part of the vectors in the positive sample image vector data are replaced in a vector replacement manner, so as to attack the positive sample image, and generate the positive sample interference vector data.

S1305, inputting the positive sample interference vector data to a feature extraction network in the diffusion model, and carrying out feature extraction on the positive sample interference vector data through the feature extraction network to generate a positive sample interference feature vector.

It may be appreciated that the feature extraction may be performed on the positive sample interference vector data, specifically, comparing the positive sample interference vector data with the positive sample image vector data, so as to determine interference information, and generate the positive sample interference vector data.

S1307, inputting the positive sample interference feature vector into a noise reduction network in the diffusion model, and carrying out noise reduction on the positive sample interference feature vector through the noise reduction network to generate positive sample noise reduction vector data.

It will be appreciated that the positive sample interference feature vector is noise reduced by the noise reduction network to reduce interference and generate positive sample noise reduction vector data.

S1309, the positive sample noise reduction vector data is input to the image decoding network in the diffusion model, and a positive sample countermeasure image is generated.

It will be appreciated that the image decoding network may be a decoder, by which the vector data is generated into a corresponding image, i.e. the positive sample noise reduction vector data is converted into an image, resulting in a positive sample challenge image.

According to the data processing method provided by the embodiment of the application, the positive sample is processed through the diffusion model, the positive sample countermeasure image is generated, the data enhancement is performed from the viewpoint of countermeasure and defense, the diversity of training data is expanded, and the model is trained through the positive sample countermeasure data, so that the defensive capability and generalization of the living body detection model are further improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 2, referring to fig. 10, step S130 includes sub-steps S1300 to S1308. Specific:

s1300, inputting each type of negative-sample countermeasure images in the M types of negative-sample countermeasure images into a diffusion model, and encoding each type of negative-sample countermeasure images through an image encoding network in the diffusion model to generate negative-sample image vector data corresponding to each type of negative-sample countermeasure images in the M types of negative-sample countermeasure images.

It will be appreciated that each negative image is taken as input to the diffusion model, and the negative image is encoded by the image encoding network in the diffusion model to generate negative image vector data. The image encoding network may be an encoder. Encoding an image refers to converting image information into corresponding vector information.

S1302, inputting negative sample image vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images to a diffusion network in the diffusion model, adding interference data to the negative sample image vector data through the diffusion network, and generating negative sample interference vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images.

It will be appreciated that the negative sample image vector data is processed through a diffusion network, and the interference vector is added to the negative sample image vector data, or a part of the vectors in the negative sample image vector data are replaced in a vector replacement manner, so as to attack the negative sample image, and generate the negative sample interference vector data.

S1304, inputting negative sample image vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images to a feature extraction network in the diffusion model, and carrying out feature extraction on the negative sample image vector data through the feature extraction network to generate negative sample interference feature vectors corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images.

It can be understood that the feature extraction may be performed on the negative-sample interference vector data, specifically, comparing the negative-sample interference vector data with the negative-sample image vector data, so as to determine interference information and generate the negative-sample interference vector data.

S1306, inputting negative sample interference feature vectors corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images into a noise reduction network in the diffusion model, and carrying out noise reduction on the negative sample interference feature vectors through the noise reduction network to generate negative sample noise reduction vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images.

It can be appreciated that the negative-sample interference feature vector is noise reduced by the noise reduction network to reduce interference and generate negative-sample noise reduction vector data.

S1308, negative sample noise reduction vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images are input to an image decoding network in the diffusion model, and negative sample countermeasure images corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images are generated.

It will be appreciated that the image decoding network may be a decoder, by which the vector data is generated into a corresponding image, i.e. the negative-sample noise-reduction vector data is converted into an image, resulting in a negative-sample countermeasure image.

According to the data processing method provided by the embodiment of the application, the negative sample is processed through the diffusion model to generate the negative sample countermeasure image, the data enhancement is performed from the viewpoint of countermeasure and defense, the diversity of training data is expanded, and the model is trained through the negative sample countermeasure data, so that the defensive capability and generalization of the living body detection model are further improved.

The screening of challenge samples to improve the robustness of the in vivo detection model cannot be used as a suitable data enhancement means to improve the accuracy of the in vivo detection model. Because the challenge sample images thus generated are entirely within the distribution of the target classes, if they are used for training, the interface of the living detection model is jagged, and while this improves the robustness of the living detection model to challenge data, data that is not very high in confidence for some data near the interface is also misclassified. This compromises the accuracy of the living body detection model.

Through the above analysis, the objective to be achieved is to improve the accuracy and generalization of the living body detection model while improving the robustness, so in the embodiment of the present application, the challenge sample is screened by adding the boundary constraint. In other words, the contrast sample image for data enhancement should not cross the boundary too far. Specific: in an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 2, referring to fig. 11, step S150 further includes a sub-step S1501. Specific:

S1501, training the living body detection pre-training model according to the positive sample antibody image carrying the first target label and the M negative sample antibody image carrying the second target label to optimize the learning rate and the iteration step length of the living body detection pre-training model, and generating a trained living body detection model.

It can be understood that by further controlling the learning rate and the iteration step length of the model, the attack iteration can be stopped by making the probability value represented by the element where the challenge sample image is located just exceed 0.2, and the attack iteration can be used as a reasonable challenge data enhancement.

In the method provided by the embodiment of the application, in the attack process, the probability value represented by the element where the target label is positioned is just more than 0.2 by controlling the learning rate and the iteration step length, so that the attack iteration can be stopped and used as a reasonable countermeasure data enhancement. The challenge data thus obtained will be in the vicinity of the classification boundary surface, rather than in the distribution of the target classes, and the model classification boundary surface trained by such challenge data will be softer but not jagged. Therefore, the defects of training data can be supplemented on the basis of the original precision of the model, the defending capability of the model is further improved, and the safety of the model is improved.

In an alternative embodiment of the data processing method provided in the corresponding embodiment of fig. 2, referring to fig. 12, the data processing method further includes steps S210 to S220. It should be noted that, steps S210 to S220 are the process of predicting the trained model. Specific:

s210, acquiring a target image.

S220, inputting the target image into a living body detection model which is trained, and outputting a target prediction result of which the target image contains living body information of a real object through the trained living body detection model.

It can be understood that according to the method provided by the embodiment of the application, the trained living body detection model is utilized to predict the living body information of the real object on the target image, so that the accurate detection of the living body of the real object is improved, and the defensive power of the model is improved.

For ease of understanding, a data processing method applied to in vivo detection model training will be described below with reference to fig. 13, including four processes: data preparation and preprocessing, in vivo detection model pre-training, generating boundary-constrained challenge sample data, and diffusion model-based challenge sample synthesis.

First, the data preparation and preprocessing process will be described in detail: the data that requires model training includes a number of positive sample images and a number of negative sample images. The positive sample image contains real object living body information, for example, a real human face photo is used as the positive sample image; the negative sample image does not contain real object living body information, so that the living body detection model can learn more negative example data, as many categories of the negative sample data as possible are required, for example, a face attack picture is taken as the negative sample image, and the negative sample categories can include a whole paper sheet (a negative sample image obtained by printing a real face photo), paper cutting (a negative sample image obtained by combining or splicing partial features in the real face photo with a real face), a head mask (a head mask made of real object living body information, a negative sample image of the mask), high-definition flip (a negative sample image obtained by performing high-definition flip on the real face photo), and the like.

All sample images are preprocessed, and the specific implementation mode is as follows:

1) Determining five-point coordinates of a human face by using a human face detection model;

2) Obtaining an external rectangle by utilizing five-point coordinates, and expanding the external rectangle by 2.226 times to obtain a detection frame with a double face size;

3) Continuously expanding one time of face detection frame to twice, then picking up face images by using the two times of face detection frame, and adjusting the size of the face images after being picked up to 256 multiplied by 3;

4) Finally, the 256×256×3 image is randomly cropped to obtain a sample image with a final size of 224×224×3 standard.

Next, a detailed explanation is given of the in-vivo detection model pre-training process: referring to fig. 14, fig. 14 is a schematic diagram illustrating a pre-training phase of a living body detection model according to an embodiment of the present application. After the standard sample images of 224×224×3 RGB colors are acquired, a real label is added to all the sample images.

For example, the label of a positive sample image (a photo of a human face) is set to 0, the label of a negative sample image of the whole sheet type is set to 1, the label of a negative sample image of the sheet cut type is set to 2, the label of a negative sample image of the head mold face type is set to 3, and the label of a negative sample image of the high definition flip type is set to 4.

Then pre-training a living body detection model M (initial model) according to a sample image carrying a label, wherein a reset=18 can be selected as a backstone of the model; other network models may be selected as the backbond, for example, in order to ensure timeliness of forward reasoning, NAS or the like may be used to search for a network model with a smaller parameter as the backbond.

The pre-training process is performed according to the classification of the sample image, that is, the model is trained according to the above five classifications, and when epoch=15, the training is stopped, so as to obtain a living body detection pre-training model M'. After the pre-training is finished, the living body detection pre-training model M' can accurately judge whether the image contains living body information of a real object or not, and can accurately classify the attack image in four attack categories, namely, after any image is input into the living body detection pre-training model, the living body detection pre-training model outputs five probability values corresponding to the five categories, wherein the probability that the input image belongs to the categories from 0 to 4 is respectively represented, and the sum of the five probability values is 1.

Next, the countermeasure sample data for generating the boundary constraint will be described in detail: referring to fig. 15, fig. 15 is a schematic diagram of an countermeasure sample stage for generating boundary constraints according to an embodiment of the present application. After the living body detection pre-training model M' is obtained, sample images need to be sampled, for example, assuming that each category has 1 ten thousand sample images, each category is randomly sampled according to a proportion of 5%, so that 500 images corresponding to each category can be obtained, and 2500 sample images in total in 5 categories can be obtained. Next, a sample challenge image for each image is generated for the 2500 sample images. Then, target label generation is performed on the 2500 sample countermeasure images, specifically: randomly setting a label of one negative sample image, namely, any integer between 1 and 4 for each of 500 positive sample challenge images containing real object living body information; tag 0, i.e., the tag of the real person, is given to all of the remaining 2000 negative sample challenge images. This attack tag is called a target tag, which is different from the real tag, and the real tag of the sample needs to be modified to the target tag later by countering the attack.

Then, the 2500 sample countermeasure images are respectively input into the living body detection pre-training model M 'for prediction, each sample countermeasure image can obtain 5 probability values, and because the living body detection pre-training model M' is pre-trained, the 2500 sample countermeasure images can basically output correct probability distribution, namely, the probability value of the class with the label of 0 in five probabilities output by the positive sample countermeasure image is infinitely close to 1, the probability value of the class corresponding to the other four labels is infinitely close to 0, and the probability value of one of the probability values of the classes corresponding to the four labels is infinitely close to 1 for the negative sample countermeasure image. After this probability vector is obtained, the challenge can be started.

Finally, the challenge sample synthesis based on the diffusion model is explained in detail: referring to fig. 16, fig. 16 is a block diagram of a diffusion model according to an embodiment of the present application. It should be noted that the diffusion model may be replaced by other generative methods, such as a flow model. In the embodiment of the application, a diffusion model-based attack countermeasure algorithm is adopted to conduct directional attack. The basic procedure for challenge sample synthesis is as follows:

1) And inputting each sample image into a classification module Classifier M' of the pre-training model, and outputting a corresponding probability vector.

2) The largest element of the 5 probability vectors corresponding to each sample image is taken as Loss1.

3) And taking the probability value of the element corresponding to the target label in the probability vector corresponding to each sample image as loss2.

4) Calculating the total Loss value Loss of each type of sample image _adv ，Loss _adv =loss1+loss2, then the total Loss value for each type of sample image calculates the gradient value for that type of sample image.

5) And inputting each type of sample image into a diffusion model, and performing diffusion inversion calculation to obtain hidden variable values (sample countermeasure images) corresponding to each type of sample image.

6) Generating an challenge hidden variable (sample challenge composite image) from the gradient value of the sample image and the sample challenge image, the sample challenge composite image (also referred to as challenge hidden variable) being expressed by the following formula:

In general, to achieve a completely successful attack, both α and the number of training iterations are sufficient to ensure that the challenge sample image can cross the model's interface, thereby becoming a target class of samples. However, in the embodiment of the present application, the robustness of the model is improved by screening the challenge sample, but it cannot be used as a suitable data enhancement means to improve the accuracy of the model. Because the challenge sample images thus generated are entirely within the distribution of target classes, training with them will bring the model's interface into the form of fig. 17: fig. 17 (a) shows a classification boundary surface of a living body model trained using clean data, and fig. 17 (b) shows a classification boundary surface of a living body model trained using mixed antibody data. As is clear from fig. 17 (a) and 17 (b), the living detection model will saw the interface for better fitting to the challenge sample, and although this gives the model improved robustness to challenge data, data with a low confidence level for some data close to the interface is also misclassified. This compromises the accuracy of the model.

Through the above analysis, the objective to be achieved should be to improve the accuracy and generalization of the model while improving the robustness, so in the embodiment of the present application, the challenge samples are screened by adding boundary constraints. In other words, the contrast sample image for data enhancement should not cross the boundary too far. The living body detection model adopted in the embodiment of the application outputs a probability value vector with the length of 5, and the sum of five element values is 1, so long as the probability value represented by the element where the target tag is positioned just exceeds 0.2 by controlling the learning rate and the iteration step length in the attack process, the attack iteration can be stopped, and the probability value is used as a reasonable countermeasure data enhancement instead of waiting for the value to approach 1. The challenge data thus obtained will be in the vicinity of the classification boundary surface rather than in the distribution of the target classes, and the model classification boundary surface trained by such challenge data will be softer but not jagged, as shown on the far right side of fig. 16. Therefore, the defects of training data can be supplemented on the basis of the original precision of the model, the defending capability of the model is further improved, and the safety of the model is improved.

Besides the human face recognition living body detection, the method can also be used in living body detection methods of other biological characteristics.

According to the method provided by the embodiment of the application, the selected countermeasure sample data is enhanced by utilizing the boundary constraint countermeasure data, so that heterogeneous data of a certain category can be well compensated, the defensive capability and generalization of a living body detection model are further improved, and the safety and robustness of face recognition service are effectively ensured.

The data processing apparatus of the present application will be described in detail with reference to fig. 18. Fig. 18 is a schematic diagram of an embodiment of a data processing apparatus 10 according to an embodiment of the present application, where the data processing apparatus 10 includes: a sample image acquisition module 110, a model pre-training module 120, a sample challenge image generation module 130, a target tag setting module 140, and a model retraining module 150. Specific:

the sample image acquisition module 110 is configured to acquire a positive sample image and an M-class negative sample image.

The model pre-training module 120 is configured to train the biopsy model according to the positive sample image carrying the positive sample label and the M-class negative sample image carrying the negative sample label, and generate a biopsy pre-training model.

The sample countermeasure image generating module 130 is configured to input the positive sample image into the diffusion model, generate a positive sample countermeasure image, and input the M-class negative sample image into the diffusion model, generate an M-class negative sample countermeasure image.

The target label setting module 140 is configured to set a first target label for the positive sample challenge image and a second target label for each of the M-class negative sample challenge images.

The model retraining module 150 is configured to train the biopsy pre-training model according to the positive sample antigen image carrying the first target tag and the M negative sample antigen images carrying the second target tag, so as to optimize parameters of the biopsy pre-training model, and generate a trained biopsy model.

According to the data processing device provided by the embodiment of the application, the data enhancement is performed from the anti-attack angle, and the living body detection pre-training model is optimized through the training data after the data enhancement, so that the living body detection model after training can accurately detect the living body of a real object, and the defending capability is improved.

In an alternative embodiment of the data processing apparatus provided in the corresponding embodiment of fig. 18 of the present application, the model retraining module 150 is further configured to:

and inputting the positive sample countermeasures carrying the first target labels into the living body detection pre-training model, and outputting first probability values and M second probability values corresponding to the positive sample countermeasures through the living body detection pre-training model.

And inputting the M negative sample countermeasure images carrying the second target labels into the living body detection pre-training model, and outputting a third probability value and M fourth probability values corresponding to each negative sample countermeasure image through the living body detection pre-training model.

And calculating a loss function according to the first probability value and the M second probability values corresponding to the positive sample countermeasure images, and the third probability value and the M fourth probability values corresponding to each negative sample countermeasure image.

According to the device provided by the embodiment of the application, the total loss function is calculated through the probability value of each antibody sample corresponding to each category, and then the living body detection pre-training model is trained through the loss function so as to optimize the parameters of the living body detection pre-training model, and the trained living body detection model is generated, so that the trained living body detection model can accurately detect the living body of a real object, and the defensive capacity is improved.

And determining the second probability value of the corresponding negative sample category of the positive sample countermeasure image from the M second probability values as the first target probability value.

And determining the largest probability value in the probability values from M fourth probability values corresponding to each negative sample countermeasure image as a second target probability value.

According to the device provided by the embodiment of the application, the loss function of the positive sample countermeasure image is calculated through the probability value of the real human face corresponding to the positive sample countermeasure image and the probability value of the class corresponding to the first target label (attack label), the loss function of the negative sample countermeasure image is calculated through the probability value of the real human face corresponding to the negative sample countermeasure image and the probability value of the class corresponding to the negative sample label (real label), the total loss function is generated, the living body detection pre-training model is trained to optimize the parameters of the living body detection pre-training model, and the trained living body detection model is generated, so that the living body detection of the real object can be accurately detected by the trained living body detection model, and the defending capability is improved.

and calculating the gradient value of the positive sample countermeasure image according to the loss function, and generating the gradient value of the positive sample countermeasure image.

And calculating the gradient value of the M-class negative sample countermeasure image according to the loss function, and generating the gradient value of the M-class negative sample countermeasure image.

And carrying out gradient processing on the positive sample countermeasure image according to the positive sample countermeasure image gradient value, and generating a positive sample countermeasure composite image.

And carrying out gradient processing on the corresponding negative sample countermeasure image according to each negative sample countermeasure image gradient value pair in the M classes of negative sample countermeasure image gradient values, and generating M classes of negative sample countermeasure composite images.

According to the device provided by the embodiment of the application, the positive sample countermeasure synthetic image corresponding to the positive sample countermeasure image is generated in a gradient processing mode, the negative sample countermeasure synthetic image corresponding to the negative sample countermeasure image is generated in a gradient processing mode, and the living body detection pre-training model is trained through the positive sample countermeasure synthetic image and the negative sample countermeasure synthetic image pair so as to optimize parameters of the living body detection pre-training model, so that the trained living body detection model can accurately detect a living body of a real object, and the defensive capacity is improved.

the positive sample countermeasure synthetic image is input into a living body detection pre-training model, and a first prediction result of whether the positive sample countermeasure synthetic image contains living body information of a real object is generated through the living body detection pre-training model prediction.

Inputting the M-class negative-sample antibody synthesized image into a living body detection pre-training model, and predicting and generating an M-class second prediction result of whether the M-class negative-sample antibody synthesized image contains living body information of a real object through the living body detection pre-training model.

According to the device provided by the embodiment of the application, the positive sample antigen synthetic image and the negative sample antigen synthetic image are predicted through the living body detection pre-training model, the first prediction result corresponding to the positive sample antigen synthetic image and the second prediction result corresponding to the negative sample antigen synthetic image are respectively obtained, and then the living body detection pre-training model is trained according to the prediction result and the target label so as to optimize the parameters of the living body detection pre-training model, and the trained living body detection model is generated, so that the trained living body detection model can accurately detect a real object living body, and the defensive capacity is improved.

the positive sample countermeasure synthetic image is input into a living body detection pre-training model, and a first prediction probability that the positive sample countermeasure synthetic image contains living body information of a real object and M second prediction probabilities that the positive sample countermeasure synthetic image corresponds to M negative sample categories are generated through a classification module in the living body detection pre-training model.

According to the device provided by the embodiment of the application, the classification model in the living body detection pre-training model predicts the corresponding probability value of the positive sample contrast synthetic image for each classification, and the classification corresponding to the maximum probability value in all probability values is used as a prediction result, so that the accuracy of the output result of the living body detection pre-training model is improved.

inputting each of the M classes of negative-sample challenge composite images into the living detection pre-training model, and generating, by a classification module in the living detection pre-training model, a third prediction probability that each of the M classes of negative-sample challenge composite images contains living information of the real object and M fourth prediction probabilities that each of the M classes of negative-sample challenge composite images corresponds to the M classes of negative samples.

According to the device provided by the embodiment of the application, the classification model in the living body detection pre-training model is used for predicting the probability value corresponding to each classification of the negative sample contrast synthetic image, and the classification corresponding to the maximum probability value in all probability values is used as a prediction result, so that the accuracy of the output result of the living body detection pre-training model is improved.

In an alternative embodiment of the data processing apparatus provided in the corresponding embodiment of fig. 18 of the present application, the sample challenge image generating module 130 is further configured to:

and inputting the positive sample image into a diffusion model, and encoding the positive sample image through an image encoding network in the diffusion model to generate positive sample image vector data.

The positive sample image vector data is input to a diffusion network in a diffusion model, interference data is added to the positive sample image vector data through the diffusion network, and positive sample interference vector data is generated.

And inputting the positive sample interference vector data into a feature extraction network in the diffusion model, and carrying out feature extraction on the positive sample interference vector data through the feature extraction network to generate a positive sample interference feature vector.

And inputting the positive sample interference feature vector into a noise reduction network in the diffusion model, and carrying out noise reduction on the positive sample interference feature vector through the noise reduction network to generate positive sample noise reduction vector data.

According to the data processing device provided by the embodiment of the application, the positive sample is processed through the diffusion model to generate the positive sample countermeasure image, the data enhancement is performed from the viewpoint of countermeasure and defense, the diversity of training data is expanded, and the model is trained through the positive sample countermeasure data, so that the defensive capability and generalization of the living body detection model are further improved.

inputting each type of negative-sample countermeasure images in the M types of negative-sample countermeasure images into a diffusion model, encoding each type of negative-sample countermeasure images through an image encoding network in the diffusion model, and generating negative-sample image vector data corresponding to each type of negative-sample countermeasure images in the M types of negative-sample countermeasure images.

And inputting the negative sample image vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images into a diffusion network in the diffusion model, adding interference data to the negative sample image vector data through the diffusion network, and generating negative sample interference vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images.

Inputting the negative sample image vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images into a feature extraction network in the diffusion model, and carrying out feature extraction on the negative sample image vector data through the feature extraction network to generate negative sample interference feature vectors corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images.

And inputting the negative sample interference characteristic vector corresponding to each type of negative sample countermeasure image in the M types of negative sample countermeasure images into a noise reduction network in the diffusion model, and carrying out noise reduction on the negative sample interference characteristic vector through the noise reduction network to generate negative sample noise reduction vector data corresponding to each type of negative sample countermeasure image in the M types of negative sample countermeasure images.

According to the data processing device provided by the embodiment of the application, the negative sample is processed through the diffusion model to generate the negative sample countermeasure image, the data enhancement is performed from the viewpoint of countermeasure and defense, the diversity of training data is expanded, and the model is trained through the negative sample countermeasure data, so that the defensive capability and generalization of the living body detection model are further improved.

Through the above analysis, the objective to be achieved is to improve the accuracy and generalization of the living body detection model while improving the robustness, so in the embodiment of the present application, the challenge sample is screened by adding the boundary constraint. In other words, the contrast sample image for data enhancement should not cross the boundary too far. Specific: in an alternative embodiment of the data processing apparatus provided in the corresponding embodiment of fig. 18 of the present application, the sample challenge image generating module 150 is further configured to:

In the device provided by the embodiment of the application, in the attack process, the probability value represented by the element where the target tag is located is just more than 0.2 by controlling the learning rate and the iteration step length, so that the attack iteration can be stopped, and the probability value is used as a reasonable countermeasure data enhancement. The challenge data thus obtained will be in the vicinity of the classification boundary surface, rather than in the distribution of the target classes, and the model classification boundary surface trained by such challenge data will be softer but not jagged. Therefore, the defects of training data can be supplemented on the basis of the original precision of the model, the defending capability of the model is further improved, and the safety of the model is improved.

In an alternative embodiment of the data processing apparatus provided in the corresponding embodiment of fig. 18 of the present application, referring to fig. 19, the data processing apparatus 10 further includes: the target image acquisition module 210 and the real object living body detection prediction module 220; specific:

The target image acquisition module 210 is configured to acquire a target image.

The real object living body detection prediction module 220 is configured to input the target image into a trained living body detection model, and output a target prediction result that the target image contains real object living body information through the trained living body detection model.

It can be understood that the device provided by the embodiment of the application predicts the living body information of the real object by using the trained living body detection model to the target image, improves the accurate detection of the living body of the real object and improves the defending capability of the model.

Fig. 20 is a schematic diagram of a server structure provided in an embodiment of the present application, where the server 300 may vary considerably in configuration or performance, and may include one or more central processing units (central processing units, CPU) 322 (e.g., one or more processors) and memory 332, one or more storage media 330 (e.g., one or more mass storage devices) storing applications 342 or data 344. Wherein the memory 332 and the storage medium 330 may be transitory or persistent. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 322 may be configured to communicate with the storage medium 330 and execute a series of instruction operations in the storage medium 330 on the server 300.

The Server 300 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input/output interfaces 358, and/or one or more operating systems 341, such as Windows Server ^TM ，Mac OS X ^TM ，Unix ^TM ,Linux ^TM ，FreeBSD ^TM Etc.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 20.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A method of data processing, comprising:

acquiring a positive sample image and M types of negative sample images, wherein the positive sample image contains real object living body information, the positive sample image carries a positive sample label, the M types of negative sample images correspond to M negative sample types, the negative sample image does not contain real object living body information, the negative sample image carries a negative sample label, and M is more than or equal to 1;

training a living body detection model according to the positive sample image carrying the positive sample label and the M-type negative sample image carrying the negative sample label to generate a living body detection pre-training model;

inputting the positive sample image into a diffusion model to generate a positive sample countermeasure image, and inputting the M-class negative sample image into the diffusion model to generate an M-class negative sample countermeasure image, wherein the positive sample countermeasure image is an image generated by adding interference information to the positive sample image, and the negative sample countermeasure image is an image generated by adding the interference information to the negative sample image;

Setting a first target label for the positive sample countermeasure image and setting a second target label for each negative sample countermeasure image in the M-class negative sample countermeasure image, wherein the first target label is one of the M-class negative sample labels corresponding to the M-class negative sample image, and the second target label is used for representing that the negative sample countermeasure image contains real object living body information;

training the living body detection pre-training model according to the positive sample antibody image carrying the first target label and the M negative sample antibody image carrying the second target label so as to optimize parameters of the living body detection pre-training model, and generating a trained living body detection model.

2. The data processing method of claim 1, wherein the training the in-vivo detection pre-training model based on the positive sample antibody image carrying the first target tag and the negative sample antibody image carrying the second target tag in class M to optimize parameters of the in-vivo detection pre-training model, generating a trained in-vivo detection model, comprises:

inputting the positive sample challenge image carrying the first target label into the living body detection pre-training model, and outputting a first probability value corresponding to the positive sample challenge image and M second probability values through the living body detection pre-training model, wherein the first probability value is used for representing the possibility that the positive sample challenge image contains real object living body information, and the M second probability values are used for representing the possibility that the positive sample challenge image corresponds to each negative sample category in M negative sample categories;

Inputting a negative-sample challenge image carrying the second target label in M classes into the living body detection pre-training model, and outputting a third probability value corresponding to each negative-sample challenge image and M fourth probability values through the living body detection pre-training model, wherein the third prediction probability value characterizes the possibility that the negative-sample challenge image contains real object living body information, and the M fourth probability values are used for characterizing the possibility that the negative-sample challenge image corresponds to each negative-sample class in M negative-sample classes;

calculating a loss function according to the first probability value and the M second probability values corresponding to the positive sample countermeasure image, and the third probability value and the M fourth probability values corresponding to each negative sample countermeasure image;

and training the living body detection pre-training model according to the loss function so as to optimize parameters of the living body detection pre-training model and generate a trained living body detection model.

3. The data processing method of claim 2, wherein calculating a loss function from the first probability value and the M second probability values corresponding to the positive sample challenge image, and the third probability value and the M fourth probability values corresponding to each negative sample challenge image comprises:

Determining a second probability value of a corresponding negative sample class of the positive sample countermeasure image from the M second probability values as a first target probability value;

determining the maximum probability value in the probability values as a second target probability value from M fourth probability values corresponding to each negative sample countermeasure image;

4. The data processing method of claim 2, wherein the training the living body detection pre-training model according to the loss function to optimize parameters of the living body detection pre-training model, generating a trained living body detection model, comprises:

calculating the gradient value of the positive sample countermeasure image according to the loss function, and generating a positive sample countermeasure image gradient value;

Performing gradient processing on the corresponding negative sample countermeasure image according to each negative sample countermeasure image gradient value in the M classes of negative sample countermeasure image gradient values to generate M classes of negative sample countermeasure composite images;

training the living body detection pre-training model according to the positive sample antigen synthesis image and the M-class negative sample antigen synthesis image so as to optimize parameters of the living body detection pre-training model, and generating a trained living body detection model.

5. The data processing method of claim 4, wherein the training the in-vivo detection pre-training model based on the positive sample challenge composite image and the M-class negative sample challenge composite image to optimize parameters of the in-vivo detection pre-training model, generating a trained in-vivo detection model, comprises:

inputting the positive sample antigen synthesized image into the living body detection pre-training model, and generating a first prediction result of whether the positive sample antigen synthesized image contains living body information of a real object through prediction of the living body detection pre-training model;

inputting the M-class negative-sample antigen synthesized image to the living body detection pre-training model, and predicting and generating an M-class second prediction result of whether the M-class negative-sample antigen synthesized image contains living body information of a real object through the living body detection pre-training model;

And training the living body detection pre-training model according to the first prediction result, the first target label, the M-class second prediction result and the M-class second target label so as to optimize parameters of the living body detection pre-training model and generate a trained living body detection model.

6. The data processing method according to claim 5, wherein the inputting the positive sample challenge composite image into the living body detection pre-training model, the predicting by the living body detection pre-training model to generate a first prediction result of whether the positive sample challenge composite image contains real object living body information, includes:

inputting the positive sample challenge composite image to the living body detection pre-training model, generating, by a classification module in the living body detection pre-training model, a first prediction probability that the positive sample challenge composite image contains real object living body information, and M second prediction probabilities that the positive sample challenge composite image corresponds to the M negative sample categories;

7. The data processing method according to claim 5, wherein the inputting the M-class negative-sample antigen-synthesized image into the living body detection pre-training model, the generating of the M-class second prediction result of whether the M-class negative-sample antigen-synthesized image contains living body information of a real object by the living body detection pre-training model prediction, includes:

Inputting each of the M classes of negative-sample challenge composite images into the living detection pre-training model, generating, by a classification module in the living detection pre-training model, a third prediction probability that each of the M classes of negative-sample challenge composite images contains real-object living information, and M fourth prediction probabilities that each of the M classes of negative-sample challenge composite images corresponds to the M negative-sample classes;

and determining that the third prediction probability corresponding to each type of negative-sample anti-synthesis image in the M types of negative-sample anti-synthesis images is M types of second prediction results of whether the M types of negative-sample anti-synthesis images contain real object living body information or not.

8. The data processing method of claim 1, wherein the inputting the positive sample image into a diffusion model generates a positive sample challenge image, comprising:

inputting the positive sample image vector data to a diffusion network in the diffusion model, and adding interference data to the positive sample image vector data through the diffusion network to generate positive sample interference vector data;

Inputting the positive sample interference vector data to a feature extraction network in the diffusion model, and carrying out feature extraction on the positive sample interference vector data through the feature extraction network to generate a positive sample interference feature vector;

inputting the positive sample interference feature vector to a noise reduction network in the diffusion model, and carrying out noise reduction on the positive sample interference feature vector through the noise reduction network to generate positive sample noise reduction vector data;

and inputting the positive sample noise reduction vector data to an image decoding network in the diffusion model to generate a positive sample countermeasure image.

9. The data processing method of claim 1, wherein the inputting the M-class negative-sample image into a diffusion model generates an M-class negative-sample challenge image, comprising:

inputting negative sample image vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images into a diffusion network in the diffusion model, adding interference data to the negative sample image vector data through the diffusion network, and generating negative sample interference vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images;

Inputting the negative sample image vector data corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images to a feature extraction network in the diffusion model, and carrying out feature extraction on the negative sample image vector data through the feature extraction network to generate negative sample interference feature vectors corresponding to each type of negative sample countermeasure images in the M types of negative sample countermeasure images;

and inputting the negative sample noise reduction vector data corresponding to each type of negative sample countermeasure image in the M types of negative sample countermeasure images to an image decoding network in the diffusion model, and generating a negative sample countermeasure image corresponding to each type of negative sample countermeasure image in the M types of negative sample countermeasure images.

10. The data processing method of claim 1, wherein the training the in-vivo detection pre-training model based on the positive sample antibody image carrying the first target tag and the negative sample antibody image carrying the second target tag in class M to optimize parameters of the in-vivo detection pre-training model, generating a trained in-vivo detection model, comprises:

Training the living body detection pre-training model according to the positive sample countermeasure image carrying the first target label and the M negative sample countermeasure image carrying the second target label so as to optimize the learning rate and the iteration step length of the living body detection pre-training model and generate a trained living body detection model.

11. The data processing method of claim 1, wherein the method further comprises:

acquiring a target image;

and inputting the target image into a trained living body detection model, and outputting a target prediction result of the target image containing living body information of the real object through the trained living body detection model.

12. A data processing apparatus, comprising:

the sample image acquisition module is used for acquiring a positive sample image and M types of negative sample images, wherein the positive sample image contains real object living body information, the positive sample image carries a positive sample label, the M types of negative sample images correspond to M negative sample categories, the negative sample image does not contain real object living body information, the negative sample image carries a negative sample label, and M is more than or equal to 1;

the model pre-training module is used for training a living body detection pre-training model according to the positive sample image carrying the positive sample label and the M-type negative sample image carrying the negative sample label, so as to generate a living body detection pre-training model;

A sample countermeasure image generating module, configured to input the positive sample image into a diffusion model, generate a positive sample countermeasure image, and input the M types of negative sample images into the diffusion model, generate M types of negative sample countermeasure images, where the positive sample countermeasure image is an image generated by adding interference information to the positive sample image, and the negative sample countermeasure image is an image generated by adding interference information to the negative sample image;

a target label setting module, configured to set a first target label for the positive sample challenge image and set a second target label for each negative sample challenge image in the M-class negative sample challenge image, where the first target label is one of M-class negative sample labels corresponding to the M-class negative sample image, and the second target label is used to characterize that the negative sample challenge image contains real object living body information;

and the model retraining module is used for training the living body detection retraining model according to the positive sample antibody image carrying the first target label and the negative sample antibody image carrying the second target label in M classes so as to optimize parameters of the living body detection retraining model and generate a trained living body detection model.

13. A computer device, comprising: memory, transceiver, processor, and bus system;

wherein the memory is used for storing programs;

the processor being configured to execute a program in the memory, including performing the data processing method according to any one of claims 1 to 11;

the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.

14. A computer readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the data processing method of any of claims 1 to 11.

15. A computer program product comprising a computer program, characterized in that the computer program is executed by a processor for performing a data processing method according to any of claims 1 to 11.