CN115019118B

CN115019118B - Countermeasure image generation method, device, computer device, and storage medium

Info

Publication number: CN115019118B
Application number: CN202210471139.9A
Authority: CN
Inventors: 刘彦宏; 王洪斌; 吴海英; 蒋宁
Original assignee: Mashang Xiaofei Finance Co Ltd
Current assignee: Mashang Xiaofei Finance Co Ltd
Priority date: 2022-04-28
Filing date: 2022-04-28
Publication date: 2024-06-21
Anticipated expiration: 2042-04-28
Also published as: CN115019118A

Abstract

The embodiment of the application discloses a method, a device, a computer device and a storage medium for generating an countermeasure image, which comprise the following steps: the specific steps of generating the intermediate countermeasure image include: inputting the image to be processed into a pre-trained target detection model, and outputting classification prediction information and region prediction coordinate information of each detection object; inputting the classification prediction information and the corresponding classification label information of each detection object into a prediction classification loss function to obtain corresponding target classification loss; determining a target positioning loss of each detection object; constructing a target gradient loss function to calculate a target gradient image corresponding to the image to be processed; the target gradient image is overlapped to the image to be processed, and an intermediate countermeasure image is generated as the image to be processed of the next iteration; and repeatedly executing the generation of the intermediate countermeasure image until the intermediate countermeasure image does not exist the detection object which is identified as the first category by the target detection model, so as to obtain the target countermeasure image, and improving the efficiency and the performance of generating the object image.

Description

Countermeasure image generation method, device, computer device, and storage medium

Technical Field

The present application relates to the field of computer vision, and in particular, to a method and apparatus for generating an countermeasure image, a computer device, and a storage medium.

Background

In recent years, a deep learning network model has become the most practical model in application fields such as computer vision, and a target detection technology based on the deep learning network model has a wide application scene in intelligent supervision, namely, whether abnormal objects which do not meet specifications appear is monitored from a monitoring image.

In the related art, a target object of each predefined category can be detected through a target detection model, and then whether an abnormal object exists or not is determined from a detection result, so that the robustness of the intelligent model is often evaluated by utilizing an anti-attack means in view of the high safety requirement of such supervision.

In the research and practice process of the prior art, the inventor discovers that in the prior art, the generated countermeasure image only can enable the target detection model to inaccurately identify the target detection frame corresponding to the abnormal object, but can not enable the identification function of the abnormal object to fail, and the countermeasure efficiency and performance are low.

Disclosure of Invention

The embodiment of the application provides a method, a device, computer equipment and a storage medium for generating an anti-attack image, which can improve the efficiency and performance of the anti-attack.

In order to solve the technical problems, the embodiment of the application provides the following technical scheme:

A countermeasure image generation method, comprising:

The specific steps of generating the intermediate countermeasure image include:

inputting the image to be processed into a pre-trained target detection model, and outputting classification prediction information and region prediction coordinate information of each detection object;

Inputting the classification prediction information and the corresponding classification label information of each detection object into a prediction classification loss function, and performing gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs to obtain corresponding target classification loss;

The prediction classification loss function comprises a gradient increment calculation item and a gradient decrement calculation item, wherein the gradient increment calculation item is used for gradient increment calculation of classification loss of a detection object of which the classification label information belongs to a first category, and the gradient decrement calculation item is used for gradient decrement calculation of classification loss of a detection object of which the classification label information belongs to a second category;

Determining target positioning loss according to the region prediction coordinate information of each detection object and the corresponding region label coordinate information;

Constructing a target gradient loss function based on the target classification loss and the target positioning loss;

calculating a target gradient image corresponding to the image to be processed through the target gradient loss function;

the target gradient image is overlapped to an image to be processed, and an intermediate countermeasure image is generated, wherein the intermediate countermeasure image is the image to be processed for gradient calculation in the next iteration;

The step of generating an intermediate challenge image is repeatedly performed until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and the intermediate challenge image is determined as the target challenge image.

An countermeasure image generating apparatus, comprising:

The generation module is used for generating an intermediate countermeasure image, and specifically comprises the following steps: the output unit is used for inputting the image to be processed into the pre-trained target detection model and outputting the classification prediction information and the region prediction coordinate information of each detection object; the input unit is used for inputting the classification prediction information of each detection object and the corresponding classification label information into the prediction classification loss function, and carrying out gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs to obtain the corresponding target classification loss; the prediction classification loss function comprises a gradient increment calculation item and a gradient decrement calculation item, wherein the gradient increment calculation item is used for gradient increment calculation of classification loss of a detection object of which the classification label information belongs to a first category, and the gradient decrement calculation item is used for gradient decrement calculation of classification loss of a detection object of which the classification label information belongs to a second category; a first determining unit, configured to determine a target positioning loss according to the region prediction coordinate information of each detection object and the corresponding region tag coordinate information; a construction unit for constructing a target gradient loss function based on the target classification loss and the target localization loss; the calculating unit is used for calculating a target gradient image corresponding to the image to be processed through the target gradient loss function; the superposition unit is used for superposing the target gradient image to the image to be processed and generating an intermediate countermeasure image, wherein the intermediate countermeasure image is the image to be processed for gradient calculation in the next iteration;

And the iteration module is used for repeatedly executing the step of generating the intermediate countermeasure image until no detection object identified as the first class by the target detection model exists in the intermediate countermeasure image, and determining the intermediate countermeasure image as the target countermeasure image.

In some embodiments, the predictive classification loss function includes a gradient increment calculation term for gradient increment calculation and a gradient decrement calculation term for gradient decrement calculation, the input unit comprising:

a first determining subunit, configured to determine a category to which the classification tag information of each detection object belongs;

A second determination subunit configured to determine, as a first detection object, a detection object whose category is a first category, and determine, as a second detection object, a detection object whose category is a second category;

The first input subunit is used for inputting the classification prediction information of the first detection object and the corresponding classification label information into a gradient increment calculation item to perform gradient increment calculation so as to obtain a first classification loss;

The second input subunit is used for inputting the classification prediction information of the second detection object and the corresponding classification label information into a gradient decreasing calculation item for gradient decreasing calculation to obtain a second classification loss;

And a third determination subunit configured to determine a target classification loss based on the first classification loss and the second classification loss.

In some embodiments, the third determining subunit is configured to:

summing up and calculating the first classification loss and the second classification loss to obtain a third classification loss;

And calculating the ratio of the third classification loss to the number of the detection objects to obtain the target classification loss.

In some embodiments, the apparatus further comprises a second determining unit for:

calculating a spatial distance between the classification prediction information and each classification label information;

and determining classification label information corresponding to the classification prediction information of each detection object according to the size of the space distance.

In some embodiments, the building unit is configured to:

weighting the target classification loss by a preset weight to obtain a weighted target classification loss;

And constructing a target gradient loss function based on the weighted target classification loss and the target positioning loss.

In some embodiments, the computing unit is configured to:

Calculating a gradient value of each pixel in the image to be processed according to the target gradient loss function to obtain a first gradient image;

Processing the first gradient image through a symbol function to obtain a second gradient image;

And calculating the product of the preset disturbance value and the second gradient image to obtain a target gradient image.

In some embodiments, the superposition unit comprises:

The superposition subunit is used for superposing the target gradient image to the image to be processed to obtain a first countermeasure image;

The acquisition subunit is used for acquiring an image to be processed which is input into the pre-trained target detection model for the first time and determining the image to be processed as an initial image;

And a generation subunit configured to define a pixel difference between each pixel between the first countermeasure image and the initial image to a preset disturbance value, and generate an intermediate countermeasure image.

In some embodiments, the generating subunit is configured to:

Calculating pixel differences between each first pixel in the first contrast image and a corresponding second pixel in the initial image to form a pixel difference set;

Defining each pixel difference in the set of pixel differences within a preset disturbance value range by a clipping function;

the minimum value of the preset disturbance value range is a negative preset disturbance value, and the maximum value of the preset disturbance value range is a positive preset disturbance value;

And superposing the pixel difference set after the pixel difference definition on the initial image to generate an intermediate countermeasure image.

A method of training an object detection model, comprising:

And inputting a sample image set into the initial object detection model, and performing model training to obtain a trained object detection model, wherein the sample image set comprises a sample contrast image, and the sample contrast image is generated by the contrast image generation method.

An image recognition method, comprising:

And inputting the image to be identified into an object detection model for object identification to obtain an identification result, wherein the object detection model is obtained by training according to the training method of the object detection model.

A computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps in contrast image generation described above.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps in the above-described challenge image generation method or training method of an object detection model when the computer program is executed.

A computer program product or computer program comprising computer instructions stored in a storage medium. The processor of the computer device reads the computer instructions from the storage medium, and the processor executes the computer instructions so that the computer performs the steps in the above-described countermeasure image generation method or training method of the object detection model.

The embodiment of the application generates the intermediate countermeasure image by the following specific steps: inputting the image to be processed into a pre-trained target detection model, and outputting classification prediction information and region prediction coordinate information of each detection object; inputting the classification prediction information and the corresponding classification label information of each detection object into a prediction classification loss function, and performing gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs to obtain corresponding target classification loss; the predictive classification loss function comprises a gradient increment calculation item and a gradient decrement calculation item, wherein the gradient increment calculation item is used for gradient increment calculation of classification loss of a detection object of which the classification tag information belongs to a first category, and the gradient decrement calculation item is used for gradient decrement calculation of classification loss of a detection object of which the classification tag information belongs to a second category; determining target positioning loss according to the region prediction coordinate information of each detection object and the corresponding region label coordinate information; constructing a target gradient loss function based on the target classification loss and the target positioning loss; calculating a target gradient image corresponding to the image to be processed through a target gradient loss function; the target gradient image is overlapped to the image to be processed, an intermediate countermeasure image is generated, and the intermediate countermeasure image is the image to be processed for gradient calculation in the next iteration; the step of generating the intermediate challenge image is repeatedly performed until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and the intermediate challenge image is determined as the target challenge image. Accordingly, each detection object is subjected to corresponding gradient increasing calculation or gradient decreasing calculation according to the category to which the classification label information belongs through a preset classification loss function, a target classification loss is obtained, a target gradient loss function is constructed based on the target classification loss, an image to be processed is calculated through the target gradient loss function, a target gradient image is generated and is superimposed on the image to be processed, an intermediate countermeasure image is generated, the process is repeated until no detection object identified as a first category by a target detection model exists in the intermediate countermeasure image, and the intermediate countermeasure image is determined to be the target countermeasure image. Compared with the scheme that the generated countermeasure image only enables the target detection model to be inaccurate in recognition of the target detection frame corresponding to the abnormal object, the method and the device for recognizing the abnormal object based on the target detection frame realize that the result of recognizing the detection object of the abnormal object is the normal object, the recognition performance of the target detection model is completely destroyed, and the efficiency and the performance of generating the countermeasure image are greatly improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic view of a scene of a countermeasure image generation system provided by an embodiment of the present application;

FIG. 2a is a flow chart of a method for generating a countermeasure image according to an embodiment of the present application;

Fig. 2b is a schematic structural diagram of an object detection network according to an embodiment of the present application;

fig. 2c is a schematic view of a scenario of a countermeasure image generating method according to an embodiment of the present application;

FIG. 3 is another flow chart of a method for generating a countermeasure image according to an embodiment of the present application;

Fig. 4 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.

The embodiment of the application provides a method, a device, computer equipment and a storage medium for generating an countermeasure image. Wherein the countermeasure image generation method can be applied to a countermeasure image generation apparatus. The countermeasure image generating means may be integrated in a computer device, which may be a terminal having a data processing function. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart watch, and the like. The computer device may also be a server, where the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, network acceleration services (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

Referring to fig. 1, a schematic view of a scene of a countermeasure image generating system according to the present application is shown; as shown, the computer device generates an intermediate challenge image, comprising the specific steps of: inputting the image to be processed into a pre-trained target detection model, and outputting classification prediction information and region prediction coordinate information of each detection object; inputting the classification prediction information and the corresponding classification label information of each detection object into a prediction classification loss function, and performing gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs to obtain corresponding target classification loss; the prediction classification loss function comprises a gradient increment calculation item and a gradient decrement calculation item, wherein the gradient increment calculation item is used for gradient increment calculation of classification loss of a detection object of which the classification label information belongs to a first category, and the gradient decrement calculation item is used for gradient decrement calculation of classification loss of a detection object of which the classification label information belongs to a second category; determining target positioning loss according to the region prediction coordinate information of each detection object and the corresponding region label coordinate information; constructing a target gradient loss function based on the target classification loss and the target positioning loss; calculating a target gradient image corresponding to the image to be processed through the target gradient loss function; the target gradient image is overlapped to an image to be processed, and an intermediate countermeasure image is generated, wherein the intermediate countermeasure image is the image to be processed for gradient calculation in the next iteration; the step of generating an intermediate challenge image is repeatedly performed until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and the intermediate challenge image is determined as the target challenge image. It should be noted that, the schematic view of the scene of the generation of the countermeasure image shown in fig. 1 is only an example, and the scene of the generation of the countermeasure image described in the embodiment of the present application is for more clearly describing the technical solution of the present application, and does not constitute a limitation to the technical solution provided by the present application. Those skilled in the art can know that with the evolution of the training of the target detection model and the appearance of a new service scene, the technical scheme provided by the application is also applicable to similar technical problems.

The following will describe in detail.

In the present embodiment, description will be made from the viewpoint of an countermeasure image generating apparatus which may be integrated in a server having a storage unit and a microprocessor mounted thereto and having an arithmetic capability.

Referring to fig. 2a, fig. 2a is a flowchart illustrating a method for generating an countermeasure image according to an embodiment of the application. The countermeasure image generation method includes: generating an intermediate challenge image, the step of generating the intermediate challenge image being repeatedly performed until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, the intermediate challenge image being determined as the target challenge image, wherein in the present embodiment, the specific step of generating the intermediate challenge image includes: steps 101 to 106.

In step 101, an image to be processed is input into a pre-trained target detection model, and classification prediction information and region prediction coordinate information of each detection object are output.

Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace human eyes with a camera and a Computer to perform machine Vision such as recognition, tracking and measurement on a target, and further perform graphic processing to make the Computer process into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include information processing, image recognition, image semantic understanding, image retrieval, optical character recognition (Optical Character Recognition, OCR), video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, map construction, etc., as well as common biometric recognition techniques such as face recognition, fingerprint recognition, etc.

The scheme provided by the embodiment of the application relates to the technologies such as computer vision technology of artificial intelligence, and the like, and is specifically described by the following embodiments:

In order to better understand the embodiments of the present application, it should be noted that the object detection model in the embodiments of the present application may be the object detection network (FASTER RCNN), please refer to fig. 2b, and fig. 2b is a schematic structural diagram of the object detection network according to the embodiments of the present application.

The object detection network 10 may be divided into mainly 4 parts:

The basic convolutional network 12 (Conv layers) is a convolutional neural network, such as 13 convolutional (Conv) layers, 13 linear rectification function (relu) layers and 4 pooling layers (pooling) layers, and is mainly used for extracting the characteristic map information 13 (feature maps) in the image 11 to be processed.

An area generating network 14 (Region Proposal Networks, RPN), where the area generating network 14 is configured to generate candidate areas (region proposals), specifically, classify anchors (anchors) in the feature map information 13 by a normalization function (softmax), acquire positive (active) information and negative (negative) information, determine the positive classification information as candidate areas, that is, primarily classify each object as an area included in an image, and the classification result corresponding to the area generating network candidate areas is classification information of the area generating network candidate areas, such as a person, an animal, or a building.

Further, the offset of the border regression (bounding box regression) of the anchor can be calculated, the candidate region is adjusted according to the offset of the border regression, the final target candidate region 15 (proposal) is obtained, and the target candidate region 15 which is too small and exceeds the border is removed, so that the positioning frame selection of the target object is realized.

The interest pooling layer 16 (ROI pooling) is responsible for collecting the target candidate region 15 and the feature map information 13, and calculating the feature map information (proposal feature maps) of the region with the size meeting the condition, and sending the feature map information to the subsequent layer for processing.

The Classifier 17 (Classifier), which may include a full connection layer (full connection) and a normalization layer, combines the region feature map information through the full connection layer and the normalization layer, calculates a classification result corresponding to the region feature map, and may perform fine tuning on the target candidate region 15 according to the classification result, and determine the fine-tuned target candidate region 15 as a final accurate detection region of interest (i.e., corresponding to the region prediction coordinate information in the embodiment of the present application), where the classification result corresponding to the position information of interest is classification information of the region of interest (i.e., corresponding to the classification prediction information of the object in the embodiment of the present application).

The pre-trained target detection model is a model for training through a large number of image samples, the image samples are marked with area label coordinate information of each object and classification label information, the area label coordinate information represents a detection frame of the object, the classification label information is calibration information of each object, such as cats, dogs, pigs and the like, so that the pre-trained target detection model can quickly identify classification prediction information (namely calibration information) and area prediction coordinate information (namely detection frame) of the detection object in the image.

Therefore, the common scene of the target detection model can be used for rapidly monitoring whether abnormal objects which are not in line with the specifications appear in the image shot by the monitoring area of the camera, for example, a scene that an airport or subway carries out intelligent security inspection on portable luggage is required to detect some object categories such as a control cutter, a gun or a water bottle which carry out abnormal alarm, and in view of the high requirements of such supervision on safety, the robustness of the target detection model is often evaluated by utilizing an anti-attack means, so that how to generate efficient anti-image targets are to be solved.

The embodiment of the application can input the image to be processed into a pre-trained target detection model, and output the classification prediction information and the region prediction coordinate information of each detection object in the image to be detected, wherein the classification prediction information can be expressed as a vector, and the region prediction coordinate information can be expressed as a vector.

Referring to fig. 2c together, fig. 2c is a schematic view of a scenario of the method for generating an countermeasure image according to an embodiment of the present application, in a processing scenario 20, an image to be processed may be input into a pre-trained target detection model, and classification prediction information (also referred to as a detection frame type prediction vector) and region prediction coordinate information (detection frame coordinate prediction vector) may be output.

In some embodiments, before inputting the classification prediction information and the corresponding classification label information of each detection object into the prediction classification loss function, the method may further include:

(1) Calculating a spatial distance between the classification prediction information and each classification label information;

(2) And determining classification label information corresponding to the classification prediction information of each detection object according to the size of the spatial distance.

The spatial distance may be a true distance between vectors, and in one embodiment, the spatial distance may be a euclidean distance, which is a commonly used distance definition, and refers to a true distance between two points in an m-dimensional space, or a natural length of a vector (i.e., the distance from the point to the origin). The euclidean distance in two and three dimensions is the actual distance between two points.

In this way, the euclidean distance between the classification prediction information and each classification tag information may be calculated, for example, the classification prediction information is calculated as (0.2,0.3,0.8), the classification tag information includes euclidean distances of cats (1, 0), dogs (0, 1, 0) and pigs (1, 0), wherein the spatial distance is the minimum pig (1, 0), and the pig (1, 0) may be used as the classification tag information of the classification prediction information (0.2,0.3,0.8). And by analogy, the classification label information corresponding to the classification prediction information of each detection object can be calculated.

In step 102, the classification prediction information and the corresponding classification label information of each detection object are input into a prediction classification loss function, and gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs is performed, so as to obtain the corresponding target classification loss.

In order to achieve that an abnormal object is identified as a normal object, so that the detection function of a pre-trained target detection model is disabled, the prediction classification loss function can be designed, wherein the prediction classification loss function comprises a gradient increment calculation item and a gradient decrement calculation item, the gradient increment calculation item is used for gradient increment calculation of classification loss of a detection object of which classification tag information belongs to a first category, and the gradient decrement calculation item is used for gradient decrement calculation of classification loss of a detection object of which classification tag information belongs to a second category.

Namely, in the embodiment of the application, the number of the objects which can be identified by the pre-trained target detection model can be divided into two categories, wherein the first category can be understood as an abnormal category, and the second category can be understood as a normal category, for example, the number of the objects which can be identified by the pre-trained target detection model is set to be Y, and M abnormal object categories are included and are marked as V= { V ₁,…,v_M }, and N normal object categories are included and are marked as U= { U ₁,…,u_N }. Y=u (and) V.

In the embodiment of the application, the gradient increment is used for obtaining the corresponding value of the independent variable when a certain function is maximum, so that the difference between the classification prediction information and the corresponding classification label information is larger and larger, and the pre-trained target detection model tends to identify the detection area of the abnormal object as a normal object. The gradient descent is used for obtaining the corresponding value of the independent variable when a certain function is at the minimum value, and in the embodiment of the application, the gradient descent is used for making the difference between the classification prediction information and the corresponding classification label smaller and smaller, so that the pre-trained target detection model tends to identify the detection area of the normal object as the normal object, namely the identification principle of the normal object is not changed.

In one embodiment, the predictive classification loss function may be expressed in the following equation:

The Lcls (F (x), { yk }) is a prediction classification loss function, where n is the number of detected objects, L _CE () represents a cross entropy loss function, F (x) _k is a prediction probability vector of the kth prediction region with respect to Y classes, Y _k is the true class of the kth detection box (i.e., classification label information), in the embodiment of the present application, each target object corresponds to a classification loss term being L _k＝L_CE(F(x)_k,y_k), yk e V is adopted for the object as an abnormal object class, Gradient increment calculation mode, namely setting gradient increment calculation mode, wherein the/>The classification label information representing j abnormal objects, in contrast, for the class of normal objects, adopts a gradient decreasing calculation mode of yk epsilon U and L _k＝-L_CE(F(x)_k,y_k), namely, a mode of increasing negative sign (-) to change an original gradient increasing mode into gradient decreasing mode.

And sequentially inputting the classification prediction information of each detection object and the corresponding classification label information into the prediction classification loss function, performing corresponding gradient calculation according to the category to which the classification label information belongs, namely performing gradient increment calculation by taking the category to which the classification label information belongs as a first category, performing gradient decrement calculation by taking the category to which the classification label information belongs as a second category, and obtaining the corresponding target classification loss.

In some embodiments, the prediction classification loss function includes a gradient increment calculation term and a gradient decrement calculation term, the gradient increment calculation term is used for gradient increment calculation, the gradient decrement calculation term is used for gradient decrement calculation, the classification prediction information of each detection object and the corresponding classification label information are input into the prediction classification loss function, and the corresponding type of gradient calculation is performed according to the category to which the classification label information belongs, so as to obtain the corresponding target classification loss, and the method includes the steps of:

(1) Determining the category to which the classification label information of each detection object belongs;

(2) Determining a detection object with a category of a first category as a first detection object, and determining a detection object with a category of a second category as a second detection object;

(3) Inputting the classification prediction information of the first detection object and the corresponding classification label information into a gradient increment calculation item to perform gradient increment calculation so as to obtain a first classification loss;

(4) Inputting the classification prediction information of the second detection object and the corresponding classification label information into a gradient decreasing calculation item for gradient decreasing calculation to obtain a second classification loss;

(5) A target classification loss is determined based on the first classification loss and the second classification loss.

Wherein, a normal category and an abnormal category can be set, the first category is the abnormal category, and the second category is the normal category. The abnormal category and the normal category may be preset in advance, for example, a control tool, a gun, and a water bottle are determined as the abnormal category, and objects such as a mobile phone and a computer are determined as the normal category.

Therefore, the category to which the classification label information of each detection object belongs is determined to be the first category or the second category. And then determining the detection object with the category of the first detection object as the first detection object, namely the detection object of the abnormal object as the first detection object, and determining the detection object with the category of the second detection object as the second detection object, namely the detection object of the normal object as the second detection object.

Further, please refer to the following formula:

The L _k corresponds to the classification loss term for each target object, the Namely, a gradient increment calculation item, when y _k epsilon V, namely, the class to which the classification label information of the detection object belongs is a first class. - (y _k∈U)⊙L_CE(F(x)_k,y_k) is the gradient decreasing calculation term, and when y _k epsilon U is the second category, namely the category to which the classification label information of the detection object belongs.

Based on the formula, the classification prediction information of the first detection object and the corresponding classification label information are input into a gradient increment calculation item to perform gradient increment calculation to obtain first classification loss, and the classification prediction information of the second detection object and the corresponding classification label information are input into a gradient decrement calculation item to perform gradient decrement calculation to obtain second classification loss.

Finally, the first and second classification losses are counted to obtain a total loss (i.e., a target classification loss).

In some embodiments, the step of determining a target classification loss based on the first classification loss and the second classification loss comprises:

(1.1) summing the first classification loss and the second classification loss to obtain a third classification loss;

(1.2) calculating the ratio of the third classification loss to the total number of detected objects to obtain the target classification loss.

Please refer to the following formula:

Based on the formula, the first classification loss and the second classification loss can be sequentially summed to obtain a third classification loss, and then the ratio of the third classification loss to the number n of detection objects identified from the image to be detected is calculated to obtain a target classification loss L _cls.

In step 103, a target positioning loss is determined according to the region prediction coordinate information and the corresponding region tag coordinate information of each detection object.

The area prediction coordinate information and the corresponding area tag coordinate information of each detection object can be substituted into the existing prediction positioning loss function to calculate, so that the corresponding target positioning loss L _loc can be obtained.

In step 104, a target gradient loss function is constructed based on the target classification loss and the target localization loss.

The target classification loss and the target positioning loss can be added to construct a target gradient loss function, and the target gradient loss function comprises an offset rule for identifying an abnormal object as a normal object class because the target classification loss is included in the target gradient loss function.

In some embodiments, the step of constructing the target gradient loss function based on the target classification loss and the target localization loss may include:

(1) Weighting the target classification loss by a preset weight to obtain a weighted target classification loss;

(2) And constructing a target gradient loss function based on the weighted target classification loss and the target positioning loss.

Wherein, the following formula can be referred to together:

L＝γL_cls+L_loc

The γ is a macro parameter, that is, a preset weight, and the L is a target gradient loss function, and since the offset rule for identifying the abnormal object as the normal object class is hidden in the target classification loss, the macro parameter may be used for weighting, for example, the macro parameter may be 0.9, 1, and so on. Based on the above formula, the target classification loss L _cls may be weighted with macro parameters to obtain a weighted target classification loss, and the target gradient loss function may be based on the sum of the weighted target classification loss γl _cls and the target positioning loss L _loc.

In step 105, a target gradient image corresponding to the image to be processed is calculated by the target gradient loss function.

In the embodiment of the application, the countermeasure image is needed to be generated as a countermeasure sample to fight the pre-trained target detection model, and the countermeasure image can be understood as that some disturbance which cannot be perceived by human eyes is added into the image to be processed (the disturbance does not influence human recognition, but the model is easy to fool), so that the target detection model makes wrong judgment.

The target gradient loss function conceals the deviation rule of identifying the abnormal object as the normal object type, namely, the target gradient loss function conceals the rule of increasing the gradient of the loss value between the prediction result of the abnormal object and the classification label information, so that the prediction result of the abnormal object can be more and more deviated from the actual target value, and in order to correctly increase disturbance, the gradient direction of the target gradient loss function on the image to be processed, namely, the target gradient image, which is also the image with the same size as the image to be processed, can be calculated.

In some embodiments, the step of calculating the target gradient image corresponding to the image to be processed through the target gradient loss function may include:

(1) Calculating a gradient value of each pixel in the image to be processed according to a target gradient loss function to obtain a first gradient image;

(2) Processing the first gradient image through a symbol function to obtain a second gradient image;

(3) Calculating the product of the preset disturbance value and the second gradient image to obtain a target gradient image

Wherein, the following formula can be referred to together:

Wherein, δ is the first countermeasure image after adding disturbance, x' is the current image to be processed, e is the disturbance value, and may be a constant, and sign () is a sign function (generally expressed by sign (x)), and the function is to take a certain number of signs (positive or negative): when x >0, sign (x) =1; when x=0, sign (x) =0; when x <0, sign (x) = -1. The method comprises The representative is that the gradient value of the target gradient loss function relative to each pixel in x' is calculated, so that the gradient value of each pixel point in the image to be processed can be calculated through the target gradient loss function, a first gradient image is obtained, the first gradient image and the image to be processed are the same in size, the difference is that the value of each pixel on the first gradient image is a gradient, so that the first gradient image can be processed through a sign function, each pixel is converted into 1, 0 or-1 from the gradient, a second gradient image is obtained, finally, the preset disturbance value epsilon is multiplied by the second gradient image, and the target gradient image can enable the recognition of the pre-trained target detection model on the abnormal object to deviate towards the direction of the normal object class.

In step 106, the target gradient image is superimposed on the image to be processed, generating an intermediate challenge image.

The target gradient image can enable the recognition of the abnormal object by the pre-trained target detection model to deviate towards the direction of the normal object class, so that the target gradient image can be directly superimposed on the image to be processed to generate an intermediate countermeasure image, and the recognition of the abnormal object in the intermediate countermeasure image by the trained target detection model is more biased towards the recognition of the normal object class than the recognition of the abnormal object in the intermediate countermeasure image due to the disturbance added in the intermediate countermeasure image.

In order to achieve a better countermeasure effect, further countermeasure enhancement is required, and the intermediate countermeasure image may be used as an image to be processed for gradient calculation in the next iteration, that is, the recognition of the abnormal object needs to be continuously gradient-increased, and please refer to step 107.

In some embodiments, the step of superimposing the target gradient image to the image to be processed, generating an intermediate challenge image, may include:

(1) The target gradient image is overlapped to an image to be processed, and a first countermeasure image is obtained;

(2) Acquiring an image to be processed of a target detection model which is input into a pre-training mode for the first time, and determining the image to be processed as an initial image;

(3) The pixel difference between each pixel between the first contrast image and the initial image is defined at a preset disturbance value, and an intermediate contrast image is generated.

Wherein, the formula can be continued to be referred to: and superposing the target gradient image on the image to be processed to obtain a first countermeasure image.

Further, in order to meet a requirement that the image is not easily perceived by a person after disturbance, an image to be processed, which is not subjected to any disturbance operation, i.e., is first input into a pre-trained target detection model, is determined as an initial image, and then each pixel in a first countermeasure image is compared with a corresponding pixel in a corresponding initial image, a pixel difference between each pixel is defined as a preset disturbance value, i.e., an absolute value of the pixel difference of each pixel is within the preset disturbance value, and if the absolute value of the difference between the pixel in the first countermeasure image and the corresponding pixel in the corresponding initial image is greater than the preset disturbance value and the pixel in the first countermeasure image is greater than the corresponding pixel in the initial image, the value of the pixel in the first countermeasure image needs to be reduced. Assuming that the absolute value of the difference between the pixel in the first countermeasure image and the corresponding pixel in the corresponding initial image is greater than the preset disturbance value and the pixel in the first countermeasure image is smaller than the corresponding pixel in the initial image, the value of the pixel in the first countermeasure image needs to be increased until the absolute value of the pixel difference between each pixel between the first countermeasure image and the initial image is defined within the preset disturbance value, and the first countermeasure image satisfying the condition is determined as the intermediate countermeasure image.

In step 107, the step of generating an intermediate challenge image is repeatedly performed until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and the intermediate challenge image is determined as the target challenge image.

In order to achieve a complete attack effect, that is, make the pre-trained target detection model identify the abnormal object as the normal object type, the steps 101 to 106 may be repeatedly performed to generate the intermediate challenge image on the premise that the intermediate object image is taken as the image to be processed for gradient calculation in the next iteration, so that the gradient increment of the abnormal recognition is larger, the interference is more serious, until the target detection model cannot identify the detection object in the intermediate challenge image as the first type, that is, the classification prediction information of the detection object corresponds to the classification label information of the normal object type, which indicates that the attack is completed, and the intermediate challenge image is determined as the target challenge image.

In the related art, the anti-attack mode only enables the object detection model to identify the abnormal object as other types, but other types can still be the abnormal type or cause warning, so that the loss of all abnormal types can be increased, the object detection model can identify the abnormal object as the normal object type, the anti-attack mode has stronger attack performance, repeated independent training is not needed, and the generation efficiency and performance of the anti-image are improved greatly.

As can be seen from the above, in the embodiment of the present application, the image to be processed is input into the pre-trained target detection model, and the classification prediction information and the region prediction coordinate information of each detection object are output; inputting the classification prediction information and the corresponding classification label information of each detection object into a prediction classification loss function, and performing gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs to obtain corresponding target classification loss; the predictive classification loss function comprises a gradient increment calculation item and a gradient decrement calculation item, wherein the gradient increment calculation item is used for gradient increment calculation of classification loss of a detection object of which the classification tag information belongs to a first category, and the gradient decrement calculation item is used for gradient decrement calculation of classification loss of a detection object of which the classification tag information belongs to a second category; determining target positioning loss according to the region prediction coordinate information of each detection object and the corresponding region label coordinate information; constructing a target gradient loss function based on the target classification loss and the target positioning loss; calculating a target gradient image corresponding to the image to be processed through a target gradient loss function; the target gradient image is overlapped to the image to be processed, an intermediate countermeasure image is generated, and the intermediate countermeasure image is the image to be processed for gradient calculation in the next iteration; the step of generating the intermediate challenge image is repeatedly performed until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and the intermediate challenge image is determined as the target challenge image. Accordingly, each detection object is subjected to corresponding gradient increasing calculation or gradient decreasing calculation according to the category to which the classification label information belongs through a preset classification loss function, a target classification loss is obtained, a target gradient loss function is constructed based on the target classification loss, an image to be processed is calculated through the target gradient loss function, a target gradient image is generated and is superimposed on the image to be processed, an intermediate countermeasure image is generated, the process is repeated until no detection object identified as a first category by a target detection model exists in the intermediate countermeasure image, and the intermediate countermeasure image is determined to be the target countermeasure image. Compared with the scheme that the generated countermeasure image only enables the target detection model to be inaccurate in recognition of the target detection frame corresponding to the abnormal object, the method and the device for recognizing the abnormal object based on the target detection frame realize that the result of recognizing the detection object of the abnormal object is the normal object, the recognition performance of the target detection model is completely destroyed, and the efficiency and the performance of generating the countermeasure image are greatly improved.

In the present embodiment, description will be given taking an example in which the countermeasure image generating apparatus is specifically integrated in a server, with specific reference to the following description.

Referring to fig. 3, fig. 3 is another flow chart of the method for generating an countermeasure image according to the embodiment of the application. The method flow may include: the server generates an intermediate challenge image, the server repeatedly performing the step of generating the intermediate challenge image until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and determining the intermediate challenge image as the target challenge image, wherein in an embodiment, the specific step of generating the intermediate challenge image by the server includes: steps 201 to 208.

In step 201, the server inputs the image to be processed into a pre-trained target detection model, and outputs classification prediction information and region prediction coordinate information of each detection object.

In order to better explain the embodiment of the application, the scene can be used as an airport or subway to carry out intelligent security inspection on the portable luggage, under the scene, some object types such as a control cutter, a gun or a water bottle which carry out abnormal alarm are required to be detected, and in view of the high requirement of such supervision on safety, the robustness of the target detection model is often evaluated by utilizing an anti-attack means.

Any image to be processed can be obtained in advance and input into a pre-trained target detection model, the target detection model can identify the image to be processed, and classification prediction information and region prediction coordinate information of each detection object are output. The classification prediction information may be a vector and the region prediction coordinate information may be a vector.

The classification prediction information characterizes a predicted object type, such as a control knife, gun, water bottle, cell phone, or computer, and the region prediction coordinate information characterizes a detection frame of the predicted object.

In step 202, the server calculates a spatial distance between the classification prediction information and each classification label information, and determines classification label information corresponding to the classification prediction information of each detection object according to the size of the spatial distance.

Wherein the class label information may represent an object such as a regulating knife, a gun, a water bottle, a mobile phone or a computer in a vector, class label information represented as (1, 0) for the regulating knife, (0, 1, 0) is classified label information expressed as a firearm, (0, 1, 0) is classified label information expressed as a water bottle, (0, 1, 0) is classified label information of the mobile phone, and (0,0,0,0,1) is classified label information of the computer. That is, each vector element may characterize whether it is a corresponding class label, with a probability that a closer to 0 is represented as the class label being smaller and a probability that a closer to 1 is represented as the class label being greater, the spatial distance being the euclidean distance.

In this way, the classification prediction information may be a vector set of prediction probabilities of the pre-trained target detection model for each classification label, for example, (0.2,0.3,0.8,0.1,0.01) the euclidean distance between the classification prediction information and each classification label information may be calculated, and the part with the smallest euclidean distance may be used as an accurate prediction vector, that is, it may be determined that the classification label information corresponding to the classification prediction information is (0, 1, 0), that is, the classification label is a water bottle, so that the classification label information corresponding to the classification prediction information of each detection object may be sequentially determined.

In step 203, the server determines a class to which the classification tag information of each detection object belongs, determines a detection object with a class of a first class as a first detection object, determines a detection object with a class of a second class as a second detection object, inputs classification prediction information of the first detection object and corresponding classification tag information into a gradient increment calculation item for gradient increment calculation to obtain a first classification loss, inputs classification prediction information of the second detection object and corresponding classification tag information into a gradient decrement calculation item for gradient decrement calculation to obtain a second classification loss, sums the first classification loss and the second classification loss to obtain a third classification loss, and calculates a ratio of the third classification loss to the total number of detection objects to obtain a target classification loss.

Further, please refer to the following formula:

The L _k corresponds to the classification loss term for each target object, the And when y _k epsilon V is the gradient increment calculation item, namely the class to which the classification label information of the detection object belongs is the first class, calculating by using the gradient increment calculation item. - (y _k∈U)⊙L_CE(F(x)_k,y_k) is the gradient decreasing calculation item, and when y _k epsilon U is that the class to which the classification label information of the detection object belongs is the second class, the gradient decreasing calculation item is used for calculation.

Based on the formula, substituting the classification prediction information of the first detection object and the corresponding classification label information into a gradient increment calculation item to perform gradient increment calculation to obtain a first classification loss, substituting the classification prediction information of the second detection object and the corresponding classification label information into a gradient decrement calculation item to perform gradient decrement calculation to obtain a second classification loss.

Please refer to the following formula:

Based on the formula, the first classification loss and the second classification loss can be sequentially summed to obtain a third classification loss, and then the ratio of the third classification loss to the total number n of the detected objects identified in the image to be detected is calculated to obtain a target classification loss L _cls.

In step 204, the server determines a target positioning loss according to the region prediction coordinate information and the corresponding region tag coordinate information of each detection object.

The area prediction coordinate information and the corresponding area tag coordinate information of each detection object may be substituted into the existing prediction positioning loss function to perform calculation, so as to obtain the corresponding target positioning loss L _loc.

In step 205, the server performs weighting processing on the target classification loss with a preset weight, so as to obtain a weighted target classification loss, and constructs a target gradient loss function based on the weighted target classification loss and the target positioning loss.

Wherein, the following formula can be referred to together:

L＝γL_cls+L_loc

The γ is a macro parameter, that is, a preset weight, and the L is a target gradient loss function, and since the offset rule for identifying the abnormal object as the normal object class is hidden in the target classification loss, the macro parameter may be weighted, for example, the macro parameter may be 0.95. Based on the above formula, the target classification loss L _cls may be weighted with macro parameters to obtain a weighted target classification loss, and the target gradient loss function may be based on the sum of the weighted target classification loss γl _cls and the target positioning loss L _loc.

In step 206, the server calculates a gradient value of each pixel in the image to be processed according to the target gradient loss function to obtain a first gradient image, processes the first gradient image through the sign function to obtain a second gradient image, and calculates a product of a preset disturbance value and the second gradient image to obtain the target gradient image.

According to the embodiment of the application, the countermeasure image is needed to be generated as a countermeasure sample to fight the pre-trained target detection model, and the countermeasure image can be understood as that a few disturbances which cannot be perceived by human eyes are added into the image to be processed, so that the target detection model makes an erroneous judgment on an abnormal object.

Wherein, the sign function is sign () (generally expressed by sign (x)), and the function is to take a certain number of signs (positive or negative): when x >0, sign (x) =1; when x=0, sign (x) =0; when x <0, sign (x) = -1.

How the target gradient image is calculated can be understood by referring to the following formula:

wherein, delta is the first countermeasure image added with disturbance, x' is the current image to be processed, epsilon is the disturbance value, which can be constant, and The representation is that the gradient value of the target gradient loss function relative to each pixel in x' can be understood as a first gradient image, so that the gradient value of each pixel point in the image to be processed can be obtained through the target gradient loss function, the first gradient image is the same as the size of the image to be processed, the difference is that the value of each pixel on the first gradient image is a gradient, so that the first gradient image can be processed through a sign function, each pixel is converted into 1, 0 or-1 from the gradient, a second gradient image is obtained, finally, the preset disturbance value epsilon is multiplied by the second gradient image, so as to obtain a target gradient image, the target gradient image can enable the identification of a pre-trained target detection model to be shifted towards the direction of the normal object class, namely, the target gradient image can enable the identification of the target detection model to be shifted towards the direction of a control tool, a gun and a water bottle.

In step 207, the server superimposes the target gradient image on the image to be processed, resulting in a first challenge image, and the image to be processed, which is acquired as the initial image, is determined as the first input to the pre-trained target detection model.

Wherein, the formula can be continued to be referred to: The target gradient image is superimposed on the image to be processed x' to obtain a first contrast image delta.

Further, in order to meet a requirement of being not easily perceived by a person after image disturbance, an image to be processed, which is not subjected to any disturbance operation, i.e., is input into the pre-trained target detection model for the first time, is acquired and determined as an initial image.

In step 208, the server calculates a pixel difference between each first pixel in the first countermeasure image and a corresponding second pixel in the initial image to form a pixel difference set, limits each pixel difference in the pixel difference set to be within a preset disturbance value range through a clipping function, and superimposes the pixel difference set after the pixel difference limit on the initial image to generate an intermediate countermeasure image.

Please refer to the following formula:

x″＝x+clip(δ-x,-∈,∈)

The x' is an intermediate contrast image, the clip () is a clipping function, and is used for calculating the pixel difference of each pixel in the first contrast image delta and the initial image x to form a pixel difference, the minimum value of the preset disturbance range is a negative preset disturbance value (i.e., -e), the maximum value of the preset disturbance value range is a positive preset disturbance value ([ epsilon ]), so that the preset disturbance range is [ epsilon ], and each pixel difference is limited in the preset disturbance value range by the clipping function, and epsilon is taken for pixel values larger than epsilon, and epsilon is taken for pixel values smaller than epsilon. The change amplitude of each pixel does not exceed one E, so that the unaware of naked eyes is ensured. Based on this, a pixel difference set after pixel definition is superimposed on the initial image, generating an intermediate countermeasure image.

In step 209, the server repeatedly performs the step of generating the intermediate challenge image until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and determines the intermediate challenge image as the target challenge image.

In order to achieve a complete attack effect, that is, to make the pre-trained target detection model recognize the abnormal object as the normal object class, the steps of generating the intermediate challenge image, that is, steps 201 to 208, are repeatedly performed on the premise that the intermediate object image is taken as the image to be processed for gradient calculation in the next iteration, so that the gradient of the abnormal recognition increases more and more, the interference becomes more and more serious, until the target detection model cannot recognize the control tool, gun or water bottle in the intermediate challenge image, that is, the classification prediction information of the detected object corresponds to the classification label information of the normal object class, that is, the mobile phone and the computer, and the description attack is completed, and the intermediate challenge image is determined as the target challenge image.

In the related art, the countermeasures may enable the target detection model to identify the control tool as a gun and still cause warning, but the embodiment of the application can increase the loss of all abnormal categories, enable the target detection model to identify all abnormal objects as normal object categories, not cause warning, have stronger attack performance, do not need repeated independent training, greatly improve the efficiency and performance of generating the countermeasures, train with the target countermeasures, and increase the robustness of the target detection model.

In some embodiments, in order to increase the robustness of the detection model (may also be referred to as an object detection model), the generated target challenge image and the conventional training image sample may be used as a sample image set to be input into the initial object detection model together for model training, so as to obtain a trained sample detection model, where the trained object detection model has better robustness and is not easily confused by an attack image, so that an image to be identified is input into the trained object detection model for object identification, and better identification effect may be obtained.

The embodiment of the application also provides a training method of the object detection model, which specifically comprises the following steps: inputting a sample image set into an initial object detection model, and performing model training to obtain a trained object detection model, wherein the sample image set comprises a sample challenge image, and the sample challenge image is generated by the challenge image generation method in the embodiment.

In this embodiment, the object detection model is different from the aforementioned object detection model in that the training sample of the object detection model includes a sample challenge image, and the training sample of the aforementioned object detection model does not include a sample challenge image. In the embodiment, in the model training process, in each model iterative training process, outputting a loss value corresponding to a current object detection model, and when the loss value meets a preset requirement, finishing training, and determining that the current object detection model is a trained object detection model; or under the condition that the number of times of model iterative training meets the preset number of times, finishing training to obtain a trained object detection model. The sample challenge image in this embodiment is obtained from the challenge image generated by the challenge image generating method described in the above embodiment, and the specific generating method is referred to the above description and will not be repeated here. The object detection model is trained by using the sample countermeasure images, so that the obtained trained object detection model can increase the recognition accuracy of the model, increase the capability of the model against attacks and increase the robustness of the object detection model.

The embodiment of the application also provides an image recognition method, which is used for inputting the image to be recognized into an object detection model for object recognition to obtain a recognition result, wherein the object detection model is obtained by training according to the training method of the object detection model.

In an actual application scene, for example, in a scene of performing intelligent security check on portable luggage in an airport or subway, some object types such as a control cutter, a gun or a water bottle which perform abnormal alarm need to be detected, an image corresponding to the luggage is obtained, the image is input into an object detection model to perform object recognition, and a corresponding recognition result is obtained.

The embodiment of the application provides a structural schematic diagram of a countermeasure image generating device, wherein the countermeasure image generating device can comprise a generating module and an iteration module, and the generating module comprises an output unit, an input unit, a first determining unit, a constructing unit, a calculating unit and a superposition unit.

The generation module is used for generating an intermediate countermeasure image, and specifically comprises the following steps:

And the output unit is used for inputting the image to be processed into the pre-trained target detection model and outputting the classification prediction information and the region prediction coordinate information of each detection object. In some embodiments, the apparatus further comprises a second determining unit for:

and determining classification label information corresponding to the classification prediction information of each detection object according to the size of the spatial distance.

And the input unit is used for inputting the classification prediction information of each detection object and the corresponding classification label information into the prediction classification loss function, and carrying out gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs to obtain the corresponding target classification loss.

The predictive classification loss function comprises a gradient increment calculation item and a gradient decrement calculation item, wherein the gradient increment calculation item is used for gradient increment calculation of classification loss of a detection object of which the classification label information belongs to a first category, and the gradient decrement calculation item is used for gradient decrement calculation of classification loss of a detection object of which the classification label information belongs to a second category.

In some embodiments, the predictive classification loss function includes a gradient increment calculation term for gradient increment calculation and a gradient decrement calculation term for gradient decrement calculation, the input unit 303 including:

In some embodiments, the third determining subunit is configured to:

Summing the first classification loss and the second classification loss to obtain a third classification loss;

And calculating the ratio of the third classification loss to the number of the detected objects to obtain the target classification loss.

And the first determining unit is used for determining the target positioning loss according to the region prediction coordinate information of each detection object and the corresponding region label coordinate information.

And a construction unit for constructing a target gradient loss function based on the target classification loss and the target localization loss.

In some embodiments, the building unit is configured to:

And the calculating unit is used for calculating a target gradient image corresponding to the image to be processed through the target gradient loss function.

In some embodiments, the computing unit is configured to:

And the superposition unit is used for superposing the target gradient image to the image to be processed and generating an intermediate countermeasure image, wherein the intermediate countermeasure image is the image to be processed for gradient calculation in the next iteration.

In some embodiments, the superposition unit comprises:

A superposition subunit, configured to superimpose the target gradient image on an image to be processed, so as to obtain a first countermeasure image;

In some embodiments, the generating subunit is configured to:

A set of pixel differences after pixel difference definition is superimposed on the initial image to generate an intermediate challenge image.

And an iteration module for repeatedly performing the step of generating an intermediate challenge image until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and determining the intermediate challenge image as the target challenge image.

The specific implementation of each unit can be referred to the previous embodiments, and will not be repeated here.

The embodiment of the application also provides a computer device, as shown in fig. 4, which shows a schematic structural diagram of a server according to the embodiment of the application, specifically:

The computer device may include one or more processors 401 of a processing core, memory 402 of one or more computer readable storage media, a power supply 403, and an input unit 404, among other components. Those skilled in the art will appreciate that the computer device structure shown in FIG. 4 is not limiting of the computer device and may include more or fewer components than shown, or may be combined with certain components, or a different arrangement of components. Wherein:

the processor 401 is a control center of the computer device, connects various parts of the entire computer device using various interfaces and lines, and performs various functions of the computer device and processes data by running or executing software programs and/or modules stored in the memory 402, and calling data stored in the memory 402, thereby performing overall monitoring of the computer device. Optionally, processor 401 may include one or more processing cores; alternatively, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, an application program, etc., and the modem processor mainly processes wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 401.

The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by executing the software programs and modules stored in the memory 402. The memory 402 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the server, etc. In addition, memory 402 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 with access to the memory 402.

The computer device further comprises a power supply 403 for supplying power to the various components, optionally, the power supply 403 may be logically connected to the processor 401 by a power management system, so that functions of charge, discharge, and power consumption management are performed by the power management system. The power supply 403 may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.

The computer device may also include an input unit 404, which input unit 404 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

Although not shown, the computer device may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 401 in the computer device loads executable files corresponding to the processes of one or more application programs into the memory 402 according to the following instructions, and the processor 401 executes the application programs stored in the memory 402, so as to implement the various method steps provided in the foregoing embodiment, as follows:

The specific steps of generating the intermediate countermeasure image include: inputting the image to be processed into a pre-trained target detection model, and outputting classification prediction information and region prediction coordinate information of each detection object; inputting the classification prediction information and the corresponding classification label information of each detection object into a prediction classification loss function, and performing gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs to obtain corresponding target classification loss; the prediction classification loss function comprises a gradient increment calculation item and a gradient decrement calculation item, wherein the gradient increment calculation item is used for gradient increment calculation of classification loss of a detection object of which the classification label information belongs to a first category, and the gradient decrement calculation item is used for gradient decrement calculation of classification loss of a detection object of which the classification label information belongs to a second category; determining target positioning loss according to the region prediction coordinate information of each detection object and the corresponding region label coordinate information; constructing a target gradient loss function based on the target classification loss and the target positioning loss; calculating a target gradient image corresponding to the image to be processed through the target gradient loss function; the target gradient image is overlapped to an image to be processed, and an intermediate countermeasure image is generated, wherein the intermediate countermeasure image is the image to be processed for gradient calculation in the next iteration; the step of generating an intermediate challenge image is repeatedly performed until no detection object identified as the first class by the target detection model exists in the intermediate challenge image, and the intermediate challenge image is determined as the target challenge image.

In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of an embodiment that are not described in detail, reference may be made to the foregoing detailed description of the countermeasure image generation method or the data processing method, which are not described herein.

Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.

To this end, an embodiment of the present application provides a computer-readable storage medium having stored therein a plurality of instructions that can be loaded by a processor to perform steps in any one of the countermeasure image generation method or the data processing method provided by the embodiment of the present application. For example, the instructions may perform the steps of:

According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the various alternative implementations provided in the above embodiments.

The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.

Wherein the computer-readable storage medium may comprise: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.

Because the instructions stored in the computer readable storage medium may execute the steps in any data processing method provided by the embodiments of the present application, the beneficial effects that any data processing method provided by the embodiments of the present application can be achieved, which are detailed in the previous embodiments and are not described herein.

The above description of the embodiment of the present application provides a method, apparatus, computer device and storage medium for generating an countermeasure image, and specific examples are applied to illustrate the principles and embodiments of the present application, and the above description of the embodiment is only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, the present description should not be construed as limiting the present application.

Claims

1. A countermeasure image generation method, characterized by comprising:

The specific steps of generating the intermediate countermeasure image include:

2. The countermeasure image generation method according to claim 1, wherein the predictive classification loss function includes a gradient increment calculation term for gradient increment calculation and a gradient decrement calculation term for gradient decrement calculation;

Inputting the classification prediction information and the corresponding classification label information of each detection object into a prediction classification loss function, and performing gradient calculation of the corresponding type of the class to which the corresponding classification label information belongs to obtain a corresponding target classification loss, wherein the method comprises the following steps:

Determining the category to which the classification label information of each detection object belongs;

Determining a detection object with a category of a first category as a first detection object, and determining a detection object with a category of a second category as a second detection object;

Inputting the classification prediction information of the first detection object and the corresponding classification label information into a gradient increment calculation item for gradient increment calculation to obtain a first classification loss;

Inputting the classification prediction information of the second detection object and the corresponding classification label information into a gradient decreasing calculation item for gradient decreasing calculation to obtain a second classification loss;

a target classification loss is determined based on the first classification loss and the second classification loss.

3. The countermeasure image generation method according to claim 2, wherein the determining a target classification loss based on the first classification loss and the second classification loss includes:

And calculating the ratio of the third classification loss to the total number of the detection objects to obtain the target classification loss.

4. The countermeasure image generation method according to claim 1, wherein before inputting the classification prediction information and the corresponding classification label information of each detection object into the prediction classification loss function, further comprising:

5. The countermeasure image generation method according to claim 1, wherein the constructing a target gradient loss function based on the target classification loss and target localization loss includes:

6. The countermeasure image generation method according to claim 1, wherein the calculating the target gradient image corresponding to the image to be processed by the target gradient loss function includes:

7. The countermeasure image generation method according to claim 6, wherein the superimposing the target gradient image to an image to be processed, generating an intermediate countermeasure image, includes:

the target gradient image is overlapped to an image to be processed, and a first countermeasure image is obtained;

acquiring an image to be processed of a target detection model which is input into a pre-training mode for the first time, and determining the image to be processed as an initial image;

the pixel difference between each pixel between the first contrast image and the initial image is defined at a preset disturbance value, and an intermediate contrast image is generated.

8. The countermeasure image generation method of claim 7, wherein defining a pixel difference between each pixel between the first countermeasure image and the initial image at a preset disturbance value, generating an intermediate countermeasure image includes:

9. A method of training an object detection model, comprising:

Inputting a sample image set into an initial object detection model, performing model training, and obtaining a trained object detection model, wherein the sample image set comprises a sample challenge image, and the sample challenge image is generated by the challenge image generating method according to any one of claims 1 to 8.

10. An image recognition method, comprising:

inputting an image to be identified into an object detection model for object identification to obtain an identification result, wherein the object detection model is obtained by training according to the training method of the object detection model of claim 9.

11. An countermeasure image generating apparatus, characterized by comprising:

12. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps in the challenge image generation method of any of claims 1 to 8 or the training method of the object detection model of claim 9 when the computer program is executed.

13. A computer readable storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor to perform the steps of the challenge image generating method of any of claims 1 to 8 or the training method of the object detection model of claim 9.