CN114419346A

CN114419346A - Model robustness detection method, device, equipment and medium

Info

Publication number: CN114419346A
Application number: CN202111673631.6A
Authority: CN
Inventors: 不公告发明人
Original assignee: Beijing Real AI Technology Co Ltd
Current assignee: Beijing Real AI Technology Co Ltd
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2022-04-29
Anticipated expiration: 2041-12-31
Also published as: CN114419346B

Abstract

The embodiment of the application relates to the technical field of computers, and provides a method, a device, equipment and a medium for detecting robustness of a model, wherein the method comprises the following steps: acquiring a target image set, wherein the target image set comprises at least one confrontation sample image; respectively inputting at least one confrontation sample image to a target model to be attacked; obtaining an output result of the target model, wherein the output result comprises the similarity between each pair of anti-sample images of the original sample image and the input target image set; obtaining robustness detection data of the target model based on the output result and a preset similarity threshold of the target model; and determining a robustness diagnosis result of the target model according to the robustness detection data. The scheme can output objective and accurate robustness diagnosis results for the target model, and is convenient for a user to quickly judge the safety and reliability of the target model in the face of resisting the attack of the sample.

Description

Model robustness detection method, device, equipment and medium

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a method, a device, equipment and a medium for detecting robustness of a model.

Background

With the development of artificial intelligence, Deep Neural Networks (Deep Neural Networks) are widely applied to the field of Computer Vision (CV) for enhancing a Deep Learning (Deep Learning) model. The existing DL model has higher security risk and poorer reliability when facing the attack of resisting samples.

In the field of security protection, people are generally identified by adopting a face identification model, a large amount of face data are collected firstly to train the model, and then the face shot in real time is identified. When the threat of resisting the sample is faced, the main solution is to improve the collected data volume and enlarge the training set, thereby improving the accuracy of model identification. However, such a solution is not targeted to combat attacks, and still suffers from interference of the combat sample on model judgment, so that a method for detecting the robustness of the model is lacked at present.

Disclosure of Invention

In order to solve the above technical problem, embodiments of the present application provide a method, an apparatus, a device, and a medium for detecting robustness of a model.

In a first aspect, an embodiment of the present application provides a method for detecting robustness of a model, including:

acquiring a target image set, wherein the target image set comprises at least one confrontation sample image; respectively inputting at least one confrontation sample image to a target model to be attacked; obtaining an output result of the target model, wherein the output result comprises the similarity between an original sample image and each confrontation sample image input into the target image set; based on the output result and a preset similarity threshold of the target model, obtaining robustness detection data of the target model; and determining a robustness diagnosis result of the target model according to the robustness detection data.

In a second aspect, an embodiment of the present application provides a robustness detection apparatus for a model, which implements the robustness detection method for the model, including:

an input-output module for obtaining a target image set, the target image set including at least one confrontation sample image;

the processing module is used for respectively inputting at least one confrontation sample image to a target model to be attacked; obtaining an output result of the target model, wherein the output result comprises the similarity between an original sample image and each confrontation sample image input into the target image set; based on the output result and a preset similarity threshold of the target model, obtaining robustness detection data of the target model; and determining a robustness diagnosis result of the target model according to the robustness detection data.

An embodiment of the present application further provides an electronic device, where the electronic device includes: a processor; a memory for storing the processor-executable instructions; and the processor is used for reading the executable instructions from the memory and executing the instructions to realize the robustness detection method of the model.

The embodiment of the application also provides a computer readable storage medium, wherein the storage medium stores a computer program, and the computer program is used for executing the robustness detection method of the model.

Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:

the method, the device, the equipment and the medium for detecting the robustness of the model provided by the embodiment of the application comprise the following steps: firstly, acquiring a target image set, wherein the target image set comprises at least one confrontation sample image; then respectively inputting at least one confrontation sample image to a target model to be attacked; acquiring an output result of the target model, and acquiring robustness detection data of the target model based on the output result and a preset similarity threshold of the target model; and determining a robustness diagnosis result of the target model according to the robustness detection data. The scheme better realizes simulation and actual combat drilling attack on the target model, fills the blank of the computer vision field in the aspect of detection and defense of the countermeasure attack, simultaneously outputs objective and accurate robustness diagnosis results for the target model, and is convenient for a user to quickly judge the safety and reliability of the target model in the face of attack of the countermeasure sample.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and, together with the description, serve to explain the principles of the embodiments of the application.

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

Fig. 1 is a schematic view of an application scenario of a detection method according to an embodiment of the present application;

FIG. 2 is a flowchart of a robustness detection method of the model according to an embodiment of the present application;

fig. 3 is a scene schematic diagram of an attack target model according to an embodiment of the present application;

FIG. 4 is a graphical representation of a robust diagnostic result according to an embodiment of the present application;

FIG. 5 is a graphical representation of another robust diagnostic result according to an embodiment of the present application;

FIG. 6 is a schematic illustration of a diagnosis in a firing ground environment according to an embodiment of the present application;

FIG. 7 is a schematic diagram illustrating a diagnosis of a bank identification system according to an embodiment of the present application;

FIG. 8 is a schematic diagram illustrating a diagnosis in a scene of driver identity recognition according to an embodiment of the present application;

FIG. 9 is a schematic diagram of an object-oriented method according to an embodiment of the present application;

FIG. 10 is a schematic illustration of a lateral comparison according to an embodiment of the present application;

FIG. 11 is a schematic illustration of another lateral comparison according to an embodiment of the present application;

FIG. 12 is a block diagram of a robustness detection apparatus of the model according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order that the above-mentioned objects, features and advantages of the present application may be more clearly understood, the solution of the present application will be further described below. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways than those described herein; it is to be understood that the embodiments described in this specification are only some embodiments of the present application and not all embodiments.

Deep neural networks are widely used in the field of computer vision to enhance deep learning models. Despite the ever-increasing performance of models, existing deep learning models are very unreliable in front of countersample (i.e., misleading the neural network with inputs made with slight perturbations). This may seriously undermine the security of the system. More and more research has shown that fighting attacks can lead to severe physical world effects, possibly in real world scenarios beyond what is expected in the laboratory. For example, a hostile facial patch is generated by GenAP and printed out, so that the smart phone can be unlocked, and the facial recognition system can make mistakes. Recognizing this growing safety challenge, many regional, national, and international agencies have introduced guidelines to standardize the use of artificial intelligence techniques and to establish standards for achieving technical robustness to safety risks. This suggests that it is necessary for computer vision engineers, who are users and builders of the model, to recognize the potential risk of fighting an attack. However, in current practice, detection and defense providing methods to reduce potential counterthreats are mainly proposed by the defense algorithm engineers after deploying deep learning models in computer vision applications, and the computer vision engineers are rarely involved in this process. Thus, both in the enterprise sector and on an individual basis by computer vision engineers, there is a need to evaluate the robustness of the model used.

Among counterattacks, escape attacks are the most common attacks on the machine learning model reasoning process. It refers to designing/computing an input that a human can correctly recognize in normal conditions, but is incorrectly classified by the model. A typical example is to alter certain pixels in the picture before uploading so that the image recognition system cannot classify the results. In fact, this example of antagonism can deceive humans. Adding noise to the original picture, and outputting a recognition result with a larger difference from the original picture, for example, the image of the Alps is recognized as the image of the dog; after adding a glasses frame to the image of the user A, the image is output as another user B.

Based on this, in the security protection field, adopt face identification model to discern people, when the threat of confrontation sample, the main solution is to improve the data bulk of collecting, enlarge the training set, improve the model identification rate of accuracy from this. However, such solutions are not targeted against counterattacks and still suffer from interference of the countersample with the model decision. The reason for this is that most CV engineers do not have knowledge and experience in the field of defense and attack, do not know how to deal with the threat of fighting samples, and do not know from what aspect to evaluate the fighting robustness of their own models. Therefore, CV engineers or industry-related departments need a method for detecting or evaluating the robustness of the model, so as to help them diagnose the robustness of the model, provide improved ideas for modifying the model, and laterally evaluate the safety development of the industry.

Based on this, the embodiment of the application provides a method, a device, equipment and a medium for detecting the robustness of a model. According to the scheme, at least one confrontation sample image is input into a target model to be attacked, the output result of the target model, namely the similarity obtained when an original sample image and a confrontation sample object attack the target model, is obtained and utilized, the robustness of the target model is detected, robustness detection data is obtained, and the robustness diagnosis result of the target model is determined according to the robustness detection data of the attack behavior. The scheme better realizes simulation and actual combat drilling attack on the target model, fills the blank of the computer vision field in the aspect of detection and defense of the countermeasure attack, simultaneously outputs objective and accurate robustness diagnosis results for the target model, and is convenient for a user to quickly judge the safety and reliability of the target model in the face of attack of the countermeasure sample. For ease of understanding, the following description of the embodiments of the present application is presented.

The embodiment of the application provides a flowchart of a robustness detection method of a model, which is executed by an electronic device, such as a mobile phone, a computer, a server and other devices. It should be noted that, the embodiments of the present application only use the robustness detection method of the model as an example to explain the present application. In the embodiment of the application, the electronic device can be respectively connected with the projector and the image acquisition device, and the picture of the electronic device is displayed on the holographic film through the projector. Fig. 1 is a diagram illustrating a structure of a target model for resisting sample image attack based on holographic imaging according to an embodiment of the present disclosure, as shown in fig. 1, an electronic device 110 is connected to a projector 120 and an image capture device 130, respectively, and a content displayed in a picture of the electronic device 110 can be displayed on a holographic film 140 through the projector 120. After the electronic device 110 is connected to the projector 120, the focal length and distortion of the projector can be adjusted by a worker, so that the picture of the electronic device 110 can be clearly displayed on the holographic film 140. When the countermeasure disturbance is displayed in the picture of the electronic device 110 and a human face appears on the other side of the holographic film 140, the image acquisition device 130 acquires a photograph in the direction of the holographic film 140, and can capture an original image as an attack object and a countermeasure image for attacking the original image, thereby realizing an attack behavior on the target model. The robustness detection scene of the model shown in fig. 1 has the advantages of wide application range, easy convergence, convenience in debugging and easy detection, can simulate and practice the attack on the target model, and outputs objective and accurate robustness diagnosis results for the target model.

It can be understood that the image capturing device used in the embodiment of the present application may be an independent image capturing apparatus, or may be a camera built in an electronic apparatus, and the present application is only explained by taking the image capturing device as an example, which is independent from the electronic apparatus, and is not intended to limit the present application.

In addition, the scheme provided by the embodiment of the application relates to technologies such as artificial intelligence (AD, Natural Language Processing (NLP)), Machine Learning (ML), and the like. The following examples are intended to illustrate in particular:

the AI is a theory, method, technique and application system that simulates, extends and expands human intelligence, senses the environment, acquires knowledge and uses the knowledge to obtain the best results using a digital computer or a machine controlled by a digital computer. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines. The machine has the functions of perception, reasoning and decision making.

The AI technology is a comprehensive subject, and relates to the field of extensive technology, both hardware level technology and software level technology. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

NLP is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

Fig. 2 is a flowchart of a robustness detection method for a model according to an embodiment of the present disclosure, where the robustness detection method is performed by an electronic device, such as a mobile phone, a computer, a server, or the like. The method comprises the following steps:

step S102, obtaining a target image set, wherein the target image set comprises at least one confrontation sample image.

In different tasks such as image recognition, image comparison, image tracking and the like, the obtained target image sets are different. For example, in the face comparison task, the confrontation sample images in the target image set are images including faces, and the included faces usually belong to different objects. The confrontation sample image can be obtained by adding invisible micro-disturbance to the original image, for example, adding a blocking object such as an eyeshade and glasses to the face of the original image to obtain the confrontation sample image; or adding tiny noise disturbance invisible to human eyes on the original image to obtain a confrontation sample image. Fighting the sample image can cause the model to make false judgments.

And step S104, respectively inputting at least one confrontation sample image to a target model to be attacked.

Taking the face comparison task as an example, the target model may be a face comparison model. The input data of the target model comprises a confrontation sample image, and the output data comprises the similarity between the original sample image and the confrontation sample image; the original sample image is an image to be an attack target, and the countermeasure sample image is an image to attack the original sample image.

And step S106, acquiring an output result of the target model.

Wherein the output result comprises a similarity between an original sample image and each of the confrontation sample images of the input target image set.

In this embodiment, the similarity between each pair of resist sample images of the original sample image and the target image set is calculated by the target model, that is, the resist sample image is used to perform an attack on the target model, and the obtained similarity represents: the degree of matching between the original sample image and the confrontation sample image is calculated by the target model.

And S108, obtaining robustness detection data of the target model based on the output result and a preset similarity threshold value of the target model.

In the embodiment, the robustness detection data of the target model is obtained based on the similarity in the output result and the preset similarity threshold of the target model. The robustness detection data may include, but is not limited to: true similarity between the original sample image and the countersample image, result of identifying accuracy or not, result of attacking success or not, and time of attacking behavior. The similarity threshold is a characteristic value inherent to the object model.

And step S110, determining a robustness diagnosis result of the target model according to the robustness detection data.

According to the robustness detection method of the model, the output result of the target model, namely the similarity obtained by attacking the target model by the original sample image and the resisting sample object is utilized to detect the robustness of the target model to obtain robustness detection data, and the robustness diagnosis result of the target model is determined according to the robustness detection data of the attack behavior. The scheme better realizes simulation and actual combat drilling attack on the target model, fills the blank of the computer vision field in the aspect of detection and defense of the countermeasure attack, simultaneously outputs objective and accurate robustness diagnosis results for the target model, and is convenient for a user to quickly judge the safety and reliability of the target model in the face of attack of the countermeasure sample.

For ease of understanding, the robustness detection method of the model provided in the embodiments of the present application is described in detail below.

In one embodiment, the target model to be attacked may be selected from a plurality of candidate models constructed in advance.

According to a target model and tasks such as face recognition achieved by the target model, original sample images serving as attack objects and countermeasure sample images attacking the original sample images are acquired from a target image set. Wherein, the original sample image and the confrontation sample image correspond to different user objects, and as shown in fig. 3, the original sample image may be an ID: 126457, the challenge sample image may be the ID: 116822. The original sample image and the confrontation sample image may be from the same image set or different image sets, and are not limited herein.

In consideration of different fields and different users, the emphasis points of model detection are greatly different, so in order to meet personalized detection requirements of different users and widely adapt to robustness detection requirements of models in more scenes, another embodiment is provided herein, and a query of custom diagnosis can be provided for a user to realize personalized customized diagnosis requirements, including: responding to the uploading operation of the user for the custom diagnosis, acquiring a target model to be attacked and a target image set uploaded by the user, and acquiring an original sample image as an attack object and a countermeasure sample image attacking the original sample image from the target image set.

In the embodiment, the target model, the original sample image and the confrontation sample image are obtained based on the uploading operation of the user, and the attack behavior based on the acquisition can meet the personalized diagnosis requirement of the user.

After the original sample image and the confrontation sample image are obtained according to the above embodiment, the confrontation sample image is input to the target model to be attacked, and the output result of the target model is obtained. In one implementation of obtaining the output of the target model, the computational logic of the target model is obtained first; then searching a historical model matched with the computational logic of the target model from a preset historical diagnosis library; and then determining the output result of the target model according to the historical output result of the historical model. The output result is the final diagnosis data of the target model, namely the final diagnosis data of the target model is output (and confidence is given) by combining the historical diagnosis data of the historical model.

In another implementation, referring to fig. 3, in obtaining the output result of the target model, first a region, a wearable device (such as glasses, a mask, and a sticker), and a feature (such as a face, a nose, a mouth, and eyes) are selected to determine an attack region; then the target model calculates the similarity between the original sample image and the confrontation sample image in the attack area; the similarity is the similarity between the original sample image and the confrontation sample image calculated after the target model is attacked, and the similarity is used as the output result of the target model.

The embodiment provides a method for obtaining robustness detection data of a target model based on the output result and a similarity threshold, which comprises the following steps 1 to 3.

Step 1, if the similarity included in the output result is greater than a similarity threshold, recording first detection data, wherein the first detection data includes at least one of the following items: error identification result, identification failure record, attack success record and first attack duration record.

Specifically, the similarity included in the comparison output result and the similarity threshold of the target model may be a characteristic value inherent to the target model, for example, 0.28. The output result comprises a similarity greater than a similarity threshold value, which represents a confrontation sample image attacked by the target model by the product service side, so that when the target model identifies a face corresponding to the original sample image, the face is identified as a confrontation sample image (namely, a male with ID: 116822), however, actually what should be output is ID: 126457, it can be seen that the identification is wrong, i.e. the attack is successful. At this time, the first detection data corresponding to the current attack behavior may be recorded: error identification result, identification failure record, attack success record and first attack duration record. It can be understood that the first detection data represents that the target model is general in robustness, poor in safety and reliability and easy to attack.

Step 2, if the similarity included in the output result is not greater than the similarity threshold, recording second detection data, wherein the second detection data includes at least one of the following items: and identifying a success record, an attack failure record and a second attack duration record.

And (2) contrary to the step 1, the similarity included in the output result is not more than the similarity threshold value, and represents a confrontation sample image which is attacked by the product service party to the target model, so that the target model can resist the attack and output an accurate recognition result when recognizing the face corresponding to the original sample image, and the target model has good robustness and is not easy to be attacked.

And 3, taking the first detection data or the second detection data as robustness detection data corresponding to the current attack behavior.

Specifically, in one attack behavior, according to a comparison result of the similarity included in the output result and a similarity threshold, the first detection data or the second detection data is used as robustness detection data of the current attack behavior.

And carrying out at least one attack on the target model in the above manner to obtain at least one robustness detection data, and determining the robustness diagnosis result of the target model. The implementation is as follows.

(1) Calculating a robustness index of the target model according to first detection data corresponding to at least one attack behavior; the robustness index includes: identifying accuracy, attack success rate and/or attack success time information.

Specifically, the identification accuracy is calculated according to the proportion of the times of identification failure records in the first detection data to the total times of the attack behaviors, the attack success rate is calculated according to the proportion of the times of attack success records to the total times of the attack behaviors, and the attack success time information is obtained according to the average time length of the first attack time length records in the plurality of first detection data.

(2) And calculating the difficulty coefficient of the attack of the target model according to the robustness index.

(3) And carrying out data processing and visual display on a plurality of items in the first detection data, the second detection data, the robustness index and the difficulty coefficient of attack to obtain a robustness diagnosis result.

The visually displayed robust diagnostic results may refer to the example provided in fig. 5, such as including: attack algorithm, attack times and attack task; data involved in attack behavior by different enterprise users: model name, data set name, model robustness, attack success rate and attack time; the line graphs are transversely compared with the enterprise model in the aspects of robustness resistance, identification accuracy, difficulty in attack coefficient and data acquisition difficulty; the product and a plurality of types of radar charts of the same product, and the like.

The robustness detection method of the model provided by the embodiment can be applied to various different scenes, and can be used for carrying out robustness diagnosis on the model with any requirement. Based on this, referring to fig. 6, the method provided in this embodiment may further include: and in a preset target range environment, performing simulation diagnosis on the target model based on the robustness detection method of the model to obtain a robustness diagnosis simulation result of the target model.

Specifically, when a model robustness detection method is implemented, a target model to be attacked and a target image set uploaded by a user are acquired in response to uploading operation of the user for user-defined diagnosis, and an original sample image serving as an attack object and an anti-sample image attacking the original sample image are acquired from the target image set; then, inputting at least one confrontation sample image to a target model to be attacked; obtaining the following output results of the target model: similarity between the original sample image and the confrontation sample image input into the target image set; and obtaining robustness detection data of the target model based on the output result and a preset similarity threshold value of the target model.

For ease of understanding, the following provides examples of two possible firing ground environments.

An example one is simulating a bank's shooting range environment. Referring to fig. 7, in the bank identification system, the robustness detection of the face recognition model is beneficial to preventing the model from being interfered by countermeasures samples to cause recognition errors and generate financial risks. The method is used for security self-improvement of the technical department of banks, advantage analysis of similar business of banks and the like. At present, some businesses of some banks operate by swiping identity cards and face recognition, so that if a non-identity card owner wears certain face decoration or processed glasses, the non-identity card owner can be identified as the identity card owner in front of a camera, and financial safety accidents are caused. Therefore, the method provided by the embodiment can be used for carrying out simulation diagnosis on the robustness of the face recognition model, so that the condition that the non-identity card original owner of the object is successfully matched with the attacker who passes the counterattack can be avoided. In the using process, a corresponding confrontation sample image is generated according to a face data set and a target model provided by a testing enterprise, then the confrontation sample image and an original sample image in the face data set are utilized to detect the robustness of the target model, the robustness detection data such as the identification accuracy, the attack success rate, the attack time and the like are integrated, the robustness index and the safety factor of the target model are calculated, the improvement direction and the suggestion of the target model used by the current testing bank are given, and a plurality of attack indexes before and after comparison are carried out.

Example two is a simulated driving range environment. Referring to fig. 8, detecting the robustness of the face recognition model for identifying the driver identity is beneficial to preventing the model from being interfered by the countermeasure sample to cause recognition error and mistakenly recognizing the driver identity to generate safety risk. The method is used for safety self-improvement of taxi taking enterprises, competitive product safety advantage analysis and the like. The camera in the vehicle can identify the face of the driver and detect whether the driver is registered. Considering that the driver may upload the countercheck sample image in advance by the gray industry means or wear the face decoration with the facial features of other people during the detection, even if the driver is not operating the vehicle, the comparison of the input image by the model without robust diagnosis still gives the conclusion that the driver is registered. Therefore, the method provided by the embodiment is used for performing simulation diagnosis on the robustness of the face recognition model, so that the situation that the confrontation sample of the picture of the driver himself is successfully matched with other people is avoided, namely: after comparing the input images with the target model subjected to the robustness diagnosis, a conclusion that the input images are not the driver himself registered is given. In the using process, a corresponding confrontation sample image is generated according to a face data set and a model provided by a testing enterprise, then the confrontation sample image and an original sample image in the face data set are utilized to detect the robustness of the model, the index of the robustness of the model is calculated by comprehensively recognizing the accuracy, the attack success rate and the attack time, meanwhile, the strength of similar taxi taking products on the confrontation robustness is displayed according to a historical analysis result, and the improvement direction and suggestion of the current testing enterprise are given.

In practical application, the robustness detection method of the model provided by the embodiment of the application can be oriented to enterprise users and individual users. Generally speaking, enterprise-oriented users aim to develop their own solutions, but do not have any knowledge or discrimination capability on the stability of the model. For individual users, for example, a certain developer writes a model by himself, a diagnostic tool of a self-compiled model is needed to continuously perform model optimization, so that a model with better robustness is delivered.

The robustness detection method of the model provided by the embodiment does not require whether the target person using the target model is a technician or not, and can be any ordinary user without technical knowledge. For a user, a robust diagnosis result can be obtained only by providing a target model to be detected, and visual suggestions are provided for the user quickly and accurately.

Referring to fig. 9, different embodiments of a robustness detection method for an application model are provided herein for enterprise users and individual users, respectively.

When facing individual users, firstly, a data set input by user definition and a self-uploaded target model are obtained according to the uploading operation of the users, and an original sample image as an attack object and an anti-sample image attacking the original sample image are obtained from the data set. Then, initiating an attack behavior to the target model based on the confrontation sample image, namely calculating the similarity between the original sample image and the confrontation sample image through the target model to obtain the image similarity; and obtaining robustness detection data of the target model based on the image similarity and the similarity threshold. The robustness diagnosis of the target model is performed in the above manner. And finally, labeling, sequencing, visually displaying and the like are carried out on robustness detection data such as key feature points, attack success time information, attack success rate and the like to obtain a robustness diagnosis result. Specifically, the robust diagnostic result may be presented in the form of a report with reference to the example provided in fig. 4; the robust diagnosis result may include: the method comprises the following steps of model name, data set name, model robustness before and after attack, key feature points, confidence sequence and image similarity, attack time and attack success rate and the like.

When facing enterprise users, the robustness diagnosis result is obtained according to the robustness detection method of the model provided by the embodiment. And then, the business models of different enterprise users can be transversely compared, and the method is suitable for the demands of competitive product analysis, industry research reports and the like. Specifically, the attention mechanism of each sample is adjusted based on the data of the current enterprise user, and a confrontation sample image is generated in a targeted manner based on different uploading operations aiming at custom diagnosis; and then based on the target model of the targeted attack against the sample image. The enterprise users may include points of interest, strong terms, weak terms, enterprise employee development direction distribution data, professional data of development employees in professional fields, historical data of enterprises in professional fields, and the like. The enterprise data may be an enterprise portrait drawn by crawling by the service side in advance, or an enterprise portrait actively provided by the enterprise, which is not limited herein.

To provide better generalization for individual and enterprise users, the present embodiments provide several methods for lateral comparison of robust diagnostic results.

Referring to fig. 10, in an embodiment, first historical diagnostic data is obtained, where the first historical diagnostic data includes robustness diagnostic data of a plurality of reference models, and the robustness diagnostic data of each reference model is obtained by using a robustness detection method of the model or diagnosis by using a plurality of model robustness diagnostic tools; the plurality of reference models (reference model 1, reference model 2, … …, reference model i) are models having the same type as the target model and having different sources.

In the implementation process, the robustness diagnostic data of each reference model can be obtained by adopting the robustness detection method of the model provided by the application, and can also be obtained by adopting a plurality of model robustness diagnostic tools for diagnosis; specifically, for example, different robust diagnostic tools of different models are correspondingly used to obtain respective robust diagnostic data of different reference models.

Comparing the robustness detection data of the target model with the first historical diagnostic data laterally.

The embodiment can compare and display the comparison difference between the diagnostic data of the robustness detection method diagnostic target model of the model provided by the application and the diagnostic data of other similar reference models by transversely comparing the robustness detection data of the target model with the first historical diagnostic data.

Referring to fig. 11, in another embodiment, second historical diagnostic data is obtained, the second historical diagnostic data comprising robust diagnostic data of a plurality of model robust diagnostic tools for diagnosis of the target model; the diagnosis method adopted by the model robustness diagnosis tool is different from the model robustness detection method provided by the application, and each model robustness diagnosis tool is respectively; model robustness diagnostic tool 1, model robustness diagnostic tools 2, … …, model robustness diagnostic tool i.

Comparing the robustness detection data of the target model with the second historical diagnostic data laterally.

The embodiment transversely compares the diagnostic data obtained by different diagnostic methods for the same target model, and can compare and display the diagnostic difference of the robustness detection method of the model provided by the application and other similar diagnostic tools on the target model. For example, refer to the multi-dimensional radar chart in fig. 5, which is used for comparing the present product (i.e. the robust detection method of the model in the present application) with the similar products (i.e. the other similar diagnostic tools) in terms of data acquisition speed, diagnostic information richness, algorithm running time, and diagnostic period. The multidimensional radar chart visually shows the obvious advantages of the product in data acquisition speed and diagnosis information richness, and the multidimensional radar chart is equivalent to the similar product in the aspects of algorithm operation time and diagnosis period.

In summary, the method for detecting the robustness of the model provided by the embodiment of the application can fill up the gap of detection and defense against attacks in the field of computer vision, which is caused by the fact that the accuracy, recall rate and other business indexes of the model are concerned at present, so that the safety of large-scale application of the model is improved to a great extent, and the occurrence of events such as personal safety, financial safety and the like is reduced. On the other hand, the scheme can simulate and practice the attack on the target model, and output objective and accurate robustness diagnosis results for the target model, so that a user can conveniently and quickly judge the safety and reliability of the target model when the target model faces the attack of the countermeasure sample.

Referring to fig. 12, an embodiment of the present application further provides a robustness detecting apparatus for a model, which implements the robustness detecting method for the model, and the apparatus includes the following modules:

an input-output module 1202 for obtaining a target image set, the target image set including at least one confrontation sample image;

a processing module 1204, configured to input at least one of the confrontation sample images to a target model to be attacked respectively; obtaining an output result of the target model, wherein the output result comprises the similarity between an original sample image and each confrontation sample image input into the target image set; based on the output result and a preset similarity threshold of the target model, obtaining robustness detection data of the target model; and determining a robustness diagnosis result of the target model according to the robustness detection data.

In addition, the robustness detecting apparatus for the model may further include a display module (not shown in the figure) for displaying the robustness diagnosis result of the target model.

In some embodiments, the processing module 1204 is specifically configured to:

if the similarity included in the output result is greater than the similarity threshold, recording first detection data, wherein the first detection data comprises at least one of the following items: error identification results, identification failure records, attack success records and first attack duration records; if the output result comprises similarity not greater than the similarity threshold, recording second detection data, wherein the second detection data comprises at least one of the following items: identifying a success record, an attack failure record and a second attack duration record; and taking the first detection data or the second detection data as robustness detection data corresponding to the current attack behavior.

In some embodiments, the processing module 1204 is specifically configured to:

calculating a robustness index of the target model according to the first detection data corresponding to at least one attack behavior; the robustness index includes: identifying accuracy, attack success rate and/or attack success time information; calculating a difficulty of attack coefficient of the target model according to the robustness index; and carrying out data processing and visual display on a plurality of items in the first detection data, the second detection data, the robustness index and the attack difficulty coefficient to obtain a robustness diagnosis result.

In some embodiments, the processing module 1204 is further configured to:

acquiring first historical diagnosis data, wherein the first historical diagnosis data comprises robustness diagnosis data of a plurality of reference models, and the robustness diagnosis data of each reference model is obtained by adopting a robustness detection method of the model or diagnosis of a plurality of model robustness diagnosis tools; wherein the multiple reference models are models of the same type and different sources as the target model; comparing the robustness detection data of the target model with the first historical diagnostic data laterally.

In some embodiments, the processing module 1204 is further configured to:

obtaining second historical diagnostic data, wherein the second historical diagnostic data comprises robustness diagnostic data of a plurality of model robustness diagnostic tools for diagnosing the target model; comparing the robustness detection data of the target model with the second historical diagnostic data laterally.

In some embodiments, the processing module 1204 is further configured to: and in a preset target range environment, performing simulation diagnosis on the target model based on the robustness detection method of the model to obtain a robustness diagnosis simulation result of the target model.

In some embodiments, the processing module 1204 is specifically configured to:

acquiring the computational logic of a target model; searching a historical model matched with the computational logic of the target model from a preset historical diagnosis library; and determining the output result of the target model according to the historical output result of the historical model.

In some embodiments, the input-output module 1202 is further configured to: responding to uploading operation of a user for custom diagnosis, acquiring a target model to be attacked and a target image set uploaded by the user, and acquiring an original sample image serving as an attack object and a countermeasure sample image attacking the original sample image from the target image set.

The device provided by the embodiment has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments.

Fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 13, the electronic device 1300 includes one or more processors 1301 and memory 1302.

The processor 1301 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 1300 to perform desired functions.

Memory 1302 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 1301 to implement the robustness detection method of the model of the embodiments of the present application described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.

In one example, the electronic device 1300 may further include: an input device 1303 and an output device 1304, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

The input device 1303 may include, for example, a keyboard, a mouse, and the like.

The output device 1304 may output various information including the determined distance information, direction information, and the like to the outside. The output devices 1304 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, for simplicity, only some of the components of the electronic device 1300 relevant to the present application are shown in fig. 13, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 1300 may include any other suitable components depending on the particular application.

Further, the present embodiment also provides a computer-readable storage medium storing a computer program for executing the robustness detecting method of the above model.

The method, the apparatus, the electronic device, and the computer program product of the medium for detecting robustness of the model provided in the embodiments of the present application include a computer-readable storage medium storing program codes, instructions included in the program codes may be used to execute the method described in the foregoing method embodiments, and specific implementations may refer to the method embodiments and are not described herein again.

It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for detecting robustness of a model is characterized by comprising the following steps:

acquiring a target image set, wherein the target image set comprises at least one confrontation sample image;

respectively inputting at least one confrontation sample image to a target model to be attacked;

obtaining an output result of the target model, wherein the output result comprises the similarity between an original sample image and each confrontation sample image input into the target image set;

based on the output result and a preset similarity threshold of the target model, obtaining robustness detection data of the target model;

and determining a robustness diagnosis result of the target model according to the robustness detection data.

2. The method according to claim 1, wherein the obtaining robustness detection data of the target model based on the output result and a preset similarity threshold of the target model comprises:

if the similarity included in the output result is greater than the similarity threshold, recording first detection data, wherein the first detection data comprises at least one of the following items: error identification results, identification failure records, attack success records and first attack duration records;

if the output result comprises similarity not greater than the similarity threshold, recording second detection data, wherein the second detection data comprises at least one of the following items: identifying a success record, an attack failure record and a second attack duration record;

and taking the first detection data or the second detection data as robustness detection data corresponding to the current attack behavior.

3. The method of claim 2, wherein determining the robust diagnostic result for the target model from the robust detection data comprises:

calculating a robustness index of the target model according to the first detection data corresponding to at least one attack behavior; the robustness index includes: identifying accuracy, attack success rate and/or attack success time information;

calculating a difficulty of attack coefficient of the target model according to the robustness index;

and carrying out data processing and visual display on a plurality of items in the first detection data, the second detection data, the robustness index and the attack difficulty coefficient to obtain a robustness diagnosis result.

4. The method of claim 1, further comprising:

acquiring first historical diagnosis data, wherein the first historical diagnosis data comprises robustness diagnosis data of a plurality of reference models, and the robustness diagnosis data of each reference model is obtained by adopting a robustness detection method of the model or diagnosis of a plurality of model robustness diagnosis tools; wherein the multiple reference models are models of the same type and different sources as the target model;

5. The method of claim 1, further comprising:

obtaining second historical diagnostic data, wherein the second historical diagnostic data comprises robustness diagnostic data of a plurality of model robustness diagnostic tools for diagnosing the target model;

6. The method of claim 1, further comprising:

and in a preset target range environment, performing simulation diagnosis on the target model based on the robustness detection method of the model to obtain a robustness diagnosis simulation result of the target model.

7. The method of claim 1, wherein obtaining the output of the target model comprises:

acquiring the computational logic of a target model;

searching a historical model matched with the computational logic of the target model from a preset historical diagnosis library;

and determining the output result of the target model according to the historical output result of the historical model.

8. The method of claim 1, further comprising:

responding to uploading operation of a user for custom diagnosis, acquiring a target model to be attacked and a target image set uploaded by the user, and acquiring an original sample image serving as an attack object and a countermeasure sample image attacking the original sample image from the target image set.

9. An apparatus for detecting robustness of a model, comprising:

10. An electronic device, characterized in that the electronic device comprises:

a processor;

a memory for storing the processor-executable instructions;

the processor is used for reading the executable instructions from the memory and executing the instructions to realize the robustness detection method of the model of any one of the claims 1-8.

11. A computer-readable storage medium, characterized in that the storage medium stores a computer program for performing the method for robustness detection of a model according to any of the preceding claims 1-8.