CN111325319A - Method, device, equipment and storage medium for detecting neural network model - Google Patents

Method, device, equipment and storage medium for detecting neural network model Download PDF

Info

Publication number
CN111325319A
CN111325319A CN202010078047.5A CN202010078047A CN111325319A CN 111325319 A CN111325319 A CN 111325319A CN 202010078047 A CN202010078047 A CN 202010078047A CN 111325319 A CN111325319 A CN 111325319A
Authority
CN
China
Prior art keywords
neural network
network model
sample
image
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010078047.5A
Other languages
Chinese (zh)
Other versions
CN111325319B (en
Inventor
李嘉麟
陈锡显
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Cloud Computing Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Cloud Computing Beijing Co Ltd filed Critical Tencent Cloud Computing Beijing Co Ltd
Priority to CN202010078047.5A priority Critical patent/CN111325319B/en
Publication of CN111325319A publication Critical patent/CN111325319A/en
Application granted granted Critical
Publication of CN111325319B publication Critical patent/CN111325319B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method, a device, equipment and a storage medium for detecting a neural network model; the method comprises the following steps: constructing an attention map of the first neural network model based on the structural features of the first neural network model; constructing a confrontation sample function based on the attention force drawing of the first neural network model and the classification result of the first neural network model for the initial sample; updating and iterating the countermeasure sample function and the initial sample, and taking an iteration result as a countermeasure sample corresponding to the initial sample; and classifying the countermeasure samples through the second neural network model to obtain the classification result of the countermeasure samples, and determining the probability of correct classification of the second neural network model according to the difference between the classification result of the countermeasure samples and the classification result of the initial samples. By the method and the device, accurate confrontation samples can be automatically acquired, and the probability of correct classification of the neural network model is improved, so that the robustness detection efficiency is improved.

Description

Method, device, equipment and storage medium for detecting neural network model
Technical Field
The present invention relates to artificial intelligence technology, and in particular, to a method and an apparatus for detecting a neural network model, an electronic device, and a storage medium.
Background
Artificial Intelligence (AI) is a comprehensive technique in computer science, and by studying the design principles and implementation methods of various intelligent machines, the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology is a comprehensive subject and relates to a wide range of fields, for example, natural language processing technology and machine learning/deep learning, etc., and along with the development of the technology, the artificial intelligence technology can be applied in more fields and can play more and more important values.
Neural networks have attracted a great deal of attention in both academic and industrial fields and have achieved breakthrough results in a number of application areas, including image recognition, speech recognition, and text classification. Robustness is an important characteristic for detecting the stability performance of the neural network, and the detection of the robustness of the neural network is an important research direction.
However, in the related art, when the robustness of the neural network needs to be detected, the robustness of the neural network is detected by using the countermeasure samples mainly by manually and randomly selecting the countermeasure samples, however, the robustness of the neural network is detected by using the randomly selected countermeasure samples with low accuracy.
Disclosure of Invention
The embodiment of the invention provides a detection method and device of a neural network model, electronic equipment and a storage medium, which can automatically obtain accurate confrontation samples and improve the probability of correct classification of the neural network model, namely improve the robustness detection efficiency.
The technical scheme of the embodiment of the invention is realized as follows:
the embodiment of the invention provides a method for detecting an image neural network model, which comprises the following steps:
constructing an attention map of a first image neural network model based on structural features of the first image neural network model;
constructing a confrontation image sample function based on an attention map of the first image neural network model and a classification result of the first image neural network model for an initial image sample;
updating and iterating the countermeasure image sample function and the initial image sample, and taking an iteration result as a countermeasure image sample corresponding to the initial image sample;
classifying the confrontation image samples through a second image neural network model to obtain classification results of the confrontation image samples, and
and determining the probability of correct image classification of the second image neural network model according to the difference between the classification result of the confrontation image sample and the classification result of the initial image sample.
The embodiment of the invention provides a detection method of a neural network model, which comprises the following steps:
constructing an attention map of a first neural network model based on structural features of the first neural network model;
constructing a confrontation sample function based on an attention map of the first neural network model and a classification result of the first neural network model for an initial sample;
updating and iterating the countermeasure sample function and the initial sample, and taking an iteration result as a countermeasure sample corresponding to the initial sample;
classifying the confrontation samples through a second neural network model to obtain classification results of the confrontation samples, and
and determining the robustness of the second neural network model according to the difference between the classification result of the confrontation sample and the classification result of the initial sample.
The embodiment of the invention provides a detection device of a neural network model, which comprises:
the first processing module is used for constructing an attention force map of a first neural network model based on structural features of the first neural network model;
the construction module is used for constructing a confrontation sample function based on the attention force drawing of the first neural network model and the classification result of the first neural network model for the initial sample;
the updating module is used for performing updating iteration processing on the basis of the confrontation sample function and the initial sample, and taking an iteration result as a confrontation sample corresponding to the initial sample;
and the second processing module is used for carrying out classification processing on the confrontation samples through a second neural network model to obtain the classification results of the confrontation samples, and determining the robustness of the second neural network model according to the difference between the classification results of the confrontation samples and the classification results of the initial samples.
In the above technical solution, the apparatus further includes:
a first determining module for determining a similarity of each of a plurality of third neural network models to the second neural network model based on functions of the third neural network models, input data, and output data;
and determining the third neural network model corresponding to the maximum similarity as the first neural network model.
In the above technical solution, the apparatus further includes:
a second determining unit, configured to construct, according to a function of the second neural network model, the first neural network model having the same function as the second neural network model;
and training the first neural network model based on the input data and the output data of the second neural network model to obtain a trained first neural network model.
In the above technical solution, the apparatus further includes:
the first classification module is used for classifying the initial samples through the first neural network model to obtain a classification result of the first neural network model for the initial samples;
the first classification module is further configured to perform encoding processing on the initial sample through the first neural network model to obtain features of the initial sample;
classifying the characteristics of the initial sample to obtain a class label corresponding to the initial sample; wherein the type of the initial sample comprises one of: an image sample; a text sample; a speech sample.
In the above technical solution, the first processing module is further configured to perform attention processing on each layer structure in the first neural network model to obtain attention information of each layer structure;
and combining the attention information of the structures of all layers to obtain an attention map of the first neural network model.
In the above technical solution, the first processing module is further configured to determine output characteristics of each layer structure in the first neural network model corresponding to the initial sample;
and carrying out derivation processing aiming at the initial sample on the output characteristics of the initial sample corresponding to each layer structure, and taking a derivation result as the attention information of each layer structure corresponding to the initial sample.
In the above technical solution, the constructing module is further configured to construct a loss function of the first neural network model based on a classification result of the first neural network model for an initial sample and the confrontation sample;
constructing a confrontation sample function for iterative updating based on the loss function of the first neural network, the attention map of the first neural network model, and the initial samples.
In the above technical solution, the building module is further configured to perform a point multiplication on the attention information of each layer structure in the first neural network model corresponding to the initial sample and the output characteristics of each layer structure corresponding to the countermeasure sample to obtain a first point multiplication result;
performing point multiplication on the attention information of each layer structure in the first neural network model corresponding to the initial sample and the output characteristics of each layer structure corresponding to the initial sample to obtain a second point multiplication result;
determining a difference between the first point multiplication result and the second point multiplication result;
and constructing a function for iterating the initial sample and taking an iteration result as a countermeasure sample by taking the loss function of the first neural network model, the initial sample and the difference as parameters.
In the above technical solution, the building module is further configured to perform difference processing on the first point multiplication result and the second point multiplication result to obtain a processing result;
determining a square of the processing result as a difference between the first dot product and the second dot product.
In the above technical solution, the apparatus further includes:
a third processing module for integrating the difference constraint of the confrontation sample relative to the initial sample in the confrontation sample function so as to enable the confrontation sample to be subjected to the difference constraint
The challenge samples differ relative to the initial samples by no more than a difference threshold.
In the above technical solution, the updating module is further configured to take the initial sample as an initial confrontation sample and substitute the confrontation sample function to determine a derivative of the confrontation sample function with respect to the initial confrontation sample;
taking the sum of the derivative and the initial confrontation sample as a new confrontation sample and substituting the sample function to continue to iteratively update the new confrontation sample;
and taking a new countermeasure sample obtained by iterative updating when the countermeasure sample function is converged as a countermeasure sample corresponding to the initial sample.
In the above technical solution, the apparatus further includes:
the second classification module is used for performing classification processing on the initial sample through the second neural network model to obtain a classification result of the initial sample;
the device further comprises:
and the training module is used for training the second neural network model based on the countermeasure samples when the probability of correct classification of the second neural network model is lower than a probability threshold value, so that the probability of correct image classification of the second neural network model is higher than the probability threshold value.
The embodiment of the invention provides a detection device of a neural network model, which comprises:
a memory for storing executable instructions;
and the processor is used for implementing the detection method of the image neural network model provided by the embodiment of the invention or the detection method of the neural network model provided by the embodiment of the invention when the executable instructions stored in the memory are executed.
The embodiment of the present invention provides a computer-readable storage medium, which stores executable instructions for causing a processor to execute the method for detecting an image neural network model provided in the embodiment of the present invention or the method for detecting a neural network model provided in the embodiment of the present invention.
The embodiment of the invention has the following beneficial effects:
constructing an attention machine drawing of the first neural network model through the structural characteristics of the first neural network model, and introducing the attention machine drawing based on the first neural network model into the confrontation sample function, so that the accurate confrontation sample corresponding to the initial sample can be automatically updated and iterated according to the confrontation sample function; based on accurate confrontation samples, robustness detection can be rapidly and accurately carried out on the second neural network model, and robustness detection efficiency is improved.
Drawings
Fig. 1 is a schematic application scenario diagram of a detection system 10 of a neural network model provided in an embodiment of the present invention;
FIG. 2 is a schematic diagram of face recognition provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of speech recognition provided by an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a detection apparatus 500 of a neural network model provided in an embodiment of the present invention;
5-8 are schematic flow charts of a method for detecting a neural network model provided by an embodiment of the present invention;
fig. 9 is a schematic application scenario diagram of a detection system 10A of an image neural network model according to an embodiment of the present invention;
FIG. 10 is a flowchart illustrating a method for detecting an image neural network model according to an embodiment of the present invention;
FIG. 11 is an alternative structural diagram of a deep neural network provided by an embodiment of the present invention;
FIG. 12 is a schematic diagram of the countermeasure attack provided by the embodiment of the present invention;
FIG. 13 is a schematic flow chart of an alternative method for detecting a neural network model according to an embodiment of the present invention;
fig. 14A-14B are diagrams of attention machines corresponding to different neural network models provided by embodiments of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
In the description that follows, references to the terms "first", "second", and the like, are intended only to distinguish similar objects and not to indicate a particular ordering for the objects, it being understood that "first", "second", and the like may be interchanged under certain circumstances or sequences of events to enable embodiments of the invention described herein to be practiced in other than the order illustrated or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.
Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.
1) Image recognition: the use of computer processing, analysis and understanding of images to identify various patterns of objects and objects is a practical application for applying deep learning algorithms. Image recognition technology is generally divided into face recognition and article recognition, and the face recognition is mainly applied to security inspection, identity verification and mobile payment; the article identification is mainly applied to the article circulation process, in particular to the field of unmanned retail such as unmanned goods shelves and intelligent retail cabinets.
2) And (3) voice recognition: techniques for a machine to translate speech signals into corresponding text or commands through a recognition and understanding process. The voice recognition technology mainly comprises three aspects of a feature extraction technology, a pattern matching criterion and a model training technology.
3) Natural Language Processing (NLP): an important direction in the fields of computer science and artificial intelligence can realize effective communication between people and computers by using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the field relates to natural language, namely the language used by people daily, so that the field is closely related to linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
4) Text classification: the computer automatically classifies and marks the text set (or other entities or objects) according to a certain classification system or standard, and belongs to automatic classification based on the classification system. The text classification generally comprises processes of text expression, classifier selection and training, classification result evaluation and feedback and the like, wherein the text expression can be subdivided into steps of text preprocessing, indexing and statistics, feature extraction and the like. The overall functional modules of the text classification system are as follows: (1) pretreatment: formatting the original corpus into the same format, so as to facilitate subsequent unified processing; (2) indexing: decomposing the document into basic processing units, and simultaneously reducing the expense of subsequent processing; (3) counting: word frequency statistics, associated probabilities of terms (words, concepts) and classifications; (4) characteristic extraction: extracting features reflecting the document theme from the document; (5) a classifier: training a classifier; (6) evaluation: and analyzing the test result of the classifier.
5) The challenge sample: for the trained neural network, an initial sample can be identified truly, but small disturbances are added on the basis of the initial sample to generate a confrontation sample, wherein human eyes cannot distinguish the confrontation sample from the initial sample basically, the trained neural network cannot identify the confrontation sample correctly, the confrontation sample cannot be perceived by human beings obviously after being modified, and wrong data identified by a machine is the confrontation sample. For example, a trained neural network is used for picture classification, and for a picture which can be correctly classified by the trained neural network, small disturbances are artificially added to pixels of the picture, the trained neural network can classify the picture added with the disturbances in a wrong way with a high probability, human eyes of the picture added with the disturbances cannot classify the picture in a wrong way, or even cannot perceive the pictures added with the disturbances, and the picture added with the disturbances is an confrontation sample.
6) Robustness: a system or organization has the ability to defend or overcome adverse conditions. For example, in machine learning, when training a model, noise may be added to the algorithm (e.g., counter training) to test the robustness of the algorithm, which may be understood herein as how tolerant the algorithm is to data changes. Robustness is often used to describe the ability to adapt to complex systems in the face of more comprehensive considerations.
7) Attention Mechanism (Attention Mechanism): the training process of the neural network is a mode of learning the importance of different parts and integrating. There are two main aspects: deciding which part of the input neural network model needs to be concerned with; limited information processing resources are allocated to the important parts. The attention mechanism may enable the neural network to focus on a subset of its inputs (or features): a particular input is selected. Attention may be applied to any type of input regardless of its shape. In situations where computing power is limited, the attention mechanism is a resource allocation scheme that is the primary means to solve the information overload problem, allocating computing resources to more important tasks.
The neural network described in the embodiment of the present invention may be applied to various fields, for example, an image recognition neural network, a speech recognition neural network, a text classification neural network, and the like.
In order to at least solve the above technical problems of the related art, embodiments of the present invention provide a method and an apparatus for detecting a neural network model, an electronic device, and a storage medium, which can automatically obtain an accurate countermeasure sample and improve robustness detection efficiency. The following describes an exemplary application of the detection apparatus of the neural network model provided in the embodiment of the present invention, which may be a server, for example, a server deployed in a cloud, and the detection apparatus of the neural network model provided in the embodiment of the present invention may be a server, for example, a first neural network model (reference neural network model), a second neural network model (neural network model to be detected) and an initial sample provided by other apparatuses or users, construct an attention machine diagram of the first neural network model based on structural features of the first neural network model, construct a countermeasure sample optimization function (i.e., a countermeasure sample function) based on the attention machine diagram and classification results of the first neural network model for the initial sample, obtain an accurate countermeasure sample corresponding to the initial sample through the countermeasure sample optimization function, and detect robustness of the second neural network model according to the accurate countermeasure sample, and showing the robustness of the second neural network model to a user; the robustness detection method can also be used for various types of user terminals such as a notebook computer, a tablet computer, a desktop computer, a mobile device (e.g., a mobile phone, a personal digital assistant) and the like, for example, a handheld terminal, and according to a first neural network model, a second neural network model and an initial sample input by a user on the handheld terminal, countermeasure samples corresponding to the initial sample are obtained, robustness detection is performed on the second neural network model based on the countermeasure samples, robustness of the second neural network model is obtained and displayed on a display interface of the handheld terminal, so that the user can know robustness of the second neural network model.
By way of example, referring to fig. 1, fig. 1 is a schematic view of an application scenario of a detection system 10 of a neural network model provided in an embodiment of the present invention, a terminal 200 is connected to a server 100 through a network 300, and the network 300 may be a wide area network or a local area network, or a combination of both.
The terminal 200 may be used to obtain the first neural network model, the second neural network model and the initial samples, for example, when the user inputs the first neural network model, the second neural network model and the initial samples through the input interface, the terminal automatically obtains the first neural network model, the second neural network model and the initial samples input by the user after the input is completed.
In some embodiments, the terminal 200 locally performs the detection method of the neural network model provided by the embodiments of the present invention to obtain the countermeasure samples corresponding to the initial samples according to the first neural network model, the second neural network model and the initial samples input by the user, and performs robustness detection on the second neural network model based on the countermeasure samples to obtain robustness of the second neural network model, for example, a detection assistant is installed on the terminal 200, the user inputs the first neural network model, the second neural network model and the initial samples in the detection, the terminal 200 constructs an attention machine map of the first neural network model based on structural features of the first neural network model according to the input first neural network model, the second neural network model and the initial samples, and maps the attention machine map and classification results of the first neural network model for the initial samples based on the attention machine map, and constructing a confrontation sample optimization function, obtaining accurate confrontation samples corresponding to the initial samples through the confrontation sample optimization function, detecting the robustness of the second neural network model according to the accurate confrontation samples, and displaying the robustness of the second neural network model on a display interface 210 of the terminal 200, so that a user can know the robustness of the second neural network model, and screening the second neural network model with high robustness for application such as image recognition, voice recognition and text classification.
In some embodiments, the terminal 200 may also send the first neural network model, the second neural network model and the initial sample input by the user on the terminal 100 to the server 100 through the network 300, and invoke the detection function of the neural network model provided by the server 100, the server 100 obtains the robustness of the second neural network model through the detection method of the neural network model provided by the embodiments of the present invention, for example, a detection assistant is installed on the terminal 200, the user inputs the first neural network model, the second neural network model and the initial sample in the detection assistant, the terminal sends the first neural network model, the second neural network model and the initial sample to the server 100 through the network 300, the server 100 receives the first neural network model, the second neural network model and the initial sample, and based on the structural features of the first neural network model, constructing an attention machine diagram of the first neural network model, constructing a confrontation sample optimization function based on the attention machine diagram and a classification result of the first neural network model for the initial samples, obtaining accurate confrontation samples corresponding to the initial samples through the confrontation sample optimization function, detecting the robustness of the second neural network model according to the accurate confrontation samples, returning the robustness of the second neural network model to a detection assistant, and displaying the robustness of the second neural network model on a display interface 210 of the terminal 200, or directly giving the robustness of the second neural network model by the server 100 so that a user can know the robustness of the second neural network model and screen the second neural network model with high robustness for image recognition, voice recognition, text classification and other applications.
In one implementation scenario, to detect the robustness of image recognition, the server or the terminal may construct an attention map of the first image-recognized neural network model based on structural features of the first image-recognized neural network model according to input of the first image-recognized neural network model (reference neural network model), the second image-recognized neural network model (neural network model under test), and the initial sample (initial image sample), construct a countermeasure sample optimization function based on the attention map of the first image-recognized neural network model and a classification result of the first image-recognized neural network model for the initial sample, and perform an update iteration process based on the countermeasure sample optimization function and the initial sample, taking an iteration result as a countermeasure sample (countermeasure image sample) corresponding to the initial sample, classifying the countermeasure samples through the neural network model identified by the second image to obtain the classification result of the countermeasure samples, and determining the robustness of the neural network model identified by the second image according to the difference between the classification result of the countermeasure samples and the classification result of the initial samples, so as to determine the robustness of the neural network model to be detected; the robustness of the neural network model to be detected is strong, and the neural network model to be detected can be applied to the scene of image recognition.
For example, in a face recognition system, in order to detect the robustness of a certain face recognition system to be detected (including a neural network model for face recognition to be detected), the detection function of the neural network model provided by the embodiment of the invention is invoked, a face confrontation sample optimization function is constructed according to a known face recognition system (including a neural network model for reference face recognition) and a face initial sample, a face confrontation sample corresponding to the face initial sample is generated according to the face confrontation sample, the robustness of the face recognition system to be detected is detected through an accurate face confrontation sample, when the robustness of the face recognition system to be detected is determined to be poor, developers need to be reminded to improve the robustness of the face recognition system to be detected, the safety factor is enhanced, and illegal persons are prevented from attacking the face recognition system to be detected; when the robustness of the face recognition system to be detected is determined, the face recognition system to be detected can be applied to a face recognition scene, referring to fig. 2, and fig. 2 is a face recognition schematic diagram provided by the embodiment of the invention, so that even if noise is added to the small-bright, small-strong and small-blossoming faces in fig. 2, the face recognition system to be detected can accurately recognize the faces in fig. 2 and respectively correspond to the small-bright, small-strong and small-blossoming faces.
For example, in an automatic driving system, in order to detect the robustness of a certain automatic driving system to be detected (including a neural network model identified by a vehicle to be detected), the detection function of the neural network model provided by the embodiment of the invention is invoked, based on the known vehicle identification system (including the neural network model of the reference vehicle identification) and the vehicle initial sample, a confrontation vehicle sample optimization function is constructed, and generating an confrontation vehicle sample corresponding to the vehicle initial sample according to the confrontation vehicle sample, detecting the robustness of the automatic driving system to be detected through an accurate confrontation vehicle sample, when the robustness of the automatic driving system is determined to be poor (the confrontation vehicle sample cannot be accurately identified), the developers need to be reminded to improve the robustness of the automatic driving system to be tested, the safety coefficient is enhanced, and traffic accidents caused by illegal persons attacking the automatic driving system to be tested are avoided; when the robustness of the automatic driving system to be tested is determined, the automatic driving system to be tested can be applied to an automatic driving scene, and the life quality is improved.
In one implementation scenario, to detect the robustness of the speech recognition, the server or the terminal may construct an attention map of the first speech recognition neural network model based on structural features of the first speech recognition neural network model according to the input first speech recognition neural network model (reference neural network model), the second speech recognition neural network model (neural network model under test), and the initial samples (initial speech samples), construct a confrontation sample optimization function based on the attention map of the first speech recognition neural network model and classification results of the first speech recognition neural network model for the initial samples, perform an update iteration process based on the confrontation sample optimization function and the initial samples, take iteration results as confrontation samples (confrontation speech samples) corresponding to the initial samples, classifying the countermeasure samples through the neural network model of the second voice recognition to obtain classification results of the countermeasure samples, and determining the robustness of the neural network model of the second voice recognition according to the difference between the classification results of the countermeasure samples and the classification results of the initial samples, so as to determine the robustness of the neural network model to be tested; the robustness of the neural network model to be tested is strong, and the neural network model to be tested can be applied to a speech recognition scene.
For example, in an access control system, in order to detect the robustness of a certain access control system to be detected (including a neural network model for speech recognition to be detected), the detection function of the neural network model provided by the embodiment of the invention is invoked, based on the known speech recognition system (including the neural network model of reference speech recognition) and the initial speech samples, a competing speech sample optimization function is constructed, and generates an anti-voice sample corresponding to the initial voice sample according to the anti-voice sample, and detects the robustness of the access control system to be tested through the accurate anti-voice sample, when the robustness of the access control system is determined to be poor (the anti-voice sample cannot be accurately identified, namely the voice of a certain person cannot be accurately identified), the developers need to be reminded to improve the robustness of the access control system to be tested, the safety coefficient is enhanced, and illegal persons are prevented from attacking the access control system to be tested and entering and exiting at will; when the strong robustness of the access control system to be tested is determined, the access control system to be tested can be applied to various access control scenes, and the life quality is improved. Referring to fig. 3, fig. 3 is a schematic diagram of voice recognition provided by the embodiment of the present invention, even if noise is added to the small and strong voice in fig. 3, the access control system to be tested can accurately recognize that the voice in fig. 3 is the small and strong voice, so as to determine whether to open the access control according to the small and strong access control authority.
In one implementation scenario, to detect the robustness of text recognition, the server or the terminal may construct an attention map of the first text-recognized neural network model based on structural features of the first text-recognized neural network model according to input of the first text-recognized neural network model (reference neural network model), the second text-recognized neural network model (neural network model under test), and the initial sample (initial text sample), construct a countermeasure sample optimization function based on the attention map of the first text-recognized neural network model and a classification result of the first text-recognized neural network model for the initial sample, and perform an update iteration process based on the countermeasure sample optimization function and the initial sample, take an iteration result as a countermeasure sample (countermeasure text sample) corresponding to the initial sample, classifying the countermeasure samples through the neural network model identified by the second text to obtain a classification result of the countermeasure samples, and determining the robustness of the neural network model identified by the second text according to the difference between the classification result of the countermeasure samples and the classification result of the initial samples, so as to determine the robustness of the neural network model to be detected; and the robustness of the neural network model to be detected is strong, so that the neural network model to be detected can be applied to a scene of text recognition.
For example, in the advertisement recommendation system, in order to detect the robustness of a certain advertisement recommendation system to be tested (including the neural network model identified by the advertisement to be tested), the detection function of the neural network model provided by the embodiment of the invention is invoked, based on the known advertisement recognition system (including the neural network model of reference advertisement recognition) and the initial advertisement sample, a confrontation advertisement sample optimization function is constructed, and generates an anti-advertisement sample corresponding to the initial advertisement sample according to the anti-advertisement sample, and detects the robustness of the advertisement recommendation system to be tested through the accurate anti-advertisement sample, when the condition that the robustness of the access control system is poor (the anti-advertisement sample cannot be accurately identified, namely a certain advertisement cannot be accurately identified, and further the advertisement cannot be accurately recommended) is determined, developers need to be reminded to improve the robustness of the advertisement recommendation system to be tested, and the function of accurately recommending advertisements is realized; when the robustness of the advertisement recommendation system to be tested is determined, the advertisement recommendation system to be tested can be applied to various advertisement recommendation scenes.
Continuing with the structure of the detecting device of the neural network model provided by the embodiment of the present invention, the detecting device of the neural network model may be various terminals, such as a mobile phone, a computer, etc., or may be the server 100 shown in fig. 1.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a detection apparatus 500 of a neural network model according to an embodiment of the present invention, where the detection apparatus 500 of the neural network model shown in fig. 4 includes: at least one processor 510, memory 550, at least one network interface 520, and a user interface 530. The various components in the neural network model detection device 500 are coupled together by a bus system 540. It is understood that the bus system 540 is used to enable communications among the components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 540 in fig. 4.
The Processor 510 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
The user interface 530 includes one or more output devices 531 enabling presentation of media content, including one or more speakers and/or one or more visual display screens. The user interface 530 also includes one or more input devices 532, including user interface components to facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
The memory 550 may comprise volatile memory or nonvolatile memory, and may also comprise both volatile and nonvolatile memory. The non-volatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 550 described in connection with embodiments of the invention is intended to comprise any suitable type of memory. Memory 550 optionally includes one or more storage devices physically located remote from processor 510.
In some embodiments, memory 550 can store data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
An operating system 551 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;
a network communication module 552 for communicating to other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;
a display module 553 for enabling presentation of information (e.g., a user interface for operating peripherals and displaying content and information) via one or more output devices 531 (e.g., a display screen, speakers, etc.) associated with the user interface 530;
an input processing module 554 to detect one or more user inputs or interactions from one of the one or more input devices 532 and to translate the detected inputs or interactions.
In some embodiments, the detection Device of the neural network model provided in the embodiments of the present invention may be implemented by combining software and hardware, and as an example, the detection Device of the neural network model provided in the embodiments of the present invention may be a processor in the form of a hardware decoding processor, which is programmed to execute the detection method of the neural network model provided in the embodiments of the present invention, for example, the processor in the form of the hardware decoding processor may employ one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.
In other embodiments, the detection apparatus for a neural network model provided in the embodiments of the present invention may be implemented in software, and fig. 4 illustrates the detection apparatus 555 for a neural network model stored in the memory 550, which may be software in the form of programs and plug-ins, and includes a series of modules, including a first processing module 5551, a building module 5552, an updating module 5553, a second processing module 5554, a first determining module 5555, a second determining module 5556, a first classifying module 5557, a third processing module 5558, a second classifying module 5559, and a training module 5560; the first processing module 5551, the building module 5552, the updating module 5553, the second processing module 5554, the first determining module 5555, the second determining module 5556, the first classifying module 5557, the third processing module 5558, the second classifying module 5559, and the training module 5560 are used to implement the method for detecting the neural network model provided by the embodiment of the invention.
As can be understood from the foregoing, the detection method of the neural network model provided in the embodiment of the present invention may be implemented by various types of detection devices of the neural network model, such as an intelligent terminal and a server.
The following describes a method for detecting a neural network model according to an embodiment of the present invention, with reference to an exemplary application and implementation of a server according to an embodiment of the present invention. Referring to fig. 5, fig. 5 is a schematic flowchart of a method for detecting a neural network model according to an embodiment of the present invention, which is described with reference to the steps shown in fig. 5.
In step 101, an attention map of the first neural network model is constructed based on structural features of the first neural network model.
After the user inputs the first neural network model, the initial sample and the second neural network model at the terminal, the terminal may input the first neural network model, the initial sample and the second neural network model to the server, and after the server receives the first neural network model, the initial sample and the second neural network model, the server may construct an attention machine map of the first neural network model based on structural features of the first neural network model, so as to construct an antagonistic sample optimization function according to the attention machine map of the first neural network model in the following order to generate an antagonistic sample corresponding to the initial sample.
Referring to fig. 6, fig. 6 is an alternative flow chart provided by an embodiment of the present invention, and in some embodiments, fig. 6 shows fig. 5, further including step 106.
In step 106, determining a similarity of each third neural network model to the second neural network model based on the functions of the plurality of third neural network models, the input data, and the output data; and determining the third neural network model corresponding to the maximum similarity as the first neural network model.
In order to determine a first neural network model (see neural network model) which is similar to a second neural network model (neural network model under test), a plurality of third neural network models (candidate neural network models) having functions similar to the second neural network model may be determined, a functional similarity of each of the third neural network models to the second neural network model may be determined based on the functions of the plurality of third neural network models and the functions of the second neural network model, for example, when some third neural network model functions as face recognition and the second neural network model functions as pedestrian recognition, the functional similarity between the third neural network model and the second neural network model is 80%, when the function of a third neural network model is face recognition and the function of the second neural network model is vehicle recognition, the functional similarity between the third neural network model and the second neural network model is 20%; determining similarity between each third neural network model and the input data of the second neural network model based on the input data of a plurality of third neural network models and the input data of the second neural network model, for example, when the input data of a certain third neural network model is a human face and the input data of the second neural network model is a pedestrian, the similarity between the third neural network model and the input data of the second neural network model is 80%, and when the input data of a certain third neural network model is a human face and the input data of the second neural network model is a vehicle, the similarity between the third neural network model and the input data of the second neural network model is 20%; determining the similarity of the output data of each third neural network model and the second neural network model based on the output data of a plurality of third neural network models and the output data of the second neural network model, for example, when the output data of a certain third neural network model is a human face and the output data of the second neural network model is a pedestrian, the similarity of the output data of the third neural network model and the second neural network model is 80%, when the output data of a certain third neural network model is a human face and the output data of the second neural network model is a vehicle, the similarity of the output data of the third neural network model and the second neural network model is 20%, and combining the functional similarity of each third neural network model and the second neural network model, the similarity of the input data and the similarity of the output data to obtain the total similarity of each third neural network model and the second neural network model, therefore, the third neural network model corresponding to the maximum similarity is determined as the first neural network model, namely, a certain candidate neural network model is determined as the reference neural network model.
In some embodiments, prior to constructing the attention map of the first neural network model, the method further comprises: according to the function of the second neural network model, constructing a first neural network model with the same function as the second neural network model; and training the first neural network model based on the input data and the output data of the second neural network model to obtain the trained first neural network model.
When the first neural network model similar to the second neural network model does not exist, a first neural network model with the same function as the first neural network model can be constructed according to the function of the second neural network model, for example, the second neural network model is used for face recognition, the first neural network model used for face recognition is constructed, input data and output data of the second neural network model are adopted, training processing is carried out on the first neural network model, the trained first neural network model is obtained, the trained first neural network model is enabled to be approximate to the second neural network, and the trained first neural network model is used for generating countermeasure samples according to initial samples.
Referring to fig. 7, fig. 7 is an alternative flowchart provided by an embodiment of the present invention, and in some embodiments, fig. 7 shows that step 101 in fig. 5 may be implemented by steps 1011 to 1012 shown in fig. 7.
In step 1011, performing attention processing on each layer structure in the first neural network model to obtain attention information of each layer structure;
in step 1012, the attention information of each layer structure is combined to obtain an attention map of the first neural network model.
In order to obtain important structural features in the first neural network model, attention processing needs to be performed on each layer structure in the known first neural network model to obtain attention information of each layer structure, so that the importance of each layer structure is extracted, an accurate countermeasure sample can be subsequently extracted, the attention information of each layer structure is combined, and an attention force chart of the first neural network model, namely the importance of each layer structure is determined.
In some embodiments, the attention processing of the respective layer structures in the first neural network model to obtain the attention information of the respective layer structures includes: determining output characteristics of each layer structure in the first neural network model corresponding to the initial sample; and performing derivation processing aiming at the initial sample on the output characteristics of the initial sample corresponding to each layer structure, and taking the derivation result as the attention information of the initial sample corresponding to each layer structure.
In the case that the structure of each layer of the first neural network model is known, the output characteristics of each layer corresponding to the initial sample can be determined, for example, the ith layer structure (e.g., convolutional layer, pooling layer, etc.) of the first neural network model is known, and the output characteristics of the ith layer structure, i.e., the matrix T, are determinedi(x) Where x represents the initial sample of the input, Ti(x) Deriving x to obtain a derivative with the absolute value of the matrix Si(x) I.e. attention information (importance) of the i-th layer structure, wherein the matrix Si(x) And matrix Ti(x) Are the same.
In step 102, a confrontation sample optimization function is constructed based on the attention map of the first neural network model and the classification results of the first neural network model for the initial samples.
After the server determines the attention map of the first neural network model, a confrontation sample optimization function (confrontation sample function) may be constructed according to the attention map of the first neural network model and the classification result of the first neural network model for the initial samples, so as to subsequently generate confrontation samples according to the confrontation sample optimization function.
Referring to fig. 8, fig. 8 is an optional flowchart provided in an embodiment of the present invention, and in some embodiments, fig. 8 illustrates that step 102 in fig. 5 may be implemented by steps 1021 to 1022 illustrated in fig. 7, which further includes step 107.
In step 107, classifying the initial samples through the first neural network model to obtain a classification result of the first neural network model for the initial samples;
in step 1071, the initial sample is encoded by the first neural network model to obtain the characteristics of the initial sample;
in step 1072, the features of the initial sample are classified to obtain a class label corresponding to the initial sample.
Wherein the type of the initial sample comprises one of: an image sample; a text sample; the speech sample, i.e., the first neural network, is not limited to image recognition, speech recognition, text classification, or the like. Before the countermeasure sample optimization function is constructed, the initial samples need to be classified through the first neural network model, the classification result of the first neural network model for the initial samples is obtained, and the countermeasure sample optimization function is constructed according to the classification result of the first neural network model for the initial samples. Specifically, the first neural network model encodes the initial sample through the first neural network model to obtain the characteristics of the initial sample, and classifies the characteristics of the initial sample to obtain the class label (class probability) corresponding to the initial sample. Different neural network models may have different encoding processes and classification processes for different initial samples.
In some embodiments, constructing the antagonistic sample optimization function based on the attention map of the first neural network model, the classification results of the first neural network model for the initial samples, comprises: constructing a loss function of the first neural network model based on the classification result of the first neural network model for the initial sample and the confrontation sample; and constructing a confrontation sample optimization function for iterative updating based on the loss function of the first neural network, the attention map of the first neural network model and the initial sample.
After the server obtains the classification result of the first neural network model for the initial sample, a loss function of the first neural network model is constructed based on the classification result of the first neural network model for the initial sample and the initial confrontation sample (the initial sample is used as the initial confrontation sample), and a confrontation sample optimization function for iterative update is constructed based on the loss function of the first neural network, the attention map of the first neural network model and the initial sample, wherein the loss function based on the first neural network, the attention map of the first neural network model and the initial sample can be expressed as various deformation functions, such as square, index and the like.
In some embodiments, constructing a confrontation sample optimization function for iterative updating based on the loss function of the first neural network, the attention map of the first neural network model, and the initial samples comprises: performing point multiplication on attention information of each layer structure in the first neural network model corresponding to the initial sample and output characteristics of the countermeasure sample corresponding to each layer structure to obtain a first point multiplication result; performing point multiplication on the attention information of the initial sample corresponding to each layer structure in the first neural network model and the output characteristics of the initial sample corresponding to each layer structure to obtain a second point multiplication result; determining a difference between the first dot product result and the second dot product result; and constructing a function for iterating the initial sample and taking the iteration result as a countermeasure sample by taking the loss function, the initial sample and the difference of the first neural network model as parameters.
In order to construct the optimization function of the confrontation sample, the attention information of the initial sample corresponding to each layer structure in the first neural network model and the output characteristics of the confrontation sample corresponding to each layer structure are subjected to point multiplication to obtain a first point multiplication result, the attention information of the initial sample corresponding to each layer structure in the first neural network model and the output characteristics of the initial sample corresponding to each layer structure are subjected to point multiplication to obtain a second point multiplication result, so as to determine the difference between the first point multiplication result and the second point multiplication result, and the loss function, the initial sample and the difference of the first neural network model are taken as parameters to construct a function for iterating the initial sample and taking the iterating result as the confrontation sample, wherein the loss function, the initial sample and the difference of the first neural network model can be expressed as various deformation functions, such as square, evolution, square, indices, etc. The specific countermeasure optimization function may be
Figure BDA0002379168830000201
Wherein,
Figure BDA0002379168830000202
is a loss function of the first neural network model, x is an initial sample,
Figure BDA0002379168830000209
is the first dot product, Si(x)·Ti(x) As a result of the second dot-product,
Figure BDA0002379168830000203
is the difference between the first dot product and the second dot product.
In some embodiments, determining the difference between the first point multiplication result and the second point multiplication result comprises: performing difference processing on the first point multiplication result and the second point multiplication result to obtain a processing result; the square of the processing result is determined as the difference between the first dot product and the second dot product.
The difference between the first dot product and the second dot product can be expressed as various deformation functions, such as square, etc. For example, the difference processing is performed on the first dot product result and the second dot product result to obtain a processing result, and the square of the processing result is determined as the difference between the first dot product result and the second dot product result, that is, the difference between the first dot product result and the second dot product result may be a function of the square form, that is, the difference may be a function of the square form
Figure BDA0002379168830000204
In (1)
Figure BDA0002379168830000205
Is the difference between the first dot product and the second dot product.
In some embodiments, the method further comprises: in the confrontation sample optimization function, a difference constraint of the confrontation samples relative to the initial samples is fused so that the difference of the confrontation samples relative to the initial samples does not exceed a difference threshold.
In order to reduce the difference between the subsequently generated challenge sample and the initial sample, i.e. the human eye cannot directly distinguishThe confrontation sample and the initial sample can be merged into the difference constraint of the confrontation sample relative to the initial sample in the confrontation sample optimization function. For example, optimizing a function at a particular challenge sample
Figure BDA0002379168830000206
In (1), adding
Figure BDA0002379168830000207
And (c) constraining, wherein,
Figure BDA0002379168830000208
to combat the samples, x is the initial sample and ε is the difference constraint.
In step 103, an update iteration process is performed based on the challenge sample optimization function and the initial sample, and the iteration result is used as the challenge sample corresponding to the initial sample.
After the confrontation sample optimization function is constructed, updating iteration processing can be performed according to the confrontation sample optimization function and the initial sample, and an iteration result is used as the confrontation sample corresponding to the initial sample.
In some embodiments, performing an update iteration process based on the challenge sample optimization function and the initial sample, and taking an iteration result as a challenge sample corresponding to the initial sample, includes: taking the initial sample as an initial challenge sample and substituting the initial sample into a challenge sample optimization function to determine a derivative of the challenge sample optimization function for the initial challenge sample; taking the sum of the derivative and the initial confrontation sample as a new confrontation sample, and substituting the new confrontation sample into the sample optimization function to continuously update the new confrontation sample in an iterative manner; and taking a new countermeasure sample obtained by iterative updating when the countermeasure sample optimization function is converged as a countermeasure sample corresponding to the initial sample.
In order to iteratively generate the confrontation samples, the initial samples are used as the initial confrontation samples and are substituted into the confrontation sample optimization function, the derivatives of the confrontation sample optimization function for the initial confrontation samples are determined through an inverse gradient descent method, the sum of the derivatives and the initial confrontation samples is used as a new confrontation sample, the sample optimization function is substituted to continue to iteratively update the new confrontation samples until the confrontation sample optimization function converges, and the new confrontation samples obtained through iterative updating when the confrontation sample optimization function converges are used as the confrontation samples corresponding to the initial samples (finally generated confrontation samples). The initial confrontation sample can be formed by adding preset noise on the basis of the initial sample.
In step 104, the confrontation samples are classified by the second neural network model, and classification results of the confrontation samples are obtained.
After the countermeasure samples corresponding to the initial samples are generated, the countermeasure samples can be classified through the second neural network model to obtain the classification results of the countermeasure samples, so that the robustness of the second neural network model can be judged according to the classification results of the countermeasure samples. For example, the initial sample is a certain vehicle picture, the generated countermeasure sample is also a vehicle picture, and there is no obvious difference visible to human eyes from the initial sample, and the countermeasure sample is classified by the second neural network model, so that the classification result of the countermeasure sample may be obtained as the vehicle picture, or the classification result of the countermeasure sample may not be obtained, or the classification result of the countermeasure sample may be obtained as a non-vehicle picture.
In step 105, the probability of correct classification of the second neural network model is determined according to the difference between the classification result of the antagonistic sample and the classification result of the initial sample.
In some embodiments, before determining the probability that the second neural network model is correctly classified according to the difference between the classification result of the antagonistic sample and the classification result of the initial sample, the method further comprises: classifying the initial samples through a second neural network model to obtain the classification results of the initial samples; and when the probability of the correct classification of the second neural network model is lower than the probability threshold, performing training processing on the second neural network model based on the countermeasure samples so as to enable the probability of the correct classification of the second neural network model to be higher than the probability threshold.
After the classification result of the antagonistic sample and the classification result of the initial sample are obtained, the ratio of the number of correctly classified samples of the second neural network model to the number of correctly classified samples of the total class, namely the robustness of the second neural network model, can be determined according to the difference between the classification result of the antagonistic sample and the classification result of the initial sample, and when the robustness of the second neural network model is determined to be lower than a probability threshold value, namely the robustness is low, the second neural network model can be retrained based on the antagonistic sample, so that the second neural network model can correctly classify the antagonistic sample, the probability of correctly classifying the second neural network model is improved, and the robustness of the second neural network model is enhanced.
Now, the method for detecting the neural network model provided in the embodiment of the present invention has been described with reference to the exemplary application and implementation of the server provided in the embodiment of the present invention, and a scheme for detecting the neural network model by matching each module in the detecting apparatus 555 of the neural network model provided in the embodiment of the present invention is continuously described below.
A first processing module 5551, configured to construct an attention map of a first neural network model based on structural features of the first neural network model; a construction module 5552, configured to construct a confrontation sample optimization function based on the attention map of the first neural network model and the classification result of the first neural network model for the initial sample; an updating module 5553, configured to perform an updating iteration process based on the confrontation sample optimization function and the initial sample, and use an iteration result as a confrontation sample corresponding to the initial sample; the second processing module 5554 is configured to perform classification processing on the confrontation sample through a second neural network model to obtain a classification result of the confrontation sample, and determine a probability that the second neural network model is correctly classified according to a difference between the classification result of the confrontation sample and the classification result of the initial sample.
In some embodiments, the detecting device 555 of the neural network model further includes: a first determining module 5555 for determining a similarity of each of a plurality of third neural network models to the second neural network model based on functions, input data, and output data of the third neural network models; and determining the third neural network model corresponding to the maximum similarity as the first neural network model.
In some embodiments, the detecting device 555 of the neural network model further includes: a second determining unit 5556, configured to construct the first neural network model having the same function as the second neural network model according to the function of the second neural network model; and training the first neural network model based on the input data and the output data of the second neural network model to obtain a trained first neural network model.
In some embodiments, the detecting device 555 of the neural network model further includes: a first classification module 5557, configured to perform classification processing on the initial sample through the first neural network model, so as to obtain a classification result of the initial sample by the first neural network model; the first classification module 5557 is further configured to perform encoding processing on the initial sample through the first neural network model to obtain features of the initial sample; classifying the characteristics of the initial sample to obtain a class label corresponding to the initial sample; wherein the type of the initial sample comprises one of: an image sample; a text sample; a speech sample.
In some embodiments, the first processing module 5551 is further configured to perform attention processing on each layer structure in the first neural network model to obtain attention information of each layer structure; and combining the attention information of the structures of all layers to obtain an attention map of the first neural network model.
In some embodiments, the first processing module 5551 is further configured to determine output characteristics of each layer structure in the first neural network model corresponding to the initial sample; and carrying out derivation processing aiming at the initial sample on the output characteristics of the initial sample corresponding to each layer structure, and taking a derivation result as the attention information of each layer structure corresponding to the initial sample.
In some embodiments, the building module 5552 is further configured to build a loss function of the first neural network model based on the classification result of the first neural network model for the initial sample and the antagonistic sample; constructing a confrontation sample optimization function for iterative updating based on the loss function of the first neural network, the attention map of the first neural network model, and the initial samples.
In some embodiments, the building module 5552 is further configured to perform a point multiplication on the attention information of each layer structure in the first neural network model corresponding to the initial sample and the output features of each layer structure corresponding to the confrontation sample to obtain a first point multiplication result; performing point multiplication on the attention information of each layer structure in the first neural network model corresponding to the initial sample and the output characteristics of each layer structure corresponding to the initial sample to obtain a second point multiplication result; determining a difference between the first point multiplication result and the second point multiplication result; and constructing a function for iterating the initial sample and taking an iteration result as a countermeasure sample by taking the loss function of the first neural network model, the initial sample and the difference as parameters.
In some embodiments, the constructing module 5552 is further configured to perform a difference processing on the first point multiplication result and the second point multiplication result to obtain a processing result; determining a square of the processing result as a difference between the first dot product and the second dot product.
In some embodiments, the detecting device 555 of the neural network model further includes: a third processing module 5558, configured to blend in a difference constraint of the confrontation sample with respect to the initial sample in the confrontation sample optimization function, so that the difference of the confrontation sample with respect to the initial sample does not exceed a difference threshold.
In some embodiments, the update module 5553 is further configured to take the initial sample as an initial challenge sample and substitute the challenge sample optimization function to determine a derivative of the challenge sample optimization function with respect to the initial challenge sample; taking the sum of the derivative and the initial confrontation sample as a new confrontation sample, and substituting the sample optimization function to continue to iteratively update the new confrontation sample; and taking a new countermeasure sample obtained by iterative updating when the countermeasure sample optimization function is converged as a countermeasure sample corresponding to the initial sample.
In some embodiments, the detecting device 555 of the neural network model further includes: the second classification module 5559 is configured to perform classification processing on the initial sample through the second neural network model to obtain a classification result of the initial sample; the detecting device 555 of the neural network model further includes: a training module 5560, configured to train the second neural network model based on the antagonistic sample when the probability of correct classification of the second neural network model is lower than a probability threshold, so that the probability of correct classification of the second neural network model is higher than the probability threshold.
Hereinafter, an exemplary application of the embodiment of the present invention in an image application scene will be specifically described.
The embodiment of the invention provides a method and a device for detecting an image neural network model, electronic equipment and a storage medium, which can automatically obtain accurate confrontation image samples and improve the robustness detection efficiency. An exemplary application of the detection device of the image neural network model provided in the embodiment of the present invention is described below, where the detection device of the image neural network model provided in the embodiment of the present invention may be a server, for example, a server deployed in a cloud, and according to a first image neural network model, a second image neural network model and an initial image sample provided by another device or a user, an attention force diagram of the first image neural network model is constructed based on structural features of the first image neural network model, and a confrontation image sample optimization function is constructed based on the attention force diagram and classification results of the first image neural network model for the initial image sample, and through the confrontation image sample optimization function, an accurate confrontation image sample corresponding to the initial image sample is obtained, and according to the accurate confrontation image sample, a probability of correct image classification of the second image neural network model is detected, The robustness is obtained, and the robustness of the second neural network model is shown to a user; the robustness detection method can also be used for various types of user terminals such as a notebook computer, a tablet computer, a desktop computer, a mobile device (e.g., a mobile phone, a personal digital assistant), and the like, for example, a handheld terminal, and according to a first image neural network model, a second image neural network model and an initial image sample input by a user on the handheld terminal, a countermeasure sample corresponding to the initial image sample is obtained, robustness detection is performed on the second image neural network model based on the countermeasure image sample, robustness of the second image neural network model is obtained, and the robustness of the second image neural network model is displayed on a display interface of the handheld terminal, so that the user can know robustness of the second image neural network model.
By way of example, referring to fig. 9, fig. 9 is a schematic view of an application scenario of a detection system 10A of an image neural network model provided in an embodiment of the present invention, a terminal 200A is connected to a server 100A through a network 300A, and the network 300A may be a wide area network or a local area network, or a combination of both.
The terminal 200A may be used to obtain a first image neural network model, a second image neural network model and an initial image sample, for example, when a user inputs a face recognition model (the first image neural network model), an entrance guard model (the second neural network model) and the initial face sample through an input interface, the terminal automatically obtains the face recognition model, the entrance guard model and the initial face sample input by the user after the input is completed.
In some embodiments, the terminal 200A locally executes the detection method of the image neural network model provided in the embodiments of the present invention to complete obtaining the confrontation face sample corresponding to the initial face sample according to the face recognition model, the access control model and the initial face sample input by the user, performing robustness detection on the access control model based on the confrontation face sample to obtain robustness of the access control model, and displaying the robustness of the access control model on the display interface 210A of the terminal 200A, so that the user can know the robustness of the second neural network model, and screen the second neural network model with high robustness for image recognition, voice recognition, text classification and other applications.
In some embodiments, the terminal 200A may also send, to the server 100A through the network 300A, the face recognition model, the access control model and the initial face sample that are input by the user on the terminal 100A, and invoke the detection function of the image neural network model provided by the server 100A, where the server 100A obtains the robustness of the access control model through the detection method of the image neural network model provided by the embodiment of the present invention, and displays the robustness of the access control model on the display interface 210A of the terminal 200A, or the server 100A directly provides the robustness of the access control model, so that the user knows the robustness of the access control model, and the access control model is ensured to be applied to the industry, the face is correctly recognized, the user cannot pass through the access control, or an illegal person passes through the access control.
The following describes a method for detecting an image neural network model according to an embodiment of the present invention, with reference to an exemplary application and implementation of a server according to an embodiment of the present invention. Referring to fig. 10, fig. 10 is a schematic flowchart of a method for detecting an image neural network model according to an embodiment of the present invention, and the steps shown in fig. 10 are combined for description.
In step 201, an attention map of the first image neural network model is constructed based on structural features of the first image neural network model.
In an exemplary embodiment, when a user needs to detect the robustness of a second image neural network model (door control model) to ensure that the door control model can be put into use, the user can input a first image neural network model (face recognition model), an initial image sample (initial face sample) and the second image neural network model at a terminal, the terminal can input the first image neural network model, the initial image sample and the second image neural network model to a server, and the server constructs an attention machine map of the first image neural network model based on structural features of the face recognition model.
In some embodiments, prior to constructing the attention map of the first graphical neural network model, in step 206, a similarity of each third graphical neural network model to the second graphical neural network model is determined based on the functions of the plurality of third graphical neural network models, the input data, and the output data; and determining the third image neural network model corresponding to the maximum similarity as the first image neural network model.
When the second image neural network model is determined to be the entrance guard model for recognizing the human face, a plurality of existing human face recognition models, namely a plurality of third image neural network models, are determined, the functions of the existing human face recognition models, the input data of the human face and the output human face recognition result are determined, the similarity of the existing human face recognition models and the entrance guard model is determined according to the functions of the human face recognition models, the input data of the human face and the output human face recognition result, and the human face recognition model with the maximum similarity is determined to be the first image neural network model for subsequently detecting the robustness of the entrance guard model.
In some embodiments, before constructing the attention map of the first image neural network model, a first image neural network model having the same function as the second image neural network model may be constructed according to the function of the second image neural network model; and training the first image neural network model based on the input data and the output data of the second image neural network model to obtain the trained first image neural network model.
When the face recognition model similar to the entrance guard model does not exist, a first image neural network model for face recognition can be constructed according to the functions of the entrance guard model, the input data and the output data of the entrance guard are adopted, the first image neural network model is trained, the trained first image neural network model is obtained, the trained first image neural network model is similar to the entrance guard model, and the trained first image neural network model is used for generating a confrontation image sample according to the initial image sample.
In some embodiments, constructing an attention map of the first image neural network model based on structural features of the first image neural network model comprises: in step 2011, attention processing is performed on each layer structure in the first image neural network model to obtain attention information of each layer structure; in step 2012, the attention information of each layer structure is combined to obtain an attention map of the first image neural network model.
In order to obtain important structural features in the first image neural network model, attention processing needs to be performed on each layer structure in the known first image neural network model to obtain attention information of each layer structure, so as to extract importance of each layer structure, and the attention information of each layer structure is combined to obtain an attention map of the first image neural network model, that is, to determine importance of each layer structure.
In some embodiments, the attention processing of the structures in the first image neural network model to obtain the attention information of the structures includes: determining output characteristics of each layer structure in the first image neural network model corresponding to the initial image sample; and performing derivation processing aiming at the initial image sample on the output characteristics of the initial image sample corresponding to each layer structure, and taking the derivation result as the attention information of the initial image sample corresponding to each layer structure.
In step 202, a countering image sample optimization function is constructed based on the attention map of the first image neural network model and the classification results of the first image neural network model for the initial image samples.
And constructing a confrontation image sample optimization function through the attention force drawing of the first image neural network model and the classification result of the first image neural network model for the initial image sample, so as to generate the confrontation image sample according to the confrontation image sample optimization function.
In some embodiments, before the image confrontation sample optimization function is constructed, in step 207, the initial image sample is classified by the first image neural network model, so as to obtain a classification result of the first image neural network model for the initial image sample; classifying the initial image sample through the first image neural network model to obtain a classification result of the first image neural network model for the initial image sample, wherein the classification result comprises the following steps: and coding the initial image sample through the first image neural network model to obtain the characteristics of the initial image sample.
In some embodiments, constructing the antagonistic image sample optimization function based on the attention map of the first image neural network model, the classification results of the first image neural network model for the initial image samples, comprises: in step 2021, constructing a loss function of the first image neural network model based on the classification result of the first image neural network model for the initial image sample and the antagonistic image sample; in step 2022, a antagonistic image sample optimization function for iterative updating is constructed based on the loss function of the first image neural network, the attention map of the first image neural network model, and the initial image samples.
Illustratively, after the server obtains the classification result of the face recognition model for the initial face sample, a loss function of the face recognition model is constructed based on the classification result of the face recognition model for the initial face sample and the initial confrontation face sample, and a confrontation face sample optimization function for iterative update is constructed based on the loss function of the face recognition model, the attention map of the face recognition model and the initial face sample.
In some embodiments, constructing a countering image sample optimization function for iterative updating based on the loss function of the first image neural network, the attention map of the first image neural network model, and the initial image samples comprises: performing point multiplication on attention information of each layer structure in the first image neural network model corresponding to the initial image sample and output characteristics of the antagonistic image sample corresponding to each layer structure to obtain a first point multiplication result; performing point multiplication on the attention information of each layer structure corresponding to the initial image sample in the first image neural network model and the output characteristics of each layer structure corresponding to the initial image sample to obtain a second point multiplication result; determining a difference between the first dot product result and the second dot product result; and constructing a function for iterating the initial image sample and taking an iteration result as a countermeasure image sample by taking the loss function, the initial image sample and the difference of the first image neural network model as parameters.
In some embodiments, determining the difference between the first point multiplication result and the second point multiplication result comprises: performing difference processing on the first point multiplication result and the second point multiplication result to obtain a processing result; the square of the processing result is determined as the difference between the first dot product and the second dot product.
Illustratively, the specific countermeasure optimization function may be
Figure BDA0002379168830000291
Wherein,
Figure BDA0002379168830000292
is a loss function of the first image neural network model, x is the initial face sample,
Figure BDA0002379168830000293
is the first dot product, Si(x)·Ti(x) As a result of the second dot-product,
Figure BDA0002379168830000294
is the difference between the first dot product and the second dot product.
In some embodiments, a disparity constraint of the confrontation image samples relative to the initial image samples may be incorporated into the confrontation image sample optimization function such that the confrontation image samples differ relative to the initial image samples by no more than a disparity threshold, thereby reducing the disparity of the confrontation image samples relative to the initial image samples.
In step 203, an update iteration process is performed based on the confrontation image sample optimization function and the initial image sample, and the iteration result is used as the confrontation image sample corresponding to the initial image sample.
In some embodiments, performing an update iteration process based on the robust image sample optimization function and the initial image sample, and taking an iteration result as the robust image sample corresponding to the initial image sample, includes: taking the initial image sample as an initial confrontation image sample and substituting the confrontation image sample optimization function to determine a derivative of the confrontation image sample optimization function for the initial confrontation image sample; taking the sum of the derivative and the initial confrontation image sample as a new confrontation image sample, and substituting the new confrontation image sample into the image sample optimization function to continuously update the new confrontation image sample in an iterative manner; and taking a new confrontation image sample obtained by iterative update when the confrontation image sample optimization function is converged as a confrontation image sample corresponding to the initial image sample.
Illustratively, in order to iteratively generate the confrontation face samples, the initial face samples are used as initial confrontation face samples and are substituted into a confrontation face sample optimization function, the derivatives of the confrontation face sample optimization function for the initial confrontation face samples are determined through an inverse gradient descent method, the sum of the derivatives and the initial confrontation face samples is used as new confrontation face samples, the new confrontation face samples are continuously updated iteratively by being substituted into the face sample optimization function until the confrontation face sample optimization function converges, and the new confrontation face samples obtained by iterative updating when the confrontation face sample optimization function converges are used as the confrontation samples corresponding to the initial face samples.
In step 204, the confrontation image samples are classified by the second image neural network model, so as to obtain a classification result of the confrontation image samples.
Illustratively, after the confrontation face sample corresponding to the initial face sample is generated, the confrontation face sample can be classified through the access control model to obtain a classification result of the confrontation face sample, so that the robustness of the access control model can be judged according to the classification result of the confrontation face sample.
In step 205, the probability of correct image classification of the second image neural network model is determined according to the difference between the classification result of the challenge image sample and the classification result of the initial image sample.
Wherein, the probability of the second image neural network model for correct image classification is the robustness of the second image neural network model.
In some embodiments, before determining the probability of the second image neural network model classifying the correct image according to the difference between the classification result of the challenge image sample and the classification result of the initial image sample, the method further comprises: classifying the initial image sample through a second image neural network model to obtain a classification result of the initial image sample; and when the probability of the correct image classification of the second image neural network model is lower than the probability threshold, training the second image neural network model based on the confrontation image samples so as to enable the probability of the correct image classification of the second image neural network model to be higher than the probability threshold.
Illustratively, after the classification result of the anti-face sample and the classification result of the initial face sample are obtained, the ratio of the number of correctly classified samples of the entrance guard model to the number of the totally classified samples, namely the robustness of the entrance guard model, can be determined according to the difference between the classification result of the anti-face sample and the classification result of the initial face sample, and when the robustness of the entrance guard model is determined to be lower than a probability threshold value, namely the robustness is low, the entrance guard model can be retrained based on the anti-face sample, so that the entrance guard model can correctly classify the anti-face sample, the probability of correctly classifying the face of the entrance guard model is improved, and the robustness of the entrance guard model is enhanced.
Embodiments of the present invention also provide a computer-readable storage medium storing executable instructions, which when executed by a processor, will cause the processor to perform a method for detecting a neural network model provided by embodiments of the present invention, for example, a method for detecting a neural network model as shown in fig. 5 to 8, or a method for detecting an image neural network model as shown in fig. 10.
In some embodiments, the storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EE PROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, may be stored in a portion of a file that holds other programs or data, e.g., in one or more scripts in a HyperText markup Language (H TML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
In the following, an exemplary application of the embodiments of the present invention in a practical application scenario will be described.
The embodiment of the invention is not limited to the deep neural network, and can also be applied to other neural networks. Compared with the traditional neural network, the deep neural network comprises more and deeper network modules, so that the learning capability of the deep neural network model is stronger. As shown in fig. 11, fig. 11 is an alternative structural diagram of a deep neural network provided in an embodiment of the present invention, where the deep neural network is composed of one or more convolutional layers and a top fully-connected layer, and also includes an associated weight and a pooling layer. The deep neural network provided in fig. 11 may be composed of input layers (85 × 85 in size), convolutional layers 1 (convolved with convolution kernel 6 × 40), pooled layers 1 (40 × 40 in size, 2 × 2 downsampled), convolutional layers 2 (convolved with convolution kernel 5 × 80), pooled layers 2 (18 × 80 in size, 2 × 2 downsampled), fully-connected layers (fully-connected unfolding of the output of pooled layers 2), and output layers (regression of the output of fully-connected layers). Different deep neural networks have different structures, and embodiments of the present invention are not limited to neural networks of a certain structure.
Although deep neural networks perform well in many computer vision, natural language processing, and other fields of tasks, a problem exists. Deep neural networks have a high accuracy in tasks in the fields of computer vision, natural language processing, and the like, but are very vulnerable to attacks against samples (counterattacks). Although these countermeasures are only slightly perturbed with respect to the initial samples, the human visual system cannot detect such perturbations, and such countermeasures can alter the task of the deep neural network. As shown in fig. 12, fig. 12 is a schematic view of the countermeasure attack provided in the embodiment of the present invention, on the basis of an initial sample (a normal car image, which can be correctly identified as a car), intermediate data (camera noise, etc.) is superimposed to generate a countermeasure sample (human eyes look the same as the initial sample), a deep neural network cannot correctly identify the countermeasure sample as a car, and when the deep neural network is applied to an application scenario of automatic driving, the deep neural network cannot correctly identify a car on a road, and thus traffic accidents such as collision and the like are caused, and safe driving cannot be realized. In order to solve the problem of attack countermeasure, research in the fields of attack countermeasure and deep learning security must be developed, otherwise, existing voice recognition systems, image recognition systems and the like based on deep neural networks generate errors due to attack countermeasures, and disasters are caused.
The migration type attack refers to a process of generating a counter sample attack neural network model A (a neural network model to be tested) through a neural network model B (a reference neural network model). When two neural network models (a neural network model a and a neural network model B) exist, when a countermeasure sample of the neural network model a needs to be generated, the structure and parameters of the neural network model a need to be known, but actually the structure and parameters of the neural network model a may not be known, so that a better-performance neural network model B needs to be determined (the structure and parameters of the neural network model B may be known), and the countermeasure sample is generated through the neural network model B.
Robustness of the neural network model refers to the ability to withstand migratory attacks. Many cloud products can provide deep neural network based products such as image recognition, text recognition, voice recognition, and the like. If these products are not robust, many false identifications are madeAs a result, there is a great potential risk, and therefore, these products must pass a certain robustness test before they come online. To achieve robust detection of the neural network model, countermeasure samples for detecting the neural network model may be generated based on migration attacks, and when the neural network model B is known, a function may be used on the basis of the neural network model B
Figure BDA0002379168830000331
Concurrent constraint generated confrontation samples
Figure BDA0002379168830000334
Similar to the original initial sample, thus the challenge sample
Figure BDA0002379168830000332
Can be used to attack an unknown neural network model a, where,
Figure BDA0002379168830000333
a loss function (e.g., error rate of classification) representing the neural network model B. In particular, an objective function may be employed
Figure BDA0002379168830000337
While satisfying constraints
Figure BDA0002379168830000335
Generating challenge samples
Figure BDA0002379168830000336
Wherein, Ti(x) The transformation characteristics (output characteristics) corresponding to the i-th layer neural network structure of the neural network model B are expressed, so that the confrontation sample generated by the neural network model B
Figure BDA0002379168830000338
The neural network model A can be attacked better, and the neural network model A can be attacked in a migration mode by using the neural network model B. However, robust against sample detection using initial objective function generationThe detection effect still has bias, and the neural network model with poor robustness cannot be well detected.
In order to solve the above problem, an embodiment of the present invention provides a method for detecting a neural network model, as shown in fig. 13, fig. 13 is an optional flow diagram of the method for detecting a neural network model provided in the embodiment of the present invention, and the method includes the following steps: 1) acquiring a neural network model for migration; 2) constructing an attention force chart; 3) establishing a confrontation sample optimization function; 4) solving by adopting a reverse gradient to obtain a migration type confrontation sample; 5) robustness of the neural network model is detected using the challenge samples. The above steps are specifically described as follows:
1) obtaining a neural network model for migration
For the detection of robustness, a to-be-detected neural network model A (a second neural network model) is known, in order to simulate a real detection environment, under the condition that the structure and parameters of the to-be-detected neural network model A are unknown, and the function, input data and output data of the to-be-detected neural network model A are known, a neural network model B (a reference neural network model and the second neural network model) for migration can be set up by self, the function of the neural network model B is the same as that of the neural network model A, the neural network model B is trained by using the input data and the output data of the neural network model A, and the trained neural network model B is a required deep neural network for migration, namely the reference neural network model.
In addition, according to the functions, input data and output data of a plurality of candidate neural network models, the similarity between each candidate neural network model and the neural network model A can be determined, and the candidate neural network model corresponding to the maximum similarity is determined as a deep neural network for migration, namely a reference neural network model.
2) Drawing for constructing attention
After the neural network model B for migration is determined, because the structure and parameters of the neural network model B are known, each layer structure of the neural network model B can be determined, and the initial sample corresponding to each layer structure can be determinedThe output characteristics of the present example, i.e. the i-th layer structure (e.g. convolutional layer, pooling layer, etc.) of the known neural network model B, determine the matrix T which is the output characteristics of the i-th layer structurei(x) Where x represents the initial sample of the input, Ti(x) Deriving x to obtain a derivative with the absolute value of the matrix Si(x) I.e. attention information (importance) of the i-th layer structure, wherein the matrix Si(x) And matrix Ti(x) Are the same. The attention information of the respective layer structures is combined to construct an attention map (recording the importance of each positional data). As shown in fig. 14A-14B, fig. 14A-14B are attention machine charts corresponding to different neural Network models provided in the embodiment of the present invention, an initiation (inclusion) V3 model, an initiation Residual Network (I initiation ResNet) V2 model, and a Residual Network (ResNet, Residual Network)152 model are typical representations of the neural Network models, and many products realize their respective functions depending on the inclusion V3 model, the inclusion n ResNet V2 model, and the ResNet 152 model, in fig. 14A, based on the inclusion V3 model, the inclusion ResNet V2 model, and the ResNet 152 model, for an initial sample 1 (a building chart taken at night), attention machine charts corresponding to the inclusion V7 model, the inclusion ResNet V2 model, and the ResNet 152 model are generated, wherein the attention machine charts corresponding to the inclusion V3 model, the inclusion V3 matrix in the inclusion ResNet 152 model are close to the center of the inclusion S3 in the attention machine charti(x) The greater the value of (A); in the attention plot corresponding to the InceptionResNet V2 model, the closer to center 2, the matrix S corresponding to the Incepti on ResNet V2 modeli(x) The greater the value of (A); in the attention map of the corresponding ResNet 152 model, the closer to center 3, the matrix S of the corresponding ResNet 152 modeli(x) The larger the value of (c). In fig. 14B, based on the inclusion V3 model, the inclusion renet V2 model, and the ResNet 152 model, for the initial sample 2 (the architectural drawing taken in the daytime), an attention map corresponding to the inclusion V3 model, the inclusion on ResNet V2 model, and the ResNet 152 model is generated, wherein the closer to the center 4 in the attention map corresponding to the inclusion V3 model, the closer to the center 4, the matrix S corresponding to the inclusion V3 model isi(x) The greater the value of (A); attention mechanism corresponding to Incepton ResNet V2 modelIn the figure, the closer to the center 5, the matrix S corresponding to the inclusion ResNet V2 modeli(x) The greater the value of (A); in the attention map of the corresponding ResNet 152 model, the closer to the center 6, the matrix S of the corresponding ResNet 152 modeli(x) The larger the value of (c).
3) Establishing a confrontation sample optimization function
In order to obtain a countermeasure sample according to an initial sample and a reference neural network model B, the embodiment of the invention provides a construction method of a countermeasure sample optimization function
Figure BDA0002379168830000351
While satisfying constraints
Figure BDA0002379168830000352
Thereby, the attention mechanism is introduced into the optimization function, wherein,
Figure BDA0002379168830000353
representing the loss function of the reference neural network model B, x representing the initial samples,
Figure BDA0002379168830000355
representing challenge samples,. epsilon.representing the difference threshold of challenge samples relative to the initial samples, Ti(x) The ith layer structure representing the reference neural network model B corresponds to the output characteristics of the initial sample,
Figure BDA0002379168830000354
the ith layer structure representing the reference neural network model B corresponds to the output characteristics of the confrontation samples, Si(x) Attention information, S, representing the i-th layer structure of a reference neural network model Bi(x) The number in (B) may be reduced or increased by an order number, such as an open root or a square, n represents the total number of layers of the structure of the reference neural network model B, and represents a dot product of the matrix.
As shown in fig. 14A-14B, intuitively, Si(x) The larger the value of (A), the more sensitive the i-layer network of the reference neural network model B to x, i.e. Ti(x) And
Figure BDA0002379168830000356
the characteristics of the corresponding position are more important, and the important characteristics can better reflect the function of the reference neural network model B, so that the function of the neural network model A to be tested can be better reflected, wherein the function of the reference neural network model B is similar to that of the neural network model A to be tested. The important characteristic is the commonality of the reference neural network model B and the neural network model A to be detected, so that the countermeasure sample can be generated on the basis of the important characteristic through the reference neural network model B, and the neural network model A to be detected is better and used for detecting the robustness of the neural network model A to be detected. As seen from fig. 14A-14B, the positions and the sizes of the attention mechanisms (the annular regions surrounding the center 1, the center 2, and the center 3, and the annular regions surrounding the center 4, the center 5, and the center 6) corresponding to the different neural networks are substantially consistent, so that the important features with commonality are determined to determine the functions of the neural network models (the reference neural network model B and the neural network model a to be tested).
4) Obtaining a migration type confrontation sample by adopting a reverse gradient solution
After determining the challenge sample optimization function, the method can be implemented by a reverse random gradient Descent (SGD) method
Figure 1
Figure 2
As variables, iterating the challenge sample optimization function until the challenge sample optimization function converges, generating a final challenge sample
Figure BDA0002379168830000362
Wherein initially
Figure BDA0002379168830000363
Is the initial sample x. Wherein, the forward SGD method is pair
Figure BDA0002379168830000364
To carry out
Figure BDA0002379168830000365
The iteration of (a), wherein,
Figure BDA0002379168830000366
in step 3) is shown
Figure BDA0002379168830000367
To pair
Figure BDA0002379168830000368
Since the antagonistic sample optimization function of step 3) is to find the maximum, the embodiment of the present invention adopts the reverse SGD method as the pair
Figure BDA0002379168830000369
To carry out
Figure BDA00023791688300003610
The iteration of (a), wherein,
Figure BDA00023791688300003611
in step 3) is shown
Figure BDA00023791688300003612
To pair
Figure BDA00023791688300003613
The derivative of (c).
5) Robustness of neural network model detection using antagonistic samples
In the case that the structure and parameters of the neural network model are not clear, lawless persons can still attack the products and applications of the neural network models by generating countermeasure samples through migration attack. Therefore, the robustness of the neural network model can be better detected by adopting the sample resisting capability of the migration type attack. The confrontation sample generated by the embodiment of the invention is combined with the attention force map, so that the important characteristics in the attention force map are extracted, the confrontation capability is strong, and the neural network model to be detected can be effectively detected.
When the accuracy of the confronting sample for detecting the neural network model to be detected is lower than the accuracy on the initial sample by less than 5%, the robustness of the neural network model to be detected is stronger. When the robustness of the neural network model to be tested is determined to be weak, the following method can be adopted to improve the robustness of the neural network model to be tested: 1) redesigning the structure of the neural network model to be tested; 2) and retraining the neural network model to be tested through the generated confrontation sample so as to enhance the robustness of the neural network model to be tested. In addition, under the condition that the robustness of the neural network model to be detected is not improved, preprocessing can be performed on input data input to the neural network model to be detected, such as filtering, dimensionality reduction and principal component extraction, so that noise in the input data can be removed, and the phenomenon that the input data cannot be correctly identified due to the fact that the noise is added into the input data and the neural network model to be detected is interfered can be avoided.
Therefore, by the method, the embodiment of the invention can effectively generate the confrontation sample for detecting the neural network model to be detected under the condition that the structure and the parameters of the neural network model to be detected are not clear, thereby detecting the robustness of the neural network model to be detected. And a certain safety test function is provided for the application of various ToB (mainly systematization or cloud-end of various business processes such as government direction, medical treatment, education and the like in order to solve an industrial problem) terminals and ToC (facing various terminal users, solving individual requirements such as shopping, social contact, life and the like), so that the economic benefit is improved, the business competitiveness of the cloud-end based on a neural network is enhanced, including but not limited to voice recognition, image recognition and the like, and more reliable safety guarantee is provided for the application landing.
To sum up, an attention machine map of the first neural network model is constructed through the structural characteristics of the first neural network model, and the attention machine map based on the first neural network model is introduced into the countermeasure sample optimization function, so that an accurate countermeasure sample corresponding to the initial sample can be automatically updated and iterated according to the countermeasure sample optimization function; based on accurate confrontation samples, robustness detection can be rapidly and accurately carried out on the second neural network model, and robustness detection efficiency is improved.
The above description is only an example of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present invention are included in the protection scope of the present invention.

Claims (15)

1. A method for detecting an image neural network model, the method comprising:
constructing an attention map of a first image neural network model based on structural features of the first image neural network model;
constructing a confrontation image sample function based on an attention map of the first image neural network model and a classification result of the first image neural network model for an initial image sample;
updating and iterating the countermeasure image sample function and the initial image sample, and taking an iteration result as a countermeasure image sample corresponding to the initial image sample;
classifying the confrontation image samples through a second image neural network model to obtain classification results of the confrontation image samples, and
and determining the probability of correct image classification of the second image neural network model according to the difference between the classification result of the confrontation image sample and the classification result of the initial image sample.
2. The method of claim 1, wherein prior to said constructing an attention map of said first image neural network model, said method further comprises:
determining a similarity of each of the third image neural network models to the second graphical neural network model based on functions, input data, and output data of a plurality of third image neural network models;
and determining a third image neural network model corresponding to the maximum similarity as the first image neural network model.
3. The method of claim 1, wherein prior to said constructing an attention map of said first image neural network model, said method further comprises:
according to the function of the second image neural network model, constructing the first image neural network model with the same function as the second image neural network model;
and training the first image neural network model based on the input data and the output data of the second image neural network model to obtain the trained first image neural network model.
4. The method of claim 1, wherein prior to constructing the image challenge sample function, the method further comprises:
classifying the initial image sample through the first image neural network model to obtain a classification result of the first image neural network model for the initial image sample;
the classifying the initial image sample through the first image neural network model to obtain a classification result of the first image neural network model for the initial image sample includes:
and coding the initial image sample through the first image neural network model to obtain the characteristics of the initial image sample.
5. The method of claim 1, wherein constructing an attention map of the first image neural network model based on structural features of the first image neural network model comprises:
performing attention processing on each layer structure in the first image neural network model to obtain attention information of each layer structure;
and combining the attention information of the structures of all layers to obtain an attention map of the first image neural network model.
6. The method of claim 5, wherein the attention processing of each layer structure in the first image neural network model to obtain the attention information of each layer structure comprises:
determining output characteristics of each layer structure in the first image neural network model corresponding to the initial image sample;
and performing derivation processing aiming at the initial image sample on the output characteristics of the initial image sample corresponding to each layer structure, and taking the derivation result as the attention information of each layer structure corresponding to the initial image sample.
7. The method of claim 1, wherein constructing a confrontation image sample function based on an attention map of the first image neural network model and a classification result of the first image neural network model for an initial image sample comprises:
constructing a loss function of the first image neural network model based on the classification result of the first image neural network model for the initial image sample and the antagonistic image sample;
constructing a countermeasure image sample function for iterative updating based on the loss function of the first image neural network, the attention map of the first image neural network model, and the initial image samples.
8. The method of claim 7, wherein constructing a confrontation image sample function for iterative updating based on the loss function of the first image neural network, the attention map of the first image neural network model, and the initial image samples comprises:
performing point multiplication on attention information of each layer structure in the first image neural network model corresponding to the initial image sample and output characteristics of each layer structure corresponding to the confrontation image sample to obtain a first point multiplication result;
performing point multiplication on attention information of each layer structure in the first image neural network model corresponding to the initial image sample and output characteristics of each layer structure corresponding to the initial image sample to obtain a second point multiplication result;
determining a difference between the first point multiplication result and the second point multiplication result;
and constructing a function for iterating the initial image sample and taking an iteration result as a countermeasure image sample by taking the loss function of the first image neural network model, the initial image sample and the difference as parameters.
9. The method of claim 8, wherein determining the difference between the first point multiplication result and the second point multiplication result comprises:
performing difference processing on the first point multiplication result and the second point multiplication result to obtain a processing result;
determining a square of the processing result as a difference between the first dot product and the second dot product.
10. The method according to any one of claims 7-9, further comprising:
in the countermeasure image sample function, the difference constraint of the countermeasure image sample relative to the initial image sample is merged in order to make
The contrast image samples differ relative to the initial image samples by no more than a difference threshold.
11. The method of claim 1, wherein the performing an update iteration process based on the confrontation image sample function and the initial image sample, and regarding an iteration result as a confrontation image sample corresponding to the initial image sample, comprises:
taking the initial image sample as an initial confrontation image sample and substituting the confrontation image sample function to determine a derivative of the confrontation image sample function for the initial confrontation image sample;
adding the derivative to the initial confrontation image sample as a new confrontation image sample and substituting into the image sample function to continue to iteratively update the new confrontation image sample;
and taking a new confrontation image sample obtained by iterative update when the confrontation image sample function is converged as a confrontation image sample corresponding to the initial image sample.
12. A method for detecting a neural network model, the method comprising:
constructing an attention map of a first neural network model based on structural features of the first neural network model;
constructing a confrontation sample function based on an attention map of the first neural network model and a classification result of the first neural network model for an initial sample;
updating and iterating the countermeasure sample function and the initial sample, and taking an iteration result as a countermeasure sample corresponding to the initial sample;
classifying the confrontation samples through a second neural network model to obtain classification results of the confrontation samples, and
and determining the probability of correct classification of the second neural network model according to the difference between the classification result of the confrontation sample and the classification result of the initial sample.
13. An apparatus for detecting a neural network model, the apparatus comprising:
the first processing module is used for constructing an attention force map of a first neural network model based on structural features of the first neural network model;
the construction module is used for constructing a confrontation sample function based on the attention force drawing of the first neural network model and the classification result of the first neural network model for the initial sample;
the updating module is used for performing updating iteration processing on the basis of the confrontation sample function and the initial sample, and taking an iteration result as a confrontation sample corresponding to the initial sample;
and the second processing module is used for carrying out classification processing on the confrontation samples through a second neural network model to obtain the classification results of the confrontation samples, and determining the probability of correct image classification of the second neural network model according to the difference between the classification results of the confrontation samples and the classification results of the initial samples.
14. An apparatus for detecting a neural network model, the apparatus comprising:
a memory for storing executable instructions;
a processor for implementing the method for detecting the image neural network model according to any one of claims 1 to 11 or the method for detecting the neural network model according to claim 12 when executing the executable instructions stored in the memory.
15. A computer-readable storage medium having stored thereon executable instructions for causing a processor to perform a method of detecting a neural network model according to any one of claims 1 to 11, or a method of detecting a neural network model according to claim 12, when executed.
CN202010078047.5A 2020-02-02 2020-02-02 Neural network model detection method, device, equipment and storage medium Active CN111325319B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010078047.5A CN111325319B (en) 2020-02-02 2020-02-02 Neural network model detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010078047.5A CN111325319B (en) 2020-02-02 2020-02-02 Neural network model detection method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111325319A true CN111325319A (en) 2020-06-23
CN111325319B CN111325319B (en) 2023-11-28

Family

ID=71168813

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010078047.5A Active CN111325319B (en) 2020-02-02 2020-02-02 Neural network model detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111325319B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111897964A (en) * 2020-08-12 2020-11-06 腾讯科技(深圳)有限公司 Text classification model training method, device, equipment and storage medium
CN112446423A (en) * 2020-11-12 2021-03-05 昆明理工大学 Fast hybrid high-order attention domain confrontation network method based on transfer learning
CN113326356A (en) * 2021-08-03 2021-08-31 北京邮电大学 Natural countermeasure sample generation method for text classifier and related device
CN113409407A (en) * 2021-05-17 2021-09-17 海南师范大学 Countermeasure sample defense method for acquiring low-frequency information based on average compression
WO2022063076A1 (en) * 2020-09-24 2022-03-31 华为技术有限公司 Adversarial example identification method and apparatus
WO2022141722A1 (en) * 2020-12-30 2022-07-07 罗普特科技集团股份有限公司 Method and apparatus for testing robustness of deep learning-based vehicle detection model
CN115496924A (en) * 2022-09-29 2022-12-20 北京瑞莱智慧科技有限公司 Data processing method, related equipment and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268222A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Action recognition system for action recognition in unlabeled videos with domain adversarial learning and knowledge distillation
US20190035387A1 (en) * 2017-07-27 2019-01-31 Microsoft Technology Licensing, Llc Intent and Slot Detection For Digital Assistants
CN109409222A (en) * 2018-09-20 2019-03-01 中国地质大学(武汉) A kind of multi-angle of view facial expression recognizing method based on mobile terminal
CN109508669A (en) * 2018-11-09 2019-03-22 厦门大学 A kind of facial expression recognizing method based on production confrontation network
US20190130278A1 (en) * 2017-10-26 2019-05-02 Nvidia Corporation Progressive modification of generative adversarial neural networks
CN109902602A (en) * 2019-02-16 2019-06-18 北京工业大学 A kind of airfield runway foreign materials recognition methods based on confrontation Neural Network Data enhancing
CN109934116A (en) * 2019-02-19 2019-06-25 华南理工大学 A kind of standard faces generation method based on generation confrontation mechanism and attention mechanism
CN109934282A (en) * 2019-03-08 2019-06-25 哈尔滨工程大学 A kind of SAR objective classification method expanded based on SAGAN sample with auxiliary information
CN109948658A (en) * 2019-02-25 2019-06-28 浙江工业大学 The confrontation attack defense method of Feature Oriented figure attention mechanism and application
CN110059744A (en) * 2019-04-16 2019-07-26 腾讯科技(深圳)有限公司 Method, the method for image procossing, equipment and the storage medium of training neural network
WO2019148898A1 (en) * 2018-02-01 2019-08-08 北京大学深圳研究生院 Adversarial cross-media retrieving method based on restricted text space
CN110334749A (en) * 2019-06-20 2019-10-15 浙江工业大学 Confrontation attack defending model, construction method and application based on attention mechanism
US20190370587A1 (en) * 2018-05-29 2019-12-05 Sri International Attention-based explanations for artificial intelligence behavior
CN110598543A (en) * 2019-08-05 2019-12-20 华中科技大学 Model training method based on attribute mining and reasoning and pedestrian re-identification method

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268222A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Action recognition system for action recognition in unlabeled videos with domain adversarial learning and knowledge distillation
US20190035387A1 (en) * 2017-07-27 2019-01-31 Microsoft Technology Licensing, Llc Intent and Slot Detection For Digital Assistants
US20190130278A1 (en) * 2017-10-26 2019-05-02 Nvidia Corporation Progressive modification of generative adversarial neural networks
WO2019148898A1 (en) * 2018-02-01 2019-08-08 北京大学深圳研究生院 Adversarial cross-media retrieving method based on restricted text space
US20190370587A1 (en) * 2018-05-29 2019-12-05 Sri International Attention-based explanations for artificial intelligence behavior
CN109409222A (en) * 2018-09-20 2019-03-01 中国地质大学(武汉) A kind of multi-angle of view facial expression recognizing method based on mobile terminal
CN109508669A (en) * 2018-11-09 2019-03-22 厦门大学 A kind of facial expression recognizing method based on production confrontation network
CN109902602A (en) * 2019-02-16 2019-06-18 北京工业大学 A kind of airfield runway foreign materials recognition methods based on confrontation Neural Network Data enhancing
CN109934116A (en) * 2019-02-19 2019-06-25 华南理工大学 A kind of standard faces generation method based on generation confrontation mechanism and attention mechanism
CN109948658A (en) * 2019-02-25 2019-06-28 浙江工业大学 The confrontation attack defense method of Feature Oriented figure attention mechanism and application
CN109934282A (en) * 2019-03-08 2019-06-25 哈尔滨工程大学 A kind of SAR objective classification method expanded based on SAGAN sample with auxiliary information
CN110059744A (en) * 2019-04-16 2019-07-26 腾讯科技(深圳)有限公司 Method, the method for image procossing, equipment and the storage medium of training neural network
CN110334749A (en) * 2019-06-20 2019-10-15 浙江工业大学 Confrontation attack defending model, construction method and application based on attention mechanism
CN110598543A (en) * 2019-08-05 2019-12-20 华中科技大学 Model training method based on attribute mining and reasoning and pedestrian re-identification method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘一敏;蒋建国;齐美彬;: "基于多尺度帧率的视频行人再识别方法", 电脑知识与技术, no. 19, pages 202 - 205 *
景晨凯;宋涛;庄雷;刘刚;王乐;刘凯伦;: "基于深度卷积神经网络的人脸识别技术综述", 计算机应用与软件, no. 01, pages 229 - 237 *
金侠挺;王耀南;张辉;刘理;钟杭;贺振东;: "基于贝叶斯CNN和注意力网络的钢轨表面缺陷检测系统", 自动化学报, no. 12, pages 110 - 125 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111897964A (en) * 2020-08-12 2020-11-06 腾讯科技(深圳)有限公司 Text classification model training method, device, equipment and storage medium
CN111897964B (en) * 2020-08-12 2023-10-17 腾讯科技(深圳)有限公司 Text classification model training method, device, equipment and storage medium
WO2022063076A1 (en) * 2020-09-24 2022-03-31 华为技术有限公司 Adversarial example identification method and apparatus
CN112446423A (en) * 2020-11-12 2021-03-05 昆明理工大学 Fast hybrid high-order attention domain confrontation network method based on transfer learning
CN112446423B (en) * 2020-11-12 2023-01-17 昆明理工大学 Fast hybrid high-order attention domain confrontation network method based on transfer learning
WO2022141722A1 (en) * 2020-12-30 2022-07-07 罗普特科技集团股份有限公司 Method and apparatus for testing robustness of deep learning-based vehicle detection model
CN113409407A (en) * 2021-05-17 2021-09-17 海南师范大学 Countermeasure sample defense method for acquiring low-frequency information based on average compression
CN113409407B (en) * 2021-05-17 2022-05-17 海南师范大学 Countermeasure sample defense method for acquiring low-frequency information based on average compression
CN113326356A (en) * 2021-08-03 2021-08-31 北京邮电大学 Natural countermeasure sample generation method for text classifier and related device
CN113326356B (en) * 2021-08-03 2021-11-02 北京邮电大学 Natural countermeasure sample generation method for text classifier and related device
CN115496924A (en) * 2022-09-29 2022-12-20 北京瑞莱智慧科技有限公司 Data processing method, related equipment and storage medium

Also Published As

Publication number Publication date
CN111325319B (en) 2023-11-28

Similar Documents

Publication Publication Date Title
CN111325319B (en) Neural network model detection method, device, equipment and storage medium
CN111897964B (en) Text classification model training method, device, equipment and storage medium
Frizzi et al. Convolutional neural network for smoke and fire semantic segmentation
KR102130162B1 (en) Assignment of relevance scores for artificial neural networks
CN111143842B (en) Malicious code detection method and system
US20220083863A1 (en) System and method for teaching compositionality to convolutional neural networks
CN110110318A (en) Text Stego-detection method and system based on Recognition with Recurrent Neural Network
CN115223020B (en) Image processing method, apparatus, device, storage medium, and computer program product
CN112381987A (en) Intelligent entrance guard epidemic prevention system based on face recognition
CN112749737A (en) Image classification method and device, electronic equipment and storage medium
Verma et al. Age prediction using image dataset using machine learning
CN114241458B (en) Driver behavior recognition method based on attitude estimation feature fusion
CN115909374B (en) Information identification method, device, equipment, storage medium and program product
CN113778256A (en) Electronic equipment with touch screen and touch unlocking method thereof
WO2023185074A1 (en) Group behavior recognition method based on complementary spatio-temporal information modeling
Browne et al. Critical challenges for the visual representation of deep neural networks
Guo et al. Design of a smart art classroom system based on Internet of Things
CN113205044B (en) Deep fake video detection method based on characterization contrast prediction learning
Shi et al. Research on Safe Driving Evaluation Method Based on Machine Vision and Long Short‐Term Memory Network
CN115620342A (en) Cross-modal pedestrian re-identification method, system and computer
Li et al. Face recognition algorithm based on multiscale feature fusion network
Yang et al. Graph relation transformer: Incorporating pairwise object features into the transformer architecture
CN114373098A (en) Image classification method and device, computer equipment and storage medium
Yang et al. [Retracted] Tackling Explicit Material from Online Video Conferencing Software for Education Using Deep Attention Neural Architectures
Samal et al. SAS-UNet: Modified encoder-decoder network for the segmentation of obscenity in images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40024877

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant