CN112712126B - Picture identification method - Google Patents
Picture identification method Download PDFInfo
- Publication number
- CN112712126B CN112712126B CN202110009127.XA CN202110009127A CN112712126B CN 112712126 B CN112712126 B CN 112712126B CN 202110009127 A CN202110009127 A CN 202110009127A CN 112712126 B CN112712126 B CN 112712126B
- Authority
- CN
- China
- Prior art keywords
- picture
- branch
- network
- layer
- initial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000012545 processing Methods 0.000 claims abstract description 72
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 18
- 238000012549 training Methods 0.000 claims description 35
- 238000011176 pooling Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 3
- 230000000875 corresponding effect Effects 0.000 description 22
- 230000006870 function Effects 0.000 description 20
- 238000010586 diagram Methods 0.000 description 4
- 230000006978 adaptation Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The application provides a picture identification method, which adopts a pre-trained picture identification model, wherein the picture identification model comprises the following steps: a main network, a branch point, and a first processing module; the main network is any convolutional neural network model; the branching point is arranged at a predetermined position; the branch network comprises a branch network convolution layer with a preset layer number and a preset channel width; the preset layer number and the preset channel width are determined according to the downsampling layer in the main network; the picture identification method comprises the following steps: the method comprises the steps that a main network performs first characteristic processing on a picture to be identified to obtain a first processed picture; the branch network identifies the first processed picture and determines a branch identification result; the first processing module receives a branch identification result output by the branch network and determines cross entropy according to the branch identification result and the maximum value position of the branch identification result; and outputting a branch identification result if the cross entropy is smaller than a preset threshold value. The method provided by the application improves the picture identification precision.
Description
Technical Field
The application relates to the technical field of image processing, in particular to a picture identification method.
Background
The use of convolutional neural network models to identify picture categories is a common approach in picture identification techniques. Along with the wide application of the convolutional neural network model in the practical technology, the adaptation degree of different convolutional neural network models to different picture types is also continuously improved, so that the accuracy of picture identification is also continuously improved.
The pictures are inevitably subjected to various security threats in the transmission process. Wherein, the attack is exclusively carried out aiming at the picture identification accuracy. Although the attacker cannot completely estimate the specific structure of the convolutional neural network model, the attacker can estimate the general structure of the convolutional neural network model based on the conventional recognition result of the convolutional neural network model. According to the presumed rough structure, the attacker adds some fine interference in the picture to be identified in a targeted way. These small disturbances are difficult to identify by the naked eye but can be identified by the convolutional neural network model. The convolutional neural network model can integrally identify the fine interference as an original part of the picture to be identified, so that the overall identification accuracy of the picture is affected.
In order to cope with the attack, the convolutional neural network model is improved to keep higher identification accuracy under the condition of interference, and in general, the overall parameters of the convolutional neural network model are adjusted. However, the method for integrally adjusting the convolutional neural network model can only have higher recognition accuracy for specific attacked pictures to be recognized, and the recognition accuracy is reduced for common unharmed pictures to be recognized.
Based on this, a method for identifying a picture is needed to solve the problem that in the prior art, the accuracy of identifying a convolutional neural network model is reduced due to the countermeasure against attack.
Disclosure of Invention
A picture identification method is used for solving the problem that in the prior art, the accuracy of the convolutional neural network model identification is reduced due to the countermeasure against attack.
The application provides a picture identification method, which adopts a pre-trained picture identification model, wherein the picture identification model comprises the following steps: a main network, a branch point, and a first processing module;
the main network is any convolutional neural network model; the main network comprises an input layer, a plurality of common convolution layers, a plurality of downsampling layers and a main network output layer which are sequentially connected from front to back;
the branch point is arranged at a preset position between the first common convolution layer and the pooling layer of the main network; the first common convolution layer is the common convolution layer closest to the input layer from the input layer;
the part from the first common convolution layer to the branch point in the main network forms a first main network processing module; the part from the branch point to the main network output layer in the main network forms a second main network processing module;
the branch network comprises a branch network convolution layer and a branch network output layer, wherein the branch network convolution layer and the branch network output layer are sequentially connected from front to back and have preset layer numbers and channel widths; the preset layer number and the preset channel width are determined according to the downsampling layer in the main network;
wherein the branch network is connected with the main network through the branch point;
the branch network output layer and the branch point are connected with the first processing module;
the picture identification method comprises the following steps:
the input layer receives a picture to be identified and transmits the picture to be identified to the first main network processing module;
the first main network processing module performs first characteristic processing on the picture to be identified to obtain a first processed picture;
the branch point stores the first processed picture;
the branch network identifies the first processed picture and determines a branch identification result;
the first processing module receives the branch identification result output by the branch network and determines cross entropy according to the branch identification result and the maximum value position of the branch identification result; outputting a branch identification result if the cross entropy is smaller than a preset threshold value;
if the cross entropy is smaller than or equal to the preset threshold value, the branch point transmits the first processed picture to the second main network processing module; the second main network processing module performs second characteristic processing on the first processed picture to obtain a main recognition result; and outputting the main identification result. Optionally, the preset layer number is the layer number of the downsampling layer between the branch point and the output layer of the main network in the main network.
Optionally, the predetermined position is determined by:
counting the total layer number of the common convolution layer and the downsampling layer in the main network;
counting the total number of layers of any position of the common convolution layer and the downsampling layer from the input layer to any position of the main network;
and if the total number of layers of the arbitrary position is more than or equal to one fourth of the total number of layers and less than or equal to one third of the total number of layers, determining the corresponding arbitrary position as a preset position.
Optionally, the preset channel width is determined by the following method:
determining the channel width of a first downsampling layer as a preset channel width of a first branch network convolution layer;
wherein the first downsampling layer is any sampling layer; the first branch network convolution layer is a branch network convolution layer corresponding to the first downsampling layer.
Optionally, the picture identification model is obtained through training by the following method:
inputting a first sample picture into an initial picture identification model with a preset structure, and performing a first model training to obtain an intermediate picture identification model; the first sample picture does not include an attack sample picture under attack;
inputting a second sample picture into the intermediate picture identification model, and acquiring an attack sample picture in the second sample picture; the second sample is obtained by modifying the first sample picture by using an attack algorithm;
changing the structure of the intermediate picture identification model;
and inputting a third sample picture comprising the first sample picture and the attack sample picture into an initial branch network in the intermediate picture identification model with the changed structure, performing second model training, determining parameters of the branch network, and obtaining a picture identification model.
Optionally, the initial picture identification model includes an initial main network, an initial branch network, the branch point and a second processing module;
the structure of the initial main network is the same as that of the main network; the initial branch network and the branch network have the same structure; the initial branch network is connected with the initial main network at a preset position through the branch point;
the initial main network output layer of the initial main network and the initial branch network output layer of the initial branch network are connected with the second processing module; the second processing module is used for comparing the processing result of the initial branch network with the processing result of the initial main network.
Optionally, inputting the first sample picture into an initial picture identification model with a predetermined structure, performing a first model training, and obtaining an intermediate picture identification model, including the following steps:
inputting the first sample picture into the initial picture identification model for multiple training;
after each training is finished, acquiring a branch network loss function corresponding to the initial branch network and a main network loss function corresponding to the initial main network;
adding the branch network loss function and the main network loss function to obtain a total loss function;
and determining an initial picture identification model when the total loss function is optimal as the intermediate picture identification model.
Optionally, inputting the second sample picture into the intermediate picture identification model, and obtaining the attack sample picture in the second sample picture includes:
inputting the second sample picture into the intermediate picture identification model, and respectively acquiring a second sample picture branch identification result corresponding to the initial branch network and a second sample picture main identification result corresponding to the initial main network;
determining a similarity value between the branch identification result of the second sample picture and the main identification result of the second sample picture;
and if the similarity value is smaller than a preset similarity threshold value, determining the corresponding second sample picture as the attack sample picture.
Optionally, changing the structure of the intermediate picture identification model includes:
removing the second processing module;
and adding the first processing module, and connecting the branch network output layer and the branch point with the first processing module.
According to the method, the branch network is added on the basis of the existing main network, and training is conducted on the picture to be identified which is attacked by the branch network, so that the branch network has higher identification accuracy on the picture to be identified which is attacked by the branch network, and the main network continues to keep higher identification accuracy on the picture to be identified which is not attacked by the branch network. The branch network is determined from the primary network. And an attacker cannot predict the outcome of a branched network, so that it is difficult to perform an efficient attack. In the process of identifying the picture, the branch network firstly identifies the picture, and if the identification of the branch network can be used as the final identification result, the workload of the picture identification model is greatly reduced. The method provided by the embodiment of the application improves the identification accuracy of the attacked picture to be identified on the premise of not reducing the identification accuracy of the attacked picture.
Drawings
Fig. 1 is a schematic structural diagram of a picture recognition model according to an embodiment of the present application;
fig. 2 is a flow chart of a picture identifying method according to an embodiment of the present application;
fig. 3 is a flow chart of a training method of a picture recognition model according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an initial image recognition model according to an embodiment of the present application;
fig. 5 is a flow chart corresponding to a method for identifying an attack sample according to an embodiment of the present application;
fig. 6 is a flowchart of another picture identifying method according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Before explaining the method provided by the embodiment of the present application, a pre-trained image recognition model for image recognition provided by the embodiment of the present application is first described. Fig. 1 is a schematic structural diagram of a picture recognition model according to the embodiment of the present application.
The picture recognition model comprises: a main network 1, a branch network 2, a branch point 3, and a first processing module 4.
The main network 1 is any convolutional neural network model. Any convolutional neural network model that can be used for picture recognition can be used as the master network 1 in the embodiments of the present application. The embodiment of the present application does not limit the main network 1. The main network 1 can independently complete the function of picture identification. According to the embodiment of the application, the branch network 2 is added outside the main network 1, and the branch network 2 assists the main network 1 to identify the attacked picture to be identified, so that the accuracy of the whole picture identification model is improved.
The main network 1 includes an input layer, a plurality of normal convolution layers, a plurality of downsampling layers, and a main network output layer, which are sequentially connected from front to back. The main network output layer is composed of a pooling layer and a full connection layer. Only one layer is arranged on the pooling layer and the full-connection layer. In the embodiment of the application, the number of layers of the common convolution layer is limited to a certain extent, and the common convolution layer has at least two layers.
The common convolution layer and the downsampling layer are used for extracting the characteristic quantity of the picture to be identified. The pooling layer is used for reducing the dimension of the extracted characteristic quantity. The full connection layer is used for outputting the final recognition result.
The branching point 3 is provided at a predetermined position between the first normal convolution layer and the pooling layer of the main network 1. The first normal convolution layer is the normal convolution layer closest to the input layer from the input layer.
Specifically, the predetermined position is determined by:
the total number of layers of the normal convolutional layer and the downsampling layer in the main network 1 is counted.
The total number of layers from the input layer to any position of the main network 1 and any position of the downsampling layer is counted.
Setting the branching point 3 at a predetermined position ensures that the main network 1 has performed feature amount extraction at least once, instead of directly processing the picture to be recognized by the branching network 2.
And if the total number of layers of the arbitrary position is more than or equal to one fourth of the total number of layers and less than or equal to one third of the total number of layers, determining the corresponding arbitrary position as the preset position.
For convenience of explanation, in the embodiment of the present application, a portion from the first normal convolution layer to the branching point 3 in the main network 1 constitutes the first main network processing module 11. The part of the main network 1 from the fulcrum 3 to the output layer of the main network 1 constitutes a second main network processing module 12.
The branch network 2 includes a branch network convolution layer and a branch network 2 output layer, which are sequentially connected from front to back, with a preset number of layers and a preset channel width. The branch network output layer is composed of a branch network pooling layer and a branch network full-connection layer.
The number of preset layers and the preset channel width are determined according to the downsampling layers in the main network 1. The structure of the branch network pooling layer is the same as that of the pooling layer. The structure of the branched full-connection layer is the same as that of the full-connection layer.
It should be further noted that, in the embodiment of the present application, the branch network 2 is determined according to the main network 1. Therefore, the preset layer number is the layer number of the downsampling layer between the branch point 3 and the output layer of the main network 1 in the main network 1, that is, the layer number of the downsampling layer in the second main network processing module 12.
The preset channel width is the channel width of the downsampling layer.
Wherein the branch network 2 is connected to the main network 1 via a branch point 3.
The output layer of the branch network 2 and the branch point 3 are connected to a first processing module 4.
The preset channel width is determined by the following method:
the channel width of the first downsampling layer is determined as the preset channel width of the convolutional layer of the first branch network 2.
Wherein the first downsampling layer is any sampling layer. The first branch network 2 convolution layer is the branch network 2 convolution layer corresponding to the first downsampling layer.
It should be noted that one downsampling layer corresponds to one branch network 2 convolution layer in the branch network 2. The channel width of the downsampling layer is the preset channel width of the convolution layer of the corresponding branch network 2.
For example, if there are three downsampling layers in the second main network processing module 12, three branch network convolution layers are correspondingly disposed in the branch network 2. The first downsampling layer from the branching point corresponds to the first branching network convolution layer in the branching network 2, and the preset channel width of the branching network convolution layer is the channel width of the first downsampling layer from the branching point.
Fig. 2 is a schematic flow chart of a picture identifying method according to an embodiment of the present application. As shown in fig. 2, the embodiment of the present application includes the following steps:
in step S201, the input layer receives the picture to be identified and transmits the picture to be identified to the first main network processing module.
Step S202, a first main network processing module performs first feature processing on a picture to be identified to obtain a first processed picture.
In this embodiment, the branching point 3 is set after the first common convolution layer, so as to enable the main network 1 to perform feature processing on the picture to be identified at least once.
In step S203, the branch point stores the first processed picture.
Step S204, the branch network identifies the first processed picture and determines a branch identification result.
It should be noted that, the branch network 2 provided in the embodiment of the present application has a good recognition degree for the attacked picture to be recognized, so the picture to be recognized is first processed through the branch network 2, and the problem that the accuracy of the recognition result is not high due to the fact that the main network 1 directly recognizes the picture to be recognized is avoided.
In step S205, the first processing module receives the branch identification result output by the branch network, and determines the cross entropy according to the branch identification result and the maximum value position of the branch identification result.
For example, if the branch identification result is [0.1,0.2,0.5,0.2], the maximum value position of the branch identification result is [0, 1,0]. Let p take the value of the branch identification result, i.e. p= [0.1,0.2,0.5,0.2] let q take the maximum value position of the branch identification result, i.e. q= [0, 1,0], then calculate the mutual cross entropy of the two, i.e. H (q, p) = - Σq (n) = -1 x 1og (0.5) = 0.3, then 0.3 is the corresponding cross entropy.
Step S206, judging whether the cross entropy is smaller than a preset threshold, if so, executing step S207, otherwise, executing step S208.
It should be noted that, in the embodiment of the present application, step S205 is used to determine whether the branch identification result reaches the expected accuracy, and if the branch identification result reaches the expected accuracy, only the last branch identification result needs to be the final identification result. By the process, a large number of feature extraction processes of secondary confirmation of the picture to be identified by the main network 1 are avoided, and the workload of a picture identification model is reduced.
Step S207, outputting the branch identification result.
In step S208, the branch point transmits the first processed picture to the second main network 1 processing module.
In step S203, the branch point stores the first processed picture, and in this step, the processing result of the previous step may be transmitted to the second main network 1 processing module.
Step S209, the second main network processing module performs second feature processing on the first processed picture to obtain a main recognition result. And outputs the main recognition result.
It should be noted that, in the embodiment of the present application, the main network 1 has good recognition accuracy for the picture to be recognized that is not attacked. Therefore, if the branch recognition result does not reach the intended accuracy, the main network 1 is required to continue the recognition processing of the picture to be recognized and take the main recognition result as the final recognition result.
The image recognition model in the process of executing the steps is trained in advance, and the training process of the image recognition model provided by the embodiment of the application is specifically described next.
Fig. 3 is a schematic flow chart of a training method of a picture recognition model according to an embodiment of the present application. It should be noted that, the training method provided in the embodiment of the present application includes two training processes, namely, a first training process and a second training process. In the two training processes, the structures of the picture identification models are different, and the corresponding effects of the picture identification models are also different. The training method provided by the embodiment of the application comprises the following steps:
step S301, inputting the first sample picture into an initial picture recognition model with a preset structure, and performing a first model training to obtain an intermediate picture recognition model.
Wherein the first sample picture does not include an attack sample picture that is under attack.
At this point, the initial picture recognition model is not structurally identical to the last established pre-trained picture recognition model. Fig. 4 is a schematic structural diagram of an initial image recognition model according to an embodiment of the present application.
As shown in fig. 4, the initial picture recognition model includes an initial main network 61, an initial branch network 62, a branch point 3, and a second processing module 5.
The structure of the initial main network 61 is the same as that of the main network 1. I.e. the initial main network 61 also comprises an input layer, a plurality of normal convolution layers and a plurality of downsampling layers. The initial branch network 62 is of the same structure as the branch network 2, i.e. the initial branch network 62 also comprises a plurality of initial branch network convolution layers. The initial branch network 62 is connected to the initial main network 61 at a predetermined position through the branch point 3.
The initial main network output layer of the initial main network 61 and the initial branch network output layer of the initial branch network 62 are connected to the second processing module 5. The second processing module 5 is configured to compare the processing result of the initial branch network 62 with the processing result of the initial main network 61.
Similar to the trained picture recognition model, the part of the initial main network 61 protected from the first initial common convolution layer to the branch point 3 forms a first initial main network 611 processing module, and the part of the initial main network 61 from the branch point 3 to the output layer of the initial main network 61 forms a second initial main network 612 processing module.
Specifically, step 301 includes the steps of:
first, a first sample picture is input into an initial picture recognition model, and training is performed for a plurality of times.
Then, after each training, the branch network loss function corresponding to the initial branch network 62 and the main network loss function corresponding to the initial main network 61 are acquired.
It should be noted that, after each training, the initial branch network 62 and the initial main network 61 generate a loss function to be optimized, respectively.
Then, the branch network loss function is added to the main network loss function to obtain a total loss function. When this step is performed, the branch network loss function is linearly added to the main network loss function to obtain the total loss function. In the embodiment of the present application, gradient descent is required for the total loss function as a target.
And finally, determining the initial picture identification model when the total loss function is optimal as an intermediate picture identification model.
It should be noted that, step 301 trains the initial branch network 62 and the initial main network 61 with the same first sample picture, and both the initial branch network 62 and the initial main network 61 after training have better recognition function for the picture that is not attacked.
Step S302, inputting the second sample picture into the intermediate picture identification model, and obtaining an attack sample picture in the second sample picture.
The second sample is obtained by modifying the first sample picture by using an attack algorithm. The step is to imitate an attacker to attack the picture to be identified, and add interference into the picture to be identified.
Fig. 5 is a schematic flow chart corresponding to a method for identifying an attack sample according to an embodiment of the present application. Mainly comprises the following steps:
step S501, inputting the second sample picture into the intermediate picture recognition model, and respectively obtaining a second sample picture branch recognition result corresponding to the initial branch network and a second sample picture main recognition result corresponding to the initial main network.
Step S502, determining a similarity value between the branch identification result of the second sample picture and the main identification result of the second sample picture.
In the embodiment of the application, the similarity value can be judged by adopting a plurality of methods, for example, the inner product operation of the branch identification result of the second sample picture and the main identification result of the second sample picture is determined, and the larger the corresponding result of the inner product operation is, the more similar the two are. For example, the norm distance of the difference phasor between the branch identification result of the second sample picture and the main identification result of the second sample picture is calculated, and the smaller the norm distance is, the more similar is. When any one method is selected, normalization processing is required to be performed on the branch identification result of the second sample picture and the main identification result of the second sample picture.
The other feasible method provided by the application is that the cross entropy between the branch identification result of the second sample picture and the main identification result of the second sample picture is determined.
Step S503, judging whether the similarity value is smaller than a preset similarity threshold, if so, executing step S504, otherwise, executing step S505.
In the embodiment of the application, the preset similarity threshold is determined empirically and can be adjusted according to actual needs.
Step S504, determining the corresponding second sample picture as an attack sample picture.
In step S505, the corresponding second sample picture is determined as a normal sample picture.
It should be noted that, in practice, the intermediate picture recognition model may be used in picture recognition to determine whether a picture to be recognized is attacked. Steps S501 to S505 are also processes of determining whether the picture is attacked.
Since the attack of the picture to be identified is performed against the original initial main network 1, the accuracy of identifying the picture by the initial main network 61 is lowered after the picture to be identified is attacked. The intermediate picture recognition model is not expected to have an initial branch network 62 by a general attacker, and interference cannot be added to the picture to be recognized aiming at the initial branch network 62, so that the initial branch network 62 has higher recognition accuracy of the picture to be recognized on the premise. If the similarity between the branch recognition result and the main recognition result is low, it is indicated that the initial branch network 62 is different from the initial main network 61 in recognition result of the picture to be recognized, and it is further indicated that the picture to be recognized is attacked.
The steps S501 to S505 provided in the embodiment of the present application may independently complete the determination of whether the picture to be identified is attacked. In actual need, steps S501 to S505 may be performed separately.
It should be noted that, steps S501 to S505 may be used to separately determine whether the picture to be identified is attacked, so that the intermediate model to be identified may be put into use, and if the attacked picture is found during the use, the attacked picture is collected. And mixes the attacked picture into the third sample picture, and performs the subsequent step S304.
Step S303, changing the structure of the intermediate picture recognition model.
Specifically, step S303 includes the steps of:
the second processing module 5 is removed.
The first processing module 4 is added and the branching network output layer and the branching point 3 are connected to the first processing module 4.
The structure of the intermediate picture recognition model is adjusted to the structure of the trained picture recognition model in step S303.
Step S304, inputting a third sample picture comprising the first sample picture and the attack sample picture into an initial branch network in the intermediate picture identification model after the structure is changed, performing a second model training, determining parameters of the branch network, and obtaining a picture identification model.
In the second model training process, the main network 1 does not participate in the training, and the parameters corresponding to the main network 1 are not adjusted. Step S304 is performed to enable the branch network 2 to have a better recognition function for the attacked picture to be recognized, and in essence, after step S304 is performed, the branch network 2 is used to recognize the attacked picture to be recognized, and the main network 1 is used to recognize the picture not attacked.
It should be specifically noted that, since the intermediate picture recognition model provided in the embodiment of the present application has practical significance, another picture recognition method is provided in the embodiment of the present application. Fig. 6 is a schematic flow chart of another picture identifying method according to an embodiment of the present application. The method provided by the embodiment of the application is as follows:
and step S601, training the initial picture recognition model built in advance for a plurality of times to obtain an intermediate picture recognition model.
The structure of the initial image recognition model is shown in fig. 4, which has been described above and will not be described here again. The process of training the initial picture recognition model is consistent with step S301, and will not be described here again.
Step S602, inputting the picture to be identified into an intermediate picture identification model to obtain an attack picture.
Specifically, an initial branch network and an initial main network in the intermediate picture identification model respectively identify pictures to be identified, if the difference between the identification results of the two is large, the pictures to be identified are determined to be attack pictures, the attack pictures are stored, and an alarm instruction is sent to staff. The alarm instruction is used for reminding staff that the structure of the initial main network can be broken by an attacker, and the intermediate picture recognition model needs to be optimized.
And step S603, after the number of attack pictures reaches a preset number, mixing the attack pictures into the pictures to be identified, and training the intermediate picture identification model for multiple times to obtain the picture identification model.
In addition, step S603 may not be executed in actual practice. When recognizing the picture, only step S601 and step S602 may be performed. The intermediate recognition model can recognize the picture and judge whether the picture to be recognized receives attack or not. Step S603 may be understood as a self-updating method of the intermediate recognition model after determining that the attack is to be recognized. The training process of the intermediate image recognition model in the specific step S603 is the same as that in the step S304, and will not be described here again.
According to the method provided by the embodiment of the application, the branch network is added on the basis of the existing main network, and training is conducted on the picture to be identified which is attacked by the branch network, so that the branch network has higher identification accuracy on the picture to be identified which is attacked, and the main network continues to keep higher identification accuracy on the picture to be identified which is not attacked. The branch network is determined from the primary network. And an attacker cannot predict the outcome of a branched network, so that it is difficult to perform an efficient attack. In the process of identifying the picture, the branch network firstly identifies the picture, and if the identification of the branch network can be used as the final identification result, the workload of the picture identification model is greatly reduced. The method provided by the embodiment of the application improves the identification accuracy of the attacked picture to be identified on the premise of not reducing the identification accuracy of the attacked picture.
The invention is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It is to be understood that the invention is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
Claims (8)
1. The picture identification method is characterized by adopting a pre-trained picture identification model, wherein the picture identification model comprises the following steps: a main network, a branch point, and a first processing module;
the main network is any convolutional neural network model; the main network comprises an input layer, a plurality of common convolution layers, a plurality of downsampling layers and a main network output layer which are sequentially connected from front to back;
the branch point is arranged at a preset position between the first common convolution layer and the pooling layer of the main network; the first common convolution layer is the common convolution layer closest to the input layer from the input layer;
the part from the first common convolution layer to the branch point in the main network forms a first main network processing module; the part from the branch point to the main network output layer in the main network forms a second main network processing module;
the branch network comprises a branch network convolution layer and a branch network output layer, wherein the branch network convolution layer and the branch network output layer are sequentially connected from front to back and have preset layer numbers and channel widths; the preset layer number and the preset channel width are determined according to the downsampling layer in the main network;
wherein the branch network is connected with the main network through the branch point;
the branch network output layer and the branch point are connected with the first processing module;
the picture identification method comprises the following steps:
the input layer receives a picture to be identified and transmits the picture to be identified to the first main network processing module;
the first main network processing module performs first characteristic processing on the picture to be identified to obtain a first processed picture;
the branch point stores the first processed picture;
the branch network identifies the first processed picture and determines a branch identification result;
the first processing module receives the branch identification result output by the branch network and determines cross entropy according to the branch identification result and the maximum value position of the branch identification result; outputting a branch identification result if the cross entropy is smaller than a preset threshold value;
if the cross entropy is smaller than or equal to the preset threshold value, the branch point transmits the first processed picture to the second main network processing module; the second main network processing module performs second characteristic processing on the first processed picture to obtain a main recognition result; and outputting the main recognition result;
the picture identification model is obtained through training by the following method:
inputting a first sample picture into an initial picture identification model with a preset structure, and performing a first model training to obtain an intermediate picture identification model; the first sample picture does not include an attack sample picture under attack;
inputting a second sample picture into the intermediate picture identification model, and acquiring an attack sample picture in the second sample picture; the second sample is obtained by modifying the first sample picture by using an attack algorithm;
changing the structure of the intermediate picture identification model;
and inputting a third sample picture comprising the first sample picture and the attack sample picture into an initial branch network in the intermediate picture identification model with the changed structure, performing second model training, determining parameters of the branch network, and obtaining a picture identification model.
2. The picture identifying method according to claim 1, wherein the preset layer number is a layer number of downsampling layers between the branch point and the output layer of the main network in the main network.
3. A picture recognition method according to claim 1, wherein the predetermined position is determined by:
counting the total layer number of the common convolution layer and the downsampling layer in the main network;
counting the total number of layers of any position of the common convolution layer and the downsampling layer from the input layer to any position of the main network;
and if the total number of layers of the arbitrary position is more than or equal to one fourth of the total number of layers and less than or equal to one third of the total number of layers, determining the corresponding arbitrary position as a preset position.
4. The picture recognition method according to claim 1, wherein the preset channel width is determined by:
determining the channel width of a first downsampling layer as a preset channel width of a first branch network convolution layer;
wherein the first downsampling layer is any sampling layer; the first branch network convolution layer is a branch network convolution layer corresponding to the first downsampling layer.
5. The picture recognition method as claimed in claim 1, wherein the initial picture recognition model includes an initial main network, an initial branch network, the branch point, and a second processing module;
the structure of the initial main network is the same as that of the main network; the initial branch network and the branch network have the same structure; the initial branch network is connected with the initial main network at a preset position through the branch point;
the initial main network output layer of the initial main network and the initial branch network output layer of the initial branch network are connected with the second processing module; the second processing module is used for comparing the processing result of the initial branch network with the processing result of the initial main network.
6. The picture recognition method according to claim 1, wherein inputting the first sample picture into an initial picture recognition model of a predetermined structure, performing a first model training, and obtaining an intermediate picture recognition model, comprises the steps of:
inputting the first sample picture into the initial picture identification model for multiple training;
after each training is finished, acquiring a branch network loss function corresponding to the initial branch network and a main network loss function corresponding to the initial main network;
adding the branch network loss function and the main network loss function to obtain a total loss function;
and determining an initial picture identification model when the total loss function is optimal as the intermediate picture identification model.
7. The method for recognizing a picture according to claim 5, wherein inputting a second sample picture into the intermediate picture recognition model, obtaining an attack sample picture in the second sample picture, comprises:
inputting the second sample picture into the intermediate picture identification model, and respectively acquiring a second sample picture branch identification result corresponding to the initial branch network and a second sample picture main identification result corresponding to the initial main network;
determining a similarity value between the branch identification result of the second sample picture and the main identification result of the second sample picture;
and if the similarity value is smaller than a preset similarity threshold value, determining the corresponding second sample picture as the attack sample picture.
8. The picture recognition method as claimed in claim 5, wherein changing the structure of the intermediate picture recognition model comprises:
removing the second processing module;
and adding the first processing module, and connecting the branch network output layer and the branch point with the first processing module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110009127.XA CN112712126B (en) | 2021-01-05 | 2021-01-05 | Picture identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110009127.XA CN112712126B (en) | 2021-01-05 | 2021-01-05 | Picture identification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112712126A CN112712126A (en) | 2021-04-27 |
CN112712126B true CN112712126B (en) | 2024-03-19 |
Family
ID=75548302
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110009127.XA Active CN112712126B (en) | 2021-01-05 | 2021-01-05 | Picture identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112712126B (en) |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171260A (en) * | 2017-12-15 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | A kind of image identification method and system |
CN109375186A (en) * | 2018-11-22 | 2019-02-22 | 中国人民解放军海军航空大学 | Radar target identification method based on the multiple dimensioned one-dimensional convolutional neural networks of depth residual error |
CN109784153A (en) * | 2018-12-10 | 2019-05-21 | 平安科技(深圳)有限公司 | Emotion identification method, apparatus, computer equipment and storage medium |
CN109829443A (en) * | 2019-02-23 | 2019-05-31 | 重庆邮电大学 | Video behavior recognition methods based on image enhancement Yu 3D convolutional neural networks |
CN110084318A (en) * | 2019-05-07 | 2019-08-02 | 哈尔滨理工大学 | A kind of image-recognizing method of combination convolutional neural networks and gradient boosted tree |
CN110443286A (en) * | 2019-07-18 | 2019-11-12 | 广州华多网络科技有限公司 | Training method, image-recognizing method and the device of neural network model |
CN110516745A (en) * | 2019-08-28 | 2019-11-29 | 北京达佳互联信息技术有限公司 | Training method, device and the electronic equipment of image recognition model |
CN111199217A (en) * | 2020-01-09 | 2020-05-26 | 上海应用技术大学 | Traffic sign identification method and system based on convolutional neural network |
CN111626317A (en) * | 2019-08-14 | 2020-09-04 | 广东省智能制造研究所 | Semi-supervised hyperspectral data analysis method based on double-flow conditional countermeasure generation network |
CN111709190A (en) * | 2020-06-24 | 2020-09-25 | 国电联合动力技术有限公司 | Wind turbine generator operation data image identification method and device |
CN111860545A (en) * | 2020-07-30 | 2020-10-30 | 元神科技(杭州)有限公司 | Image sensitive content identification method and system based on weak detection mechanism |
CN111860147A (en) * | 2020-06-11 | 2020-10-30 | 北京市威富安防科技有限公司 | Pedestrian re-identification model optimization processing method and device and computer equipment |
CN112115973A (en) * | 2020-08-18 | 2020-12-22 | 吉林建筑大学 | Convolutional neural network based image identification method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11195057B2 (en) * | 2014-03-18 | 2021-12-07 | Z Advanced Computing, Inc. | System and method for extremely efficient image and pattern recognition and artificial intelligence platform |
US11074495B2 (en) * | 2013-02-28 | 2021-07-27 | Z Advanced Computing, Inc. (Zac) | System and method for extremely efficient image and pattern recognition and artificial intelligence platform |
-
2021
- 2021-01-05 CN CN202110009127.XA patent/CN112712126B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171260A (en) * | 2017-12-15 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | A kind of image identification method and system |
CN109375186A (en) * | 2018-11-22 | 2019-02-22 | 中国人民解放军海军航空大学 | Radar target identification method based on the multiple dimensioned one-dimensional convolutional neural networks of depth residual error |
CN109784153A (en) * | 2018-12-10 | 2019-05-21 | 平安科技(深圳)有限公司 | Emotion identification method, apparatus, computer equipment and storage medium |
CN109829443A (en) * | 2019-02-23 | 2019-05-31 | 重庆邮电大学 | Video behavior recognition methods based on image enhancement Yu 3D convolutional neural networks |
CN110084318A (en) * | 2019-05-07 | 2019-08-02 | 哈尔滨理工大学 | A kind of image-recognizing method of combination convolutional neural networks and gradient boosted tree |
CN110443286A (en) * | 2019-07-18 | 2019-11-12 | 广州华多网络科技有限公司 | Training method, image-recognizing method and the device of neural network model |
CN111626317A (en) * | 2019-08-14 | 2020-09-04 | 广东省智能制造研究所 | Semi-supervised hyperspectral data analysis method based on double-flow conditional countermeasure generation network |
CN110516745A (en) * | 2019-08-28 | 2019-11-29 | 北京达佳互联信息技术有限公司 | Training method, device and the electronic equipment of image recognition model |
CN111199217A (en) * | 2020-01-09 | 2020-05-26 | 上海应用技术大学 | Traffic sign identification method and system based on convolutional neural network |
CN111860147A (en) * | 2020-06-11 | 2020-10-30 | 北京市威富安防科技有限公司 | Pedestrian re-identification model optimization processing method and device and computer equipment |
CN111709190A (en) * | 2020-06-24 | 2020-09-25 | 国电联合动力技术有限公司 | Wind turbine generator operation data image identification method and device |
CN111860545A (en) * | 2020-07-30 | 2020-10-30 | 元神科技(杭州)有限公司 | Image sensitive content identification method and system based on weak detection mechanism |
CN112115973A (en) * | 2020-08-18 | 2020-12-22 | 吉林建筑大学 | Convolutional neural network based image identification method |
Non-Patent Citations (3)
Title |
---|
Zahra Dehghan Doolab ; Seyyed Ali Seyyedsalehi ; Narjes Soltani Dehaghani." Nonlinear Normalization of Input Patterns to Handwritten Character Variability in Handwriting Recognition Neural Network".《2012 International Conference on Biomedical Engineering and Biotechnology》.2012,全文. * |
基于卷积神经网络的图像大数据识别;吴海丽;;山西大同大学学报(自然科学版)(第02期);全文 * |
基于多任务学习的人脸属性识别方法;李亚;张雨楠;彭程;杨俊钦;刘淼;;计算机工程(第03期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112712126A (en) | 2021-04-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112446379B (en) | Self-adaptive intelligent processing method for dynamic large scene | |
CN114186632B (en) | Method, device, equipment and storage medium for training key point detection model | |
CN110619316A (en) | Human body key point detection method and device and electronic equipment | |
CN110991568A (en) | Target identification method, device, equipment and storage medium | |
CN110516734B (en) | Image matching method, device, equipment and storage medium | |
CN110929564B (en) | Fingerprint model generation method and related device based on countermeasure network | |
CN112765357A (en) | Text classification method and device and electronic equipment | |
CN112966547A (en) | Neural network-based gas field abnormal behavior recognition early warning method, system, terminal and storage medium | |
CN115223251A (en) | Training method and device for signature detection model, electronic equipment and storage medium | |
CN114154622A (en) | Algorithm model for traffic operation system flow data acquisition missing completion | |
CN113011893B (en) | Data processing method, device, computer equipment and storage medium | |
CN114417739A (en) | Method and device for recommending process parameters under abnormal working conditions | |
CN112712126B (en) | Picture identification method | |
CN114078274A (en) | Face image detection method and device, electronic equipment and storage medium | |
CN107330382A (en) | The single sample face recognition method and device represented based on local convolution characteristic binding | |
CN109977884B (en) | Target following method and device | |
CN114459482B (en) | Boundary simplification method, path planning method, device, equipment and system | |
CN110378241A (en) | Crop growthing state monitoring method, device, computer equipment and storage medium | |
WO2022153710A1 (en) | Training apparatus, classification apparatus, training method, classification method, and program | |
CN115661890A (en) | Model training method, face recognition device, face recognition equipment and medium | |
CN114820755A (en) | Depth map estimation method and system | |
CN115273814A (en) | Pseudo voice detection method, device, computer equipment and storage medium | |
CN115249281A (en) | Image occlusion and model training method, device, equipment and storage medium | |
CN113255512A (en) | Method, apparatus, device and storage medium for living body identification | |
CN105574038B (en) | Content of text discrimination test method and device based on anti-identification rendering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |