CN112712126A - Picture identification method - Google Patents

Picture identification method Download PDF

Info

Publication number
CN112712126A
CN112712126A CN202110009127.XA CN202110009127A CN112712126A CN 112712126 A CN112712126 A CN 112712126A CN 202110009127 A CN202110009127 A CN 202110009127A CN 112712126 A CN112712126 A CN 112712126A
Authority
CN
China
Prior art keywords
picture
branch
network
initial
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110009127.XA
Other languages
Chinese (zh)
Other versions
CN112712126B (en
Inventor
王中风
何鎏璐
王美琪
林军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202110009127.XA priority Critical patent/CN112712126B/en
Publication of CN112712126A publication Critical patent/CN112712126A/en
Application granted granted Critical
Publication of CN112712126B publication Critical patent/CN112712126B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a picture identification method, which adopts a pre-trained picture identification model, wherein the picture identification model comprises the following steps: the system comprises a main network, a branch point and a first processing module; the main network is any one convolution neural network model; the branch point is arranged at a predetermined position; the branch network comprises branch network convolution layers with preset layer number and preset channel width; the preset number of layers and the preset channel width are determined according to the down-sampling layer in the main network; the picture identification method comprises the following steps: the main network carries out first characteristic processing on the picture to be identified to obtain a first processed picture; the branch network identifies the first processed picture and determines a branch identification result; the first processing module receives a branch identification result output by the branch network and determines the cross entropy according to the branch identification result and the maximum value position of the branch identification result; and if the cross entropy is smaller than the preset threshold value, outputting a branch identification result. The method provided by the application improves the picture identification precision.

Description

Picture identification method
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a picture identification method.
Background
The recognition of the picture category by using the convolutional neural network model is a common means in the picture recognition technology. With the wide application of the convolutional neural network model in the practical technology, the adaptation degree of different convolutional neural network models to different picture types is also continuously improved, so the accuracy of picture identification is also continuously improved.
During the transmission process of the pictures, various security threats are inevitable. Wherein the counterattack attacks specifically on the picture recognition accuracy. Although an attacker cannot fully estimate the specific structure of the convolutional neural network model, the attacker can estimate the approximate structure of the convolutional neural network model from the previous recognition result of the convolutional neural network model. According to the presumed general structure, the attacker adds some slight interference in the picture to be identified in a targeted manner. These subtle perturbations are difficult to identify visually, but can be identified by convolutional neural network models. The convolutional neural network model can take the slight interference as an original part of the picture to be recognized to perform overall recognition, so that the overall recognition accuracy of the picture is influenced.
In order to cope with counterattack and improve the recognition accuracy of the convolutional neural network model under the condition of interference, the overall parameter adjustment is generally carried out on the convolutional neural network model. However, the method for integrally adjusting the convolutional neural network model can only have higher identification accuracy for a specific image to be identified under attack, and the identification accuracy is reduced for a common image to be identified without attack.
Based on this, there is a need for an image recognition method for solving the problem that the recognition accuracy of the convolutional neural network model is reduced due to counterattack in the prior art.
Disclosure of Invention
A picture identification method is used for solving the problem that the identification accuracy of a convolutional neural network model is reduced due to counterattack in the prior art.
The application provides a picture identification method, which adopts a pre-trained picture identification model, wherein the picture identification model comprises the following steps: the system comprises a main network, a branch point and a first processing module;
the main network is any one convolutional neural network model; the main network comprises an input layer, a plurality of common convolution layers, a plurality of down-sampling layers and a main network output layer which are sequentially connected from front to back;
the branch point is arranged at a preset position between the first common convolution layer of the main network and the pooling layer; the first normal convolutional layer is a normal convolutional layer closest to the input layer from the input layer;
wherein a portion of the main network from the first normal convolution layer to the branch point constitutes a first main network processing module; the part from the branch point to the main network output layer in the main network forms a second main network processing module;
the branch network comprises a branch network convolution layer and a branch network output layer which are sequentially connected from front to back and have preset number of layers and preset channel width; the preset number of layers and the preset channel width are determined according to the down-sampling layer in the main network;
wherein the branch network is connected with the main network through the branch point;
the branch network output layer and the branch point are connected with the first processing module;
the picture identification method comprises the following steps:
the input layer receives a picture to be identified and transmits the picture to be identified to the first main network processing module;
the first main network processing module performs first feature processing on the picture to be identified to obtain a first processed picture;
the branch point stores the first processing picture;
the branch network identifies the first processed picture and determines a branch identification result;
the first processing module receives the branch identification result output by the branch network and determines the cross entropy according to the branch identification result and the maximum position of the branch identification result; if the cross entropy is smaller than a preset threshold value, outputting a branch identification result;
if the cross entropy is smaller than or equal to the preset threshold value, the branch point transmits the first processed picture to the second main network processing module; the second main network processing module performs second feature processing on the first processed picture to obtain a main identification result; and outputting the main recognition result. Optionally, the preset number of layers is the number of layers of a downsampling layer between the branch point and the output layer of the main network in the main network.
Optionally, the predetermined position is determined by:
counting the total number of the common convolution layers and the downsampling layers in the main network;
counting the total number of the common convolution layers from the input layer to any position of the main network and any position of the down-sampling layer;
and if the total layer number of the arbitrary positions is more than or equal to one fourth of the total layer number and less than or equal to one third of the total layer number, determining the corresponding arbitrary positions as the preset positions.
Optionally, the preset channel width is determined by the following method:
determining the channel width of the first lower sampling layer as the preset channel width of the first branch network convolution layer;
wherein the first lower sampling layer is any sampling layer; the first branch network convolution layer is a branch network convolution layer corresponding to the first lower sampling layer.
Optionally, the image recognition model is obtained by training through the following method:
inputting a first sample picture into an initial picture recognition model with a preset structure, and performing first model training to obtain an intermediate picture recognition model; the first sample picture does not comprise an attacked sample picture;
inputting a second sample picture into the intermediate picture identification model, and acquiring an attack sample picture in the second sample picture; the second sample is obtained by modifying the first sample picture by using an attack algorithm;
changing the structure of the intermediate picture recognition model;
and inputting a third sample picture comprising the first sample picture and the attack sample picture into an initial branch network in the intermediate picture recognition model with the changed structure, performing second model training, determining parameters of the branch network, and obtaining the picture recognition model.
Optionally, the initial picture recognition model includes an initial main network, an initial branch network, the branch point, and a second processing module;
the structure of the initial main network is the same as that of the main network; the initial branch network and the branch network have the same structure; the initial branch network is connected to the initial main network at a predetermined location through the branch point;
an initial main network output layer of the initial main network and an initial branch network output layer of the initial branch network are connected with the second processing module; the second processing module is configured to compare a processing result of the initial branch network with a processing result of the initial main network.
Optionally, inputting the first sample picture into an initial picture recognition model with a predetermined structure, performing first model training, and obtaining an intermediate picture recognition model, including the following steps:
inputting the first sample picture into the initial picture recognition model, and performing multiple training;
after each training is finished, obtaining a branch network loss function corresponding to the initial branch network and a main network loss function corresponding to the initial main network;
adding the branch network loss function and the main network loss function to obtain a total loss function;
and determining the initial picture identification model when the total loss function is optimal as the intermediate picture identification model.
Optionally, inputting the second sample picture into the intermediate picture identification model, and acquiring the attack sample picture in the second sample picture, including:
inputting the second sample picture into the intermediate picture identification model, and respectively obtaining a second sample picture branch identification result corresponding to the initial branch network and a second sample picture main identification result corresponding to the initial main network;
determining a similarity value between the second sample picture branch identification result and the second sample picture main identification result;
and if the similarity value is smaller than a preset similarity threshold value, determining the corresponding second sample picture as the attack sample picture.
Optionally, changing the structure of the intermediate image recognition model includes:
removing the second processing module;
and adding the first processing module, and connecting the branch network output layer and the branch point with the first processing module.
According to the method, the branch network is added on the basis of the existing main network, and the branch network is trained for the attacked picture to be identified, so that the branch network has higher identification accuracy for the attacked picture to be identified, and the main network continuously keeps higher identification accuracy for the picture to be identified which is not attacked. The branch network is determined from the main network. An attacker cannot predict the outcome of the branch network, and therefore, it is difficult to make an effective attack. In the process of picture identification, the picture is firstly identified by the branch network, and if the identification of the branch network can be used as the final identification result, the workload of the picture identification model is greatly reduced. The method provided by the embodiment of the application improves the identification accuracy of the picture to be identified under attack on the premise of not reducing the identification accuracy of the picture which is not under attack.
Drawings
Fig. 1 is a schematic structural diagram of a picture recognition model provided in the present application;
fig. 2 is a schematic flowchart of a picture identification method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a training method for a picture recognition model according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an initial picture recognition model according to an embodiment of the present disclosure;
fig. 5 is a flowchart illustrating a method for identifying an attack sample according to an embodiment of the present application;
fig. 6 is a flowchart illustrating another picture identification method according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Before explaining the method provided by the embodiment of the present application, a pre-trained image recognition model for image recognition provided by the embodiment of the present application is introduced first. As shown in fig. 1, a schematic structural diagram of an image recognition model provided for the implementation of the present application is shown.
The picture recognition model comprises: a main network 1, a branch network 2, a branch point 3 and a first processing module 4.
The main network 1 is any convolutional neural network model. Any convolutional neural network model that can be used for picture recognition may be used as the main network 1 in the embodiments of the present application. The embodiment of the present application does not limit the main network 1. The main network 1 can perform the picture recognition function alone. According to the embodiment of the application, the branch network 2 is added outside the main network 1, and the branch network 2 assists the main network 1 to identify the attacked picture to be identified, so that the accuracy of the whole picture identification model is improved.
The main network 1 includes an input layer, a plurality of normal convolution layers, a plurality of down-sampling layers, and a main network output layer, which are sequentially connected from front to back. Wherein, the main network output layer is composed of a pooling layer and a full connection layer. The pooling layer and the full-connection layer are only one. In the embodiment of the application, the number of layers of the common convolution layer is limited to a certain extent, and the common convolution layer has at least two layers.
Note that the normal convolution layer and the downsampling layer are used to extract feature quantities of a picture to be recognized. And the pooling layer is used for reducing the dimension of the extracted characteristic quantity. And the full connection layer is used for outputting the final recognition result.
The branch point 3 is provided at a predetermined position between the first general convolutional layer and the pooling layer of the main network 1. The first normal convolutional layer is a normal convolutional layer closest to the input layer from the input layer.
Specifically, the predetermined position is determined by:
and counting the total number of the common convolution layers and the downsampling layers in the main network 1.
The total number of arbitrary position layers of the normal convolution layer and the down-sampling layer from the input layer to the arbitrary position of the main network 1 is counted.
Setting the branch point 3 at a predetermined position ensures that the main network 1 performs feature amount extraction at least once, instead of directly processing the picture to be recognized by the branch network 2.
And if the total layer number of the arbitrary positions is more than or equal to one fourth of the total layer number and less than or equal to one third of the total layer number, determining the corresponding arbitrary positions as the preset positions.
For convenience of illustration, in the embodiment of the present application, the portion from the first normal convolution layer to the branch point 3 in the main network 1 constitutes the first main network processing module 11. The part of the master network 1 from the branch point 3 to the output layer of the master network 1 constitutes a second master network processing module 12.
The branch network 2 includes a branch network convolution layer with a preset number of layers and a preset channel width and an output layer of the branch network 2, which are sequentially connected from front to back. The branch network output layer is composed of a branch network pooling layer and a branch network full-connection layer.
The preset number of layers and the preset channel width are determined according to the down-sampling layer in the main network 1. The structure of the branched network pooling layer is the same as that of the pooling layer. The structure of the branch full-connection layer is the same as that of the full-connection layer.
It should be further noted that, in the embodiment of the present application, the branch network 2 is determined according to the main network 1. Therefore, the preset number of layers is the number of downsampling layers between the branch point 3 and the output layer of the main network 1 in the main network 1, that is, the number of downsampling layers in the second main network processing module 12.
The preset channel width is the channel width of the down-sampling layer.
The branch network 2 is connected to the main network 1 via a branch point 3.
The output layer of the branch network 2 and the branch point 3 are connected to a first processing module 4.
The preset channel width is determined by the following method:
the channel width of the first downsampling layer is determined as the preset channel width of the convolution layer of the first branch network 2.
Wherein the first lower sampling layer is any sampling layer. The first branch network 2 convolutional layer is a branch network 2 convolutional layer corresponding to the first lower sampling layer.
One downsampling layer corresponds to one of the branch network 2 convolution layers in the branch network 2. The channel width of the down-sampling layer is the preset channel width of the convolution layer of the corresponding branch network 2.
For example, if there are three downsampling layers in the second main network processing module 12, three convolutional layers of the branch network are correspondingly disposed in the branch network 2. The first down-sampling layer from the branch point corresponds to the first branch network convolution layer in the branch network 2, and the preset channel width of the branch network convolution layer is the channel width of the first down-sampling layer from the branch point.
Fig. 2 is a schematic flow chart of a picture identification method according to an embodiment of the present application. As shown in fig. 2, the embodiment of the present application includes the following steps:
step S201, the input layer receives the picture to be recognized and transmits the picture to be recognized to the first main network processing module.
Step S202, the first main network processing module performs first feature processing on the picture to be identified to obtain a first processed picture.
In the embodiment of the present application, the branch point 3 is set after the first general convolution layer, so as to enable the main network 1 to perform at least one feature processing on the picture to be recognized.
In step S203, the branch point stores the first processed picture.
Step S204, the branch network identifies the first processed picture and determines a branch identification result.
It should be noted that the branch network 2 provided in the embodiment of the present application has a good recognition degree for the picture to be recognized that is attacked, so that the picture to be recognized is processed through the branch network 2 first, and the problem that the main network 1 directly recognizes the picture to be recognized, which results in low accuracy of the recognition result, is avoided.
In step S205, the first processing module receives the branch identification result output by the branch network, and determines the cross entropy according to the branch identification result and the maximum position of the branch identification result.
For example, if the branch recognition result is [0.1,0.2,0.5,0.2], the maximum position of the branch recognition result is [0,0,1,0 ]. Let p take the value of the branch recognition result, i.e. p ═ 0.1,0.2,0.5,0.2, let q take the maximum position of the branch recognition result, i.e. q ═ 0,0,1,0, then calculate the cross entropy of both, i.e. H (q, p) ═ Σ q (n) × logp (n) ═ 1 × 1og (0.5) ═ 0.3, then 0.3 is the corresponding cross entropy.
Step S206, determining whether the cross entropy is smaller than a preset threshold, if the cross entropy is smaller than the preset threshold, executing step S207, otherwise, executing step S208.
It should be noted that, in the embodiment of the present application, step S205 is used to determine whether the branch recognition result reaches the expected accuracy, and if the branch recognition result reaches the expected accuracy, only the last branch recognition result is required to be used as the final recognition result. By the aid of the process, a large number of characteristic extraction processes of secondary confirmation of the picture to be recognized by the main network 1 are avoided, and workload of the picture recognition model is reduced.
In step S207, a branch recognition result is output.
In step S208, the branch point transmits the first processed picture to the second host network 1 processing module.
In step S203, the branch point stores the first processing picture, and the processing result of the previous step can be transmitted to the second main network 1 processing module.
Step S209, the second master network processing module performs second feature processing on the first processed picture to obtain a master identification result. And outputs the main recognition result.
It should be noted that, in the embodiment of the present application, the main network 1 has good identification accuracy for the picture to be identified that is not attacked. Therefore, if the branch recognition result does not reach the expected accuracy, the main network 1 is required to continue the recognition processing of the picture to be recognized, and the main recognition result is taken as the final recognition result.
The picture recognition model in the above steps is trained in advance, and the following describes specifically the training process of the picture recognition model provided in the embodiment of the present application.
Fig. 3 is a schematic flow chart of a training method for an image recognition model according to an embodiment of the present disclosure. It should be noted that the training method provided in the embodiment of the present application includes two training processes, i.e., a first training process and a second training process. In the two training processes, the structure of the image recognition model is different, and correspondingly, the function of the image recognition model is also different. The training method provided by the embodiment of the application comprises the following steps:
step S301, inputting a first sample picture into an initial picture recognition model with a preset structure, performing first model training, and obtaining an intermediate picture recognition model.
Wherein the first sample picture does not include the attacked sample picture.
At this time, the initial picture recognition model and the last established picture recognition model trained in advance are not the same in structure. Fig. 4 is a schematic structural diagram of an initial picture recognition model provided in the embodiment of the present application.
As shown in fig. 4, the initial picture recognition model includes an initial main network 61, an initial branch network 62, a branch point 3, and a second processing module 5.
The structure of the original main network 61 is the same as that of the main network 1. Namely, the initial master network 61 also includes an input layer, a plurality of normal convolutional layers, and a plurality of down-sampling layers. The initial branch network 62 has the same structure as the branch network 2, i.e., the initial branch network 62 also includes a plurality of initial branch network convolutional layers. The initial branch network 62 is connected to the initial main network 61 at a predetermined position through the branch point 3.
The initial main network output layer of the initial main network 61 and the initial branch network output layer of the initial branch network 62 are connected to the second processing module 5. The second processing module 5 is configured to compare the processing result of the initial branch network 62 with the processing result of the initial main network 61.
Similar to the trained picture recognition model, the part of the initial master network 61 protected from the first initial common convolution layer to the branch point 3 constitutes a first initial master network 611 processing module, and the part of the initial master network 61 protected from the branch point 3 to the output layer of the initial master network 61 constitutes a second initial master network 612 processing module.
Specifically, step 301 includes the following steps:
firstly, a first sample picture is input into an initial picture recognition model, and multiple times of training are carried out.
Then, after each training is completed, a branch network loss function corresponding to the initial branch network 62 and a main network loss function corresponding to the initial main network 61 are obtained.
It should be noted that after each training, the initial branch network 62 and the initial main network 61 respectively generate a loss function to be optimized.
Then, the branch network loss function is added to the main network loss function to obtain a total loss function. When the step is executed, the branch network loss function and the main network loss function are linearly added to obtain a total loss function. In the embodiment of the present application, the total loss function needs to be targeted for gradient descent.
And finally, determining the initial picture identification model when the total loss function is optimal as an intermediate picture identification model.
It should be noted that, in step 301, the initial branch network 62 and the initial main network 61 are trained by using the same first sample picture, and both the trained initial branch network 62 and the trained initial main network 61 have a better recognition function for the picture that is not attacked.
Step S302, inputting the second sample picture into the intermediate picture recognition model, and acquiring an attack sample picture in the second sample picture.
And the second sample is obtained by modifying the first sample picture by using an attack algorithm. The step is that an attacker is simulated to attack the picture to be identified, and interference is added into the picture to be identified.
As shown in fig. 5, a flowchart corresponding to the method for identifying an attack sample provided in the embodiment of the present application is shown. The method mainly comprises the following steps:
step S501, inputting the second sample picture into the intermediate picture recognition model, and respectively obtaining a second sample picture branch recognition result corresponding to the initial branch network and a second sample picture main recognition result corresponding to the initial main network.
Step S502, determining a similarity value between the branch recognition result of the second sample picture and the main recognition result of the second sample picture.
The similarity value can be judged by adopting various methods, for example, the inner product operation of the branch identification result of the second sample picture and the main identification result of the second sample picture is determined, and the larger the corresponding result of the inner product operation is, the more similar the two are. And for example, calculating a norm distance of a difference phasor between the branch identification result of the second sample picture and the main identification result of the second sample picture, wherein the smaller the norm distance, the more similar the norm distance is. It should be noted that, when any method is selected, normalization processing needs to be performed on the branch recognition result of the second sample picture and the main recognition result of the second sample picture.
Another possible method provided by the present application is to determine a cross entropy between the branch recognition result of the second sample picture and the main recognition result of the second sample picture.
Step S503, determining whether the similarity value is smaller than a preset similarity threshold, if the similarity value is larger than the preset similarity threshold, executing step S504, otherwise, executing step S505.
In the embodiment of the application, the preset similarity threshold is determined empirically and can be adjusted according to actual needs.
Step S504, determining the corresponding second sample picture as an attack sample picture.
In step S505, the corresponding second sample picture is determined as a normal sample picture.
It should be noted that, actually, the intermediate image recognition model may be used in image recognition to determine whether an image to be recognized is attacked or not. Step S501 to step S505 are also the process of determining whether the picture is attacked or not.
Since the attack of the picture to be recognized is performed on the original initial master network 1, after the picture to be recognized is attacked, the recognition accuracy of the picture by the initial master network 61 is reduced. However, an attacker cannot expect that the intermediate picture recognition model still has one initial branch network 62, and cannot add interference to the picture to be recognized for the initial branch network 62, so that on such a premise, the accuracy of recognition of the picture to be recognized by the initial branch network 62 is high. If the similarity between the branch recognition result and the main recognition result is low, it indicates that the recognition results of the to-be-recognized picture by the initial branch network 62 and the initial main network 61 are different, and further indicates that the to-be-recognized picture is attacked.
Steps S501 to S505 provided in the embodiment of the present application can independently complete the determination of whether the picture to be recognized is attacked. In actual need, step S501 to step S505 may be performed separately.
It should be noted that steps S501 to S505 may be used to separately determine whether the picture to be identified is attacked, so that the intermediate model to be identified may be put into use, and if the picture to be identified is found in the using process, the corresponding picture to be identified is collected. And the attacked picture is mixed into the third sample picture, and the subsequent step S304 is executed.
Step S303, the structure of the intermediate picture recognition model is changed.
Specifically, step S303 includes the following steps:
the second processing module 5 is removed.
A first processing module 4 is added and the branch network output layer and the branch point 3 are connected to the first processing module 4.
After step S303, the structure of the intermediate image recognition model is adjusted to the structure of the trained image recognition model.
Step S304, inputting a third sample picture comprising the first sample picture and the attack sample picture into an initial branch network in the intermediate picture recognition model with the changed structure, performing second model training, determining parameters of the branch network, and obtaining the picture recognition model.
It should be noted that, in the process of the second model training, the main network 1 does not participate in the training, and the parameters corresponding to the main network 1 are not adjusted. Step S304 is performed to enable the branch network 2 to have a better recognition function for the attacked picture to be recognized, and in essence, after step S304 is performed, the branch network 2 is used to recognize the attacked picture to be recognized, and the main network 1 is used to recognize the picture that is not attacked.
It should be particularly noted that, since the intermediate image recognition model provided in the embodiment of the present application has practical use significance, another image recognition method is provided in the embodiment of the present application. Fig. 6 is a schematic flow chart of another picture identification method according to an embodiment of the present application. The method provided by the embodiment of the application is as follows:
step S601, training the pre-built initial picture recognition model for multiple times to obtain an intermediate picture recognition model.
The structure of the initial image recognition model is shown in fig. 4, which has already been described above and will not be described herein again. The process of training the initial image recognition model is consistent with step S301, and is not described herein again.
Step S602, inputting the picture to be identified into the intermediate picture identification model to obtain an attack picture.
Specifically, the initial branch network and the initial main network in the intermediate image recognition model respectively recognize the image to be recognized, if the difference between the recognition results of the initial branch network and the initial main network is large, the image to be recognized is determined as an attack image, the attack image is stored, and meanwhile an alarm instruction is sent to a worker. The alarm instruction is used for reminding workers that the structure of the initial main network can be broken by an attacker and the intermediate image recognition model needs to be optimized.
And step S603, when the number of the attack pictures reaches a preset number, mixing the attack pictures into the pictures to be identified, and training the intermediate picture identification model for multiple times to obtain the picture identification model.
In addition, step S603 may not be executed in actual operation. When recognizing a picture, only step S601 and step S602 may be performed. That is, the intermediate recognition model can recognize the picture and judge whether the picture to be recognized is attacked or not. Step S603 may be understood as a self-updating method of the intermediate recognition model after determining that the attack is received by the intermediate recognition model. The process of training the intermediate image recognition model in step S603 is the same as that in step S304, and is not repeated here.
According to the method provided by the embodiment of the application, the branch network is added on the basis of the existing main network, and the branch network is trained for the attacked picture to be recognized, so that the branch network has higher recognition accuracy for the attacked picture to be recognized, and the main network continuously keeps higher recognition accuracy for the picture to be recognized which is not attacked. The branch network is determined from the main network. An attacker cannot predict the outcome of the branch network, and therefore, it is difficult to make an effective attack. In the process of picture identification, the picture is firstly identified by the branch network, and if the identification of the branch network can be used as the final identification result, the workload of the picture identification model is greatly reduced. The method provided by the embodiment of the application improves the identification accuracy of the picture to be identified under attack on the premise of not reducing the identification accuracy of the picture which is not under attack.
The invention is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (9)

1. A picture recognition method is characterized in that a picture recognition model trained in advance is adopted, and the picture recognition model comprises the following steps: the system comprises a main network, a branch point and a first processing module;
the main network is any one convolutional neural network model; the main network comprises an input layer, a plurality of common convolution layers, a plurality of down-sampling layers and a main network output layer which are sequentially connected from front to back;
the branch point is arranged at a preset position between the first common convolution layer of the main network and the pooling layer; the first normal convolutional layer is a normal convolutional layer closest to the input layer from the input layer;
wherein a portion of the main network from the first normal convolution layer to the branch point constitutes a first main network processing module; the part from the branch point to the main network output layer in the main network forms a second main network processing module;
the branch network comprises a branch network convolution layer and a branch network output layer which are sequentially connected from front to back and have preset number of layers and preset channel width; the preset number of layers and the preset channel width are determined according to the down-sampling layer in the main network;
wherein the branch network is connected with the main network through the branch point;
the branch network output layer and the branch point are connected with the first processing module;
the picture identification method comprises the following steps:
the input layer receives a picture to be identified and transmits the picture to be identified to the first main network processing module;
the first main network processing module performs first feature processing on the picture to be identified to obtain a first processed picture;
the branch point stores the first processing picture;
the branch network identifies the first processed picture and determines a branch identification result;
the first processing module receives the branch identification result output by the branch network and determines the cross entropy according to the branch identification result and the maximum position of the branch identification result; if the cross entropy is smaller than a preset threshold value, outputting a branch identification result;
if the cross entropy is smaller than or equal to the preset threshold value, the branch point transmits the first processed picture to the second main network processing module; the second main network processing module performs second feature processing on the first processed picture to obtain a main identification result; and outputting the main recognition result.
2. The method according to claim 1, wherein the preset number of layers is the number of downsampling layers from the branch point to the output layer of the main network.
3. The picture recognition method according to claim 1, wherein the predetermined position is determined by:
counting the total number of the common convolution layers and the downsampling layers in the main network;
counting the total number of the common convolution layers from the input layer to any position of the main network and any position of the down-sampling layer;
and if the total layer number of the arbitrary positions is more than or equal to one fourth of the total layer number and less than or equal to one third of the total layer number, determining the corresponding arbitrary positions as the preset positions.
4. The picture recognition method according to claim 1, wherein the preset channel width is determined by the following method:
determining the channel width of the first lower sampling layer as the preset channel width of the first branch network convolution layer;
wherein the first lower sampling layer is any sampling layer; the first branch network convolution layer is a branch network convolution layer corresponding to the first lower sampling layer.
5. The picture recognition method according to claim 1, wherein the picture recognition model is obtained by training:
inputting a first sample picture into an initial picture recognition model with a preset structure, and performing first model training to obtain an intermediate picture recognition model; the first sample picture does not comprise an attacked sample picture;
inputting a second sample picture into the intermediate picture identification model, and acquiring an attack sample picture in the second sample picture; the second sample is obtained by modifying the first sample picture by using an attack algorithm;
changing the structure of the intermediate picture recognition model;
and inputting a third sample picture comprising the first sample picture and the attack sample picture into an initial branch network in the intermediate picture recognition model with the changed structure, performing second model training, determining parameters of the branch network, and obtaining the picture recognition model.
6. The picture recognition method according to claim 5, wherein the initial picture recognition model comprises an initial main network, an initial branch network, the branch point and a second processing module;
the structure of the initial main network is the same as that of the main network; the initial branch network and the branch network have the same structure; the initial branch network is connected to the initial main network at a predetermined location through the branch point;
an initial main network output layer of the initial main network and an initial branch network output layer of the initial branch network are connected with the second processing module; the second processing module is configured to compare a processing result of the initial branch network with a processing result of the initial main network.
7. The picture recognition method according to claim 5, wherein the first sample picture is input into an initial picture recognition model with a predetermined structure, and a first model training is performed to obtain an intermediate picture recognition model, comprising the steps of:
inputting the first sample picture into the initial picture recognition model, and performing multiple training;
after each training is finished, obtaining a branch network loss function corresponding to the initial branch network and a main network loss function corresponding to the initial main network;
adding the branch network loss function and the main network loss function to obtain a total loss function;
and determining the initial picture identification model when the total loss function is optimal as the intermediate picture identification model.
8. The picture recognition method according to claim 5, wherein inputting the second sample picture into the intermediate picture recognition model, and obtaining the attack sample picture in the second sample picture comprises:
inputting the second sample picture into the intermediate picture identification model, and respectively obtaining a second sample picture branch identification result corresponding to the initial branch network and a second sample picture main identification result corresponding to the initial main network;
determining a similarity value between the second sample picture branch identification result and the second sample picture main identification result;
and if the similarity value is smaller than a preset similarity threshold value, determining the corresponding second sample picture as the attack sample picture.
9. The picture recognition method according to claim 5, wherein changing the structure of the intermediate picture recognition model comprises:
removing the second processing module;
and adding the first processing module, and connecting the branch network output layer and the branch point with the first processing module.
CN202110009127.XA 2021-01-05 2021-01-05 Picture identification method Active CN112712126B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110009127.XA CN112712126B (en) 2021-01-05 2021-01-05 Picture identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110009127.XA CN112712126B (en) 2021-01-05 2021-01-05 Picture identification method

Publications (2)

Publication Number Publication Date
CN112712126A true CN112712126A (en) 2021-04-27
CN112712126B CN112712126B (en) 2024-03-19

Family

ID=75548302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110009127.XA Active CN112712126B (en) 2021-01-05 2021-01-05 Picture identification method

Country Status (1)

Country Link
CN (1) CN112712126B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171260A (en) * 2017-12-15 2018-06-15 百度在线网络技术(北京)有限公司 A kind of image identification method and system
US20180204111A1 (en) * 2013-02-28 2018-07-19 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
CN109375186A (en) * 2018-11-22 2019-02-22 中国人民解放军海军航空大学 Radar target identification method based on the multiple dimensioned one-dimensional convolutional neural networks of depth residual error
CN109784153A (en) * 2018-12-10 2019-05-21 平安科技(深圳)有限公司 Emotion identification method, apparatus, computer equipment and storage medium
CN109829443A (en) * 2019-02-23 2019-05-31 重庆邮电大学 Video behavior recognition methods based on image enhancement Yu 3D convolutional neural networks
CN110084318A (en) * 2019-05-07 2019-08-02 哈尔滨理工大学 A kind of image-recognizing method of combination convolutional neural networks and gradient boosted tree
CN110443286A (en) * 2019-07-18 2019-11-12 广州华多网络科技有限公司 Training method, image-recognizing method and the device of neural network model
CN110516745A (en) * 2019-08-28 2019-11-29 北京达佳互联信息技术有限公司 Training method, device and the electronic equipment of image recognition model
CN111199217A (en) * 2020-01-09 2020-05-26 上海应用技术大学 Traffic sign identification method and system based on convolutional neural network
US20200184278A1 (en) * 2014-03-18 2020-06-11 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
CN111626317A (en) * 2019-08-14 2020-09-04 广东省智能制造研究所 Semi-supervised hyperspectral data analysis method based on double-flow conditional countermeasure generation network
CN111709190A (en) * 2020-06-24 2020-09-25 国电联合动力技术有限公司 Wind turbine generator operation data image identification method and device
CN111860147A (en) * 2020-06-11 2020-10-30 北京市威富安防科技有限公司 Pedestrian re-identification model optimization processing method and device and computer equipment
CN111860545A (en) * 2020-07-30 2020-10-30 元神科技(杭州)有限公司 Image sensitive content identification method and system based on weak detection mechanism
CN112115973A (en) * 2020-08-18 2020-12-22 吉林建筑大学 Convolutional neural network based image identification method

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204111A1 (en) * 2013-02-28 2018-07-19 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
US20200184278A1 (en) * 2014-03-18 2020-06-11 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
CN108171260A (en) * 2017-12-15 2018-06-15 百度在线网络技术(北京)有限公司 A kind of image identification method and system
CN109375186A (en) * 2018-11-22 2019-02-22 中国人民解放军海军航空大学 Radar target identification method based on the multiple dimensioned one-dimensional convolutional neural networks of depth residual error
CN109784153A (en) * 2018-12-10 2019-05-21 平安科技(深圳)有限公司 Emotion identification method, apparatus, computer equipment and storage medium
CN109829443A (en) * 2019-02-23 2019-05-31 重庆邮电大学 Video behavior recognition methods based on image enhancement Yu 3D convolutional neural networks
CN110084318A (en) * 2019-05-07 2019-08-02 哈尔滨理工大学 A kind of image-recognizing method of combination convolutional neural networks and gradient boosted tree
CN110443286A (en) * 2019-07-18 2019-11-12 广州华多网络科技有限公司 Training method, image-recognizing method and the device of neural network model
CN111626317A (en) * 2019-08-14 2020-09-04 广东省智能制造研究所 Semi-supervised hyperspectral data analysis method based on double-flow conditional countermeasure generation network
CN110516745A (en) * 2019-08-28 2019-11-29 北京达佳互联信息技术有限公司 Training method, device and the electronic equipment of image recognition model
CN111199217A (en) * 2020-01-09 2020-05-26 上海应用技术大学 Traffic sign identification method and system based on convolutional neural network
CN111860147A (en) * 2020-06-11 2020-10-30 北京市威富安防科技有限公司 Pedestrian re-identification model optimization processing method and device and computer equipment
CN111709190A (en) * 2020-06-24 2020-09-25 国电联合动力技术有限公司 Wind turbine generator operation data image identification method and device
CN111860545A (en) * 2020-07-30 2020-10-30 元神科技(杭州)有限公司 Image sensitive content identification method and system based on weak detection mechanism
CN112115973A (en) * 2020-08-18 2020-12-22 吉林建筑大学 Convolutional neural network based image identification method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZAHRA DEHGHAN DOOLAB; SEYYED ALI SEYYEDSALEHI; NARJES SOLTANI DEHAGHANI: "" Nonlinear Normalization of Input Patterns to Handwritten Character Variability in Handwriting Recognition Neural Network"", 《2012 INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND BIOTECHNOLOGY》 *
吴海丽;: "基于卷积神经网络的图像大数据识别", 山西大同大学学报(自然科学版), no. 02 *
李亚;张雨楠;彭程;杨俊钦;刘淼;: "基于多任务学习的人脸属性识别方法", 计算机工程, no. 03 *

Also Published As

Publication number Publication date
CN112712126B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
CN114186632B (en) Method, device, equipment and storage medium for training key point detection model
CN110516734B (en) Image matching method, device, equipment and storage medium
CN111724370B (en) Multi-task image quality evaluation method and system based on uncertainty and probability
CN111680544B (en) Face recognition method, device, system, equipment and medium
CN110689136B (en) Deep learning model obtaining method, device, equipment and storage medium
US20230066703A1 (en) Method for estimating structural vibration in real time
JP2023541752A (en) Neural network model training methods, image retrieval methods, equipment and media
CN112232426A (en) Training method, device and equipment of target detection model and readable storage medium
CN111192313A (en) Method for robot to construct map, robot and storage medium
CN115223251A (en) Training method and device for signature detection model, electronic equipment and storage medium
CN110489423A (en) A kind of method, apparatus of information extraction, storage medium and electronic equipment
CN114581794B (en) Geographic digital twin information acquisition method and device, electronic equipment and storage medium
CN110533184B (en) Network model training method and device
CN110874638A (en) Behavior analysis-oriented meta-knowledge federation method, device, electronic equipment and system
CN114417739A (en) Method and device for recommending process parameters under abnormal working conditions
CN114154622A (en) Algorithm model for traffic operation system flow data acquisition missing completion
CN112712126B (en) Picture identification method
CN112966547A (en) Neural network-based gas field abnormal behavior recognition early warning method, system, terminal and storage medium
CN109871448B (en) Short text classification method and system
CN111627029A (en) Method and device for acquiring image instance segmentation result
CN116777646A (en) Artificial intelligence-based risk identification method, apparatus, device and storage medium
CN116707859A (en) Feature rule extraction method and device, and network intrusion detection method and device
CN114882273B (en) Visual identification method, device, equipment and storage medium applied to narrow space
CN114459482B (en) Boundary simplification method, path planning method, device, equipment and system
CN111553324B (en) Human body posture predicted value correction method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant