CN113537389A - Robust image classification method and device based on model embedding - Google Patents
Robust image classification method and device based on model embedding Download PDFInfo
- Publication number
- CN113537389A CN113537389A CN202110898433.3A CN202110898433A CN113537389A CN 113537389 A CN113537389 A CN 113537389A CN 202110898433 A CN202110898433 A CN 202110898433A CN 113537389 A CN113537389 A CN 113537389A
- Authority
- CN
- China
- Prior art keywords
- image sample
- model
- training image
- label
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 90
- 238000012549 training Methods 0.000 claims abstract description 197
- 238000003062 neural network model Methods 0.000 claims abstract description 89
- 238000012216 screening Methods 0.000 claims abstract description 39
- 238000007865 diluting Methods 0.000 claims abstract description 19
- 239000012895 dilution Substances 0.000 claims description 35
- 238000010790 dilution Methods 0.000 claims description 35
- 239000011159 matrix material Substances 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 7
- 238000002372 labelling Methods 0.000 abstract description 12
- 239000000463 material Substances 0.000 abstract description 6
- 230000008569 process Effects 0.000 description 16
- 239000000243 solution Substances 0.000 description 10
- 238000013145 classification model Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 238000012360 testing method Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 238000007635 classification algorithm Methods 0.000 description 4
- 238000003113 dilution method Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 230000003446 memory effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 241000282994 Cervidae Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013100 final test Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The present disclosure provides a robust image classification method and apparatus based on model embedding, wherein the method comprises: inputting the image to be classified into an image classification depth neural network model, and outputting an image class identification result corresponding to the image to be classified; the image classification deep neural network model is characterized in that a training image sample is input into the deep neural network model to obtain the embedded characteristic of the training image sample in the last hidden layer of the neural network model; diluting the noise label of the training image sample according to the embedded characteristic; carrying out sample screening according to the diluted label to obtain a screened image sample; and updating the model parameters of the deep neural network model according to the screened image samples to obtain an image classification deep neural network model. The technical scheme of the image classification method and the device can reduce the labeling cost and simultaneously ensure the image classification performance of the robustness so as to save the manpower and material resource cost of image labeling.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a robust image classification method and apparatus based on model embedding.
Background
The image classification problem based on deep learning makes great progress in the fully supervised scene, but the performance of the classification algorithm in the fully supervised scene is often heavily dependent on the accuracy of image labeling. In the practical application with larger scale, due to the personal professional knowledge limitation of the annotator, the work fatigue and other human factors, the acquired training data set is often noisy, i.e. some pictures are not accurately annotated. How to design a robust classification model to resist the negative influence brought by label noise is a key for expanding the application scene of the image classification model. There are also some technical solutions that try to design robust image classification models. The prior art scheme includes that a model prediction is used for trying to correct wrong image labels, and then corrected images are used for carrying out later-stage model training; in the technical scheme, a training sample is screened by using model prediction, and the screened clean image (image with accurate label) is further used for training a later-stage model.
However, the above prior art schemes all have a serious technical drawback that although the early model prediction can effectively help to reduce the labeling noise effect, a part of persistent noise still exists, and the "early" in the early model training is not controllable. More specifically, the prediction result of the early model is not hundred percent accurate, and secondary training using the prediction result may generate accumulated errors, resulting in increasingly poor iterative classification models. Compared with model prediction, the method based on model embedding provided by the scheme is more robust, and the problem of accumulated errors can be effectively avoided.
It is well known that DNN requires a long training process to obtain reliable predictions, which necessarily conflicts with the "early" in memory effects. Although the DNN had the opportunity to learn clean images during early training, it was not possible to learn a reliable clean classification model, and noisy images were involved in the training, resulting in the model beginning to fit noise. The consequence is that the classification performance of the early models rises first and falls later and cannot be optimized. The use of moving averages continually reduces the number of noisy images discarded in order to achieve a compromise between the "long-term" required for DNN training and the "early" required for memory effects. However, this compromise requires manual adjustment and the adjustment process is laborious and time consuming.
Disclosure of Invention
The invention provides a robust image classification method and device based on model embedding, which are used for solving the defects that manual adjustment is needed in the prior art, and the adjustment process is troublesome and labor-consuming, and the image classification performance of robustness is ensured while the labeling cost is reduced so as to save the manpower and material resources cost of image labeling.
In a first aspect, the present disclosure provides a robust image classification method based on model embedding, including:
acquiring an image to be classified;
inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
According to the robust image classification method based on model embedding provided by the present disclosure, the diluting the noise label of the training image sample according to the embedding feature specifically includes:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
According to the robust image classification method based on model embedding provided by the present disclosure, the image sample screening according to the diluted label specifically includes:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
According to the robust image classification method based on model embedding provided by the present disclosure, the model parameters of the deep neural network model are updated according to a second model, wherein the second model is:
wherein, B is the training image sample after screening,solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,is a loss function.
According to the robust image classification method based on model embedding provided by the present disclosure, the method for obtaining the K nearest neighbor of each training image sample is as follows:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
According to the robust image classification method based on model embedding provided by the present disclosure, the first model is:
wherein e isiIs the weight of the ith training image sample,representing the noise label corresponding to the training image sample,is an n x C matrix, representing the dilution signature, where C is the number of classifications,to representThe value of j is taken at the maximum value.
In a second aspect, the present disclosure provides a robust image classification apparatus based on model embedding, including:
the first processing module is used for acquiring an image to be classified;
the second processing module is used for inputting the image to be classified into an image classification depth neural network model and outputting an image class identification result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
According to the robust image classification device based on model embedding provided by the present disclosure, the diluting the noise label of the training image sample according to the embedding feature specifically includes:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
According to the robust image classification device based on model embedding provided by the present disclosure, the image sample screening according to the diluted label specifically includes:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
According to the robust image classification device based on model embedding provided by the present disclosure, the model parameters of the deep neural network model are updated according to a second model, wherein the second model is:
wherein, B is the training image sample after screening,solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,is a loss function.
According to the robust image classification device based on model embedding provided by the present disclosure, the method for obtaining the K nearest neighbor of each training image sample is as follows:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
The robust image classification device based on model embedding is provided according to the present disclosure, wherein the first model is:
wherein e isiIs the weight of the ith training image sample,representing the noise label corresponding to the training image sample,is an n x C matrix, representing the dilution signature, where C is the number of classifications,to representThe value of j is taken at the maximum value.
In a third aspect, the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the robust image classification method based on model embedding as described in any one of the above.
In a fourth aspect, the present disclosure also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the model-based embedded robust image classification method as described in any of the above.
According to the robust image classification method and device based on model embedding, a training image sample is input into a deep neural network model, and embedding characteristics of the training image sample in the last hidden layer of the neural network model are obtained; and diluting the noise label of the training image sample according to the embedded characteristics, screening the sample image based on the diluted label, and updating the parameters of the model by using the screened image, so that the technical scheme disclosed by the invention can reduce the labeling cost and simultaneously ensure the robust image classification performance so as to save the manpower and material resources cost of image labeling.
Drawings
In order to more clearly illustrate the technical solutions of the present disclosure or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart diagram of a robust image classification method based on model embedding provided by the present disclosure;
FIG. 2 is a schematic flow chart of a robust image classification algorithm based on model embedding provided by the present disclosure;
FIG. 3 is a schematic diagram of a sample k neighbor finding process provided by the present disclosure;
FIG. 4 is a schematic diagram of a k-nearest neighbor based tag dilution process provided by the present disclosure;
FIG. 5 is a schematic diagram illustrating comparison of performance of a model prediction method and a model embedding method provided by the present disclosure;
FIG. 6 is a schematic structural diagram of a robust image classification apparatus based on model embedding provided by the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device provided by the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present disclosure, belong to the protection scope of the embodiments of the present disclosure.
A robust image classification method based on model embedding provided by the embodiments of the present disclosure is described below with reference to fig. 1 to fig. 2, which includes:
step 100: acquiring an image to be classified;
in particular, the image classification problem is an important application as a bottom-layer image processing problem in the fields of medical imaging, satellite image remote sensing, internet crowdsourcing and the like. In recent years, the problem of image classification based on deep learning has been achieved with good results. Therefore, when the deep neural network model is used for image classification, an image to be classified is acquired first.
Step 200: inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
Specifically, Deep Neural Networks (DNNs) can be understood as Neural Networks with many hidden layers, also called Deep feed-forward Networks (DFNs), where a Multi-Layer perceptron (MLP) is divided by the positions of different layers, and the Neural network layers inside the DNNs can be divided into: the input layer, the hidden layer and the output layer, generally, the first layer is the input layer, the last layer is the output layer, and the middle layers are all hidden layers. The layers are all connected, namely any neuron of the ith layer is necessarily connected with any neuron of the (i + 1) th layer.
Before the deep neural network is used for classifying images, a deep neural network model needs to be trained, and although the DNN has an opportunity to learn a clean image in the early training process in the existing training method, a reliable clean classification model cannot be learned, and a noise image participates in training, so that the model starts to fit noise. The consequence is that the classification performance of the early models rises first and falls later and cannot be optimized. Or using a moving average method to continuously reduce the number of discarded noisy images to achieve a compromise between "long-term" required for DNN training and "early" required for memory effects. However, this compromise requires manual adjustment and the adjustment process is laborious and time consuming.
In an actual application scenario, in consideration of the human and material costs or the subjectivity of a classification task, training data which can be actually acquired is generally affected by external noise, so that the performance of a traditional deep learning algorithm is limited. Therefore, how to design a robust image classification algorithm based on a noisy training image is of great significance in academic research and industrial applications. The robustness of the model embedding characteristics adopted by the scheme has a longer period in the training process, and the problem of manual parameter adjustment is avoided.
In the early training process of the deep neural network, model embedding has stronger robustness than model prediction, and based on the model embedding, a label noise reduction scheme based on the model embedding is designed, and a clean image screening scheme is further designed for DNN model training.
Fig. 2 shows a basic flow chart of the robust image classification algorithm based on model embedding proposed by the present patent. Specifically, training is carried out by inputting training image samples (input images) to a DNN, embedding characteristics (model embedding) of a last hidden layer of a training model are taken for label noise dilution (label dilution), the diluted labels are further used for image sample screening (image selection), the screened clean samples (selected images) are fed back to the DNN training process, and finally a robust classification model is output. And classifying the image based on the obtained image classification model.
According to the robust image classification method based on model embedding, a training image sample is input into a deep neural network model, and embedding characteristics of the training image sample in the last hidden layer of the neural network model are obtained; and diluting the noise label of the training image sample according to the embedded characteristics, screening the sample image based on the diluted label, and updating the parameters of the model by using the screened image, so that the technical scheme disclosed by the invention can reduce the labeling cost and simultaneously ensure the robust image classification performance so as to save the manpower and material resources cost of image labeling.
According to the robust image classification method based on model embedding provided by the embodiment of the disclosure, the diluting the noise label of the training image sample according to the embedding feature specifically comprises:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
Specifically, during the label dilution process, the robust information provided by model embedding is further extracted by using K Nearest Neighbors (KNN) of the training image samples. Fig. 3 shows a schematic diagram of the process of finding K neighbors of image samples, i.e. finding the K image samples that are closest to each training image sample in euclidean space (where ● and x respectively indicate that the image sample is correctly/incorrectly labeled). Fig. 4 illustrates a K-nearest neighbor based noise label dilution process. In each round of dilution process, firstly searching K neighbor of each training image sample, then transmitting label information of the K neighbor of each training image sample to the image sample and updating the label of the image sample; the process iterates through T rounds until convergence.
Given input training data(x represents one of the training images,representing the noise label corresponding to the image, n representing the total number of training image samples), using this data to train dnnf (x), given any desired test data xt,f(xt) Can accurately predict the real label y thereoft. Let w denote the update parameter of f (x), φ (x) denote the embedding characteristics of the last hidden layer of f (x), NNk(xi(ii) a Phi) represents an image sample xiK is a neighbor on the feature phi (x).
In each round of label dilution, x for each image sampleiFirst, using its embedding characteristic phi (x)i) Obtain its K nearest neighbor NNk(xi(ii) a Phi) and then propagates its k neighbor label information to the image sample xiTo obtain a dilution label z thereofi。
Formally expressing the t-th iteration process as follows:
wherein W represents a similarity matrix constructed based on phi (x), and the first term on the right side of the formula (1) represents a sample xiK neighbor of (a), the second term on the right represents the t-1 th iteration sample xiα is a weight parameter of the above two items. The above equation is written in matrix form:
Z(t)=(1-α)·WZ(t-1)+α·Z(t-1) (2)
by noise labelsOne-hot representation of (one-hot representation) initializing the dilution tag Z(0)Then, Z is updated using equation (2) until convergence.
Next, the generation process of the similarity matrix W is described in detail. Given training sampleThe model embedding characteristics ofFirstly, constructing a sparse matrix A with the size of n multiplied by n and an element A thereofijComprises the following steps:
where γ is a hyperparameter, typically set to 4; finally, constructing a similarity matrix W ═ D-1/2W′D-1/2WhereinD=diag(W′1)。
According to the robust image classification method based on model embedding provided by the embodiment of the disclosure, the image sample screening according to the diluted label specifically includes:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
In particular, a training sample is givenAnd dilution label thereforEach sample training weight e can be obtained by equation (4), and DNN parameter w is further updated by equation (5), where i (a) is an indication function, a is true, i (a) is 1, otherwise i (a) is 0;the operator is solved for the gradient with respect to w.
Wherein, B is the training image sample after screening,solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,is a loss function.
In the formula (4)Show to makeTaking the value of j at the maximum value, whereinIs a Z is an n x C matrix and C is the number of classifications. Each row (i) represents a group of one hot labels; in one hot label, each element is 0 or 1, and there is one and only one 1. Here, theThis is the position of 1 in the ith group of one hot labels.
Where i (a) is an indicator function, a is true when i (a) is 1, otherwise i (a) is 0; is shown inWhen the image sample is established, the weight value corresponding to the sample is 1, namely the image sample is screened, and when the image sample is established, the image sample is screenedIf not, the weight of the sample is 0, i.e. the image sample is not screened.
And updating the parameters of the deep neural network model according to the screened image samples in a mode of formula (5).
According to the robust image classification method based on model embedding provided by the embodiment of the disclosure, the method for obtaining the K nearest neighbor of each training image sample comprises the following steps:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
In experiments, CIFAR-10 and CIFAR-100 data sets were used to verify the effectiveness of the method proposed in this patent. Specifically, CIFAR-10 and CIFAR-100 each contain 50000 training images and 10000 test images, with the number of image classes being 10 and 100, respectively. Note that the above data set is initially a clean data set, and the experiment converts the data set into a noisy data set by manually adding tag noise, and the conversion method is as follows: symmetric flipping and asymmetric flipping. The symmetric inversion is also called random inversion, and the image of each category is inverted into all other labels with a certain probability rho; the asymmetric inversion is also called pairing inversion, images of each category are inverted into a certain semantic information similar label (such as cat → dog, deer → horse) with a certain probability rho, and the inversion mode can be applied to a fine-grained image classification scene in practical application. The probability ρ is also called the noise rate, i.e., the ratio of the noisy images. The noise rates considered for asymmetric inversion are: 40%, 50% and 60%; the noise rates considered for symmetric inversion are: 20%, 30 and 40%. The results of the experiment are shown in tables 1 and 2 below:
table 1: CIFAR-10 data set experimental results
Table 2: CIFAR-100 data set experimental results
Tables 1 and 2 show the experimental results (test accuracy and standard deviation) of the present protocol (LEND) and the existing protocols (CE, Co-teaching, JoCor and GCE) on the CIFAR-10 and CIFAR-100 data sets, respectively. The Best line represents the optimal test result in the training process, the Last line represents the final test result in the training process, and the optimal test result of each group of experiments is marked in bold.
In addition, in order to verify that the model embedding method adopted by the scheme is superior to the model prediction method, classification performance curves of the model embedding method and the model prediction method in the training process are further shown, as shown in fig. 5, wherein a dotted line represents the precision of the model embedding method, and a solid line represents the precision of the model prediction method.
The test results show that the robust image classification method based on model embedding provided by the patent can obtain better results than the existing scheme under various label noise scenes and various noise rates. Meanwhile, the model embedding adopted by the scheme is verified to have stronger robustness than model prediction. Therefore, the technical scheme can be applied to the classification task of the images with noise, and more accurate classification effect can be ensured.
In a second aspect, the embodiment of the present disclosure, in combination with fig. 6, provides a robust image classification apparatus based on model embedding, where the apparatus includes:
the first processing module 61 is used for acquiring an image to be classified;
the second processing module 62 is configured to input the image to be classified into an image classification deep neural network model, and output an image classification recognition result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
Since the apparatus provided by the embodiment of the present invention can be used for executing the method described in the above embodiment, and the operation principle and the beneficial effect are similar, detailed descriptions are omitted here, and specific contents can be referred to the description of the above embodiment.
According to the robust image classification device based on model embedding, through inputting a training image sample into a deep neural network model, the embedding characteristics of the training image sample in the last hidden layer of the neural network model are obtained; and diluting the noise label of the training image sample according to the embedded characteristics, screening the sample image based on the diluted label, and updating the parameters of the model by using the screened image, so that the technical scheme disclosed by the invention can reduce the labeling cost and simultaneously ensure the robust image classification performance so as to save the manpower and material resources cost of image labeling.
According to the robust image classification device based on model embedding provided by the embodiment of the present disclosure, the diluting the noise label of the training image sample according to the embedding feature specifically includes:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
According to the robust image classification device based on model embedding provided by the embodiment of the present disclosure, the image sample screening according to the diluted label specifically includes:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
According to the robust image classification device based on model embedding provided by the embodiment of the disclosure, the model parameters of the deep neural network model are updated according to a second model, wherein the second model is as follows:
wherein, B is the training image sample after screening,solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,is a loss function.
According to the robust image classification device based on model embedding provided by the embodiment of the disclosure, the method for obtaining the K nearest neighbor of each training image sample comprises the following steps:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
According to the robust image classification device based on model embedding provided by the embodiment of the disclosure, the first model is as follows:
wherein e isiIs the weight of the ith training image sample,representing the noise label corresponding to the training image sample,is an n x C matrix, representing the dilution signature, where C is the number of classifications,to representThe value of j is taken at the maximum value.
Fig. 7 illustrates a physical structure diagram of an electronic device, and as shown in fig. 7, the electronic device may include: a processor (processor)710, a communication Interface (Communications Interface)720, a memory (memory)730, and a communication bus 740, wherein the processor 710, the communication Interface 720, and the memory 730 communicate with each other via the communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform a model-based embedded robust image classification method comprising: acquiring an image to be classified; inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified; the image classification deep neural network model is obtained by training through the following method: inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model; diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label; performing sample screening according to the diluted label to obtain a screened image sample; inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
In addition, the logic instructions in the memory 730 can be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present disclosure also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform a robust image classification method based on model embedding provided by the above methods, the method comprising: acquiring an image to be classified; inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified; the image classification deep neural network model is obtained by training through the following method: inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model; diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label; performing sample screening according to the diluted label to obtain a screened image sample; inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
In yet another aspect, the present disclosure also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the robust image classification method based on model embedding provided above, the method comprising: acquiring an image to be classified; inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified; the image classification deep neural network model is obtained by training through the following method: inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model; diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label; performing sample screening according to the diluted label to obtain a screened image sample; inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.
Claims (14)
1. A robust image classification method based on model embedding is characterized by comprising the following steps:
acquiring an image to be classified;
inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
2. The method for robust image classification based on model embedding according to claim 1, wherein the diluting the noise label of the training image sample according to the embedding feature specifically comprises:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
3. The model-embedding-based robust image classification method according to claim 1, wherein the image sample screening according to the diluted label specifically comprises:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
4. The method of claim 1, wherein the model parameters of the deep neural network model are updated according to a second model, wherein the second model is:
wherein, B is the training image sample after screening,solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,is a loss function.
5. The method for robust image classification based on model embedding according to claim 2, wherein the method for obtaining the K-nearest neighbor of each training image sample is as follows:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
6. The method of claim 3, wherein the first model is:
7. A robust image classification apparatus based on model embedding, comprising:
the first processing module is used for acquiring an image to be classified;
the second processing module is used for inputting the image to be classified into an image classification depth neural network model and outputting an image class identification result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
8. The robust image classification device based on model embedding of claim 7, wherein the diluting the noise label of the training image sample according to the embedding feature specifically comprises:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
9. The model-embedding-based robust image classification device according to claim 7, wherein the image sample screening according to the diluted label specifically comprises:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
10. The apparatus of claim 7, wherein the model parameters of the deep neural network model are updated according to a second model, wherein the second model is:
wherein, B is the training image sample after screening,solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,is a loss function.
11. The robust image classification device based on model embedding of claim 8 is characterized in that the method for obtaining the K nearest neighbor of each training image sample is as follows:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
12. The model-based embedded robust image classification device of claim 9, wherein the first model is:
13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the method for robust image classification based on model embedding according to any of claims 1 to 6.
14. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the robust image classification method based on model embedding of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110898433.3A CN113537389B (en) | 2021-08-05 | 2021-08-05 | Robust image classification method and device based on model embedding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110898433.3A CN113537389B (en) | 2021-08-05 | 2021-08-05 | Robust image classification method and device based on model embedding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113537389A true CN113537389A (en) | 2021-10-22 |
CN113537389B CN113537389B (en) | 2023-11-07 |
Family
ID=78122088
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110898433.3A Active CN113537389B (en) | 2021-08-05 | 2021-08-05 | Robust image classification method and device based on model embedding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113537389B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115618935A (en) * | 2022-12-21 | 2023-01-17 | 北京航空航天大学 | Robustness loss function searching method and system for classified task label noise |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105335756A (en) * | 2015-10-30 | 2016-02-17 | 苏州大学 | Robust learning model and image classification system |
CN108171261A (en) * | 2017-12-21 | 2018-06-15 | 苏州大学 | Adaptive semi-supervision image classification method, device, equipment and the medium of robust |
CN108399421A (en) * | 2018-01-31 | 2018-08-14 | 南京邮电大学 | A kind of zero sample classification method of depth of word-based insertion |
US20200242415A1 (en) * | 2019-01-30 | 2020-07-30 | Coretronic Corporation | Training method of neural network and classification method based on neural network and device thereof |
CN111488904A (en) * | 2020-03-03 | 2020-08-04 | 清华大学 | Image classification method and system based on confrontation distribution training |
CN112766386A (en) * | 2021-01-25 | 2021-05-07 | 大连理工大学 | Generalized zero sample learning method based on multi-input multi-output fusion network |
-
2021
- 2021-08-05 CN CN202110898433.3A patent/CN113537389B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105335756A (en) * | 2015-10-30 | 2016-02-17 | 苏州大学 | Robust learning model and image classification system |
CN108171261A (en) * | 2017-12-21 | 2018-06-15 | 苏州大学 | Adaptive semi-supervision image classification method, device, equipment and the medium of robust |
CN108399421A (en) * | 2018-01-31 | 2018-08-14 | 南京邮电大学 | A kind of zero sample classification method of depth of word-based insertion |
US20200242415A1 (en) * | 2019-01-30 | 2020-07-30 | Coretronic Corporation | Training method of neural network and classification method based on neural network and device thereof |
CN111488904A (en) * | 2020-03-03 | 2020-08-04 | 清华大学 | Image classification method and system based on confrontation distribution training |
CN112766386A (en) * | 2021-01-25 | 2021-05-07 | 大连理工大学 | Generalized zero sample learning method based on multi-input multi-output fusion network |
Non-Patent Citations (1)
Title |
---|
黎健成;袁春;宋友;: "基于卷积神经网络的多标签图像自动标注", 计算机科学, no. 07 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115618935A (en) * | 2022-12-21 | 2023-01-17 | 北京航空航天大学 | Robustness loss function searching method and system for classified task label noise |
Also Published As
Publication number | Publication date |
---|---|
CN113537389B (en) | 2023-11-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109934293B (en) | Image recognition method, device, medium and confusion perception convolutional neural network | |
US11816183B2 (en) | Methods and systems for mining minority-class data samples for training a neural network | |
CN110334742B (en) | Graph confrontation sample generation method based on reinforcement learning and used for document classification and adding false nodes | |
EP3629246A1 (en) | Systems and methods for neural architecture search | |
Goodfellow et al. | Generative adversarial nets | |
US8325999B2 (en) | Assisted face recognition tagging | |
US11068747B2 (en) | Computer architecture for object detection using point-wise labels | |
US11037027B2 (en) | Computer architecture for and-or neural networks | |
US11593619B2 (en) | Computer architecture for multiplier-less machine learning | |
CN113010683B (en) | Entity relationship identification method and system based on improved graph attention network | |
US20210295112A1 (en) | Image recognition learning device, image recognition device, method and program | |
US20200272812A1 (en) | Human body part segmentation with real and synthetic images | |
CN115552481A (en) | System and method for fine tuning image classification neural network | |
CN112465226A (en) | User behavior prediction method based on feature interaction and graph neural network | |
WO2020190951A1 (en) | Neural network trained by homographic augmentation | |
Akpinar et al. | Sample complexity bounds for recurrent neural networks with application to combinatorial graph problems | |
CN113537389A (en) | Robust image classification method and device based on model embedding | |
JP2022507144A (en) | Computer architecture for artificial image generation | |
CN116883751A (en) | Non-supervision field self-adaptive image recognition method based on prototype network contrast learning | |
JP2019028484A (en) | Attribute identification apparatus, attribute identification model learning apparatus, method and program | |
US20230394304A1 (en) | Method and Apparatus for Neural Network Based on Energy-Based Latent Variable Models | |
US11587323B2 (en) | Target model broker | |
JP6993250B2 (en) | Content feature extractor, method, and program | |
CN111709479B (en) | Image classification method and device | |
CN115661847B (en) | Table structure recognition and model training method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |