CN113537389A - Robust image classification method and device based on model embedding - Google Patents

Robust image classification method and device based on model embedding Download PDF

Info

Publication number
CN113537389A
CN113537389A CN202110898433.3A CN202110898433A CN113537389A CN 113537389 A CN113537389 A CN 113537389A CN 202110898433 A CN202110898433 A CN 202110898433A CN 113537389 A CN113537389 A CN 113537389A
Authority
CN
China
Prior art keywords
image sample
model
training image
label
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110898433.3A
Other languages
Chinese (zh)
Other versions
CN113537389B (en
Inventor
沈力
张闯
宫辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Information Technology Co Ltd
Original Assignee
Jingdong Technology Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Information Technology Co Ltd filed Critical Jingdong Technology Information Technology Co Ltd
Priority to CN202110898433.3A priority Critical patent/CN113537389B/en
Publication of CN113537389A publication Critical patent/CN113537389A/en
Application granted granted Critical
Publication of CN113537389B publication Critical patent/CN113537389B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The present disclosure provides a robust image classification method and apparatus based on model embedding, wherein the method comprises: inputting the image to be classified into an image classification depth neural network model, and outputting an image class identification result corresponding to the image to be classified; the image classification deep neural network model is characterized in that a training image sample is input into the deep neural network model to obtain the embedded characteristic of the training image sample in the last hidden layer of the neural network model; diluting the noise label of the training image sample according to the embedded characteristic; carrying out sample screening according to the diluted label to obtain a screened image sample; and updating the model parameters of the deep neural network model according to the screened image samples to obtain an image classification deep neural network model. The technical scheme of the image classification method and the device can reduce the labeling cost and simultaneously ensure the image classification performance of the robustness so as to save the manpower and material resource cost of image labeling.

Description

Robust image classification method and device based on model embedding
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a robust image classification method and apparatus based on model embedding.
Background
The image classification problem based on deep learning makes great progress in the fully supervised scene, but the performance of the classification algorithm in the fully supervised scene is often heavily dependent on the accuracy of image labeling. In the practical application with larger scale, due to the personal professional knowledge limitation of the annotator, the work fatigue and other human factors, the acquired training data set is often noisy, i.e. some pictures are not accurately annotated. How to design a robust classification model to resist the negative influence brought by label noise is a key for expanding the application scene of the image classification model. There are also some technical solutions that try to design robust image classification models. The prior art scheme includes that a model prediction is used for trying to correct wrong image labels, and then corrected images are used for carrying out later-stage model training; in the technical scheme, a training sample is screened by using model prediction, and the screened clean image (image with accurate label) is further used for training a later-stage model.
However, the above prior art schemes all have a serious technical drawback that although the early model prediction can effectively help to reduce the labeling noise effect, a part of persistent noise still exists, and the "early" in the early model training is not controllable. More specifically, the prediction result of the early model is not hundred percent accurate, and secondary training using the prediction result may generate accumulated errors, resulting in increasingly poor iterative classification models. Compared with model prediction, the method based on model embedding provided by the scheme is more robust, and the problem of accumulated errors can be effectively avoided.
It is well known that DNN requires a long training process to obtain reliable predictions, which necessarily conflicts with the "early" in memory effects. Although the DNN had the opportunity to learn clean images during early training, it was not possible to learn a reliable clean classification model, and noisy images were involved in the training, resulting in the model beginning to fit noise. The consequence is that the classification performance of the early models rises first and falls later and cannot be optimized. The use of moving averages continually reduces the number of noisy images discarded in order to achieve a compromise between the "long-term" required for DNN training and the "early" required for memory effects. However, this compromise requires manual adjustment and the adjustment process is laborious and time consuming.
Disclosure of Invention
The invention provides a robust image classification method and device based on model embedding, which are used for solving the defects that manual adjustment is needed in the prior art, and the adjustment process is troublesome and labor-consuming, and the image classification performance of robustness is ensured while the labeling cost is reduced so as to save the manpower and material resources cost of image labeling.
In a first aspect, the present disclosure provides a robust image classification method based on model embedding, including:
acquiring an image to be classified;
inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
According to the robust image classification method based on model embedding provided by the present disclosure, the diluting the noise label of the training image sample according to the embedding feature specifically includes:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
According to the robust image classification method based on model embedding provided by the present disclosure, the image sample screening according to the diluted label specifically includes:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
According to the robust image classification method based on model embedding provided by the present disclosure, the model parameters of the deep neural network model are updated according to a second model, wherein the second model is:
Figure BDA0003198892500000031
wherein, B is the training image sample after screening,
Figure BDA0003198892500000032
solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,
Figure BDA0003198892500000034
representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,
Figure BDA0003198892500000033
is a loss function.
According to the robust image classification method based on model embedding provided by the present disclosure, the method for obtaining the K nearest neighbor of each training image sample is as follows:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
According to the robust image classification method based on model embedding provided by the present disclosure, the first model is:
Figure BDA0003198892500000041
wherein e isiIs the weight of the ith training image sample,
Figure BDA0003198892500000042
representing the noise label corresponding to the training image sample,
Figure BDA0003198892500000043
is an n x C matrix, representing the dilution signature, where C is the number of classifications,
Figure BDA0003198892500000044
to represent
Figure BDA0003198892500000045
The value of j is taken at the maximum value.
In a second aspect, the present disclosure provides a robust image classification apparatus based on model embedding, including:
the first processing module is used for acquiring an image to be classified;
the second processing module is used for inputting the image to be classified into an image classification depth neural network model and outputting an image class identification result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
According to the robust image classification device based on model embedding provided by the present disclosure, the diluting the noise label of the training image sample according to the embedding feature specifically includes:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
According to the robust image classification device based on model embedding provided by the present disclosure, the image sample screening according to the diluted label specifically includes:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
According to the robust image classification device based on model embedding provided by the present disclosure, the model parameters of the deep neural network model are updated according to a second model, wherein the second model is:
Figure BDA0003198892500000051
wherein, B is the training image sample after screening,
Figure BDA0003198892500000052
solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,
Figure BDA0003198892500000054
representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,
Figure BDA0003198892500000053
is a loss function.
According to the robust image classification device based on model embedding provided by the present disclosure, the method for obtaining the K nearest neighbor of each training image sample is as follows:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
The robust image classification device based on model embedding is provided according to the present disclosure, wherein the first model is:
Figure BDA0003198892500000061
wherein e isiIs the weight of the ith training image sample,
Figure BDA0003198892500000062
representing the noise label corresponding to the training image sample,
Figure BDA0003198892500000063
is an n x C matrix, representing the dilution signature, where C is the number of classifications,
Figure BDA0003198892500000064
to represent
Figure BDA0003198892500000065
The value of j is taken at the maximum value.
In a third aspect, the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the robust image classification method based on model embedding as described in any one of the above.
In a fourth aspect, the present disclosure also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the model-based embedded robust image classification method as described in any of the above.
According to the robust image classification method and device based on model embedding, a training image sample is input into a deep neural network model, and embedding characteristics of the training image sample in the last hidden layer of the neural network model are obtained; and diluting the noise label of the training image sample according to the embedded characteristics, screening the sample image based on the diluted label, and updating the parameters of the model by using the screened image, so that the technical scheme disclosed by the invention can reduce the labeling cost and simultaneously ensure the robust image classification performance so as to save the manpower and material resources cost of image labeling.
Drawings
In order to more clearly illustrate the technical solutions of the present disclosure or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart diagram of a robust image classification method based on model embedding provided by the present disclosure;
FIG. 2 is a schematic flow chart of a robust image classification algorithm based on model embedding provided by the present disclosure;
FIG. 3 is a schematic diagram of a sample k neighbor finding process provided by the present disclosure;
FIG. 4 is a schematic diagram of a k-nearest neighbor based tag dilution process provided by the present disclosure;
FIG. 5 is a schematic diagram illustrating comparison of performance of a model prediction method and a model embedding method provided by the present disclosure;
FIG. 6 is a schematic structural diagram of a robust image classification apparatus based on model embedding provided by the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device provided by the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present disclosure, belong to the protection scope of the embodiments of the present disclosure.
A robust image classification method based on model embedding provided by the embodiments of the present disclosure is described below with reference to fig. 1 to fig. 2, which includes:
step 100: acquiring an image to be classified;
in particular, the image classification problem is an important application as a bottom-layer image processing problem in the fields of medical imaging, satellite image remote sensing, internet crowdsourcing and the like. In recent years, the problem of image classification based on deep learning has been achieved with good results. Therefore, when the deep neural network model is used for image classification, an image to be classified is acquired first.
Step 200: inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
Specifically, Deep Neural Networks (DNNs) can be understood as Neural Networks with many hidden layers, also called Deep feed-forward Networks (DFNs), where a Multi-Layer perceptron (MLP) is divided by the positions of different layers, and the Neural network layers inside the DNNs can be divided into: the input layer, the hidden layer and the output layer, generally, the first layer is the input layer, the last layer is the output layer, and the middle layers are all hidden layers. The layers are all connected, namely any neuron of the ith layer is necessarily connected with any neuron of the (i + 1) th layer.
Before the deep neural network is used for classifying images, a deep neural network model needs to be trained, and although the DNN has an opportunity to learn a clean image in the early training process in the existing training method, a reliable clean classification model cannot be learned, and a noise image participates in training, so that the model starts to fit noise. The consequence is that the classification performance of the early models rises first and falls later and cannot be optimized. Or using a moving average method to continuously reduce the number of discarded noisy images to achieve a compromise between "long-term" required for DNN training and "early" required for memory effects. However, this compromise requires manual adjustment and the adjustment process is laborious and time consuming.
In an actual application scenario, in consideration of the human and material costs or the subjectivity of a classification task, training data which can be actually acquired is generally affected by external noise, so that the performance of a traditional deep learning algorithm is limited. Therefore, how to design a robust image classification algorithm based on a noisy training image is of great significance in academic research and industrial applications. The robustness of the model embedding characteristics adopted by the scheme has a longer period in the training process, and the problem of manual parameter adjustment is avoided.
In the early training process of the deep neural network, model embedding has stronger robustness than model prediction, and based on the model embedding, a label noise reduction scheme based on the model embedding is designed, and a clean image screening scheme is further designed for DNN model training.
Fig. 2 shows a basic flow chart of the robust image classification algorithm based on model embedding proposed by the present patent. Specifically, training is carried out by inputting training image samples (input images) to a DNN, embedding characteristics (model embedding) of a last hidden layer of a training model are taken for label noise dilution (label dilution), the diluted labels are further used for image sample screening (image selection), the screened clean samples (selected images) are fed back to the DNN training process, and finally a robust classification model is output. And classifying the image based on the obtained image classification model.
According to the robust image classification method based on model embedding, a training image sample is input into a deep neural network model, and embedding characteristics of the training image sample in the last hidden layer of the neural network model are obtained; and diluting the noise label of the training image sample according to the embedded characteristics, screening the sample image based on the diluted label, and updating the parameters of the model by using the screened image, so that the technical scheme disclosed by the invention can reduce the labeling cost and simultaneously ensure the robust image classification performance so as to save the manpower and material resources cost of image labeling.
According to the robust image classification method based on model embedding provided by the embodiment of the disclosure, the diluting the noise label of the training image sample according to the embedding feature specifically comprises:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
Specifically, during the label dilution process, the robust information provided by model embedding is further extracted by using K Nearest Neighbors (KNN) of the training image samples. Fig. 3 shows a schematic diagram of the process of finding K neighbors of image samples, i.e. finding the K image samples that are closest to each training image sample in euclidean space (where ● and x respectively indicate that the image sample is correctly/incorrectly labeled). Fig. 4 illustrates a K-nearest neighbor based noise label dilution process. In each round of dilution process, firstly searching K neighbor of each training image sample, then transmitting label information of the K neighbor of each training image sample to the image sample and updating the label of the image sample; the process iterates through T rounds until convergence.
Given input training data
Figure BDA0003198892500000101
(x represents one of the training images,
Figure BDA0003198892500000102
representing the noise label corresponding to the image, n representing the total number of training image samples), using this data to train dnnf (x), given any desired test data xt,f(xt) Can accurately predict the real label y thereoft. Let w denote the update parameter of f (x), φ (x) denote the embedding characteristics of the last hidden layer of f (x), NNk(xi(ii) a Phi) represents an image sample xiK is a neighbor on the feature phi (x).
In each round of label dilution, x for each image sampleiFirst, using its embedding characteristic phi (x)i) Obtain its K nearest neighbor NNk(xi(ii) a Phi) and then propagates its k neighbor label information to the image sample xiTo obtain a dilution label z thereofi
Formally expressing the t-th iteration process as follows:
Figure BDA0003198892500000103
wherein W represents a similarity matrix constructed based on phi (x), and the first term on the right side of the formula (1) represents a sample xiK neighbor of (a), the second term on the right represents the t-1 th iteration sample xiα is a weight parameter of the above two items. The above equation is written in matrix form:
Z(t)=(1-α)·WZ(t-1)+α·Z(t-1) (2)
by noise labels
Figure BDA0003198892500000104
One-hot representation of (one-hot representation) initializing the dilution tag Z(0)Then, Z is updated using equation (2) until convergence.
Next, the generation process of the similarity matrix W is described in detail. Given training sample
Figure BDA0003198892500000111
The model embedding characteristics of
Figure BDA0003198892500000112
Firstly, constructing a sparse matrix A with the size of n multiplied by n and an element A thereofijComprises the following steps:
Figure BDA0003198892500000113
where γ is a hyperparameter, typically set to 4; finally, constructing a similarity matrix W ═ D-1/2W′D-1/2Wherein
Figure BDA00031988925000001114
D=diag(W′1)。
According to the robust image classification method based on model embedding provided by the embodiment of the disclosure, the image sample screening according to the diluted label specifically includes:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
In particular, a training sample is given
Figure BDA0003198892500000114
And dilution label therefor
Figure BDA0003198892500000115
Each sample training weight e can be obtained by equation (4), and DNN parameter w is further updated by equation (5), where i (a) is an indication function, a is true, i (a) is 1, otherwise i (a) is 0;
Figure BDA0003198892500000116
the operator is solved for the gradient with respect to w.
Figure BDA0003198892500000117
Figure BDA0003198892500000118
Wherein, B is the training image sample after screening,
Figure BDA0003198892500000119
solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,
Figure BDA00031988925000001110
representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,
Figure BDA00031988925000001111
is a loss function.
In the formula (4)
Figure BDA00031988925000001112
Show to make
Figure BDA00031988925000001113
Taking the value of j at the maximum value, wherein
Figure BDA0003198892500000121
Is a Z is an n x C matrix and C is the number of classifications. Each row (i) represents a group of one hot labels; in one hot label, each element is 0 or 1, and there is one and only one 1. Here, the
Figure BDA0003198892500000122
This is the position of 1 in the ith group of one hot labels.
Where i (a) is an indicator function, a is true when i (a) is 1, otherwise i (a) is 0; is shown in
Figure BDA0003198892500000123
When the image sample is established, the weight value corresponding to the sample is 1, namely the image sample is screened, and when the image sample is established, the image sample is screened
Figure BDA0003198892500000124
If not, the weight of the sample is 0, i.e. the image sample is not screened.
And updating the parameters of the deep neural network model according to the screened image samples in a mode of formula (5).
According to the robust image classification method based on model embedding provided by the embodiment of the disclosure, the method for obtaining the K nearest neighbor of each training image sample comprises the following steps:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
In experiments, CIFAR-10 and CIFAR-100 data sets were used to verify the effectiveness of the method proposed in this patent. Specifically, CIFAR-10 and CIFAR-100 each contain 50000 training images and 10000 test images, with the number of image classes being 10 and 100, respectively. Note that the above data set is initially a clean data set, and the experiment converts the data set into a noisy data set by manually adding tag noise, and the conversion method is as follows: symmetric flipping and asymmetric flipping. The symmetric inversion is also called random inversion, and the image of each category is inverted into all other labels with a certain probability rho; the asymmetric inversion is also called pairing inversion, images of each category are inverted into a certain semantic information similar label (such as cat → dog, deer → horse) with a certain probability rho, and the inversion mode can be applied to a fine-grained image classification scene in practical application. The probability ρ is also called the noise rate, i.e., the ratio of the noisy images. The noise rates considered for asymmetric inversion are: 40%, 50% and 60%; the noise rates considered for symmetric inversion are: 20%, 30 and 40%. The results of the experiment are shown in tables 1 and 2 below:
table 1: CIFAR-10 data set experimental results
Figure BDA0003198892500000131
Table 2: CIFAR-100 data set experimental results
Figure BDA0003198892500000132
Tables 1 and 2 show the experimental results (test accuracy and standard deviation) of the present protocol (LEND) and the existing protocols (CE, Co-teaching, JoCor and GCE) on the CIFAR-10 and CIFAR-100 data sets, respectively. The Best line represents the optimal test result in the training process, the Last line represents the final test result in the training process, and the optimal test result of each group of experiments is marked in bold.
In addition, in order to verify that the model embedding method adopted by the scheme is superior to the model prediction method, classification performance curves of the model embedding method and the model prediction method in the training process are further shown, as shown in fig. 5, wherein a dotted line represents the precision of the model embedding method, and a solid line represents the precision of the model prediction method.
The test results show that the robust image classification method based on model embedding provided by the patent can obtain better results than the existing scheme under various label noise scenes and various noise rates. Meanwhile, the model embedding adopted by the scheme is verified to have stronger robustness than model prediction. Therefore, the technical scheme can be applied to the classification task of the images with noise, and more accurate classification effect can be ensured.
In a second aspect, the embodiment of the present disclosure, in combination with fig. 6, provides a robust image classification apparatus based on model embedding, where the apparatus includes:
the first processing module 61 is used for acquiring an image to be classified;
the second processing module 62 is configured to input the image to be classified into an image classification deep neural network model, and output an image classification recognition result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
Since the apparatus provided by the embodiment of the present invention can be used for executing the method described in the above embodiment, and the operation principle and the beneficial effect are similar, detailed descriptions are omitted here, and specific contents can be referred to the description of the above embodiment.
According to the robust image classification device based on model embedding, through inputting a training image sample into a deep neural network model, the embedding characteristics of the training image sample in the last hidden layer of the neural network model are obtained; and diluting the noise label of the training image sample according to the embedded characteristics, screening the sample image based on the diluted label, and updating the parameters of the model by using the screened image, so that the technical scheme disclosed by the invention can reduce the labeling cost and simultaneously ensure the robust image classification performance so as to save the manpower and material resources cost of image labeling.
According to the robust image classification device based on model embedding provided by the embodiment of the present disclosure, the diluting the noise label of the training image sample according to the embedding feature specifically includes:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
According to the robust image classification device based on model embedding provided by the embodiment of the present disclosure, the image sample screening according to the diluted label specifically includes:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
According to the robust image classification device based on model embedding provided by the embodiment of the disclosure, the model parameters of the deep neural network model are updated according to a second model, wherein the second model is as follows:
Figure BDA0003198892500000151
wherein, B is the training image sample after screening,
Figure BDA0003198892500000152
solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,
Figure BDA0003198892500000153
representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,
Figure BDA0003198892500000161
is a loss function.
According to the robust image classification device based on model embedding provided by the embodiment of the disclosure, the method for obtaining the K nearest neighbor of each training image sample comprises the following steps:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
According to the robust image classification device based on model embedding provided by the embodiment of the disclosure, the first model is as follows:
Figure BDA0003198892500000162
wherein e isiIs the weight of the ith training image sample,
Figure BDA0003198892500000163
representing the noise label corresponding to the training image sample,
Figure BDA0003198892500000164
is an n x C matrix, representing the dilution signature, where C is the number of classifications,
Figure BDA0003198892500000165
to represent
Figure BDA0003198892500000166
The value of j is taken at the maximum value.
Fig. 7 illustrates a physical structure diagram of an electronic device, and as shown in fig. 7, the electronic device may include: a processor (processor)710, a communication Interface (Communications Interface)720, a memory (memory)730, and a communication bus 740, wherein the processor 710, the communication Interface 720, and the memory 730 communicate with each other via the communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform a model-based embedded robust image classification method comprising: acquiring an image to be classified; inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified; the image classification deep neural network model is obtained by training through the following method: inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model; diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label; performing sample screening according to the diluted label to obtain a screened image sample; inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
In addition, the logic instructions in the memory 730 can be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present disclosure also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform a robust image classification method based on model embedding provided by the above methods, the method comprising: acquiring an image to be classified; inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified; the image classification deep neural network model is obtained by training through the following method: inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model; diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label; performing sample screening according to the diluted label to obtain a screened image sample; inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
In yet another aspect, the present disclosure also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the robust image classification method based on model embedding provided above, the method comprising: acquiring an image to be classified; inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified; the image classification deep neural network model is obtained by training through the following method: inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model; diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label; performing sample screening according to the diluted label to obtain a screened image sample; inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Claims (14)

1. A robust image classification method based on model embedding is characterized by comprising the following steps:
acquiring an image to be classified;
inputting the image to be classified into an image classification depth neural network model, and outputting an image classification recognition result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
2. The method for robust image classification based on model embedding according to claim 1, wherein the diluting the noise label of the training image sample according to the embedding feature specifically comprises:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
3. The model-embedding-based robust image classification method according to claim 1, wherein the image sample screening according to the diluted label specifically comprises:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
4. The method of claim 1, wherein the model parameters of the deep neural network model are updated according to a second model, wherein the second model is:
Figure FDA0003198892490000021
wherein, B is the training image sample after screening,
Figure FDA0003198892490000022
solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,
Figure FDA0003198892490000023
representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,
Figure FDA0003198892490000024
is a loss function.
5. The method for robust image classification based on model embedding according to claim 2, wherein the method for obtaining the K-nearest neighbor of each training image sample is as follows:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
6. The method of claim 3, wherein the first model is:
Figure FDA0003198892490000025
wherein e isiIs the weight of the ith training image sample,
Figure FDA0003198892490000026
representing the noise label corresponding to the training image sample,
Figure FDA0003198892490000027
is an n x C matrix, representing the dilution signature, where C is the number of classifications,
Figure FDA0003198892490000028
to represent
Figure FDA0003198892490000029
The value of j is taken at the maximum value.
7. A robust image classification apparatus based on model embedding, comprising:
the first processing module is used for acquiring an image to be classified;
the second processing module is used for inputting the image to be classified into an image classification depth neural network model and outputting an image class identification result corresponding to the image to be classified;
the image classification deep neural network model is obtained by training through the following method:
inputting a training image sample into a deep neural network model, and acquiring the embedded characteristic of the training image sample in the last hidden layer of the neural network model;
diluting the noise label of the training image sample according to the embedding characteristics to obtain a diluted label;
performing sample screening according to the diluted label to obtain a screened image sample;
inputting the screened image sample into the deep neural network model, and updating the model parameters of the deep neural network model to obtain the image classification deep neural network model.
8. The robust image classification device based on model embedding of claim 7, wherein the diluting the noise label of the training image sample according to the embedding feature specifically comprises:
the method comprises the following steps: acquiring K neighbors of each training image sample;
transmitting the label information of the K neighbor to the corresponding training image sample to realize the dilution of the noise label of the training image sample;
step two: and repeating the first step until convergence occurs, and determining a final dilution result of the noise label of the training image sample.
9. The model-embedding-based robust image classification device according to claim 7, wherein the image sample screening according to the diluted label specifically comprises:
determining a noise label of the training image sample and a dilution label corresponding to the training image sample;
inputting the noise label and the dilution label into a first model, and determining the weight of the image sample;
and screening the image sample according to the weight.
10. The apparatus of claim 7, wherein the model parameters of the deep neural network model are updated according to a second model, wherein the second model is:
Figure FDA0003198892490000041
wherein, B is the training image sample after screening,
Figure FDA0003198892490000042
solving the operator for the gradient with respect to w, w being a parameter of the deep neural network model, eiIs the weight of the ith training image sample,
Figure FDA0003198892490000043
representing the noise label, f (x), corresponding to the training image samplei) Is a true label for the training image sample,
Figure FDA0003198892490000044
is a loss function.
11. The robust image classification device based on model embedding of claim 8 is characterized in that the method for obtaining the K nearest neighbor of each training image sample is as follows:
determining Euclidean spatial distance values between each training image sample and other training image samples in the training image sample set;
sorting the Euclidean spatial distance values from small to large;
and screening out the first K Euclidean spatial distance values, wherein the training image samples corresponding to the first K Euclidean spatial distance values are K neighbors of each training image sample.
12. The model-based embedded robust image classification device of claim 9, wherein the first model is:
Figure FDA0003198892490000045
wherein e isiIs the weight of the ith training image sample,
Figure FDA0003198892490000046
representing the noise label corresponding to the training image sample,
Figure FDA0003198892490000047
is an n x C matrix, representing the dilution signature, where C is the number of classifications,
Figure FDA0003198892490000048
to represent
Figure FDA0003198892490000049
The value of j is taken at the maximum value.
13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the method for robust image classification based on model embedding according to any of claims 1 to 6.
14. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the robust image classification method based on model embedding of any one of claims 1 to 6.
CN202110898433.3A 2021-08-05 2021-08-05 Robust image classification method and device based on model embedding Active CN113537389B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110898433.3A CN113537389B (en) 2021-08-05 2021-08-05 Robust image classification method and device based on model embedding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110898433.3A CN113537389B (en) 2021-08-05 2021-08-05 Robust image classification method and device based on model embedding

Publications (2)

Publication Number Publication Date
CN113537389A true CN113537389A (en) 2021-10-22
CN113537389B CN113537389B (en) 2023-11-07

Family

ID=78122088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110898433.3A Active CN113537389B (en) 2021-08-05 2021-08-05 Robust image classification method and device based on model embedding

Country Status (1)

Country Link
CN (1) CN113537389B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115618935A (en) * 2022-12-21 2023-01-17 北京航空航天大学 Robustness loss function searching method and system for classified task label noise

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335756A (en) * 2015-10-30 2016-02-17 苏州大学 Robust learning model and image classification system
CN108171261A (en) * 2017-12-21 2018-06-15 苏州大学 Adaptive semi-supervision image classification method, device, equipment and the medium of robust
CN108399421A (en) * 2018-01-31 2018-08-14 南京邮电大学 A kind of zero sample classification method of depth of word-based insertion
US20200242415A1 (en) * 2019-01-30 2020-07-30 Coretronic Corporation Training method of neural network and classification method based on neural network and device thereof
CN111488904A (en) * 2020-03-03 2020-08-04 清华大学 Image classification method and system based on confrontation distribution training
CN112766386A (en) * 2021-01-25 2021-05-07 大连理工大学 Generalized zero sample learning method based on multi-input multi-output fusion network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335756A (en) * 2015-10-30 2016-02-17 苏州大学 Robust learning model and image classification system
CN108171261A (en) * 2017-12-21 2018-06-15 苏州大学 Adaptive semi-supervision image classification method, device, equipment and the medium of robust
CN108399421A (en) * 2018-01-31 2018-08-14 南京邮电大学 A kind of zero sample classification method of depth of word-based insertion
US20200242415A1 (en) * 2019-01-30 2020-07-30 Coretronic Corporation Training method of neural network and classification method based on neural network and device thereof
CN111488904A (en) * 2020-03-03 2020-08-04 清华大学 Image classification method and system based on confrontation distribution training
CN112766386A (en) * 2021-01-25 2021-05-07 大连理工大学 Generalized zero sample learning method based on multi-input multi-output fusion network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黎健成;袁春;宋友;: "基于卷积神经网络的多标签图像自动标注", 计算机科学, no. 07 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115618935A (en) * 2022-12-21 2023-01-17 北京航空航天大学 Robustness loss function searching method and system for classified task label noise

Also Published As

Publication number Publication date
CN113537389B (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN109934293B (en) Image recognition method, device, medium and confusion perception convolutional neural network
US11816183B2 (en) Methods and systems for mining minority-class data samples for training a neural network
CN110334742B (en) Graph confrontation sample generation method based on reinforcement learning and used for document classification and adding false nodes
EP3629246A1 (en) Systems and methods for neural architecture search
Goodfellow et al. Generative adversarial nets
US8325999B2 (en) Assisted face recognition tagging
US11068747B2 (en) Computer architecture for object detection using point-wise labels
US11037027B2 (en) Computer architecture for and-or neural networks
US11593619B2 (en) Computer architecture for multiplier-less machine learning
CN113010683B (en) Entity relationship identification method and system based on improved graph attention network
US20210295112A1 (en) Image recognition learning device, image recognition device, method and program
US20200272812A1 (en) Human body part segmentation with real and synthetic images
CN115552481A (en) System and method for fine tuning image classification neural network
CN112465226A (en) User behavior prediction method based on feature interaction and graph neural network
WO2020190951A1 (en) Neural network trained by homographic augmentation
Akpinar et al. Sample complexity bounds for recurrent neural networks with application to combinatorial graph problems
CN113537389A (en) Robust image classification method and device based on model embedding
JP2022507144A (en) Computer architecture for artificial image generation
CN116883751A (en) Non-supervision field self-adaptive image recognition method based on prototype network contrast learning
JP2019028484A (en) Attribute identification apparatus, attribute identification model learning apparatus, method and program
US20230394304A1 (en) Method and Apparatus for Neural Network Based on Energy-Based Latent Variable Models
US11587323B2 (en) Target model broker
JP6993250B2 (en) Content feature extractor, method, and program
CN111709479B (en) Image classification method and device
CN115661847B (en) Table structure recognition and model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant