CN114549906A - Improved image classification algorithm for step-by-step training of Top-k loss function - Google Patents
Improved image classification algorithm for step-by-step training of Top-k loss function Download PDFInfo
- Publication number
- CN114549906A CN114549906A CN202210185010.1A CN202210185010A CN114549906A CN 114549906 A CN114549906 A CN 114549906A CN 202210185010 A CN202210185010 A CN 202210185010A CN 114549906 A CN114549906 A CN 114549906A
- Authority
- CN
- China
- Prior art keywords
- loss function
- improved
- module
- training
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 46
- 238000007635 classification algorithm Methods 0.000 title claims abstract description 14
- 230000006870 function Effects 0.000 claims abstract description 55
- 238000000034 method Methods 0.000 claims abstract description 18
- 238000013135 deep learning Methods 0.000 claims abstract description 16
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 230000008569 process Effects 0.000 claims abstract description 4
- 101100153586 Caenorhabditis elegans top-1 gene Proteins 0.000 claims description 15
- 101100370075 Mus musculus Top1 gene Proteins 0.000 claims description 15
- 101100537629 Caenorhabditis elegans top-2 gene Proteins 0.000 claims description 12
- 101150107801 Top2a gene Proteins 0.000 claims description 12
- 230000000694 effects Effects 0.000 claims description 5
- 238000004422 calculation algorithm Methods 0.000 abstract description 10
- 238000013528 artificial neural network Methods 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000012360 testing method Methods 0.000 description 8
- 238000013461 design Methods 0.000 description 4
- 238000013136 deep learning model Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000009966 trimming Methods 0.000 description 3
- 238000011176 pooling Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 241000282994 Cervidae Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 241000009328 Perro Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image classification algorithm for step-by-step training of an improved Top-k loss function, which comprises an image data preprocessing module; the system comprises a deep learning feature extraction module and a system prediction output module, and is characterized in that an image data preprocessing module preprocesses input image data; the deep learning feature extraction module comprises an improvement of step-by-step training by using an improved Top-k loss function by using deep learning, and the classifier module is a deep learning network module; and the system output module processes the output of the classifier and outputs a judgment result. The image classification system breaks through the accuracy limit of the deep neural network under the condition of not modifying the network structure by using an improved image classification algorithm for step-by-step training of the Top-k loss function, and the accuracy is effectively improved by using the improved algorithm.
Description
Technical Field
The invention belongs to the field of image classification, and particularly relates to an improved Top-k loss function step-by-step training image classification system based on deep learning.
Background
Image Classification (Image Classification) technology has been developed as Image Classification has played an increasingly important role in various fields of daily life. To further achieve proper classification of images, given a set of images that are each labeled as a single class, we predict the class of a new set of test images and measure the accuracy results of the prediction as a point of research. The traditional image classification algorithm extracts the manually designed features, and the manually designed features have great limitation and are difficult to design, so that the manually designed features cannot be competent for some complex tasks, such as K-neighbor algorithm (KNN), Support Vector Machine (SVM) and other algorithms, the design difficulty is often high, the combination of feature extraction and classifier algorithms is complex, and the high classification accuracy is difficult to realize.
In recent years, deep learning is increasingly applied to image classification systems. At present, the deep learning method obtains a plurality of breakthrough achievements in the aspect of practical application and gradually becomes an important tool for artificial intelligence. The convolutional neural network is one of deep learning algorithms, and compared with the characteristics of the traditional manual design, the deep characteristic has the advantages that the complex and time-consuming characteristic extraction algorithm design is not needed to be carried out manually, only an effective neural network model needs to be designed, and the classification accuracy is high. However, the deep neural network often cannot break through the accuracy limit without modifying the network structure, and an improved algorithm is needed to effectively improve the accuracy.
Disclosure of Invention
The invention aims to provide a step-by-step training algorithm for an improved Top-k multi-loss function for image classification, compared with the traditional method, the method can realize higher classification accuracy compared with the original loss function, and the method uses a deep learning method, simultaneously omits the steps of manually extracting features and manually selecting a classifier, and overcomes the problem of difficult feature extraction and classification in the traditional method.
The solution provided by the invention is to adopt a step-by-step training algorithm of an improved Top-k multi-loss function for image classification to realize higher classification accuracy, and the method comprises the following steps: an image data preprocessing module; the system comprises a deep learning feature extraction module and a system prediction output module.
The image data preprocessing module represents the image data by a three-dimensional tensor. The image data preprocessing module comprises the following functions: irrelevant input data is abandoned, negative effects can be reduced, and the method generally comprises the steps of changing the brightness of an original image, changing the contrast of the original image, performing center trimming on the image, changing a color function, denoising pretreatment and the like. Preprocessing image data to convert the image data into an input form which can be accepted by an image target classification module, marking image target data categories in a training mode, and selecting a data set required by a machine learning method.
The working method of image data preprocessing in the training mode comprises the following steps: for a three-channel image, the brightness of an original image is changed, the contrast of the original image is changed, the image is subjected to center trimming, a color function is changed, denoising pretreatment is carried out, pretreatment such as unified upsampling, center trimming and rotation is carried out, and the data volume is expanded. In the training model, parameters of the convolutional neural network can be updated to obtain a CNN network capable of accurately classifying the specified data set.
The working method for preprocessing the image target data in the test mode comprises the following steps: for the three-channel image, the brightness of the original image is changed, the contrast of the original image is changed, the color function is changed, denoising pretreatment is carried out, unified upsampling is carried out, and data expansion processing is not carried out. The upsampling is performed uniformly to a fixed resolution, and in the following embodiments, the upsampling results in an image resolution of 224 × 224 pixels. And obtaining the final classification effect.
The image target classification module comprises a deep learning feature extraction module and a loss function step-by-step training module, and a supervised learning algorithm is used during network training. The CNN network feature extraction module and the classifier module are used for improving the Top-k loss function, training a model with lower Top-2 errors is simpler than improving Top-1 correct classification, and the fault tolerance of the model structure can be effectively improved as long as the first two prediction categories contain real labels. The basic classifier through the joint training strategy is used for the ensemble learning we propose, and a plurality of CNN models are respectively trained by using the training strategy. In the first step, we train the CNN network using a cross-entropy loss function, which is initialized by the ImageNet pre-training model and fine-tuned for the dataset. It is known that the cross-entropy loss function is a loss function created to optimize the top-1 loss, and the model is trained by the cross-entropy loss function to obtain a network with higher top-1 accuracy. In the second step, the top-k loss function established for the first k correct labels is used for fine adjustment, the model trained in the first step is used as the initial weight of top-k loss function training, so that an optimized network is obtained under the condition that the top-1 correct rate and the higher top-2 correct rate are unchanged, and the model improves the recognition capability. The network is trained by two parallel loss functions together to form a unified network.
And the system output module processes the output of the classifier and outputs a judgment result.
Compared with the traditional cross entropy loss function, the improved image classification algorithm for the step-by-step training of the Top-k loss function disclosed by the invention can realize higher Top-1 and Top-2 classification accuracy and simultaneously reduce the complexity of manual calculation. On the basis of changing the model architecture, the accuracy of Top-1 and Top-2 is improved compared with the cross entropy loss function. Compared with the traditional training method, the improved image classification algorithm for the step-by-step training of the Top-k loss function has better robustness and better performance on a CIFAR-10 data set.
Drawings
FIG. 1 is a flow chart of an image classification algorithm using deep learning;
FIG. 2 is a flow chart of image classification using two loss functions for step-wise training;
FIG. 3 is a graph comparing accuracy and effect of step-by-step training for the conventional training method and the proposed Top-k loss function;
Detailed Description
The present invention is described in further detail below with reference to the attached drawings.
First, a data set is selected, and the data set selected by us is a CIFAR-10 image classification data set, and the CIFAR-10 data set is a reference data set widely used for image classification. The data set contained 60000 images, divided into 10 categories (airplane, car, bird, cat, deer, dog, frog, horse, boat and truck), 50000 images for training and 10000 images for testing, all 32x 32 pixels in size. The data set details are shown in the following table:
TABLE 1 CIFAR-10 data set information Table
Data set | Training set | Verification set | Test set | Categories |
CIFAR-10 | 50000 | 0 | 10000 | 10 |
The selection of the deep learning model, which is selected to be tested, comprises the following steps:
the Incep-v 3 and the Incep-v 3 have strong image feature extraction and classification performance and are widely used image recognition models. It consists of symmetric and asymmetric building blocks, including convolutional, average pooling, maximum pooling, padding and full-link layers. Batch normalization is widely used throughout the model and applied to activation inputs. .
DPN92, in which High Order RNN structure (HORNN) was used to link DenseNet and ResNet, demonstrated that DenseNet can extract new features from previous levels, while ResNet is essentially a multiplex of the extracted features from previous levels. By combining the advantages of the two structures, the DPN network can effectively improve the classification efficiency.
ResNet, ResNet uses the identity mapping to directly transmit the output of the previous layer to the back layer, when the depth of the network is increased, the error cannot be increased, the deeper network cannot bring the rise of the error on the training set, and the problem of gradient disappearance is solved.
The model was trained using the environmental training data set shown in the table below.
Table 2 experimental environment table
The model training parameters are shown in the following table, and the Batchsize is properly adjusted according to the model parameter quantity to reduce the video memory occupied by the video card and the model training time.
TABLE 3 model initialization parameter Table
Parameter(s) | Value of |
Learning rate | 0.1 |
Learning rate decay | 0.1 |
Momentum | 0.9 |
Batch size | 32/24 |
Training rounds | 200 |
In the training process, by using the image classification algorithm of step-by-step training of the improved Top-k loss function, as shown in FIG. 2, the training of the model with lower Top-2 errors is simpler than the correct classification of the improved Top-1, and the fault tolerance of the model structure can be effectively improved as long as the first two prediction classes contain real labels. The basic classifier through the joint training strategy is used for the ensemble learning we propose, and a plurality of CNN models are respectively trained by using the training strategy. In the first step, we train the CNN network using a cross-entropy loss function, which is initialized by the ImageNet pre-training model and fine-tuned for the dataset. It is known that the cross-entropy loss function is a loss function created to optimize the top-1 loss, and the model is trained by the cross-entropy loss function to obtain a network with higher top-1 accuracy. In the second step, we use the top-k loss function created for the first k correct labels to perform fine tuning, and use the model trained in the first step as the initial weight of top-k loss function training to obtain an optimized network under the condition that the top-1 correct rate and the higher top-2 correct rate are unchanged, and the model improves the recognition capability. However, due to the ambiguity of the label and the uncertainty of the feature, there is no real identification method, besides improving the efficiency of feature extraction, the predicted label is hidden between top-1 and top-k, the improved top-2 classification accuracy further improves the efficiency that we will extract the top-2 label for effective use, and the network is trained by two parallel loss functions together to form a unified network.
The test model is trained to obtain test results of three deep learning models of ResNet, DPN92 and increment-v 3, and the effect of the traditional cross entropy loss function is shown in the following table:
TABLE 3 Classification of results on CIFAR-10 datasets Using loss function
Method | Rate of accuracy |
ResNet18 | 96.50 |
DPN92 | 97.56 |
Incption-v3 | 97.19 |
The test model is trained to obtain test results of three deep learning models of ResNet, DPN92 and inclusion-v 3, and the following table shows the step-by-step training by using the proposed Top-k loss function:
table 4 the classification results were compared using different loss functions on the inclusion-v 3, DPN92 and ResNet18 models of the CIFAR-10 dataset.
The improved image classification algorithm for the step-by-step training of the Top-k loss function is provided, the Top-1 precision is guaranteed, meanwhile, the Top-2 generalization error of the model is tried to be reduced, the importance of a plurality of output values of the model and the importance of Top-k loss are emphasized by constraining the Top-k loss value, and the classification performance is further improved. These two loss functions were compared on the CIFAR-10 dataset. The basic classifier using the top-k loss function is superior to the basic classifier using the cross entropy loss function in top-1 precision and top-2 precision. It can be seen that the classification performance of the model is improved after applying the top-k loss function. In particular, after applying the top-k loss function, the top-1 precision of the model is improved by about 0.03% -0.2% and the top-2 precision is improved by about 0.2% -0.6% on the CIFAR-10 dataset, and FIG. 3 shows the stability of the top-k loss function on the CIFAR-10 dataset.
It will be understood by those skilled in the art that the foregoing is only an exemplary embodiment of the present invention, and is not intended to limit the invention to the particular forms disclosed, since various modifications, substitutions and improvements within the spirit and scope of the invention are possible and within the scope of the appended claims.
Claims (4)
1. An improved image classification algorithm for step-by-step training of a Top-k loss function is characterized by comprising the following modules:
(1) image data preprocessing module
(2) Deep learning feature extraction module
(3) And a system prediction output module.
2. The image classification algorithm of the improved Top-k loss function step training according to claim 1, characterized in that the image data preprocessing module in the module (1) preprocesses the input image data, the deep learning feature extraction module in the module (2) comprises an improvement of the improved Top-k loss function step training using deep learning, the classifier module is a deep learning network full-link layer, and the system prediction output module of the module (3) processes the output of the classifier and outputs the decision result.
3. The image classification algorithm of the improved Top-k loss function step training according to claim 1, characterized in that the improved Top-k loss function step training can replace the cross entropy loss function to construct a better network with improved Top-1 accuracy and Top-2 accuracy. The traditional cross entropy loss function, the mathematical formula is expressed as:
top-k loss depends on whether y is part of the Top-k prediction, which is equivalent to comparing the Top k predictions to the true tags. The mathematical formula for the Top-k loss function is expressed as:
4. the improved image classification algorithm for step-by-step Top-k loss function training according to claim 1, wherein the capability of extracting the model features is improved, a better classification effect is obtained, and the robustness of the model is improved without modifying the network structure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210185010.1A CN114549906A (en) | 2022-02-28 | 2022-02-28 | Improved image classification algorithm for step-by-step training of Top-k loss function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210185010.1A CN114549906A (en) | 2022-02-28 | 2022-02-28 | Improved image classification algorithm for step-by-step training of Top-k loss function |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114549906A true CN114549906A (en) | 2022-05-27 |
Family
ID=81679445
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210185010.1A Pending CN114549906A (en) | 2022-02-28 | 2022-02-28 | Improved image classification algorithm for step-by-step training of Top-k loss function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114549906A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034281A (en) * | 2018-07-18 | 2018-12-18 | 中国科学院半导体研究所 | The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing |
CN109635947A (en) * | 2018-12-14 | 2019-04-16 | 安徽省泰岳祥升软件有限公司 | Machine reading based on answer sampling understands model training method and device |
CN110245592A (en) * | 2019-06-03 | 2019-09-17 | 上海眼控科技股份有限公司 | A method of for promoting pedestrian's weight discrimination of monitoring scene |
CN112580507A (en) * | 2020-12-18 | 2021-03-30 | 合肥高维数据技术有限公司 | Deep learning text character detection method based on image moment correction |
CN113962329A (en) * | 2021-11-15 | 2022-01-21 | 长沙理工大学 | Novel image recognition algorithm based on deep ensemble learning |
-
2022
- 2022-02-28 CN CN202210185010.1A patent/CN114549906A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034281A (en) * | 2018-07-18 | 2018-12-18 | 中国科学院半导体研究所 | The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing |
CN109635947A (en) * | 2018-12-14 | 2019-04-16 | 安徽省泰岳祥升软件有限公司 | Machine reading based on answer sampling understands model training method and device |
CN110245592A (en) * | 2019-06-03 | 2019-09-17 | 上海眼控科技股份有限公司 | A method of for promoting pedestrian's weight discrimination of monitoring scene |
CN112580507A (en) * | 2020-12-18 | 2021-03-30 | 合肥高维数据技术有限公司 | Deep learning text character detection method based on image moment correction |
CN113962329A (en) * | 2021-11-15 | 2022-01-21 | 长沙理工大学 | Novel image recognition algorithm based on deep ensemble learning |
Non-Patent Citations (1)
Title |
---|
LEONARD BERRADA ET AL.: ""SMOOTH LOSS FUNCTIONS FOR DEEP TOP-K CLASSIFICATION"", 《ICLR 2018》, 31 December 2018 (2018-12-31), pages 1 - 25 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109685115B (en) | Fine-grained conceptual model with bilinear feature fusion and learning method | |
CN103605972B (en) | Non-restricted environment face verification method based on block depth neural network | |
CN107967484B (en) | Image classification method based on multi-resolution | |
CN110321967B (en) | Image classification improvement method based on convolutional neural network | |
CN109002755B (en) | Age estimation model construction method and estimation method based on face image | |
CN106022363B (en) | A kind of Chinese text recognition methods suitable under natural scene | |
CN110188827B (en) | Scene recognition method based on convolutional neural network and recursive automatic encoder model | |
CN109241995B (en) | Image identification method based on improved ArcFace loss function | |
CN103955702A (en) | SAR image terrain classification method based on depth RBF network | |
CN110059769B (en) | Semantic segmentation method and system based on pixel rearrangement reconstruction and used for street view understanding | |
CN105184298A (en) | Image classification method through fast and locality-constrained low-rank coding process | |
CN113128478B (en) | Model training method, pedestrian analysis method, device, equipment and storage medium | |
CN113283590B (en) | Defending method for back door attack | |
KR102645698B1 (en) | Method and apparatus for face recognition robust to alignment shape of the face | |
CN112101364B (en) | Semantic segmentation method based on parameter importance increment learning | |
CN112232395B (en) | Semi-supervised image classification method for generating countermeasure network based on joint training | |
CN106611156B (en) | Pedestrian identification method and system based on self-adaptive depth space characteristics | |
CN111401156A (en) | Image identification method based on Gabor convolution neural network | |
CN115995040A (en) | SAR image small sample target recognition method based on multi-scale network | |
Yu et al. | Exemplar-based recursive instance segmentation with application to plant image analysis | |
CN114626476A (en) | Bird fine-grained image recognition method and device based on Transformer and component feature fusion | |
CN110188646B (en) | Human ear identification method based on fusion of gradient direction histogram and local binary pattern | |
CN111310820A (en) | Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration | |
CN113962329A (en) | Novel image recognition algorithm based on deep ensemble learning | |
CN113297964A (en) | Video target recognition model and method based on deep migration learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |