CN114549906A - Improved image classification algorithm for step-by-step training of Top-k loss function - Google Patents

Improved image classification algorithm for step-by-step training of Top-k loss function Download PDF

Info

Publication number
CN114549906A
CN114549906A CN202210185010.1A CN202210185010A CN114549906A CN 114549906 A CN114549906 A CN 114549906A CN 202210185010 A CN202210185010 A CN 202210185010A CN 114549906 A CN114549906 A CN 114549906A
Authority
CN
China
Prior art keywords
loss function
improved
module
training
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210185010.1A
Other languages
Chinese (zh)
Inventor
邓泽林
胡钰聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202210185010.1A priority Critical patent/CN114549906A/en
Publication of CN114549906A publication Critical patent/CN114549906A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image classification algorithm for step-by-step training of an improved Top-k loss function, which comprises an image data preprocessing module; the system comprises a deep learning feature extraction module and a system prediction output module, and is characterized in that an image data preprocessing module preprocesses input image data; the deep learning feature extraction module comprises an improvement of step-by-step training by using an improved Top-k loss function by using deep learning, and the classifier module is a deep learning network module; and the system output module processes the output of the classifier and outputs a judgment result. The image classification system breaks through the accuracy limit of the deep neural network under the condition of not modifying the network structure by using an improved image classification algorithm for step-by-step training of the Top-k loss function, and the accuracy is effectively improved by using the improved algorithm.

Description

Improved image classification algorithm for step-by-step training of Top-k loss function
Technical Field
The invention belongs to the field of image classification, and particularly relates to an improved Top-k loss function step-by-step training image classification system based on deep learning.
Background
Image Classification (Image Classification) technology has been developed as Image Classification has played an increasingly important role in various fields of daily life. To further achieve proper classification of images, given a set of images that are each labeled as a single class, we predict the class of a new set of test images and measure the accuracy results of the prediction as a point of research. The traditional image classification algorithm extracts the manually designed features, and the manually designed features have great limitation and are difficult to design, so that the manually designed features cannot be competent for some complex tasks, such as K-neighbor algorithm (KNN), Support Vector Machine (SVM) and other algorithms, the design difficulty is often high, the combination of feature extraction and classifier algorithms is complex, and the high classification accuracy is difficult to realize.
In recent years, deep learning is increasingly applied to image classification systems. At present, the deep learning method obtains a plurality of breakthrough achievements in the aspect of practical application and gradually becomes an important tool for artificial intelligence. The convolutional neural network is one of deep learning algorithms, and compared with the characteristics of the traditional manual design, the deep characteristic has the advantages that the complex and time-consuming characteristic extraction algorithm design is not needed to be carried out manually, only an effective neural network model needs to be designed, and the classification accuracy is high. However, the deep neural network often cannot break through the accuracy limit without modifying the network structure, and an improved algorithm is needed to effectively improve the accuracy.
Disclosure of Invention
The invention aims to provide a step-by-step training algorithm for an improved Top-k multi-loss function for image classification, compared with the traditional method, the method can realize higher classification accuracy compared with the original loss function, and the method uses a deep learning method, simultaneously omits the steps of manually extracting features and manually selecting a classifier, and overcomes the problem of difficult feature extraction and classification in the traditional method.
The solution provided by the invention is to adopt a step-by-step training algorithm of an improved Top-k multi-loss function for image classification to realize higher classification accuracy, and the method comprises the following steps: an image data preprocessing module; the system comprises a deep learning feature extraction module and a system prediction output module.
The image data preprocessing module represents the image data by a three-dimensional tensor. The image data preprocessing module comprises the following functions: irrelevant input data is abandoned, negative effects can be reduced, and the method generally comprises the steps of changing the brightness of an original image, changing the contrast of the original image, performing center trimming on the image, changing a color function, denoising pretreatment and the like. Preprocessing image data to convert the image data into an input form which can be accepted by an image target classification module, marking image target data categories in a training mode, and selecting a data set required by a machine learning method.
The working method of image data preprocessing in the training mode comprises the following steps: for a three-channel image, the brightness of an original image is changed, the contrast of the original image is changed, the image is subjected to center trimming, a color function is changed, denoising pretreatment is carried out, pretreatment such as unified upsampling, center trimming and rotation is carried out, and the data volume is expanded. In the training model, parameters of the convolutional neural network can be updated to obtain a CNN network capable of accurately classifying the specified data set.
The working method for preprocessing the image target data in the test mode comprises the following steps: for the three-channel image, the brightness of the original image is changed, the contrast of the original image is changed, the color function is changed, denoising pretreatment is carried out, unified upsampling is carried out, and data expansion processing is not carried out. The upsampling is performed uniformly to a fixed resolution, and in the following embodiments, the upsampling results in an image resolution of 224 × 224 pixels. And obtaining the final classification effect.
The image target classification module comprises a deep learning feature extraction module and a loss function step-by-step training module, and a supervised learning algorithm is used during network training. The CNN network feature extraction module and the classifier module are used for improving the Top-k loss function, training a model with lower Top-2 errors is simpler than improving Top-1 correct classification, and the fault tolerance of the model structure can be effectively improved as long as the first two prediction categories contain real labels. The basic classifier through the joint training strategy is used for the ensemble learning we propose, and a plurality of CNN models are respectively trained by using the training strategy. In the first step, we train the CNN network using a cross-entropy loss function, which is initialized by the ImageNet pre-training model and fine-tuned for the dataset. It is known that the cross-entropy loss function is a loss function created to optimize the top-1 loss, and the model is trained by the cross-entropy loss function to obtain a network with higher top-1 accuracy. In the second step, the top-k loss function established for the first k correct labels is used for fine adjustment, the model trained in the first step is used as the initial weight of top-k loss function training, so that an optimized network is obtained under the condition that the top-1 correct rate and the higher top-2 correct rate are unchanged, and the model improves the recognition capability. The network is trained by two parallel loss functions together to form a unified network.
And the system output module processes the output of the classifier and outputs a judgment result.
Compared with the traditional cross entropy loss function, the improved image classification algorithm for the step-by-step training of the Top-k loss function disclosed by the invention can realize higher Top-1 and Top-2 classification accuracy and simultaneously reduce the complexity of manual calculation. On the basis of changing the model architecture, the accuracy of Top-1 and Top-2 is improved compared with the cross entropy loss function. Compared with the traditional training method, the improved image classification algorithm for the step-by-step training of the Top-k loss function has better robustness and better performance on a CIFAR-10 data set.
Drawings
FIG. 1 is a flow chart of an image classification algorithm using deep learning;
FIG. 2 is a flow chart of image classification using two loss functions for step-wise training;
FIG. 3 is a graph comparing accuracy and effect of step-by-step training for the conventional training method and the proposed Top-k loss function;
Detailed Description
The present invention is described in further detail below with reference to the attached drawings.
First, a data set is selected, and the data set selected by us is a CIFAR-10 image classification data set, and the CIFAR-10 data set is a reference data set widely used for image classification. The data set contained 60000 images, divided into 10 categories (airplane, car, bird, cat, deer, dog, frog, horse, boat and truck), 50000 images for training and 10000 images for testing, all 32x 32 pixels in size. The data set details are shown in the following table:
TABLE 1 CIFAR-10 data set information Table
Data set Training set Verification set Test set Categories
CIFAR-10 50000 0 10000 10
The selection of the deep learning model, which is selected to be tested, comprises the following steps:
the Incep-v 3 and the Incep-v 3 have strong image feature extraction and classification performance and are widely used image recognition models. It consists of symmetric and asymmetric building blocks, including convolutional, average pooling, maximum pooling, padding and full-link layers. Batch normalization is widely used throughout the model and applied to activation inputs. .
DPN92, in which High Order RNN structure (HORNN) was used to link DenseNet and ResNet, demonstrated that DenseNet can extract new features from previous levels, while ResNet is essentially a multiplex of the extracted features from previous levels. By combining the advantages of the two structures, the DPN network can effectively improve the classification efficiency.
ResNet, ResNet uses the identity mapping to directly transmit the output of the previous layer to the back layer, when the depth of the network is increased, the error cannot be increased, the deeper network cannot bring the rise of the error on the training set, and the problem of gradient disappearance is solved.
The model was trained using the environmental training data set shown in the table below.
Table 2 experimental environment table
Figure BDA0003522795040000041
Figure BDA0003522795040000051
The model training parameters are shown in the following table, and the Batchsize is properly adjusted according to the model parameter quantity to reduce the video memory occupied by the video card and the model training time.
TABLE 3 model initialization parameter Table
Parameter(s) Value of
Learning rate 0.1
Learning rate decay 0.1
Momentum 0.9
Batch size 32/24
Training rounds 200
In the training process, by using the image classification algorithm of step-by-step training of the improved Top-k loss function, as shown in FIG. 2, the training of the model with lower Top-2 errors is simpler than the correct classification of the improved Top-1, and the fault tolerance of the model structure can be effectively improved as long as the first two prediction classes contain real labels. The basic classifier through the joint training strategy is used for the ensemble learning we propose, and a plurality of CNN models are respectively trained by using the training strategy. In the first step, we train the CNN network using a cross-entropy loss function, which is initialized by the ImageNet pre-training model and fine-tuned for the dataset. It is known that the cross-entropy loss function is a loss function created to optimize the top-1 loss, and the model is trained by the cross-entropy loss function to obtain a network with higher top-1 accuracy. In the second step, we use the top-k loss function created for the first k correct labels to perform fine tuning, and use the model trained in the first step as the initial weight of top-k loss function training to obtain an optimized network under the condition that the top-1 correct rate and the higher top-2 correct rate are unchanged, and the model improves the recognition capability. However, due to the ambiguity of the label and the uncertainty of the feature, there is no real identification method, besides improving the efficiency of feature extraction, the predicted label is hidden between top-1 and top-k, the improved top-2 classification accuracy further improves the efficiency that we will extract the top-2 label for effective use, and the network is trained by two parallel loss functions together to form a unified network.
The test model is trained to obtain test results of three deep learning models of ResNet, DPN92 and increment-v 3, and the effect of the traditional cross entropy loss function is shown in the following table:
TABLE 3 Classification of results on CIFAR-10 datasets Using loss function
Method Rate of accuracy
ResNet18 96.50
DPN92 97.56
Incption-v3 97.19
The test model is trained to obtain test results of three deep learning models of ResNet, DPN92 and inclusion-v 3, and the following table shows the step-by-step training by using the proposed Top-k loss function:
table 4 the classification results were compared using different loss functions on the inclusion-v 3, DPN92 and ResNet18 models of the CIFAR-10 dataset.
Figure BDA0003522795040000061
The improved image classification algorithm for the step-by-step training of the Top-k loss function is provided, the Top-1 precision is guaranteed, meanwhile, the Top-2 generalization error of the model is tried to be reduced, the importance of a plurality of output values of the model and the importance of Top-k loss are emphasized by constraining the Top-k loss value, and the classification performance is further improved. These two loss functions were compared on the CIFAR-10 dataset. The basic classifier using the top-k loss function is superior to the basic classifier using the cross entropy loss function in top-1 precision and top-2 precision. It can be seen that the classification performance of the model is improved after applying the top-k loss function. In particular, after applying the top-k loss function, the top-1 precision of the model is improved by about 0.03% -0.2% and the top-2 precision is improved by about 0.2% -0.6% on the CIFAR-10 dataset, and FIG. 3 shows the stability of the top-k loss function on the CIFAR-10 dataset.
It will be understood by those skilled in the art that the foregoing is only an exemplary embodiment of the present invention, and is not intended to limit the invention to the particular forms disclosed, since various modifications, substitutions and improvements within the spirit and scope of the invention are possible and within the scope of the appended claims.

Claims (4)

1. An improved image classification algorithm for step-by-step training of a Top-k loss function is characterized by comprising the following modules:
(1) image data preprocessing module
(2) Deep learning feature extraction module
(3) And a system prediction output module.
2. The image classification algorithm of the improved Top-k loss function step training according to claim 1, characterized in that the image data preprocessing module in the module (1) preprocesses the input image data, the deep learning feature extraction module in the module (2) comprises an improvement of the improved Top-k loss function step training using deep learning, the classifier module is a deep learning network full-link layer, and the system prediction output module of the module (3) processes the output of the classifier and outputs the decision result.
3. The image classification algorithm of the improved Top-k loss function step training according to claim 1, characterized in that the improved Top-k loss function step training can replace the cross entropy loss function to construct a better network with improved Top-1 accuracy and Top-2 accuracy. The traditional cross entropy loss function, the mathematical formula is expressed as:
Figure FDA0003522795030000011
top-k loss depends on whether y is part of the Top-k prediction, which is equivalent to comparing the Top k predictions to the true tags. The mathematical formula for the Top-k loss function is expressed as:
Figure FDA0003522795030000012
4. the improved image classification algorithm for step-by-step Top-k loss function training according to claim 1, wherein the capability of extracting the model features is improved, a better classification effect is obtained, and the robustness of the model is improved without modifying the network structure.
CN202210185010.1A 2022-02-28 2022-02-28 Improved image classification algorithm for step-by-step training of Top-k loss function Pending CN114549906A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210185010.1A CN114549906A (en) 2022-02-28 2022-02-28 Improved image classification algorithm for step-by-step training of Top-k loss function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210185010.1A CN114549906A (en) 2022-02-28 2022-02-28 Improved image classification algorithm for step-by-step training of Top-k loss function

Publications (1)

Publication Number Publication Date
CN114549906A true CN114549906A (en) 2022-05-27

Family

ID=81679445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210185010.1A Pending CN114549906A (en) 2022-02-28 2022-02-28 Improved image classification algorithm for step-by-step training of Top-k loss function

Country Status (1)

Country Link
CN (1) CN114549906A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034281A (en) * 2018-07-18 2018-12-18 中国科学院半导体研究所 The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing
CN109635947A (en) * 2018-12-14 2019-04-16 安徽省泰岳祥升软件有限公司 Machine reading based on answer sampling understands model training method and device
CN110245592A (en) * 2019-06-03 2019-09-17 上海眼控科技股份有限公司 A method of for promoting pedestrian's weight discrimination of monitoring scene
CN112580507A (en) * 2020-12-18 2021-03-30 合肥高维数据技术有限公司 Deep learning text character detection method based on image moment correction
CN113962329A (en) * 2021-11-15 2022-01-21 长沙理工大学 Novel image recognition algorithm based on deep ensemble learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034281A (en) * 2018-07-18 2018-12-18 中国科学院半导体研究所 The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing
CN109635947A (en) * 2018-12-14 2019-04-16 安徽省泰岳祥升软件有限公司 Machine reading based on answer sampling understands model training method and device
CN110245592A (en) * 2019-06-03 2019-09-17 上海眼控科技股份有限公司 A method of for promoting pedestrian's weight discrimination of monitoring scene
CN112580507A (en) * 2020-12-18 2021-03-30 合肥高维数据技术有限公司 Deep learning text character detection method based on image moment correction
CN113962329A (en) * 2021-11-15 2022-01-21 长沙理工大学 Novel image recognition algorithm based on deep ensemble learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LEONARD BERRADA ET AL.: ""SMOOTH LOSS FUNCTIONS FOR DEEP TOP-K CLASSIFICATION"", 《ICLR 2018》, 31 December 2018 (2018-12-31), pages 1 - 25 *

Similar Documents

Publication Publication Date Title
CN109685115B (en) Fine-grained conceptual model with bilinear feature fusion and learning method
CN103605972B (en) Non-restricted environment face verification method based on block depth neural network
CN107967484B (en) Image classification method based on multi-resolution
CN110321967B (en) Image classification improvement method based on convolutional neural network
CN109002755B (en) Age estimation model construction method and estimation method based on face image
CN106022363B (en) A kind of Chinese text recognition methods suitable under natural scene
CN110188827B (en) Scene recognition method based on convolutional neural network and recursive automatic encoder model
CN109241995B (en) Image identification method based on improved ArcFace loss function
CN103955702A (en) SAR image terrain classification method based on depth RBF network
CN110059769B (en) Semantic segmentation method and system based on pixel rearrangement reconstruction and used for street view understanding
CN105184298A (en) Image classification method through fast and locality-constrained low-rank coding process
CN113128478B (en) Model training method, pedestrian analysis method, device, equipment and storage medium
CN113283590B (en) Defending method for back door attack
KR102645698B1 (en) Method and apparatus for face recognition robust to alignment shape of the face
CN112101364B (en) Semantic segmentation method based on parameter importance increment learning
CN112232395B (en) Semi-supervised image classification method for generating countermeasure network based on joint training
CN106611156B (en) Pedestrian identification method and system based on self-adaptive depth space characteristics
CN111401156A (en) Image identification method based on Gabor convolution neural network
CN115995040A (en) SAR image small sample target recognition method based on multi-scale network
Yu et al. Exemplar-based recursive instance segmentation with application to plant image analysis
CN114626476A (en) Bird fine-grained image recognition method and device based on Transformer and component feature fusion
CN110188646B (en) Human ear identification method based on fusion of gradient direction histogram and local binary pattern
CN111310820A (en) Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration
CN113962329A (en) Novel image recognition algorithm based on deep ensemble learning
CN113297964A (en) Video target recognition model and method based on deep migration learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination