CN113077388B - Data-augmented deep semi-supervised over-limit learning image classification method and system - Google Patents

Data-augmented deep semi-supervised over-limit learning image classification method and system Download PDF

Info

Publication number
CN113077388B
CN113077388B CN202110448092.XA CN202110448092A CN113077388B CN 113077388 B CN113077388 B CN 113077388B CN 202110448092 A CN202110448092 A CN 202110448092A CN 113077388 B CN113077388 B CN 113077388B
Authority
CN
China
Prior art keywords
image
data
training
network model
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110448092.XA
Other languages
Chinese (zh)
Other versions
CN113077388A (en
Inventor
曾宇骏
呼晓畅
徐昕
方强
周思航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202110448092.XA priority Critical patent/CN113077388B/en
Publication of CN113077388A publication Critical patent/CN113077388A/en
Application granted granted Critical
Publication of CN113077388B publication Critical patent/CN113077388B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for classifying data-augmented deep semi-supervised over-limit learning images, wherein the method comprises the steps of extracting features by adopting a deep convolution network model aiming at training images; fine tuning and optimizing a deep convolutional network model based on part of manual label data and generating a pseudo label for a label-free training image; fusing high-level semantic features extracted from the training images with low-level shallow structure features to obtain fused image features; adopting a random linear interpolation technology to amplify the fusion image characteristics and the labels of the training images; and training a single hidden layer feedforward neural network aiming at the augmented fused image features and labels and replacing a full connection layer in the deep convolutional network model to obtain a final image classification recognition network model. The method has the advantages of small requirement on manual marking, robust anti-noise interference capability, good classification and identification performance and strong task expansibility, and data is enlarged.

Description

Data-augmented deep semi-supervised over-limit learning image classification method and system
Technical Field
The invention relates to the technical field of image classification and target identification, in particular to a method and a system for classifying depth semi-supervised over-limit learning images with data augmentation.
Background
The current visual target identification method with better performance mostly adopts a deep learning technology, and supervised learning training of a deep neural network model is often required to be carried out on large-scale artificial labeling data. However, manually marking large amounts of data is expensive and may even be impractical for certain application scenarios. In recent years, related research has been more focused on deep semi-supervised learning techniques. The deep semi-supervised learning technology can improve the performance of the visual target classification and identification method by utilizing a small amount of high-quality labeled data and a large amount of easily-obtained unlabeled data. A variety of deep semi-supervised learning techniques such as entropy minimization, pseudo-labeling, consistency regularization, pre-training, model generation based, etc. have been proposed.
Among them, the most representative common techniques are mainly three types:
the method comprises the steps of firstly, carrying out a pseudo-label-based deep semi-supervised learning technology, using a class prediction result of a model obtained by training data based on a small amount of artificial labeling labels as a pseudo label of unlabeled data, then adding the pseudo label to the artificial labeling data to train an image class label prediction model again, carrying out label updating on the pseudo label data, and repeating iteration in the way until a receivable image classification precision effect is achieved. However, the labels predicted by the model may have noise, and the data itself may also have noise to a certain extent, so that the problem of model degradation often occurs in the image classification model obtained by training based on the samples, and the image classification accuracy is difficult to meet the actual application requirements;
the second is a deep semi-supervised learning technique based on consistent regularization, which penalizes inconsistent predictions from unmarked data under different perturbations. This technique will also suffer from noise signature overfitting due to the use of multiple predictions. Subsequently, various technologies such as generating a high-quality pseudo label and generating effective disturbance are provided to reduce overfitting, but the performance is obviously improved when the noise intensity is low, and the acceptable effect cannot be achieved when the noise intensity is high;
and thirdly, decomposing the feature representation and the classifier into two independent stages, namely pre-training the deep neural network before applying the classifier for learning so as to find out the compressed feature representation of the input data through an unsupervised auxiliary learning task. This technique pursues better feature representation models and better classifiers, respectively, and has advantages in analyzing and mitigating the over-fitting problem. However, this technique lacks label guidance for unsupervised auxiliary tasks and the feature representation model often faces a risk of inconsistency with the final task. For example, an auto-encoder aims to reconstruct all pixels of the original image to guide the learning extraction of the image feature representation, and the final classification task may be related to only a small portion of pixels in the image. In addition, the technology is difficult to effectively set a classifier, the sample data information utilization efficiency is low, and efficient learning training is difficult to perform.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the method and the system for classifying the deep semi-supervised ultralimit learning image are provided, wherein the method and the system are small in artificial marking requirement, have robust anti-noise interference capability, good classification and identification performance and strong task expansibility, and have the advantages of data amplification.
In order to solve the technical problems, the invention adopts the technical scheme that:
a data-augmented deep semi-supervised over-limit learning image classification method comprises the following steps of learning and training an image classification recognition network model:
s1, extracting features of the training image by adopting a depth convolution network model; training the deep convolutional network model aiming at part of artificial label data of the training image to realize fine tuning optimization, and predicting and generating a corresponding pseudo label for the label-free training image through the fine tuning optimized deep convolutional network model;
s2, fusing the high-level semantic features extracted from the training images with the low-level shallow structure features to obtain fused image features;
s3, augmenting the fusion image characteristics and labels of the training image by adopting a random linear interpolation technology;
s4, randomly dividing the augmented fusion image features and labels into batches, sequentially inputting the batches into the single hidden layer feedforward neural network, updating the weight of a network output layer, and repeating the steps until the training of the single hidden layer feedforward neural network is completed; and removing the full connection layer in the deep convolutional network model, and connecting the full connection layer with the trained single hidden layer feedforward neural network to form an image classification and identification network model for realizing the end-to-end classification and identification of the corresponding image target.
Optionally, the deep convolutional network model in step S1 is a 13-CNN deep convolutional network model that has completed pre-training.
Optionally, the training loss function of the 13-CNN deep convolutional network model is as follows:
Figure 358537DEST_PATH_IMAGE001
in the above formula, the first and second carbon atoms are,l cos a function representing the loss of training is represented,λ 0 ,λ 1 andλ 2 in order to be the weight coefficient,R 0 in order to be a consistent regularization term,R 1 in order to cross-entropy regularization terms,y i in order to be the label of the sample,
Figure 693703DEST_PATH_IMAGE002
the model is given a label for the unlabeled data self-estimation,x i is as followsiThe number of the samples is one,nas to the number of samples,p(y i |x i ) Is the predicted output of the model and is,lthe number of marked samples is as follows:
Figure 747591DEST_PATH_IMAGE003
wherein the content of the first and second substances,Cas to the number of sample classes,p c in order to average the probability of the class edge,
Figure 826406DEST_PATH_IMAGE004
the class edge probability predicted for the model,p(y|x) Is the sample conditional probability of the model output,His the entropy.
Optionally, the step S2 of obtaining the fused image feature by fusion refers to performing feature-vectorization cascade fusion to obtain the fused image feature, where the function expression is:
Figure 652279DEST_PATH_IMAGE005
in the above formula, the first and second carbon atoms are,f c (x) For the fused image features after the cascade fusion,concatwhich represents a vector concatenation operation, is shown,ReLUin order to be a linear rectification function,GAPis a function of the global average pooling,f s (x), f h (x) Respectively a low-level shallow structure characteristic and a high-level semantic characteristic output by the deep neural network.
Optionally, in step S3, the function expression for augmenting the fused image feature and the label of the training image by using the random linear interpolation technique is as follows:
Figure 892768DEST_PATH_IMAGE006
in the above formula, the first and second carbon atoms are,
Figure 843406DEST_PATH_IMAGE007
for the interpolated image feature matrix,X j andX i is the image feature matrix before the interpolation,
Figure 53808DEST_PATH_IMAGE008
for the label matrix after the interpolation, the label matrix is,Y j andY i is the label matrix before the interpolation,λis a weight vector sampled in the beta distribution.
Optionally, a weight vector sampled in beta distributionλThe formula of the calculation function is:
Figure 999767DEST_PATH_IMAGE009
in the above formula, the first and second carbon atoms are,f(x:α,γ) For weight vector sampled in beta distributionλThe expression of the computational function of (2),α,γcontrol parameters for beta distributions greater than 0 respectively,xanduare all function variable unknowns.
Optionally, the function expression for updating the weight of the network output layer in step S4 is as follows:
Figure 942315DEST_PATH_IMAGE010
in the above formula, the first and second carbon atoms are,K k+1 andK k respectively, a weight matrix is formed by the weight matrix,β k+1 andβ k the solution parameters for the iteration are respectively,
Figure 583512DEST_PATH_IMAGE011
to input firstk+1 batch of augmentation data and tags
Figure 597605DEST_PATH_IMAGE012
The hidden layer output matrix of the single hidden layer feedforward neural network is used,
Figure 70174DEST_PATH_IMAGE013
is a label matrix and has initial values:
Figure 511520DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 640013DEST_PATH_IMAGE015
for inputting the augmentation data and tags of the initial lot
Figure 864321DEST_PATH_IMAGE016
The hidden layer output matrix of the single hidden layer feedforward neural network is used,cin order to be the weight coefficient,Iis an identity matrix.
Optionally, step S4 is followed by: and inputting the image to be recognized into the image classification recognition network model, and obtaining and outputting an image classification recognition result corresponding to the image to be recognized.
In addition, the invention also provides a data-augmented depth semi-supervised overrun learning image classification system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the data-augmented depth semi-supervised overrun learning image classification method.
Furthermore, the present invention also provides a computer-readable storage medium having stored therein a computer program programmed or configured to execute the data-augmented deep semi-supervised ultralimit learning image classification method.
Compared with the prior art, the invention has the main advantages that:
1. the invention adopts a deep semi-supervised learning technical route, and performs learning training of the image classification and identification network model based on a small amount of labeled data and a large amount of unlabeled data, thereby greatly reducing the cost requirement of manual labeling of the data while ensuring the accuracy of image classification and identification. Meanwhile, the existing deep semi-supervised learning framework is improved, the feature learning and the classifier training process are decoupled and optimized respectively, so that the network model can learn and extract image features more oriented to specific classification recognition tasks. Meanwhile, the classifier training adopts an ultralimit learning machine principle, and the generalization capability of classification and identification is further improved.
2. When the ultralimit learning machine is adopted for learning and training the classifier, a data augmentation mechanism is introduced and integrated into the target optimization function design of the online ultralimit learning machine, so that the classifier obtained by training can effectively tolerate the training data and the noise in the label thereof, and the robustness of classification and identification is effectively improved. In addition, the online ultralimit learning method based on data augmentation is not only limited to a deep semi-supervised learning framework, but also is applicable to classifier learning training related to supervised learning tasks, and has certain task expansibility.
Drawings
FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of an implementation principle of the method according to the embodiment of the present invention.
FIG. 3 shows the comparison result of the effect performance of the method of the embodiment of the invention on the typical international standard image classification recognition databases CIFAR-10 and CIFAR-100 and the related typical method.
FIG. 4 is a schematic diagram showing the comparison of the effectiveness performance of the method of the embodiment of the invention on typical international standard image classification recognition databases CIFAR-10 and CIFAR-100 and the classification recognition accuracy thereof with other part of representative methods.
FIG. 5 is a schematic diagram showing the comparison between the performance of the method of the embodiment of the invention on typical international standard image classification recognition databases CIFAR-10 and CIFAR-100 and the classification recognition accuracy of another part of representative methods.
Detailed Description
The invention is further described below with reference to the drawings and specific preferred embodiments of the description, without thereby limiting the scope of protection of the invention.
As shown in fig. 1 and fig. 2, the method for classifying an image by deep semi-supervised over-limit learning with augmented data of the present embodiment includes the following steps of learning and training an image classification recognition network model:
s1, extracting features of the training image by adopting a depth convolution network model; training the deep convolutional network model aiming at part of artificial label data of the training image to realize fine tuning optimization, and predicting and generating a corresponding pseudo label for the label-free training image through the fine tuning optimized deep convolutional network model;
s2, fusing the high-level semantic features extracted from the training images with the low-level shallow structure features to obtain fused image features;
s3, augmenting the fusion image characteristics and labels of the training image by adopting a random linear interpolation technology;
s4, randomly dividing the augmented fusion image features and labels into batches, sequentially inputting the batches into the single hidden layer feedforward neural network, updating the weight of a network output layer, and repeating the steps until the training of the single hidden layer feedforward neural network is completed; and removing the full connection layer in the deep convolutional network model, and connecting the full connection layer with the trained single hidden layer feedforward neural network to form an image classification and identification network model for realizing the end-to-end classification and identification of the corresponding image target.
Step S1 is a step of extracting depth convolution features of an example image and generating pseudo labels, that is, training and learning a depth convolution neural network model with simple classification and recognition capability based on a small amount of artificially labeled image data, and performing preliminary classification and recognition on a large amount of unlabeled images to obtain corresponding pseudo labels, converting original image data into a data set containing artificially labeled images and pseudo labels, and performing retraining on the depth convolution network, thereby converting original semi-supervised learning into a supervised learning process, and the corresponding extracted depth convolution features can also have task relevance.
As an alternative embodiment, the deep convolutional network model in step S1 is a 13-CNN deep convolutional network model with pre-training.
In this embodiment, the training loss function of the 13-CNN deep convolutional network model is as follows:
Figure 519293DEST_PATH_IMAGE001
in the above formula, the first and second carbon atoms are,l cos a function representing the loss of training is represented,λ 0 ,λ 1 andλ 2 in order to be the weight coefficient,R 0 in order to be a consistent regularization term,R 1 in order to cross-entropy regularization terms,y i in order to be the label of the sample,
Figure 131540DEST_PATH_IMAGE002
the model is given a label for the unlabeled data self-estimation,x i is as followsiThe number of the samples is one,nas to the number of samples,p(y i |x i ) Is the predicted output of the model and is,lthe number of marked samples is as follows:
Figure 12908DEST_PATH_IMAGE003
wherein the content of the first and second substances,Cas to the number of sample classes,p c in order to average the probability of the class edge,
Figure 40907DEST_PATH_IMAGE004
the class edge probability predicted for the model,p(y|x) Is the sample conditional probability of the model output,His the entropy.
Step S2, the multi-level image depth convolution feature fusion is to select a shallow feature capable of reflecting image structure information and a deep feature capable of reflecting image category semantic information from the multi-level image convolution features extracted by the depth convolution network, and perform feature vectorization cascade fusion. Specifically, the step S2 of the embodiment of obtaining the fusion image feature by fusion refers to performing feature-vectorized cascade fusion to obtain the fusion image feature, where a function expression of the fusion image feature is:
Figure 553316DEST_PATH_IMAGE005
in the above formula, the first and second carbon atoms are,f c (x) For the fused image features after the cascade fusion,concatwhich represents a vector concatenation operation, is shown,ReLUin order to be a linear rectification function,GAPis a function of the global average pooling,f s (x), f h (x) Respectively a low-level shallow structure characteristic and a high-level semantic characteristic output by the deep neural network. In this embodiment, taking 13-CNN as an example, the output characteristics of the 3 rd convolutional layer and the 18 th convolutional layer may be selected and cascaded after global pooling and linear rectification processing.
Global average pooling functionGAPThe functional expression of (a) is:
Figure 211830DEST_PATH_IMAGE017
linear rectification functionReLUThe functional expression of (a) is:
Figure 439549DEST_PATH_IMAGE018
wherein the content of the first and second substances,WHCrespectively, image data of output global average pooling functionXThe width, height and number of channels of the channel,x i,j,k finger image dataXTo middlekIn the channeljGo to the firstiThe data point value, max, of the column is taken to be the maximum value.
In the step S3 of this embodiment, in the online ultralimit learning classification based on data augmentation, the image features and the corresponding labels obtained in the steps S1 and S2 are augmented by using a random linear interpolation principle, specifically using the existing mixup method. Specifically, in step S3, the function expression for augmenting the fused image feature and the label of the training image by using the random linear interpolation technique is as follows:
Figure 599135DEST_PATH_IMAGE006
in the above formula, the first and second carbon atoms are,
Figure 104066DEST_PATH_IMAGE007
for the interpolated image feature matrix,X j andX i is the image feature matrix before the interpolation,
Figure 464640DEST_PATH_IMAGE008
for the label matrix after the interpolation, the label matrix is,Y j andY i is the label matrix before the interpolation,λis a weight vector sampled in the beta distribution.
In this embodiment, the weight vector sampled in beta distributionλThe formula of the calculation function is:
Figure 179655DEST_PATH_IMAGE009
in the above formula, the first and second carbon atoms are,f(x:α,γ) For weight vector sampled in beta distributionλThe expression of the computational function of (2),α,γcontrol parameters for beta distributions greater than 0 respectively,xanduare all function variable unknowns. In this example toα=γ
In this embodiment, in step S4, the fully-connected layer at the end of the deep convolutional network model is replaced by a single hidden layer feedforward neural network, data amplification is further combined with the over-limit learning machine, an objective function as shown below is defined, and learning training of the single hidden layer feedforward neural network is performed accordingly.
Figure 142932DEST_PATH_IMAGE019
Wherein the content of the first and second substances,λto sample the weight vector of the beta distribution,
Figure 236790DEST_PATH_IMAGE020
for the hidden layer output matrix of the used single hidden layer feedforward neural network,βfor the output layer weights of the used single-hidden layer feed-forward neural network to be optimized for learning,Frepresents a Frobingni norm,care weight coefficients. Accordingly, the output weights of the single hidden layer feedforward neural network may be given by:
Figure 33844DEST_PATH_IMAGE021
in the above formula, the first and second carbon atoms are,β * representing the output weights of the single hidden layer feed-forward neural network,
Figure 236156DEST_PATH_IMAGE020
in the form of a matrix of data,cin order to be the weight coefficient,Iis a matrix of the units,
Figure 737544DEST_PATH_IMAGE022
being a label matrix, the data matrix consists of N samples,dis the characteristic dimension of the sample. Wherein:
Figure 748225DEST_PATH_IMAGE023
in the above formula, the first and second carbon atoms are,λto sample the weight vector for the beta distribution,Y i in the form of a matrix of original tags,Y j is a disorderly ordered matrix of labels.
In this embodiment, the function expression for updating the weight of the network output layer in step S4 is as follows:
Figure 185023DEST_PATH_IMAGE010
in the above formula, the first and second carbon atoms are,K k+1 andK k respectively, a weight matrix is formed by the weight matrixes,β k+1 andβ k the solution parameters for the iteration are respectively,
Figure 609051DEST_PATH_IMAGE011
to input firstk+1 batch of augmentation data and tags
Figure 320655DEST_PATH_IMAGE012
The hidden layer output matrix of the single hidden layer feedforward neural network is used,
Figure 654684DEST_PATH_IMAGE013
is a label matrix and has initial values:
Figure 118508DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 701936DEST_PATH_IMAGE015
for inputting the augmentation data and tags of the initial lot
Figure 279548DEST_PATH_IMAGE016
The hidden layer output matrix of the single hidden layer feedforward neural network is used,cin order to be the weight coefficient,Iis an identity matrix.
After the training of the single hidden layer feedforward neural network is finished, the single hidden layer feedforward neural network is connected with the deep convolution networks used in the S1 and the S2 of the removed full connection layer, and then the classification and the identification of the input image can be completed. In this embodiment, step S4 is followed by: and inputting the image to be recognized into the image classification recognition network model, and obtaining and outputting an image classification recognition result corresponding to the image to be recognized.
FIG. 3 shows the comparison result of the effect performance of the method of this embodiment on the typical international standard image classification recognition databases CIFAR-10 and CIFAR-100 and the related typical method. Particularly, when noise exists in the image data, the method of the embodiment can also have better classification and identification performance, as shown in fig. 4 and 5, the effect performance of the method of the embodiment on the typical international standard image classification and identification databases CIFAR-10 and CIFAR-100 and the classification and identification accuracy comparison with the related representative method are achieved under different label noise intensities (0% -70%).
In addition, the present embodiment further provides a data-augmented deep semi-supervised overrun learning image classification system, which includes a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to execute the steps of the data-augmented deep semi-supervised overrun learning image classification method.
In addition, the present embodiment also provides a computer-readable storage medium, in which a computer program programmed or configured to execute the aforementioned data-augmented depth semi-supervised ultralimit learning image classification method is stored.
In summary, the method of the embodiment includes the steps of obtaining a training image to be classified and identified, and performing depth convolution feature coding and fusion on each pixel in the image; training an initial classification recognition network based on a small part of labeled image data and performing classification recognition on unlabeled image data to obtain corresponding pseudo labels; carrying out data augmentation on the acquired image features to be classified and identified, the labels corresponding to the image features and the pseudo labels; and removing and replacing the full connection layer of the initial classification recognition network with a single hidden layer feedforward neural network layer, and performing learning training on the weight of the network layer according to an online overrun learning machine based on data augmentation to obtain a final image classification recognition network model, wherein the model can be used for realizing end-to-end classification recognition of corresponding image targets. The method has the advantages of small requirement on manual marking, robust anti-noise interference capability, good classification and identification performance, strong task expansibility and the like.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims (9)

1. A data-augmented deep semi-supervised over-limit learning image classification method is characterized by comprising the following steps of learning and training an image classification recognition network model:
s1, extracting features of the training image by adopting a depth convolution network model; training the deep convolutional network model aiming at part of artificial label data of the training image to realize fine tuning optimization, and predicting and generating a corresponding pseudo label for the label-free training image through the fine tuning optimized deep convolutional network model;
s2, fusing the high-level semantic features extracted from the training images with the low-level shallow structure features to obtain fused image features;
s3, augmenting the fusion image characteristics and labels of the training image by adopting a random linear interpolation technology;
s4, randomly dividing the augmented fusion image features and labels into batches, sequentially inputting the batches into the single hidden layer feedforward neural network, updating the weight of a network output layer, and repeating the steps until the training of the single hidden layer feedforward neural network is completed; removing a full connection layer in the deep convolutional network model, and connecting the full connection layer with a trained single hidden layer feedforward neural network to form an image classification and identification network model for realizing end-to-end classification and identification of a corresponding image target; the function expression for updating the weight of the network output layer is as follows:
Figure 518780DEST_PATH_IMAGE001
in the above formula, the first and second carbon atoms are,K k+1 andK k respectively, a weight matrix is formed by the weight matrix,β k+1 andβ k the solution parameters for the iteration are respectively,
Figure 435920DEST_PATH_IMAGE002
to input firstk+1 batch of augmentation data and tags
Figure 543554DEST_PATH_IMAGE003
The hidden layer output matrix of the single hidden layer feedforward neural network is used,
Figure 532238DEST_PATH_IMAGE004
is a label matrix and has initial values:
Figure 346610DEST_PATH_IMAGE005
wherein,
Figure 372335DEST_PATH_IMAGE006
For inputting the augmentation data and tags of the initial lot
Figure 639369DEST_PATH_IMAGE007
The hidden layer output matrix of the single hidden layer feedforward neural network is used,cin order to be the weight coefficient,Iis an identity matrix.
2. The method for classifying image of deep semi-supervised over-learning with augmented data of claim 1, wherein the deep convolutional network model in step S1 is a 13-CNN deep convolutional network model with pre-training.
3. The method for classifying image of deep semi-supervised over-learning based on data augmentation as claimed in claim 2, wherein the training loss function of the 13-CNN deep convolutional network model is as follows:
Figure 431744DEST_PATH_IMAGE008
in the above-mentioned formula, the compound has the following structure,l cos a function representing the loss of training is represented,λ 0 ,λ 1 andλ 2 in order to be the weight coefficient,R 0 in order to be a consistent regularization term,R 1 in order to cross-entropy regularization terms,y i in order to be the label of the sample,
Figure 100623DEST_PATH_IMAGE009
the model is given a label for the unlabeled data self-estimation,x i is as followsiThe number of the samples is one,nas to the number of samples,p(y i |x i ) Is the predicted output of the model and is,lthe number of marked samples is as follows:
Figure 424812DEST_PATH_IMAGE010
wherein the content of the first and second substances,Cas to the number of sample classes,p c in order to average the probability of the class edge,
Figure 179142DEST_PATH_IMAGE011
the class edge probability predicted for the model,p(y|x) Is the sample conditional probability of the model output,His the entropy.
4. The method for classifying data-augmented depth semi-supervised over-learning images according to claim 1, wherein the step of fusing to obtain the fused image features in step S2 is to perform feature vectorization cascade fusion to obtain the fused image features, and the function expression is as follows:
Figure 119416DEST_PATH_IMAGE012
in the above-mentioned formula, the compound has the following structure,f c (x) For the fused image features after the cascade fusion,concatwhich represents a vector concatenation operation, is shown,ReLUin order to be a linear rectification function,GAPis a function of the global average pooling,f s (x), f h (x) Respectively a low-level shallow structure characteristic and a high-level semantic characteristic output by the deep neural network.
5. The method for classifying image features for data augmentation and semi-supervised over-learning in depth according to claim 1, wherein the function expression for augmenting the fused image features and labels of the training image by using the random linear interpolation technique in step S3 is as follows:
Figure 642801DEST_PATH_IMAGE013
in the above formula, the first and second carbon atoms are,
Figure 666121DEST_PATH_IMAGE014
for the interpolated image feature matrix,X j andX i is the image feature matrix before the interpolation,
Figure 907746DEST_PATH_IMAGE015
for the label matrix after the interpolation, the label matrix is,Y j andY i is the label matrix before the interpolation,λis a weight vector sampled in the beta distribution.
6. The method of claim 5 wherein the weight vectors sampled in beta distribution are used to classify imagesλThe formula of the calculation function is:
Figure 510766DEST_PATH_IMAGE016
in the above-mentioned formula, the compound has the following structure,f(x:α,γ) For weight vector sampled in beta distributionλThe expression of the computational function of (2),α,γcontrol parameters for beta distributions greater than 0 respectively,xanduare all function variable unknowns.
7. The method for classifying image according to claim 1, wherein step S4 is followed by the steps of: and inputting the image to be recognized into the image classification recognition network model, and obtaining and outputting an image classification recognition result corresponding to the image to be recognized.
8. A data-augmented deep semi-supervised overrun learning image classification system comprising a microprocessor and a memory connected to each other, characterized in that the microprocessor is programmed or configured to perform the steps of the data-augmented deep semi-supervised overrun learning image classification method of any one of claims 1 to 7.
9. A computer-readable storage medium having stored thereon a computer program programmed or configured to perform the data-augmented method for classifying images for deep semi-supervised ultralimit learning according to any one of claims 1 to 7.
CN202110448092.XA 2021-04-25 2021-04-25 Data-augmented deep semi-supervised over-limit learning image classification method and system Active CN113077388B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110448092.XA CN113077388B (en) 2021-04-25 2021-04-25 Data-augmented deep semi-supervised over-limit learning image classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110448092.XA CN113077388B (en) 2021-04-25 2021-04-25 Data-augmented deep semi-supervised over-limit learning image classification method and system

Publications (2)

Publication Number Publication Date
CN113077388A CN113077388A (en) 2021-07-06
CN113077388B true CN113077388B (en) 2022-08-09

Family

ID=76618604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110448092.XA Active CN113077388B (en) 2021-04-25 2021-04-25 Data-augmented deep semi-supervised over-limit learning image classification method and system

Country Status (1)

Country Link
CN (1) CN113077388B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673591B (en) * 2021-08-13 2023-12-01 上海交通大学 Self-adjusting sampling optimization image classification method, device and medium
CN114462558A (en) * 2022-04-13 2022-05-10 南昌工程学院 Data-augmented supervised learning image defect classification method and system
CN114821204B (en) * 2022-06-30 2023-04-07 山东建筑大学 Meta-learning-based embedded semi-supervised learning image classification method and system
CN115272777B (en) * 2022-09-26 2022-12-23 山东大学 Semi-supervised image analysis method for power transmission scene
CN116168348B (en) * 2023-04-21 2024-01-30 成都睿瞳科技有限责任公司 Security monitoring method, system and storage medium based on image processing
CN117710763A (en) * 2023-11-23 2024-03-15 广州航海学院 Image noise recognition model training method, image noise recognition method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897737A (en) * 2017-01-24 2017-06-27 北京理工大学 A kind of high-spectrum remote sensing terrain classification method based on the learning machine that transfinites
CN106960176A (en) * 2017-02-22 2017-07-18 华侨大学 A kind of pedestrian's gender identification method based on transfinite learning machine and color characteristic fusion
CN107403191A (en) * 2017-07-03 2017-11-28 杭州电子科技大学 A kind of semi-supervised learning machine sorting technique that transfinites with depth structure
CN109740539A (en) * 2019-01-04 2019-05-10 上海理工大学 3D object identification method based on transfinite learning machine and fusion convolutional network
CN110598728A (en) * 2019-07-23 2019-12-20 杭州电子科技大学 Semi-supervised ultralimit learning machine classification method based on graph balance regularization

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104680144B (en) * 2015-03-02 2018-06-05 华为技术有限公司 Based on the lip reading recognition methods and device for projecting very fast learning machine
WO2018184222A1 (en) * 2017-04-07 2018-10-11 Intel Corporation Methods and systems using improved training and learning for deep neural networks
CN112116088A (en) * 2020-08-24 2020-12-22 丽水学院 Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897737A (en) * 2017-01-24 2017-06-27 北京理工大学 A kind of high-spectrum remote sensing terrain classification method based on the learning machine that transfinites
CN106960176A (en) * 2017-02-22 2017-07-18 华侨大学 A kind of pedestrian's gender identification method based on transfinite learning machine and color characteristic fusion
CN107403191A (en) * 2017-07-03 2017-11-28 杭州电子科技大学 A kind of semi-supervised learning machine sorting technique that transfinites with depth structure
CN109740539A (en) * 2019-01-04 2019-05-10 上海理工大学 3D object identification method based on transfinite learning machine and fusion convolutional network
CN110598728A (en) * 2019-07-23 2019-12-20 杭州电子科技大学 Semi-supervised ultralimit learning machine classification method based on graph balance regularization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Traffic Sign Recognition Using Deep Convolutional Networks and Extreme Learning Machine;Yujun Zeng,et al.;《IScIDE 2015, Part I, LNCS 9242》;20151231;272–280 *
Traffic Sign Recognition Using Kernel Extreme Learning Machines With Deep Perceptual Features;Yujun Zeng,et al.;《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》;20161231;1-7 *

Also Published As

Publication number Publication date
CN113077388A (en) 2021-07-06

Similar Documents

Publication Publication Date Title
CN113077388B (en) Data-augmented deep semi-supervised over-limit learning image classification method and system
CN111079532B (en) Video content description method based on text self-encoder
WO2020063715A1 (en) Method and system for training binary quantized weight and activation function for deep neural networks
CN107526785B (en) Text classification method and device
CN111552807B (en) Short text multi-label classification method
CN110110080A (en) Textual classification model training method, device, computer equipment and storage medium
CN108710896B (en) Domain learning method based on generative confrontation learning network
CN111275046B (en) Character image recognition method and device, electronic equipment and storage medium
JP2023549579A (en) Temporal Bottleneck Attention Architecture for Video Behavior Recognition
CN109389166A (en) The depth migration insertion cluster machine learning method saved based on partial structurtes
CN113159072B (en) Online ultralimit learning machine target identification method and system based on consistency regularization
CN113283590B (en) Defending method for back door attack
CN109242097B (en) Visual representation learning system and method for unsupervised learning
Flenner et al. A deep non-negative matrix factorization neural network
CN110892409A (en) Method and apparatus for analyzing images
Cheng et al. A survey on deep neural network pruning-taxonomy, comparison, analysis, and recommendations
Ulaganathan et al. Isolated handwritten Tamil character recognition using convolutional neural networks
CN114882278A (en) Tire pattern classification method and device based on attention mechanism and transfer learning
CN112527959B (en) News classification method based on pooling convolution embedding and attention distribution neural network
CN116075820A (en) Method, non-transitory computer readable storage medium and apparatus for searching image database
Passalis et al. Deep temporal logistic bag-of-features for forecasting high frequency limit order book time series
Nguyen-Duc et al. Particle-based Adversarial Local Distribution Regularization.
CN115797642A (en) Self-adaptive image semantic segmentation algorithm based on consistency regularization and semi-supervision field
CN115862015A (en) Training method and device of character recognition system, and character recognition method and device
Chu et al. Mixed-precision quantized neural network with progressively decreasing bitwidth for image classification and object detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant