CN113077388B - Data-augmented deep semi-supervised over-limit learning image classification method and system - Google Patents
Data-augmented deep semi-supervised over-limit learning image classification method and system Download PDFInfo
- Publication number
- CN113077388B CN113077388B CN202110448092.XA CN202110448092A CN113077388B CN 113077388 B CN113077388 B CN 113077388B CN 202110448092 A CN202110448092 A CN 202110448092A CN 113077388 B CN113077388 B CN 113077388B
- Authority
- CN
- China
- Prior art keywords
- image
- data
- training
- network model
- label
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000012549 training Methods 0.000 claims abstract description 60
- 238000013528 artificial neural network Methods 0.000 claims abstract description 29
- 230000004927 fusion Effects 0.000 claims abstract description 21
- 230000003190 augmentative effect Effects 0.000 claims abstract description 13
- 238000005516 engineering process Methods 0.000 claims abstract description 9
- 239000011159 matrix material Substances 0.000 claims description 39
- 230000006870 function Effects 0.000 claims description 36
- 125000004432 carbon atom Chemical group C* 0.000 claims description 14
- 238000009826 distribution Methods 0.000 claims description 14
- 239000013598 vector Substances 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 10
- 238000013434 data augmentation Methods 0.000 claims description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 7
- 239000000126 substance Substances 0.000 claims description 7
- 230000003416 augmentation Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 4
- 101100001676 Emericella variicolor andK gene Proteins 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 150000001875 compounds Chemical class 0.000 claims 3
- 238000010586 diagram Methods 0.000 description 12
- 230000000694 effects Effects 0.000 description 5
- 238000002372 labelling Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 3
- 230000003321 amplification Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4007—Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method and a system for classifying data-augmented deep semi-supervised over-limit learning images, wherein the method comprises the steps of extracting features by adopting a deep convolution network model aiming at training images; fine tuning and optimizing a deep convolutional network model based on part of manual label data and generating a pseudo label for a label-free training image; fusing high-level semantic features extracted from the training images with low-level shallow structure features to obtain fused image features; adopting a random linear interpolation technology to amplify the fusion image characteristics and the labels of the training images; and training a single hidden layer feedforward neural network aiming at the augmented fused image features and labels and replacing a full connection layer in the deep convolutional network model to obtain a final image classification recognition network model. The method has the advantages of small requirement on manual marking, robust anti-noise interference capability, good classification and identification performance and strong task expansibility, and data is enlarged.
Description
Technical Field
The invention relates to the technical field of image classification and target identification, in particular to a method and a system for classifying depth semi-supervised over-limit learning images with data augmentation.
Background
The current visual target identification method with better performance mostly adopts a deep learning technology, and supervised learning training of a deep neural network model is often required to be carried out on large-scale artificial labeling data. However, manually marking large amounts of data is expensive and may even be impractical for certain application scenarios. In recent years, related research has been more focused on deep semi-supervised learning techniques. The deep semi-supervised learning technology can improve the performance of the visual target classification and identification method by utilizing a small amount of high-quality labeled data and a large amount of easily-obtained unlabeled data. A variety of deep semi-supervised learning techniques such as entropy minimization, pseudo-labeling, consistency regularization, pre-training, model generation based, etc. have been proposed.
Among them, the most representative common techniques are mainly three types:
the method comprises the steps of firstly, carrying out a pseudo-label-based deep semi-supervised learning technology, using a class prediction result of a model obtained by training data based on a small amount of artificial labeling labels as a pseudo label of unlabeled data, then adding the pseudo label to the artificial labeling data to train an image class label prediction model again, carrying out label updating on the pseudo label data, and repeating iteration in the way until a receivable image classification precision effect is achieved. However, the labels predicted by the model may have noise, and the data itself may also have noise to a certain extent, so that the problem of model degradation often occurs in the image classification model obtained by training based on the samples, and the image classification accuracy is difficult to meet the actual application requirements;
the second is a deep semi-supervised learning technique based on consistent regularization, which penalizes inconsistent predictions from unmarked data under different perturbations. This technique will also suffer from noise signature overfitting due to the use of multiple predictions. Subsequently, various technologies such as generating a high-quality pseudo label and generating effective disturbance are provided to reduce overfitting, but the performance is obviously improved when the noise intensity is low, and the acceptable effect cannot be achieved when the noise intensity is high;
and thirdly, decomposing the feature representation and the classifier into two independent stages, namely pre-training the deep neural network before applying the classifier for learning so as to find out the compressed feature representation of the input data through an unsupervised auxiliary learning task. This technique pursues better feature representation models and better classifiers, respectively, and has advantages in analyzing and mitigating the over-fitting problem. However, this technique lacks label guidance for unsupervised auxiliary tasks and the feature representation model often faces a risk of inconsistency with the final task. For example, an auto-encoder aims to reconstruct all pixels of the original image to guide the learning extraction of the image feature representation, and the final classification task may be related to only a small portion of pixels in the image. In addition, the technology is difficult to effectively set a classifier, the sample data information utilization efficiency is low, and efficient learning training is difficult to perform.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the method and the system for classifying the deep semi-supervised ultralimit learning image are provided, wherein the method and the system are small in artificial marking requirement, have robust anti-noise interference capability, good classification and identification performance and strong task expansibility, and have the advantages of data amplification.
In order to solve the technical problems, the invention adopts the technical scheme that:
a data-augmented deep semi-supervised over-limit learning image classification method comprises the following steps of learning and training an image classification recognition network model:
s1, extracting features of the training image by adopting a depth convolution network model; training the deep convolutional network model aiming at part of artificial label data of the training image to realize fine tuning optimization, and predicting and generating a corresponding pseudo label for the label-free training image through the fine tuning optimized deep convolutional network model;
s2, fusing the high-level semantic features extracted from the training images with the low-level shallow structure features to obtain fused image features;
s3, augmenting the fusion image characteristics and labels of the training image by adopting a random linear interpolation technology;
s4, randomly dividing the augmented fusion image features and labels into batches, sequentially inputting the batches into the single hidden layer feedforward neural network, updating the weight of a network output layer, and repeating the steps until the training of the single hidden layer feedforward neural network is completed; and removing the full connection layer in the deep convolutional network model, and connecting the full connection layer with the trained single hidden layer feedforward neural network to form an image classification and identification network model for realizing the end-to-end classification and identification of the corresponding image target.
Optionally, the deep convolutional network model in step S1 is a 13-CNN deep convolutional network model that has completed pre-training.
Optionally, the training loss function of the 13-CNN deep convolutional network model is as follows:
in the above formula, the first and second carbon atoms are,l cos a function representing the loss of training is represented,λ 0 ,λ 1 andλ 2 in order to be the weight coefficient,R 0 in order to be a consistent regularization term,R 1 in order to cross-entropy regularization terms,y i in order to be the label of the sample,the model is given a label for the unlabeled data self-estimation,x i is as followsiThe number of the samples is one,nas to the number of samples,p(y i |x i ) Is the predicted output of the model and is,lthe number of marked samples is as follows:
wherein the content of the first and second substances,Cas to the number of sample classes,p c in order to average the probability of the class edge,the class edge probability predicted for the model,p(y|x) Is the sample conditional probability of the model output,His the entropy.
Optionally, the step S2 of obtaining the fused image feature by fusion refers to performing feature-vectorization cascade fusion to obtain the fused image feature, where the function expression is:
in the above formula, the first and second carbon atoms are,f c (x) For the fused image features after the cascade fusion,concatwhich represents a vector concatenation operation, is shown,ReLUin order to be a linear rectification function,GAPis a function of the global average pooling,f s (x), f h (x) Respectively a low-level shallow structure characteristic and a high-level semantic characteristic output by the deep neural network.
Optionally, in step S3, the function expression for augmenting the fused image feature and the label of the training image by using the random linear interpolation technique is as follows:
in the above formula, the first and second carbon atoms are,for the interpolated image feature matrix,X j andX i is the image feature matrix before the interpolation,for the label matrix after the interpolation, the label matrix is,Y j andY i is the label matrix before the interpolation,λis a weight vector sampled in the beta distribution.
Optionally, a weight vector sampled in beta distributionλThe formula of the calculation function is:
in the above formula, the first and second carbon atoms are,f(x:α,γ) For weight vector sampled in beta distributionλThe expression of the computational function of (2),α,γcontrol parameters for beta distributions greater than 0 respectively,xanduare all function variable unknowns.
Optionally, the function expression for updating the weight of the network output layer in step S4 is as follows:
in the above formula, the first and second carbon atoms are,K k+1 andK k respectively, a weight matrix is formed by the weight matrix,β k+1 andβ k the solution parameters for the iteration are respectively,to input firstk+1 batch of augmentation data and tagsThe hidden layer output matrix of the single hidden layer feedforward neural network is used,is a label matrix and has initial values:
wherein the content of the first and second substances,for inputting the augmentation data and tags of the initial lotThe hidden layer output matrix of the single hidden layer feedforward neural network is used,cin order to be the weight coefficient,Iis an identity matrix.
Optionally, step S4 is followed by: and inputting the image to be recognized into the image classification recognition network model, and obtaining and outputting an image classification recognition result corresponding to the image to be recognized.
In addition, the invention also provides a data-augmented depth semi-supervised overrun learning image classification system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the data-augmented depth semi-supervised overrun learning image classification method.
Furthermore, the present invention also provides a computer-readable storage medium having stored therein a computer program programmed or configured to execute the data-augmented deep semi-supervised ultralimit learning image classification method.
Compared with the prior art, the invention has the main advantages that:
1. the invention adopts a deep semi-supervised learning technical route, and performs learning training of the image classification and identification network model based on a small amount of labeled data and a large amount of unlabeled data, thereby greatly reducing the cost requirement of manual labeling of the data while ensuring the accuracy of image classification and identification. Meanwhile, the existing deep semi-supervised learning framework is improved, the feature learning and the classifier training process are decoupled and optimized respectively, so that the network model can learn and extract image features more oriented to specific classification recognition tasks. Meanwhile, the classifier training adopts an ultralimit learning machine principle, and the generalization capability of classification and identification is further improved.
2. When the ultralimit learning machine is adopted for learning and training the classifier, a data augmentation mechanism is introduced and integrated into the target optimization function design of the online ultralimit learning machine, so that the classifier obtained by training can effectively tolerate the training data and the noise in the label thereof, and the robustness of classification and identification is effectively improved. In addition, the online ultralimit learning method based on data augmentation is not only limited to a deep semi-supervised learning framework, but also is applicable to classifier learning training related to supervised learning tasks, and has certain task expansibility.
Drawings
FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of an implementation principle of the method according to the embodiment of the present invention.
FIG. 3 shows the comparison result of the effect performance of the method of the embodiment of the invention on the typical international standard image classification recognition databases CIFAR-10 and CIFAR-100 and the related typical method.
FIG. 4 is a schematic diagram showing the comparison of the effectiveness performance of the method of the embodiment of the invention on typical international standard image classification recognition databases CIFAR-10 and CIFAR-100 and the classification recognition accuracy thereof with other part of representative methods.
FIG. 5 is a schematic diagram showing the comparison between the performance of the method of the embodiment of the invention on typical international standard image classification recognition databases CIFAR-10 and CIFAR-100 and the classification recognition accuracy of another part of representative methods.
Detailed Description
The invention is further described below with reference to the drawings and specific preferred embodiments of the description, without thereby limiting the scope of protection of the invention.
As shown in fig. 1 and fig. 2, the method for classifying an image by deep semi-supervised over-limit learning with augmented data of the present embodiment includes the following steps of learning and training an image classification recognition network model:
s1, extracting features of the training image by adopting a depth convolution network model; training the deep convolutional network model aiming at part of artificial label data of the training image to realize fine tuning optimization, and predicting and generating a corresponding pseudo label for the label-free training image through the fine tuning optimized deep convolutional network model;
s2, fusing the high-level semantic features extracted from the training images with the low-level shallow structure features to obtain fused image features;
s3, augmenting the fusion image characteristics and labels of the training image by adopting a random linear interpolation technology;
s4, randomly dividing the augmented fusion image features and labels into batches, sequentially inputting the batches into the single hidden layer feedforward neural network, updating the weight of a network output layer, and repeating the steps until the training of the single hidden layer feedforward neural network is completed; and removing the full connection layer in the deep convolutional network model, and connecting the full connection layer with the trained single hidden layer feedforward neural network to form an image classification and identification network model for realizing the end-to-end classification and identification of the corresponding image target.
Step S1 is a step of extracting depth convolution features of an example image and generating pseudo labels, that is, training and learning a depth convolution neural network model with simple classification and recognition capability based on a small amount of artificially labeled image data, and performing preliminary classification and recognition on a large amount of unlabeled images to obtain corresponding pseudo labels, converting original image data into a data set containing artificially labeled images and pseudo labels, and performing retraining on the depth convolution network, thereby converting original semi-supervised learning into a supervised learning process, and the corresponding extracted depth convolution features can also have task relevance.
As an alternative embodiment, the deep convolutional network model in step S1 is a 13-CNN deep convolutional network model with pre-training.
In this embodiment, the training loss function of the 13-CNN deep convolutional network model is as follows:
in the above formula, the first and second carbon atoms are,l cos a function representing the loss of training is represented,λ 0 ,λ 1 andλ 2 in order to be the weight coefficient,R 0 in order to be a consistent regularization term,R 1 in order to cross-entropy regularization terms,y i in order to be the label of the sample,the model is given a label for the unlabeled data self-estimation,x i is as followsiThe number of the samples is one,nas to the number of samples,p(y i |x i ) Is the predicted output of the model and is,lthe number of marked samples is as follows:
wherein the content of the first and second substances,Cas to the number of sample classes,p c in order to average the probability of the class edge,the class edge probability predicted for the model,p(y|x) Is the sample conditional probability of the model output,His the entropy.
Step S2, the multi-level image depth convolution feature fusion is to select a shallow feature capable of reflecting image structure information and a deep feature capable of reflecting image category semantic information from the multi-level image convolution features extracted by the depth convolution network, and perform feature vectorization cascade fusion. Specifically, the step S2 of the embodiment of obtaining the fusion image feature by fusion refers to performing feature-vectorized cascade fusion to obtain the fusion image feature, where a function expression of the fusion image feature is:
in the above formula, the first and second carbon atoms are,f c (x) For the fused image features after the cascade fusion,concatwhich represents a vector concatenation operation, is shown,ReLUin order to be a linear rectification function,GAPis a function of the global average pooling,f s (x), f h (x) Respectively a low-level shallow structure characteristic and a high-level semantic characteristic output by the deep neural network. In this embodiment, taking 13-CNN as an example, the output characteristics of the 3 rd convolutional layer and the 18 th convolutional layer may be selected and cascaded after global pooling and linear rectification processing.
Global average pooling functionGAPThe functional expression of (a) is:
linear rectification functionReLUThe functional expression of (a) is:
wherein the content of the first and second substances,W、H、Crespectively, image data of output global average pooling functionXThe width, height and number of channels of the channel,x i,j,k finger image dataXTo middlekIn the channeljGo to the firstiThe data point value, max, of the column is taken to be the maximum value.
In the step S3 of this embodiment, in the online ultralimit learning classification based on data augmentation, the image features and the corresponding labels obtained in the steps S1 and S2 are augmented by using a random linear interpolation principle, specifically using the existing mixup method. Specifically, in step S3, the function expression for augmenting the fused image feature and the label of the training image by using the random linear interpolation technique is as follows:
in the above formula, the first and second carbon atoms are,for the interpolated image feature matrix,X j andX i is the image feature matrix before the interpolation,for the label matrix after the interpolation, the label matrix is,Y j andY i is the label matrix before the interpolation,λis a weight vector sampled in the beta distribution.
In this embodiment, the weight vector sampled in beta distributionλThe formula of the calculation function is:
in the above formula, the first and second carbon atoms are,f(x:α,γ) For weight vector sampled in beta distributionλThe expression of the computational function of (2),α,γcontrol parameters for beta distributions greater than 0 respectively,xanduare all function variable unknowns. In this example toα=γ。
In this embodiment, in step S4, the fully-connected layer at the end of the deep convolutional network model is replaced by a single hidden layer feedforward neural network, data amplification is further combined with the over-limit learning machine, an objective function as shown below is defined, and learning training of the single hidden layer feedforward neural network is performed accordingly.
Wherein the content of the first and second substances,λto sample the weight vector of the beta distribution,for the hidden layer output matrix of the used single hidden layer feedforward neural network,βfor the output layer weights of the used single-hidden layer feed-forward neural network to be optimized for learning,Frepresents a Frobingni norm,care weight coefficients. Accordingly, the output weights of the single hidden layer feedforward neural network may be given by:
in the above formula, the first and second carbon atoms are,β * representing the output weights of the single hidden layer feed-forward neural network,in the form of a matrix of data,cin order to be the weight coefficient,Iis a matrix of the units,being a label matrix, the data matrix consists of N samples,dis the characteristic dimension of the sample. Wherein:
in the above formula, the first and second carbon atoms are,λto sample the weight vector for the beta distribution,Y i in the form of a matrix of original tags,Y j is a disorderly ordered matrix of labels.
In this embodiment, the function expression for updating the weight of the network output layer in step S4 is as follows:
in the above formula, the first and second carbon atoms are,K k+1 andK k respectively, a weight matrix is formed by the weight matrixes,β k+1 andβ k the solution parameters for the iteration are respectively,to input firstk+1 batch of augmentation data and tagsThe hidden layer output matrix of the single hidden layer feedforward neural network is used,is a label matrix and has initial values:
wherein the content of the first and second substances,for inputting the augmentation data and tags of the initial lotThe hidden layer output matrix of the single hidden layer feedforward neural network is used,cin order to be the weight coefficient,Iis an identity matrix.
After the training of the single hidden layer feedforward neural network is finished, the single hidden layer feedforward neural network is connected with the deep convolution networks used in the S1 and the S2 of the removed full connection layer, and then the classification and the identification of the input image can be completed. In this embodiment, step S4 is followed by: and inputting the image to be recognized into the image classification recognition network model, and obtaining and outputting an image classification recognition result corresponding to the image to be recognized.
FIG. 3 shows the comparison result of the effect performance of the method of this embodiment on the typical international standard image classification recognition databases CIFAR-10 and CIFAR-100 and the related typical method. Particularly, when noise exists in the image data, the method of the embodiment can also have better classification and identification performance, as shown in fig. 4 and 5, the effect performance of the method of the embodiment on the typical international standard image classification and identification databases CIFAR-10 and CIFAR-100 and the classification and identification accuracy comparison with the related representative method are achieved under different label noise intensities (0% -70%).
In addition, the present embodiment further provides a data-augmented deep semi-supervised overrun learning image classification system, which includes a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to execute the steps of the data-augmented deep semi-supervised overrun learning image classification method.
In addition, the present embodiment also provides a computer-readable storage medium, in which a computer program programmed or configured to execute the aforementioned data-augmented depth semi-supervised ultralimit learning image classification method is stored.
In summary, the method of the embodiment includes the steps of obtaining a training image to be classified and identified, and performing depth convolution feature coding and fusion on each pixel in the image; training an initial classification recognition network based on a small part of labeled image data and performing classification recognition on unlabeled image data to obtain corresponding pseudo labels; carrying out data augmentation on the acquired image features to be classified and identified, the labels corresponding to the image features and the pseudo labels; and removing and replacing the full connection layer of the initial classification recognition network with a single hidden layer feedforward neural network layer, and performing learning training on the weight of the network layer according to an online overrun learning machine based on data augmentation to obtain a final image classification recognition network model, wherein the model can be used for realizing end-to-end classification recognition of corresponding image targets. The method has the advantages of small requirement on manual marking, robust anti-noise interference capability, good classification and identification performance, strong task expansibility and the like.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.
Claims (9)
1. A data-augmented deep semi-supervised over-limit learning image classification method is characterized by comprising the following steps of learning and training an image classification recognition network model:
s1, extracting features of the training image by adopting a depth convolution network model; training the deep convolutional network model aiming at part of artificial label data of the training image to realize fine tuning optimization, and predicting and generating a corresponding pseudo label for the label-free training image through the fine tuning optimized deep convolutional network model;
s2, fusing the high-level semantic features extracted from the training images with the low-level shallow structure features to obtain fused image features;
s3, augmenting the fusion image characteristics and labels of the training image by adopting a random linear interpolation technology;
s4, randomly dividing the augmented fusion image features and labels into batches, sequentially inputting the batches into the single hidden layer feedforward neural network, updating the weight of a network output layer, and repeating the steps until the training of the single hidden layer feedforward neural network is completed; removing a full connection layer in the deep convolutional network model, and connecting the full connection layer with a trained single hidden layer feedforward neural network to form an image classification and identification network model for realizing end-to-end classification and identification of a corresponding image target; the function expression for updating the weight of the network output layer is as follows:
in the above formula, the first and second carbon atoms are,K k+1 andK k respectively, a weight matrix is formed by the weight matrix,β k+1 andβ k the solution parameters for the iteration are respectively,to input firstk+1 batch of augmentation data and tagsThe hidden layer output matrix of the single hidden layer feedforward neural network is used,is a label matrix and has initial values:
2. The method for classifying image of deep semi-supervised over-learning with augmented data of claim 1, wherein the deep convolutional network model in step S1 is a 13-CNN deep convolutional network model with pre-training.
3. The method for classifying image of deep semi-supervised over-learning based on data augmentation as claimed in claim 2, wherein the training loss function of the 13-CNN deep convolutional network model is as follows:
in the above-mentioned formula, the compound has the following structure,l cos a function representing the loss of training is represented,λ 0 ,λ 1 andλ 2 in order to be the weight coefficient,R 0 in order to be a consistent regularization term,R 1 in order to cross-entropy regularization terms,y i in order to be the label of the sample,the model is given a label for the unlabeled data self-estimation,x i is as followsiThe number of the samples is one,nas to the number of samples,p(y i |x i ) Is the predicted output of the model and is,lthe number of marked samples is as follows:
4. The method for classifying data-augmented depth semi-supervised over-learning images according to claim 1, wherein the step of fusing to obtain the fused image features in step S2 is to perform feature vectorization cascade fusion to obtain the fused image features, and the function expression is as follows:
in the above-mentioned formula, the compound has the following structure,f c (x) For the fused image features after the cascade fusion,concatwhich represents a vector concatenation operation, is shown,ReLUin order to be a linear rectification function,GAPis a function of the global average pooling,f s (x), f h (x) Respectively a low-level shallow structure characteristic and a high-level semantic characteristic output by the deep neural network.
5. The method for classifying image features for data augmentation and semi-supervised over-learning in depth according to claim 1, wherein the function expression for augmenting the fused image features and labels of the training image by using the random linear interpolation technique in step S3 is as follows:
in the above formula, the first and second carbon atoms are,for the interpolated image feature matrix,X j andX i is the image feature matrix before the interpolation,for the label matrix after the interpolation, the label matrix is,Y j andY i is the label matrix before the interpolation,λis a weight vector sampled in the beta distribution.
6. The method of claim 5 wherein the weight vectors sampled in beta distribution are used to classify imagesλThe formula of the calculation function is:
in the above-mentioned formula, the compound has the following structure,f(x:α,γ) For weight vector sampled in beta distributionλThe expression of the computational function of (2),α,γcontrol parameters for beta distributions greater than 0 respectively,xanduare all function variable unknowns.
7. The method for classifying image according to claim 1, wherein step S4 is followed by the steps of: and inputting the image to be recognized into the image classification recognition network model, and obtaining and outputting an image classification recognition result corresponding to the image to be recognized.
8. A data-augmented deep semi-supervised overrun learning image classification system comprising a microprocessor and a memory connected to each other, characterized in that the microprocessor is programmed or configured to perform the steps of the data-augmented deep semi-supervised overrun learning image classification method of any one of claims 1 to 7.
9. A computer-readable storage medium having stored thereon a computer program programmed or configured to perform the data-augmented method for classifying images for deep semi-supervised ultralimit learning according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110448092.XA CN113077388B (en) | 2021-04-25 | 2021-04-25 | Data-augmented deep semi-supervised over-limit learning image classification method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110448092.XA CN113077388B (en) | 2021-04-25 | 2021-04-25 | Data-augmented deep semi-supervised over-limit learning image classification method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113077388A CN113077388A (en) | 2021-07-06 |
CN113077388B true CN113077388B (en) | 2022-08-09 |
Family
ID=76618604
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110448092.XA Active CN113077388B (en) | 2021-04-25 | 2021-04-25 | Data-augmented deep semi-supervised over-limit learning image classification method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113077388B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113673591B (en) * | 2021-08-13 | 2023-12-01 | 上海交通大学 | Self-adjusting sampling optimization image classification method, device and medium |
CN114462558A (en) * | 2022-04-13 | 2022-05-10 | 南昌工程学院 | Data-augmented supervised learning image defect classification method and system |
CN114821204B (en) * | 2022-06-30 | 2023-04-07 | 山东建筑大学 | Meta-learning-based embedded semi-supervised learning image classification method and system |
CN115272777B (en) * | 2022-09-26 | 2022-12-23 | 山东大学 | Semi-supervised image analysis method for power transmission scene |
CN116168348B (en) * | 2023-04-21 | 2024-01-30 | 成都睿瞳科技有限责任公司 | Security monitoring method, system and storage medium based on image processing |
CN117710763A (en) * | 2023-11-23 | 2024-03-15 | 广州航海学院 | Image noise recognition model training method, image noise recognition method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106897737A (en) * | 2017-01-24 | 2017-06-27 | 北京理工大学 | A kind of high-spectrum remote sensing terrain classification method based on the learning machine that transfinites |
CN106960176A (en) * | 2017-02-22 | 2017-07-18 | 华侨大学 | A kind of pedestrian's gender identification method based on transfinite learning machine and color characteristic fusion |
CN107403191A (en) * | 2017-07-03 | 2017-11-28 | 杭州电子科技大学 | A kind of semi-supervised learning machine sorting technique that transfinites with depth structure |
CN109740539A (en) * | 2019-01-04 | 2019-05-10 | 上海理工大学 | 3D object identification method based on transfinite learning machine and fusion convolutional network |
CN110598728A (en) * | 2019-07-23 | 2019-12-20 | 杭州电子科技大学 | Semi-supervised ultralimit learning machine classification method based on graph balance regularization |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104680144B (en) * | 2015-03-02 | 2018-06-05 | 华为技术有限公司 | Based on the lip reading recognition methods and device for projecting very fast learning machine |
WO2018184222A1 (en) * | 2017-04-07 | 2018-10-11 | Intel Corporation | Methods and systems using improved training and learning for deep neural networks |
CN112116088A (en) * | 2020-08-24 | 2020-12-22 | 丽水学院 | Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes |
-
2021
- 2021-04-25 CN CN202110448092.XA patent/CN113077388B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106897737A (en) * | 2017-01-24 | 2017-06-27 | 北京理工大学 | A kind of high-spectrum remote sensing terrain classification method based on the learning machine that transfinites |
CN106960176A (en) * | 2017-02-22 | 2017-07-18 | 华侨大学 | A kind of pedestrian's gender identification method based on transfinite learning machine and color characteristic fusion |
CN107403191A (en) * | 2017-07-03 | 2017-11-28 | 杭州电子科技大学 | A kind of semi-supervised learning machine sorting technique that transfinites with depth structure |
CN109740539A (en) * | 2019-01-04 | 2019-05-10 | 上海理工大学 | 3D object identification method based on transfinite learning machine and fusion convolutional network |
CN110598728A (en) * | 2019-07-23 | 2019-12-20 | 杭州电子科技大学 | Semi-supervised ultralimit learning machine classification method based on graph balance regularization |
Non-Patent Citations (2)
Title |
---|
Traffic Sign Recognition Using Deep Convolutional Networks and Extreme Learning Machine;Yujun Zeng,et al.;《IScIDE 2015, Part I, LNCS 9242》;20151231;272–280 * |
Traffic Sign Recognition Using Kernel Extreme Learning Machines With Deep Perceptual Features;Yujun Zeng,et al.;《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》;20161231;1-7 * |
Also Published As
Publication number | Publication date |
---|---|
CN113077388A (en) | 2021-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113077388B (en) | Data-augmented deep semi-supervised over-limit learning image classification method and system | |
CN111079532B (en) | Video content description method based on text self-encoder | |
WO2020063715A1 (en) | Method and system for training binary quantized weight and activation function for deep neural networks | |
CN107526785B (en) | Text classification method and device | |
CN111552807B (en) | Short text multi-label classification method | |
CN110110080A (en) | Textual classification model training method, device, computer equipment and storage medium | |
CN108710896B (en) | Domain learning method based on generative confrontation learning network | |
CN111275046B (en) | Character image recognition method and device, electronic equipment and storage medium | |
JP2023549579A (en) | Temporal Bottleneck Attention Architecture for Video Behavior Recognition | |
CN109389166A (en) | The depth migration insertion cluster machine learning method saved based on partial structurtes | |
CN113159072B (en) | Online ultralimit learning machine target identification method and system based on consistency regularization | |
CN113283590B (en) | Defending method for back door attack | |
CN109242097B (en) | Visual representation learning system and method for unsupervised learning | |
Flenner et al. | A deep non-negative matrix factorization neural network | |
CN110892409A (en) | Method and apparatus for analyzing images | |
Cheng et al. | A survey on deep neural network pruning-taxonomy, comparison, analysis, and recommendations | |
Ulaganathan et al. | Isolated handwritten Tamil character recognition using convolutional neural networks | |
CN114882278A (en) | Tire pattern classification method and device based on attention mechanism and transfer learning | |
CN112527959B (en) | News classification method based on pooling convolution embedding and attention distribution neural network | |
CN116075820A (en) | Method, non-transitory computer readable storage medium and apparatus for searching image database | |
Passalis et al. | Deep temporal logistic bag-of-features for forecasting high frequency limit order book time series | |
Nguyen-Duc et al. | Particle-based Adversarial Local Distribution Regularization. | |
CN115797642A (en) | Self-adaptive image semantic segmentation algorithm based on consistency regularization and semi-supervision field | |
CN115862015A (en) | Training method and device of character recognition system, and character recognition method and device | |
Chu et al. | Mixed-precision quantized neural network with progressively decreasing bitwidth for image classification and object detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |