CN115952851A - Self-supervision continuous learning method based on information loss mechanism - Google Patents
Self-supervision continuous learning method based on information loss mechanism Download PDFInfo
- Publication number
- CN115952851A CN115952851A CN202211375805.5A CN202211375805A CN115952851A CN 115952851 A CN115952851 A CN 115952851A CN 202211375805 A CN202211375805 A CN 202211375805A CN 115952851 A CN115952851 A CN 115952851A
- Authority
- CN
- China
- Prior art keywords
- model
- self
- image
- feature
- learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000007246 mechanism Effects 0.000 title claims abstract description 70
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000012360 testing method Methods 0.000 claims abstract description 52
- 238000012549 training Methods 0.000 claims description 33
- 210000002569 neuron Anatomy 0.000 claims description 21
- 238000013527 convolutional neural network Methods 0.000 claims description 19
- 238000000605 extraction Methods 0.000 claims description 17
- 238000013528 artificial neural network Methods 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000007635 classification algorithm Methods 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 7
- 230000003044 adaptive effect Effects 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 6
- 238000003062 neural network model Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000012512 characterization method Methods 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 230000002411 adverse Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 2
- 230000014509 gene expression Effects 0.000 abstract description 2
- 238000011160 research Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 241000269350 Anura Species 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 241000282994 Cervidae Species 0.000 description 1
- 206010008909 Chronic Hepatitis Diseases 0.000 description 1
- 241000283086 Equidae Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 208000006454 hepatitis Diseases 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
Abstract
The invention provides an information loss mechanism-based self-supervision continuous learning method, which comprises the following steps: (1) An unsupervised continuous learning framework based on information loss to cause models to learn only important feature representations on continuous tasks; (2) An InfoDrap loss term based on a self-supervision learning paradigm is used for helping a model to still extract important feature expressions of a test sample after an InfoDrap mechanism is removed in a testing stage. In addition, the unsupervised continuous learning framework proposed by the invention can be used simultaneously with most of the continuous learning strategies. By discarding unimportant image information, the model only focuses on the feature representation of the important image information to relieve the limitation of the capacity of the model, and the performance of the self-supervision model is improved under the condition that samples of historical tasks or parameter information of the historical model are not required to be introduced.
Description
Technical Field
The invention belongs to the field of image processing, and mainly aims to improve the performance of an automatic supervision continuous learning model; the method is mainly applied to the field of image classification.
Background
In recent years, deep Learning (DL) has been remarkably successful in the fields of machine Learning, natural language processing, and the like. The focus of DL is to develop Deep Neural Networks (DNNs) by off-line training using fixed or predefined data sets, which exhibit significant performance on the corresponding task. However, DNN is also limited, and the trained DNN is fixed, and parameters inside the network may not change during the operation process, which means that the DNN remains static after deployment and cannot adapt to a changing environment. Real-world applications are not all monolithic, and in particular applications associated with autonomous agents involve the processing of continuously changing data, and over time, the data or tasks faced by the model may change, and static models do not perform well in such scenarios. One possible solution is to retrain the network when the data distribution changes, however, the complete training using the expanded data set is a computationally intensive task that is not possible in real world computing resource constrained environments, resulting in the need for a new algorithm that enables continuous learning with efficient use of resources.
Continuous learning presents needs and challenges in many real-world scenarios: the robot needs to autonomously learn a new behavior specification according to the change of the environment so as to adapt to the new environment and complete a new task; the automatic driving program needs to adapt to different environments, such as from rural highways to highways, from locations with sufficient light to dim environments; intelligent dialog systems need to adapt to different users and situations; smart medical applications need to adapt to new cases, new hospitals and inconsistent medical conditions.
Continuous Learning (CL) studies the problem of Learning in non-stationary data streams, and aims to expand the adaptive capacity of a model, so that the model can learn corresponding knowledge in different tasks, and can memorize characteristics learned in historical tasks. According to whether the input data has a label, continuous Learning can be divided into Supervised Continuous Learning (SCL) and Unsupervised Continuous Learning (UCL), supervised continuous Learning is usually concentrated on a series of related tasks, an artificially given label is added to the input data, so that task information and task boundary information needing generalization can be obtained, and the setting no longer meets the requirements of real situations: the unknown task labels, the undefined definition of task boundaries and the unavailability of a large amount of class label data lead to unsupervised continuous learning and self-supervised continuous learning methods. Self-supervised learning is part of unsupervised learning, which aims to eliminate the need for artificial identification to represent learning, and learns the characterization of data using unidentified raw information. The real self-monitoring continuous learning algorithm can utilize continuously input data streams which are not independently and uniformly distributed to learn a robust and self-adaptive model on the premise of not forgetting the obtained knowledge.
In recent years, research on CL has focused mainly on SCL, and these research results generally cannot be expanded into practical application scenarios with biased data distribution, and therefore, research on UCL that does not rely on manual annotation or supervised information has been receiving increasing attention, and despite short research time, complex research problems, and less results in the UCL field, efforts have been made to show that relying on manual annotation data is not essential for continuous learning, unsupervised visual representation can alleviate the problem of catastrophic forgetting, and UCL can exhibit better performance than SCL. Reference documents: madan, d., yoon, j., li, y, liu, y, & Hwang, s.j. (2021, separator) for unsupervised connected communication in International Conference communication in order to improve the performance of unsupervised models, a lightweight method independent of the model, namely information loss (InfoDrop), has attracted attention, which improves the robustness, interpretability of the model by reducing the texture bias of the Convolutional Neural Networks (CNN). Reference: the invention aims to combine an information loss mechanism with an unsupervised continuous Learning framework, improve the performance of the model, construct a more robust and reasonable continuous Learning model and promote the unsupervised continuous Learning technology to develop forward.
Disclosure of Invention
The invention relates to a self-supervision continuous learning method, which leads a model to extract important image characteristics in a continuous learning task by introducing an InfoDrap mechanism into a self-supervision model. The method selects the abandoned unimportant image information by calculating the self-information amount of the image block, guides the model to pay attention to the important region of the image information, and accordingly improves the performance of the self-supervision model.
The method comprises the steps of firstly constructing an information loss mechanism-based self-supervision continuous learning framework, dividing a CIFAR-10 data set into 5 tasks, training a model on the corresponding data set according to the arrival sequence of the tasks, and testing the accuracy of the model by using a KNN algorithm. The method is characterized in that an information loss mechanism is introduced into an automatic supervision learning framework to improve the performance of the model. The invention mainly does the following work from the perspective of model capacity: 1) Constructing a self-supervision learning model and a self-supervision continuous learning paradigm; 2) An information loss mechanism based on information quantity and a Dropout method is established, the model is helped to lose unimportant features in the image, the important features are reserved, and the information loss mechanism is integrated into a self-supervision continuous learning framework; 3) Based on the self-supervision loss paradigm, an InfoDrop loss item is combined, and the situation that the InfoDrop mechanism needs to be removed to finely adjust the model in the post-test is avoided; 4) Training is carried out on a data set CIFAR-10, accuracy of the model on the test set is tested by using a KNN classification algorithm, performance of the model is evaluated, and the model is compared with various continuous learning strategies. Through the work, the unsupervised continuous learning method is applicable to various continuous learning strategies, can improve the performance of models under different strategies, and is high in applicability.
To facilitate the description of the present disclosure, certain terms are first defined.
Definition 1: residual convolutional neural networks (ResNet). The method has the advantages that the 'residual connection' is added into the convolutional network, so that the degradation phenomenon of a deep network in training is solved, the trainable depth of the neural network is greatly increased, and compared with the traditional convolutional neural network, the residual network has the advantages of better training and easier optimization. In the present invention, the residual convolutional neural network used is the Resnet18 network.
Definition 2: and (4) self-adaptive averaging of the pooling layers. The self-adaptive average pooling layer can compress the spatial dimension, take out the average value of data in the corresponding dimension, and output the result of the specified size in a self-adaptive manner, so that some useless characteristics can be inhibited to a certain extent.
Definition 3: simsim. This is a different name for the twin network model, the simsim model maximizes the similarity between two augmentations of one image, which learns the characterization without the need for negative sample pairs, large batches, and momentum encoding.
Definition 4: dropout method. Dropout is a regularization method that solves the neural network overfitting problem by setting a probability to be discarded for neurons in a certain layer of the network, and randomly discarding some neurons according to the set probability in training.
Definition 5: the image Patch. Patch can be understood as an image block, and during the operation of the neural network, the network divides the picture into a plurality of small blocks, and the convolution kernel only looks at one small block at a time, and such a small block is called Patch.
Definition 6: the ReLU activation layer. Also called modified linear unit, is a commonly used activation function in artificial neural network, usually referring to a nonlinear function represented by a ramp function and its variants, and the expression f (x) = max (0, x).
The technical scheme of the invention is a continuous image feature extraction method based on an information loss mechanism, which comprises the following steps:
step 1: preprocessing the data set;
acquiring real world object images, labeling the real images according to the types of objects in the real images, normalizing pixel values of all pictures, zooming and cutting the pictures, and dividing the images into a plurality of data sets, wherein each data set comprises different types of images;
step 2: constructing an automatic supervision learning model;
self-supervised learning model feature-by-feature encoder f Θ And a characteristic measuring head h; feature encoder f Θ By the feature extraction module f b And a feature projection module f g Is formed by cascading:constructing a feature extraction module by using a residual convolutional neural network Resnet18, wherein the first layer of the feature extraction module is a convolutional neural network block, the second layer to the fifth layer of the feature extraction module are residual network blocks, and the last layer is an adaptive average pooling layer; the characteristic projection module is formed by connecting two layers of linear layers; feature encoder f Θ Is an image->Outputting as a feature representation of an image>The characteristic prediction head h is formed by connecting two layers of linear layers, the input of the characteristic prediction head h is the characteristic z of the image, and the output of the characteristic prediction head h is the prediction->The block structure of the convolutional neural network is shown in fig. 1, the block structure of the residual convolutional neural network is shown in fig. 2, and the structure of the residual convolutional neural network Resnet18 is shown in fig. 3;
and step 3: constructing a self-supervision continuous learning paradigm;
self-supervised continuous learning aims at a series of orderedArriving unlabeled tasksFeature representation of an upper learning image with a data set having a different distribution per task->T =1, ·, T; generally, an image x is randomly sampled from a data set, and then two image transformation operations are respectively performed on the image x to obtain images x of two related viewing angles 1 And x 2 (ii) a One view x of an image using a feature encoder 1 Performing feature encoding to obtain its feature z 1 =f(x 1 ) Similarly, another view x can be obtained 2 Characteristic z of 2 =f(x 2 ) (ii) a The goal of self-supervised continuous learning is to allow the model to learn about the historical task T at any time τ in the training 1 ,...,T τ-1 And the current task T τ The image representation of (1):
wherein in small batches of samplest = 1.., τ,>to approximate the desired operator->x i,t Represents slave data set->Sampling an ith sample in the small batch of samples obtained by up-random sampling; loss termFor the purpose of self-supervised learning loss, the self-supervised loss calculation formula in simsim is used here:
whereinIs that the feature encoder is for->Is greater than or equal to>Is that the characteristic prediction header relates to>Is predicted by the characteristic representation of>Stopgrad (. Cndot.) denotes stopping the gradient back propagation of the variable; i | · | live through 2 Is a two-norm operator;
however, achieving the goal of self-supervised learning is challenging; since in a continuous learning setting it is usually assumed that data from historical tasks is not available, i.e. required in inaccessible data setsWhile t =1, t-1, τ -1, solving for the model in the data set ÷ based on the number of cells in the data set>t = 1.. The optimum parameter Θ on τ * (ii) a Therefore, some continuous learning strategies need to be introduced to help the model to learnMaintaining its performance on historical tasks while previous tasks;
and 4, step 4: establishing an information loss mechanism
An InfoDrap mechanism, namely an information-based Dropout method, is introduced to help a continuous learning model to discard unimportant features in an image and only keep the important features; if the image patch input by the neuron contains less information, the Infodrop mechanism zeros the output of the neuron with higher probability, otherwise, keeps the output of the neuron; specifically, the first in the neural network is calculated under Boltzmann distributionThe output of the jth neuron of the c-th channel in the layer->The discarding factor of (2):
wherein,is the ^ th or greater in the neural network>Input patch for jth neuron of the c-th channel in the layer;When the self-information in the input patch of the neuron is low, the output of the neuron is discarded with a high probability, namely, the neural network is prompted to reduce the attention to the low-information area in the image; t is a temperature coefficient and is a 'soft threshold' of an InfoDrap mechanism, when T becomes small, namely the threshold is reduced, most of the patch is reserved, and only few patches with low self-information are lost; when T becomes infinite, i.e., the threshold goes high, the InfoDrop mechanism willDegenerates to the conventional Dropout mechanism, all patches will be dropped with equal probability;Is->A probability distribution of (a);
to approximate distributionInfoDrap mechanism hypothesis &>Is greater than or equal to>Is sampled from the distribution->When/is>Repeating the pattern of patch in its vicinity results in a higher ≧ greater>And therefore a low self-information; define a distribution->The estimation of (d) is:
wherein R representsThe manhattan radius of the field, | | · | |, represents the euclidean distance, h is the bandwidth, 6 is the bandwidth; fromCan be observed when->And its neighborhood->The more different the patch within, the more self-information it contains, i.e. </or>Will be set to zero with a lower probability;
and 5: constructing an automatic supervision continuous learning framework based on an information loss mechanism;
the method comprises the steps that a model is expected to learn feature representations of regions with important information in an image on a data set of a current task, and features of unimportant regions are ignored, so that the model can be guaranteed to be capable of learning at least key feature representations under the condition of limited model capacity; generally, an InfoDrop mechanism is implemented when a neural network model is optimized on a training set, and the InfoDrop mechanism is cancelled when the performance of the neural network model is verified on a test set, but most of areas with low self-information in an image can be discarded by the InfoDrop mechanism, so that larger distribution deviation occurs in the training data set and the test data set, and the performance of the model on the test set can be influenced; therefore, before testing the model, the model with the InfoDrop mechanism removed is usually optimized for the second time on the training set; however, the second optimization consumes additional training time and also introduces the effect of unimportant information areas in the image on the model; in order to avoid adverse effects brought by second optimization, an information loss mechanism suitable for self-supervision continuous learning is constructed based on a self-supervision learning model; when in taskWhen the model is trained, infoDrap loss is introduced on the basis of an auto-supervised loss term, and the following auto-supervised learning paradigm with an InfoDrap mechanism is constructed:
the self-supervision learning paradigm comprises two terms, wherein the first term is an original self-supervision loss term, and the second term is an InfoDrop regular term; wherein,for a model with an InfoDrap mechanism>Is recorded as &>Eyes->And f Θ Sharing the network weight; by minimizing InfoDrop regular terms, model f without the InfoDrop mechanism can be made Θ Is greater or less than>And a model with an InfoDrap mechanism @>Is greater or less than>Approximation to promote model f Θ Actively capturing the characteristics of the area with important information without adopting an InfoDrap mechanism, and ignoring unimportant characteristics; method frame schematic see figure 4
Step 6: (1) Processing the data set according to the step 1 to obtain data sets of a plurality of tasks; (2) constructing an unsupervised learning model according to the step 2; (3) Training a model on a training set of each task according to the arrival sequence of the tasks;
and 7: evaluating the performance of the model by using a KNN algorithm;
a) Computing test samplesIs characteristic of->Similarity to individual signatures in the feature library->s ij =cos(f i ,v j );
b) Will be provided withItem preceding K big as test sample>K neighbor set of>Calculating a test sample->Scores in C categories, the category with the highest score being the predictive classification of the test sample, test sample->The score calculation formula on the jth category is as follows:
And 8: after the model is trained on each task, the feature encoder f of the model is used Θ Chinese medicine for treating chronic hepatitis BSign extraction module f b To characterize the images of the test set on each task and then evaluate the validity of the characterization of the model using a KNN classification algorithm. The test results are shown in Table 1.
The innovation here is that:
(1) The invention establishes a framework for promoting the self-supervision model to extract important features on continuous tasks based on an InfoDrap mechanism. On a continuous learning task, the model, due to its limited capacity, makes a trade-off between preserving the feature representation capabilities of past tasks and learning the feature representation capabilities of the current task. The framework enables the model to only pay attention to the feature representation of the important image information by discarding the unimportant image information, so that the limitation of the capacity of the model is relieved, and the performance of the self-supervision model is improved under the condition that a sample of a historical task or parameter information of the historical model is not required to be introduced.
(2) The invention designs an InfoDrop loss item based on a self-supervision loss model, and can help the model to have the capability of directly extracting important feature representation of a test sample after an InfoDrop mechanism is removed in a test stage by optimizing the loss item, thereby avoiding fine tuning of the model.
Drawings
FIG. 1 is a block diagram of a convolutional network block of the method of the present invention
FIG. 2 is a block diagram of the residual convolutional neural network of the present invention
FIG. 3 is the structure diagram of the residual convolution neural network Resnet18 of the method of the present invention
FIG. 4 is a schematic diagram of the method of the present invention
Detailed Description
Step 1: preprocessing the data set;
CIFAR-10 dataset (http:// www.cs. Toronto. Edu/. Kriz/CIFAR. Html.) was downloaded, CIFAR-10 dataset containing 10 categories of real-world color pictures. Each category contains 5000 training pictures and 1000 test pictures with an image resolution size of 32 x 2. Dividing a CIFAR-10 data set into 5 tasks, wherein the data set of each task comprises two random image samples, and the image types of the data sets of each task are not overlapped with each other;
step 2: constructing a self-supervision learning model;
self-supervised learning model feature-by-feature encoder f Θ And the characteristic measuring head h. Feature encoder f Θ By the feature extraction module f b And a feature projection module f g Is formed by cascading:constructing a feature extraction module by adopting a residual convolutional neural network Resnet18, wherein the first layer of the feature extraction module is a convolutional neural network block, the second layer to the fifth layer of the feature extraction module are residual convolutional neural network blocks, and the last layer of the feature extraction module is an adaptive average pooling layer; the characteristic projection module is formed by connecting two layers of linear layers. Feature encoder f Θ Is an image->The output is a characteristic representation of the image->The characteristic prediction head h is formed by connecting two layers of linear layers, the input of the characteristic prediction head h is the characteristic z of the image, and the output of the characteristic prediction head h is the prediction->The block structure of the convolutional neural network is shown in fig. 1, the block structure of the residual convolutional neural network is shown in fig. 2, and the structure of the residual convolutional neural network Resnet18 is shown in fig. 3;
and step 3: constructing a self-supervision continuous learning paradigm;
self-supervised continuous learning addresses unlabeled tasks in a series of ordered arrivalsFeature representation of the upper learning image with a different distribution of data sets ≥ on each task>T = 1. Generally speakingWill be derived from the data setThe image x is obtained by sampling randomly, and then the image x with two related visual angles is obtained by respectively carrying out image transformation operations twice on the image x 1 And x 2 . One view x of an image using a feature encoder 1 Performing feature encoding to obtain its feature z 1 =f(x 1 ) Similarly, another view x can be obtained 2 Characteristic z of 2 =f(x 2 ). The goal of self-supervised continuous learning is to allow the model to learn about the historical task T at any time τ in the training 1 ,...,T τ-1 } and the current task T τ The image representation of (1):
wherein in small batches of samplest = 1.., τ,>to approximate the desired operator->x i,t Represents slave data set->And (4) up-randomly sampling the ith sample in the small batch of samples. Loss termFor the purpose of self-supervised learning loss, the self-supervised loss calculation formula in simsim is used here:
whereinIs that the feature encoder is for->Is greater than or equal to>Is that the characteristic prediction header relates to>Is predicted by the characteristic representation of>Stopgrad (. Cndot.) indicates that the gradient of the stopping variable is propagated backwards. I | · | purple wind 2 Is a two-norm operator.
However, achieving the goal of self-supervised learning is challenging. Since in a continuous learning setting it is usually assumed that data from historical tasks is not available, i.e. required in inaccessible data setst = 1.. Once.τ -1, the model is solved and evaluated in the data set ≥ s>t = 1.. The optimum parameter Θ on τ * . Therefore, some continuous learning strategies need to be introduced to help the model to keep its performance on the historical tasks while learning the current task.
And 4, step 4: establishing an information loss mechanism
InfoDrop mechanism, an information-based Dropout method, is introduced to help continuous learning model discardAnd (4) retaining only important features of the unimportant features in the image. The Infodrop mechanism nulates the output of a neuron with a high probability if the input image patch contains less information, and otherwise retains its output. Specifically, the first in the neural network is calculated under Boltzmann distributionOutput of a jth neuron in a channel c in a layer +>The discarding factor of (2):
wherein,is the ^ th or greater in the neural network>Input patch of jth neuron of the c-th channel in the layer.Defined as self-information, when the self-information in the input patch of a neuron is low, the output of the neuron will be discarded with a greater probability, i.e., causing the neural network to reduce the attention to low-information regions in the image. T is a temperature coefficient and is a 'soft threshold' of an InfoDrap mechanism, when T becomes small, namely the threshold is reduced, most of the patch is reserved, and only few patches with low self-information are lost; when T becomes infinite, i.e., the threshold goes high, the InfoDrop mechanism will degenerate to the conventional Dropout mechanism and all the latches will be discarded with equal probability.Is->The probability distribution of (c).
To approximate distributionInfoDrap mechanism hypothesis->Is greater than or equal to>Is sampled from the distribution->When/is>Repeating the pattern of patch in its vicinity results in a higher ≧ greater>And therefore a low self-information. Define a distribution->Is estimated as:
wherein R representsThe manhattan radius of the field, | | · | |, represents the euclidean distance, h is the bandwidth, b is the bandwidth. FromCan be observed when->And its neighborhood->The more different the patch within, the more self-information it contains, i.e. </or>Will be zeroed with a lower probability.
And 5: constructing an automatic supervision continuous learning framework based on an information loss mechanism;
it is desirable for the model to learn only the feature representations of regions in the image that have important information on the dataset of the current task, ignoring features of unimportant regions to ensure that the model can learn at least the key feature representations with limited model capacity. Generally, an InfoDrop mechanism is implemented when a neural network model is optimized on a training set, and the InfoDrop mechanism is cancelled when the performance of the neural network model is verified on a test set, but as the InfoDrop mechanism discards most of areas with low self-information in an image, larger distribution deviation occurs in the training data set and the test data set, and the performance of the model on the test set is influenced. Therefore, a model with the InfoDrop mechanism removed is typically optimized a second time on the training set before testing the model. However, the second optimization requires additional training time and also introduces the effect of unimportant information areas in the image on the model. In order to avoid adverse effects brought by the second optimization, an information loss mechanism adaptive to the self-supervision continuous learning is constructed based on the self-supervision learning model. When in taskWhen the model is trained, infoDrop loss is introduced on the basis of an unsupervised loss item, and the following unsupervised learning paradigm with an InfoDrop mechanism is constructed:
the self-supervised learning paradigm contains two terms, the first term being the original self-supervised loss term and the second term being the InfoDrop canonical term. Wherein,for a model with an InfoDrap mechanism>Is recorded as &>Eyes->And f Θ And sharing the network weight. By minimizing the InfoDrop regular term, model f without the InfoDrop mechanism can be made Θ Is greater or less than>And a model with an InfoDrap mechanism @>Is greater or less than>Approximation to promote model f Θ The characteristics of the area with important information are actively captured under the condition that an InfoDrap mechanism is not adopted, and the unimportant characteristics are ignored. Method frame schematic see figure 4
Step 6: processing the data set according to the step 1 to obtain data sets of a plurality of tasks; and (3) constructing an unsupervised learning model according to the step (2), and training the model on the training set of each task according to the task arrival sequence.
And 7: evaluating the performance of the model by using a KNN algorithm;
(1) Will taskOn training set>Conversion into a library of characteristics>Wherein v is i =f b (x i );
a) Calculating test samplesIs characteristic of->Similarity to individual signatures in the feature library->s ij =cos(f i ,v j );
b) Will be provided withItem preceding K big as test sample>K neighbor set of>Calculating a test sample->Scores in C categories, the category with the highest score being the predicted category of the test sample, test sample->The score calculation formula on the jth category is as follows:
And 8: after the model is trained on each task, the feature encoder f of the model is used Θ Feature extraction module f in (1) b To characterize the images of the test set on each task and then to evaluate the validity of the characterization of the model using a KNN classification algorithm. The test results are shown in Table 1. The invention verifies on 5 typical continuous learning strategies of FINETUNE, DER, SI, LUMP and CASSLEThe superiority of the self-supervision continuous learning framework based on the information loss mechanism is improved. It can be seen from table l that the self-supervision continuous learning framework provided by the invention can significantly alleviate the catastrophic forgetting phenomenon and improve the accuracy of the model on each task.
The picture size is as follows: 32*32*3
The picture categories are: airplanes, cars, birds, cats, deer, dogs, frogs, horses, boats, and trucks.
Learning rate: 0.003
Training batch size N:256
Iteration times are as follows: 200
Table 1 is a graph of the results of the experiment of the method of the present invention.
Claims (1)
1. An image feature continuous extraction method based on an information loss mechanism comprises the following steps:
step 1: preprocessing the data set;
acquiring real world object images, labeling the real images according to the types of objects in the real images, normalizing pixel values of all pictures, scaling and cutting the pictures, and dividing the images into a plurality of data sets, wherein each data set comprises different types of images;
step 2: constructing an automatic supervision learning model;
self-supervised learning model feature encoder f Θ And a characteristic measuring head h; feature encoder f Θ By the feature extraction module f b And a feature projection module f g Is formed by cascading:the feature extraction module is constructed by adopting a residual convolutional neural network Resnet18, the first layer of the feature extraction module is a convolutional neural network block, the second layer to the fifth layer are residual network blocks, and the last layer is an adaptive average pooling layer(ii) a The characteristic projection module is formed by connecting two layers of linear layers; feature encoder f Θ Is inputted as an imageThe output is a characteristic representation of the image->The characteristic prediction head h is formed by connecting two layers of linear layers, the input of the characteristic prediction head h is the characteristic z of an image, and the output of the characteristic prediction head h is the prediction of the image characteristic>
And step 3: constructing a self-supervision continuous learning paradigm;
self-supervised continuous learning addresses unlabeled tasks in a series of ordered arrivalsFeature representation of the upper learning image with a different distribution of data sets ≥ on each task>Generally, an image x is randomly sampled from a data set, and then two image transformation operations are respectively performed on the image x to obtain images x of two related view angles 1 And x 2 (ii) a One view x of an image using a feature encoder 1 Performing feature encoding to obtain its feature z 1 =f(x 1 ) Similarly, another view x can be obtained 2 Characteristic z of 2 =f(x 2 ) (ii) a The goal of self-supervised continuous learning is to allow the model to learn about the historical task T at any time τ in the training 1 ,...,T τ-1 And the current task T τ The image of (1) represents:
wherein in small batches of samplesUp-count loss term->To approximate the desired operatorx i,t Represents slave data set->Sampling an ith sample in the small batch of samples obtained by up-random sampling; loss term->For the purpose of self-supervised learning loss, the self-supervised loss calculation formula in simsim is used here:
whereinIs that the feature encoder is for->Is greater than or equal to>Is that the characteristic prediction header relates to>Prediction of feature representation ofStopgrad (. Cndot.) denotes stopping the gradient back propagation of the variable; i | · | purple wind 2 Is a two-norm operator;
however, achieving the goal of self-supervised learning is challenging; since in a continuous learning setting it is usually assumed that data from historical tasks is not available, i.e. required in inaccessible data setsWhile solving for the model at the data set->Optimum parameter theta of (2) * (ii) a Therefore, some continuous learning strategies need to be introduced to help the model to keep its performance on the historical task while learning the current task;
and 4, step 4: establishing an information loss mechanism
An InfoDrop mechanism, an information-based Dropout method, is introduced to help a continuous learning model discard unimportant features in an image and only keep the important features; if the image patch input by the neuron contains less information, the Infodrop mechanism can set the output of the neuron to zero with higher probability, otherwise, the output of the neuron is kept; specifically, the first in the neural network is calculated under Boltzmann distributionThe output of the jth neuron of the c-th channel in the layer->The discarding coefficient of (c):
wherein,is the ^ th or greater in the neural network>Input patch for jth neuron of the c-th channel in the layer;When the self-information in the input patch of the neuron is lower, the output of the neuron is discarded with higher probability, namely, the neural network is prompted to reduce the attention to the low-information area in the image; t is a temperature coefficient and is a 'soft threshold' of an InfoDrap mechanism, when T becomes small, namely the threshold is reduced, most of the patch is reserved, and only few patches with low self-information are lost; when T becomes infinite, i.e., the threshold becomes high, the InfoDrop mechanism will degenerate to the conventional Dropout mechanism and all the latches will be discarded with equal probability;Is->A probability distribution of (a);
to approximate distributionInfoDrap mechanism hypothesis->Is greater than or equal to>All samples of (1) are from minutesCloth/device>When/is>Repeating the pattern of patch in its vicinity results in a higher ≧ greater>And therefore a low self-information; define a distribution->The estimation of (d) is:
wherein R representsThe manhattan radius of the field, | | · | |, represents the euclidean distance, h is the bandwidth, b is the bandwidth; fromCan be observed when->And its neighborhood>The more diverse the patch within, it contains more self-information, i.e. </>>Will be zeroed with lower probability;
and 5: constructing an automatic supervision continuous learning framework based on an information loss mechanism;
the method comprises the steps that a model is expected to learn feature representations of regions with important information in an image on a data set of a current task, and features of unimportant regions are ignored, so that the model can be guaranteed to be capable of learning at least key feature representations under the condition of limited model capacity; generally, an InfoDrop mechanism is implemented when a neural network model is optimized on a training set, and the InfoDrop mechanism is cancelled when the performance of the neural network model is verified on a test set, but as the InfoDrop mechanism discards most of areas with low self-information in an image, larger distribution deviation occurs in the training data set and the test data set, and the performance of the model on the test set is influenced; therefore, before testing the model, the model with the InfoDrop mechanism removed is usually optimized for the second time on the training set; however, the second optimization consumes additional training time and also introduces the effect of unimportant information areas in the image on the model; in order to avoid adverse effects caused by second optimization, an information loss mechanism adaptive to self-supervision continuous learning is constructed on the basis of a self-supervision learning model; when in taskWhen the model is trained, infoDrap loss is introduced on the basis of an auto-supervised loss term, and the following auto-supervised learning paradigm with an InfoDrap mechanism is constructed:
the self-supervision learning paradigm comprises two terms, wherein the first term is an original self-supervision loss term, and the second term is an InfoDrop regular term; wherein,for models with an InfoDrap mechanism>Is recorded as &>Eyes->And f Θ Sharing the network weight; by minimizing the InfoDrop regular term, model f without the InfoDrop mechanism can be made Θ Is greater or less than>And a model with an InfoDrap mechanism @>Is greater or less than>Approximation to promote model f Θ Actively capturing the characteristics of the area with important information without adopting an InfoDrap mechanism, and ignoring unimportant characteristics;
step 6: (1) Processing the data set according to the step 1 to obtain data sets of a plurality of tasks; (2) constructing an unsupervised learning model according to the step 2; (3) Training a model on a training set of each task according to the arrival sequence of the tasks;
and 7: evaluating the performance of the model by using a KNN algorithm;
a) Calculating test samplesIs characteristic of->Similarity to individual signatures in the feature library->s ij =cos(f i ,v j );
b) Will be provided withItem preceding K big as test sample>K neighbor set of>Calculating a test sample->Scores in C categories, the category with the highest score being the predicted category of the test sample, test sample->The score calculation formula on the jth category is as follows:
And 8: after the model is trained on each task, the feature encoder f of the model is used Θ Feature extraction module f in (1) b To characterize the images of the test set on each task and then evaluate the validity of the characterization of the model using a KNN classification algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211375805.5A CN115952851B (en) | 2022-11-04 | 2022-11-04 | Self-supervision continuous learning method based on information loss mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211375805.5A CN115952851B (en) | 2022-11-04 | 2022-11-04 | Self-supervision continuous learning method based on information loss mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115952851A true CN115952851A (en) | 2023-04-11 |
CN115952851B CN115952851B (en) | 2024-10-01 |
Family
ID=87288106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211375805.5A Active CN115952851B (en) | 2022-11-04 | 2022-11-04 | Self-supervision continuous learning method based on information loss mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115952851B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109690576A (en) * | 2016-07-18 | 2019-04-26 | 渊慧科技有限公司 | The training machine learning model in multiple machine learning tasks |
CN114612847A (en) * | 2022-03-31 | 2022-06-10 | 长沙理工大学 | Method and system for detecting distortion of Deepfake video |
CN114758195A (en) * | 2022-05-10 | 2022-07-15 | 西安交通大学 | Human motion prediction method capable of realizing continuous learning |
-
2022
- 2022-11-04 CN CN202211375805.5A patent/CN115952851B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109690576A (en) * | 2016-07-18 | 2019-04-26 | 渊慧科技有限公司 | The training machine learning model in multiple machine learning tasks |
CN114612847A (en) * | 2022-03-31 | 2022-06-10 | 长沙理工大学 | Method and system for detecting distortion of Deepfake video |
CN114758195A (en) * | 2022-05-10 | 2022-07-15 | 西安交通大学 | Human motion prediction method capable of realizing continuous learning |
Non-Patent Citations (2)
Title |
---|
ALESSANDRO ACHILLE 等: "Information Dropout: Learning Optimal Representations Through Noisy Computation", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》, 31 December 2018 (2018-12-31), pages 2897 - 2905, XP011698769, DOI: 10.1109/TPAMI.2017.2784440 * |
莫建文 等: "基于神经元正则和资源释放的增量学习", 《华南理工大学学报(自然科学版)》, vol. 50, no. 6, 30 June 2022 (2022-06-30), pages 71 - 80 * |
Also Published As
Publication number | Publication date |
---|---|
CN115952851B (en) | 2024-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ghosh et al. | Structured variational learning of Bayesian neural networks with horseshoe priors | |
CN111444878B (en) | Video classification method, device and computer readable storage medium | |
CN108960086B (en) | Multi-pose human body target tracking method based on generation of confrontation network positive sample enhancement | |
CN114492574A (en) | Pseudo label loss unsupervised countermeasure domain adaptive picture classification method based on Gaussian uniform mixing model | |
CN113449864A (en) | Feedback type pulse neural network model training method for image data classification | |
CN110443372B (en) | Transfer learning method and system based on entropy minimization | |
CN116312782B (en) | Spatial transcriptome spot region clustering method fusing image gene data | |
CN113378937B (en) | Small sample image classification method and system based on self-supervision enhancement | |
CN107945210A (en) | Target tracking algorism based on deep learning and environment self-adaption | |
CN115331284A (en) | Self-healing mechanism-based facial expression recognition method and system in real scene | |
CN116883751A (en) | Non-supervision field self-adaptive image recognition method based on prototype network contrast learning | |
CN114417975A (en) | Data classification method and system based on deep PU learning and class prior estimation | |
CN114048843A (en) | Small sample learning network based on selective feature migration | |
CN118097228A (en) | Multi-teacher auxiliary instance self-adaptive DNN-based mobile platform multi-target classification method | |
Zhang et al. | Learning to search efficient densenet with layer-wise pruning | |
CN117079017A (en) | Credible small sample image identification and classification method | |
Singh et al. | Deep active transfer learning for image recognition | |
Hindarto | Comparative Analysis VGG16 Vs MobileNet Performance for Fish Identification | |
CN115952851A (en) | Self-supervision continuous learning method based on information loss mechanism | |
Połap et al. | Meta-heuristic algorithm as feature selector for convolutional neural networks | |
CN112989088B (en) | Visual relation example learning method based on reinforcement learning | |
CN115063374A (en) | Model training method, face image quality scoring method, electronic device and storage medium | |
CN115019342A (en) | Endangered animal target detection method based on class relation reasoning | |
CN113553917A (en) | Office equipment identification method based on pulse transfer learning | |
CN114120447A (en) | Behavior recognition method and system based on prototype comparison learning and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |