CN116452897A - Cross-domain small sample classification method, system, equipment and storage medium - Google Patents
Cross-domain small sample classification method, system, equipment and storage medium Download PDFInfo
- Publication number
- CN116452897A CN116452897A CN202310717959.6A CN202310717959A CN116452897A CN 116452897 A CN116452897 A CN 116452897A CN 202310717959 A CN202310717959 A CN 202310717959A CN 116452897 A CN116452897 A CN 116452897A
- Authority
- CN
- China
- Prior art keywords
- domain
- model
- target domain
- sample
- pseudo
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 79
- 238000003860 storage Methods 0.000 title claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 116
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 24
- 238000005070 sampling Methods 0.000 claims description 27
- 238000013145 classification model Methods 0.000 claims description 14
- 238000013140 knowledge distillation Methods 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 11
- 238000007670 refining Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000013508 migration Methods 0.000 abstract description 7
- 230000005012 migration Effects 0.000 abstract description 7
- 238000004364 calculation method Methods 0.000 abstract description 5
- 230000007547 defect Effects 0.000 abstract description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 238000002372 labelling Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000004821 distillation Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- RSWGJHLUYNHPMX-UHFFFAOYSA-N Abietic-Saeure Natural products C12CCC(C(C)C)=CC2=CCC2C1(C)CCCC2(C)C(O)=O RSWGJHLUYNHPMX-UHFFFAOYSA-N 0.000 description 1
- 208000025174 PANDAS Diseases 0.000 description 1
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 1
- 240000004718 Panda Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- KHPCPRHQVVSZAH-HUOMCSJISA-N Rosin Natural products O(C/C=C/c1ccccc1)[C@H]1[C@H](O)[C@@H](O)[C@@H](O)[C@@H](CO)O1 KHPCPRHQVVSZAH-HUOMCSJISA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- -1 carrier Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004138 cluster model Methods 0.000 description 1
- 239000000306 component Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- KHPCPRHQVVSZAH-UHFFFAOYSA-N trans-cinnamyl beta-D-glucopyranoside Natural products OC1C(O)C(O)C(CO)OC1OCC=CC1=CC=CC=C1 KHPCPRHQVVSZAH-UHFFFAOYSA-N 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a cross-domain small sample classification method, a system, equipment and a storage medium, which are one-to-one schemes, wherein: the pseudo labels of the target domain set label-free image samples are directly obtained by using a clustering algorithm, so that the defect of class distinguishing confusion caused by a self-training method can be completely avoided while the target domain semantic information is developed, and the time and calculation effort consumption caused by training a weak classifier is also avoided; and a field mixed training task is constructed on the source field set and the pseudo-marked target field subset, and migration of the model to the target field is promoted by utilizing target semantic information (expressed in a pseudo-label form), so that the adaptability and the classification effect of the model to the target field are improved better.
Description
Technical Field
The present invention relates to the field of computer vision, and in particular, to a method, system, device, and storage medium for classifying small samples across domains.
Background
In recent years, the development of deep learning technology makes a significant breakthrough in the classification of small samples. Small sample classification requires rapid acquisition of new knowledge from a small number of samples, which is critical in learning a priori knowledge from a training set and generalizing to new data sets. Current small sample classification methods all assume only a small domain difference between the training set and the target data set, which is difficult to achieve in real-world scenarios. The objective of the cross-domain small sample classification problem is to implement small sample classification with large domain differences. As an emerging research direction, the method is more practical and practical than the traditional small sample classification problem, and researchers are put into great attention.
For the cross-domain small sample classification problem, the objective is to identify unlabeled image samples in the target set with few labeled image samples, which are image samples with semantic class labels, with large domain differences. There are four basic concepts in the cross-domain small sample classification problem: a source domain set (source set), a target domain set (target set), a support set (support set) and a query set (query set). Each category in the source domain set has many image samples and is a labeled image sample. Only a small number of image samples per class of the target domain set have labels, the remainder being unlabeled image samples. The categories contained in the target domain set and the source domain set are not overlapped, and large domain style differences exist. Both the support set and the query set are sampled from the target domain set. The support set contains N categories, each category has only K (K is smaller) samples and all samples have labels. The query set contains only these N categories, but all of the samples therein are unlabeled image samples. A cross-domain small sample task is typically defined as an N-way K-shot task (i.e., a given support set, classifying a sample of a query set into one of the N classes described above), requiring training a classifier using a source domain set and a portion of a target domain set (containing only unlabeled image samples), and then classifying the samples in the query set using the classifier and the support set.
The current cross-domain small sample classification method mainly includes two aspects, namely a self-training method (self-training) and a pre-training-fine-tuning mode (pre-training and fine-tuning). The self-training method is to train a weak classifier through a small amount of labeled data, and label unlabeled image samples through the weak classifier so as to solve the problem that the labeled image samples are fewer. The pre-training and fine-tuning training mode is a widely used model-based migration learning method, and is that a model with a strong generalization capability (pre-training model) is firstly obtained by training on a large data set, then fine tuning is carried out on a downstream task, and finally a model (target classification model) which can adapt to a target task is obtained. Therefore, the training and testing flow of the cross-domain small sample classification model based on the self-training method and the pre-training-fine tuning training mode is as follows:
(1) And performing supervision training on the source domain set to obtain a weak classifier.
(2) Labeling the unlabeled image samples of the target domain set by using the weak classifier to obtain pseudo labels.
(3) And performing supervision training by using the source domain set and the target domain set sample with the pseudo tag to obtain a pre-training model.
(4) When small samples are classified, the support set samples are used as anchor points, fine adjustment is carried out on the pre-training model, the target classification model is obtained, and image classification is carried out on the query set samples.
The relevant research work is specifically described below in terms of the self-training method and the pre-training-fine training mode, respectively.
The self-training method is mature in the prior art, and has wide application in cross-domain small sample learning (semi-supervised learning) and field self-adaption (domain adaptation). In self-training for cross-domain small sample learning, firstly, performing supervised image classification training by using all labeled image samples in a source domain set to obtain a traditional image classifier, and then labeling unlabeled target domain set samples by using the traditional image classifier to obtain pseudo labels. In the prior art document "small sample migration across extreme task differences with Self-training" (Cheng Perng Phoo and Bharath hariharan. Self-training For new-shot Transfer Across Extreme Task differences, international Conference on Learning Representations, 2021.) a Self-training method was introduced For the first time. The method adopts a 10-layer residual network (ResNet-10) as a feature extractor, and adopts a full-connection layer with one output dimension consistent with the category number of the source domain set as a classifier for labeling the target domain set sample. And after the labeling is finished, the pseudo label of the target domain set sample is the probability that the sample is predicted to be each category of the source domain set. In the prior art document "dynamic distillation network using cross-domain small sample identification without annotation data" (Ashraful Islam, chun-Fu Richard Chen, rameswar Panda, lenid Karlinsky, rogerio Feris, and Richard J Radke, dynamic distillation network for cross-domain flow-shot recognition with unlabeled data, advances in Neural Information Processing Systems, 34:3584-3595, 2021.) dynamic distillation algorithms and consistency regularization are used to optimize the quality of pseudo tags. The self-training method can explore semantic structures in the unlabeled target domain set sample to a certain extent, and the model training and domain migration difficulties caused by label deletion are relieved. However, the experimental results of the above technical literature are not ideal, mainly due to the limitation of the self-training method to the problem of small samples. Under the setting of the small sample problem, the source domain set data and the unlabeled target domain set samples come from different categories, so that the self-training method is obviously unreasonable to predict the probability distribution of the target domain set samples on the source domain set category. Considering that the pseudo tag given based on the source domain set category directly optimizes the classification network aiming at the target domain set category, the obtained model is confused when the cross-domain categories are distinguished, and the expected identification of the target domain is not facilitated. The method directly avoids the technical bottleneck without using a self-training method, so that the attention of an algorithm to the target domain set samples and categories is higher, and the method is suitable for classifying small samples in the target domain.
Pretraining-fine tuning training patterns are widely used in numerous computer vision tasks, especially cross-domain small sample learning tasks. In prior art documents, "more extensive research on cross-domain small sample learning" (Yunhui Guo, noel C codec, leond Karlinsky, james V codec, john R Smith, gate Saenko, tajana rosin, and Rogerio ferris A broader study of cross-domain few-shot learning, in European conference on computer vision, pages 124-141. Springer, 2020.), current cross-domain small sample classification benchmarks are presented, while experiments indicate that the performance of the pre-training-fine training mode is significantly better than current meta-learning (meta-learning) based small sample learning methods. In the cited prior art document 'small sample migration across extreme task differences by self-training', a source domain set and a target domain set sample with a pseudo tag are used for general supervised image classification training to obtain a pre-training model, a feature extractor of the model is fixed during fine tuning, the set sample is used as an anchor point, and a full-connection layer of the model is retrained to obtain a small sample classifier suitable for a target class.
However, the above-mentioned small sample classification methods with cross-domain still have major technical bottlenecks and limitations. Firstly, the specificity of cross-domain small sample problem setting results in that a small sample classification model obtained by using a self-training method can be confused when distinguishing cross-domain categories. Secondly, the pre-training and fine-tuning training mode used at present has weaker purposefulness for the cross-domain problem, only the pseudo tag image sample and the source domain sample are considered equally during pre-training, the target domain sample utilization rate is low, and the classification and adaptation problem of the target domain by the model is not emphasized practically, so that the robustness is poor and the accuracy is low.
Disclosure of Invention
The invention aims to provide a cross-domain small sample classification method, a system, equipment and a storage medium, which can improve the accuracy of cross-domain small sample classification.
The invention aims at realizing the following technical scheme:
a cross-domain small sample classification method, comprising:
generating pseudo labels of unlabeled image samples in the target domain set by a clustering algorithm to form a pseudo-labeled target domain subset;
sampling from the source domain set and the pseudo-marked target domain subset by using a mixed sampling strategy to form a domain mixed scene set, endowing the domain mixed scene set with a label specific to a scene training task, and performing scene training on a small sample classification model to obtain a domain mixed model;
classifying unlabeled image samples in the query set based on the domain mixed model and the support set; wherein both the support set and the query set are sampled from the target domain set.
A cross-domain small sample classification system, comprising:
the pseudo tag generation unit is used for generating pseudo tags of the label-free image samples in the target domain set through a clustering algorithm to form a pseudo-tagged target domain subset;
the field mixed learning unit is used for sampling from the source domain set and the pseudo-labeled target domain subset by using a mixed sampling strategy to form a field mixed scene set, endowing the field mixed scene set with a label specific to a scene training task, and carrying out scene training to obtain a field mixed model;
the classification unit is used for classifying the unlabeled image samples in the query set based on the field mixed model and the support set; wherein both the support set and the query set are sampled from the target domain set.
A processing apparatus, comprising: one or more processors; a memory for storing one or more programs;
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the aforementioned methods.
A readable storage medium storing a computer program which, when executed by a processor, implements the method described above.
According to the technical scheme provided by the invention, the pseudo label of the target domain set label-free image sample is directly obtained by using the clustering algorithm, so that the defect of class distinguishing confusion caused by a self-training method can be completely avoided while the target domain semantic information is developed, and the time and calculation effort consumption caused by training a weak classifier is also avoided; and a field mixed training task is constructed on the source field set and the pseudo-marked target field subset, and migration of the model to the target field is promoted by utilizing target semantic information (expressed in a pseudo-label form), so that the adaptability and the classification effect of the model to the target field are improved better.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a cross-domain small sample classification method provided by an embodiment of the invention;
FIG. 2 is a schematic diagram of a cross-domain small sample classification method including a refining process according to an embodiment of the present invention;
FIG. 3 is a flowchart of a pseudo tag generating method according to an embodiment of the present invention;
FIG. 4 is a flowchart of a domain mixed learning method according to an embodiment of the present invention;
FIG. 5 is a flow chart of a method for model refinement for a particular target domain provided by an embodiment of the present invention;
FIG. 6 is a schematic diagram of a cross-domain small sample classification system according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a processing apparatus according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
The terms that may be used herein will first be described as follows:
the terms "comprises," "comprising," "includes," "including," "has," "having" or other similar referents are to be construed to cover a non-exclusive inclusion. For example: including a particular feature (e.g., a starting material, component, ingredient, carrier, formulation, material, dimension, part, means, mechanism, apparatus, step, procedure, method, reaction condition, processing condition, parameter, algorithm, signal, data, product or article of manufacture, etc.), should be construed as including not only a particular feature but also other features known in the art that are not explicitly recited.
The method, the system, the equipment and the storage medium for classifying the cross-domain small samples are described in detail below. What is not described in detail in the embodiments of the present invention belongs to the prior art known to those skilled in the art. The specific conditions are not noted in the examples of the present invention and are carried out according to the conditions conventional in the art or suggested by the manufacturer.
Example 1
The embodiment of the invention provides a cross-domain small sample classification method, which mainly comprises the following steps as shown in fig. 1:
and step 1, generating pseudo labels of unlabeled image samples in the target domain set through a clustering algorithm to form a pseudo-labeled target domain subset.
In the embodiment of the invention, generating the pseudo tag of the unlabeled image sample in the target domain set refers to generating the pseudo tag capable of expressing potential target domain semantic information for the unlabeled image sample given in the target domain set and used for training, and specifically: carrying out a clustering algorithm on unlabeled image samples in a target domain set to obtain a plurality of clusters; and regarding each cluster as a proxy class of the target domain set, and taking the label of the proxy class as a pseudo label of each unlabeled image sample in the cluster.
All given unlabeled image samples available for training and corresponding pseudo labels form a pseudo-tagged target domain subset.
And 2, sampling from the source domain set and the pseudo-marked target domain subset by using a mixed sampling strategy to form a domain mixed scene set, endowing the domain mixed scene set with labels specific to scene training tasks, and carrying out scene training to obtain a domain mixed model.
In the embodiment of the invention, a field mixed learning scheme is provided, specifically, the processing of data and the training of a model follow a scene training mode, and the conventional supervised learning mode used in the prior art is not needed. Firstly, a mixed sampling strategy is used, a field mixed scene set is generated from a source field set and a pseudo-marked target field subset simultaneously, a label specific to a scene training task is given to the field mixed scene set, and then scene training is carried out on a small sample classification model.
The scheme provided by the embodiment of the invention belongs to a plug and play method, and a small sample classification model (or a small sample classification model obtained based on scene training mode training) is an existing model, for example, a 10-layer residual network model (ResNet-10).
And step 3, classifying the unlabeled image samples in the query set based on the domain mixed model and the support set.
In the embodiment of the invention, the field mixed model obtained in the step 2 and the support set can be used for directly classifying the unlabeled image samples in the query set, and the implementation mode of the part can be realized by referring to the conventional technology; both the support set and the query set herein are sampled from the target domain set, and their definition can be found in the description of the background section above.
Preferably, in order to obtain a small sample classification model completely adapted to the target domain, the domain hybrid model may be refined to obtain a refined model (i.e., the small sample classification model of the target domain), and then the refined model and the support set are used to classify the image samples in the query set, so that accuracy can be improved more, and fig. 2 shows a related flow. Refining the domain mixed model to obtain a refined model, wherein the method comprises the following steps of: sampling the target domain subset generated with the corresponding pseudo tag to obtain a target domain scene training set; using a knowledge distillation algorithm, taking a field mixed model as a teacher model, and initializing parameters of a student model; inputting the target domain scene training set into a teacher model and a student model, calculating knowledge distillation loss, constructing positive and negative sample pairs by using the target domain scene training set, inputting the positive and negative sample pairs into the student model, and calculating comparison learning loss; and optimizing the student model by utilizing the comparison learning loss and the knowledge distillation loss, wherein the student model after optimization is the refined model.
According to the scheme provided by the embodiment of the invention, the clustering algorithm is used for replacing the self-training method, the pseudo tag of the target domain set sample is directly obtained, the defect of category distinguishing confusion caused by the self-training method can be completely avoided while the target domain semantic information is developed, and the time and labor consumption caused by training the weak classifier is avoided. Furthermore, in order to make up for the inconsistency between the domain and the category, the invention provides a domain mixed learning scheme, wherein the domain mixed learning scheme constructs a domain mixed training task on a source domain set and a pseudo-labeled target domain subset, and the migration of a model to the target domain is promoted by utilizing target semantic information (expressed in a pseudo-label form). In addition, the invention also provides a model refining method specific to the target field, which enables the model to be further migrated to the target field so as to obtain a small sample classification model completely adapting to the target field, thereby improving the accuracy more.
In order to more clearly demonstrate the technical scheme and the technical effects provided by the invention, the method provided by the embodiment of the invention is described in detail below by using specific embodiments.
1. And generating a pseudo tag.
In the embodiment of the invention, the pseudo tag of the unlabeled image sample in the target domain set is generated through a clustering algorithm. The clustering algorithm is used for clustering according to the similarity among unlabeled image samples, all the unlabeled image samples are divided into a plurality of clusters, so that the data similarity among the clusters is larger, and the data similarity among the clusters is smaller, and the method is an unsupervised learning method.
As shown in fig. 3, a flow chart of a pseudo tag generation method is shown, and first, a cluster model is trained by using unlabeled image samples in a target domain set as a training set. The clustering model can calculate the characteristic representation of the unlabeled image sample in the characteristic space, and classify the image sample with higher similarity into the same clustering cluster according to the similarity degree of the characteristic representation, wherein the higher similarity means that the similarity is not smaller than a set value, and the size of the set value can be set according to actual conditions or experience. Then, each cluster is regarded as a different proxy class of the target domain set, the proxy class can express a potential target domain semantic structure, and unlabeled image samples in the same cluster can be regarded as belonging to the same target domain class. And finally, labeling the corresponding pseudo labels for the unlabeled image samples according to the labels of the belonging agent categories to form a pseudo-labeled target domain subset. The above operations are finished offline in advance and do not participate in the training of the cross-domain small sample classification network, so that the time consumption in the training process is reduced.
2. Domain mixed learning.
The domain mixed learning is to construct a cross-domain training task by utilizing target domain semantic information expressed in a pseudo tag form and a source domain set sample. In the learning process, the data processing and the model training follow a scenario training mode, wherein the scenario training mode (episode training) is a training strategy commonly used in the current small sample learning, and is to simulate the setting of a test stage when the model is trained, for example, the same support set and query set dividing mode as the test is adopted, and the scenario training mode can be called a training support set and a training query set. The training support set and the training query set obtained by sampling in the training set are used for training the model, a test scene is simulated in the training process, and the model can better perform in the test process.
In the embodiment of the invention, a mixed sampling strategy is used for sampling from a source domain set and a pseudo-marked target domain subset to form a domain mixed scene set; the preferred embodiment is as follows: setting sampling probability p, enabling the probability of sampling categories in the target domain subset of the pseudo-annotation to be p, enabling the probability of sampling categories in the source domain subset to be 1-p, and extracting corresponding image samples from the corresponding target domain subset of the pseudo-annotation or the source domain set according to the sampled categories to form a domain mixed scene set. The value of the probability p here may depend on the number of image samples in the source domain set and the pseudo-tagged target domain subset.
In the embodiment of the invention, after the field mixed scene set is obtained by sampling, a label specific to a scene training task is given to the field mixed scene set, and then, scene training is carried out on the small sample classification model to obtain a model capable of preliminarily realizing cross-domain small sample classification, which is called a field mixed model. In this section, assigning a label specific to a scenario training task to a field mix scenario set means assigning a unique label for scenario training to each image sample in the field mix scenario set, which is used as a supervisory signal in scenario training, and considering that the assignment of the label specific to the scenario training task and the scenario training scheme can be implemented with reference to conventional techniques, details are omitted. As shown in fig. 4, a flow chart of a domain mixed learning method is shown. The field mixed learning provided by the embodiment of the invention can be directly carried on any scene training algorithm, has high execution efficiency, and can greatly improve the generalization performance of the model.
3. Model refinement for a particular target domain.
The model refining of the specific target field refers to that the model is further migrated to the target field by utilizing the semantic information of the target field expressed in the form of pseudo labels to adjust the model. Specifically, the method also follows a scene training mode, and firstly, a target domain subset of the pseudo label is sampled to obtain a target domain scene training set; then, using a knowledge distillation algorithm, taking the field mixed model as a teacher model, and initializing parameters of the student model; inputting the target domain scene training set into a teacher model and a student model, calculating knowledge distillation loss, constructing positive and negative sample pairs by using the target domain scene training set, inputting the positive and negative sample pairs into the student model, and calculating comparison learning loss; and optimizing the student model by utilizing the comparison learning loss and the knowledge distillation loss, wherein the student model after optimization is the refined model.
In the embodiment of the present invention, the knowledge distillation algorithm refers to extracting knowledge contained in one trained model (i.e., a teacher model) into another model (i.e., a student model). The calculation mode of the knowledge distillation loss comprises the following steps: softening the output result of the teacher model by using a softmax function (normalized exponential function) with temperature parameters, and taking the softened output result as supervision information of the student model; and calculating knowledge distillation loss by using the output result of the student model and the supervision information.
In the embodiment of the invention, contrast learning is an unsupervised learning paradigm, and a model is required to learn the characteristic expression of sample data from an unlabeled image. Specifically, the image sample is compared with an example (positive sample) similar to the semantic of the image sample and an example (negative sample) dissimilar to the semantic of the image sample in the feature space, and the representation corresponding to the positive sample is closer to the feature space through the contrast learning loss, and the representation corresponding to the negative sample is farther in distance, so that the feature representation of the sample is better learned. The calculation mode of the contrast learning loss is as follows: performing two different data enhancement processes on each image sample in the target domain scene training set to obtain a first enhancement sample and a second enhancement sample corresponding to each image sample, constructing positive and negative sample pairs by using the first enhancement sample and the second enhancement sample, inputting the positive and negative sample pairs into a student model, and calculating contrast learning loss by using an output result of the student model; wherein the first enhancement sample and the second enhancement sample of the same image sample are a positive sample pair, and the combination of any enhancement samples of different image samples is a negative sample pair.
Exemplary: corresponding to the image sample A and the image sample B, the first enhancement sample and the second enhancement sample of the image sample A are a positive sample pair, and the first enhancement sample and the second enhancement sample of the image sample B are a positive sample pair; the first enhancement sample of the image sample A, the first enhancement sample and the second enhancement sample of the image sample B respectively form a negative sample pair; the second enhancement sample of the image sample A, the first enhancement sample and the second enhancement sample of the image sample B respectively form a negative sample pair.
As shown in fig. 5, a flow chart of a model refining method of a specific target domain is shown, and a teacher model and a student model in a teacher-student framework have the same structure, wherein the teacher model is a domain mixed model obtained by the training, and does not participate in subsequent training, and the student model is initialized by the domain mixed model obtained by the training, and further training is performed subsequently. In the refining process, two different data enhancement processes may be one strong enhancement process and one weak enhancement process, and then the first enhancement sample and the second enhancement sample may be referred to as a strong copy and a weak copy, and the specific implementation of the strong enhancement process or the weak enhancement process may refer to the conventional technology, which is not described in detail in the present invention.
The scheme provided by the embodiment of the invention mainly has the following beneficial effects: firstly, the invention proposes to replace a self-training method by using a clustering algorithm, solves the problem of category distinguishing confusion caused by self-training, and also avoids the time and calculation effort consumption caused by training a weak classifier; secondly, the invention provides a field mixed learning scheme, and the model is promoted to migrate to the target domain by utilizing the semantic information (expressed in a pseudo tag form) of the target domain, so that generalization of the model is greatly improved, and the efficiency and the robustness of the invention are higher; finally, a model refining method in the specific target field is also provided, so that the model is further migrated to the target field, a small sample classification model completely suitable for the target field is obtained, and the accuracy is improved greatly. The invention has been tested and verified on a plurality of reference data sets, the accuracy of which reaches the current highest level, and the performance of which is greatly improved compared with the prior art.
From the description of the above embodiments, it will be apparent to those skilled in the art that the above embodiments may be implemented in software, or may be implemented by means of software plus a necessary general hardware platform. With such understanding, the technical solutions of the foregoing embodiments may be embodied in a software product, where the software product may be stored in a nonvolatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and include several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods of the embodiments of the present invention.
Example two
The present invention also provides a cross-domain small sample classification system, which is mainly used for implementing the method provided in the foregoing embodiment, as shown in fig. 6, and the system mainly includes:
the pseudo tag generation unit is used for generating pseudo tags of the label-free image samples in the target domain set through a clustering algorithm to form a pseudo-tagged target domain subset;
the field mixed learning unit is used for sampling from the source domain set and the pseudo-labeled target domain subset by using a mixed sampling strategy to form a field mixed scene set, endowing the field mixed scene set with a label specific to a scene training task, and carrying out scene training to obtain a field mixed model;
the classification unit is used for classifying the unlabeled image samples in the query set based on the field mixed model and the support set; wherein both the support set and the query set are sampled from the target domain set.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the system is divided into different functional modules to perform all or part of the functions described above.
Example III
The present invention also provides a processing apparatus, as shown in fig. 7, which mainly includes: one or more processors; a memory for storing one or more programs; wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods provided by the foregoing embodiments.
Further, the processing device further comprises at least one input device and at least one output device; in the processing device, the processor, the memory, the input device and the output device are connected through buses.
In the embodiment of the invention, the specific types of the memory, the input device and the output device are not limited; for example:
the input device can be a touch screen, an image acquisition device, a physical key or a mouse and the like;
the output device may be a display terminal;
the memory may be random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatile memory), such as disk memory.
Example IV
The invention also provides a readable storage medium storing a computer program which, when executed by a processor, implements the method provided by the foregoing embodiments.
The readable storage medium according to the embodiment of the present invention may be provided as a computer readable storage medium in the aforementioned processing apparatus, for example, as a memory in the processing apparatus. The readable storage medium may be any of various media capable of storing a program code, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, and an optical disk.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
Claims (10)
1. A cross-domain small sample classification method, comprising:
generating pseudo labels of unlabeled image samples in the target domain set through a clustering algorithm to form a pseudo-labeled target domain subset;
sampling from the source domain set and the pseudo-marked target domain subset by using a mixed sampling strategy to form a domain mixed scene set, endowing the domain mixed scene set with a label specific to a scene training task, and performing scene training on a small sample classification model to obtain a domain mixed model;
classifying unlabeled image samples in the query set based on the domain mixed model and the support set; wherein both the support set and the query set are sampled from the target domain set.
2. The method of claim 1, wherein generating pseudo tags for unlabeled image samples in a target domain set by a clustering algorithm comprises:
carrying out a clustering algorithm on unlabeled image samples in a target domain set to obtain a plurality of clusters;
and regarding each cluster as a proxy class of the target domain set, and taking the label of the proxy class as a pseudo label of each unlabeled image sample in the cluster.
3. The method for classifying small cross-domain samples according to claim 2, wherein the performing a clustering algorithm on unlabeled image samples in the target domain set to obtain a plurality of clusters includes:
clustering is carried out according to the similarity among the unlabeled image samples, and all the unlabeled image samples are divided into a plurality of clustering clusters.
4. The method of claim 1, wherein the sampling from the source domain set and the pseudo-tagged target domain subset using a mixed sampling strategy, forming the domain mixed scene set comprises:
setting sampling probability p, enabling the probability of sampling categories in the target domain subset of the pseudo-annotation to be p, enabling the probability of sampling categories in the source domain subset to be 1-p, and extracting corresponding image samples from the corresponding target domain subset of the pseudo-annotation or the source domain set according to the sampled categories to form a domain mixed scene set.
5. The method of cross-domain small sample classification as claimed in claim 1, further comprising: refining the field mixed model to obtain a refined model, and classifying image samples in a query set by using the refined model and a support set;
refining the domain mixed model, wherein obtaining the refined model comprises the following steps: sampling the target domain set with the generated corresponding pseudo tag to obtain a target domain scene training set; using a knowledge distillation algorithm, taking a field mixed model as a teacher model, and initializing parameters of a student model; inputting the target domain scene training set into a teacher model and a student model, calculating knowledge distillation loss, constructing positive and negative sample pairs by using the target domain scene training set, inputting the positive and negative sample pairs into the student model, and calculating comparison learning loss; and optimizing the student model by utilizing the comparison learning loss and the knowledge distillation loss, wherein the student model after optimization is the refined model.
6. The method of claim 5, wherein the inputting the training set of target domain scenarios into the teacher model and the student model, and calculating the knowledge distillation loss comprises:
softening the output result of the teacher model by using a softmax function with temperature parameters and taking the output result as supervision information of the student model; wherein the softmax function is a normalized exponential function;
and calculating knowledge distillation loss by using the output result of the student model and the supervision information.
7. The method for classifying small cross-domain samples according to claim 5, wherein the constructing positive and negative pairs of samples using the training set of target domain scenarios and inputting the positive and negative pairs of samples into the student model, and calculating the contrast learning loss comprises:
performing two different data enhancement processes on each image sample in the target domain scene training set to obtain a first enhancement sample and a second enhancement sample corresponding to each image sample, constructing positive and negative sample pairs by using the first enhancement sample and the second enhancement sample, inputting the positive and negative sample pairs into a student model, and calculating contrast learning loss by using an output result of the student model; wherein the first enhancement sample and the second enhancement sample of the same image sample are a positive sample pair, and the combination of any enhancement samples of different image samples is a negative sample pair.
8. A cross-domain small sample classification system, comprising:
the pseudo tag generation unit is used for generating pseudo tags of the label-free image samples in the target domain set through a clustering algorithm to form a pseudo-tagged target domain subset;
the field mixed learning unit is used for sampling from the source domain set and the pseudo-labeled target domain subset by using a mixed sampling strategy to form a field mixed scene set, endowing the field mixed scene set with a label specific to a scene training task, and carrying out scene training to obtain a field mixed model;
the classification unit is used for classifying the unlabeled image samples in the query set based on the field mixed model and the support set; wherein both the support set and the query set are sampled from the target domain set.
9. A processing apparatus, comprising: one or more processors; a memory for storing one or more programs;
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.
10. A readable storage medium storing a computer program, which when executed by a processor implements the method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310717959.6A CN116452897B (en) | 2023-06-16 | 2023-06-16 | Cross-domain small sample classification method, system, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310717959.6A CN116452897B (en) | 2023-06-16 | 2023-06-16 | Cross-domain small sample classification method, system, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116452897A true CN116452897A (en) | 2023-07-18 |
CN116452897B CN116452897B (en) | 2023-10-20 |
Family
ID=87134207
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310717959.6A Active CN116452897B (en) | 2023-06-16 | 2023-06-16 | Cross-domain small sample classification method, system, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116452897B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116763259A (en) * | 2023-08-17 | 2023-09-19 | 普希斯(广州)科技股份有限公司 | Multi-dimensional control method and device for beauty equipment and beauty equipment |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2022042487A (en) * | 2020-09-02 | 2022-03-14 | 富士通株式会社 | Method for training domain adaptive neural network |
CN114241260A (en) * | 2021-12-14 | 2022-03-25 | 四川大学 | Open set target detection and identification method based on deep neural network |
CN114332568A (en) * | 2022-03-16 | 2022-04-12 | 中国科学技术大学 | Training method, system, equipment and storage medium of domain adaptive image classification network |
CN114998602A (en) * | 2022-08-08 | 2022-09-02 | 中国科学技术大学 | Domain adaptive learning method and system based on low confidence sample contrast loss |
CN115278520A (en) * | 2022-07-08 | 2022-11-01 | 南京邮电大学 | 5G indoor positioning method based on fingerprint database migration reconstruction |
CN115359074A (en) * | 2022-10-20 | 2022-11-18 | 之江实验室 | Image segmentation and training method and device based on hyper-voxel clustering and prototype optimization |
CN115879533A (en) * | 2022-12-02 | 2023-03-31 | 西安交通大学 | Analog incremental learning method and system based on analog learning |
US20230111287A1 (en) * | 2020-06-16 | 2023-04-13 | Huawei Technologies Co., Ltd. | Learning proxy mixtures for few-shot classification |
CN115984621A (en) * | 2023-01-09 | 2023-04-18 | 宁波拾烨智能科技有限公司 | Small sample remote sensing image classification method based on restrictive prototype comparison network |
-
2023
- 2023-06-16 CN CN202310717959.6A patent/CN116452897B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230111287A1 (en) * | 2020-06-16 | 2023-04-13 | Huawei Technologies Co., Ltd. | Learning proxy mixtures for few-shot classification |
JP2022042487A (en) * | 2020-09-02 | 2022-03-14 | 富士通株式会社 | Method for training domain adaptive neural network |
CN114241260A (en) * | 2021-12-14 | 2022-03-25 | 四川大学 | Open set target detection and identification method based on deep neural network |
CN114332568A (en) * | 2022-03-16 | 2022-04-12 | 中国科学技术大学 | Training method, system, equipment and storage medium of domain adaptive image classification network |
CN115278520A (en) * | 2022-07-08 | 2022-11-01 | 南京邮电大学 | 5G indoor positioning method based on fingerprint database migration reconstruction |
CN114998602A (en) * | 2022-08-08 | 2022-09-02 | 中国科学技术大学 | Domain adaptive learning method and system based on low confidence sample contrast loss |
CN115359074A (en) * | 2022-10-20 | 2022-11-18 | 之江实验室 | Image segmentation and training method and device based on hyper-voxel clustering and prototype optimization |
CN115879533A (en) * | 2022-12-02 | 2023-03-31 | 西安交通大学 | Analog incremental learning method and system based on analog learning |
CN115984621A (en) * | 2023-01-09 | 2023-04-18 | 宁波拾烨智能科技有限公司 | Small sample remote sensing image classification method based on restrictive prototype comparison network |
Non-Patent Citations (4)
Title |
---|
DA LI ET AL.: "Episodic training for domain generalization", PROCEEDINGS OF THE IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION * |
WENTAO CHEN ET AL.: "Cross-Domain Cross-Set Few-Shot Learning via Learning Compact and Aligned Representations", COMPUTER VISION-ECCV 2022 * |
YUCHI QIU ET AL.: "cluster learning-assisted directed evolution", NATURE COMPUTATIONAL SCIENCE * |
张优佳: "基于图神经网络的小样本学习方法研究", 中国优秀硕士学位论文全文数据库 信息科技辑 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116763259A (en) * | 2023-08-17 | 2023-09-19 | 普希斯(广州)科技股份有限公司 | Multi-dimensional control method and device for beauty equipment and beauty equipment |
CN116763259B (en) * | 2023-08-17 | 2023-12-08 | 普希斯(广州)科技股份有限公司 | Multi-dimensional control method and device for beauty equipment and beauty equipment |
Also Published As
Publication number | Publication date |
---|---|
CN116452897B (en) | 2023-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hong et al. | Cogvideo: Large-scale pretraining for text-to-video generation via transformers | |
Li et al. | Efficient self-supervised vision transformers for representation learning | |
CN111914644B (en) | Dual-mode cooperation based weak supervision time sequence action positioning method and system | |
CN111753101B (en) | Knowledge graph representation learning method integrating entity description and type | |
CN109582793A (en) | Model training method, customer service system and data labeling system, readable storage medium storing program for executing | |
CN102314614B (en) | Image semantics classification method based on class-shared multiple kernel learning (MKL) | |
CN109919252B (en) | Method for generating classifier by using few labeled images | |
CN116452897B (en) | Cross-domain small sample classification method, system, equipment and storage medium | |
CN108537119A (en) | A kind of small sample video frequency identifying method | |
CN111078881B (en) | Fine-grained sentiment analysis method and system, electronic equipment and storage medium | |
EP4018358A1 (en) | Negative sampling algorithm for enhanced image classification | |
CN111723815A (en) | Model training method, image processing method, device, computer system, and medium | |
CN112232395B (en) | Semi-supervised image classification method for generating countermeasure network based on joint training | |
CN115269870A (en) | Method for realizing classification and early warning of data link faults in data based on knowledge graph | |
Lonij et al. | Open-world visual recognition using knowledge graphs | |
CN112668633A (en) | Adaptive graph migration learning method based on fine granularity field | |
CN112750128A (en) | Image semantic segmentation method and device, terminal and readable storage medium | |
Balzategui et al. | Few-shot incremental learning in the context of solar cell quality inspection | |
CN115438658A (en) | Entity recognition method, recognition model training method and related device | |
CN111708896B (en) | Entity relationship extraction method applied to biomedical literature | |
CN115496137A (en) | Small sample classification method based on label propagation and distribution conversion and related device | |
CN114972282A (en) | Incremental learning non-reference image quality evaluation method based on image semantic information | |
CN113537307A (en) | Self-supervision domain adaptation method based on meta-learning | |
CN112686277A (en) | Method and device for model training | |
CN114020922B (en) | Text classification method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |