CN114782779B - Small sample image feature learning method and device based on feature distribution migration - Google Patents

Small sample image feature learning method and device based on feature distribution migration Download PDF

Info

Publication number
CN114782779B
CN114782779B CN202210487387.2A CN202210487387A CN114782779B CN 114782779 B CN114782779 B CN 114782779B CN 202210487387 A CN202210487387 A CN 202210487387A CN 114782779 B CN114782779 B CN 114782779B
Authority
CN
China
Prior art keywords
class
distribution
sample
feature
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210487387.2A
Other languages
Chinese (zh)
Other versions
CN114782779A (en
Inventor
李晓旭
王湘阳
刘俊
金志宇
任凯
张文斌
曾俊瑀
李睿凡
陶剑
董洪飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanzhou University of Technology
Original Assignee
Lanzhou University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanzhou University of Technology filed Critical Lanzhou University of Technology
Priority to CN202210487387.2A priority Critical patent/CN114782779B/en
Publication of CN114782779A publication Critical patent/CN114782779A/en
Application granted granted Critical
Publication of CN114782779B publication Critical patent/CN114782779B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a small sample image feature learning method and device based on feature distribution migration, which utilizes the data of a base class to combine with a gradient descent method in the early stage to optimize parameters of an embedded module and a distribution learning module, and does not need additional parameter setting when carrying out distribution correction in the later stage; in addition, it is generally assumed that each dimension in the feature representation follows a gaussian distribution, so that the mean and variance of the gaussian distribution can be passed between similar classes, reducing the bias so that statistics of these classes can be better estimated with a sufficient number of samples, and then correcting the distribution of samples using a distribution correction model, thereby classifying new classes of samples more accurately. Meanwhile, the method can be matched with any classifier and any feature extractor without additional parameters, solves the problem of prototype deviation in small sample image classification, improves the image classification effect and has high practical value.

Description

Small sample image feature learning method and device based on feature distribution migration
Technical Field
The invention relates to the technical field of image classification, in particular to a small sample image feature learning method and device based on feature distribution migration.
Background
In recent years, with the development of computer technology, people browse increasingly rich information, a large number of pictures are uploaded to a network every day, and due to the large number, people cannot classify the pictures. On many large sample image classification tasks, the recognition performance of machines has exceeded that of humans. However, when the sample size is small, the recognition level of the machine still has a large gap from that of human beings. Therefore, there is a strong social need to study efficient and reliable image classification algorithms.
The small sample classification (Few-shot Classification) belongs to the category of small sample Learning (Few-shot Learning), and often includes two types of data, namely base type data and new type data, with disjoint class spaces. Small sample classification aims at learning classification rules with knowledge learned from base class data and a small number of labeled samples (support samples) of new class data, accurately predicting the class of unlabeled samples (query samples) in new class tasks.
For the shortcomings of the prior art, first, the prior deep learning technique is not applicable to the task of classifying small samples with few marked samples. Thus, how to learn a high-recognition feature representation based on the base class data and the new class data with few labeled samples is a worth exploring problem. Second, for deviations in distributed prototypes, the learned prototype deviation (bias) is often large because of the very few marked samples. Therefore, how to improve the classification performance of small sample images by reducing prototype bias is also a challenging task. Finally, errors exist in the discriminant of the base class data features and the mobility of the base class data features on the new class data, and classification inaccuracy is easy to cause.
Disclosure of Invention
Aiming at the problem of prototype deviation in small sample image classification, the invention provides a small sample image feature learning method and device based on feature distribution migration, which mainly assumes that each dimension in feature representation follows Gaussian distribution, thus the mean value and variance of the Gaussian distribution can be transferred between similar categories, and the categories are judged by combining and comparing the distances between samples or between samples and distribution prototypes, the statistical data of the categories is better estimated under enough sample numbers, and the classification effect of the images is improved.
In order to achieve the above object, the present invention provides the following technical solutions:
in one aspect, the invention provides a small sample image feature learning method based on feature distribution migration, which comprises the following steps:
s1, preprocessing data, wherein the data comprises a training set and a testing set;
s2, pre-training an embedded module f by using base class data θ Obtaining a good characteristic space;
s3, D is train Input to the embedding module f θ Obtaining a sample characteristic diagram, and inputting the sample characteristic diagram into a distribution learning module g φ In the method, a loss function is minimized, and a distribution learning module g is optimized φ
S4, dividing the new class data into support sets
Figure BDA0003629704630000021
And query set->
Figure BDA0003629704630000022
Support set->
Figure BDA0003629704630000023
Through the embedded module f θ And a distribution learning module g φ Calculating distribution prototype of each class->
Figure BDA0003629704630000024
And->
Figure BDA0003629704630000025
/>
S5, calculating the class probability of each class in the basic class data, selecting the first n maximum classes, combining the distribution of the n classes with the distribution of the current class to obtain a corrected distribution prototype of each class
Figure BDA0003629704630000026
And->
Figure BDA0003629704630000027
S6, calculating the prediction probability of the new type query sample.
Further, the pretreatment method in step S1 is as follows:
s11, data is processed
Figure BDA0003629704630000028
Is divided into->
Figure BDA0003629704630000029
Figure BDA00036297046300000210
And->
Figure BDA00036297046300000211
Two parts, and the class spaces of the two parts are mutually exclusive, D train For adjusting parameters during training, D test Evaluating the performance of the model as new data;
s12, for the C-way K-shot classification task, from D train Randomly selecting C classes, randomly selecting M samples from each class, wherein K samples are used as support samples S i The remaining M-K samples are used as query samples Q i ,S i And Q i Form a task T i The method comprises the steps of carrying out a first treatment on the surface of the Similarly, for D test With tasks
Figure BDA00036297046300000212
Further, an embedding module f containing four convolution blocks is used in step S2 θ For image extraction features, which contain convolutional layers, pooling layers and nonlinear activation functions, each convolutional block uses a convolutional kernel with a window size of 3*3, a batch normalization, a RELU nonlinear layer, a 2 x 2 max pooling layer, and the max pooling layer of the last two blocks is clipped.
Further, the distribution learning module g in step S3 φ Is composed of two fully connected layers for extracting the distribution representation of image features to obtain each classMean and variance of individual samples.
Further, the minimizing of the loss function in step S3 uses a gradient descent algorithm, and the weights ω and the deviations b are continuously adjusted so that the value of the loss function becomes smaller and smaller.
Further, the step S3 specifically includes:
s31, D in the base class train Input embedding module f θ Sequentially passing through a convolution layer, a pooling layer and an activation function to obtain a sample feature map of each class;
s32, calculating the mean mu of each class sample feature map c Sum of variances sigma c For the embedding module f, compared with the spatial distribution of the characteristics of the pre-training samples θ Is adjusted by calculating the mean mu of each class according to formulas (1) and (2) c Sum of variances sigma c
Figure BDA0003629704630000031
Figure BDA0003629704630000032
In which x is i Feature vector expressed as the ith sample of C in base class, n c Expressed as the total number of samples in class C;
s33, inputting various sample feature images into the distribution learning module g φ Obtaining the average value of each sample
Figure BDA0003629704630000033
Sum of variances->
Figure BDA0003629704630000034
Calculating each sample x using Gaussian distribution formula (3) i Category probability:
Figure BDA0003629704630000035
sigma in c Covariance matrix expressed as C-class characteristics, and the calculation formula is shown as formula (4):
Figure BDA0003629704630000036
s34, minimizing a loss function by using a cross entropy formula, and optimizing a distribution learning module g φ Parameters, the formula is shown as (5):
Figure BDA0003629704630000037
where y is represented as a set of labeled feature vectors.
Further, the step 4 is specifically as follows:
s41, each task
Figure BDA0003629704630000038
By support set->
Figure BDA0003629704630000039
And query set->
Figure BDA00036297046300000310
Composition;
s42, supporting the collection
Figure BDA0003629704630000041
Input embedding module f θ Obtaining the average value mu of each class sample characteristic diagram c Sum of variances sigma c
S43, inputting various sample feature images into the distribution learning module g φ Obtaining the average value of each sample
Figure BDA0003629704630000042
Sum of variances
Figure BDA0003629704630000043
S44, according toMean value of each sample
Figure BDA0003629704630000044
Sum of variances->
Figure BDA0003629704630000045
Calculating +.about.using equations (6) and (7)>
Figure BDA0003629704630000046
Distribution prototype of each class->
Figure BDA0003629704630000047
And->
Figure BDA0003629704630000048
Figure BDA0003629704630000049
Figure BDA00036297046300000410
In the formula (6), S c Represented as a support set
Figure BDA00036297046300000411
Class C, x i Expressed as support set->
Figure BDA00036297046300000412
In the class C sample of (C),
Figure BDA00036297046300000413
represented as sample x i Mean, mu c Expressed as the mean of class C, i.e., the distribution of class C, the overall expression (6) is expressed as a weighted harmonic mean of class C to represent the locations of distribution prototypes of different classes in the model for tightening intra-class relationships and satisfying recognition gaps;
the purpose of equation (7) is to solve for the variance of class C to eliminate class independent representations of individual data with sufficient class information to reduce the magnitude variation of the overall class information.
Further, step S5 specifically includes:
s51, calculating the class probability of each class in the basic class sample data, wherein the formula is as follows:
Figure BDA00036297046300000414
in the method, in the process of the invention,
Figure BDA00036297046300000415
the mean and variance of class C in the base class sample data is represented as subject to gaussian distribution,
Figure BDA00036297046300000416
expressed as support set->
Figure BDA00036297046300000417
Average value of class C, S d Expressed as support set->
Figure BDA00036297046300000418
The distribution of class C is used as an input, and a distance set is compared with the distribution of class C in the basic class sample data;
s52, selecting the first n maximum categories, combining the distribution of the n categories with the distribution of the current category, and adopting the following formula:
Figure BDA00036297046300000419
where topn (·) is expressed as a slave input distance set S d An operator for selecting a top element, S N To store the nearest n base class sample data with respect to the feature vector;
s53, inputting the combined classes into formulas (6) and (7) to obtain corrected distribution prototypes of each class
Figure BDA00036297046300000420
And->
Figure BDA00036297046300000421
/>
Figure BDA0003629704630000051
Figure BDA0003629704630000052
In the formula (6), S c Represented as a support set
Figure BDA0003629704630000053
Class C, x i Expressed as support set->
Figure BDA0003629704630000054
In the class C sample of (C),
Figure BDA0003629704630000055
represented as sample x i Mean, mu c Expressed as the mean of class C, i.e., the distribution of class C, the overall expression (6) is expressed as a weighted harmonic mean of class C to represent the locations of distribution prototypes of different classes in the model for tightening intra-class relationships and satisfying recognition gaps;
the purpose of equation (7) is to solve for the variance of class C to eliminate class independent representations of individual data with sufficient class information to reduce the magnitude variation of the overall class information.
Further, the step S6 specifically includes:
s61, the query set of the new class data
Figure BDA0003629704630000056
Sample information input embedding module f of (2) θ Obtaining the average value mu of each class sample characteristic diagram c Sum of variances sigma c
S62, inputting various sample feature images into a distribution learning module g φ Obtaining the average value of each sample
Figure BDA0003629704630000057
Sum of variances->
Figure BDA0003629704630000058
S63, the average value of each sample is calculated
Figure BDA0003629704630000059
Sum of variances->
Figure BDA00036297046300000510
Inputting a formula (3), calculating the prediction probability of a new class query sample, inputting the prediction probability into a measurement module, and outputting a corresponding class label:
Figure BDA00036297046300000511
sigma in c Covariance matrix expressed as C-class characteristics, and the calculation formula is shown as formula (4):
Figure BDA00036297046300000512
on the other hand, the invention also provides a task self-adaptive measurement learning device facing the small sample image, which is used for realizing any one of the methods, and comprises the following modules:
the embedding module is used for carrying out feature extraction processing on the image samples and constructing a feature space, wherein the image samples comprise a base class sample, a new class support sample and a query sample;
the distribution learning module is used for extracting distribution representation of image characteristics and obtaining the mean value and variance of each sample in the class;
the distribution correction module is used for carrying out distribution correction on the new class samples by using the distribution of the base class samples and constructing an image distribution correction model;
and the measurement module is used for classifying the new class query set samples by using the optimized distribution of the base class samples and obtaining class labels.
Compared with the prior art, the invention has the beneficial effects that:
the invention establishes the small sample image feature learning method and device based on feature distribution migration, can be matched with any classifier and feature extractor, does not need additional parameters, solves the problem of prototype deviation in small sample image classification, improves the image classification effect, and has high practical value.
The device optimizes the parameters of the embedded module and the distribution learning module by utilizing the data of the base class and the gradient descent method in the early stage, and does not need additional parameter setting when carrying out distribution correction in the later stage; in addition, it is generally assumed that each dimension in the feature representation follows a gaussian distribution, so that the mean and variance of the gaussian distribution can be passed between similar classes, reducing the bias so that statistics of these classes can be better estimated with a sufficient number of samples, and then correcting the distribution of samples using a distribution correction model, thereby classifying new classes of samples more accurately.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.
Fig. 1 is a flowchart of a small sample image feature learning method based on feature distribution migration according to an embodiment of the present invention.
Fig. 2 is a feature learning network structure diagram of transfer learning and distribution transfer of a feature learning model of a small sample image based on feature distribution transfer according to an embodiment of the present invention.
Fig. 3 is a flowchart of a distribution correction module according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a functional module of a small sample image feature learning device based on feature distribution migration according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. Embodiments of the present invention are intended to be within the scope of the present invention as defined by the appended claims.
According to one aspect disclosed herein, there is provided a small sample image feature learning method based on feature distribution migration, as shown in fig. 1, including the following steps:
s1, preprocessing data, wherein the data comprises a training set and a testing set;
specifically, the pretreatment method of step S1 includes:
s11, data is processed
Figure BDA0003629704630000071
Is divided into->
Figure BDA0003629704630000072
Figure BDA0003629704630000073
And->
Figure BDA0003629704630000074
Two parts, and the class spaces of the two parts are mutually exclusive, D train For adjusting parameters during training, D test Evaluating the performance of the model as new data;
s12, for the C-way K-shot classification task, from D train Randomly selecting C classes, randomly selecting M samples from each class, wherein K samples are used as support samples S i The remaining M-K samples are used as queriesSample Q i ,S i And Q i Form a task T i The method comprises the steps of carrying out a first treatment on the surface of the Similarly, for D test With tasks
Figure BDA0003629704630000075
S2, pre-training an embedded module f by using base class data θ Obtaining a good characteristic space;
further, an embedding module f containing four convolution blocks is used in step S2 θ For image extraction features, which contain convolutional layers, pooling layers and nonlinear activation functions, each convolutional block uses a convolutional kernel with a window size of 3*3, a batch normalization, a RELU nonlinear layer, a 2 x 2 max pooling layer, and the max pooling layer of the last two blocks is clipped. For example, for an 84 x 3RGB image, a 3x3 convolution kernel with 64 filters is used for each block.
S3, D is train Input to the embedding module f θ Obtaining a sample characteristic diagram, and inputting the sample characteristic diagram into a distribution learning module g φ In the method, a loss function is minimized, and a distribution learning module g is optimized φ
Wherein, the distribution learning module g φ The method consists of two full connection layers, and is used for extracting distribution representation of image characteristics to obtain the mean value and variance of each sample in the class.
The minimizing of the loss function uses a gradient descent algorithm, constantly adjusting the weights ω and the deviations b so that the value of the loss function becomes smaller and smaller. Random gradient descent, batch gradient descent, etc. may also be substituted.
Specifically, step S3 includes:
s31, D in the base class train Input embedding module f θ Sequentially passing through a convolution layer, a pooling layer and an activation function to obtain a sample feature map of each class;
s32, calculating the mean mu of each class sample feature map c Sum of variances sigma c For the embedding module f, compared with the spatial distribution of the characteristics of the pre-training samples θ Is adjusted according to the formulas (1) and (2)Mean μ of each class c Sum of variances sigma c
Figure BDA0003629704630000081
Figure BDA0003629704630000082
In which x is i Feature vector expressed as the ith sample of C in base class, n c Expressed as the total number of samples in class C;
s33, inputting various sample feature images into the distribution learning module g φ Obtaining the average value of each sample
Figure BDA0003629704630000083
Sum of variances->
Figure BDA0003629704630000084
Calculating each sample x using Gaussian distribution formula (3) i Category probability:
Figure BDA0003629704630000085
sigma in c Covariance matrix expressed as C-class characteristics, and the calculation formula is shown as formula (4):
Figure BDA0003629704630000086
s34, minimizing a loss function by using a cross entropy formula, and optimizing a distribution learning module g φ Parameters, the formula is shown as (5):
Figure BDA0003629704630000087
where y is represented as a set of labeled feature vectors.
S4, supporting the collection
Figure BDA0003629704630000088
Through the embedded module f θ And a distribution learning module g φ Calculating distribution prototype of each class->
Figure BDA0003629704630000089
And->
Figure BDA00036297046300000810
Specifically, step 4 includes:
s41, dividing the new class data into support sets
Figure BDA00036297046300000811
And query set->
Figure BDA00036297046300000812
Every task->
Figure BDA00036297046300000813
By support set->
Figure BDA00036297046300000814
And a query set
Figure BDA00036297046300000815
Composition;
s42, supporting the collection
Figure BDA00036297046300000816
Input embedding module f θ Obtaining the average value mu of each class sample characteristic diagram c Sum of variances sigma c
S43, inputting various sample feature images into the distribution learning module g φ Obtaining the average value of each sample
Figure BDA0003629704630000091
Sum of variances
Figure BDA0003629704630000092
S44, according to the average value of each sample
Figure BDA0003629704630000093
Sum of variances->
Figure BDA0003629704630000094
Calculating +.about.using equations (6) and (7)>
Figure BDA0003629704630000095
Distribution prototype of each class->
Figure BDA0003629704630000096
And->
Figure BDA0003629704630000097
/>
Figure BDA0003629704630000098
Figure BDA0003629704630000099
In the formula (6), S c Represented as a support set
Figure BDA00036297046300000910
Class C, x i Expressed as support set->
Figure BDA00036297046300000911
In the class C sample of (C),
Figure BDA00036297046300000912
represented as sample x i Mean, mu c Expressed as the mean of class C, i.e., the distribution of class C, the overall expression (6) is expressed as a weighted harmonic mean of class C to represent the locations of the distribution prototypes of the different classes in the model for tightening within the classThe relation sum satisfies the recognition gap;
the purpose of equation (7) is to solve for the variance of class C to eliminate class independent representations of individual data with sufficient class information to reduce the magnitude variation of the overall class information.
S5, calculating the class probability of each class in the basic class data, selecting the first n maximum classes, combining the distribution of the n classes with the distribution of the current class to obtain a corrected distribution prototype of each class
Figure BDA00036297046300000913
And->
Figure BDA00036297046300000914
Specifically, step S5 includes:
s51, calculating the class probability of each class in the basic class sample data, wherein the formula is as follows:
Figure BDA00036297046300000915
in the method, in the process of the invention,
Figure BDA00036297046300000916
the mean and variance of class C in the base class sample data is represented as subject to gaussian distribution,
Figure BDA00036297046300000917
expressed as support set->
Figure BDA00036297046300000918
Average value of class C, S d Expressed as support set->
Figure BDA00036297046300000919
The distribution of class C is used as an input, and a distance set is compared with the distribution of class C in the basic class sample data;
s52, selecting the first n maximum categories, combining the distribution of the n categories with the distribution of the current category, and adopting the following formula:
Figure BDA00036297046300000920
where topn (·) is expressed as a slave input distance set S d An operator for selecting a top element, S N To store the nearest n base class sample data with respect to the feature vector;
s53, inputting the combined classes into formulas (6) and (7) to obtain corrected distribution prototypes of each class
Figure BDA00036297046300000921
And
Figure BDA00036297046300000922
Figure BDA0003629704630000101
Figure BDA0003629704630000102
in the formula (6), S c Represented as a support set
Figure BDA0003629704630000103
Class C, x i Expressed as support set->
Figure BDA0003629704630000104
In the class C sample of (C),
Figure BDA0003629704630000105
represented as sample x i Mean, mu c Expressed as the mean of class C, i.e., the distribution of class C, the overall expression (6) is expressed as a weighted harmonic mean of class C to represent the locations of distribution prototypes of different classes in the model for tightening intra-class relationships and satisfying recognition gaps; />
The purpose of equation (7) is to solve for the variance of class C to eliminate class independent representations of individual data with sufficient class information to reduce the magnitude variation of the overall class information.
S6, calculating the prediction probability of the new type query sample, and outputting the type label.
Specifically, step S6 includes:
s61, the query set of the new class data
Figure BDA0003629704630000106
Sample information input embedding module f of (2) θ Obtaining the average value mu of each class sample characteristic diagram c Sum of variances sigma c
S62, inputting various sample feature images into a distribution learning module g φ Obtaining the average value of each sample
Figure BDA0003629704630000107
Sum of variances
Figure BDA0003629704630000108
S63, the average value of each sample is calculated
Figure BDA0003629704630000109
Sum of variances->
Figure BDA00036297046300001010
Inputting a formula (3), calculating the prediction probability of a new class query sample, inputting the prediction probability into a measurement module, and outputting a corresponding class label:
Figure BDA00036297046300001011
sigma in c Covariance matrix expressed as C-class characteristics, and the calculation formula is shown as formula (4):
Figure BDA00036297046300001012
on the other hand, the invention also provides a task self-adaptive measurement learning device facing the small sample image, which is used for realizing any one of the methods, and comprises the following modules:
the embedding module is used for carrying out feature extraction processing on the image samples and constructing a feature space, wherein the image samples comprise a base class sample, a new class support sample and a query sample;
the distribution learning module is used for extracting distribution representation of image characteristics and obtaining the mean value and variance of each sample in the class;
the distribution correction module is used for carrying out distribution correction on the new class samples by using the distribution of the base class samples and constructing an image distribution correction model;
and the measurement module is used for classifying the new class query set samples by using the optimized distribution of the base class samples and obtaining class labels.
The small sample image feature learning method based on feature distribution migration is established on a pre-training feature extractor and a classification model, can be matched with any classifier and feature extractor, and does not need additional parameters; in the small sample image feature learning method device based on feature distribution migration, a distribution learning module is adopted, and it is generally assumed that each dimension in feature representation follows Gaussian distribution, so that the mean value and variance of the Gaussian distribution can be transferred between similar categories, and the weighted harmonic mean of the corresponding categories is calculated by calculating the mean value and variance of each sample to represent the positions of distribution prototypes of different categories in a model, and the distances represented by each category are compared to learn to classify the samples. The aim is to update and classify the query sample by fusing the sample characteristics connected with the query sample and the similarity of the distances between the sample characteristics.
The specific embodiments of the small sample image feature learning method and model based on feature distribution migration are set forth above in connection with the accompanying drawings. The implementation of the method and apparatus will be apparent to those skilled in the art from the description of the embodiments above.
It should be noted that, in the drawings and the text of the specification, the undescribed implementation is a form known to those skilled in the art, and not described in detail. Furthermore, the above definitions of the elements and methods are not limited to the specific structures, shapes or modes mentioned in the examples, and may be simply modified or replaced by those of ordinary skill in the art.
Furthermore, unless specifically described or steps must occur sequentially, the order of the steps is not limited to that listed above and may be varied or rearranged according to the desired design. And the above examples can be mixed and matched with each other or other examples based on design and reliability, i.e. the technical features in different implementations can be freely combined to form more implementation examples. The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, the disclosure herein is not directed to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present disclosure as described herein, and the above description of specific languages is provided for disclosure of enablement and best mode of the present disclosure.
Similarly, it should be appreciated that in the above description of exemplary embodiments disclosed herein, various features disclosed herein are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various disclosed aspects. However, the disclosed method should not be construed as reflecting the following schematic diagram: i.e., the claims are directed to the disclosed herein with more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this disclosure.
The foregoing examples are merely specific embodiments of the present application, and are not intended to limit the scope of the present application, but the present application is not limited thereto, and those skilled in the art will appreciate that while the foregoing examples are described in detail, the present application is not limited thereto. Any person skilled in the art, within the technical scope of the disclosure of the present application, may modify or easily conceive of changes to the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical details; such modifications, changes or substitutions do not depart from the spirit and scope of the corresponding technical solutions. Are intended to be encompassed within the scope of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. The small sample image feature learning method based on feature distribution migration is characterized by comprising the following steps of:
s1, preprocessing data, wherein the data comprises a training set and a testing set;
s2, pre-training an embedded module f by using base class data θ Obtaining a feature space;
s3, D is train Input to the embedding module f θ Obtaining a sample characteristic diagram, and inputting the sample characteristic diagram into a distribution learning module g φ In the method, a loss function is minimized, and a distribution learning module g is optimized φ
S4, dividing the new class data into support sets
Figure FDA0004197274470000011
And query set->
Figure FDA0004197274470000012
Support set->
Figure FDA0004197274470000013
Through the embedded module f θ Distribution learning moduleg φ Calculating distribution prototype of each class->
Figure FDA0004197274470000014
And->
Figure FDA0004197274470000015
S5, calculating the class probability of each class in the basic class data, selecting the first n maximum classes, combining the distribution of the n classes with the distribution of the current class to obtain a corrected distribution prototype of each class
Figure FDA0004197274470000016
And->
Figure FDA0004197274470000017
The step S5 specifically comprises the following steps:
s51, calculating the class probability of each class in the basic class sample data, wherein the formula is as follows:
Figure FDA0004197274470000018
in the method, in the process of the invention,
Figure FDA0004197274470000019
the mean and variance of class C in the base class sample data are expressed as Gaussian distribution, ++>
Figure FDA00041972744700000110
Expressed as support set->
Figure FDA00041972744700000111
Average value of class C, S d Expressed as support set->
Figure FDA00041972744700000112
The distribution of class C is used as input and is phase with the distribution of class C in the basic class sample dataA set of distances compared;
s52, selecting the first n maximum categories, combining the distribution of the n categories with the distribution of the current category, and adopting the following formula:
Figure FDA00041972744700000113
where topn (·) is expressed as a slave input distance set S d An operator for selecting a top element, S N To store the nearest n base class sample data with respect to the feature vector;
s53, inputting the combined classes into formulas (6) and (7) to obtain corrected distribution prototypes of each class
Figure FDA00041972744700000114
And->
Figure FDA00041972744700000115
Figure FDA00041972744700000116
Figure FDA00041972744700000117
In the formula (6), S c Represented as a support set
Figure FDA00041972744700000118
Class C, x i Expressed as support set->
Figure FDA00041972744700000119
Samples in class C of (a), +.>
Figure FDA00041972744700000120
Represented as sample x i Mean, mu c Expressed as the mean of class C, i.e., the distribution of class C, the overall expression (6) is expressed as a weighted harmonic mean of class C to represent the locations of distribution prototypes of different classes in the model for tightening intra-class relationships and satisfying recognition gaps;
the purpose of equation (7) is to solve for the variance of class C to eliminate class independent representation of individual data with sufficient class information to reduce the magnitude variation of overall class information;
s6, calculating the prediction probability of the new type query sample.
2. The small sample image feature learning method based on feature distribution migration of claim 1, wherein the preprocessing method of step S1 is as follows:
s11, data is processed
Figure FDA0004197274470000021
Is divided into->
Figure FDA0004197274470000022
Figure FDA0004197274470000023
And->
Figure FDA0004197274470000024
Two parts, and the class spaces of the two parts are mutually exclusive, D train For adjusting parameters during training, D test Evaluating the performance of the model as new data;
s12, for the C-way K-shot classification task, from D train Randomly selecting C classes, randomly selecting M samples from each class, wherein K samples are used as support samples S i The remaining M-K samples are used as query samples Q i ,S i And Q i Form a task T i The method comprises the steps of carrying out a first treatment on the surface of the Similarly, for D test With tasks
Figure FDA0004197274470000025
3. The method for learning small sample image features based on feature distribution migration according to claim 1, wherein an embedding module f including four convolution blocks is used in step S2 θ For image extraction features, which contain convolutional layers, pooling layers and nonlinear activation functions, each convolutional block uses a convolutional kernel with a window size of 3*3, a batch normalization, a RELU nonlinear layer, a 2 x 2 max pooling layer, and the max pooling layer of the last two blocks is clipped.
4. The small sample image feature learning method based on feature distribution migration of claim 1, wherein the feature distribution learning module g in step S3 φ The method consists of two full connection layers, and is used for extracting distribution representation of image characteristics to obtain the mean value and variance of each sample in the class.
5. The small sample image feature learning method based on feature distribution migration according to claim 1, wherein the minimizing of the loss function in step S3 uses a gradient descent algorithm, and the value of the loss function becomes smaller and smaller by continuously adjusting the weight ω and the deviation b.
6. The small sample image feature learning method based on feature distribution migration of claim 1, wherein step S3 specifically includes:
s31, D in the base class train Input embedding module f θ Sequentially passing through a convolution layer, a pooling layer and an activation function to obtain a sample feature map of each class;
s32, calculating the mean mu of each class sample feature map c Sum of variances sigma c For the embedding module f, compared with the spatial distribution of the characteristics of the pre-training samples θ Is adjusted by calculating the mean mu of each class according to formulas (1) and (2) c Sum of variances sigma c
Figure FDA0004197274470000031
Figure FDA0004197274470000032
In which x is i Feature vector expressed as the ith sample of C in base class, n c Expressed as the total number of samples in class C;
s33, inputting various sample feature images into the distribution learning module g φ Obtaining the average value mu of each sample xi Sum of variances
Figure FDA0004197274470000033
Calculating each sample x using Gaussian distribution formula (3) i Category probability:
Figure FDA0004197274470000034
sigma in c Covariance matrix expressed as C-class characteristics, and the calculation formula is shown as formula (4):
Figure FDA0004197274470000035
s34, minimizing a loss function by using a cross entropy formula, and optimizing a distribution learning module g φ Parameters, the formula is shown as (5):
Figure FDA0004197274470000036
/>
where y is represented as a set of labeled feature vectors.
7. The small sample image feature learning method based on feature distribution migration of claim 1, wherein step S4 specifically comprises the following steps:
s41, dividing the new class data into support sets
Figure FDA0004197274470000037
And query set->
Figure FDA0004197274470000038
Every task->
Figure FDA0004197274470000039
By support set->
Figure FDA00041972744700000310
And query set->
Figure FDA00041972744700000311
Composition;
s42, supporting the collection
Figure FDA00041972744700000312
Input embedding module f θ Obtaining the average value mu of each class sample characteristic diagram c Sum of variances sigma c
S43, inputting various sample feature images into the distribution learning module g φ Obtaining the average value of each sample
Figure FDA00041972744700000313
Sum of variances->
Figure FDA00041972744700000314
S44, according to the average value of each sample
Figure FDA00041972744700000315
Sum of variances->
Figure FDA00041972744700000316
Calculating +.about.using equations (6) and (7)>
Figure FDA00041972744700000317
Distribution prototype of each class->
Figure FDA0004197274470000041
And->
Figure FDA0004197274470000042
Figure FDA0004197274470000043
Figure FDA0004197274470000044
In the formula (6), S c Represented as a support set
Figure FDA0004197274470000045
Class C, x i Expressed as support set->
Figure FDA0004197274470000046
Samples in class C of (a), +.>
Figure FDA0004197274470000047
Represented as sample x i Mean, mu c Expressed as the mean of class C, i.e., the distribution of class C, the overall expression (6) is expressed as a weighted harmonic mean of class C to represent the locations of distribution prototypes of different classes in the model for tightening intra-class relationships and satisfying recognition gaps;
the purpose of equation (7) is to solve for the variance of class C to eliminate class independent representations of individual data with sufficient class information to reduce the magnitude variation of the overall class information.
8. The small sample image feature learning method based on feature distribution migration of claim 1, wherein step S6 specifically comprises:
s61, the query set of the new class data
Figure FDA0004197274470000048
Sample information input embedding module f of (2) θ Obtaining the average value mu of each class sample characteristic diagram c Sum of variances sigma c
S62, inputting various sample feature images into a distribution learning module g φ Obtaining the average value of each sample
Figure FDA0004197274470000049
Sum of variances->
Figure FDA00041972744700000410
S63, the average value of each sample is calculated
Figure FDA00041972744700000411
Sum of variances->
Figure FDA00041972744700000412
Inputting a formula (3), calculating the prediction probability of a new class query sample, inputting the prediction probability into a measurement module, and outputting a corresponding class label:
Figure FDA00041972744700000413
sigma in c Covariance matrix expressed as C-class characteristics, and the calculation formula is shown as formula (4):
Figure FDA00041972744700000414
9. a task adaptive metric learning device for small sample image, characterized in that it is configured to implement the task adaptive metric learning method for small sample image according to any one of claims 1-8, and comprises the following modules:
the embedding module is used for carrying out feature extraction processing on the image samples and constructing a feature space, wherein the image samples comprise a base class sample, a new class support sample and a query sample;
the distribution learning module is used for extracting distribution representation of image characteristics and obtaining the mean value and variance of each sample in the class;
the distribution correction module is used for carrying out distribution correction on the new class samples by using the distribution of the base class samples and constructing an image distribution correction model;
and the measurement module is used for classifying the new class query set samples by using the optimized distribution of the base class samples and obtaining class labels.
CN202210487387.2A 2022-05-06 2022-05-06 Small sample image feature learning method and device based on feature distribution migration Active CN114782779B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210487387.2A CN114782779B (en) 2022-05-06 2022-05-06 Small sample image feature learning method and device based on feature distribution migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210487387.2A CN114782779B (en) 2022-05-06 2022-05-06 Small sample image feature learning method and device based on feature distribution migration

Publications (2)

Publication Number Publication Date
CN114782779A CN114782779A (en) 2022-07-22
CN114782779B true CN114782779B (en) 2023-06-02

Family

ID=82436001

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210487387.2A Active CN114782779B (en) 2022-05-06 2022-05-06 Small sample image feature learning method and device based on feature distribution migration

Country Status (1)

Country Link
CN (1) CN114782779B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101750611A (en) * 2009-12-02 2010-06-23 哈尔滨工程大学 Underwater robot object detection device and detection method
CN105469111A (en) * 2015-11-19 2016-04-06 浙江大学 Small sample set object classification method on basis of improved MFA and transfer learning

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4154175A4 (en) * 2020-06-16 2023-07-19 Huawei Technologies Co., Ltd. Learning proxy mixtures for few-shot classification
CN111858991A (en) * 2020-08-06 2020-10-30 南京大学 Small sample learning algorithm based on covariance measurement
CN113610151B (en) * 2021-08-05 2022-05-03 哈尔滨理工大学 Small sample image classification system based on prototype network and self-encoder
CN114328921B (en) * 2021-12-27 2024-04-09 湖南大学 Small sample entity relation extraction method based on distribution calibration
CN114333027B (en) * 2021-12-31 2024-05-14 之江实验室 Cross-domain novel facial expression recognition method based on combined and alternate learning frames
CN114387473A (en) * 2022-01-12 2022-04-22 南通大学 Small sample image classification method based on base class sample characteristic synthesis
CN114387474A (en) * 2022-01-12 2022-04-22 南通大学 Small sample image classification method based on Gaussian prototype classifier

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101750611A (en) * 2009-12-02 2010-06-23 哈尔滨工程大学 Underwater robot object detection device and detection method
CN105469111A (en) * 2015-11-19 2016-04-06 浙江大学 Small sample set object classification method on basis of improved MFA and transfer learning

Also Published As

Publication number Publication date
CN114782779A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN107480261B (en) Fine-grained face image fast retrieval method based on deep learning
US20160140425A1 (en) Method and apparatus for image classification with joint feature adaptation and classifier learning
CN112101437B (en) Fine granularity classification model processing method based on image detection and related equipment thereof
CN112115783A (en) Human face characteristic point detection method, device and equipment based on deep knowledge migration
CN110782420A (en) Small target feature representation enhancement method based on deep learning
CN111898703B (en) Multi-label video classification method, model training method, device and medium
US11887270B2 (en) Multi-scale transformer for image analysis
CN105184298A (en) Image classification method through fast and locality-constrained low-rank coding process
WO2022252458A1 (en) Classification model training method and apparatus, device, and medium
CN116503399B (en) Insulator pollution flashover detection method based on YOLO-AFPS
CN110084284A (en) Target detection and secondary classification algorithm and device based on region convolutional neural networks
CN114943859B (en) Task related metric learning method and device for small sample image classification
CN114782752B (en) Small sample image integrated classification method and device based on self-training
CN112329571B (en) Self-adaptive human body posture optimization method based on posture quality evaluation
CN116503398B (en) Insulator pollution flashover detection method and device, electronic equipment and storage medium
CN114782779B (en) Small sample image feature learning method and device based on feature distribution migration
CN113762331A (en) Relational self-distillation method, apparatus and system, and storage medium
CN117058235A (en) Visual positioning method crossing various indoor scenes
CN113508377A (en) Image retrieval method and image retrieval system
CN115294381B (en) Small sample image classification method and device based on feature migration and orthogonal prior
CN110069647A (en) Image tag denoising method, device, equipment and computer readable storage medium
JPWO2012077818A1 (en) Method for determining transformation matrix of hash function, hash type approximate nearest neighbor search method using the hash function, apparatus and computer program thereof
CN116012878A (en) Pedestrian re-identification method and system
CN112633323B (en) Gesture detection method and system for classroom
Zheng et al. Defence against adversarial attacks using clustering algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant