CN115294381A - Small sample image classification method and device based on feature migration and orthogonal prior - Google Patents

Small sample image classification method and device based on feature migration and orthogonal prior Download PDF

Info

Publication number
CN115294381A
CN115294381A CN202210487137.9A CN202210487137A CN115294381A CN 115294381 A CN115294381 A CN 115294381A CN 202210487137 A CN202210487137 A CN 202210487137A CN 115294381 A CN115294381 A CN 115294381A
Authority
CN
China
Prior art keywords
orthogonal
feature
prior
small sample
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210487137.9A
Other languages
Chinese (zh)
Other versions
CN115294381B (en
Inventor
李晓旭
张志敏
刘俊
汤卓和
刘忠源
张文斌
曾俊瑀
马占宇
陶剑
董洪飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanzhou University of Technology
Original Assignee
Lanzhou University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanzhou University of Technology filed Critical Lanzhou University of Technology
Priority to CN202210487137.9A priority Critical patent/CN115294381B/en
Publication of CN115294381A publication Critical patent/CN115294381A/en
Application granted granted Critical
Publication of CN115294381B publication Critical patent/CN115294381B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a small sample image classification method and device based on feature migration and orthogonal prior, which are used for researching a small sample classification framework of high-resolution feature extraction on the basis of small sample image classification research based on depth measurement. By introducing feature migration and orthogonal prior small sample image feature learning, assuming a new class and base class shared feature extraction mode and assuming the feature orthogonality of different classes of new class data, an orthogonal feature subspace is learned by constructing an orthogonalized feature adaptation network, so that the features of different classes are orthogonal to each other, and the recognition degree of the features is improved. The method has very important significance for theoretical research of small sample learning and promotion of wide application of machine identification technology. Meanwhile, the advanced technology for breaking through the theoretical bottleneck of small sample learning and mastering artificial intelligence in China will play a role in adding bricks and tiles.

Description

Small sample image classification method and device based on feature migration and orthogonal prior
Technical Field
The invention relates to the technical field of image classification, in particular to a small sample image classification method and device based on feature migration and orthogonal prior.
Background
In recent years, with the development of deep learning, the recognition performance of machines has surpassed that of humans on many large sample image classification tasks. However, when the sample size is small, the recognition level of the machine is still far from that of human. Therefore, image Classification of a small number of training samples, especially small sample Image Classification (Few-shot Image Classification) with only one or a Few labeled samples per class, has received much attention from researchers in recent two years.
The small sample Classification (Few-shot Classification) belongs to the category of small sample Learning (Few-shot Learning), and usually includes two types of data with disjoint class spaces, namely, base class data and new class data. The small sample classification aims to learn classification rules by using knowledge learned by base class data and a small number of labeled samples (support samples) of new class data, and accurately predicts the classes of unlabeled samples (query samples) in a new class task, and the framework of the small sample classification is shown in fig. 1.
The classification of small sample images is a research problem to be solved urgently in the field of computer vision and artificial intelligence at present. The existing and successful large-sample image classification method depends heavily on the number of samples, and the sample size of objects in the real world is subject to long-tail distribution, i.e. the sample size of a large number of objects is seriously insufficient, for example, in the fields of military affairs, medical treatment, industry, astronomy and the like, the sample collection needs to consume a large amount of manpower, material resources, time and economic cost, and the collection of large-scale image samples is difficult. Therefore, the research for classifying small sample images has important value for the wide application of image classification technology.
In the prior art, a classification method based on depth measurement mainly judges a class by comparing distances between samples or between a sample and a class prototype. Technologies such as data enhancement and transfer learning are often combined to make up for the defects of insufficient data quantity and easiness in overfitting of a model, and good classification performance is obtained on a plurality of small sample classification tasks. However, compared with the classification of large sample images, the performance of the existing classification of small sample images is still unsatisfactory, which limits the practicability of the small sample image classification technology to a great extent, and still faces some problems to be solved urgently: and (5) learning features with high identification degree. For classifying large sample images, the existing deep learning technology can learn the image features with high identification degree by increasing model elasticity and sample size. However, for small sample classification tasks where the labeled samples are rare, the existing deep learning techniques are not applicable. Therefore, it is a problem to be solved how to learn a feature representation with high recognition based on base class data and new class data with few labeled samples.
Disclosure of Invention
The invention provides a small sample image classification method and device based on feature migration and orthogonal prior aiming at the technical problem of high-resolution feature learning in small sample image classification.
In order to achieve the above purpose, the invention provides the following technical scheme:
the invention firstly provides a small sample image classification method based on feature migration and orthogonal prior, which comprises the following steps:
s1, data preparation, namely pre-training an image to obtain an embedded module f θ Extracting the characteristics of an image, wherein the image comprises a training set and a test set;
s2, introducing an orthogonal prior thought into the convolutional neural network model, and constructing a feature learning network model based on feature migration and orthogonal prior;
s3, learning a network model objective function based on the training optimization orthogonal prior feature;
and S4, classifying the images in the test set by using the optimized image orthogonal prior feature learning network model.
Further, step S1 includes:
s11, data are processed
Figure RE-GDA0003856221160000021
Is divided into
Figure RE-GDA0003856221160000022
And
Figure RE-GDA0003856221160000023
two parts, and the two parts are in the same category space, and D is the same as D train As base class data training model, D test Testing the model as new data;
s12, for the C-way K-shot classification task, from D train Randomly selecting C categories, randomly selecting M samples in each category, wherein K samples are used as support samples S i And the rest M-K samples are used as query samples Q i ,S i And Q i Form a task T i Same for D test Has a task
Figure RE-GDA0003856221160000024
S13, a first training stage: pre-training embedded module f by base class data θ ,f θ The method comprises the following steps of (1) containing 4 convolution blocks, wherein each convolution block contains a convolution layer, a pooling layer and a nonlinear activation function; the window size of convolution kernel used by each convolution block is 3x3, one batch normalization, three channels of RGB, one pooling layer, 2 x 2 maximal poolingLayer, the maximum pooling layer of the last two blocks is cut, and a nonlinear activation layer, the activation function of which adopts ReLu.
Further, in the step S2, in the feature learning network model based on feature migration and orthogonal prior, the orthogonalized feature adaptive network consists of three parts: embedded module f θ Orthogonal adaptation module
Figure RE-GDA0003856221160000031
And a metrology module; quadrature adaptation module
Figure RE-GDA0003856221160000032
The method is composed of two layers of convolution layers, the size of a convolution kernel is 5 multiplied by 5, and the method is used for transforming the new sample characteristics and learning an orthogonalized characteristic subspace.
Further, step S3 includes:
s31, in the second stage of training, a classification task is carried out on the new data, and all the support samples are input into the embedded module f with fixed parameters θ In the method, corresponding support sample characteristics f are obtained θ (S ck );
S32, feature transformation is carried out by utilizing an orthogonal adaptation module to obtain
Figure RE-GDA0003856221160000033
S33, the transformed features correspond to the masks M of each class c Multiplying to enable the features between different classes to be orthogonal pairwise;
s34, calculating cosine distance C (P) between similar features by utilizing a measurement module ci ,P cj )(i∈[0,K),i≠j);
S35, optimizing the orthogonal adaptation module by utilizing the mean square error loss function
Figure RE-GDA0003856221160000034
Further, step S33 calculates the formula as:
Figure RE-GDA0003856221160000035
wherein S is ck For the kth supported sample of class c,
Figure RE-GDA0003856221160000036
representing multiplication of corresponding elements of a matrix of the same order, M cijh Is a mask M of class c c Value of ith row and jth column of h channel, M c The elements of (a) are constituted as follows:
Figure RE-GDA0003856221160000037
c is the total category number under the current task, H is the number of characteristic channels, and H is an integral multiple of C; in the above formula, when h is within a given range (the range of h starts from 0), M cijh Equal to 1 and the values of the remaining positions are 0.
Further, step S34 calculates the formula as follows:
Figure RE-GDA0003856221160000038
wherein, C (P) ci ,P cj ) For calculating cosine distance between the same classes, K is the number of supporting samples, c is the c-th class, P ci Represents the ith supported sample feature in class c, P cj Represents the jth supported sample feature in class c,
Figure RE-GDA0003856221160000041
multiplication of corresponding elements of the representative matrix, | | P ci I represents solving matrix P ci The two norms of (a).
Further, the mean square error loss function calculation formula of step S35 is as follows:
Figure RE-GDA0003856221160000042
wherein N is under the current taskTotal number of classes, C (P) ci ,P cj ) To calculate the cosine distances between classes, where MSE [ cos (P) ] ci ,P cj ),1]=[cos(P ci ,P cj )-1] 2
After loss of the support sample is calculated, gradient reduction is carried out, and a mini-batch and Adam optimizer are adopted to update the orthogonal adaptation module
Figure RE-GDA0003856221160000043
Training is repeated for multiple tasks until the network converges.
Further, the Adam adaptive optimization algorithm comprises the following specific steps:
initializing the data: v. of dW =0,S dW =0,v db =0,S db =0, which represent biased first and second moment estimates, respectively, dW, db representing the differential of W and b, respectively;
calculating a Momentum exponential weighted average:
v dW =β 1 v dW +(1-β 1 )dW (5)
v db =β 1 v db +(1-β 1 )db (6)
calculating an exponentially weighted average of the gradient differential squares of the RMSprop algorithm formula:
S dW =β 2 S dW +(1-β 2 )(dW) 2 (7)
S db =β 2 S db +(1-β 2 )(db) 2 (8)
calculating the deviation correction of Momentum and RMSprop algorithms:
deviation correction of Momentum algorithm:
Figure RE-GDA0003856221160000044
Figure RE-GDA0003856221160000045
deviation correction of RMSprop algorithm:
Figure RE-GDA0003856221160000046
Figure RE-GDA0003856221160000051
gradient descent is carried out, and the weight is updated:
Figure RE-GDA0003856221160000052
Figure RE-GDA0003856221160000053
in equations (5) - (14), t represents the t-th iteration, α represents the learning rate, which controls the update rate of the weights, ε represents a very small constant, β 12 Respectively representing the exponential decay rates of the first and second moment estimates,
Figure RE-GDA0003856221160000054
representing the first and second moment estimates after bias correction.
Further, step S4 includes:
s41, test procedure, each task
Figure RE-GDA0003856221160000055
From the supporting set
Figure RE-GDA0003856221160000056
And query set
Figure RE-GDA0003856221160000057
Composition, query set of test set
Figure RE-GDA0003856221160000058
Input to an embedding module f θ Orthogonal adaptation module after fine tuning
Figure RE-GDA0003856221160000059
In (1) obtaining characteristics
Figure RE-GDA00038562211600000510
S42, respectively matching the characteristics output by the orthogonal adaptation module with the masks M corresponding to different classes c Multiplication, the concrete operation is shown as formula (1):
Figure RE-GDA00038562211600000511
wherein the content of the first and second substances,
Figure RE-GDA00038562211600000512
for the k-th query sample,
Figure RE-GDA00038562211600000513
representing multiplication of corresponding elements of a matrix of the same order, M cijh Is a mask M of class c c Value of ith row and jth column of h channel, M c The elements of (a) are as follows:
Figure RE-GDA00038562211600000514
c is the total category number under the current task, H is the number of characteristic channels, and H is an integral multiple of C; in the above formula, when h is within a given range (the range of h starts from 0), M cijh Equal to 1, the values of the remaining positions being 0;
s43, sending the product into a measurement module, and calculating a query sample
Figure RE-GDA00038562211600000515
Cosine distances from all supported samples;
and S44, taking the support sample class with the closest distance as the prediction class of the query sample.
On the other hand, the invention also provides a small sample image classification device based on feature migration and orthogonal prior, which is used for realizing the method and comprises the following functional modules:
a pre-training module for pre-training the image to obtain an embedded module f θ Extracting the characteristics of an image, wherein the image comprises a training set and a test set;
the processing module introduces the idea of orthogonal prior and constructs a feature learning network model based on feature migration and orthogonal prior;
the calculation module is used for solving model parameters based on a training optimization orthogonal prior feature learning network model objective function;
and the classification module is used for classifying the images of the test set by utilizing the optimized image orthogonal prior feature learning network model.
Compared with the prior art, the invention has the beneficial effects that:
the invention discloses a small sample image classification method and device based on feature migration and orthogonal prior, which are based on Deep Convolutional Neural Networks (DCNN), and are used for researching a small sample classification framework of high-resolution feature extraction on the basis of small sample image classification research based on depth measurement. The method comprises the steps of learning an orthogonal feature subspace by introducing feature migration and orthogonal prior small sample image feature learning, assuming a new class and base class shared feature extraction mode and assuming feature orthogonality of new class data among different classes, wherein mutual correlation does not exist, and establishing an orthogonalized feature adaptation network to enable different classes of features to be mutually orthogonal, so that different classes are easily distinguished, and the identification degree of the features is improved. The method has very important significance for theoretical research of small sample learning and promotion of wide application of machine identification technology. Meanwhile, the advanced technology for breaking through the theoretical bottleneck of small sample learning and mastering artificial intelligence in China firstly plays a role in adding bricks and tiles.
Drawings
In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a small-sample Classification (Few-shot Classification) framework.
Fig. 2 is a flowchart of a small sample image classification method and apparatus based on feature migration and orthogonal prior provided by an embodiment of the present invention.
FIG. 3 shows an embedded module f according to an embodiment of the present invention θ Structure diagram.
Fig. 4 is a small sample image feature learning network diagram introducing feature migration and orthogonal prior provided by the embodiment of the present invention.
FIG. 5 is a block diagram of an orthogonal adaptation module according to an embodiment of the present invention
Figure RE-GDA0003856221160000071
The model structure diagram of (1).
Fig. 6 is a schematic diagram of a functional module of a small sample image classification device based on feature migration and orthogonal prior provided by an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. The embodiments of the present invention, and all other embodiments obtained by those skilled in the art without making any creative efforts, belong to the protection scope of the present invention.
The invention also provides a small sample image classification method based on feature migration and orthogonal prior, and the flow is shown in figure 2 and comprises the following steps.
The method comprises the following steps:
s1, preparing data, and pre-training an image to obtain an embedded module f θ Extracting features of an image, imageComprises a training set and a testing set;
specifically, step S1 includes:
s11, data are processed
Figure RE-GDA0003856221160000072
Is divided into
Figure RE-GDA0003856221160000073
And
Figure RE-GDA0003856221160000074
two parts, and the two parts have mutually exclusive class spaces, and the D is converted into the D train As base class data training model, D test Testing the model as new data;
s12, for the C-way K-shot classification task, selecting D train Randomly selecting C categories, randomly selecting M samples in each category, wherein K samples are used as support samples S i And the rest M-K samples are used as query samples Q i ,S i And Q i Form a task T i Same for D test Has a task
Figure RE-GDA0003856221160000075
S13, a first training stage: pre-training embedded module f by base class data θ ,f θ The method comprises the steps of containing 4 convolution blocks, wherein each convolution block contains a convolution layer, a pooling layer and a nonlinear activation function; the window size of a convolution kernel used by each convolution block is 3 multiplied by 3, a batch normalization, RGB three channels, a pooling layer, a 2 multiplied by 2 maximum pooling layer, the maximum pooling layers of the last two blocks are cut, and a nonlinear activation layer is used as an activation function of the nonlinear activation layer. For example, for an 84 × 84 × 3RGB image, a 3 × 3 convolution kernel with 64 filters is used per block. Each block is composed of 1 convolution, 1 ReLu, one pooling, as shown in fig. 3. The pre-trained embedded modules can be multiplexed according to different scenes.
S2, introducing an orthogonal prior thought into the convolutional neural network model, and constructing a feature learning network model based on feature migration and orthogonal prior; as shown in fig. 4.
Specifically, in the step S2, in the feature learning network model based on feature migration and orthogonal prior, the orthogonalized feature adaptive network consists of three parts: embedded module f θ Orthogonal adaptation module
Figure RE-GDA0003856221160000081
And a metrology module; quadrature adaptation module
Figure RE-GDA0003856221160000082
The feature subspace is composed of two convolutional layers, the convolutional kernel size is 5 × 5, and is used for transforming and learning the new sample features, and is orthogonalized as shown in fig. 5.
S3, learning a network model objective function based on training optimization orthogonal prior characteristics;
specifically, step S3 includes:
s31, in the second stage of training, a classification task is carried out on the new data, and all the support samples are input into the embedded module f with fixed parameters θ In the method, corresponding support sample characteristics f are obtained θ (S ck );
S32, feature transformation is carried out by utilizing an orthogonal adaptation module to obtain
Figure RE-GDA0003856221160000083
S33, the transformed characteristic corresponds to a Mask (Mask) M of each class c Multiplying to enable the features between different classes to be orthogonal pairwise;
step S33 is calculated as:
Figure RE-GDA0003856221160000084
wherein S is ck For the kth supported sample of class c,
Figure RE-GDA0003856221160000085
representing the same orderMultiplication of corresponding elements of the matrix, M cijh Is a mask M of class c c In the ith row and jth column, the value of the h channel, M c The elements of (a) are constituted as follows:
Figure RE-GDA0003856221160000086
c is the total category number of the current task, H is the number of the characteristic channels, and H is an integral multiple of C; in the above formula, when h is within a given range (the range of h starts from 0), M cijh Equal to 1 and the values of the remaining positions are 0.
S34, calculating cosine distance C (P) between similar features by utilizing a measurement module ci ,P cj ) (i belongs to [0, K), i is not equal to j), and the cosine distance of the features among the same classes under a plurality of classes can be obtained;
step S34 is calculated as follows:
Figure RE-GDA0003856221160000091
wherein, C (P) ci ,P cj ) For calculating the cosine distance between the same classes, K is the number of the supporting samples, c is the c-th class, P ci Represents the ith supported sample feature in class c, P cj Represents the jth supported sample feature in class c,
Figure RE-GDA0003856221160000094
multiplication of corresponding elements representing the matrix, | | P ci I is expressed by solving matrix P ci The two norms of (a).
S35, optimizing the orthogonal adaptation module by utilizing the mean square error loss function
Figure RE-GDA0003856221160000092
Step S35 includes:
the formula is calculated using the mean square error loss function as follows:
Figure RE-GDA0003856221160000093
wherein N is the total category number under the current task, C (P) ci ,P cj ) To calculate the cosine distances between classes, where MSE [ cos (P) ] ci ,P cj ),l]=[cos(P ci ,P cj )-1] 2
After the loss of the support sample is calculated, gradient descent is carried out, and a mini-batch and Adam optimizer is adopted to update the orthogonal adaptation module
Figure RE-GDA0003856221160000095
Training is repeated for multiple tasks until the network converges.
The Adam adaptive optimization algorithm comprises the following specific steps:
initializing the data: v. of dW =0,S dW =0,v db =0,S db =0, which represent biased first and second moment estimates, respectively, dW, db representing the differential of W and b, respectively;
calculate Momentum exponential weighted average:
v dW =β 1 v dW +(1-β 1 )dW (5)
v db =β 1 v db +(1-β 1 )db (6)
calculating an exponentially weighted average of the gradient differential squares of the RMSprop algorithm formula:
S dW =β 2 S dW +(1-β 2 )(dW) 2 (7)
S db =β 2 S db +(1-β 2 )(db) 2 (8)
calculating the deviation correction of Momentum and RMSprop algorithms:
and (3) deviation correction of the Momentum algorithm:
Figure RE-GDA0003856221160000101
Figure RE-GDA0003856221160000102
deviation correction of RMSprop algorithm:
Figure RE-GDA0003856221160000103
Figure RE-GDA0003856221160000104
gradient descent is carried out, and the weight is updated:
Figure RE-GDA0003856221160000105
Figure RE-GDA0003856221160000106
in equations (5) - (14), t represents the t-th iteration, α represents the learning rate, which controls the update rate of the weights, ε represents a very small constant, β 12 Respectively representing the exponential decay rates of the first and second moment estimates,
Figure RE-GDA0003856221160000107
representing the first and second moment estimates after bias correction.
And S4, classifying the images of the test set by using the optimized image orthogonal prior feature learning network model.
Step S4 comprises the following steps:
s41, testing process, each task
Figure RE-GDA0003856221160000108
By supporting set
Figure RE-GDA0003856221160000109
And query set
Figure RE-GDA00038562211600001010
Composition, query set of test set
Figure RE-GDA00038562211600001011
Input to an embedding module f θ Orthogonal adaptation module after fine adjustment
Figure RE-GDA00038562211600001012
In order to obtain characteristics
Figure RE-GDA00038562211600001013
S42, respectively matching the characteristics output by the orthogonal adaptation module with the masks M corresponding to different classes c Multiplication, the specific operation is shown in formula (1):
Figure RE-GDA0003856221160000111
wherein the content of the first and second substances,
Figure RE-GDA0003856221160000112
for the sample of the k-th query,
Figure RE-GDA0003856221160000113
representing multiplication of corresponding elements of a matrix of the same order, M cijh As class c mask M c In the ith row and jth column, the value of the h channel, M c The elements of (a) are constituted as follows:
Figure RE-GDA0003856221160000114
c is the total category number under the current task, H is the number of characteristic channels, and H is an integral multiple of C; in the above formula, when h is within a given range (the range of h starts from 0), M cijh Equal to 1, the values of the remaining positions being 0;
s43, mixingThe product is sent to a measurement module to calculate a query sample
Figure RE-GDA0003856221160000115
Cosine distances from all supported samples; in the training stage, the measurement module calculates the cosine distance between the same type of features without calculating different types, and the measurement module in the same testing stage has different use methods;
and S44, taking the support sample type with the closest distance as the prediction type of the query sample. Different from the traditional training, the model is finely adjusted by using the support sample under the new class, and the query sample is directly tested after the optimization is completed.
On the other hand, the invention also provides a small sample image classification device based on feature migration and orthogonal prior, which is used for implementing the method, and as shown in fig. 6, the device comprises the following functional modules:
a pre-training module for pre-training the image to obtain an embedded module f θ Extracting the characteristics of an image, wherein the image comprises a training set and a test set;
the processing module introduces the idea of orthogonal prior and constructs a feature learning network model based on feature migration and orthogonal prior;
the calculation module is used for solving model parameters based on a training optimization orthogonal prior feature learning network model objective function;
and the classification module is used for classifying the images of the test set by utilizing the optimized image orthogonal prior feature learning network model.
The invention learns the orthogonal feature subspace by introducing the feature migration and the orthogonal prior small sample image feature learning, assuming a new class and base class shared feature extraction mode and assuming the feature orthogonality of different classes of new class data, and constructing an orthogonalized feature adaptation network to ensure that the features of different classes are orthogonal to each other, thereby improving the feature identification degree.
The detailed description of the proposed small sample image classification method and apparatus based on feature migration and orthogonal prior and the method thereof are described above with reference to the accompanying drawings. The implementation of the method and the device will be clear to those skilled in the art from the above description of the embodiments.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, this disclosure is not intended to be limited to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the disclosure as described herein, and any descriptions of specific languages are provided above to disclose the best mode disclosed herein.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments disclosed herein, various features disclosed herein are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various disclosed aspects. However, the disclosed method should not be interpreted to reflect the following schematic: rather, the claims appended hereto are directed to more features than are expressly recited in each claim. Rather, as the following claims reflect, disclosed aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this disclosure.
The above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: those skilled in the art can still make modifications or easily conceive of changes to the technical solutions described in the foregoing embodiments, or make equivalents to some of them, within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present application. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A small sample image classification method based on feature migration and orthogonal prior is characterized by comprising the following steps:
s1, data preparation, namely pre-training an image to obtain an embedded module f θ Extracting the characteristics of an image, wherein the image comprises a training set and a test set;
s2, introducing an orthogonal prior thought into the convolutional neural network model, and constructing a feature learning network model based on feature migration and orthogonal prior;
s3, learning a network model objective function based on training optimization orthogonal prior characteristics;
and S4, classifying the images of the test set by using the optimized image orthogonal prior feature learning network model.
2. The small sample image classification method based on feature migration and orthogonal prior as claimed in claim 1, wherein step S1 comprises:
s11, data are processed
Figure RE-FDA0003856221150000011
Is divided into
Figure RE-FDA0003856221150000012
And
Figure RE-FDA0003856221150000013
two parts, and the two parts are in the same category space, and D is the same as D train As base class data training model, D test Testing the model as new data;
s12, for the C-way K-shot classification task, from D train Randomly selects C categories, and randomly selects M samples in each category, whereinK samples are used as support samples S i And the rest M-K samples are used as query samples Q i ,S i And Q i Form a task T i Same for D test Has a task
Figure RE-FDA0003856221150000014
S13, a first training stage: pre-training embedded module f by base class data θ ,f θ The method comprises the following steps of (1) containing 4 convolution blocks, wherein each convolution block contains a convolution layer, a pooling layer and a nonlinear activation function; the window size of a convolution kernel used by each convolution block is 3 multiplied by 3, a batch normalization, RGB three channels, a pooling layer, a 2 multiplied by 2 maximum pooling layer, the maximum pooling layers of the last two blocks are cut, and a nonlinear activation layer is used as an activation function of the nonlinear activation layer.
3. The small sample image classification method based on feature migration and orthogonal prior as claimed in claim 1, wherein in the feature learning network model based on feature migration and orthogonal prior in step S2, the orthogonalized feature adaptation network consists of three parts: embedded module f θ Orthogonal adaptation module
Figure RE-FDA0003856221150000015
And a metrology module; quadrature adaptation module
Figure RE-FDA0003856221150000016
The method is composed of two layers of convolution layers, the size of a convolution kernel is 5 multiplied by 5, and the method is used for transforming the new sample characteristics and learning an orthogonalized characteristic subspace.
4. The method for classifying small sample images based on feature migration and orthogonal prior as claimed in claim 1, wherein the step S3 of optimizing the orthogonal prior based on training the feature learning network model comprises:
s31, in the second stage of training, a classification task is carried out on the new data, and all the support samples are inputFixed-parameter embedded module f θ In the method, corresponding support sample characteristics f are obtained θ (S ck );
S32, feature transformation is carried out by utilizing an orthogonal adaptation module to obtain
Figure RE-FDA0003856221150000021
S33, the transformed features correspond to the masks M of each class c Multiplying to enable the features between different classes to be orthogonal pairwise;
s34, calculating cosine distance C (P) between similar features by utilizing a measurement module ci ,P cj )(i∈[0,K),i≠j);
S35, optimizing the orthogonal adaptation module by utilizing the mean square error loss function
Figure RE-FDA0003856221150000022
5. The small sample image classification method based on feature migration and orthogonal prior as claimed in claim 4, wherein the step S33 is calculated by the formula:
Figure RE-FDA0003856221150000023
wherein S is ck For the kth supported sample of class c,
Figure RE-FDA0003856221150000024
representing multiplication of corresponding elements of a matrix of the same order, M cijh Is a mask M of class c c Value of ith row and jth column of h channel, M c The elements of (a) are constituted as follows:
Figure RE-FDA0003856221150000025
wherein C is the total category number under the current task, H is the number of characteristic channels, and H is CInteger multiples of; in the above formula, when h is within a given range, the range of h starts from 0, and M cijh Equal to 1 and the values of the remaining positions are 0.
6. The small sample image classification method based on feature migration and orthogonal prior as claimed in claim 4, wherein the calculation formula of step S34 is as follows:
Figure RE-FDA0003856221150000026
wherein, C (P) ci ,P cj ) For calculating the cosine distance between the same classes, K is the number of the supporting samples, c is the c-th class, P ci Represents the ith supported sample feature in class c, P cj Represents the jth supported sample feature in class c,
Figure RE-FDA0003856221150000027
multiplication of corresponding elements of the representative matrix, | | P ci I is expressed by solving matrix P ci The two norms of (a).
7. The small sample image classification method based on feature migration and orthogonal prior as claimed in claim 4, wherein the mean square error loss function calculation formula of step S35 is as follows:
Figure RE-FDA0003856221150000031
wherein N is the total category number under the current task, C (P) ci ,P cj ) To calculate the cosine distances between classes, where MSE [ cos (P) ] ci ,P cj ),1]=[cos(P ci ,P cj )-1] 2
After loss of the support sample is calculated, gradient reduction is carried out, and a mini-batch and Adam optimizer are adopted to update the orthogonal adaptation module
Figure RE-FDA0003856221150000036
Training is repeated for multiple tasks until the network converges.
8. The small sample image classification method based on feature migration and orthogonal prior as claimed in claim 7, wherein the Adam adaptive optimization algorithm comprises the following specific steps:
initializing the data: v. of dW =0,S dW =0,v db =0,S db =0, which represent biased first and second moment estimates, respectively, dW, db representing the differential of W and b, respectively;
calculate Momentum exponential weighted average:
v dW =β 1 v dW +(1-β 1 )dW (5)
v db =β 1 v db +(1-β 1 )db (6)
calculating an exponentially weighted average of the gradient differential squares of the RMSprop algorithm formula:
S dW =β 2 S dW +(1-β 2 )(dW) 2 (7)
S db =β 2 S db +(1-β 2 )(db) 2 (8)
calculating the deviation correction of Momentum and RMSprop algorithms:
and (3) deviation correction of the Momentum algorithm:
Figure RE-FDA0003856221150000032
Figure RE-FDA0003856221150000033
deviation correction of RMSprop algorithm:
Figure RE-FDA0003856221150000034
Figure RE-FDA0003856221150000035
gradient descent is carried out, and the weight is updated:
Figure RE-FDA0003856221150000041
Figure RE-FDA0003856221150000042
in equations (5) - (14), t represents the t-th iteration, α represents the learning rate, which controls the update rate of the weights, ε represents a very small constant, β 1 ,β 2 Respectively representing the exponential decay rates of the first and second moment estimates,
Figure RE-FDA0003856221150000043
representing the first and second moment estimates after bias correction.
9. The small sample image classification method based on feature migration and orthogonal prior as claimed in claim 1, wherein step S4 comprises:
s41, testing process, each task
Figure RE-FDA0003856221150000044
From the supporting set
Figure RE-FDA0003856221150000045
And query set
Figure RE-FDA0003856221150000046
Composition, query set of test set
Figure RE-FDA0003856221150000047
Input to an embedding module f θ Orthogonal adaptation module after fine adjustment
Figure RE-FDA0003856221150000048
In (1) obtaining characteristics
Figure RE-FDA0003856221150000049
S42, respectively matching the characteristics output by the orthogonal adaptation module with the masks M corresponding to different classes c Multiplying, and specifically operating as shown in formula (1);
Figure RE-FDA00038562211500000410
wherein the content of the first and second substances,
Figure RE-FDA00038562211500000411
for the k-th query sample,
Figure RE-FDA00038562211500000412
representing multiplication of corresponding elements of a matrix of the same order, M cijh Is a mask M of class c c In the ith row and jth column, the value of the h channel, M c The elements of (a) are constituted as follows:
Figure RE-FDA00038562211500000413
c is the total category number of the current task, H is the number of the characteristic channels, and H is an integral multiple of C; in the above formula, when h is within a given range, the range of h starts from 0, M cijh Equal to 1, the values of the remaining positions being 0;
s43, sending the product into a measurement module, and calculating a query sample
Figure RE-FDA00038562211500000414
Cosine distances from all supported samples;
and S44, taking the support sample class with the closest distance as the prediction class of the query sample.
10. A small sample image classification device based on feature migration and orthogonal prior, which is used for implementing the method of any one of claims 1-9, and comprises the following functional modules:
a pre-training module for pre-training the image to obtain an embedded module f θ Extracting the characteristics of an image, wherein the image comprises a training set and a test set;
the processing module introduces the idea of orthogonal prior and constructs a feature learning network model based on feature migration and orthogonal prior;
the calculation module is used for solving model parameters based on a training optimization orthogonal prior feature learning network model objective function;
and the classification module is used for classifying the images of the test set by utilizing the optimized image orthogonal prior feature learning network model.
CN202210487137.9A 2022-05-06 2022-05-06 Small sample image classification method and device based on feature migration and orthogonal prior Active CN115294381B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210487137.9A CN115294381B (en) 2022-05-06 2022-05-06 Small sample image classification method and device based on feature migration and orthogonal prior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210487137.9A CN115294381B (en) 2022-05-06 2022-05-06 Small sample image classification method and device based on feature migration and orthogonal prior

Publications (2)

Publication Number Publication Date
CN115294381A true CN115294381A (en) 2022-11-04
CN115294381B CN115294381B (en) 2023-06-30

Family

ID=83819949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210487137.9A Active CN115294381B (en) 2022-05-06 2022-05-06 Small sample image classification method and device based on feature migration and orthogonal prior

Country Status (1)

Country Link
CN (1) CN115294381B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116778268A (en) * 2023-04-20 2023-09-19 江苏济远医疗科技有限公司 Sample selection deviation relieving method suitable for medical image target classification

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018162933A1 (en) * 2017-03-10 2018-09-13 Artificial Intelligence Research Group Limited Improved object recognition system
CN109508655A (en) * 2018-10-28 2019-03-22 北京化工大学 The SAR target identification method of incomplete training set based on twin network
CN110188795A (en) * 2019-04-24 2019-08-30 华为技术有限公司 Image classification method, data processing method and device
CN110929603A (en) * 2019-11-09 2020-03-27 北京工业大学 Weather image identification method based on lightweight convolutional neural network
CN113379614A (en) * 2021-03-31 2021-09-10 西安理工大学 Computed ghost imaging reconstruction recovery method based on Resnet network
WO2022016802A1 (en) * 2020-07-21 2022-01-27 上海集成电路研发中心有限公司 Physical feature map- and dcnn-based computation method for machine learning-based inverse lithography technology solution

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018162933A1 (en) * 2017-03-10 2018-09-13 Artificial Intelligence Research Group Limited Improved object recognition system
CN109508655A (en) * 2018-10-28 2019-03-22 北京化工大学 The SAR target identification method of incomplete training set based on twin network
CN110188795A (en) * 2019-04-24 2019-08-30 华为技术有限公司 Image classification method, data processing method and device
CN110929603A (en) * 2019-11-09 2020-03-27 北京工业大学 Weather image identification method based on lightweight convolutional neural network
WO2022016802A1 (en) * 2020-07-21 2022-01-27 上海集成电路研发中心有限公司 Physical feature map- and dcnn-based computation method for machine learning-based inverse lithography technology solution
CN113379614A (en) * 2021-03-31 2021-09-10 西安理工大学 Computed ghost imaging reconstruction recovery method based on Resnet network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHUO SUN 等: "Amortized Bayesian Prototype Meta-learning: A New Probabilistic Meta-learning Approach to Few-shot Image Classification", vol. 130, pages 1414 - 1422 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116778268A (en) * 2023-04-20 2023-09-19 江苏济远医疗科技有限公司 Sample selection deviation relieving method suitable for medical image target classification

Also Published As

Publication number Publication date
CN115294381B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
WO2022160771A1 (en) Method for classifying hyperspectral images on basis of adaptive multi-scale feature extraction model
CN109948029A (en) Based on the adaptive depth hashing image searching method of neural network
CN106682694A (en) Sensitive image identification method based on depth learning
CN111242157A (en) Unsupervised domain self-adaption method combining deep attention feature and conditional opposition
CN112347970B (en) Remote sensing image ground object identification method based on graph convolution neural network
CN110991549A (en) Countermeasure sample generation method and system for image data
CN111160553B (en) Novel field self-adaptive learning method
CN105913087A (en) Object identification method based on optimal pooled convolutional neural network
CN110490227A (en) A kind of few sample image classification method based on Feature Conversion
CN113221694B (en) Action recognition method
CN111401156B (en) Image identification method based on Gabor convolution neural network
CN110110845B (en) Learning method based on parallel multi-level width neural network
CN112818969A (en) Knowledge distillation-based face pose estimation method and system
CN111523586B (en) Noise-aware-based full-network supervision target detection method
CN114943859B (en) Task related metric learning method and device for small sample image classification
CN115294381A (en) Small sample image classification method and device based on feature migration and orthogonal prior
CN111582373A (en) Radiation source identification method based on weighted migration extreme learning machine algorithm
CN112200262B (en) Small sample classification training method and device supporting multitasking and cross-tasking
CN116070713A (en) Method for relieving Non-IID influence based on interpretable federal learning
CN114818945A (en) Small sample image classification method and device integrating category adaptive metric learning
CN114155554A (en) Transformer-based camera domain pedestrian re-recognition method
CN112782660A (en) Radar target identification method based on Bert
CN113077009A (en) Tunnel surrounding rock lithology identification method based on migration learning model
CN110110769A (en) A kind of image classification method based on width radial primary function network
CN111882061B (en) Convolutional neural network training method based on hierarchical random gradient descent

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Zhang Zhimin

Inventor after: Dong Hongfei

Inventor after: Li Xiaoxu

Inventor after: Liu Jun

Inventor after: Tang Zhuohe

Inventor after: Liu Zhongyuan

Inventor after: Zhang Wenbin

Inventor after: Zeng Junyu

Inventor after: Ma Zhanyu

Inventor after: Tao Jian

Inventor before: Li Xiaoxu

Inventor before: Dong Hongfei

Inventor before: Zhang Zhimin

Inventor before: Liu Jun

Inventor before: Tang Zhuohe

Inventor before: Liu Zhongyuan

Inventor before: Zhang Wenbin

Inventor before: Zeng Junyu

Inventor before: Ma Zhanyu

Inventor before: Tao Jian

CB03 Change of inventor or designer information