CN115272692A - Small sample image classification method and system based on feature pyramid and feature fusion - Google Patents

Small sample image classification method and system based on feature pyramid and feature fusion Download PDF

Info

Publication number
CN115272692A
CN115272692A CN202210733595.6A CN202210733595A CN115272692A CN 115272692 A CN115272692 A CN 115272692A CN 202210733595 A CN202210733595 A CN 202210733595A CN 115272692 A CN115272692 A CN 115272692A
Authority
CN
China
Prior art keywords
feature
pyramid
module
relation
small sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210733595.6A
Other languages
Chinese (zh)
Inventor
王先知
许洁斌
艾浩然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210733595.6A priority Critical patent/CN115272692A/en
Publication of CN115272692A publication Critical patent/CN115272692A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Library & Information Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a small sample image classification method based on feature pyramid and feature fusion, which comprises the following steps: s1, constructing a characteristic pyramid relation network model, wherein the model comprises a characteristic extraction module, a relation module and a characteristic fusion module; s2, expanding the data set, and dividing the data set into a training set, a verification set and a test set; s3, training a model, namely sampling a support set and a query set from a training set; s4, inputting a support set image and a query set image, extracting the features of the images by a feature extraction module, outputting the feature vectors of the images, and fusing the feature vectors by a feature fusion module; s5, inputting the feature vectors into a relation module, outputting the similarity values of the support set image and the query set image by the relation module, and processing all the similarity values to obtain final similarity values; s6, calculating the loss of the model, updating parameters of the model, and repeating iterative training until the error value of the loss tends to be stable; and S7, storing the trained model, and using the model for small sample image classification testing.

Description

Small sample image classification method and system based on feature pyramid and feature fusion
Technical Field
The invention relates to the field of small sample learning and meta-learning, in particular to a small sample image classification method and system based on a feature pyramid and feature fusion.
Background
The deep neural network model usually needs a large number of training samples with labeled data to achieve a better training effect. In reality, a large amount of manpower and material resources are often consumed for labeling samples, or sample data which can be used in some cases is few, and at this time, if a small amount of samples are directly used for training, an overfitting problem is generated, and small sample learning is generated for solving the problem.
A basic model for small sample learning is defined as p = C (f (x | θ) | w), where the feature extractor may be denoted as f, the classifier may be denoted as C, x denotes an input image to be recognized, θ denotes a parameter of the feature extractor f, w denotes a parameter of the classifier C, and p denotes a prediction result output by the model. In the process of small sample learning, due to the fact that the number of samples is small, overfitting of model parameters theta and w can be caused by direct training, and accuracy is reduced on a target task.
Defining a training set of similar previous tasks by large amounts of available data as DbaseDefining a small sample learning data set containing the target detection task as Dnovel. Model is at DbaseThe training is carried out, and a better parameter theta and w are learned. Model at D with initialization parameters already obtainednovelTraining to obtain new model parameter theta1And w1Updating the original parameters, and the updated new model p = C (f (x | θ))1)|w1) Can accurately finish image segmentationAnd (4) class tasks.
Around the core problem of small sample number, the existing small sample learning strategies are mainly solved by a data enhancement-based method, a metric learning-based method, a model-based method and a parameter optimization-based method. The metric learning is also called similarity learning, and the task of the metric learning is to learn a pair of similarity metrics S (-) and similar samples have higher similarity scores and dissimilar samples have lower similarity scores. The S can be a distance measurement without learning or a neural network capable of learning, and the similarity score output by the S can be used for inquiring the sample classification of the test set. However, in the existing small sample learning based on metric learning, only the similarity and distance measurement between the final outputs of the models are concerned, the similarity and measurement of the middle layer of the model network are not concerned, the identification accuracy is low, and the final classification accuracy of the models is influenced.
Disclosure of Invention
The invention aims to overcome the problem of low accuracy of small sample image classification in the prior art, and provides a small sample image classification method based on a feature pyramid and feature fusion.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a small sample image classification method based on feature pyramid and feature fusion comprises the following steps:
s1, constructing a characteristic pyramid relation network model of a plurality of layers of neural networks, wherein each layer of neural network comprises a characteristic extraction module, a relation module and a characteristic fusion module;
s2, acquiring a data set, expanding the data set, and dividing the expanded data set into a training set, a verification set and a test set;
s3, training a characteristic pyramid relation network model by adopting a C-way K-shot mode, and sampling a support set and a query set from a training set respectively in each training;
s4, inputting a support set image and a query set image, extracting the features of the images by a feature extraction module, outputting the feature vectors of the images, and fusing the feature vectors of the support set image and the query set image by a feature fusion module;
s5, inputting the fused feature vectors into a relation module, outputting the similarity values of the support set image and the query set image by the relation module, and processing the similarity values output by all the relation modules to obtain a final similarity value;
s6, calculating the loss of the characteristic pyramid relation network model, updating parameters of the characteristic pyramid relation network model, and repeating iterative training until the loss error value tends to be stable;
and S7, storing the trained characteristic pyramid relation network model, and using the characteristic pyramid relation network model for small sample image classification testing.
Further, the feature fusion module comprises a feature fusion item, and the feature fusion item is:
C′(FS,FQ)=Concate(FS,FQ,Mul(FS,FQ))
in the formula, FSFeature vectors representing images of a query set, FQThe feature vector of the support set image is represented, concate (·,) represents the operation of splicing in the feature channel, and Mul (·,) operation represents the multiplication of the feature graph according to the position corresponding elements.
Further, in step S6, the similarity score of a group of images is regarded as a regression task, and a mean square error MSE function is used as a loss function of each layer of neural network, where the mean square error MSE function is:
MSE(r,yS,yQ)=(r-1(yS==yQ))2
where r represents the similarity score of each layer of neural network output, ySLabel representing support set image, yQA label representing a query set image.
Further, in step S6, the loss of the feature pyramid relationship network model is calculated by using a loss function, where the loss function is:
Figure BDA0003714798740000021
in the formula, rlSimilarity score, y, representing the output of the l-th neural networkSLabel representing the support set image, yQLabels representing the query set images, MSE representing the mean square error function, and n representing the number of layers of the neural network.
Further, in step S2, the acquired data set is expanded by rotation at 90 degrees, 180 degrees, and 270 degrees.
Further, in the feature pyramid relationship network model, the last activation function of the full connection layer of the relationship module uses a Sigmoid function, and all other activation functions use a ReLU function.
A small sample image classification system based on feature pyramid and feature fusion comprises:
the characteristic extraction module is used for extracting the characteristics of the input image;
the characteristic fusion module is used for fusing the characteristics of the input image;
and the relation module is used for judging the similarity of the input image characteristics of the support set and the image characteristics of the query set.
Further, the feature extraction module comprises four volume blocks and two maximum pooling layers of 2*2, and the volume blocks, the maximum pooling layers, the volume blocks and the convolution blocks are connected in sequence.
Further, the relation module comprises two volume blocks, two maximum pooling layers of 2*2, a ReLU full-connection layer and a Sigmoid full-connection layer, and the volume blocks, the maximum pooling layers, the ReLU full-connection layer and the Sigmoid full-connection layer are sequentially connected.
Further, the convolution block comprises a convolution layer, a Batch Norm layer and a ReLU activation function layer, the convolution kernel size of the convolution layer is 3*3, and the number of output channels is 64.
Compared with the prior art, the method improves the classification precision of the small sample images by constructing the characteristic pyramid relation network (FPRN) model, and can still quickly obtain the detection result through the characteristic pyramid relation network (FPRN) model due to the small self-weight of the characteristic pyramid relation network (FPRN) model, so that the accuracy is high.
Drawings
FIG. 1 is a schematic diagram of a feature pyramid relationship network FPRN.
Fig. 2 is a schematic structural diagram of the feature extraction module.
Fig. 3 is a schematic structural diagram of the relationship module.
Fig. 4 is a schematic diagram of the structure of the convolution Block Conv Block.
Detailed Description
The method and system for classifying small sample images based on feature pyramid and feature fusion of the present invention will be further described with reference to the accompanying drawings and specific embodiments.
The invention discloses a small sample image classification method based on a feature pyramid and feature fusion, which comprises the following steps of;
s1, constructing a characteristic pyramid relation network model of a plurality of layers of neural networks, wherein each layer of neural network comprises a characteristic extraction module, a relation module and a characteristic fusion module.
And S2, acquiring a data set, expanding the data set, and dividing the expanded data set into a training set, a verification set and a test set.
And S3, training the characteristic pyramid relation network model by adopting a C-way K-shot mode, and sampling a support set and a query set from the training set respectively in each training.
And S4, inputting the support set image and the query set image, extracting the features of the images by the feature extraction module, outputting the feature vectors of the images, and fusing the feature vectors of the support set image and the query set image by the feature fusion module.
And S5, inputting the fused feature vectors into a relation module, outputting the similarity values of the support set image and the query set image by the relation module, and processing the similarity values output by all the relation modules to obtain the final similarity value.
And S6, calculating the loss of the characteristic pyramid relation network model, updating the parameters of the characteristic pyramid relation network model, and repeating iterative training until the loss error value tends to be stable.
And S7, storing the trained feature pyramid relationship network model, and using the feature pyramid relationship network model for small sample image classification testing.
Referring to fig. 1, the invention also discloses a small sample image classification system based on feature pyramid and feature fusion, which comprises a feature extraction module, a feature fusion module and a relation module, wherein the feature extraction module is used for extracting features of input images, the feature fusion module is used for fusing the features of the input images, and the relation module is used for judging the similarity of the features of the input support set images and the features of the query set images.
Specifically, in the neural network, the deeper the neural network hierarchy, the larger the receptive field, the more attention is paid to the overall features of the image, and the shallower the neural network hierarchy, the smaller the receptive field, the more attention is paid to the local features of the image. For example, animals are classified, deep neural networks can distinguish specific animal species characteristics, and shallow neural networks can extract hair characteristics, background texture characteristics and the like, so that the characteristics of the shallow neural networks can also be used for helping to distinguish animal species. Based on this, the invention provides a Feature Pyramid Relationship Network (FPRN) model.
The feature pyramid relational network model has a plurality of layers of neural networks, each layer of neural network including a Feature Extraction Module (FEM), a feature fusion module, and a Relational Module (RM). The feature extraction module is used for extracting features of the input images, the feature fusion module is used for fusing the features of the input images, and the relation module is used for judging the similarity of the input image features of the support set and the input image features of the query set.
As shown in fig. 2, the feature extraction module includes four convolution blocks and two maximum pooling layers 2*2, in the feature extraction module, the convolution blocks, the maximum pooling layers, the convolution blocks, and the convolution blocks are connected in sequence, and each convolution block outputs two feature vectors in a group for feature fusion.
As shown in fig. 3, the relationship module includes two volume blocks, two maximum pooling layers of 2*2, a ReLU full-connected layer, and a Sigmoid full-connected layer, and in the relationship module, the volume blocks, the maximum pooling layers, the ReLU full-connected layer, and the Sigmoid full-connected layer are connected in sequence. The input of the relation module is the feature fused by the feature fusion module, and the output of the relation module is a similarity score used for judging the similarity of the input image feature of the support set and the image feature of the query set.
As shown in FIG. 4, the convolution block consists of a convolution layer with a convolution kernel size of 3*3 and a number of output channels of 64, a Batch Norm layer, and a ReLU activation function layer.
In the feature pyramid relational network model, all other activation functions are ReLU functions except that the activation function of the last full-connection layer of the relational module is a Sigmoid function. The last output of the relationship module uses the Sigmoid function because the present invention expects to output a similarity score between 0 and 1.
The feature fusion module comprises feature fusion items, and the feature fusion items are as follows:
C′(FS,FQ)=Concate(FS,FQ,Mul(FS,FQ))
in the formula, FSRepresenting feature vectors, F, produced by convolved blocks of the query set imageQRepresenting a feature vector generated by a convolution block of a support set image, concatee (·,) representing the operation of splicing in a feature channel, and Mul (·,) operation representing the multiplication of a feature graph according to position corresponding elements.
The acquired data set is expanded by rotation through 90 degrees, 180 degrees and 270 degrees.
For the C-way 1-shot problem, feature fusion can be directly carried out on the support set feature graph and the query set feature graph extracted by the feature extraction module. For the C-way K-shot (K > 1) problem, the feature graphs extracted by the feature extraction module from a plurality of images of the support set are added element by element according to corresponding positions, feature fusion is carried out on the feature graphs and the feature graphs of the query set, and then the similarity scores are calculated in the relation module.
In the training stage, C categories are randomly extracted from a training set, and K samples are extracted from each category to serve as a support set of a characteristic pyramid relation network (FPRN) model; and extracting a batch of samples from the rest data in the C categories as a query set. A Feature Pyramid Relationship Network (FPRN) model is expected to learn from training the ability to distinguish the C classes from the C × K samples, and each training run samples different classes.
And fusing the characteristic vectors obtained from the support set image and the query set image at different network depths together through a characteristic fusion item. In the feature fusion term proposed by the invention, through Mul (F)S,FQ) Can be introduced of FSAnd FQThe interesting area is more prominent, and the similarity can be judged by the characteristic pyramid relation network.
The relation module outputs a similarity score from 0 to 1, and the similarity scores output by the relation modules of all the neural networks are weighted and averaged to obtain the final similarity score of the Feature Pyramid Relation Network (FPRN) model.
Taking the similarity score of a group of images of the support set image and the query set image as a regression task, and taking a Mean Square Error (MSE) function as a loss function of each layer of neural network, wherein the MSE function is as follows:
MSE(r,ys,yQ)=(r-1(yS==yQ))2
where r represents the similarity score of each layer of neural network output, ySLabel representing the support set image, yQA label representing a query set image. When the labels are the same, (y)S==yQ) Has a value of 1, and when the labels are different, (y)S==yQ) The value of (d) is 0.
In the Feature Pyramid Relationship Network (FPRN) model, the relationship module of each layer of neural network outputs a similarity score, so the total loss function of the Feature Pyramid Relationship Network (FPRN) model is:
Figure BDA0003714798740000061
in the formula, rlSimilarity score, y, representing the output of the l-th layerSLabel representing the support set image, yQLabels representing the query set images, MSE represents the mean square error function.
And calculating the loss of the characteristic pyramid relation network (FPRN) model through a loss function, and reversely propagating and updating the parameters of the characteristic pyramid relation network (FPRN) model. And repeating the iterative training until the error value of the loss calculated by the loss function tends to be stable.
And storing the trained characteristic pyramid relation network model, and using the characteristic pyramid relation network model for small sample image classification testing. The small sample image classification method based on the feature pyramid and the feature fusion provided by the invention obtains a better detection effect on two public data sets.
The omniroot dataset contains 1623 character classes in 50 different languages, each character class containing 20 samples written by different people. In the training process, each training of the 20-way 1-shot is composed of 1 support set image and 10 query set images in each category, and each training of the 20-way 5-shot is composed of 5 support set images and 5 query set images in each category. In the testing process, the classification result of a characteristic pyramid relation network (FPRN) model is evaluated by randomly sampling 1000 times in a testing set, wherein 1-shot samples 1 testing set image every time, and 5-shots samples 5 testing set images every time.
The miniImagenet dataset consists of a total of 60000 color images in 100 categories, each category consisting of 600 samples, 64 categories for training, 16 categories for verification, and 20 categories for testing. On the miniImagenet data set, the invention adopts the setting of 5-way 1-shot and 5-way 5-shot. In the training process, each training of the 5-way 1-shot is composed of 1 support set image and 15 query set images in each category, and each training of the 5-way 5-shot is composed of 5 support set images and 10 query set images in each category. In the testing process, the invention randomly samples 600 times in the test to evaluate the classification result of a Feature Pyramid Relation Network (FPRN) model, wherein in the setting of 5-way 1-shot and 5-way 5-shot, 15 test set images are sampled each time.
The present invention compares the results of the image classification methods of the Feature Pyramid Relationship Network (FPRN) model with other popular metric learning-based small sample learning models. The models used for comparison mainly include twin networks (Siamese networks), prototype networks (prototypes networks), matching networks (Matching networks) and relationship networks (relationship networks). The results of comparing the feature pyramid relational network model (FPRN) to these model benchmarks on the Omniglo dataset are shown in table 1.
Table 1 omniroot dataset experimental results
Figure BDA0003714798740000071
The results of comparing model references of the Feature Pyramid Relationship Network (FPRN) model with the twin Network (siame Network), the Prototype Network (proto type Network), the Matching Network (Matching networks) and the relationship Network (relationship Network) on the miniimage data set are shown in table 2.
TABLE 2 miniImagenet data set Experimental results
Figure BDA0003714798740000072
As shown in tables 1 and 2, experimental data show that the Feature Pyramid Relationship Network (FPRN) model provided by the present invention achieves the highest determination accuracy in each experiment. On the Ominigilot data set, the characteristic pyramid relational network (FPRN) model provided by the invention can achieve 98.3% of classification accuracy in the setting of 20-way 1-shot and can achieve 99.2% of classification accuracy in the setting of 20-way 5-shot. On a miniImagenet data set, the characteristic pyramid relation network (FPRN) model provided by the invention can reach 50.2% of classification accuracy in the setting of 5-way 1-shot and can reach 66.7% of classification accuracy in the setting of 5-way 5-shot.
The invention compares the detection speed of a relation network model and a characteristic pyramid relation network (FPRN) model in 5-way 1-shot setting on a miniImagenet data set. The video card used in the experiment is an NVIDIA Quadro P2000 video card, the judging speed of the relation network is 17.1fps, the judging speed of the characteristic pyramid relation network model (FPRN) is 16.3fps, and the judging speed of the characteristic pyramid relation network model (FPRN) is 4.7% slower than that of the relation network model. In the experimental setup, the detection accuracy of the Feature Pyramid Relationship Network (FPRN) model is 50.2%, the detection accuracy of the relationship network model is 47.3%, the absolute value of the detection accuracy of the Feature Pyramid Relationship Network (FPRN) model is improved by 2.9% compared with the absolute value of the detection accuracy of the relationship network model, and the accuracy is improved by 6.1% according to percentage calculation. The Feature Pyramid Relationship Network (FPRN) model achieves a 6.1% percentage accuracy improvement at the sacrifice of 4.7% detection time.
In conclusion, the accuracy of small sample image classification is improved by constructing the characteristic pyramid relation network (FPRN) model, and the detection result can be still quickly obtained through the characteristic pyramid relation network (FPRN) model due to the fact that the quantity of the characteristic pyramid relation network (FPRN) model is small, and the accuracy is high.
The above description is intended to describe in detail the preferred embodiments of the present invention, but the embodiments are not intended to limit the scope of the claims of the present invention, and all equivalent changes and modifications made within the technical spirit of the present invention should fall within the scope of the claims of the present invention.

Claims (10)

1. A small sample image classification method based on feature pyramid and feature fusion is characterized by comprising the following steps:
s1, constructing a characteristic pyramid relation network model of a plurality of layers of neural networks, wherein each layer of neural network comprises a characteristic extraction module, a relation module and a characteristic fusion module;
s2, acquiring a data set, expanding the data set, and dividing the expanded data set into a training set, a verification set and a test set;
s3, training a characteristic pyramid relation network model by adopting a C-way K-shot mode, and sampling a support set and a query set from a training set respectively in each training;
s4, inputting a support set image and a query set image, extracting the features of the images by a feature extraction module, outputting the feature vectors of the images, and fusing the feature vectors of the support set image and the query set image by a feature fusion module;
s5, inputting the fused feature vectors into a relation module, outputting the similarity values of the support set image and the query set image by the relation module, and processing the similarity values output by all the relation modules to obtain a final similarity value;
s6, calculating the loss of the characteristic pyramid relation network model, updating parameters of the characteristic pyramid relation network model, and repeating iterative training until the loss error value tends to be stable;
and S7, storing the trained characteristic pyramid relation network model, and using the characteristic pyramid relation network model for small sample image classification testing.
2. The feature pyramid and feature fusion based small sample image classification method of claim 1, characterized in that the feature fusion module includes feature fusion terms, the feature fusion terms are:
C′(FS,FQ)=Concate(FS,FQ,Mul(FS,FQ))
in the formula, FSFeature vectors representing images of a query set, FQThe feature vector of the support set image is represented, concate (·,) represents the operation of splicing in the feature channel, and Mul (·,) operation represents the multiplication of the feature graph according to the position corresponding elements.
3. The small sample image classification method based on feature pyramid and feature fusion as claimed in claim 1, characterized in that in step S6, the similarity scores of a group of images are regarded as a regression task, and a mean square error MSE function is used as a loss function of each layer of neural network, and the mean square error MSE function is:
MSE(r,yS,yQ)=(r-1(yS==yQ))2
where r represents the similarity score of each layer of neural network output, ySLabel representing support set image, yQA label representing a query set image.
4. The small sample image classification method based on feature pyramid and feature fusion as claimed in claim 3, characterized in that in step S6, the loss of the feature pyramid relation network model is calculated by using a loss function, and the loss function is:
Figure FDA0003714798730000021
in the formula, rlSimilarity score, y, representing the output of the l-th neural networkSLabel representing the support set image, yQLabels representing the query set images, MSE representing the mean square error function, and n representing the number of layers of the neural network.
5. The method for classifying small sample images based on feature pyramid and feature fusion as claimed in claim 1, wherein in step S2, the acquired data set is expanded by rotation at 90 degrees, 180 degrees and 270 degrees.
6. The method for classifying small sample images based on feature pyramid and feature fusion as claimed in claim 1, wherein in the feature pyramid relational network model, the activation function of the last full-connection layer of the relational module uses Sigmoid function, and all other activation functions use ReLU function.
7. A small sample image classification system based on feature pyramid and feature fusion is characterized by comprising:
the characteristic extraction module is used for extracting the characteristics of the input image;
the characteristic fusion module is used for fusing the characteristics of the input image;
and the relation module is used for judging the similarity of the input image characteristics of the support set and the image characteristics of the query set.
8. The small sample image classification system based on feature pyramid and feature fusion of claim 7, characterized in that the feature extraction module comprises four volume blocks and two maximum pooling layers of 2*2, and the volume block, the maximum pooling layer, the volume block, and the convolution block are connected in sequence.
9. The feature pyramid and feature fusion based small sample image classification system of claim 7, wherein the relationship module comprises two volume blocks, two maximum pooling layers of 2*2, a ReLU full-connected layer, and a Sigmoid full-connected layer, and the volume blocks, the maximum pooling layers, the ReLU full-connected layer, and the Sigmoid full-connected layer are connected in sequence.
10. The feature pyramid and feature fusion based small sample image classification system of claim 8 or 9, where the convolution block comprises a convolution layer with a convolution kernel size of 3*3 and a number of output channels of 64, a Batch Norm layer and a ReLU activation function layer.
CN202210733595.6A 2022-06-27 2022-06-27 Small sample image classification method and system based on feature pyramid and feature fusion Pending CN115272692A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210733595.6A CN115272692A (en) 2022-06-27 2022-06-27 Small sample image classification method and system based on feature pyramid and feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210733595.6A CN115272692A (en) 2022-06-27 2022-06-27 Small sample image classification method and system based on feature pyramid and feature fusion

Publications (1)

Publication Number Publication Date
CN115272692A true CN115272692A (en) 2022-11-01

Family

ID=83761081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210733595.6A Pending CN115272692A (en) 2022-06-27 2022-06-27 Small sample image classification method and system based on feature pyramid and feature fusion

Country Status (1)

Country Link
CN (1) CN115272692A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115775340A (en) * 2023-02-13 2023-03-10 北京科技大学 Feature modulation-based self-adaptive small sample image classification method and device
CN116597167A (en) * 2023-06-06 2023-08-15 中国人民解放军92942部队 Permanent magnet synchronous motor small sample demagnetization fault diagnosis method, storage medium and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115775340A (en) * 2023-02-13 2023-03-10 北京科技大学 Feature modulation-based self-adaptive small sample image classification method and device
CN116597167A (en) * 2023-06-06 2023-08-15 中国人民解放军92942部队 Permanent magnet synchronous motor small sample demagnetization fault diagnosis method, storage medium and system
CN116597167B (en) * 2023-06-06 2024-02-27 中国人民解放军92942部队 Permanent magnet synchronous motor small sample demagnetization fault diagnosis method, storage medium and system

Similar Documents

Publication Publication Date Title
CN108171209B (en) Face age estimation method for metric learning based on convolutional neural network
CN109241317B (en) Pedestrian Hash retrieval method based on measurement loss in deep learning network
CN108960409B (en) Method and device for generating annotation data and computer-readable storage medium
Unnikrishnan et al. Toward objective evaluation of image segmentation algorithms
WO2019015246A1 (en) Image feature acquisition
CN115272692A (en) Small sample image classification method and system based on feature pyramid and feature fusion
CN110717554B (en) Image recognition method, electronic device, and storage medium
CN109740679B (en) Target identification method based on convolutional neural network and naive Bayes
CN112200211B (en) Small sample fish identification method and system based on residual network and transfer learning
CN109063112B (en) Rapid image retrieval method, model and model construction method based on multitask learning deep semantic hash
CN111950528B (en) Graph recognition model training method and device
CN109919252B (en) Method for generating classifier by using few labeled images
CN108009560B (en) Commodity image similarity category judgment method and device
CN110738102A (en) face recognition method and system
CN111860656B (en) Classifier training method, device, equipment and storage medium
CN112784921A (en) Task attention guided small sample image complementary learning classification algorithm
Xu et al. Discriminative analysis for symmetric positive definite matrices on lie groups
CN111325237A (en) Image identification method based on attention interaction mechanism
CN112132145A (en) Image classification method and system based on model extended convolutional neural network
CN114299362A (en) Small sample image classification method based on k-means clustering
CN112232374A (en) Irrelevant label filtering method based on depth feature clustering and semantic measurement
Sunitha et al. Novel content based medical image retrieval based on BoVW classification method
CN111340067B (en) Redistribution method for multi-view classification
Cho et al. A space-time graph optimization approach based on maximum cliques for action detection
CN114299342B (en) Unknown mark classification method in multi-mark picture classification based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination