CN114723998A - Small sample image classification method and device based on large-boundary Bayes prototype learning - Google Patents

Small sample image classification method and device based on large-boundary Bayes prototype learning Download PDF

Info

Publication number
CN114723998A
CN114723998A CN202210482490.8A CN202210482490A CN114723998A CN 114723998 A CN114723998 A CN 114723998A CN 202210482490 A CN202210482490 A CN 202210482490A CN 114723998 A CN114723998 A CN 114723998A
Authority
CN
China
Prior art keywords
sample
class
model
prototype
distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210482490.8A
Other languages
Chinese (zh)
Other versions
CN114723998B (en
Inventor
李晓旭
郭紫杰
刘俊
武继杰
宋琪
张文斌
曾俊瑀
马占宇
陶剑
董洪飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanzhou University of Technology
Original Assignee
Lanzhou University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanzhou University of Technology filed Critical Lanzhou University of Technology
Priority to CN202210482490.8A priority Critical patent/CN114723998B/en
Publication of CN114723998A publication Critical patent/CN114723998A/en
Application granted granted Critical
Publication of CN114723998B publication Critical patent/CN114723998B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a small sample image classification method and a device based on large-boundary Bayesian prototype learning, wherein the method comprises the following steps: s1, preprocessing data, wherein the data comprise a training set and a testing set; s2, constructing a Bayes prototype distribution model based on the convolutional neural network and the Gaussian distribution model; s3, solving model parameters based on the objective function of the optimized Bayesian prototype distribution model; and S4, classifying the images of the test set by using the optimized Bayesian prototype distribution model, and evaluating the performance of the model. The invention introduces the thought of large boundary classification and variational reasoning, researches based on a mode of distribution estimation modeling, establishes the small sample image classification method and the device based on large boundary Bayes prototype learning, solves the problem of prototype deviation existing in small sample image classification, improves the classification effect of images and has high practical value.

Description

Small sample image classification method and device based on large-boundary Bayes prototype learning
Technical Field
The invention relates to the technical field of image classification, in particular to a small sample image classification method and device based on large-boundary Bayesian prototype learning.
Background
In recent years, with the development of computer technology, the information browsed by people is increasingly rich, a large number of pictures are uploaded to a network every day, and due to the large number of pictures, people cannot classify the pictures. The recognition performance of machines has surpassed humans on many large sample image classification tasks. However, when the sample size is small, the recognition level of the machine still has a large gap from that of the human. Therefore, there is an urgent social need to research an efficient and reliable image classification algorithm.
The ability of human beings to identify a new object, particularly through a very small number of samples, for example, a child only needs to see a single picture in a book to accurately judge what is "banana" or "strawberry". The small sample learning means that a researcher expects that a machine learning model can quickly learn only by a small amount of data when meeting a new category after learning a large amount of data of a certain category, so that the small sample learning is realized.
The small sample classification belongs to the category of small sample learning and usually comprises two types of data with disjoint class spaces, namely base class data and new class data. The small sample classification aims to learn classification rules by using the knowledge learned by the base class data and a small number of labeled samples of the new class data, and accurately predict the classes of the unlabeled samples in the new class task.
In the prior metric learning, a class is predicted by using a point estimation method by using a fixed point, generally a central point, but the single-point metric learning method has inherent noise vulnerability and is easy to generate deviation. Estimating such a specific point is difficult, for example, when the limited support points are unevenly distributed in the embedding space, because they are easily affected by noise. Furthermore, these points also lack interpretability, because a single embedding is not sufficient to represent a class, and we consider that each point of the same class is not isolated in the embedding space, but sampled from a high-dimensional distribution, the distribution of each class being better able to be described than several points. Therefore, based on the problems caused by the point estimation method, a distribution estimation framework based on variational reasoning is provided, and although accurate reasoning is difficult to calculate, the variational reasoning is a method which is attractive theoretically and suitable for calculation.
However, the variation reasoning method needs to pass through a model with a complete data set, which results in that all elements are embedded inefficiently, so that the distribution is not good, the deviation of class prototypes is large, and the optimization mode of the MAML (model-independent meta learning algorithm) can affect the gradient. Therefore, a generation mechanism is shared between the support set and the target set based on the work of predecessors, and then the support set and the target set are combined into large-scale distribution of specific classes, so that the confidence of the class where the target point is located can be intuitively calculated through the distribution of each class, and the classification can be carried out by using the confidence, so that the influence of deviation on a sample can be favorably avoided.
Disclosure of Invention
The invention aims at the prototype deviation problem in the small sample image classification, provides a small sample image classification method and a device based on large boundary Bayes prototype learning, introduces the ideas of large boundary classification and variational reasoning, utilizes a distributed modeling mode to improve the optimization mode of MAML (model-independent meta learning algorithm), increases the similarity in the class and reduces the similarity between the classes, and the invention is based on the variational reasoning method, uses class estimation to replace point estimation in the metric learning to eliminate the inherent noise fragility and deviation of a single-point metric learning method, introduces a small amount of random variables for representing class prototypes, utilizes the variational reasoning mode to learn the posterior distribution of the samples on the basis, shares a generation mechanism between a support set and a target set, combines the generation mechanism into a large-scale specific class method, and intuitively calculates the confidence coefficient of the class where a target point is located by estimating the class distribution, and then, the confidence coefficient is used for classification to obtain a prediction result, so that the influence of deviation on the sample is avoided, the classification precision and robustness are enhanced, and the cost is reduced.
In order to achieve the above purpose, the invention provides the following technical scheme:
on one hand, the invention provides a small sample image classification method based on large-boundary Bayesian prototype learning, which comprises the following steps:
s1, preprocessing data, wherein the data comprise a training set and a test set;
s2, constructing a Bayesian prototype distribution model based on the convolutional neural network and the Gaussian distribution model, wherein the model is composed of an embedded module fθAnd a distribution mapping module; the embedded module is used for extracting sample characteristics and comprises four volume blocks, wherein each volume block comprises a volume layer, a pooling layer and a nonlinear activation function; the distribution mapping module consists of a full connection layer and a conversion layer and is used for converting deep features of the sample into multi-dimensional Gaussian distribution;
s3, solving model parameters based on the objective function of the optimized Bayesian prototype distribution model;
and S4, classifying the images of the test set by using the optimized Bayesian prototype distribution model, and evaluating the performance of the model.
Further, the preprocessing method of step S1 is:
s11, data are processed
Figure BDA0003628130460000031
Divided into training sets
Figure BDA0003628130460000032
Figure BDA0003628130460000033
And test set
Figure BDA0003628130460000034
Two parts, and the class spaces of the two parts are mutually exclusive, and D is obtainedtrainAs base class data training model, DtestEvaluating the performance of the model as new data;
s12, for the C-way K-shot classification task, from DtrainRandomly selecting C categories, randomly selecting M samples in each category, wherein K samples are used as support samples SiAnd the rest M-K samples are used as query samples Qi,SiAnd QiForm a task Ti(ii) a Likewise, for DtestHas a task
Figure BDA0003628130460000035
Further, in step S11, the feature extractor F is formed by following a four-layer convolution architecture, where each convolution block includes a 3 × 3 convolution with 64 filters, a batch normalization, a relu nonlinear layer, a 2 × 2 max pooling layer, and the maximum pooling layer of the last two blocks is clipped, and the output size of the feature extractor is 64 × 5 × 5 ═ 1600; and constructing a raw generator by using a 128-dimensional full-connection layer, and finally obtaining 64-dimensional distribution of the support set and the target set through aggregation planning.
Further, the full-link layer maps the deep features of the sample into a vector in step S12, in which one part of the vector is used as the mean μ of the sample, and the other part is used as the original variance vector of the sample
Figure BDA0003628130460000036
ConversionThe layer converts the original variance vector into a variance vector σ of the sample2
Further, step S3 specifically includes:
s31 for DtrainA task T in (2)iFirst, all support samples and query samples are input into the embedding module
Figure BDA0003628130460000037
Performing the following steps;
s32, utilizing the convolutional neural network embedded in the module to enable the support sample to sequentially pass through a convolutional layer, a pooling layer and an activation layer, and finally extracting the features of the image
Figure BDA0003628130460000038
S33, supporting the sample and obtaining the global characteristics of the sample after the sample and the query sample pass through the full connection layer;
s34, after the sample passes through the full connection layer, part of the vector is used as the sample mean mu, and the other part of the vector is used as the original variance vector of the sample
Figure BDA0003628130460000039
Converting the original variance vector into the variance vector sigma of the sample through a conversion layer2Finally, the distribution of the support samples is obtained
Figure BDA0003628130460000041
S35, obtaining a class prototype of the support sample by using the formulas (1) and (2);
Figure BDA0003628130460000042
Figure BDA0003628130460000043
x in the formula (1)iRepresentative sample, ScRepresents the C-th class, μ (x) in the support samplei) Representative sample xiMean value of (d) (. mu.)cRepresenting the mean value of the class C, namely the distribution of the class C, wherein the formula (1) represents the weighted harmonic mean of the class C, representing the positions of different classes of prototypes in the model, the formula (2) represents the variance of the class C, representing the fluctuation degree of the sample, and the formula (1) and the formula (2) are combined to represent the class prototypes of the sample;
s36, calculating the prediction probability P of all the supporting samples by using the formula (3)r(μ(xi)|q(z|Sc)):
Figure BDA0003628130460000044
In the formula (3), q (z | S)c) For the distribution of class C support set samples, Pr(X)Here, prediction probability; the prediction probability of all the supporting samples c is given by the Gaussian distribution in equation (3), where the mean is μcThe mean value of the vectors in all dimensions is represented, the position of the sample is also represented, class information extracted from the sample is concentrated, and the variance is
Figure BDA0003628130460000045
Represents significance in different dimensions;
s37, obtaining the classification loss L of the whole task T by using the prediction probabilityR(S) is:
Figure BDA0003628130460000046
the formula (4) represents the classification loss of the whole task T, S represents the sample of the support set, c represents the class of the c sample, PrRepresenting the prediction probability, ScRepresenting a support sample c, the left side of the equation represents a support sample loss function, and the right side of the equation represents a cross entropy loss function;
s38, calculating the loss between different tasks by using the formula (5), namely judging the similarity degree between the prediction sample and the real sample, namely the KL divergence distributed between different categories:
Figure BDA0003628130460000047
s39, calculating the total loss L using equation (6):
L=LR+Linter (6)
s310, adopting an MAML optimization mode, and utilizing a loss function of a support sample in a formula (6) to perform multi-step random gradient descent to obtain new network parameters;
s311, calculating the classification prediction loss of the query sample under the parameter by using the formula (4), and updating the model by using an Adam optimizer.
Further, the specific step of step S4 is:
s41, each task
Figure BDA0003628130460000051
By supporting set
Figure BDA0003628130460000052
And query set
Figure BDA0003628130460000053
Composition, all supporting samples are subjected to a Bayesian prototype distribution model to obtain the distribution of each sample x
Figure BDA0003628130460000054
Calculating a class prototype supporting the sample by using a formula (1) and a formula (2);
Figure BDA0003628130460000055
Figure BDA0003628130460000056
x in the formula (1)iRepresentative sample, ScRepresents the C-th class in the support sample, μ (x)i) Representative sample xiMean value of (d) (. mu.)cDistribution representing the mean of class C, i.e. the C-th classThe formula (1) represents the weighted harmonic mean of the class C, represents the positions of different classes of prototypes in the model, the formula (2) represents the variance of the class C, and represents the fluctuation degree of the sample, and the formula (1) and the formula (2) are combined to represent the class prototypes of the sample;
s42, inputting the query sample into the trained embedded module fθPerforming the following steps; then, the matrix characteristics output by the embedding module are subjected to a Bayesian prototype distribution model to obtain the distribution of each class under the query sample, the confidence of the class where the target point is located is intuitively calculated, and then classification is performed by using the confidence to obtain a prediction result.
On the other hand, the invention also provides a small sample image classification device based on large-boundary Bayesian prototype learning, which is used for realizing any one of the methods, and comprises the following modules:
a data preprocessing module: the system comprises a data acquisition module, a data preprocessing module, a data analysis module and a model analysis module, wherein the data acquisition module is used for acquiring data;
a network model construction module: the model is used for introducing a convolutional neural network model and multidimensional Gaussian distribution to construct a surface Bayesian prototype distribution model, and the model is composed of an embedded module fθAnd a distribution mapping module; the embedded module is used for extracting sample characteristics and comprises four volume blocks, wherein each volume block comprises a volume layer, a pooling layer and a nonlinear activation function; the distribution mapping module consists of a full connection layer and a conversion layer and is used for converting deep features of the sample into multi-dimensional Gaussian distribution;
a training model parameter module: based on the optimized objective function of the Bayesian prototype distribution model, solving model parameters;
a test model performance module: and classifying the images of the test set by using the optimized Bayesian prototype distribution model, and evaluating the performance of the model.
Compared with the prior art, the invention has the beneficial effects that:
the invention introduces the ideas of large boundary classification and variational inference, researches based on a distribution estimation modeling mode, establishes a small sample image classification method and a device based on large boundary Bayes prototype learning, introduces a small amount of random variables for representing class prototypes, learns posterior distribution of samples by using a variational inference mode on the basis, shares a generation mechanism between a support set and a target set, combines the generation mechanism into a large specific class method, intuitively calculates the confidence of the class by estimating the class distribution, intuitively calculates the confidence of the class of a target point, classifies by using the confidence, obtains a prediction result, is favorable for avoiding the sample from being influenced by deviation, enhances the classification precision and the robustness and simultaneously reduces the cost. In general, the method of the invention, which allows the use of a complete Bayesian analysis in the model, is easier to interpret than the previous work, eliminates bias in point estimation, and improves prediction performance. The small sample image method based on the large-boundary Bayesian prototype learning skillfully applies the thought of variational reasoning, utilizes the optimization mode of the MAML, strengthens the inherent characteristics of classes, inhibits irrelevant characteristics, solves the problem of prototype deviation in small sample image classification, improves the classification effect of images and has high practical value.
Drawings
In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a variational bayesian prototype learning network of a computational prototype according to an embodiment of the present invention.
Fig. 2 is a block flow diagram of a small sample image classification device based on large-boundary bayesian prototype learning according to an embodiment of the present invention.
Fig. 3 is a flowchart of a convolutional neural network according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. The embodiments of the present invention, and all other embodiments obtained by those skilled in the art without any inventive step, belong to the protection scope of the present invention.
According to one aspect disclosed herein, there is provided a small sample image classification method based on large-boundary bayesian prototype learning, as shown in fig. 1, comprising the following steps:
s1, preprocessing data, wherein the data comprise a training set and a test set;
specifically, the preprocessing method in step S1 is:
s11, data is processed
Figure BDA0003628130460000071
Divided into training sets
Figure BDA0003628130460000072
Figure BDA0003628130460000073
And test set
Figure BDA0003628130460000074
Two parts, and the class spaces of the two parts are mutually exclusive, and D is obtainedtrainAs base class data training model, DtestAs a new class of data evaluation model performance, the feature extractor F is formed following a four-layer convolution architecture, each convolution block in the convolutional neural network comprises a 3 × 3 convolution with 64 filters, a batch normalization, a relu nonlinear layer, a 2 × 2 max pooling layer, and the max pooling layer of the last two blocks is cut. The output size of the feature extractor is 1600 × 5 × 1600. And constructing a raw generator by using a 128-dimensional full connection layer, and finally obtaining 64-dimensional distribution of the support set and the target set through aggregation planning.
S12, for the C-way K-shot classification task, from DtrainRandomly selecting C categories, randomly selecting M samples in each category, wherein K samples are used as support samples SiAnd the rest M-K samples are used as query samples Qi,SiAnd QiForm a task Ti(ii) a Likewise, for DtestHas a task
Figure BDA0003628130460000075
The full-connection layer maps the deep features of the sample into a vector, one part of the vector is used as the mean value mu of the sample, and the other part of the vector is used as the original variance vector of the sample
Figure BDA0003628130460000076
The conversion layer converts the original variance vector into a variance vector sigma of the sample2
And S2, constructing a Bayesian prototype distribution model based on the convolutional neural network and the Gaussian distribution model.
In particular, the model is composed of an embedded module fθAnd a distribution mapping module; the embedded module is used for extracting sample characteristics and comprises four volume blocks, wherein each volume block comprises a volume layer, a pooling layer and a nonlinear activation function; the distribution mapping module is composed of a full connection layer and a conversion layer and is used for converting deep features of the sample into multi-dimensional Gaussian distribution.
S3, based on the objective function of the optimized Bayesian prototype distribution model, solving model parameters;
specifically, step S3 includes:
s31 for DtrainA task T iniFirst, all support samples and query samples are input into the embedding module
Figure BDA0003628130460000081
Performing the following steps;
s32, utilizing the convolutional neural network embedded in the module to enable the support sample to sequentially pass through a convolutional layer, a pooling layer and an activation layer, and finally extracting the features of the image
Figure BDA0003628130460000082
S33, supporting the sample and obtaining the global characteristics of the sample after the sample and the query sample pass through the full connection layer;
s34, after the sample passes through the full connection layer, part of the vector is used as the sample mean mu, and the other part of the vector is used as the original variance vector of the sample
Figure BDA0003628130460000083
Converting the original variance vector into the variance vector sigma of the sample through a conversion layer2Finally, the distribution of the support samples is obtained
Figure BDA0003628130460000084
S35, obtaining a class prototype of the support sample by using the formulas (1) and (2);
Figure BDA0003628130460000085
Figure BDA0003628130460000086
in the formula (1), xiRepresentative sample, ScRepresents the C-th class, μ (x) in the support samplei) Representative sample xiMean value of (a), mucRepresenting the distribution of the mean value of the class C, namely the class C, wherein the formula (1) represents the weighted harmonic mean of the class C, representing the positions of different classes of prototypes in the model, the formula (2) represents the variance of the class C, representing the fluctuation degree of the sample, and the formula (1) and the formula (2) are combined to represent the class prototypes of the sample;
s36, calculating the prediction probability P of all the supporting samples by using the formula (3)r(μ(xi)|q(z|Sc)):
Figure BDA0003628130460000087
In the formula (3), q (z | S)c) For the distribution of class C support set samples, Pr(X)Here, prediction probability; the prediction probability of all the supporting samples c is given by the Gaussian distribution in equation (3), where the mean is μcRepresents vectors in all dimensionsRepresents the location of the sample, centralizes class information extracted from the sample, and has a variance of
Figure BDA0003628130460000088
Represents significance in different dimensions;
s37, obtaining the classification loss L of the whole task T by using the prediction probabilityR(S) is:
Figure BDA0003628130460000089
the formula (4) represents the classification loss of the whole task T, S represents the sample of the support set, c represents the class of the c sample, PrRepresenting the prediction probability, ScRepresenting a support sample c, the left side of the equation represents a support sample loss function, and the right side of the equation represents a cross entropy loss function;
s38, calculating the loss between different tasks by using the formula (5), namely judging the similarity degree between the prediction sample and the real sample, namely the KL divergence distributed between different categories:
Figure BDA0003628130460000091
s39, calculating the total loss L using equation (6):
L=LR+Linter (6)
s310, adopting an MAML optimization mode, and utilizing a loss function of a support sample in a formula (6) to perform multi-step random gradient descent to obtain new network parameters;
s311, calculating the classification prediction loss of the query sample under the parameter by using the formula (4), and updating the model by using an Adam optimizer.
And S4, classifying the images of the test set by using the optimized Bayesian prototype distribution model, and evaluating the performance of the model.
Specifically, the step S4 is:
s41, each task
Figure BDA0003628130460000092
By supporting set
Figure BDA0003628130460000093
And query set
Figure BDA0003628130460000094
Composition, all supporting samples are subjected to a Bayesian prototype distribution model to obtain the distribution of each sample x
Figure BDA0003628130460000095
Calculating a class prototype supporting the sample by using a formula (1) and a formula (2);
Figure BDA0003628130460000096
Figure BDA0003628130460000097
x in the formula (1)iRepresentative sample, ScRepresents the C-th class in the support sample, μ (x)i) Representative sample xiMean value of (d) (. mu.)cRepresenting the distribution of the mean value of the class C, namely the class C, wherein the formula (1) represents the weighted harmonic mean of the class C, representing the positions of different classes of prototypes in the model, the formula (2) represents the variance of the class C, representing the fluctuation degree of the sample, and the formula (1) and the formula (2) are combined to represent the class prototypes of the sample;
s42, inputting the query sample into the trained embedded module fθPerforming the following steps; and then, the matrix characteristics output by the embedding module are subjected to a Bayesian prototype distribution model to obtain the distribution of each class under the query sample, the confidence of the class where the target point is located is directly calculated, and then classification is carried out by utilizing the confidence to obtain a prediction result.
On the other hand, the invention also provides a small sample image classification device based on large-boundary Bayesian prototype learning, which is used for realizing any one of the methods, and comprises the following modules:
a data preprocessing module: the system comprises a data acquisition module, a data preprocessing module, a data analysis module and a model analysis module, wherein the data acquisition module is used for acquiring data;
a network model construction module: is used for introducing a convolution neural network model and multidimensional Gaussian distribution to construct a Bayesian prototype distribution model, and the model is composed of an embedded module fθAnd a distribution mapping module; the embedded module is used for extracting sample characteristics and comprises four volume blocks, wherein each volume block comprises a volume layer, a pooling layer and a nonlinear activation function; the distribution mapping module consists of a full connection layer and a conversion layer and is used for converting deep features of the sample into multi-dimensional Gaussian distribution;
a training model parameter module: based on the optimized objective function of the Bayesian prototype distribution model, solving model parameters;
a test model performance module: and classifying the images of the test set by using the optimized Bayesian prototype distribution model, and evaluating the performance of the model.
The method and the device have the following advantages:
1. aiming at the problems that inherent noise generated by a single-point metric learning method is fragile and easy to generate deviation, the deviated class specific distribution is eliminated by random variational reasoning, the classifier-free prediction is realized by the distribution statistic of a new sample, and the classification precision and the robustness of a small sample learning classification frame are enhanced;
2. when the limited supporting points are distributed unevenly in an embedding space, a specific point is difficult to estimate, and a single point lacks interpretability because the single point is not enough to represent a class, so that a distribution estimation framework based on variational reasoning is adopted by sampling in a high-dimensional distribution;
3. based on a shared generation mechanism generated between a support set and a data set in a variational reasoning idea by the predecessor, the shared generation mechanism is combined into the distribution of a large specific class, a variational Bayesian framework is provided by combining a Bayesian analysis method, the model is flexibly optimized by using an MAML optimization mode, and then the inherent characteristics of the class are strengthened and irrelevant characteristics are inhibited by visually observing the confidence coefficient of the class and classifying by using the confidence coefficient, so that the deviation is reduced.
The detailed implementation of the proposed method and model for classifying small sample images based on the large-boundary bayesian prototype learning is described above with reference to the accompanying drawings. The implementation of the method and the device will be clear to those skilled in the art from the above description of the embodiments.
It is noted that the implementations not described in the drawings and in the specification are all forms known to those of ordinary skill in the art and are not described in detail. Furthermore, the above definitions of the various elements and methods are not limited to the various specific structures, shapes or arrangements mentioned in the examples, which may be easily modified or substituted by those of ordinary skill in the art.
Further, unless specifically described or steps must occur sequentially, the order of the steps is not limited to that listed above and may be varied or rearranged as desired by the desired design. And the above examples can be mixed and matched with each other or with other examples based on design and reliability considerations, i.e. technical features in different implementations can be freely combined to form more implementation examples. The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, this disclosure is not intended to be limited to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the disclosure as described herein, and any descriptions of specific languages are provided above to disclose the best mode disclosed herein.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments disclosed herein, various features disclosed herein are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various disclosed aspects. However, the disclosed method should not be interpreted to reflect the following schematic: that is, the claims are entitled to antedate such disclosure by virtue of their greater disclosure than is expressly recited in each claim. Rather, as the following claims reflect, disclosed aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this disclosure.
The above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: those skilled in the art can still make modifications or changes to the embodiments described in the foregoing embodiments, or make equivalents to some of the techniques; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present application. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (7)

1. A small sample image classification method based on large-boundary Bayesian prototype learning is characterized by comprising the following steps:
s1, preprocessing data, wherein the data comprise a training set and a testing set;
s2, constructing a Bayesian prototype distribution model based on the convolutional neural network and the Gaussian distribution model, wherein the model is composed of an embedded module fθAnd a distribution mapping module; the embedded module is used for extracting sample characteristics and comprises four volume blocks, wherein each volume block comprises a volume layer, a pooling layer and a nonlinear activation function; the distribution mapping module consists of a full connection layer and a conversion layer and is used for converting deep features of the sample into multi-dimensional Gaussian distribution;
s3, solving model parameters based on the objective function of the optimized Bayesian prototype distribution model;
and S4, classifying the images of the test set by using the optimized Bayesian prototype distribution model, and evaluating the performance of the model.
2. The method for classifying small sample images based on large-boundary Bayesian prototype learning according to claim 1, wherein the preprocessing method in step S1 is as follows:
s11, data are processed
Figure FDA0003628130450000011
Divided into training sets
Figure FDA0003628130450000012
Figure FDA0003628130450000013
And test set
Figure FDA0003628130450000014
Two parts, and the class spaces of the two parts are mutually exclusive, and D is obtainedtrainAs base class data training model, DtestEvaluating the performance of the model as new data;
s12, for the C-way K-shot classification task, from DtrainRandomly selecting C categories, randomly selecting M samples in each category, wherein K samples are used as support samples SiAnd the rest M-K samples are used as query samples Qi,SiAnd QiForm a task Ti(ii) a Likewise, for DtestHas a task
Figure FDA0003628130450000015
3. The method for classifying small sample images based on large-boundary bayesian prototype learning according to claim 2, wherein the feature extractor F is formed by following a four-layer convolution architecture in step S11, wherein each convolution block comprises a 3 × 3 convolution with 64 filters, a batch normalization, a relu non-linear layer, a 2 × 2 max pooling layer, and a maximum pooling layer of the last two blocks is clipped, and the output size of the feature extractor is 64 × 5 × 5 ═ 1600; and constructing a raw generator by using a 128-dimensional full-connection layer, and finally obtaining 64-dimensional distribution of the support set and the target set through aggregation planning.
4. The method for classifying small sample images based on Bayesian prototype learning with large boundaries as claimed in claim 2, wherein the full-connectivity layer maps deep features of the sample into a vector with one part as the mean μ of the sample and the other part as the original variance vector of the sample in step S12
Figure FDA0003628130450000021
The conversion layer converts the original variance vector into a variance vector sigma of the sample2
5. The method for classifying small sample images based on bayesian prototype learning with large boundaries as claimed in claim 1, wherein step S3 specifically comprises:
s31 for DtrainA task T iniFirst, all support samples and query samples are input into the embedding module
Figure FDA0003628130450000022
Performing the following steps;
s32, utilizing the convolutional neural network embedded in the module to enable the support sample to sequentially pass through a convolutional layer, a pooling layer and an activation layer, and finally extracting the features of the image
Figure FDA0003628130450000023
S33, obtaining the global characteristics of the sample after the support sample and the query sample pass through the full connection layer;
s34, after the sample passes through the full connection layer, part of the vector is used as the sample mean mu, and the other part of the vector is used as the original variance vector of the sample
Figure FDA0003628130450000024
Converting the original variance vector into the variance vector sigma of the sample through a conversion layer2Finally, the distribution of the support samples is obtained
Figure FDA0003628130450000025
S35, obtaining a class prototype of the support sample by using the formulas (1) and (2);
Figure FDA0003628130450000026
Figure FDA0003628130450000027
x in the formula (1)iRepresentative sample, ScRepresents the C-th class, μ (x) in the support samplei) Representative sample xiMean value of (d) (. mu.)cRepresenting the mean value of the class C, namely the distribution of the class C, wherein the formula (1) represents the weighted harmonic mean of the class C, representing the positions of different classes of prototypes in the model, the formula (2) represents the variance of the class C, representing the fluctuation amplitude of the sample, and the formula (1) and the formula (2) are combined to represent the class prototypes of the sample;
s36, calculating the prediction probability P of all the supporting samples by using the formula (3)r(μ(xi)|q(z|Sc)):
Figure FDA0003628130450000028
In the formula (3), q (z | S)c) For the distribution of class C support set samples, Pr(X)Here, prediction probability; the prediction probability of all the supporting samples c is given by the Gaussian distribution in equation (3), where the mean is μcRepresents the mean of the vectors in all dimensions and also represents the position of the sample, setThe class information extracted from the sample is extracted, and the variance is
Figure FDA0003628130450000029
Represents significance of different dimensions;
s37, obtaining the classification loss L of the whole task T by using the prediction probabilityR(S) is:
Figure FDA0003628130450000031
the formula (4) represents the classification loss of the whole task T, S represents the sample of the support set, c represents the class of the c sample, PrRepresenting the prediction probability, ScRepresenting a support sample c, the left side of the equation represents a support sample loss function, and the right side of the equation represents a cross entropy loss function;
s38, calculating the loss between different tasks by using the formula (5), namely judging the similarity degree between the prediction sample and the real sample, namely the KL divergence distributed between different categories:
Figure FDA0003628130450000032
s39, calculating the total loss L using equation (6):
L=LR+Linter (6)
s310, adopting an MAML optimization mode, and utilizing a loss function of a support sample in a formula (6) to perform multi-step random gradient descent to obtain new network parameters;
s311, calculating the classification prediction loss of the query sample under the parameter by using the formula (4), and updating the model by using an Adam optimizer.
6. The method for classifying small sample images based on large-boundary Bayesian prototype learning according to claim 1, wherein the step S4 comprises the following steps:
s41, each task
Figure FDA0003628130450000033
By supporting set
Figure FDA0003628130450000034
And query set
Figure FDA0003628130450000035
Composition, all supporting samples are subjected to a Bayesian prototype distribution model to obtain the distribution of each sample x
Figure FDA0003628130450000036
Calculating a class prototype supporting the sample by using a formula (1) and a formula (2);
Figure FDA0003628130450000037
Figure FDA0003628130450000038
x in the formula (1)iRepresentative sample, ScRepresents the C-th class, μ (x) in the support samplei) Representative sample xiMean value of (d) (. mu.)cRepresenting the mean value of the class C, namely the distribution of the class C, wherein the formula (1) represents the weighted harmonic mean of the class C, representing the positions of different classes of prototypes in the model, the formula (2) represents the variance of the class C and the fluctuation degree of the table sample, and the formula (1) and the formula (2) are combined to represent the class prototypes of the sample;
s42, inputting the query sample into the trained embedded module fθPerforming the following steps; then, the matrix characteristics output by the embedding module are subjected to a Bayesian prototype distribution model to obtain the distribution of each class under the query sample, the confidence of the class where the target point is located is intuitively calculated, and then classification is performed by using the confidence to obtain a prediction result.
7. A small sample image classification device based on large-boundary bayesian prototype learning, which is used for implementing the method of any one of claims 1 to 8, and comprises the following modules:
a data preprocessing module: the system comprises a data acquisition module, a data preprocessing module, a data analysis module and a model analysis module, wherein the data acquisition module is used for acquiring data;
a network model construction module: the method is used for introducing a convolutional neural network model and multidimensional Gaussian distribution, constructing a small sample image classification model based on large-boundary Bayesian prototype learning, and enabling the model to be composed of an embedded module fθAnd a distribution mapping module; the embedded module is used for extracting sample characteristics and comprises four volume blocks, wherein each volume block comprises a volume layer, a pooling layer and a nonlinear activation function; the distribution mapping module consists of a full connection layer and a conversion layer and is used for converting deep features of the sample into multi-dimensional Gaussian distribution;
a training model parameter module: based on the optimized objective function of the Bayesian prototype distribution model, solving model parameters;
a test model performance module: and classifying the images of the test set by using the optimized Bayesian prototype distribution model to evaluate the performance of the model.
CN202210482490.8A 2022-05-05 2022-05-05 Small sample image classification method and device based on large-boundary Bayesian prototype learning Active CN114723998B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210482490.8A CN114723998B (en) 2022-05-05 2022-05-05 Small sample image classification method and device based on large-boundary Bayesian prototype learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210482490.8A CN114723998B (en) 2022-05-05 2022-05-05 Small sample image classification method and device based on large-boundary Bayesian prototype learning

Publications (2)

Publication Number Publication Date
CN114723998A true CN114723998A (en) 2022-07-08
CN114723998B CN114723998B (en) 2023-06-20

Family

ID=82231221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210482490.8A Active CN114723998B (en) 2022-05-05 2022-05-05 Small sample image classification method and device based on large-boundary Bayesian prototype learning

Country Status (1)

Country Link
CN (1) CN114723998B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116500575A (en) * 2023-05-11 2023-07-28 兰州理工大学 Extended target tracking method and device based on variable decibel leaf theory

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414600A (en) * 2019-07-27 2019-11-05 西安电子科技大学 A kind of extraterrestrial target small sample recognition methods based on transfer learning
CN111476292A (en) * 2020-04-03 2020-07-31 北京全景德康医学影像诊断中心有限公司 Small sample element learning training method for medical image classification processing artificial intelligence
WO2021051987A1 (en) * 2019-09-18 2021-03-25 华为技术有限公司 Method and apparatus for training neural network model
CN112633382A (en) * 2020-12-25 2021-04-09 浙江大学 Mutual-neighbor-based few-sample image classification method and system
US20210398004A1 (en) * 2020-06-19 2021-12-23 Electronics And Telecommunications Research Institute Method and apparatus for online bayesian few-shot learning
CN114124437A (en) * 2021-09-28 2022-03-01 西安电子科技大学 Encrypted flow identification method based on prototype convolutional network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414600A (en) * 2019-07-27 2019-11-05 西安电子科技大学 A kind of extraterrestrial target small sample recognition methods based on transfer learning
WO2021051987A1 (en) * 2019-09-18 2021-03-25 华为技术有限公司 Method and apparatus for training neural network model
CN111476292A (en) * 2020-04-03 2020-07-31 北京全景德康医学影像诊断中心有限公司 Small sample element learning training method for medical image classification processing artificial intelligence
US20210398004A1 (en) * 2020-06-19 2021-12-23 Electronics And Telecommunications Research Institute Method and apparatus for online bayesian few-shot learning
CN112633382A (en) * 2020-12-25 2021-04-09 浙江大学 Mutual-neighbor-based few-sample image classification method and system
CN114124437A (en) * 2021-09-28 2022-03-01 西安电子科技大学 Encrypted flow identification method based on prototype convolutional network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JIAN ZHANG 等: "Variational Few-Shot Learning", pages 1685 - 1694 *
JINGYI XU 等: "Variational Feature Disentangling for Fine-Grained Few-Shot Classification", 《PROCEEDINGS OF THE IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION 》, pages 8812 - 8821 *
ZHUO SUN 等: "Amortized Bayesian Prototype Meta-learning: A New Probabilistic Meta-learning Approach to Few-shot Image Classification", vol. 130, pages 1414 - 1422 *
曹洁 等: "基于滑动特征向量的小样本图像分类方法", vol. 51, no. 5, pages 1785 - 1791 *
李远沐;王展青;: "一种改进的小批量手写体字符识别算法", 《小型微型计算机系统》, vol. 41, no. 07, pages 1541 - 1546 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116500575A (en) * 2023-05-11 2023-07-28 兰州理工大学 Extended target tracking method and device based on variable decibel leaf theory
CN116500575B (en) * 2023-05-11 2023-12-22 兰州理工大学 Extended target tracking method and device based on variable decibel leaf theory

Also Published As

Publication number Publication date
CN114723998B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
CN110929164A (en) Interest point recommendation method based on user dynamic preference and attention mechanism
Mu et al. Urban land use and land cover change prediction via self-adaptive cellular based deep learning with multisourced data
CN114332578A (en) Image anomaly detection model training method, image anomaly detection method and device
Lu CNN Convolutional layer optimisation based on quantum evolutionary algorithm
CN113761259A (en) Image processing method and device and computer equipment
CN109978870A (en) Method and apparatus for output information
CN113378706B (en) Drawing system for assisting children in observing plants and learning biological diversity
CN115995018A (en) Long tail distribution visual classification method based on sample perception distillation
CN115115830A (en) Improved Transformer-based livestock image instance segmentation method
CN114943859B (en) Task related metric learning method and device for small sample image classification
CN112560948A (en) Eye fundus map classification method and imaging method under data deviation
CN111126155B (en) Pedestrian re-identification method for generating countermeasure network based on semantic constraint
CN115222896A (en) Three-dimensional reconstruction method and device, electronic equipment and computer-readable storage medium
CN114723998A (en) Small sample image classification method and device based on large-boundary Bayes prototype learning
CN111242028A (en) Remote sensing image ground object segmentation method based on U-Net
Zhang et al. Zero-small sample classification method with model structure self-optimization and its application in capability evaluation
CN112270334B (en) Few-sample image classification method and system based on abnormal point exposure
CN113762331A (en) Relational self-distillation method, apparatus and system, and storage medium
CN116665039A (en) Small sample target identification method based on two-stage causal intervention
CN114818945A (en) Small sample image classification method and device integrating category adaptive metric learning
CN111241165B (en) Artificial intelligence education system based on big data and data processing method
CN108960406B (en) MEMS gyroscope random error prediction method based on BFO wavelet neural network
Yadav et al. Plant Pathologist-A Machine Learning Diagnostician for the Plant Disease
CN109726690A (en) Learner behavior image multizone based on DenseCap network describes method
Şimşek et al. A Study on Deep Learning Methods in the Concept of Digital Industry 4.0

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant