CN113255701A

CN113255701A - Small sample learning method and system based on absolute-relative learning framework

Info

Publication number: CN113255701A
Application number: CN202110700741.0A
Authority: CN
Inventors: 张洪广; 马琳茹; 杨雄军; 李东阳; 保金祯
Original assignee: Institute of Network Engineering Institute of Systems Engineering Academy of Military Sciences
Current assignee: Institute of Network Engineering Institute of Systems Engineering Academy of Military Sciences
Priority date: 2021-06-24
Filing date: 2021-06-24
Publication date: 2021-08-13
Anticipated expiration: 2041-06-24
Also published as: CN113255701B

Abstract

The invention provides a small sample learning method and a system based on an absolute-relative learning framework, which comprises the following steps: calling a representation extraction module to perform representation extraction on each image sample in the training set so as to obtain a feature vector of each image sample; calling an absolute learning module, and training the feature vector of each image sample to determine a first prediction result of each image sample based on the category and a second prediction result based on the semantics; combining the feature vectors of every two image samples into a group of sample feature pairs, and splicing the two feature vectors in each group of sample feature pairs into a group of composite vectors; calling a relative learning module to train the sample feature pairs so as to determine a first similarity based on the category and a second similarity based on the semantic meaning of two feature vectors in each group of sample feature pairs; and calculating a loss function of the model according to the first prediction result, the second prediction result, the first similarity and the second similarity so as to complete the training of the model.

Description

Small sample learning method and system based on absolute-relative learning framework

Technical Field

The invention belongs to the technical field of small sample learning, and particularly relates to a small sample learning method and system based on an absolute-relative learning framework.

Background

In recent years, although deep learning has achieved significant effects in the computer vision field such as target recognition, scene understanding, semantic segmentation, and the like, there are some key bottleneck problems, for example, the existing deep learning model depends heavily on centralized large-scale annotation training data, and when the annotation training data is deficient, the generalization of the deep learning model is significantly reduced, resulting in limited application scenarios. Different from the deep learning model at the present stage, human beings can quickly master a new target concept only by a small amount of data under the general condition, and with the inspiration, researchers provide a 'small sample learning' concept, aiming at exploring a method for solving a new problem by using knowledge and experience summarized from the existing samples under the condition that the quantity of new target class marking samples is very small, and gradually becoming a new research hotspot in the field of machine learning. At present, small sample learning problems are mainly based on algorithm modeling and performance evaluation of a target identification scene, and in short, the small sample learning problems can be summarized into a W-way K-shot balanced sampling target identification task, namely, W different classes are randomly sampled from a training set at first, K marked samples are randomly sampled from each class respectively to construct a support set S, and the classification is carried out by measuring the relation between the unmarked samples in a query set Q and the support set. In the training stage, a large number of meta-training tasks are constructed in the above mode to pre-train the model parameters, then the obtained parameters are transferred to a target test type, and the meta-testing tasks are constructed to evaluate the model performance.

One of the widely used small sample learning methods is a learning technology based on relationship measurement, and the core idea is to obtain the relationship between different samples by measuring the similarity between a support set and a query set sample pair representation, and use this as a classification basis to perform sample prediction. The method generally comprises two modules, a representation extraction module and a relation measurement module, wherein the representation extraction module is mainly responsible for embedding the image sample into a convolution vector space, and the relation measurement module is used for calculating a similarity score between a support-query sample pair. The matching network and the relation network model are typical representatives of the small sample learning method, wherein the former calculates the relation between the support sample and the query sample by introducing an attention function, and the latter adaptively learns and measures the similarity between different sample pairs by using a deep convolution network so as to judge the class of the test sample.

Although the small sample learning method based on metric learning achieves certain effect in the task scene of small sample target identification, the following problems also exist:

(1) although the measurement indexes based on the specific distance function are simple and effective, the measurement indexes are not suitable for describing the sample relationship in a complex scene, and the overfitting problem is easy to occur; (2) the sample relation model based on the binarization relation label is too ideal, lacks of smoothness of decision making, and cannot objectively describe the correlation between different objects in the real world, so that great deviation exists in the decision making of the model in fine granularity and coarse granularity identification; (3) the relationship network only compares the similarity between the sample pairs, and ignores the specific meaning of the category to which the sample actually belongs, so that the representation extraction module lacks important target concept knowledge in the training process, thereby limiting the performance improvement of small sample learning.

Disclosure of Invention

The invention provides a small sample learning scheme based on an absolute-relative learning framework, which aims to solve the technical problem in the prior art, namely how to design a novel relation network so that a small sample learning model can fully utilize class labels and semantic standard information to improve the image representation effect, and further improve the classification performance of training samples under the condition of few training samples. According to the scheme, the similarity learning and the concept learning are combined, so that the small sample learning model based on the metric learning can better capture the complex class relation between the image pairs, and a better classification effect is obtained.

The invention discloses a small sample learning method based on an absolute-relative learning framework in a first aspect. The absolute-relative learning framework comprises a representation extraction module, an absolute learning module and a relative learning module. The small sample learning method is used for training a model, the model is used for image recognition, and the small sample learning method specifically comprises the following steps: step S1, calling the representation extraction module to perform representation extraction on each image sample in the training set so as to obtain a feature vector of each image sample; step S2, calling the absolute learning module, training the feature vector of each image sample to determine a first prediction result of each image sample based on the category and a second prediction result based on the semantics; step S3, combining the feature vectors of every two image samples into a group of sample feature pairs, and splicing the two feature vectors in each group of sample feature pairs into a group of composite vectors; step S4, calling the relative learning module, training the sample feature pairs to determine a first similarity based on category and a second similarity based on semantic of two feature vectors in each group of sample feature pairs; step S5, calculating a loss function of the model according to the first prediction result, the second prediction result, the first similarity, and the second similarity, so as to complete training of the model.

According to the method of the first aspect of the present invention, in step S2, the determining the first prediction result specifically includes: and calling a first absolute learning submodule in the absolute learning module, and calculating a first prediction result of each image sample based on the category by learning the category knowledge of each image sample and utilizing a cross entropy loss function, wherein the first absolute learning submodule is a category predictor.

According to the method of the first aspect of the present invention, in step S2, the determining the second prediction result specifically includes: and calling a second absolute learning submodule in the absolute learning module, and calculating a second prediction result of each image sample based on the semantics by learning the semantic knowledge of each image sample and utilizing a mean square error loss function, wherein the second absolute learning submodule is a semantic predictor.

According to the method of the first aspect of the present invention, in step S4, the determining the first similarity specifically includes: and calling a first relative learning submodule in the relative learning module, taking the spliced synthetic vector as input, and calculating a first similarity of two feature vectors in each group of sample feature pairs based on the category by using a mean square error loss function, wherein the first relative learning submodule is a relative learner for describing the category relationship.

According to the method of the first aspect of the present invention, in step S4, the determining the second similarity specifically includes: and calling a second relative learning submodule in the relative learning module, taking the spliced synthetic vector as input, and calculating a semantic-based second similarity of the two feature vectors in each group of sample feature pairs by using a mean square error loss function, wherein the second relative learning submodule is a relative learner for describing a semantic relationship.

According to the method of the first aspect of the present invention, in step S5, the calculating the loss function of the model specifically includes: the first prediction result, the second prediction result, the first similarity and the second similarity respectively have corresponding weight factors, and the loss function of the model is calculated based on the weight factors.

According to the method of the first aspect of the present invention, in step S5, completing the training of the model specifically includes: repeating the steps S1 to S4 for multiple times to obtain a plurality of loss functions of the model, determining a minimum value from the plurality of loss functions, and using the model in the state of the minimum value for the image recognition to finish the training of the model.

The invention discloses a small sample learning system based on an absolute-relative learning framework in a second aspect. The absolute-relative learning framework comprises a representation extraction module, an absolute learning module and a relative learning module. The small sample learning system is used for training a model, the model is used for image recognition, and the small sample learning system specifically comprises: the first processing unit is configured to invoke the representation extraction module to perform representation extraction on each image sample in a training set so as to obtain a feature vector of each image sample; a second processing unit, configured to invoke the absolute learning module, train the feature vector of each image sample, and determine a first prediction result based on category and a second prediction result based on semantic meaning of each image sample; a third processing unit configured to combine the feature vectors of every two image samples into a set of sample feature pairs, and splice the two feature vectors in each set of sample feature pairs into a set of composite vectors; a fourth processing unit, configured to invoke the relative learning module, train the sample feature pairs, and determine a first similarity based on category and a second similarity based on semantic of two feature vectors in each set of sample feature pairs; a fifth processing unit, configured to calculate a loss function of the model according to the first prediction result, the second prediction result, the first similarity and the second similarity, so as to complete training of the model.

According to the system of the second aspect of the invention, the second processing unit is specifically configured to: and calling a first absolute learning submodule in the absolute learning module, and calculating a first prediction result of each image sample based on the category by learning the category knowledge of each image sample and utilizing a cross entropy loss function, wherein the first absolute learning submodule is a category predictor.

According to the system of the second aspect of the invention, the second processing unit is specifically configured to: and calling a second absolute learning submodule in the absolute learning module, and calculating a second prediction result of each image sample based on the semantics by learning the semantic knowledge of each image sample and utilizing a mean square error loss function, wherein the second absolute learning submodule is a semantic predictor.

According to the system of the second aspect of the present invention, the fourth processing unit is specifically configured to: and calling a first relative learning submodule in the relative learning module, taking the spliced synthetic vector as input, and calculating a first similarity of two feature vectors in each group of sample feature pairs based on the category by using a mean square error loss function, wherein the first relative learning submodule is a relative learner for describing the category relationship.

According to the system of the second aspect of the present invention, the fourth processing unit is specifically configured to: and calling a second relative learning submodule in the relative learning module, taking the spliced synthetic vector as input, and calculating a semantic-based second similarity of the two feature vectors in each group of sample feature pairs by using a mean square error loss function, wherein the second relative learning submodule is a relative learner for describing a semantic relationship.

According to the system of the second aspect of the invention, the fifth processing unit is specifically configured to: the first prediction result, the second prediction result, the first similarity and the second similarity respectively have corresponding weight factors, and the loss function of the model is calculated based on the weight factors.

According to the system of the second aspect of the present invention, the first processing unit, the second processing unit, the third processing unit, the fourth processing unit, and the fifth processing unit are utilized multiple times to obtain a plurality of loss functions of the model, a minimum value is determined from the plurality of loss functions, and the model in the state of the minimum value is used for the image recognition to complete the training of the model.

A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the small sample learning method based on the absolute-relative learning framework according to any one of the first aspect of the disclosure when executing the computer program.

A fourth aspect of the invention discloses a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the small sample learning method based on an absolute-relative learning architecture according to any one of the first aspects of the present disclosure.

In conclusion, the technical scheme of the invention combines similarity learning and concept learning, so that the small sample learning model based on metric learning can better capture the complex class relation between image pairs, thereby obtaining better classification effect.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description in the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a diagram of an absolute-relative learning architecture according to an embodiment of the present invention;

FIG. 2 is a flowchart of a small sample learning method based on an absolute-relative learning architecture according to an embodiment of the present invention;

FIG. 3 is a block diagram of a small sample learning system based on an absolute-relative learning architecture according to an embodiment of the present invention;

fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention discloses a small sample learning method based on an absolute-relative learning framework in a first aspect. Fig. 1 is a schematic diagram of an absolute-relative learning architecture according to an embodiment of the present invention, and as shown in fig. 1, the absolute-relative learning architecture 100 includes a representation extraction module 101, an absolute learning module 102, and a relative learning module 103, where the absolute learning module 102 includes a first absolute learning submodule 1021 and a second absolute learning submodule 1022, and the relative learning module 103 includes a first relative learning submodule 1031 and a second relative learning submodule 1032.

In some embodiments, the token extraction module 101 is primarily responsible for convolutional token extraction on randomly sampled samples, each branch in the absolute learning module 102 encourages the token extraction module 101 and the absolute learner to learn the absolute conceptual knowledge of the class itself (e.g., sample class, color, texture, word sense description, etc.) by using class-level or sample-level annotations of the samples as a monitor, while branches in the relative learning module 103 generate relationship labels based on different types of class annotations, thereby describing the relationships between samples at multiple granularities (e.g., whether the classes are the same, whether the color and texture are consistent, whether the word sense description is consistent, etc.), and using them as a monitor signal to train the relationships between pairs of relative learner metric samples and calculate classification probabilities. The original relationship learning network can be considered as a branch in the relative learning module 103. Compared with the traditional relational network which only considers the similarity difference between the sample pairs, the method introduces the category concept and the semantic annotation information when constructing the sample classification model so as to describe the characteristics of the support set samples from multiple angles and further enrich the available distinguishing representation during prediction. The absolute learning module 102 and the relative learning module 103 simultaneously perform end-to-end training, wherein outputs of all branches except the category relationship learning branch act on the relationship learning branch in a feedback connection manner, so that the recognition accuracy is further improved.

Fig. 2 is a flowchart of a small sample learning method based on an absolute-relative learning architecture according to an embodiment of the present invention, where the small sample learning method is used to train a model, and the model is used for image recognition, and as shown in fig. 2, the small sample learning method specifically includes: step S1, calling the representation extraction module 101 to perform representation extraction on each image sample in the training set to obtain a feature vector of each image sample; step S2, invoking the absolute learning module 102, and training the feature vector of each image sample to determine a first prediction result based on category and a second prediction result based on semantic meaning of each image sample; step S3, combining the feature vectors of every two image samples into a group of sample feature pairs, and splicing the two feature vectors in each group of sample feature pairs into a group of composite vectors; step S4, invoking the relative learning module 103, and training the sample feature pairs to determine a first similarity based on category and a second similarity based on semantic of two feature vectors in each group of sample feature pairs; step S5, calculating a loss function of the model according to the first prediction result, the second prediction result, the first similarity, and the second similarity, so as to complete training of the model.

In step S1, the representation extraction module 101 is invoked to perform representation extraction on each image sample in the training set to obtain a feature vector of each image sample.

In some embodiments, samples in a training set are combinedX _i, X _jInputting the sample characteristic vector into a characterization extraction module 101f to obtain a corresponding sample characteristic vector

And

wherein the representation extraction module 101 is a general deep convolutional neural network.

In step S2, the absolute learning module 102 is invoked to train the feature vector of each image sample to determine a first category-based prediction result and a second semantic-based prediction result of each image sample.

In some embodiments, determining the first prediction result specifically includes: calling a first absolute learning submodule 1021 in the absolute learning module 102, and calculating a first prediction result of each image sample based on the category by learning the category knowledge of each image sample and using a cross entropy loss function, wherein the first absolute learning submodule 1021 is a category predictor.

In some embodiments, determining the second prediction result specifically includes: and calling a second absolute learning submodule 1022 in the absolute learning module 102, and calculating a second semantic-based prediction result of each image sample by learning semantic knowledge of each image sample and using a mean square error loss function, wherein the second absolute learning submodule 1022 is a semantic predictor.

In some embodiments, the feature vector is input to a first absolute learning submodule 1021 h_cAnd a second absolute learning submodule 1022 h_aTraining to capture specific class concepts and semantic knowledge of the image sample, where h_cThe method is a category predictor and mainly responsible for learning the category concept knowledge of the samples, and the adopted loss function is a cross entropy loss function

In the formula

Representing class predictor pair samplesX _iN represents the number of training samples; h is_aThe semantic predictor is mainly responsible for learning semantic knowledge of a prediction sample, and the adopted loss function is a mean square error loss function

In the formula

Representing semantic annotation data predicted by the semantic predictor,a _ithe actual annotation data for the image sample.

In step S3, the feature vectors of each two image samples are combined into a set of sample feature pairs, and the two feature vectors in each set of sample feature pairs are spliced into a set of composite vectors.

In step S4, the relative learning module 103 is invoked to train the sample feature pairs to determine a first similarity based on category and a second similarity based on semantic of two feature vectors in each set of sample feature pairs.

In some embodiments, determining the first similarity specifically includes: calling a first relative learning submodule 1031 in the relative learning module 103, taking the spliced resultant vector as an input, and calculating a first similarity based on a category of two feature vectors in each group of sample feature pairs by using a mean square error loss function, wherein the first relative learning submodule 1031 is a relative learner for describing a category relationship.

In some embodiments, determining the second similarity specifically comprises: and calling a second relative learning sub-module 1032 in the relative learning module 103, taking the spliced synthetic vector as an input, and calculating a semantic-based second similarity of the two feature vectors in each group of sample feature pairs by using a mean square error loss function, wherein the second relative learning sub-module 1032 is a relative learner for describing a semantic relationship.

In some embodiments, while the absolute learning module 102 is training, sample feature pairs are constructed and input into respective branches of the relative learning module 103 for model training to learn the degree of similarity between two feature vectors contained in a sample pair. The relative learning module 103 is essentially a deep neural network, whose input is a synthesized vector after the concatenation of sample feature pairs, and in the training process, there are two relative learners r described by category relations with different granularities_c(first relative learning submodule 1031) and relative learner r of semantic relationship description_a(second relative learning submodule 1032) using a mean square error loss function

And

in the formula (I), wherein,

and

respectively represent the sample pair relations learned by two opposite learners,

and

then the sample pair relationship labels generated using category annotation and semantic annotation, respectively, are represented.

In step S5, a loss function of the model is calculated according to the first prediction result, the second prediction result, the first similarity and the second similarity, so as to complete training of the model.

In some embodiments, calculating the loss function of the model specifically comprises: the first prediction result, the second prediction result, the first similarity and the second similarity respectively have corresponding weight factors, and the loss function of the model is calculated based on the weight factors

In some embodiments, in step S5, completing the training of the model specifically includes: repeating the steps S1 to S4 for multiple times to obtain a plurality of loss functions of the model, determining a minimum value from the plurality of loss functions, and using the model in the state of the minimum value for the image recognition to finish the training of the model.

In some embodiments, a small sample learning model based on an absolute-relative learning architecture will enable end-to-end training of the network with a weighted sum of the learner loss functions described above as the final loss function. The final loss function is specifically:

in the formula

α, β, γ represent weighting factors for respective loss functions of the learners.

The invention discloses a small sample learning system based on an absolute-relative learning framework in a second aspect. The absolute-relative learning architecture (shown in fig. 1) includes a representation extraction module 101, an absolute learning module 102, and a relative learning module 103.

Fig. 3 is a structural diagram of a small sample learning system based on an absolute-relative learning architecture according to an embodiment of the present invention, as shown in fig. 3. The small sample learning system 300 is configured to train a model, where the model is used for image recognition, and the small sample learning system 300 specifically includes: a first processing unit 301, configured to invoke the representation extraction module 101 to perform representation extraction on each image sample in a training set to obtain a feature vector of each image sample; a second processing unit 302, configured to invoke the absolute learning module 102 to train a feature vector of each image sample to determine a first category-based prediction result and a second semantic-based prediction result of each image sample; a third processing unit 303, configured to combine the feature vectors of every two image samples into a set of sample feature pairs, and splice the two feature vectors in each set of sample feature pairs into a set of composite vectors; a fourth processing unit 304, configured to invoke the relative learning module 103 to train the sample feature pairs to determine a first similarity based on class and a second similarity based on semantics of two feature vectors in each set of sample feature pairs; a fifth processing unit 305, configured to calculate a loss function of the model according to the first prediction result, the second prediction result, the first similarity, and the second similarity, so as to complete training of the model.

According to the system of the second aspect of the present invention, the second processing unit 302 is specifically configured to: calling a first absolute learning submodule 1021 in the absolute learning module 102, and calculating a first prediction result of each image sample based on the category by learning the category knowledge of each image sample and using a cross entropy loss function, wherein the first absolute learning submodule 1021 is a category predictor.

According to the system of the second aspect of the present invention, the second processing unit 302 is specifically configured to: and calling a second absolute learning submodule 1022 in the absolute learning module 102, and calculating a second semantic-based prediction result of each image sample by learning semantic knowledge of each image sample and using a mean square error loss function, wherein the second absolute learning submodule 1022 is a semantic predictor.

According to the system of the second aspect of the present invention, the fourth processing unit 304 is specifically configured to: calling a first relative learning submodule 1031 in the relative learning module 103, taking the spliced resultant vector as an input, and calculating a first similarity based on a category of two feature vectors in each group of sample feature pairs by using a mean square error loss function, wherein the first relative learning submodule 1031 is a relative learner for describing a category relationship.

According to the system of the second aspect of the present invention, the fourth processing unit 304 is specifically configured to: and calling a second relative learning sub-module 1032 in the relative learning module 103, taking the spliced synthetic vector as an input, and calculating a semantic-based second similarity of the two feature vectors in each group of sample feature pairs by using a mean square error loss function, wherein the second relative learning sub-module 1032 is a relative learner for describing a semantic relationship.

According to the system of the second aspect of the present invention, the fifth processing unit 305 is specifically configured to: the first prediction result, the second prediction result, the first similarity and the second similarity respectively have corresponding weight factors, and the loss function of the model is calculated based on the weight factors.

According to the system of the second aspect of the present invention, the first processing unit 301, the second processing unit 302, the third processing unit 303, the fourth processing unit 304, and the fifth processing unit 305 are utilized multiple times to obtain a plurality of loss functions of the model, a minimum value is determined from the plurality of loss functions, and the model in the state of the minimum value is used for the image recognition, so as to complete the training of the model.

Fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes a processor, a memory, a communication interface, a display screen, and an input device, which are connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the electronic device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, Near Field Communication (NFC) or other technologies. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the electronic equipment, an external keyboard, a touch pad or a mouse and the like.

It will be understood by those skilled in the art that the structure shown in fig. 4 is only a partial block diagram related to the technical solution of the present disclosure, and does not constitute a limitation of the electronic device to which the solution of the present application is applied, and a specific electronic device may include more or less components than those shown in the drawings, or combine some components, or have a different arrangement of components.

The invention has the technical characteristics and obvious effects that: firstly, the method overcomes the defect that the conceptual information of the sample category is difficult to effectively utilize in the traditional relational network model, and improves the classification performance of the small sample learning model in the visual field; secondly, the invention provides a small sample learning method based on an absolute-relative learning framework, which can combine similar learning and concept learning to construct richer image representations and provide important thought reference for multi-modal learning in the field of image classification.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A small sample learning method based on an absolute-relative learning framework is characterized in that:

the absolute-relative learning framework comprises a representation extraction module, an absolute learning module and a relative learning module;

the small sample learning method is used for training a model, the model is used for image recognition, and the small sample learning method specifically comprises the following steps:

step S1, calling the representation extraction module to perform representation extraction on each image sample in the training set so as to obtain a feature vector of each image sample;

step S2, calling the absolute learning module, training the feature vector of each image sample to determine a first prediction result of each image sample based on the category and a second prediction result based on the semantics;

step S3, combining the feature vectors of every two image samples into a group of sample feature pairs, and splicing the two feature vectors in each group of sample feature pairs into a group of composite vectors;

step S4, calling the relative learning module, training the sample feature pairs to determine a first similarity based on category and a second similarity based on semantic of two feature vectors in each group of sample feature pairs;

step S5, calculating a loss function of the model according to the first prediction result, the second prediction result, the first similarity, and the second similarity, so as to complete training of the model.

2. The method for learning the small sample based on the absolute-relative learning architecture of claim 1, wherein in the step S2, the determining the first prediction result specifically includes:

and calling a first absolute learning submodule in the absolute learning module, and calculating a first prediction result of each image sample based on the category by learning the category knowledge of each image sample and utilizing a cross entropy loss function, wherein the first absolute learning submodule is a category predictor.

3. The method for learning the small sample based on the absolute-relative learning architecture of claim 1, wherein in the step S2, the determining the second prediction result specifically includes:

and calling a second absolute learning submodule in the absolute learning module, and calculating a second prediction result of each image sample based on the semantics by learning the semantic knowledge of each image sample and utilizing a mean square error loss function, wherein the second absolute learning submodule is a semantic predictor.

4. The method for learning the small sample based on the absolute-relative learning framework according to claim 1, wherein in the step S4, the determining the first similarity specifically includes:

and calling a first relative learning submodule in the relative learning module, taking the spliced synthetic vector as input, and calculating a first similarity of two feature vectors in each group of sample feature pairs based on the category by using a mean square error loss function, wherein the first relative learning submodule is a relative learner for describing the category relationship.

5. The method for learning the small sample based on the absolute-relative learning architecture of claim 1, wherein in the step S4, the determining the second similarity specifically includes:

and calling a second relative learning submodule in the relative learning module, taking the spliced synthetic vector as input, and calculating a semantic-based second similarity of the two feature vectors in each group of sample feature pairs by using a mean square error loss function, wherein the second relative learning submodule is a relative learner for describing a semantic relationship.

6. The method for learning the small sample based on the absolute-relative learning architecture of claim 1, wherein in the step S5, the calculating the loss function of the model specifically comprises:

the first prediction result, the second prediction result, the first similarity and the second similarity respectively have corresponding weight factors, and the loss function of the model is calculated based on the weight factors.

7. The method for learning the small sample based on the absolute-relative learning architecture of claim 1, wherein in the step S5, the training of the model specifically includes:

repeating the steps S1 to S4 for multiple times to obtain a plurality of loss functions of the model, determining a minimum value from the plurality of loss functions, and using the model in the state of the minimum value for the image recognition to finish the training of the model.

8. A small sample learning system based on an absolute-relative learning architecture, wherein:

the small sample learning system is used for training a model, the model is used for image recognition, and the small sample learning system specifically comprises:

the first processing unit is configured to invoke the representation extraction module to perform representation extraction on each image sample in a training set so as to obtain a feature vector of each image sample;

a second processing unit, configured to invoke the absolute learning module, train the feature vector of each image sample, and determine a first prediction result based on category and a second prediction result based on semantic meaning of each image sample;

a third processing unit configured to combine the feature vectors of every two image samples into a set of sample feature pairs, and splice the two feature vectors in each set of sample feature pairs into a set of composite vectors;

a fourth processing unit, configured to invoke the relative learning module, train the sample feature pairs, and determine a first similarity based on category and a second similarity based on semantic of two feature vectors in each set of sample feature pairs;

a fifth processing unit, configured to calculate a loss function of the model according to the first prediction result, the second prediction result, the first similarity and the second similarity, so as to complete training of the model.

9. An electronic device, comprising a memory storing a computer program and a processor implementing the steps of the small sample learning method based on an absolute-relative learning architecture of any one of claims 1 to 7 when the computer program is executed.

10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which, when being executed by a processor, implements the steps of the small sample learning method based on an absolute-relative learning architecture of any one of claims 1 to 7.