CN110717554B

CN110717554B - Image recognition method, electronic device, and storage medium

Info

Publication number: CN110717554B
Application number: CN201911233317.9A
Authority: CN
Inventors: 赖锦祥; 胡永涛; 戴景文
Original assignee: Guangdong Virtual Reality Technology Co Ltd
Current assignee: Guangdong Virtual Reality Technology Co Ltd
Priority date: 2019-12-05
Filing date: 2019-12-05
Publication date: 2023-02-28
Anticipated expiration: 2039-12-05
Also published as: CN110717554A

Abstract

The application discloses an image recognition method, an electronic device and a storage medium. The image recognition method comprises the following steps: extracting a first query set and a first support set from an image set to be identified, wherein the first support set comprises a first number of categories of first support images with category labels, and the first query set comprises one or more first query images of unlabeled categories; and inputting the first query set and the first support set into a small sample network model trained in advance to obtain a comprehensive classification result of each first query image. By the above method, the images can be classified.

Description

Image recognition method, electronic device, and storage medium

Technical Field

The present application relates to the field of machine learning algorithms, and in particular, to an image recognition method, an electronic device, and a storage medium.

Background

Deep learning is now being used more and more extensively. However, deep learning can be realized only by relying on a large amount of data, so that the workload of collecting data at the early stage is large, and sometimes the quantity of the existing data cannot meet the requirement of deep learning; on the other hand, a human being needs to recognize a new object only from a few observed samples, for example, a child may recognize an apple after seeing one or several pictures. The ability of human discrimination far exceeds the ability of machines to deep learn to discriminate objects. And the small sample learning can learn a good classification model under the condition that each category only has 1 or a plurality of samples, so that the small sample model is more close to the requirements of people.

Disclosure of Invention

The embodiment of the application provides an image identification method based on small sample learning, electronic equipment and a computer readable storage medium, which can classify images.

The application provides an image identification method based on small sample learning in a first aspect, and the method comprises the following steps: extracting a first query set and a first support set from the image set to be identified, wherein the first support set comprises first support images with category labels of a first number of categories, and the first query set comprises first query images of one or more unlabeled categories;

the method comprises the steps that a first query set and a first support set are input into a small sample network model trained in advance, and a comprehensive classification result of each first query image is obtained, wherein the small sample network model comprises an integration layer, a first element learning device, a second element learning device and a third element learning device, the first element learning device is used for obtaining a first classification result of the first query image based on the similarity between the first query image and the first support image, the second element learning device is used for obtaining a second classification result of the first query image based on the first distance between the first query image and the first support image, the third element learning device is used for classifying the first query image contained in the first query set to obtain a third classification result of the first query image, and the integration layer is used for integrating a first classification result, a second classification result and a third classification result of the same first query image to obtain a comprehensive classification result of each first query image.

A second aspect of the present application provides an electronic device comprising a processor, a memory coupled to the processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program stored in the memory to perform the method provided in the first aspect.

A third aspect of the present application provides a computer-readable storage medium having program code stored therein, the program code being invoked by a processor to perform the method of the first aspect.

The image identification method based on small sample learning provided by the embodiment of the application can extract the first query set and the first support set from the image set to be identified and input the first query set and the first support set into the small sample network model, and the first element learner, the second element learner and the third element learner in the small sample network model can respectively obtain the first classification result, the second classification result and the third classification result of the first query image. The three meta learners are different in classification standard, the first meta learner is used for classifying the similarity between the first query image and the first support set, the second meta learner is used for classifying the first query image and the first support set, and the third meta learner is a common sample classifier and can directly classify the first query image. And then, integrating three classification results by an integration layer in the small sample network model to serve as a comprehensive classification result, so that the accuracy of the small sample network in classifying the images can be improved.

Drawings

Fig. 1 is a schematic flowchart of an image recognition method based on small sample learning according to a first embodiment of the present application;

FIG. 2 is a diagram of a network model architecture for small sample learning provided by a second embodiment of the present application;

fig. 3 is a schematic flowchart of an image recognition method based on small sample learning according to a third embodiment of the present application;

FIG. 4 is a schematic flowchart of a network model training method based on small sample learning according to a fourth embodiment of the present application;

fig. 5 is a schematic flowchart of a method for verifying a small sample network model according to a fifth embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present application.

Detailed Description

The present application will be described in detail with reference to the drawings and examples.

Referring to fig. 1, fig. 1 is a schematic flowchart of an image recognition method based on small sample learning according to a first embodiment of the present application. As shown in fig. 1, the image recognition method based on small sample learning of the present embodiment includes:

s110: a first query set and a first support set are extracted from a set of images to be identified.

Wherein the first support set includes a first number of categories of category-tagged first support images and the first query set includes one or more unlabeled categories of first query images.

The specific category in the image set to be identified is not necessarily known, that is, the image set to be identified includes a few images of known categories and unidentified images of the remaining unlabeled categories, and the images with category labels can be extracted as the first support set, and the images of the unlabeled categories can be used as the first query set.

In one embodiment, S categories are selected from P categories in the image set to be identified, where S and P may be positive integers and S ≦ P. Then, Y samples with category labels are selected from each selected category to form a first support set, that is, the first support set contains sxy first support images, and one or more images from the images without labeled categories can be selected as a first query image to form a first query set.

S120: and inputting the first query set and the first support set into a small sample network model trained in advance to obtain a comprehensive classification result of each first query image.

The small sample network model comprises an integration layer, a first element learner, a second element learner and a third element learner, wherein the first element learner is used for obtaining a first classification result of a first query image based on the similarity between the first query image and a first support image, the second element learner is used for obtaining a second classification result of the first query image based on a first distance between the first query image and the first support image, the third element learner is used for classifying the first query image contained in the first query set to obtain a third classification result of the first query image, and the integration layer is used for integrating the first classification result, the second classification result and the third classification result of the same first query image to obtain a comprehensive classification result of each first query image.

In one embodiment, the first meta learner may be relationship-Networks. relationship-Networks may calculate a similarity between a first query image in a first query set and image category features in a first support set using trainable convolutional layers. For example, the first support set is clustered to obtain a category a taking a as a clustering center and a category B taking B as the clustering center, for one first query image in the first query set, the first query image is used as a part of the first query set and is input into a small sample network model, the characteristics of the first query image are extracted through a convolutional neural network, then relationship-Networks can respectively calculate the similarity between the characteristics of the first query image and the clustering centers a and B in the first support set according to the characteristics of the first query image, and if the similarity between the characteristics of the first query image and the clustering centers a in the first support set is higher, the first query image belongs to an image category a; otherwise, the first query image belongs to the image category b.

It is to be understood that the process of separating a plurality of physical or abstract objects in a collection into classes composed of similar objects is referred to as clustering. In this embodiment, the images in the first query set and/or the first support set are aggregated into a plurality of classes, and the clustering center is a mean value obtained from the image features in each class.

Optionally, the first unigram learner includes a relation module, and the relation module is configured to calculate a similarity between a feature of the first query image extracted by the convolutional neural network and a cluster center of the first support set. And calculating the similarity between the characteristics of each first query image and each cluster center according to the relationship-Networks, and respectively attributing each first query image to the category with the maximum similarity to the first query image so as to obtain a first classification result.

The second element learner may be a Proto-Networks. The Proto-Networks may use trainable convolutional layers to calculate a distance between a first query image in a first query set and a category of images in a first support set. The distance may be a euclidean distance or a cosine distance. For example, the first support set is clustered to obtain a category a taking a as a clustering center and a category B taking B as the clustering center, for one first query image in the first query set, the first query image is used as a part of the first query set and is input into a small sample network model, the characteristics of the first query image are extracted through a convolutional neural network, and then Proto-Networks can respectively calculate the Euclidean distance between the first query image and the clustering center A and the clustering center B in the first support set according to the characteristics of the first query image, if the Euclidean distance between the characteristics of the first query image and the clustering center A in the first support set is smaller, the first query image belongs to the image category a; otherwise, the first query image belongs to the image category b.

Optionally, the second meta learner may be to obtain a second classification result for the first query image based on a first distance between the first query image and a cluster center of the first support set. Specifically, the distance between the feature of each first query image and each cluster center is calculated according to ProtoNetworks, and the distance can be a euclidean distance or a cosine distance. And respectively attributing each first query image to the category with the minimum distance from the first query image so as to obtain a second classification result.

The third learner may be a Classifier-nets. Classinger-Networks can directly carry out classification prediction on images according to the image features extracted by the convolutional network. Unlike the first two meta-learners, the input to the third meta-learner does not include the cluster center of the first support set.

The third learner may include a Pooling layer (Pooling layer) for Pooling features of the first query image and a distance calculation module. Pooling, which may also be referred to as downsampling, is to approximate the pixels (feature values) around some pixels, count the feature values of a certain position and its neighboring positions in a pixel plane, and use the summed result as the value of the position in the pixel plane. In this embodiment, the image feature matrix is pooled by the pooling layer, so that the dimensionality of the image feature vector output by the convolutional layer can be reduced. The filter size of the pooling layer may be 1*1, 2*2, 5*5, etc. When the filter size of the pooling layer may be 1*1, the image feature matrix is not actually pooled.

The distance calculation module is used for calculating a second distance between the features of the first query image after pooling and the free vector. The free vector is an adjustable vector, and the initial value of the free vector can be a randomly assigned value or an artificially designated value, and the free vector can be continuously adjusted in the training process. Specifically, the distance between the features of the pooled image and the free vector is calculated from the Classifier-Networks, which may be a euclidean distance or a cosine distance.

In the third learner, the calculated distance is normalized to obtain the distance weight between each first query image and each category, and each first query image is respectively attributed to the category corresponding to the weight value with the minimum distance, so that a third classification result is obtained.

Optionally, the first distance and the second distance are calculated in the same way. That is, when the first distance is the euclidean distance, the second distance is also the euclidean distance; when the first distance is a cosine distance, the second distance is also a cosine distance.

The integration layer may be configured to integrate the first classification result, the second classification result, and the third classification result to obtain a comprehensive classification result of the first query image.

In one embodiment, the calculation method for obtaining the comprehensive classification result according to the first classification result, the second classification result, and the third classification result is as follows:

mapping a first classification result and a second classification result of the first query image to a fixed numerical value interval, and respectively carrying out normalization processing on a third classification result of the first query image, the mapped first classification result and the mapped second classification result; and calculating the sum of the first classification result, the second classification result and the third classification result after the normalization processing to obtain the probability of each class of the first query image in the first quantity of classes, and taking the class with the maximum probability as the comprehensive classification result of the first query image. The above calculation method can also be expressed as:

wherein

For the integrated classification result of the first query image,

for a first classification result of a first query image,

for the second classification result of the first query image,

is the third classification result of the first query image. Through the upper partAnd (4) obtaining a comprehensive classification result of the first query image by the formula.

It can be understood that the final classification result of each meta-learner is a vector, the sigmoid function is a regression function, and the classification result vector obtained by the meta-learner can be mapped to a fixed numerical space, so that the output numerical value is limited. The numerical space may be (0,1). softmax is a normalization index function, and can normalize the classification result vector of the meta-learner, i.e. map each element in the result vector to a probability.

Due to the first classification result

And a second classification result

Are all similarity measures between features, so the first classification result

And a second classification result

There may be large values (values greater than 1) in the first classification result, so a sigmoid function is required to classify the first classification result

And a second classification result

Mapping the result vector between 0~1 and normalizing the result vector mapped between 0~1 by using a softmax function so as to finally output a first classification result

And a second classification result

Is a classification probability result. And the third classificationResults

Is a classification probability result, the value of which is less than 1), and thus can be directly used for the calculation of the final classification result.

In a second embodiment provided by the present application, as shown in fig. 2, a small sample network model includes a convolutional layer, a cluster layer, a first meta-learner, a second meta-learner, a third meta-learner, and a cluster layer.

The input of the small sample network model is a first support set string (N multiplied by J) and a first query set Qtrain (N multiplied by K), the first support set string (N multiplied by J) and the first query set Qtrain (N multiplied by K) are sent into the convolutional layer, and the features of the input first support image and the first query image are extracted by using a convolutional neural network. The clustering layer is used for clustering the features of all the first support images extracted by the convolution layer to obtain N clustering centers, and each clustering center can respectively correspond to one category. And finally, integrating the first classification result, the second classification result and the third classification result by an integration layer to obtain a comprehensive classification result of the first query image.

By the method, the first query set and the first support set extracted from the image set to be identified can be input into the small sample network model, and the first element learner, the second element learner and the third element learner in the small sample network model can respectively obtain the first classification result, the second classification result and the third classification result of the first query image. The three meta-learners are different in classification standard, the first meta-learner is used for classifying similarity between a first query set and a first support set, the second meta-learner is used for classifying a first distance between the first query set and the first support set, and the third meta-learner is used for directly classifying the first query image. And then integrating the three classification results by an integration layer in the small sample network model to serve as a comprehensive classification result. Thereby enabling an improved accuracy of classification of the first query image in the first query set.

Referring to fig. 3, fig. 3 is a flowchart illustrating an image recognition method based on small sample learning according to a third embodiment of the present application.

S310: the third, pre-trained learner is fine-tuned using the first support image of the first support set.

The image class in the first support set is greater than 1. Optionally, the first support images of the first support set are images with class labels, the third learner which is trained in advance is retrained by using the images with the class labels, parameters (free variables) of the third learner in the small sample network model are fine-tuned, and the small sample network model is optimized to improve the accuracy of the classification result of the unknown class images in the subsequent image set to be recognized. The pooling layer filter may be increased in size prior to fine-tuning the third learner. For example, the size of the pooling layer filter may be increased to 3*3.

S320: a first query set and a first support set are extracted from a set of images to be identified.

S330: and inputting the first query set and the first support set into a small sample network model trained in advance to obtain a comprehensive classification result of each first query image.

In the present embodiment, the detailed descriptions of S320 and S330 refer to S110 and S120 in the first embodiment, which are not repeated here.

By the method, before the image identification process is carried out, parameters (free variables) used by a third element learner in the small sample network model are finely adjusted by using the first support image (the image with the class label) of the first support set, and the finely adjusted small sample network model is used for identifying the images in the first query set, so that the accuracy of the classification result is further improved.

Referring to fig. 4, fig. 4 is a schematic flowchart of a network model training method based on small sample learning according to a fourth embodiment of the present application. As shown in fig. 4, the network model training method based on small sample learning of the present application includes:

s410: a second query set and a second support set are extracted from the training image set.

The training image set comprises a second number of classes of images with class labels, the classes of the images of the training image set are different from the classes of the images of the image set to be recognized, the second support set comprises a third number of classes of second support images with class labels, the second query set comprises a third number of classes of second query images with class labels, and the third number is smaller than or equal to the second number.

The images in the training image set are divided into a second number of categories, and each category has a specified number of images. The number of different categories of images may be the same or different. The second query set and the second support set each comprise a third number of categories of images, and the categories of images in the second query set and the second support set are the same. And the number of categories in the second query set and the second support set is less than or equal to the number of categories in the training image set, i.e. the third number is less than or equal to the second number.

For example, N (third number) classes are selected from M (i.e., second number) classes of the training image set Xtrain, where N ≦ M. Then selecting K samples from each selected category to form a second query set Qtrain, namely the second query set Qtrain comprises NxK second images; and selecting J samples from each selected category to form a second support set string which contains N multiplied by J images. J and K may or may not be equal. In other embodiments, the image categories in the second query set belong to image categories in the second support set, and the number of images of different categories in the second query set and the second support set may be the same or different.

S420: and inputting the second query set and the second support set into the small sample network model.

S430: and clustering the second support images contained in the second support set to obtain a third number of clustering centers.

Features of the input second support image and the second query image are extracted through the convolutional layer of the small sample network. The clustering layer is used for clustering the characteristics of all the second support images extracted by the convolutional layer to obtain a third number of clustering centers as the clustering centers of the second support set.

S440: determining the similarity between a currently input second query image and each clustering center of a second support set through a first meta-learner of the small sample network model, obtaining a first classification result of the second query image, determining a first distance between the second query image and each clustering center through a second meta-learner of the small sample network model, obtaining a second classification result of the second query image, and classifying the second query image contained in the second query set through a third meta-learner of the small sample network model to obtain a third classification result of the second query image.

For a detailed description of S440, please refer to the first embodiment, which is not described herein.

S450: and calculating classification errors between the classification label of the currently input second query image and the first classification result, the second classification result and the third classification result to obtain a comprehensive error.

It can be understood that the smaller the composite error, the more accurate the classification of the network model representing the trained small sample.

According to classification errors between the class label of the currently input second query image and the first classification result, the second classification result and the third classification result, the process of obtaining the comprehensive error is as follows:

mapping the first classification result of the second query image to a fixed numerical value interval, and calculating the mean square error between the classification label of the second query image and the mapped first classification result to obtain a first classification error; normalizing the second classification result of the second query image, and calculating a cross entropy error between the classification label of the second query image and the normalized second classification result to obtain a second classification error; normalizing the third classification result of the second query image, and calculating a cross entropy error between the classification label of the second query image and the normalized third classification result to obtain a third classification error; and obtaining the sum of the first classification error, the second classification error and the third classification error to obtain a comprehensive error. The calculation formula of the composite error can be as follows:

wherein,

and the classification error represents the error between the classification result obtained by the small sample network model and the real classification.

The category label for the second query image, i.e. the true classification of the second query image,

representing the first classification result

And true classification

The mean square error between the two signals,

as a result of the second classification

And true classification

The cross-entropy error between the two,

as a result of the third classification

And true classification

Cross entropy error between.

Of course, the classification error calculation between the three classification results and the true classification may be the same. For example, the classification errors between the three classification results and the real classification can all be mean square errors between the three classification results and the real classification.

S460: and judging whether the current small sample network model meets the preset condition or not by using the calculated comprehensive error. If yes, go to step 470; if not, go to S480.

S470: and finishing the training of the small sample network model.

S480: and adjusting parameters of the small sample network model according to the comprehensive error, acquiring a first classification result, a second classification result and a third classification result of a next input second query image according to the adjusted small sample network model, and calculating to obtain the comprehensive error.

In the step, the adjusted small sample network model is used for reclassifying the input second query image to reduce the comprehensive error of the small sample network model so as to enable the small sample network model to meet the preset condition, so that the small sample network model is optimized.

Optionally, the parameters in the small sample network may include at least one of parameters of a convolutional neural network, calculation parameters of a cluster center, parameters of a first unary learner, and free vectors of a third unary learner. After the step is executed, the process jumps to the step S420 to repeat the above process until the training is completed or the number of cycles reaches the preset condition.

The predetermined condition for stopping training generally includes the classification error being less than a predetermined threshold. The preset threshold may be determined according to actual experience or multiple tests, and is not limited herein.

In addition to the composite error, an intra-sample prototype variance (ISPV) and an inter-sample prototype variance (IPV) for the small sample network may also be calculated. Wherein the intra-sample prototype variance is used to evaluate the intra-class distance of the second query image classification, and the inter-sample prototype variance is used to evaluate the inter-class distance of the second query image classification.

In this embodiment, the intra-class distance is a mean square distance between different image features in the same image class, and the inter-class distance is a mean square distance between image features of the image class. It is generally believed that the larger the inter-class distance, and the smaller the intra-class distance, the better the model. Wherein, the calculation formulas of the prototype variance (ISPV) and the prototype-to-prototype variance (IPV) in the sample are respectively as follows:

wherein,

the category is represented by a list of categories,

to represent

The cluster center to which the category corresponds,

to represent

Images in categories

，

To represent

Images in categories

And

the distance between the cluster centers corresponding to the categories,

to represent

Class and

the distance between the categories.

In the process of training the small sample network model, the advantages and the disadvantages of the small sample network model can be observed by observing the variance of the prototypes in the sample and the variance change between the prototypes. Specifically, the preset condition for stopping training may further include that the intra-sample variance is less than a first threshold and/or the inter-prototype variance is greater than a second threshold.

After the training is completed, the method may jump to step S410 to extract the second query set and the second support set again for loop execution. The number of categories of the second query set and the second support set extracted each time and the number of images of each category may be the same or different. It should be noted that the condition for the small sample network model to complete training completely includes that all image classes in the training image set Xtrain are extracted-trained, but not limited to complete extraction-training at one time.

It should be noted that, during the training process, the filter size of the pooling layer of the third learner may be 1*1, i.e., pooling is not performed. In this embodiment, the trained small sample network model may identify the image.

Referring to fig. 5, fig. 5 is a schematic flowchart of verifying a small sample network model according to a fifth embodiment of the present application. As shown in fig. 5, after completing the training of the small sample network model, the small sample network model may be further verified, including:

s510: a third query set and a third support set are extracted from the set of test images.

Optionally, the test image set comprises a fourth number of categories of category-labeled images, the categories of the images of the test image set being different from the categories of the images of the training image set, the third support set comprises a fifth number of categories of third support images, the third query set comprises a fifth number of categories of third query images, the fifth number being smaller than or equal to the fourth number.

S520: the third query set and the third support set are input into the small sample network model.

The small sample network model trained by the above embodiments can be verified using the third query set and the third support set. The third query set and the third support set may be input into the small sample network model that has been trained by using the training image set, so as to obtain a comprehensive classification result of the third query image in the third query set, and the specific acquiring process may refer to the foregoing description, and will not be repeated here. The combined classification results of the third query images in the third query set are then compared to the category labels to verify the recognition effect of the small-sample network model.

Because the classification of the third element learner in the small sample network model which is trained by utilizing the training image set is independently completed on the second query set, the free vector is only influenced by the second query set, the second query set is changed into the third query set in the verification process, and the third query set are different, the parameters of the third element learner are required to be adjusted according to the actually input third query set for verification.

In the process of verifying the small sample network model, the advantages and the disadvantages of the small sample network model can be observed by observing the variance of the prototype in the sample and the variance change between the prototypes.

S530: the filter size of the pooling layer in the third learner is increased.

Because the number of images in the test image set is generally less than the number of images in the training image set, increasing the size of the filter of the pooling layer in the third learner can prevent overfitting of the model, and improve the generalization capability of the model. For example, the filter of the pooling layer in the third learner may be sized to 5*5.

S540: the third learner is trained using a third set of queries and a third set of supports.

In this step, the third element learner is finely adjusted only according to the error between the third classification result output by the third element learner and the class label, and the object of adjustment is a free vector. The process is a fine tuning phase, i.e. the parameters of the first and second meta learners are not changed during the process. The purpose of this stage is to further fine-tune the parameters of the third element learner, and improve the adaptability of the small sample network model to different image sets, thereby improving the accuracy of the small sample network model.

S550: the small sample network model is verified by the third support image and the third query image.

And verifying the trained small sample network model by extracting a third query set (third query image) and a third support set (third support image) from the test image set so as to check the recognition effect of the small sample network model on the test image sets with different categories.

In this embodiment, the trained small sample learning network model is further verified, and the verified small sample network model can identify an image.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present application. As shown in fig. 6, the electronic device includes a processor 610, and a memory 620 coupled to the processor 610, wherein the memory 620 stores a computer program, and the processor 610 is configured to execute the computer program stored in the memory 620 to perform the method provided in any of the above embodiments and possible combinations.

Processor 610 may also be referred to as a CPU (Central Processing Unit). The processor 610 may be an integrated circuit chip having signal processing capabilities. The processor 610 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

In one embodiment, a computer-readable storage medium is also provided, in which program code is stored, the program code being called by a processor to execute the method provided in any one of the above embodiments and possible combinations.

It should be noted that, all or part of the flow in the method according to the embodiments may be implemented by instructing related hardware through a computer program, where the computer program may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium can include any entity or device capable of carrying computer program code, recording media, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-only Memory (ROM), random Access Memory (RAM), electrical carrier signals, telecommunications signals, software distribution media, and the like.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims

1. An image recognition method based on small sample learning is characterized by comprising the following steps:

extracting a first query set and a first support set from a set of images to be identified, the first support set comprising a first number of categories of category-tagged first support images, the first query set comprising one or more unlabeled categories of first query images;

inputting the first query set and the first support set into a small sample network model trained in advance to obtain a comprehensive classification result of each first query image, wherein the small sample network model comprises an integration layer, a first meta-learner, a second meta-learner and a third meta-learner, the first meta-learner is used for obtaining a first classification result of the first query image based on the similarity between the first query image and the first support image, the second meta-learner is used for obtaining a second classification result of the first query image based on the first distance between the first query image and the first support image, the third meta-learner is used for classifying the first query image contained in the first query set to obtain a third classification result of the first query image, and the integration layer is used for obtaining a comprehensive classification result of each first query image by integrating the first classification result, the second classification result and the third classification result of the same first query image;

the training mode of the small sample network model comprises the following steps:

extracting a second query set and a second support set from the training image set, the training image set comprising a second number of categories of category-labeled images, the categories of the images of the training image set being different from the categories of the images of the image set to be recognized, the second support set comprising a third number of categories of category-labeled second support images, the second query set comprising a third number of categories of category-labeled second query images, the third number being smaller than or equal to the second number;

inputting the second query set and the second support set into a small sample network model;

clustering the second support images contained in the second support set to obtain a third number of clustering centers;

determining, by the first element learner of the small sample network model, a similarity between the currently input second query image and each of the cluster centers of the second support set, and obtaining the first classification result of the second query image, determining, by the second element learner of the small sample network model, a first distance between the second query image and each of the cluster centers, and obtaining the second classification result of the second query image, and classifying, by the third element learner of the small sample network model, the second query image included in the second query set to obtain the third classification result of the second query image;

calculating classification errors among the class label of the currently input second query image and the first classification result, the second classification result and the third classification result to obtain a comprehensive error, adjusting parameters of the small sample network model according to the comprehensive error, obtaining the first classification result, the second classification result and the third classification result of the next input second query image according to the adjusted small sample network model, and calculating the comprehensive error until the comprehensive error meets a preset condition.

2. The method of claim 1, further comprising, prior to obtaining the composite classification result for each query image:

fine-tuning the third meta-learner pre-trained with the first support image of the first support set.

3. The method of claim 1,

the third element learner comprises a pooling layer and a distance calculating module, wherein the pooling layer is used for pooling the features of the first query image, the distance calculating module is used for calculating a second distance between the features of the first query image after pooling and a free vector, and the free vector is an adjustable parameter in the third element learner.

4. The method of claim 1, wherein computing the integrated classification result for the first query image comprises:

mapping the first and second classification results of the first query image to a fixed numerical interval;

respectively carrying out normalization processing on the third classification result of the first query image, the first classification result after mapping and the second classification result after mapping;

and calculating the sum of the first classification result, the second classification result and the third classification result after the normalization processing to obtain the probability of each class of the first query image in the first quantity of classes, and taking the class with the maximum probability as the comprehensive classification result of the first query image.

5. The method of claim 1, further comprising, after said inputting the second query set and the second support set into a small sample network model:

extracting image features of an input image through a convolution layer of the small sample network model;

the determining, by the first meta-learner of the small-sample network model, similarities between the currently-input second query image and respective cluster centers of the second support set includes:

calculating, by a relationship module of the first unilearner, a similarity between image features of the second query image and each of the cluster centers of the second support set.

6. The method of claim 1, wherein said calculating classification errors between said first classification result, said second classification result, and said third classification result carried by said currently inputted second query image, resulting in a composite error, comprises:

mapping the first classification result of the second query image to a fixed numerical value interval, and calculating a mean square error between a classification label of the second query image and the mapped first classification result to obtain a first classification error;

normalizing the second classification result of the second query image, and calculating a cross entropy error between the classification label of the second query image and the normalized second classification result to obtain a second classification error;

normalizing the third classification result of the second query image, and calculating a cross entropy error between the classification label of the second query image and the normalized third classification result to obtain a third classification error;

and obtaining the sum of the first classification error, the second classification error and the third classification error to obtain the comprehensive error.

7. The method of claim 1, wherein the verification of the small sample network model comprises:

extracting a third query set and a third support set from a test image set, the test image set including a fourth number of categories of category-labeled images, the categories of the images of the test image set being distinct from the categories of the images of the training image set, the third support set including a fifth number of categories of third support images, the third query set including a fifth number of categories of category-labeled third query images, the fifth number being less than or equal to the fourth number;

inputting the third query set and the third support set into the small sample network model;

increasing the size of a filter of a pooling layer in the third learner;

training the third meta-learner with the third set of queries and the third set of support;

and verifying the small sample network model through the third support image and the third query image.

8. An electronic device, comprising a processor, a memory coupled to the processor, wherein the memory stores a computer program, and wherein the processor is configured to execute the computer program stored by the memory to perform the method of any of claims 1-7.

9. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 7.