CN113627522A - Image classification method, device and equipment based on relational network and storage medium - Google Patents

Image classification method, device and equipment based on relational network and storage medium Download PDF

Info

Publication number
CN113627522A
CN113627522A CN202110907203.9A CN202110907203A CN113627522A CN 113627522 A CN113627522 A CN 113627522A CN 202110907203 A CN202110907203 A CN 202110907203A CN 113627522 A CN113627522 A CN 113627522A
Authority
CN
China
Prior art keywords
image
support set
target image
similarity
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110907203.9A
Other languages
Chinese (zh)
Other versions
CN113627522B (en
Inventor
梁军
余嘉琳
余松森
苏俊光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Original Assignee
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University filed Critical South China Normal University
Priority to CN202110907203.9A priority Critical patent/CN113627522B/en
Priority claimed from CN202110907203.9A external-priority patent/CN113627522B/en
Publication of CN113627522A publication Critical patent/CN113627522A/en
Application granted granted Critical
Publication of CN113627522B publication Critical patent/CN113627522B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an image classification method based on a relational network, which comprises the following steps: acquiring a target image; inputting the target image and the support set image into a trained image classification model to obtain the similarity between the target image and each class image in the support set image; the image classification model comprises an embedding module and a measurement module, wherein the embedding module is a random depth network, and the measurement module comprises a convolution layer and a full-connection layer which are connected with each other; and obtaining the category of the target image according to the maximum similarity. According to the method, a random depth network is adopted in an embedded module to replace convolutional layers in a relational network, and the network can optimize the training process of a residual error network by randomly removing some redundant layers, so that the number of layers of the network is increased, the problem of overfitting can be solved, more accurate support set image characteristics and query set image characteristics can be extracted, and the category judgment of a query set is further improved.

Description

Image classification method, device and equipment based on relational network and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for classifying images based on a relationship network.
Background
In recent years, the unprecedented breakthrough progress of deep learning in various fields depends largely on the large amount of available tagged data, which needs to be collected and labeled at a large cost, which severely limits the expansion in new categories, and more importantly, the problem of the small amount of tagged data is difficult to solve by the deep learning models. Therefore, the problem of learning a small sample based on a relationship network becomes a hot content of recent research.
The goal of small sample research is to design a relevant learning model such that the model can achieve fast learning and identify classes of new samples in only a small number of labeled samples. The current research applicable to the problem of small samples is as follows: data enhancement, meta learning, and metric learning. Data enhancement can relieve the over-fitting problem and the data scarcity problem in the training process of a small amount of data to a certain extent, but cannot fundamentally solve the problem of small samples. The meta learning is to promote the model from the original data learning to the task learning, so that a new direction is provided for the research of the small sample learning problem.
The method for extracting the features of the images by using the deep convolutional network is a key step in the process of learning the small samples, and the deep convolutional network is difficult to improve the classification accuracy of the models by adopting the existing learning method to perform the task of learning the small samples.
The problems that the gradient disappears when the deep neural network is trained, the information gradually decreases when the information flows forwards and the training time is too long exist, so that the training of the deep neural network becomes very difficult. In some tasks, if shallow neural networks are used, they are simple in structure and easy to train but have poor expression ability; if deep neural networks are used, they have more redundant layers, which are more difficult to train, although they are well-expressed.
Disclosure of Invention
Based on this, the present invention provides an image classification method, apparatus, device and storage medium based on a relational network, which can extract more accurate support set image features and query set image features to further improve the category judgment of a query set.
In a first aspect, an embodiment of the present application provides an image classification method based on a relationship network, including the following steps:
acquiring a target image;
inputting the target image and the support set image into a trained image classification model to obtain the similarity between the target image and each class image in the support set image; the image classification model comprises an embedding module and a measurement module, wherein the embedding module is a random depth network, and the measurement module comprises a convolution layer and a full-connection layer which are connected with each other;
and obtaining the category of the target image according to the maximum similarity.
Further, inputting the target image and the support set image into a trained image classification model to obtain the similarity between the target image and each category image in the support set image, including:
inputting the target image and the support set image into the random depth network, and extracting the features of the target image and the support set image;
splicing the extracted features of the target image and the features of the support set image to obtain a spliced image;
inputting the spliced image into the convolutional layer, and further extracting the characteristics of the spliced image;
and inputting the extracted characteristics of the spliced images into the full-connection layer to obtain the similarity between the target image and each category of image in the support set image.
Further, extracting features of the target image and the support set image comprises:
randomly discarding the redundant layers of the target image and the support set image according to a rule generated by a survival probability;
obtaining the characteristic diagram of the target image
Figure BDA0003202120230000021
And feature maps of the support set images
Figure BDA0003202120230000022
Wherein xjIs the target image, xiIs a support set image.
Further, obtaining the similarity between the target image and each image in the support set image includes:
obtaining the matching degree between the target image and each category image in the support set image by analyzing the acquired characteristics of the spliced image, wherein the process is shown as a formula 1;
Figure BDA0003202120230000023
wherein the content of the first and second substances,
Figure BDA0003202120230000024
to support a set of image feature maps,
Figure BDA0003202120230000025
is a feature map of the target image, ri,jRepresenting the similarity of the target image and the support set image categories, and C is the number of the support set category images, and generating C similarities.
Further, the training process of the commodity classification model comprises the following steps:
acquiring a query set image and a training set image;
inputting the query set image and the training set image into the random depth network, and extracting the characteristics of the query set image and the training set image;
splicing the extracted characteristics of the images of the query set and the training set to obtain a spliced image;
inputting the output result of the random depth network into the convolutional layer, and further extracting the characteristics of the query set image and the training set image;
and inputting the output result of the convolutional layer to the full-connection layer to obtain the similarity between the image of the query set and each image in the image of the training set.
In a second aspect, an embodiment of the present application provides an apparatus, including:
the image acquisition module is used for acquiring a target image, a query set image and a support set image;
and the similarity judging module is used for inputting the target image and the support set image into a trained image classification model to obtain the similarity between the target image and each image in the support set image.
And the image classification module is used for obtaining the category of the target image according to the size of the similarity.
Further, in an apparatus provided in an embodiment of the present application, the similarity determining module includes:
a first input unit, configured to input the target image and the support set image to the random depth network, and extract features of the target image and the support set image;
the first splicing unit is used for splicing the extracted features of the target image and the features of the support set image to obtain a spliced image;
the second input unit is used for inputting the spliced image to the first convolution layer and further extracting the characteristics of the spliced image;
and the third input unit is used for inputting the extracted characteristics of the spliced images into the full-connection layer to obtain the similarity between the target image and each image in the support set image.
Further, an apparatus provided in an embodiment of the present application further includes a training module:
the training module is used for inputting the images of the query set and the images of the support set into the image classification model for training to obtain an image classification model set corresponding to the image classification model, and classifying and identifying the images of the query set by adopting the image classification model.
In a third aspect, an embodiment of the present application provides an electronic device, including:
the system comprises a processor and a memory, wherein the memory stores a program which can be called by the processor;
wherein the processor, when executing the program, implements the method for image classification based on a relational network according to the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, and the computer program when executed by a processor implements the steps of the relationship network based image classification method according to the first aspect.
In the embodiment of the application, in order to extract more accurate image features of the support set and the query set so as to further improve the category judgment of the query set, an improved novel model is provided on the basis of a relational network by using the knowledge of less sample learning, and the improved model is applied to the problem of image classification. The difference of the model in comparison with the relational network is that a random depth network (Stochastic depth) is adopted in an embedded module to replace the original four-layer convolution layer in the relational network, the depth of the embedded module can be deepened by the random depth network, the training process of a residual error network can be optimized by randomly removing some redundant layers, and the problem of overfitting can be prevented while the number of layers of the network is deepened.
The model of the application adopts a part of residual error modules which are not activated, so that the idea of model fusion is actually embodied, and because the depth of the model is random during training and the depth of the model is determined during prediction, the models with different depths are actually fused during testing, so that the network becomes simpler.
For a better understanding and practice, the invention is described in detail below with reference to the accompanying drawings.
Drawings
FIG. 1 is a flow chart of a method for image classification based on a relational network according to the present invention;
FIG. 2 is a diagram of the original ResNet structure in a random deep network;
FIG. 3 is a schematic diagram of probability of survival generation in a random deep network;
FIG. 4 is a diagram illustrating an image classification model according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the construction of a convolutional layer and a fully-connected layer in a metrology module;
FIG. 6 is a schematic block diagram of an image classification apparatus based on a relational network according to the present invention;
fig. 7 is a schematic diagram of a similarity determination module in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
It should be understood that the embodiments described are only some embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the embodiments in the present application.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims. In the description of the present application, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another similar human body, and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.
Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes an associative relationship with a human body, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the context of the associated human is an "or" relationship.
To solve the technical problem in the background art, an embodiment of the present application provides an image classification method based on a relationship network, as shown in fig. 1, the method includes the following steps:
in step S101, a target image is acquired;
in step S102, inputting the target image and the support set image into a trained image classification model to obtain a similarity between the target image and each category image in the support set image; the image classification model comprises an embedding module and a measurement module, wherein the embedding module is a random depth network, and the measurement module comprises a convolution layer and a full-connection layer which are connected with each other;
in step S103, the category of the target image is obtained according to the maximum similarity.
The target image is an image file in which a tag to be recognized is recorded.
Specifically, the image classification model is a model that can extract features of a support set image and a given target image through a small number of labeled support set images and the given target image, and perform recognition and classification by measuring distances between the extracted features; the similarity between the target image and each category image in the support set image is the distance between the characteristics of the commodity image and the characteristics of the support set image, the similarity is low when the distance is long, and the similarity is high when the distance is close.
Experiments in different data sets show that the training method can effectively solve the problem of difficulty in deep network training and greatly improve the model precision and the training speed. Fig. 2 is an original ResNet structure diagram, where f represents a residual part and id represents an identity map, and the two parts are summed, activated and then output. This process can be represented by the following equation:
Hl=ReLU(fl(Hl-1)+id(Hl-1)) (1)
wherein Hl-1Represents the l-1 th residual block, i.e., the input; hlRepresents the ith residual block, i.e., the output result.
Stochastic Depth refers to adding a random variable b during training, wherein the probability distribution of b satisfies a Bernoulli distribution, then multiplying f by b, and randomly discarding the residual part. If b is 1, the structure is the original ResNet structure, and when b is 0, the residual branch is not activated, and the whole structure is degraded to an identity function. This process can be represented by the following equation:
Hl=ReLU(blfl(Hl-1)+id(Hl-1)) (2)
b satisfies a Bernoulli distribution, the value of b is only 0 and 1, wherein the probability of 0 is 1-p, and the probability of 1 is p. The above p is also referred to as survival probability, and represents the probability that b is 1, and is set as a smoothing function of the number of residual layers l. From p0Linear decrease to p 1lThere are a total of L residual blocks, 0.5. The formula is as follows:
Figure BDA0003202120230000061
wherein p islIt means that the first layer is trainingThe probability of survival in the exercise, L, represents the total number of residual blocks. The resulting rule for p is shown in fig. 3.
Because the embedded modules should not experiment with too complex networks, a model based on ResNet-18 optimization is used. 18 of ResNet-18 is the designated 18 layers with weights. Because the embedding module needs to extract the feature maps of the target image and the support set image and then input these feature maps as input to the measurement module. The last two layers of ResNet-18, the max pooling layer and the full connectivity layer, are removed and become the ResNet-16 model. And then, optimizing ResNet-16 on the basis of ResNet-16, and randomly discarding the redundant layers according to a rule generated by the survival probability p to finally form Stochastic Depth-16.
In a specific embodiment, as shown in fig. 4-5, fig. 4-5 are specific structures of an image classification model, wherein the embedding module is a random depth network, and the metrology module comprises a convolutional layer and a fully-connected layer connected to each other.
The random deep network uses Stochastic Depth-16.
The convolutional layer comprises a convolutional block 1 and a convolutional block 2, and the fully-connected layers comprise a maximum pooling layer 1, a ReLU activation function layer, a maximum pooling layer 2 and a Sigmoid function layer. Each convolution block comprises a convolution kernel, a batch normalization layer and a ReLU linear activation layer, parameters of each convolution kernel are the same, a 64-channel 3 x 3 convolution kernel is adopted, and the maximum pooling layer is 2 x 2.
According to the specific structure of the image classification model, inputting the target commodity image and the support set image into the trained commodity classification model, and specifically comprising the following steps:
inputting the target image and the support set image into the random depth network, and extracting the features of the target image and the support set image;
splicing the extracted features of the target image and the features of the support set image to obtain a spliced image;
inputting the spliced image into the convolutional layer, and further extracting the characteristics of the spliced image;
and inputting the extracted characteristics of the spliced images into the full-connection layer to obtain the similarity between the target image and each category of image in the support set image.
Specifically, extracting the features of the target image and the support set image includes:
randomly discarding the redundant layers of the target image and the support set image according to a rule generated by a survival probability; obtaining the characteristic diagram of the target image
Figure BDA0003202120230000062
And feature maps of the support set images
Figure BDA0003202120230000063
Wherein xjIs the target image, xiIs a support set image.
Obtaining the similarity between the target image and each image in the support set image, including:
obtaining the matching degree between the target image and each category image in the support set image by analyzing the acquired characteristics of the spliced image, wherein the process is shown as a formula 4;
Figure BDA0003202120230000064
wherein the content of the first and second substances,
Figure BDA0003202120230000071
to support a set of image feature maps,
Figure BDA0003202120230000072
is a feature map of the target image, ri,jRepresenting the similarity of the target image and the support set image categories, and C is the number of the support set category images, and generating C similarities.
In a specific embodiment, the training process of the image classification model comprises the following steps:
acquiring a query set image and a training set image;
inputting the query set image and the training set image into the random depth network, and extracting the characteristics of the query set image and the training set image;
splicing the extracted characteristics of the images of the query set and the training set to obtain a spliced image;
inputting the output result of the random depth network into the convolutional layer, and further extracting the characteristics of the query set image and the training set image;
and inputting the output result of the convolutional layer to the full-connection layer to obtain the similarity between the image of the query set and each image in the image of the training set.
As shown in fig. 6, it is a schematic block diagram of an apparatus 200 for classifying a target image based on a relationship network according to the present invention, including:
an image obtaining module 210, configured to obtain a target image, a query set image, and a support set image.
And the similarity judging module 220 is configured to input the target image and the support set image into a trained image classification model, so as to obtain a similarity between the target image and each image in the support set image.
An image category obtaining module 230, configured to obtain a category of the target image according to the size of the similarity.
As shown in fig. 7, the similarity determination module 220 includes:
a first input unit 221, configured to input the target commodity image and the support set image into the random depth network, so as to obtain feature maps of the target image and the support set image;
a first stitching unit 222, configured to stitch the target image feature map with the feature map of each support set category image to obtain a stitched feature map;
a second input unit 223, configured to input the stitching feature map into the convolutional layer, and extract features of the stitching feature map;
a third input unit 224, configured to input an output result of the convolutional layer to the full link layer, so as to obtain a similarity between the commodity image and each category image in the support set image.
In a preferred embodiment, the image classification system further includes a training module, where the training module is configured to input the query set images and the support set images into an image classification model for training, to obtain an image classification model set corresponding to the image classification model, and perform classification and identification on the query set images by using the image classification model.
Corresponding to the image classification method based on the relational network, an embodiment of the present application further provides an electronic device, including:
at least one processor and at least one memory;
the memory stores a program that can be called by the processor;
when the processor executes the program, the steps of the image classification method based on the relational network can be realized.
In particular, the electronic device may be a computer or a server.
Corresponding to the above-mentioned image classification method based on the relational network, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the image classification method based on the relational network.
In a specific embodiment, the evaluation experiment of the classification method model of the commodity image based on the relationship network provided by the invention can be based on a mini-ImageNet data set and an RP2K data set.
The experiment is realized based on PyTorch frame, and the experimental environment is shown in the following table 1.
Figure BDA0003202120230000081
1) mini-ImageNet dataset
mini-ImageNet was derived from ImageNet and contains 100 classes, each containing 100 samples, each picture size being 84 x 84, where 64 classes were used for training, 16 classes for validation and 20 classes for testing. We tested both 5-way 1-shot and 5-way 5-shot tasks. In the application model, four convolutional layers embedded in a module in a relational network are replaced by a random depth network, and a measurement module is consistent with the relational network. On the mini-ImageNet data set, the experimental results are shown in Table 2, and the precision of the model is improved by 1.58% and 1.21% in two tasks of 5-way 1-shot and 5-way 5-shot respectively.
Figure BDA0003202120230000082
Figure BDA0003202120230000091
2) RP2K data set
The RP2K dataset is a broad range of item image datasets used for retail item classification. This data set collected over 500000 images of retail goods. Including 2000 different image categories. It is currently the largest commodity picture dataset. To verify whether our improved model can be more efficiently classified in small sample retail merchandise images. We simulated the mini-ImageNet dataset and randomly drawn 100 categories of commodity in the RP2K dataset, 64 as training set, 16 as validation set, and 20 as test set. And (3) extracting and dividing for 3 times respectively, inputting the data sets for 3 times into the model to obtain results for 3 times, and taking an average value as a final result of the user. Because RP2K pictures are not of the same size, we uniformly modify all the picture sizes to 84 × 84. The experiment was the same as described above. The results of the experiment are shown in table 3. In the RP2K data set, the precision of the comparison relation network SD-RNET model is respectively improved by 0.85 percent and 0.26 percent in two tasks of 5-way 1-shot and 5-way 5-shot
Figure BDA0003202120230000092
In the embodiment of the application, in order to extract more accurate image features of the support set and the query set so as to further improve the category judgment of the query set, an improved novel model is provided on the basis of a relational network by using the knowledge of less sample learning, and the improved model is applied to the problem of image classification. The difference of the model in comparison with the relational network is that a random depth network (Stochastic depth) is adopted in an embedded module to replace the original four-layer convolution layer in the relational network, the depth of the embedded module can be deepened by the random depth network, the training process of a residual error network can be optimized by randomly removing some redundant layers, and the problem of overfitting can be prevented while the number of layers of the network is deepened.
The model of the application adopts a part of residual error modules which are not activated, so that the idea of model fusion is actually embodied, and because the depth of the model is random during training and the depth of the model is determined during prediction, the models with different depths are actually fused during testing, so that the network becomes simpler.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Claims (10)

1. An image classification method based on a relational network is characterized in that: the method comprises the following steps:
acquiring a target image;
inputting the target image and the support set image into a trained image classification model to obtain the similarity between the target image and each class image in the support set image; the image classification model comprises an embedding module and a measurement module, wherein the embedding module is a random depth network, and the measurement module comprises a convolution layer and a full-connection layer which are connected with each other;
and obtaining the category of the target image according to the maximum similarity.
2. The method of claim 1, wherein inputting the target image and the support set image into a trained image classification model to obtain a similarity between the target image and each class image in the support set image comprises:
inputting the target image and the support set image into the random depth network, and extracting the features of the target image and the support set image;
splicing the extracted features of the target image and the features of the support set image to obtain a spliced image;
inputting the spliced image into the convolutional layer, and further extracting the characteristics of the spliced image;
and inputting the extracted characteristics of the spliced images into the full-connection layer to obtain the similarity between the target image and each category of image in the support set image.
3. The method of claim 2, wherein extracting the features of the target image and the support set image comprises:
randomly discarding the redundant layers of the target image and the support set image according to a rule generated by a survival probability;
obtaining the characteristic diagram of the target image
Figure FDA0003202120220000011
And feature maps of the support set images
Figure FDA0003202120220000012
Wherein xjIs the target image, xiIs a support set image.
4. The method of claim 3, wherein obtaining the similarity between the target image and each image in the support set images comprises:
obtaining the matching degree between the target image and each category image in the support set image by analyzing the acquired characteristics of the spliced image, wherein the process is shown as a formula 1;
Figure FDA0003202120220000013
wherein the content of the first and second substances,
Figure FDA0003202120220000014
to support a set of image feature maps,
Figure FDA0003202120220000015
is a feature map of the target image, ri,jRepresenting the similarity of the target image and the support set image categories, and C is the number of the support set category images, and generating C similarities.
5. The small sample image classification method according to claim 3, wherein the training process of the commodity classification model comprises:
acquiring a query set image and a training set image;
inputting the query set image and the training set image into the random depth network, and extracting the characteristics of the query set image and the training set image;
splicing the extracted characteristics of the images of the query set and the training set to obtain a spliced image;
inputting the output result of the random depth network into the convolutional layer, and further extracting the characteristics of the query set image and the training set image;
and inputting the output result of the convolutional layer to the full-connection layer to obtain the similarity between the image of the query set and each image in the image of the training set.
6. An image classification apparatus based on a relational network, characterized by comprising:
the image acquisition module is used for acquiring a target image, a query set image and a support set image;
and the similarity judging module is used for inputting the target image and the support set image into a trained image classification model to obtain the similarity between the target image and each image in the support set image.
And the image classification module is used for obtaining the category of the target image according to the size of the similarity.
7. The image classification device based on the relational network according to claim 6, wherein the similarity determination module comprises:
a first input unit, configured to input the target image and the support set image to the random depth network, and extract features of the target image and the support set image;
the first splicing unit is used for splicing the extracted features of the target image and the features of the support set image to obtain a spliced image;
the second input unit is used for inputting the spliced image to the first convolution layer and further extracting the characteristics of the spliced image;
and the third input unit is used for inputting the extracted characteristics of the spliced images into the full-connection layer to obtain the similarity between the target image and each image in the support set image.
8. The image classification device based on the relational network according to claim 7, wherein:
the training module is used for inputting the images of the query set and the images of the support set into the image classification model for training to obtain an image classification model set corresponding to the image classification model, and classifying and identifying the images of the query set by adopting the image classification model.
9. An electronic device, comprising:
the system comprises a processor and a memory, wherein the memory stores a program which can be called by the processor;
wherein the processor, when executing the program, implements the image processing method of any one of claims 1 to 5.
10. A computer-readable storage medium storing a computer program, characterized in that:
the computer program realizing the steps of the method according to any of claims 1-5 when executed by a processor.
CN202110907203.9A 2021-08-09 Image classification method, device, equipment and storage medium based on relational network Active CN113627522B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110907203.9A CN113627522B (en) 2021-08-09 Image classification method, device, equipment and storage medium based on relational network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110907203.9A CN113627522B (en) 2021-08-09 Image classification method, device, equipment and storage medium based on relational network

Publications (2)

Publication Number Publication Date
CN113627522A true CN113627522A (en) 2021-11-09
CN113627522B CN113627522B (en) 2024-07-02

Family

ID=

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965817A (en) * 2023-01-05 2023-04-14 北京百度网讯科技有限公司 Training method and device of image classification model and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434721A (en) * 2020-10-23 2021-03-02 特斯联科技集团有限公司 Image classification method, system, storage medium and terminal based on small sample learning
US20210232915A1 (en) * 2020-01-23 2021-07-29 UMNAI Limited Explainable neural net architecture for multidimensional data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210232915A1 (en) * 2020-01-23 2021-07-29 UMNAI Limited Explainable neural net architecture for multidimensional data
CN112434721A (en) * 2020-10-23 2021-03-02 特斯联科技集团有限公司 Image classification method, system, storage medium and terminal based on small sample learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GAO HUANG 等: "Deep Networks with Stochastic Depth", ARXIV:1603.09382V3 [CS.LG], pages 1 - 16 *
SONGSEN YU 等: "Sketch works ranking based on improved transfer learning model", MULTIMEDIA TOOLS AND APPLICATIONS, 23 August 2023 (2023-08-23), pages 33663 - 33678 *
代磊超 等: "一种鲁棒性的少样本学习方法", 小型微型计算机系 统, vol. 42, no. 2, pages 340 - 347 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965817A (en) * 2023-01-05 2023-04-14 北京百度网讯科技有限公司 Training method and device of image classification model and electronic equipment

Similar Documents

Publication Publication Date Title
US20220237788A1 (en) Multiple instance learner for tissue image classification
Zhao et al. A visual long-short-term memory based integrated CNN model for fabric defect image classification
US20180247156A1 (en) Machine learning systems and methods for document matching
CN106294344B (en) Video retrieval method and device
CN113360701B (en) Sketch processing method and system based on knowledge distillation
CN110175613A (en) Street view image semantic segmentation method based on Analysis On Multi-scale Features and codec models
WO2019015246A1 (en) Image feature acquisition
CN112464865A (en) Facial expression recognition method based on pixel and geometric mixed features
CN112132014B (en) Target re-identification method and system based on non-supervised pyramid similarity learning
Akrim et al. Classification of Tajweed Al-Qur'an on Images Applied Varying Normalized Distance Formulas
CN113761259A (en) Image processing method and device and computer equipment
CN112560710B (en) Method for constructing finger vein recognition system and finger vein recognition system
CN116612335B (en) Few-sample fine-granularity image classification method based on contrast learning
CN114913923A (en) Cell type identification method aiming at open sequencing data of single cell chromatin
CN109993187A (en) A kind of modeling method, robot and the storage device of object category for identification
CN115393666A (en) Small sample expansion method and system based on prototype completion in image classification
CN108229505A (en) Image classification method based on FISHER multistage dictionary learnings
CN111414930B (en) Deep learning model training method and device, electronic equipment and storage medium
CN113496260A (en) Grain depot worker non-standard operation detection method based on improved YOLOv3 algorithm
CN112861881A (en) Honeycomb lung recognition method based on improved MobileNet model
CN108805152A (en) A kind of scene classification method and device
CN116524243A (en) Classification method and device for fossil images of penstones
Winiarti et al. Application of Artificial Intelligence in Digital Architecture to Identify Traditional Javanese Buildings
CN115033700A (en) Cross-domain emotion analysis method, device and equipment based on mutual learning network
CN113627522A (en) Image classification method, device and equipment based on relational network and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant