CN111062424A - Small sample food image recognition model training method and food image recognition method - Google Patents

Small sample food image recognition model training method and food image recognition method Download PDF

Info

Publication number
CN111062424A
CN111062424A CN201911232161.2A CN201911232161A CN111062424A CN 111062424 A CN111062424 A CN 111062424A CN 201911232161 A CN201911232161 A CN 201911232161A CN 111062424 A CN111062424 A CN 111062424A
Authority
CN
China
Prior art keywords
image
positive
relationship
sample
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911232161.2A
Other languages
Chinese (zh)
Inventor
闵巍庆
吕永强
蒋树强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201911232161.2A priority Critical patent/CN111062424A/en
Publication of CN111062424A publication Critical patent/CN111062424A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a small sample food image recognition model training method and a food image recognition method. The model training method comprises the following steps: constructing a triple containing a positive sample, a negative sample and an anchor image by using a training data set, and inputting a ternary convolution neural network to extract the feature representation of the triple; carrying out feature map fusion to obtain a feature map of a positive and negative sample image pair; and screening the positive and negative sample images based on the relation value scores, and training the feature embedded network and the relation learning network by using the screened positive and negative sample images.

Description

Small sample food image recognition model training method and food image recognition method
Technical Field
The invention relates to the technical field of food identification, in particular to a small sample food image identification model training method and a food image identification method fusing a ternary convolutional neural network and a relational network.
Background
Food identification is an important research topic in the fields of computer vision, data mining, multimedia social interaction and the like, and is widely applied to food automation detection, food management, food trend and popularity analysis, smart home and food safety. The food data set collected from the real world conforms to the typical long tail distribution, that is, only a small number of samples can be collected from many unusual food categories, so that the identification of small sample foods is a problem to be solved urgently.
However, no relevant work is currently concerned with the problem of small sample food image recognition. This is because small sample food image recognition faces many challenges, including: (1) the fine granularity is distinguished, and the distinguishing of the fine granularity of the food image in the class and between the classes is very important for food identification; (2) lack of rigid structures, many food images do not have a fixed spatial layout, many food items do not have fixed structures, structural information is not readily available; (3) training is difficult, and because the training class space does not intersect with the testing class space, it is more difficult to train from the training class to obtain the relevant guide information of the testing class.
The existing small sample identification method only focuses on the similarity information between image pairs, and omits the fine-grained distinction of the images within and among classes.
In the existing small sample identification method, fine-grained division cannot be performed on an image.
Disclosure of Invention
Aiming at the problems and the defects of the prior art, the invention provides a sample food image recognition model training method and a corresponding recognition method, which are integrated with a ternary convolution neural network and a relationship network.
In order to achieve the purpose, the invention provides the following technical scheme:
according to one aspect of the invention, a small sample food image recognition model training method is provided, and is characterized in that the method comprises the following steps:
constructing a triple including a positive sample image, a negative sample image and an anchor image by using a training data set, inputting the triple into a ternary convolution neural network to extract the characteristic representation of the triple, and acquiring a characteristic diagram of a convolution layer;
respectively fusing the positive and negative sample images with the feature maps of the anchor image to obtain positive and negative sample image feature maps;
and (3) substituting the positive and negative sample image pairs into a relation learning network to obtain corresponding relation value scores, screening the positive and negative sample images based on the relation value scores, and training the ternary convolutional neural network and the relation learning network by using the screened positive and negative sample images.
In a preferred implementation manner, when the relationship learning network is trained, the hinge loss function used is:
Figure BDA0002303847860000021
Figure BDA0002303847860000022
Figure BDA0002303847860000023
Figure BDA0002303847860000024
wherein P is the number of randomly selected sample classes, K is the number of randomly selected samples from each class, m is a relationship control threshold,
Figure BDA0002303847860000025
scoring the relationship values obtained for the relationship learning network,
Figure BDA0002303847860000026
the parameters of the network are learned for the relationship,
Figure BDA0002303847860000027
is a fused feature map.
In another preferred implementation, the feature screening in step (3) is performed by selecting a triplet with a positive sample pair relationship score of greater than β and a negative sample pair relationship score of less than or equal to α, wherein β and α are preset parameters.
In another preferred implementation, β and α take on values of 0.6 and 0.4, respectively.
In another preferred implementation, the relationship score of the positive and negative sample image pairs is calculated by the following formula:
Figure BDA0002303847860000028
wherein r is a relationship value score,
Figure BDA0002303847860000029
for the parameters of the relationship learning network, τ (f)θ(x),fθ(x-) τ (f) as a fusion feature of the negative sample image pairθ(x),fθ(x+) ) are fused features of positive sample image pairs.
In another preferred implementation, in the step (1), the feature of the last convolutional layer in the ternary convolutional neural network is extracted as a feature representation of the image.
In another preferred implementation manner, the relationship learning network includes two convolutional layers and two fully-connected layers, where the convolutional layers are used to perform convolutional learning on the input fusion features, and the fully-connected layers are used to perform learning and dimension reduction processing on the convolutional results.
According to another aspect of the invention, a method for small sample food image recognition by using a model trained by the method is provided, and the method comprises the following steps:
selecting one image from the C category images of the training set as a support set, taking a target image as a query set, respectively pairing the images in the query set and the images in the support set to form an image pair, and inputting the image pair into a trained ternary convolution neural network for feature map extraction;
and fusing the characteristic graphs of the image pairs, inputting the characteristic graphs into a trained relation learning network to obtain corresponding relation scores, and determining the category of the target image based on the maximum value of the relation scores in the image pairs.
According to another aspect of the invention, a computer-readable storage medium is provided, on which a computer program is stored, wherein the program, when executed by a processor, implements the method.
According to another aspect of the present invention, there is provided a computer device comprising a memory and a processor, the memory having stored thereon a computer program operable on the processor, wherein the processor implements the method when executing the program.
Compared with the prior art, the invention has the beneficial effects that:
the small sample Food image recognition model training method and the corresponding recognition method provided by the invention integrate the ternary convolution neural network and the relational network, the ternary neural network can be used for learning more fine-grained information, and meanwhile, the linear measurement method is replaced by organically integrating the nonlinear measurement method by means of the relational network, so that the classification performance is improved to the greatest extent, and the best classification performance is achieved on a plurality of public data sets (ETH Food-101, Vireofood-172 and ChinesFoodNet).
By using the screening rule of the batch limited difficulty sample (limited batch hard) provided by the invention, the triples which do not meet the training are screened, and the model training which meets the requirement is carried out, so that the reliability of the model training can be further ensured, and the classification accuracy is improved.
Drawings
The invention is illustrated and described only by way of example and not by way of limitation in the scope of the invention as set forth in the following drawings, in which:
fig. 1 is a schematic frame diagram of a small sample food identification method fusing a ternary convolutional neural network and a relational network according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a relationship network structure adopted in the embodiment of the present invention.
Fig. 3 is a schematic flow chart of a "limited difficulty sample in batch" method adopted in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions, design methods, and advantages of the present invention more apparent, the present invention will be further described in detail by specific embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Brief introduction to study of small samples
In the small sample learning problem, training and testing samples are usually composed of a series of training sets and testing sets. Assuming that C training classes are provided, N labeled training samples are provided in total, and a training set is defined
Figure BDA0002303847860000041
Wherein
Figure BDA0002303847860000042
Refers to the image that is sampled and,
Figure BDA0002303847860000043
means that
Figure BDA0002303847860000044
The label of (1). For the test set, assuming there are L new classes, there are M test samples in total, defining the test sample set
Figure BDA0002303847860000045
The label set is
Figure BDA0002303847860000046
It is worth noting that the sample space of the training set and the test set is completely uncorrelated. The small sample problem is to identify unknown classes using known classes.
For small sample learning, a support set and a query set are first defined. Using training set as an example, random Slave training set DbaseC classes are sampled, and K samples are randomly sampled from each class to form a support set
Figure BDA0002303847860000047
Defining a set of queries
Figure BDA0002303847860000048
A category is randomly selected from the C categories of the support set and n samples are randomly sampled from the selected category. If the support set contains C different classesAnd each category contains K samples, this task is called "C-way K-shot". In general, K tends to be small in small sample learning settings, e.g., K-1 or K-5. The aim of the task based on the 'C-way K-shot' is to provide a query image
Figure BDA0002303847860000049
Learning a classification map using a support set
Figure BDA00023038478600000410
Deriving a probability distribution for query classes
Figure BDA00023038478600000411
Wherein
Figure BDA00023038478600000412
Is a predicted label.
According to one embodiment of the invention, the small sample food image identification method comprises the following steps:
1. defining a triple input, providing three food image inputs x-X and x+Negative ("negative"), anchor ("anchor") and positive ("positive") specimen images are shown, respectively. Wherein x and x+Are samples belonging to the same class, x-Samples belonging to a different category than x. And respectively selecting 2 categories A and B from the training categories, wherein the category A randomly selects two images as an anchor and a positive sample, and selects one image from the category B as a negative sample, and the triple is (anchor, positive and negative). The data set is partitioned into a test set and a training set in which class spaces are disjoint from each other.
Aiming at the learning characteristics of small sample food images, in a training stage, constructing triples from a training set of a data set as the input of a ternary neural network model for training the model; in the testing phase, a "C-wayK-shot" image pair (i.e., containing C classes each containing a subset of K samples) is constructed from the test set as an input to the deep neural network model for testing the performance of the model. The invention mainly tests the experimental result of the 5-way 1-shot, for example, randomly selects 5 categories, selects an image as a support set for each category, selects any one category, selects another image as a query image from the categories, and finally judges the category of the query image from the 5 categories.
2. And respectively inputting the constructed triples into a ternary convolutional neural network, for example, embedding features in a deep neural network into a sub-network to obtain feature representation of the triples. Wherein, the feature embedding sub-network can adopt different deep neural networks,
briefly describing the structure of the feature embedding sub-network, the invention takes a commonly used VGG16 network as an example to carry out relevant experimental analysis, and the feature embedding sub-network is a VGG16 network without a full connection layer and consists of 13 convolutional layers; the extracted features are the features of the last convolution layer with feature dimensions 14 x 512.
The difference from the ternary neural network in the prior art is that the invention extracts the characteristic diagram of the convolutional layer, not the characteristic diagram of the full connection layer.
In the prior art, assuming that the feature embedding expression of a sample is f θ (x), where θ is a parameter of the feature embedding network, the prior art adopts a fully connected layer before a classification layer as an embedded expression of an image, and then obtains a distance expression of a triplet by using a fixed distance algorithm (e.g., L2 distance):
Figure BDA0002303847860000051
in the invention, the characteristic graphs obtained by inputting the triples into the image characteristic embedding network are respectively expressed as fθ(x-),fθ(x),fθ(x+) This is expressed as a feature of the image. The inventor of the present application has noted that the features of the convolutional layer may retain the spatial information of the image compared to the features of the fully-connected layer, and the features of the last convolutional layer may also have stronger semantic information, which is more beneficial to the identification of the small sample food image. Therefore, in a preferred embodiment of the present invention, the features of the last convolutional layer are represented as features of the image.
3. Respectively carrying out depth fusion on the negative sample image, the positive sample image and the feature map of the anchor by utilizing a fusion operator tau to respectively obtain the fusion features of the negative sample image pair and the positive sample image pair, wherein the feature maps of the fused positive sample image pair and the fused negative sample image pair are tau (f) respectivelyθ(x),fθ(x+) And τ (f)θ(x),fθ(x-) )., there are many feature fusion methods, and in this embodiment, a feature splicing method is used, for example, for a feature map of 14 * 14 * 512, the features of the negative sample image pair and the positive sample image pair are depth-fused, respectively, and the dimension of the obtained feature map is 14 * 14 * 1024.
In this embodiment, the feature embedding is expressed as a feature map extracted from the convolutional layer, which not only fits the input of the relationship learning network, but also contains richer image information than the fully-connected layer. According to the above assumptions, the feature embedding of the sample is denoted as fθ(x) Wherein θ is a parameter of the feature embedding network, and a fusion operator τ (feature depth fusion) is used to obtain a fusion feature pair:
Figure BDA0002303847860000061
4. and respectively inputting the characteristics of the fused positive and negative sample image pairs into a relationship learning network to obtain the relationship score of the positive and negative sample image pairs.
Figure BDA0002303847860000062
Wherein r is a score of the relationship value,
Figure BDA0002303847860000063
for the parameters of the relationship learning network, its parameter value, τ (f), can be obtained by trainingθ(x),fθ(x-) Feature depth fusion of negative sample image pairs, τ (f)θ(x),fθ(x+) Feature depth fusion for positive sample image pairs. Emphasis on the combination of ternary networks and relational networksThe characteristics of the fused positive and negative sample image pairs enter a relationship learning network to obtain a nonlinear relationship score.
As shown in fig. 2, the relationship learning network proposed by the present invention is composed of two convolutional layers and two fully-connected layers, wherein the convolutional layers are used for performing convolution processing on input fusion features, learning fusion representation of the fusion features, and the nonlinear activation function of the convolutional layers is ReLU and is composed of 64 convolution kernels of 3 × 3; the two full-connection layers are used for performing dimension reduction and relationship score learning on the convolution result of the fusion features, respectively use a ReLU nonlinear activation function and a Sigmoid nonlinear activation function, and output dimensions are respectively 8 dimensions and 1 dimension. And respectively inputting the characteristics of the positive sample image pair and the negative sample image pair after characteristic fusion into a relation learning network, so that each image pair can obtain a 1-dimensional relation score. It should be noted that, although fig. 2 shows a relationship learning network, the construction of each layer in the relationship learning network is the same as the conventional construction method, and is not described in detail here.
The relation network is regarded as a nonlinear measurement function, the linear measurement function in the traditional ternary neural network is replaced, self-adaptive learning can be carried out according to different network models and different data, and a fixed linear distance cannot be changed; the discrimination of the nonlinear measurement method is stronger than that of the linear measurement method (the experimental performance shows superiority).
Feature map τ (f) fused in training phaseθ(x),fθ(x-) And τ (f)θ(x),fθ(x+) Respectively input to the relationship learning network
Figure BDA0002303847860000064
The hinge loss function is defined as:
Figure BDA0002303847860000065
where m is the relationship control threshold value,
Figure BDA0002303847860000066
obtaining relationship values for a relationship networkThe method comprises the following steps of dividing,
Figure BDA0002303847860000067
are parameters of the relationship network.
And training the relation learning network by using the hinge loss function.
Since the last fully-connected layer in the relational network uses the Sigmoid function as the activation function, the final relationship score r is between 0 and 1, and 1 can be considered as completely similar, and 0 is completely dissimilar. However, with the traditional "batch hardest sample" (batch hard) screening rule, when the relationship score is used as the standard for screening the triples, there are many triples that are not suitable for model training, for example, many pairs of positive sample images have a relationship score much smaller than that of pairs of negative sample images, which generate excessive loss values, and are very disadvantageous for model training.
Therefore, in the conventional method, the relationship scores of positive and negative samples are screened by using a self-researched screening rule of 'batch limited difficulty samples', namely, after all triples are input into a relationship learning network, the positive and negative samples are screened according to the relationship scores of the positive and negative samples, for example, only the triples with the positive sample pair being greater than or equal to β and the triples with the negative sample image pair relationship being less than or equal to α (wherein β and α are hyper-parameters which are parameters set before the learning process is started, and are respectively used for selecting and screening trainable positive sample pairs and trainable negative sample pairs, and through experimental verification, β and α respectively take values of 0.6 and 0.4), so that the triples which are extremely difficult to train in batch data are effectively avoided, the difficulty of network training is reduced.
Based on a triple selection scheme of 'batch limited difficulty samples', an original loss function is adjusted, and finally, based on a new triple sampling scheme, the loss function is as follows:
Figure BDA0002303847860000071
wherein]+The subscript of the min function indicates the combination of samples in different classes, and the s.t. indicates subject to as a symbol in mathematics, indicating a conditional statement. P classes are randomly selected per batch of data, and then K samples are randomly selected from each class, so that each batch of samples contains P x K images. The hinge loss function causes the relationship value score between the positive sample image pair to be greater than the relationship value score between the negative sample image pair. The hinge loss function not only directs the feature embedding model to generate the embedding of the image, but also directs the relational network learning at the same time.
By adopting the hinge loss function, the model can learn more discriminative information by simultaneously limiting the positive and negative sample relation values.
In a testing stage or an application stage, images are selected in the mode of the C-way K-shot to generate corresponding support sets and query sets, corresponding image pairs are generated according to each corresponding support set and query set, for example, for a group of 5-way 1-shot, 5 images from 5 categories are respectively used as the support sets, then one category is selected from the selected 5 categories, an image is additionally selected as the query set (if the application stage is adopted, the image to be classified is selected as the query set), and finally the images in the query set are respectively matched with the images in the support sets, so that 4 negative sample image pairs and 1 positive sample image pair exist, and then trained features are input to be embedded into a network and a relation learning subnetwork.
And inputting the constructed 5-way 1-shot image pair into a feature embedding sub-network (a deep neural network) respectively, and extracting a feature map of the convolutional layer. And (4) performing depth fusion on the feature maps of the negative sample image pair and the positive sample image pair respectively by using a fusion operator tau to obtain fusion features of the 4 negative sample image pairs and the 1 positive sample image pair respectively.
And respectively inputting the characteristics of the fused positive and negative sample image pairs into a relationship learning network to obtain the relationship score of the positive and negative sample image pairs.
Figure BDA0002303847860000081
Finally, 5 relation scores are obtained respectively, and if the relation score of the positive sample image pair is higher than the relation scores of the other four negative samples, the classification is correct; if the positive sample image pair relationship score is not the highest, a classification error is indicated. The category of the query set is the category of the positive samples in the support set, so that the food image classification is realized.
In order to verify the performance advantages of the food image recognition method of the present invention and the existing food image recognition method, the inventors conducted experiments on a plurality of common data sets. The inventor respectively adopts the identification method of the invention and models and corresponding methods of Sieme Network, MatchingNetwork, relationship Network, triple Network and the like on common data sets ETH Food-101, Vireofood-172, ChinesFoodNet and the like, and 1000 groups of '5-way 1-shot' image pairs are randomly generated from each data set by test data.
Table 1 shows the results of comparing the performance of the process of the invention with other reference processes.
It can be seen from table 1 that the ternary neural network based on nonlinear distance (relationship learning network) proposed by the present invention achieves the best classification performance on three popular common data sets. Compared with a linear distance ternary neural network, the accuracy rate is respectively improved by 0.8%, 1.7% and 3.7%, the superiority of a relation learning network is proved, and the invention provides a brand-new technical concept of small sample food image recognition.
Figure BDA0002303847860000091
TABLE 1
Table 2 shows the superiority of the method of the present invention, and a group of parameter settings with the best test performance are obtained by exploring the parameter settings, wherein the two values in parentheses in the table are α and β respectively.
Figure BDA0002303847860000092
TABLE 2
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A small sample food image recognition model training method is characterized by comprising the following steps:
constructing a triple including a positive sample image, a negative sample image and an anchor image by using a training data set, inputting the triple into a ternary convolution neural network to extract the characteristic representation of the triple, and acquiring a characteristic diagram of a convolution layer;
respectively fusing the positive and negative sample images with the feature maps of the anchor image to obtain positive and negative sample image feature maps;
and (3) substituting the positive and negative sample image pairs into a relation learning network to obtain corresponding relation value scores, screening the positive and negative sample images based on the relation value scores, and training the ternary convolutional neural network and the relation learning network by using the screened positive and negative sample images.
2. The small sample food image recognition model training method of claim 1, wherein at the close pointWhen training is performed by a learning network, the adopted hinge loss function is as follows:
Figure FDA0002303847850000011
Figure FDA0002303847850000012
Figure FDA0002303847850000013
Figure FDA0002303847850000014
wherein P is the number of randomly selected sample classes, K is the number of randomly selected samples from each class, m is a relationship control threshold,
Figure FDA0002303847850000015
scoring the relationship values obtained for the relationship learning network,
Figure FDA0002303847850000016
the parameters of the network are learned for the relationship,
Figure FDA0002303847850000017
is a fused feature map.
3. The small sample food image recognition model training method of claim 1, wherein the feature screening in step (3) is performed by selecting a triplet with a positive sample pair relationship score of greater than β and a negative sample pair relationship score of less than or equal to α, wherein β and α are preset parameters.
4. The small sample food image recognition model training method of claim 3, wherein the values of β and α are 0.6 and 0.4, respectively.
5. The small sample food image recognition model training method of claim 3,
calculating a relationship score for the positive and negative sample image pairs by:
Figure FDA0002303847850000021
wherein r is a relationship value score,
Figure FDA0002303847850000022
for the parameters of the relationship learning network, τ (f)θ(x),fθ(x-) τ (f) as a fusion feature of the negative sample image pairθ(x),fθ(x+) ) are fused features of positive sample image pairs.
6. The small sample food image recognition model training method of claim 1, wherein the features of the last convolutional layer in the ternary convolutional neural network are extracted as the feature representation of the image in step (1).
7. The small sample food identification method according to claim 1, wherein the relationship learning network comprises two convolutional layers for convolutional learning of the input fusion features and two fully-connected layers for learning and dimension reduction of the convolutional result.
8. A method for small sample food product image recognition using a model trained by the method of any one of claims 1-7, comprising:
selecting an image from the C category images of the training set as a support set, taking a target image as a query set, respectively pairing the images in the query set and the images in the support set to form an image pair, and inputting the image pair into the trained ternary convolution neural network for feature map extraction;
and fusing the characteristic graphs of the image pairs, inputting the characteristic graphs into the trained relation learning network to obtain corresponding relation scores, and determining the category of the target image based on the maximum value of the relation scores in the image pairs.
9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
10. A computer device comprising a memory and a processor, on which memory a computer program is stored which is executable on the processor, characterized in that the processor implements the method of any one of claims 1 to 7 when executing the program.
CN201911232161.2A 2019-12-05 2019-12-05 Small sample food image recognition model training method and food image recognition method Pending CN111062424A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911232161.2A CN111062424A (en) 2019-12-05 2019-12-05 Small sample food image recognition model training method and food image recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911232161.2A CN111062424A (en) 2019-12-05 2019-12-05 Small sample food image recognition model training method and food image recognition method

Publications (1)

Publication Number Publication Date
CN111062424A true CN111062424A (en) 2020-04-24

Family

ID=70299777

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911232161.2A Pending CN111062424A (en) 2019-12-05 2019-12-05 Small sample food image recognition model training method and food image recognition method

Country Status (1)

Country Link
CN (1) CN111062424A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626212A (en) * 2020-05-27 2020-09-04 腾讯科技(深圳)有限公司 Method and device for identifying object in picture, storage medium and electronic device
CN111797237A (en) * 2020-07-10 2020-10-20 第四范式(北京)技术有限公司 Text entity relation identification method, system and medium
CN111882000A (en) * 2020-08-04 2020-11-03 天津大学 Network structure and method applied to small sample fine-grained learning
CN112052762A (en) * 2020-08-27 2020-12-08 西安电子科技大学 Small sample ISAR image target identification method based on Gaussian prototype
CN113486202A (en) * 2021-07-01 2021-10-08 南京大学 Method for classifying small sample images
CN113716146A (en) * 2021-07-23 2021-11-30 武汉纺织大学 Paper towel product packaging detection method based on deep learning
CN115965817A (en) * 2023-01-05 2023-04-14 北京百度网讯科技有限公司 Training method and device of image classification model and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308497A (en) * 2018-10-27 2019-02-05 北京航空航天大学 A kind of multidirectional scale dendrography learning method based on multi-tag network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308497A (en) * 2018-10-27 2019-02-05 北京航空航天大学 A kind of multidirectional scale dendrography learning method based on multi-tag network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吕永强等: "融合三元卷积神经网络与关系网络的小样本食品图像识别", 《计算机科学》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626212A (en) * 2020-05-27 2020-09-04 腾讯科技(深圳)有限公司 Method and device for identifying object in picture, storage medium and electronic device
CN111626212B (en) * 2020-05-27 2023-09-26 腾讯科技(深圳)有限公司 Method and device for identifying object in picture, storage medium and electronic device
CN111797237A (en) * 2020-07-10 2020-10-20 第四范式(北京)技术有限公司 Text entity relation identification method, system and medium
CN111797237B (en) * 2020-07-10 2024-05-07 第四范式(北京)技术有限公司 Text entity relationship recognition method, system and medium
CN111882000A (en) * 2020-08-04 2020-11-03 天津大学 Network structure and method applied to small sample fine-grained learning
CN112052762A (en) * 2020-08-27 2020-12-08 西安电子科技大学 Small sample ISAR image target identification method based on Gaussian prototype
CN113486202A (en) * 2021-07-01 2021-10-08 南京大学 Method for classifying small sample images
CN113486202B (en) * 2021-07-01 2023-08-04 南京大学 Method for classifying small sample images
CN113716146A (en) * 2021-07-23 2021-11-30 武汉纺织大学 Paper towel product packaging detection method based on deep learning
CN115965817A (en) * 2023-01-05 2023-04-14 北京百度网讯科技有限公司 Training method and device of image classification model and electronic equipment

Similar Documents

Publication Publication Date Title
CN111062424A (en) Small sample food image recognition model training method and food image recognition method
TWI677852B (en) A method and apparatus, electronic equipment, computer readable storage medium for extracting image feature
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN109685135B (en) Few-sample image classification method based on improved metric learning
CN108171209B (en) Face age estimation method for metric learning based on convolutional neural network
CN111126386B (en) Sequence domain adaptation method based on countermeasure learning in scene text recognition
Kuncheva et al. PCA feature extraction for change detection in multidimensional unlabeled data
CN111126482B (en) Remote sensing image automatic classification method based on multi-classifier cascade model
Saito et al. Robust active learning for the diagnosis of parasites
CN105303179A (en) Fingerprint identification method and fingerprint identification device
EP3311311A1 (en) Automatic entity resolution with rules detection and generation system
CN105389326B (en) Image labeling method based on weak matching probability typical relevancy models
CN109783666A (en) A kind of image scene map generation method based on iteration fining
CN107798351B (en) Deep learning neural network-based identity recognition method and system
CN111931505A (en) Cross-language entity alignment method based on subgraph embedding
CN112560710B (en) Method for constructing finger vein recognition system and finger vein recognition system
CN115712740B (en) Method and system for multi-modal implication enhanced image text retrieval
CN111564179A (en) Species biology classification method and system based on triple neural network
CN109145704A (en) A kind of human face portrait recognition methods based on face character
CN115309860A (en) False news detection method based on pseudo twin network
CN116385832A (en) Bimodal biological feature recognition network model training method
CN108229692B (en) Machine learning identification method based on dual contrast learning
CN111786999B (en) Intrusion behavior detection method, device, equipment and storage medium
Lim et al. A scene image is nonmutually exclusive—a fuzzy qualitative scene understanding
CN113516156A (en) Fine-grained image classification method based on multi-source information fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200424