CN117975445A - Food identification method, system, equipment and medium - Google Patents

Food identification method, system, equipment and medium Download PDF

Info

Publication number
CN117975445A
CN117975445A CN202410377890.1A CN202410377890A CN117975445A CN 117975445 A CN117975445 A CN 117975445A CN 202410377890 A CN202410377890 A CN 202410377890A CN 117975445 A CN117975445 A CN 117975445A
Authority
CN
China
Prior art keywords
loss
class
distance
densenet
center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410377890.1A
Other languages
Chinese (zh)
Other versions
CN117975445B (en
Inventor
王晨旭
宋婷婷
吴秦
周浩杰
王宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202410377890.1A priority Critical patent/CN117975445B/en
Publication of CN117975445A publication Critical patent/CN117975445A/en
Application granted granted Critical
Publication of CN117975445B publication Critical patent/CN117975445B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention relates to a food identification method, a system, equipment and a medium, wherein the method comprises the following steps: step S1: acquiring a plurality of food images, and training the food images through DenseNet networks; step S2: in training, calculating classification loss, class center distance loss and inter-class distance loss according to the output characteristics of the DenseNet network, and constructing total loss based on the classification loss, the class center distance loss and the inter-class distance loss; step S3: if the total loss is smaller than the preset loss, training the DenseNet network is completed; if the total loss is not less than the preset loss, continuing training the DenseNet network until the total loss is less than the preset loss; step S4: and identifying the food image to be identified through the trained DenseNet network. The invention can effectively identify food images and reduce false detection.

Description

Food identification method, system, equipment and medium
Technical Field
The invention relates to the technical field of food identification, in particular to a food identification method, a system, equipment and a medium.
Background
Food recognition is widely applied to multiple scenes such as restaurant service, health management, electronic commerce platform, intelligent agriculture and the like. The automatic settlement of commodities can be realized by identifying dining information of clients and the like; analyzing the nutrition intake condition of the user by identifying dishes, food materials or other nutrition information in the image so as to realize the supervision of the diet health of the user; on the e-commerce platform, detailed information of food can be displayed through food identification, so that shopping experience of a user is improved. In addition, the food recognition can be applied to social media, and users share and recommend delicious dishes and restaurants through delicious foods, so that social interaction is increased. Therefore, food identification is becoming a research hotspot in many fields such as computer vision, medical and health informatics, agriculture, etc.
The identification classification of food images is an important branch of fine-grained image identification. The main task is to identify the food category in the image by utilizing the technologies of computer vision, deep learning, image processing and the like. The recognition methods of food images can be mainly classified into two types, a conventional image-based method and a deep learning-based method. The method based on the traditional image distinguishes the food category through the characteristics of color histogram, texture, shape and the like which are designed manually, however, the method is extremely sensitive to the data quality, and the adaptability is not ideal when facing the task of uneven image quality and large image variation of food classification. In contrast, deep learning-based methods can better accommodate the task of food recognition, but their accuracy remains to be improved.
At present, the development of food identification still has the following difficulties. First, the intra-class variation is large and the inter-class variation is small. Different ingredients and cooking methods are adopted for the same food in different regions, so that even the same food, the images of the same food show obvious differences visually; meanwhile, since different dishes may use similar raw materials or cooking techniques, they are also very similar in view, which increases the difficulty of food classification. Secondly, the food image background is complex and the noise is large. Images of food products are mostly taken in various environments and complex backgrounds, such as dim light, vapor environments, intense reflections, appliances of various shapes and colors, etc., which make the food product images complex backgrounds and noisier. Third, the food data set is unevenly distributed. The quantity difference of different categories in the food data set is large, the quantity of samples of the food is thousands of sheets for home vegetables such as Mapo tofu, and the quantity of the food is very small, even tens of sheets for home vegetables such as golden ruffle.
In summary, the above problems all result in a problem that the accuracy of identifying the dish is not high and there is false detection.
Disclosure of Invention
Therefore, the invention aims to solve the technical problems of low food image detection accuracy and false detection in the prior art.
In order to solve the technical problems, the invention provides a food identification method, which comprises the following steps:
Step S1: acquiring a plurality of food images, and training the food images through DenseNet networks;
Step S2: in DenseNet network training, calculating classification loss, class center distance loss and inter-class distance loss according to the output characteristics of the DenseNet network, and constructing total loss based on the classification loss, the class center distance loss and the inter-class distance loss;
Each food image has different categories based on dishes, the classification loss is the classification loss of the DenseNet network for predicting the food image, and the class center distance loss is the loss of the distance relation between each sample in the feature vector output by the DenseNet network and the class center of the same class; the inter-class distance loss is the loss of the difference between the distance relation between each sample and other class centers and the distance relation between each class center in the feature vector output by DenseNet networks; the class center is a common sample with the same class in the output feature vector of DenseNet networks;
step S3: if the total loss is smaller than the preset loss, training the DenseNet network is completed; if the total loss is not less than the preset loss, continuing training the DenseNet network until the total loss is less than the preset loss;
Step S4: and identifying the food image to be identified through the trained DenseNet network.
In one embodiment of the present invention, the step S2 further includes setting a plurality of training rounds in DenseNet network training, if the current training round is smaller than the training round thresholdThe classification loss is directly used as the total loss; if the current training round is greater than the training round threshold/>And summing the classification loss, the center-to-class distance loss and the inter-class distance loss to obtain a total loss.
In one embodiment of the present invention, the step S2 calculates a classification loss, a center-to-class distance loss, and an inter-class distance loss according to the output characteristics of the DenseNet network, and constructs a total loss based on the classification loss, the center-to-class distance loss, and the inter-class distance loss, and the method includes:
Initializing feature sum matrices And image quantity matrix/>Wherein/>For accumulating the sum of all eigenvectors belonging to the same category,/>For calculating the number of samples for each category contained in the training data,For category number,/>For feature dimension/>Is a feature space;
inputting a plurality of food images into DenseNet networks, and obtaining the feature vector output by the last layer of the DenseNet networks and the feature vector output by the penultimate layer, wherein the formula is as follows: Wherein/> Represents the/>The image of the object is a single image,A feature extractor for DenseNet networks;
initializing class center matrices of different classes in feature vectors
By means ofAnd/>Update/>And/>Wherein/>For image/>Is a label of (2);
the feature vector output by the last layer of DenseNet network and the feature vector output by the penultimate layer are predicted by two different fully connected layers, and adds the prediction results of two different full-connection layers to obtain a prediction classification result ,/>A classifier formed for two different fully connected layers of DenseNet network connections;
If the current training turns The method meets the following conditions: /(I)Then calculate the classification loss/>, based on the prediction classification resultClassifying the loss/>As a total loss; otherwise, summing the classification loss, the center-to-class distance loss and the inter-class distance loss to obtain a total loss, wherein the calculation method of the center-to-class distance loss and the inter-class distance loss comprises the following steps: according to updated/>And/>Updating a class center matrix, wherein the formula is as follows: /(I)Calculating class center distance loss and class-to-class distance loss according to the updated class center matrix, wherein/>A training round threshold; will/>Wheel updated class center matrix/>For/>Wheel class center distance loss/>And inter-class distance loss/>In the calculation, the first/>Wheel updated class center matrix/>For/>Wheel class center distance loss/>And inter-class distance loss/>In the calculation.
In one embodiment of the present invention, the classification loss is formulated as:
Wherein, Representing the number of samples,/>Representing category number,/>Represents the/>True label of individual samples, whenThe samples belong to the/>Category time/>The remaining cases/>;/>Represents the/>The samples are predicted as the firstProbability of individual categories.
In one embodiment of the present invention, the center-like distance loss calculation formula is:
Wherein, Representing the number of samples,/>Feature vector and/>, for DenseNet network outputsFor/>One sample of each category,/>For/>In wheel/>Class center of individual class,/>Is the euclidean distance.
In one embodiment of the present invention, the inter-class distance loss calculation method includes:
constructing a distance relation among the centers of each class, wherein the formula is as follows:
Wherein, Is cosine similarity,/>For/>In wheel/>Class center of individual class,/>Is the firstIn wheel/>Class center of individual class,/>Is a first super parameter;
And constructing the center distance relation between each sample and other types, wherein the formula is as follows:
Wherein, For/>In wheel/>Characterization of individual samples,/>For/>In wheel/>Class center of individual class and/>Not belonging to the/>Category,/>Is a first super parameter;
calculating the inter-class distance loss, wherein the formula is as follows:
Wherein, Representing the number of samples,/>Is formed by/>Set of compositions and/>For/>Wheel number/>Set of distance relationships between individual samples and class centers of other classes,/>Is/>Set of compositions and/>For/>Wheel number/>Distance relation set of individual class center and other class centers,/>Is the euclidean distance.
In one embodiment of the present invention, the step S1 further includes: performing data enhancement processing on a plurality of acquired food images, wherein the enhancement processing method comprises the following steps:
Randomly selecting images from a plurality of food images, adjusting the size of the selected images to 256×256, and then cutting the resized images to 224×224;
or horizontally overturning the food images.
In order to solve the above technical problems, the present invention provides a food recognition system, including:
training module: the method comprises the steps of acquiring a plurality of food images, and training the food images through DenseNet networks;
Total loss construction module: in DenseNet network training, calculating classification loss, class center distance loss and inter-class distance loss according to the output characteristics of the DenseNet network, and constructing total loss based on the classification loss, the class center distance loss and the inter-class distance loss;
Each food image has different categories based on dishes, the classification loss is the classification loss of the DenseNet network for predicting the food image, and the class center distance loss is the loss of the distance relation between each sample in the feature vector output by the DenseNet network and the class center of the same class; the inter-class distance loss is the loss of the difference between the distance relation between each sample and other class centers and the distance relation between each class center in the feature vector output by DenseNet networks; the class center is a common sample with the same class in the output feature vector of DenseNet networks;
And a judging module: if the total loss is smaller than the preset loss, training the DenseNet network is completed; if the total loss is not less than the preset loss, continuing training the DenseNet network until the total loss is less than the preset loss;
And an identification module: the method is used for identifying the food image to be identified through the trained DenseNet network.
In order to solve the technical problem, the invention provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of the food identification method when executing the computer program.
To solve the above technical problem, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the food identification method as described above.
Compared with the prior art, the technical scheme of the invention has the following advantages:
according to the method, distinguishing detail features are found through calculating the distance between the class center and each sample, and the class center is dynamically updated along with the increase of training rounds so as to find a public feature representation most suitable for the current class, so that the problem of sparse and unbalanced distribution of food image data is solved to a certain extent;
In order to minimize the intra-class distance, the class center distance loss is designed according to the distance between each sample and the center of the same class; in order to maximize the inter-class distance, the inter-class distance loss is designed according to the distance between each sample and the center of other classes and the distance between the centers of the classes, and the network is continuously trained through the losses, so that the problems of large inter-class distance and small intra-class distance are effectively solved;
The DenseNet network can effectively identify food images, and has low false detection rate.
Drawings
In order that the invention may be more readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof that are illustrated in the appended drawings.
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a simplified flow chart of DenseNet networks in combination with a dynamic class center module in accordance with an embodiment of the present invention;
FIG. 3 is a detailed flow chart of DenseNet networks in combination with a dynamic class center module in an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the invention and practice it.
Example 1
Referring to fig. 1, the present invention relates to a food identification method, comprising:
Step S1: acquiring a plurality of food images, and training the food images through DenseNet networks;
Step S2: in DenseNet network training, calculating classification loss, class center distance loss and inter-class distance loss according to the output characteristics of the DenseNet network, and constructing total loss based on the classification loss, the class center distance loss and the inter-class distance loss;
each food image has different categories based on dishes, the classification loss is a classification loss of the DenseNet network for predicting the food image, and the class center distance loss is a loss of a distance relation between each sample (each sample corresponds to one feature vector) in the feature vector output by the DenseNet network and the class center of the same class; the inter-class distance loss is the loss of the difference between the distance relation between each sample and other class centers and the distance relation between each class center in the feature vector output by DenseNet networks; the class center is a common sample with the same class in the output feature vectors of DenseNet networks (note that, in this embodiment, the initial class center is obtained randomly, and in the subsequent training process, the class center is obtained by adding and averaging the feature vectors corresponding to the samples in all the same class);
step S3: if the total loss is smaller than the preset loss, training the DenseNet network is completed; if the total loss is not less than the preset loss, continuing training the DenseNet network until the total loss is less than the preset loss;
Step S4: and identifying the food image to be identified through the trained DenseNet network.
The present embodiment is described in detail below:
As shown in fig. 1, in the food recognition method based on the dynamic class center, the adopted model includes a backbone network (DenseNet network) and is combined with the dynamic class center module, and since the DenseNet network is the prior art, the embodiment will not be repeated for the DenseNet network, as shown in fig. 2.
Step 1, model training: the model parameters are mainly adjusted through the marked training data, so that the model parameters can be learned from the data to be suitable for characteristic representation, and knowledge and rules for solving the problem of food identification are learned. For arbitrary input modelsThe main flow of model training for images from different classes is as follows:
1. The basic feature extraction of the image, this part obtains the basic feature of each image through DenseNet network (particularly DenseNet161 network), as shown in part (a) of fig. 3. The DenseNet network of the embodiment adopts a dense connection mode to enable the input of each layer to be connected with the input of the front layer, so that information flow can be effectively promoted, the utilization efficiency of parameters is improved, and the identification precision is improved. It should be noted that, in this embodiment, the last layer output feature and the last but last layer output feature of DenseNet network are used as the input of the dynamic class center module to dynamically update the class center of each class to enhance the extraction of the distinguishing feature, and meanwhile, they are also used as the input of two different fully connected layers (FC 1 and FC 2) respectively to implement the classification prediction of the input image and calculate the classification loss according to the prediction result, and the classification loss formula is:
Wherein, Representing the number of samples (samples, i.e. images)/>Representing category number,/>Represents the/>True tags of individual samples if and only if the/>The samples belong to the/>Category time/>The remaining cases/>;/>Represent the firstThe individual samples are predicted as/>Probability of individual categories.
2. And dynamically updating the class center according to the extracted characteristics. The dynamic class center module is compressed by the feature graphs output by the DenseNet network penultimate layer and the last layerDimension eigenvectors (AND/>)Individual category corresponds) as input, the category center for each category is dynamically updated. The method can describe the distances between the categories and the samples in a non-parameter mode in a dynamic diagram mode, so that the increase of the inter-category change and the reduction of the intra-category change are realized, and the sparsity and unbalance of the data in the representation learning are reduced. In this embodiment, a gaussian distribution is first used to obtain an initialized class center (i.e., class center matrix/>) The batch/>, is then countedImage quantity matrix/>, of each category containedAnd feature sum matrix/>, of the same classFor updating their class centers, e.g. in the/>Batch per round/>In/>The individual category contains(/>When/>Not update the first >Class center of class) images whose feature sum is/>Then update its class center as/>; Then respectively calculating center-to-class distance loss and inter-class distance loss, and finally returning to the feature extractor/>And classifier/>
The dynamic class center module of this embodiment is shown in fig. 3, wherein,Representing different samples in a feature vector,/>The class center of the class corresponding to the different samples is represented, the part (b) of the dynamic class center module in fig. 3 is used for generating the class center, and the class center is updated through the part (c) and the part (d) in fig. 3; the part (c) of the dynamic class center module in fig. 3 is used to calculate the inter-class distance; the part (d) of the dynamic class center module in fig. 3 is used to calculate the intra-class distance. The principle of the dynamic class center module of this embodiment is specifically as follows:
Input: data set ,/>Respectively express/>Middle/>Personal image and its tag, feature extractor of DenseNet network/>Classifier/>, formed by two different fully connected layers (FC 1 and FC 2) of DenseNet network connectionsCurrent training turns/>Category number/>Feature dimension/>
Initializing: initializing class center matrices of different classes using gaussian distribution,/>Is a feature space;
s1: setting multiple groups of training rounds, and for the current training round S2, executing;
S2: initializing feature sum matrices And image quantity matrix/>All values are 0, where/(v)For accumulating the sum of all eigenvectors belonging to the same category,/>For calculating the number of samples of each category contained in each batch;
S3: setting the iteration number, which refers to the data set Divided by batch/>And rounding if the result is not an integer. It should be noted that the same dataset/>, is used for each roundTraining, only per batch/>The specific images selected may vary;
s4: randomly sampling a batch of training data
S5: for training dataS6 is performed for each image in (a);
S6: inputting an image into a DenseNet network, and obtaining a feature vector output by the last layer of the DenseNet network and a feature vector output by the last but last layer, wherein the formula is as follows: ,/>
S7: by means of And/>Update/>And/>,/>For/>Is a label of (2);
S8: predicting the feature vector output by the last layer and the feature vector output by the last layer of DenseNet networks through two different fully connected layers (FC 1 and FC 2), and adding the prediction results of the two different fully connected layers to obtain a prediction classification result
S9: judging training dataIf every data in the (2) is executed, S5-S8 is executed, if yes, S10 is executed, and if not, S5-S8 is executed continuously;
s10: judging Whether or not to hold,/>If the training round threshold is met, executing S11, and if the training round threshold is not met, executing S12;
s11: calculating the classification loss according to the predicted classification result of DenseNet network Classifying the loss/>As a total loss;
s12: summing the classification loss, the center-to-class distance loss and the inter-class distance loss to obtain a total loss, wherein the calculation method of the center-to-class distance loss and the inter-class distance loss comprises the following steps: according to updated And/>Updating the class center matrix: /(I)According to the updated class center matrix/>Calculating center-to-class distance loss and inter-class distance loss; note that the/>, will beWheel updated class center matrix/>For/>Wheel class center distance loss/>And inter-class distance loss/>In the calculation, the first/>Wheel updated class center matrix/>For/>Wheel class center distance loss/>And inter-class distance loss/>Calculating;
s13: calculate class center distance loss (calculated based on each sample and class center distance, as shown in part (d) of the dynamic class center module in fig. 3):
Wherein, Feature vector and/>, for DenseNet network outputsFor/>One sample of each category,/>For/>In wheel/>Class center of individual class,/>Is Euclidean distance;
According to the first Wheel updated class center matrix/>Constructing a distance relation between each class center (as shown in a part (c) of the dynamic class center module in fig. 3):
Wherein, Is cosine similarity,/>For/>In wheel/>Class center of individual class,/>Is the firstIn wheel/>Class center of individual class,/>Is a first super parameter;
S14: according to the first Wheel updated class center matrix/>Constructing a distance relationship between each sample and the other class center (as shown in part (c) of the dynamic class center module in fig. 3):
Wherein, For/>In wheel/>Feature vector of individual samples,/>For/>In wheel/>Class center of individual class and/>Not belonging to the/>Category,/>Is a first super parameter;
S15: calculating inter-class distance loss (calculated based on distance relationships between each sample and other class centers, distance relationships between respective class centers):
Wherein, Is formed by/>Set of compositions, th/>Individual category satisfaction/>And/>,/>For/>Wheel number/>Set of distance relationships between individual samples and class centers of other classes,/>Is/>Collection of compositions,/>First, theCategory/>And/>,/>For/>Wheel number/>And the distance relation set between the individual class center and each other class center.
It should be noted that although the previous round (i.e. the first roundRound) updated class center matrix/>But ultimately represents the current wheel (i.e./>Wheel) is lost, therefore, the formula/>All adopt/>Rather than/>
S16: calculate the total loss:
Wherein, Respectively second and third super parameters;
s17: with the increase of training rounds, the total loss is gradually reduced (i.e. the total loss is smaller than the preset loss), the network is trained until the loss tends to be stable, the optimal network parameters are obtained, the trained DenseNet network is further obtained, and after model training is completed, the learnable parameters are returned ,/>
And 2, deploying a model (namely a trained DenseNet network) to a server side and completing reasoning. When the model is deployed, the present embodiment uses the flash framework to provide services for the application at the server side, define the route, process the logic and generate the response. And configuring network settings and a firewall of the server, and starting and running the service program. The request parameters were obtained with request. Ars. Get, the feedback data was wrapped with jsonify, and the picture URL was parsed with image. When in reasoning and identifying, preprocessing the input image consistent with training, loading a pre-trained model, inputting the image into the loaded model, predicting the result, and feeding back the predicted result to the user side.
Experimental results and analysis
First, a dataset used in experiments will be described, comprising two broad categories:
(1) Chinese food: the Chinese food contains 208 categories, which are all Chinese dishes; wherein the training set 145065, the verification set 20253 and the test set 20310. The training set, the verification set and the test set are all single dishes; in the experiment, a training set of Chinese food is used as a training data training model, and the performance of the model is verified by a verification set; and during testing, 92 images collected by the testing set and the embodiment are used as the testing set to respectively test the classification accuracy.
(2) Our data: in this example, 92 images were collected and selected from various scenes such as schools and meals in households as test samples, and a total of 27 categories were included in the existing category of Chinese food.
Experiment setting: in this embodiment, denseNet161 networks were used as the backbone network and pre-trained on ImageNet. In the training stage, a random gradient descent method is used, and the attenuation rate isThe momentum was 0.9. In this embodiment, initial learning rates are set for the backbone network and the dynamic class center module, respectively, where the initial learning rate of the backbone network is 0.1, and the learning rate of the dynamic class center module is set to 0.5. The model trains 100 rounds altogether, and starts training the dynamic class center module at the 2 nd round, and the learning rate is respectively reduced by 0.1 times at the 40 th round and the 70 th round. Each batch was 16 in size during training. In order to enhance the training data, the present embodiment uses the following data enhancement method: (1) The images are randomly selected, the image size is directly adjusted to 256×256, and then the images are cut to 224×224 size (2) to horizontally flip the images.
The method of this embodiment is compared with the current advanced method, and specifically is as follows:
In order to verify the validity of the dynamic class-centric approach, the accuracy of the method of this embodiment and other advanced approaches in the datasets Chinese food and our data is also compared, as shown in Table 1. The accuracy of this embodiment is highest, both when the top 5 predictive scores are highest and when only the highest predictive scores are taken. And when the highest predictive score of the first 5 is taken, the accuracy of this embodiment reaches 96.71% and 94.44% on chinese food and our data, which fully demonstrates that the DenseNet161 network of this embodiment can extract the distinguishing features and has generalization. Meanwhile, the model size is 31.83M, the reasoning time can reach 0.049 seconds/sheet, and the model is completely sufficient to be deployed to a mobile terminal and is applied to reality.
Experimental results on tables 1 chinese food and our data
Example two
The present embodiment provides a food recognition system including:
training module: the method comprises the steps of acquiring a plurality of food images, and training the food images through DenseNet networks;
Total loss construction module: in DenseNet network training, calculating classification loss, class center distance loss and inter-class distance loss according to the output characteristics of the DenseNet network, and constructing total loss based on the classification loss, the class center distance loss and the inter-class distance loss;
Each food image has different categories based on dishes, the classification loss is the classification loss of the DenseNet network for predicting the food image, and the class center distance loss is the loss of the distance relation between each sample in the feature vector output by the DenseNet network and the class center of the same class; the inter-class distance loss is the loss of the difference between the distance relation between each sample and other class centers and the distance relation between each class center in the feature vector output by DenseNet networks; the class center is a common sample with the same class in the output feature vector of DenseNet networks;
And a judging module: if the total loss is smaller than the preset loss, training the DenseNet network is completed; if the total loss is not less than the preset loss, continuing training the DenseNet network until the total loss is less than the preset loss;
And an identification module: the method is used for identifying the food image to be identified through the trained DenseNet network.
Example III
The present embodiment provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the food identification method of embodiment one when executing the computer program.
Example IV
The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the food identification method of embodiment one.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be realized by adopting various computer languages, such as object-oriented programming language Java, an transliteration script language JavaScript and the like.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations and modifications of the present invention will be apparent to those of ordinary skill in the art in light of the foregoing description. It is not necessary here nor is it exhaustive of all embodiments. And obvious variations or modifications thereof are contemplated as falling within the scope of the present invention.

Claims (8)

1. A method for identifying food, characterized in that: comprising the following steps:
Step S1: acquiring a plurality of food images, and training the food images through DenseNet networks;
Step S2: in DenseNet network training, calculating classification loss, class center distance loss and inter-class distance loss according to the output characteristics of the DenseNet network, and constructing total loss based on the classification loss, the class center distance loss and the inter-class distance loss;
Each food image has different categories based on dishes, the classification loss is the classification loss of the DenseNet network for predicting the food image, and the class center distance loss is the loss of the distance relation between each sample in the feature vector output by the DenseNet network and the class center of the same class; the inter-class distance loss is the loss of the difference between the distance relation between each sample and other class centers and the distance relation between each class center in the feature vector output by DenseNet networks; the class center is a common sample with the same class in the output feature vector of DenseNet networks;
Step S2 further comprises setting a plurality of training rounds in DenseNet network training, if the current training round is smaller than the training round threshold The classification loss is directly used as the total loss; if the current training round is greater than the training round threshold/>Summing the classification loss, the center-like distance loss and the inter-class distance loss to obtain total loss;
In the step S2, classification loss, center-to-class distance loss and inter-class distance loss are calculated according to the output characteristics of the DenseNet network, and total loss is constructed based on the classification loss, center-to-class distance loss and inter-class distance loss, and the method comprises:
Initializing feature sum matrices And image quantity matrix/>Wherein/>For accumulating the sum of all eigenvectors belonging to the same category,/>For calculating the number of samples per category contained in the training data,/>For category number,/>For feature dimension/>Is a feature space;
inputting a plurality of food images into DenseNet networks, and obtaining the feature vector output by the last layer of the DenseNet networks and the feature vector output by the penultimate layer, wherein the formula is as follows: Wherein/> Represents the/>Image,/>A feature extractor for DenseNet networks;
initializing class center matrices of different classes in feature vectors
By means ofAnd/>Update/>And/>Wherein/>For image/>Is a label of (2);
the feature vector output by the last layer of DenseNet network and the feature vector output by the penultimate layer are predicted by two different fully connected layers, and adds the prediction results of two different full-connection layers to obtain a prediction classification result ,/>A classifier formed for two different fully connected layers of DenseNet network connections;
If the current training turns The method meets the following conditions: /(I)Then calculate the classification loss/>, based on the prediction classification resultClassifying the loss/>As a total loss; otherwise, summing the classification loss, the center-to-class distance loss and the inter-class distance loss to obtain a total loss, wherein the calculation method of the center-to-class distance loss and the inter-class distance loss comprises the following steps: according to updated/>And/>Updating a class center matrix, wherein the formula is as follows: /(I)Calculating class center distance loss and class-to-class distance loss according to the updated class center matrix, wherein/>A training round threshold; will/>Wheel updated class center matrix/>For/>Wheel class center distance loss/>And inter-class distance loss/>In the calculation, the first/>Wheel updated class center matrix/>For the firstWheel class center distance loss/>And inter-class distance loss/>Calculating;
step S3: if the total loss is smaller than the preset loss, training the DenseNet network is completed; if the total loss is not less than the preset loss, continuing training the DenseNet network until the total loss is less than the preset loss;
Step S4: and identifying the food image to be identified through the trained DenseNet network.
2. The method of claim 1, wherein: the formula of the classification loss is as follows:
Wherein, Representing the number of samples,/>Representing category number,/>Represents the/>True tags of individual samples, when/>The samples belong to the/>Category time/>The remaining cases/>;/>Represents the/>The individual samples are predicted as/>Probability of individual categories.
3. The method of claim 1, wherein: the class center distance loss calculation formula is as follows:
Wherein, Representing the number of samples,/>Feature vector and/>, for DenseNet network outputsFor/>One sample of each category,/>For/>In wheel/>Class center of individual class,/>Is the euclidean distance.
4. The method of claim 1, wherein: the inter-class distance loss calculation method comprises the following steps:
constructing a distance relation among the centers of each class, wherein the formula is as follows:
Wherein, Is cosine similarity,/>For/>In wheel/>Class center of individual class,/>Is the firstIn wheel/>Class center of individual class,/>Is a first super parameter;
And constructing the center distance relation between each sample and other types, wherein the formula is as follows:
Wherein, For/>In wheel/>Characterization of individual samples,/>For/>In wheel/>Class center of individual classes andNot belonging to the/>Category,/>Is a first super parameter;
calculating the inter-class distance loss, wherein the formula is as follows:
Wherein, Representing the number of samples,/>Is formed by/>Set of compositions and/>For/>Wheel number/>Set of distance relationships between individual samples and class centers of other classes,/>Is/>Set of compositions and/>For/>Wheel number/>Distance relation set of individual class center and other class centers,/>Is the euclidean distance.
5. The method of claim 1, wherein: the step S1 further includes: performing data enhancement processing on the acquired food images, wherein the data enhancement processing comprises the following steps:
Randomly selecting images from a plurality of food images, adjusting the size of the selected images to 256×256, and then cutting the resized images to 224×224;
or horizontally overturning the food images.
6. A food product identification system, characterized by: comprising the following steps:
training module: the method comprises the steps of acquiring a plurality of food images, and training the food images through DenseNet networks;
Total loss construction module: in DenseNet network training, calculating classification loss, class center distance loss and inter-class distance loss according to the output characteristics of the DenseNet network, and constructing total loss based on the classification loss, the class center distance loss and the inter-class distance loss;
Each food image has different categories based on dishes, the classification loss is the classification loss of the DenseNet network for predicting the food image, and the class center distance loss is the loss of the distance relation between each sample in the feature vector output by the DenseNet network and the class center of the same class; the inter-class distance loss is the loss of the difference between the distance relation between each sample and other class centers and the distance relation between each class center in the feature vector output by DenseNet networks; the class center is a common sample with the same class in the output feature vector of DenseNet networks;
The total loss construction module further comprises setting a plurality of training rounds in DenseNet network training, if the current training round is smaller than the training round threshold The classification loss is directly used as the total loss; if the current training round is greater than the training round threshold/>Summing the classification loss, the center-like distance loss and the inter-class distance loss to obtain total loss;
The total loss construction module calculates classification loss, class center distance loss and inter-class distance loss according to the output characteristics of the DenseNet network, and constructs the total loss based on the classification loss, the class center distance loss and the inter-class distance loss, and the method comprises the following steps:
Initializing feature sum matrices And image quantity matrix/>Wherein/>For accumulating the sum of all eigenvectors belonging to the same category,/>For calculating the number of samples per category contained in the training data,/>For category number,/>For feature dimension/>Is a feature space;
inputting a plurality of food images into DenseNet networks, and obtaining the feature vector output by the last layer of the DenseNet networks and the feature vector output by the penultimate layer, wherein the formula is as follows: Wherein/> Represents the/>Image,/>A feature extractor for DenseNet networks;
initializing class center matrices of different classes in feature vectors
By means ofAnd/>Update/>And/>Wherein/>For image/>Is a label of (2);
the feature vector output by the last layer of DenseNet network and the feature vector output by the penultimate layer are predicted by two different fully connected layers, and adds the prediction results of two different full-connection layers to obtain a prediction classification result ,/>A classifier formed for two different fully connected layers of DenseNet network connections;
If the current training turns The method meets the following conditions: /(I)Then calculate the classification loss/>, based on the prediction classification resultClassifying the loss/>As a total loss; otherwise, summing the classification loss, the center-to-class distance loss and the inter-class distance loss to obtain a total loss, wherein the calculation method of the center-to-class distance loss and the inter-class distance loss comprises the following steps: according to updated/>And/>Updating a class center matrix, wherein the formula is as follows: /(I)Calculating class center distance loss and class-to-class distance loss according to the updated class center matrix, wherein/>A training round threshold; will/>Wheel updated class center matrix/>For/>Wheel class center distance loss/>And inter-class distance loss/>In the calculation, the first/>Wheel updated class center matrix/>For the firstWheel class center distance loss/>And inter-class distance loss/>Calculating;
And a judging module: if the total loss is smaller than the preset loss, training the DenseNet network is completed; if the total loss is not less than the preset loss, continuing training the DenseNet network until the total loss is less than the preset loss;
And an identification module: the method is used for identifying the food image to be identified through the trained DenseNet network.
7. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized by: the processor, when executing the computer program, implements the steps of the method for identifying food according to any one of claims 1 to 5.
8. A computer-readable storage medium having stored thereon a computer program, characterized by: the computer program, when executed by a processor, implements the steps of the method for identifying food according to any one of claims 1 to 5.
CN202410377890.1A 2024-03-29 2024-03-29 Food identification method, system, equipment and medium Active CN117975445B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410377890.1A CN117975445B (en) 2024-03-29 2024-03-29 Food identification method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410377890.1A CN117975445B (en) 2024-03-29 2024-03-29 Food identification method, system, equipment and medium

Publications (2)

Publication Number Publication Date
CN117975445A true CN117975445A (en) 2024-05-03
CN117975445B CN117975445B (en) 2024-05-31

Family

ID=90858282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410377890.1A Active CN117975445B (en) 2024-03-29 2024-03-29 Food identification method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN117975445B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259738A (en) * 2020-01-08 2020-06-09 科大讯飞股份有限公司 Face recognition model construction method, face recognition method and related device
CN111523483A (en) * 2020-04-24 2020-08-11 北京邮电大学 Chinese food dish image identification method and device
US20210390355A1 (en) * 2020-06-13 2021-12-16 Zhejiang University Image classification method based on reliable weighted optimal transport (rwot)
CN116258938A (en) * 2022-12-09 2023-06-13 西北工业大学 Image retrieval and identification method based on autonomous evolution loss

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259738A (en) * 2020-01-08 2020-06-09 科大讯飞股份有限公司 Face recognition model construction method, face recognition method and related device
CN111523483A (en) * 2020-04-24 2020-08-11 北京邮电大学 Chinese food dish image identification method and device
US20210390355A1 (en) * 2020-06-13 2021-12-16 Zhejiang University Image classification method based on reliable weighted optimal transport (rwot)
CN116258938A (en) * 2022-12-09 2023-06-13 西北工业大学 Image retrieval and identification method based on autonomous evolution loss

Also Published As

Publication number Publication date
CN117975445B (en) 2024-05-31

Similar Documents

Publication Publication Date Title
CN107578060B (en) Method for classifying dish images based on depth neural network capable of distinguishing areas
TWI532013B (en) Image quality analysis method and system
CN110909182B (en) Multimedia resource searching method, device, computer equipment and storage medium
CN109857844B (en) Intent recognition method and device based on ordering dialogue text and electronic equipment
CN111400507B (en) Entity matching method and device
CN112348117B (en) Scene recognition method, device, computer equipment and storage medium
US11809985B2 (en) Algorithmic apparel recommendation
CN110033342A (en) A kind of training method and device, a kind of recommended method and device of recommended models
CN104462301B (en) A kind for the treatment of method and apparatus of network data
CN109300059B (en) Dish recommending method and device
CN109460519B (en) Browsing object recommendation method and device, storage medium and server
CN109359515A (en) A kind of method and device that the attributive character for target object is identified
CN110119479B (en) Restaurant recommendation method, restaurant recommendation device, restaurant recommendation equipment and readable storage medium
CN116821308A (en) Generation method, training method and device of model and storage medium
CN106157156A (en) A kind of cooperation recommending system based on communities of users
Mohanty et al. The food recognition benchmark: Using deep learning to recognize food in images
CN110046959B (en) Method, device, equipment and storage medium for determining business category of merchant
CN107133854A (en) Information recommendation method and device
CN110851571A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN111428007A (en) Cross-platform based synchronous push feedback method
Chun et al. Development of Korean food image classification model using public food image dataset and deep learning methods
CN114637920A (en) Object recommendation method and device
Suddul et al. A comparative study of deep learning methods for food classification with images
CN117523271A (en) Large-scale home textile image retrieval method, device, equipment and medium based on metric learning
CN117975445B (en) Food identification method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant