CN112668529A

CN112668529A - Dish sample image enhancement identification method

Info

Publication number: CN112668529A
Application number: CN202011643563.4A
Authority: CN
Inventors: 瞿晨非; 井焜
Original assignee: Synthesis Electronic Technology Co Ltd
Current assignee: Synthesis Electronic Technology Co Ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2021-04-16

Abstract

The invention provides a dish sample image enhancement and identification method, which comprises the steps of obtaining a complete image of a single tableware-contained dish, and constructing an original dish training set; constructing an antagonistic network comprising a generator G and a discriminator D, and generating a standardized dish training set by using an original dish training set; constructing a dish identification network, utilizing a standardized dish training set for training to obtain a dish identification model, and ending the training process; collecting dishes to be identified on the same day, constructing a dish identification base, generating dish characteristics, standardizing base images to train a dish identification model, and obtaining and storing dish characteristic vectors; respectively obtaining the standardized representation and the characteristic representation of the dish by using steps 4, 5 and 6 when the dish is to be identified on the day; and comparing the dish features with all the features, and considering the current dish to be detected as the corresponding dish in the bottom warehouse by the two groups of the dish features closest to each other. The invention improves the recognition possibility of the recognition model, reduces the difference among the same type of samples and can reduce the model training cost.

Description

Dish sample image enhancement identification method

Technical Field

The invention relates to a dish sample image enhancement identification method, in particular to a dish identification method based on image data enhancement.

Background

At present, a dish identification mode based on deep learning mainly aims at identifying dish images in tableware, dish areas are preferentially extracted in the identification process, tableware information is introduced by a method of extracting a tableware outer edge matrix, and the accuracy of dish identification is greatly influenced by the condition of the space occupied by dishes in the tableware due to insufficient sample space coverage of training samples. In addition, the background modeling method is used for removing redundant tableware information, so that the information redundancy is large, the generated result is not standard, and the deep learning training and recognition process is not facilitated. In the existing deep learning dish identification application, dish patterns in training data are complex and are unevenly distributed, so that the space of a training sample is uneven, and the accuracy rate is low after model training is finished. Therefore, a method for directly enhancing the dish image area is designed, and the accuracy during training and use is improved.

Disclosure of Invention

The invention aims to provide an image recognition method for enhancing a dish image, which improves the recognition possibility of a recognition model and reduces the difference among samples of the same type, thereby reducing the model training cost.

In order to achieve the purpose, the invention is realized by the following technical scheme:

step 1: obtaining a complete image of the dishes contained in a single tableware through an image acquisition unit, and constructing an original dish training set according to the complete image;

step 2: constructing an antagonistic network comprising a generator G and a discriminator D, wherein the antagonistic network is formed by using a convolutional neural network, and generating a standardized dish training set by using an original dish training set;

and step 3: constructing a dish identification network, training by using resnet-50, generating a standardized dish training set by using the step 2, training to obtain a dish identification model, and finishing the training process;

and 4, step 4: in the application process, firstly, dishes to be identified on the day are collected, and the collection process of a single picture is the same as that in the step 1;

and 5: constructing a dish identification base library, and generating standardized dishes by using the countermeasure network used in the step 2 for the dish images obtained in the step 4;

step 6: generating dish features, and acquiring and storing dish feature vectors by using the normalized bottom library image generated in the step 5 through the dish recognition model trained in the step 3;

and 7: respectively obtaining the standardized representation and the characteristic representation of the dish by using steps 4, 5 and 6 when the dish is to be identified on the day;

step 8, calculating and comparing the dish features generated in the step 7 with all the features in the step 6 by adopting Euclidean distances, and considering the dish to be detected as the corresponding dish in the bottom library if the two groups of features are closer;

preferably, the step 2 of generating the normalized dish training set by the training set comprises the following steps:

1) the method comprises the following steps of training a generator and a discriminator by taking an obtained original dish training set as a real sample, so that a dish generation sample acceptable to human eyes can be generated finally;

2) in the training process, a generator G is used for generating an image of a specified category of dishes, and a discriminator G is used for judging whether the image of the category of dishes is the image of the specified category of dishes;

3) the training process is alternately finished in pairs, the accuracy of the generated image of the generator G is optimized through the judgment result of the discriminator G, and the generator G is used for generating the image;

4) judging ability of result optimization discriminator G on image truth and final optimization function

Wherein x represents an input real picture, z represents noise information, and c represents an artificial control condition; d (x, c) is a judgment result of the original real picture under certain artificial parameters through a discriminator D, (G (Z)), c) represents picture information generated by a generator G under the condition of artificial control of noise Z, and D (G (Z)), c) represents a judgment result of authenticity through the discriminator D;

5) the control condition c specifies the area, shape and proportion of the dish in the tableware, the generator is modified by the control parameter to generate a dish result, the dish image generated by the generator G has the same expression form on the non-dish characteristics, the space position, the proportion and the shape of the generated dish result in the tableware are nearly the same, and the internal structure of the dish is not changed;

6) after the training is finished, the obtained original dish training set is input into a generator G by controlling artificial conditions c, and each dish image information in the sample generates a corresponding standard dish to form a standardized dish training set.

The invention has the advantages that:

1. and (3) using a countermeasure generation network, enhancing the original image by using the characteristics of the dishes, increasing the characteristics of the image and improving the recognition possibility of the recognition model.

2. The original sample set is uniformly enhanced by using the countermeasure generation network, model parameters are controlled, the dish images integrally have a uniform pattern, and the difference among samples of the same type is reduced, so that the model training cost can be reduced.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.

FIG. 1 is a schematic diagram of the training process of the present invention.

FIG. 2 is a schematic diagram of the process of constructing a dish library according to the present invention.

FIG. 3 is a flow chart illustrating an identification process according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The scheme mainly comprises the following steps

Step 1: and obtaining a complete image of the dishes contained in the single tableware through an image acquisition unit, and constructing an original dish training set according to the complete image.

Step 2: and constructing an antagonistic network and generating a standardized dish training set by using the original dish training set.

Furthermore, the confrontation generation network comprises a generator G and a discriminator D which are both formed by a convolutional neural network;

further, the original dish training set obtained in the step 1 is used as a real sample, and a generator and a discriminator are trained, so that a dish generation sample acceptable to human eyes can be generated finally;

furthermore, in the training process, a generator G is used for generating an image of the dish of the specified category, and a discriminator G is used for judging whether the image of the dish of the specified category is the image of the dish of the specified category.

Further, the training process is alternately finished in pairs, the accuracy of the generated image is optimized through the judgment result of the discriminator G, and the judgment capability of the discriminator G on the image truth is optimized through the generation result of the generator G.

Further, a final optimization function

Wherein x represents an input real picture, z represents noise information, and c represents an artificial control condition; d (x, c) is the result of judging the original real picture by the discriminator D under some artificial parameters, (G (Z)), c) represents the picture information generated by the generator G under the artificial control condition of the noise Z, D (G (Z)), c) represents the result of judging the authenticity of (G (Z), c) by the discriminator D.

Furthermore, the artificial control condition c specifies the area, shape and proportion of the dish in the tableware, and the dish result is generated by controlling the parameter modification generator, so that the dish image generated by the generator G has the same expression form on the non-dish characteristics, the space position, the proportion and the shape of the generated dish result in the tableware are nearly the same, and the internal structure of the dish is not changed.

Further, after the training is finished, inputting the original dish training set obtained in the step 1 into a generator G by controlling artificial conditions c, so that each dish image information in the sample generates a corresponding standard dish to form a standardized dish training set.

And step 3: and (3) constructing a dish identification network, and generating a standardized dish training set for training by utilizing the step (2) to obtain a dish identification model. The training process is ended.

Further, the dish identification network may be trained using resnet-50.

And 4, step 4: in the application process, dishes to be identified on the day are collected firstly, and the process of collecting single picture is the same as the step 1

And 5: and (4) constructing a dish identification base library, and generating the standardized dishes by using the countermeasure network used in the step 2 for the dish images obtained in the step 4.

Step 6: and (4) generating dish features, and acquiring and storing dish feature vectors by using the normalized bottom library image generated in the step (5) through the dish identification model trained in the step (3).

And 7: when the dish is to be identified, the normalized representation and the characteristic representation of the dish are respectively obtained by using the steps 4, 5 and 6.

And 8, comparing the dish features generated in the step 7 with all the features in the step 6, and determining that the dish to be detected is the corresponding dish in the bottom library if the two groups of features are closer.

Further, the Euclidean distance can be adopted for calculation during feature comparison.

Claims

1. A dish sample image enhancement identification method is characterized by comprising the following steps:

and 4, step 4: collecting dishes to be identified on the same day, wherein the process of collecting single pictures is the same as that in the step 1;

and 5: establishing a dish identification base, generating a normalized dish by using the picture through an confrontation network, and generating a normalized base image;

step 6: generating dish features, and acquiring and storing dish feature vectors by utilizing the dish identification model trained in the step 3 on the normalized bottom library image;

and 8: calculating and comparing the dish features generated in the step 7 with all the features in the step 6 by adopting Euclidean distances, and considering the dish to be detected as the corresponding dish in the bottom library if the two groups of features are closer;

2. the method for enhancing and identifying the image of the dish sample according to claim 1, wherein the step 2 of generating the normalized dish training set by the training set comprises the following steps: