CN111310520B

CN111310520B - Dish identification method, cashing method, dish ordering method and related devices

Info

Publication number: CN111310520B
Application number: CN201811513717.0A
Authority: CN
Inventors: 汪海洋
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-12-11
Filing date: 2018-12-11
Publication date: 2023-11-21
Anticipated expiration: 2038-12-11
Also published as: CN111310520A

Abstract

The invention discloses a dish identification method, a cashing method, a dish ordering method, a related device, a computing device and a medium, wherein the dish identification method comprises the following steps: inputting a dish image to be identified into a first dish type identification model for processing to obtain image characteristics output by a bottleneck layer in the first dish type identification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type identification model; and inputting the image characteristics into a second dish category identification model to identify so as to acquire the categories of dishes in the dish image.

Description

Dish identification method, cashing method, dish ordering method and related devices

Technical Field

The invention relates to the technical field of image processing, in particular to a dish identification method, a cashing method, a dish ordering method, a related device, computing equipment and a medium.

Background

During dining in a restaurant, a merchant often wants to provide services such as quick ordering, serving and checkout, so that unnecessary waste of time is avoided. However, during the peak period of dining, especially for canteens, due to the great flow of people, people cannot load the current working intensity, so that the service is difficult to meet the requirements, various problems of ordering dishes, misplacing dishes, waiting for the dishes, overlong billing amount settling time and the like are easy to occur, and poor dining experience is brought. Among these, problems such as waiting for a dish time and a billing amount settling time being too long are most common.

In order to solve the problems, the settlement and bill-ordering efficiency can be improved through the self-service cash desk robot, so that the situations that the manual price calculation is slow in settlement speed, easy to cause calculation errors, and the bill-ordering is forgotten due to busyness are avoided. For a self-service cashier robot, if the speed and the accuracy of settlement and ordering are ensured, the implanted dish identification algorithm is required to have higher accuracy and responsiveness to the identification of different dishes, and a deep neural network is generally adopted to complete the construction of the dish identification algorithm.

However, in the existing dish identification method implemented by using the deep neural network, on one hand, a large number of actually used dish samples need to be collected so as to train the deep neural network, on the other hand, because the dish samples are too many, in order to meet the identification precision, the structure of the deep neural network is complex and the number of layers is more, in this case, a longer training time is caused, even if the trained deep neural network can achieve a better identification effect, in the application to dish identification, the problem that the identification response speed is slow due to the complex network structure may occur, and also good user experience cannot be provided, so that a new dish identification method needs to be provided to optimize the processing procedure.

Disclosure of Invention

To this end, the present invention provides a dish identification scheme, as well as a cashing and dish ordering scheme based on dish identification, in an effort to solve or at least alleviate the above-presented problems.

According to an aspect of the present invention, there is provided a dish identification method comprising the steps of: firstly, inputting a dish image to be identified into a first dish type identification model for processing to obtain image features output by a bottleneck layer in the first dish type identification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type identification model; and inputting the image characteristics into a second dish category identification model to identify so as to acquire the categories of dishes in the dish image.

Optionally, in the dish identification method according to the present invention, the first dish category identification model is determined based on a fine tuning process performed on a category identification model trained in advance.

Optionally, in the dish identification method according to the present invention, performing the fine adjustment process based on the category identification model trained in advance includes: modifying the output number of the classifiers in the pre-trained class identification model according to the number of the classes of the dishes corresponding to the first dish image data set; correspondingly loading parameters of all processing layers before the last processing layer in the pre-trained category identification model into the modified category identification model to generate a first vegetable category identification model; the first dish category recognition model is trained based on the first set of dish image data such that output of the first dish category recognition model indicates a category of dishes in the input dish image.

Optionally, in the dish recognition method according to the present invention, the category recognition model is model trained based on a set of image data acquired in advance, so that an output of the category recognition model indicates a category of image content in the input image.

Optionally, in the dish identifying method according to the present invention, the image data set includes a plurality of pieces of image data, each piece of image data includes a training image and a category of image content in the training image, and performing model training based on the pre-acquired image data set includes: for each piece of image data in the image data set, taking a training image in the image data as input, and inputting the training image into the category recognition model to obtain a first category recognition result of the training image output by the category recognition model; based on the difference between the category of the image content in the training image and the first category recognition result, parameters of the category recognition model are adjusted.

Optionally, in the dish identification method according to the present invention, the image data set is an ImageNet data set.

Optionally, in the dish identification method according to the present invention, the first set of dish image data includes a plurality of pieces of dish image data, each piece of dish image data includes a dish training image and a category of a dish in the dish training image, and training the first dish category identification model based on the first set of dish image data includes: inputting the dish training images in the dish image data as input to a first dish type recognition model for each piece of dish image data in the first dish image data set to obtain a first dish type recognition result of the dish training images output by the first dish type recognition model; based on the difference between the category of the dishes in the dishes training image and the first dishes category recognition result, parameters of the first dishes category recognition model are adjusted.

Optionally, in the dish identification method according to the present invention, adjusting parameters of the first dish category identification model includes: parameters of a plurality of processing layers in the first dish category identification model close to the output end are adjusted.

Optionally, in the dish identification method according to the present invention, the second dish type identification model is model-trained based on the second set of dish image data acquired in advance, and the first dish type identification model, so that the output of the second dish type identification model indicates the type of dish in the input dish image.

Optionally, in the dish identification method according to the present invention, the second set of dish image data includes a plurality of pieces of specific dish image data, each piece of specific dish image data includes a specific dish training image and a category of a dish in the specific dish training image, and performing model training based on the second set of dish image data acquired in advance and the first dish category identification model includes: inputting specific dish training images in the specific dish image data into a first dish type identification model for processing to obtain training image characteristics output by a bottleneck layer in the first dish type identification model; inputting the training image features serving as input into a second vegetable category recognition model to obtain a second category recognition result, which is output by the second vegetable category recognition model and corresponds to the vegetable in the specific vegetable training image, of the training image features; and adjusting parameters of a second dish category identification model based on the category of the dishes in the specific dish training image and the second category identification result.

Optionally, in the dish identification method according to the present invention, the first dish category identification model includes a deep neural network including a plurality of processing layers.

Optionally, in the dish identification method according to the present invention, the deep neural network is a convolutional neural network, and the processing layer is any one of a convolutional layer, a pooling layer, and a full connection layer.

Optionally, in the dish identification method according to the present invention, the second dish category identification model includes a support vector machine model.

According to still another aspect of the present invention, there is provided a cashier method comprising the steps of: firstly, acquiring one or more dish images corresponding to a current order, wherein the dish images comprise corresponding dishes; inputting each dish image into a first dish type identification model respectively for processing to obtain image characteristics output by a bottleneck layer in the first dish type identification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type identification model; inputting the image characteristics into a second dish category identification model for identification so as to acquire the categories of dishes in the dish images; acquiring the price of the dishes according to the categories of the dishes in the dishes image; and calculating the bill amount corresponding to the current order based on the price of the dishes in each dish image and the number of the dish images.

Optionally, in the cashing method according to the present invention, the first dish category recognition model is determined based on a fine tuning process performed by a category recognition model trained in advance.

Optionally, in the cashing method according to the present invention, performing fine adjustment processing based on a class identification model trained in advance includes: modifying the output number of the classifiers in the pre-trained class identification model according to the number of the classes of the dishes corresponding to the first dish image data set; correspondingly loading parameters of all processing layers before the last processing layer in the pre-trained category identification model into the modified category identification model to generate a first vegetable category identification model; the first dish category recognition model is trained based on the first set of dish image data such that output of the first dish category recognition model indicates a category of dishes in the input dish image.

Optionally, in the cashing method according to the present invention, the category identification model is model trained based on a set of image data acquired in advance, so that an output of the category identification model indicates a category of image content in the input image.

According to a further aspect of the present invention there is provided a method of menu promotion comprising the steps of: firstly, acquiring one or more dish images corresponding to a current order, wherein the dish images comprise corresponding dishes; inputting each dish image into a first dish type identification model respectively for processing to obtain image characteristics output by a bottleneck layer in the first dish type identification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type identification model; inputting the image characteristics into a second dish category identification model for identification so as to acquire the categories of dishes in the dish images; determining whether the dishes are out of the meal overtime according to the categories of the dishes in the dishes image; and if the meal of the dishes is overtime, sending a dish ordering message to the corresponding client.

Optionally, in the dish ordering method according to the present invention, the method further includes: counting the stock quantity of the dishes corresponding to the food materials according to the categories of the dishes in the dishes image; and if the stock quantity is lower than the preset food material quantity, sending a replenishment message to the client.

Optionally, in the dish ordering method according to the present invention, the method further includes: and if the stock quantity is not lower than the preset food material quantity, sending a dish making message to the client.

Optionally, in the dish ordering method according to the present invention, the first dish category identification model is determined based on a fine tuning process performed by a pre-trained category identification model.

Optionally, in the dish ordering method according to the present invention, the performing fine tuning processing based on the pre-trained category identification model includes: modifying the output number of the classifiers in the pre-trained class identification model according to the number of the classes of the dishes corresponding to the first dish image data set; correspondingly loading parameters of all processing layers before the last processing layer in the pre-trained category identification model into the modified category identification model to generate a first vegetable category identification model; the first dish category recognition model is trained based on the first set of dish image data such that output of the first dish category recognition model indicates a category of dishes in the input dish image.

Optionally, in the menu ordering method according to the invention, the category identification model is model trained based on a pre-acquired set of image data, such that the output of the category identification model indicates the category of the image content in the input image.

According to still another aspect of the present invention, there is provided a dish identification device, which includes a feature extraction module and an identification module. The feature extraction module is suitable for inputting a dish image to be identified into the first dish type identification model for processing so as to obtain image features output by a bottleneck layer in the first dish type identification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type identification model; the identification module is suitable for inputting the image characteristics into the second dish category identification model to identify so as to acquire the categories of dishes in the dish image.

According to still another aspect of the present invention, there is provided a cashier apparatus including a first acquisition module, a feature extraction module, an identification module, a second acquisition module, and a calculation module. The first acquisition module is suitable for acquiring one or more dish images corresponding to the current order, wherein the dish images comprise corresponding dishes; the feature extraction module is suitable for inputting each dish image into the first dish type recognition model for processing so as to obtain image features output by a bottleneck layer in the first dish type recognition model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type recognition model; the identification module is suitable for inputting the image characteristics into the second dish category identification model to identify so as to acquire the categories of dishes in the dish image; the second acquisition module is suitable for acquiring the price of the dishes according to the categories of the dishes in the dishes image; the calculating module is suitable for calculating the bill amount corresponding to the current order based on the price of the dishes in each dish image and the quantity of the dish images.

According to still another aspect of the present invention, there is provided a menu ordering apparatus, which includes an acquisition module, a feature extraction module, an identification module, a determination module, and a transmission module. The acquisition module is suitable for acquiring one or more dish images corresponding to the current order, wherein the dish images comprise corresponding dishes; the feature extraction module is suitable for inputting each dish image into the first dish type recognition model for processing so as to obtain image features output by a bottleneck layer in the first dish type recognition model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type recognition model; the identification module is suitable for inputting the image characteristics into the second dish category identification model to identify so as to acquire the categories of dishes in the dish image; the determining module is suitable for determining whether the dishes are out of the meal overtime according to the categories of the dishes in the dishes image; the sending module is suitable for sending a menu ordering message to the corresponding client when the meal of the menu is overtime.

According to yet another aspect of the present invention, there is provided a computing device comprising one or more processors, memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing a dish identification method, a cashing method and/or a dish ordering method according to the present invention.

According to yet another aspect of the present invention, there is also provided a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform a dish identification method, a cashing method and/or a dish ordering method according to the present invention.

According to the dish identification scheme, forward reasoning is carried out on the dish image to be identified by utilizing the first dish type identification model, and image features of a bottleneck layer are extracted and input into the second dish type identification model so as to determine the type of the dish. The first vegetable category recognition model is determined based on fine adjustment processing of a pre-trained category recognition model, the category recognition model is trained on a large-scale image data set, and high recognition capability is ensured, so that the first vegetable category recognition model has a good initial network structure. And furthermore, a general first dish image data set is adopted to finish migration training on the first dish type recognition model, so that the recognition performance in dish recognition is further improved. Considering that dishes obtained by processing the same raw material in different restaurants may have large differences, a small amount of dish samples of each restaurant are collected to form a specific second dish image data set, and the set is combined with the first dish type recognition model to train the second dish type recognition model, so that more accurate recognition capability is obtained, the expected dish recognition effect can be achieved without collecting a large amount of actually used dish samples, and development time and cost are greatly saved.

Further, based on the cashing scheme and the dish ordering scheme provided by the dish identification, on the premise of ensuring the identification precision and the identification rate of the dish types, the cashing scheme can realize quick and accurate bill settlement, and the dish ordering scheme can timely inform a kitchen to make dishes in a rapid speed aiming at the dishes with overtime meal, so that the dining experience of a diner is improved.

Drawings

To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which set forth the various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to fall within the scope of the claimed subject matter. The above, as well as additional objects, features, and advantages of the present disclosure will become more apparent from the following detailed description when read in conjunction with the accompanying drawings. Like reference numerals generally refer to like parts or elements throughout the present disclosure.

FIG. 1 shows a schematic diagram of a dish identification system 100 according to one embodiment of the invention;

fig. 2 shows a schematic diagram of a cashier system 200 according to an embodiment of the invention;

FIG. 3 shows a schematic diagram of a dish ordering system 300 according to one embodiment of the invention;

FIG. 4 illustrates a block diagram of a computing device 400, according to one embodiment of the invention;

FIG. 5A shows a schematic diagram of a dish identification process according to one embodiment of the invention;

FIG. 5 shows a flow chart of a dish identification method 500 according to one embodiment of the invention;

FIG. 6A shows a schematic diagram of a type-recognition model, according to one embodiment of the invention;

FIG. 6B shows a schematic diagram of a first dish category identification model according to one embodiment of the invention;

fig. 7 shows a flow chart of a cashier method 600 according to an embodiment of the invention;

FIG. 8 shows a flow chart of a menu ordering method 700 according to one embodiment of the invention

FIG. 9 shows a schematic diagram of a dish identification device 800 according to one embodiment of the invention;

fig. 10 shows a schematic diagram of a cashier device 900 according to an embodiment of the invention; and

fig. 11 shows a schematic diagram of a dish ordering apparatus 1000 according to an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Fig. 1 shows a schematic diagram of a dish identification system 100 according to an embodiment of the invention. It should be noted that the dish identification system 100 in fig. 1 is merely exemplary, and in a specific practical situation, there may be different numbers of client devices and servers in the dish identification system 100, where the client devices are typically devices with shooting functions, and may be mobile terminals, such as smartphones, tablet computers, and the like, and may also be computing devices, such as PCs, and the invention is not limited thereto.

As shown in fig. 1, the dish identification system 100 includes a client device 110 and a server 120, wherein a dish identification device (not shown in the figure) resides in the server 120. According to one embodiment of the present invention, a camera is provided in the client device 110, and after a menu image is captured by the camera, the menu image is uploaded to the server 120.

The server 120 recognizes the category of the dishes by the dish recognition device on the received dish image. Specifically, the dish identification device inputs the dish image to be identified into the first dish type identification model for processing to obtain the image characteristics output by the bottleneck layer in the first dish type identification model, and inputs the image characteristics into the second dish type identification model for identification to obtain the type of the dish in the dish image. Wherein the bottleneck layer comprises all processing layers before the last processing layer in the first menu category identification model. Further, in consideration of practical situations, the dish identification device is not limited to be deployed in the server 120, but may be deployed in the client device 110, so as to avoid dependence on a communication network, such as a 4G network, thereby improving availability of identification application under a network-free or weak signal network and reducing operation and maintenance costs.

The above-mentioned dish identification system 100 will be described below with respect to two specific application scenarios of cashing and menu ordering. In the cash registering scenario, the dish identification system 100 is applied to bill settlement of a restaurant, forming a cash registering system 200. Fig. 2 shows a schematic diagram of a cashier system 200 according to an embodiment of the invention. It should be noted that the cashier system 200 in fig. 2 is only exemplary, and in a specific practical situation, there may be different numbers of client devices and servers in the cashier system 200, and the client devices are typically devices with shooting functions, and may be mobile terminals, such as smartphones, tablet computers, and the like, and may also be computing devices, such as PCs and the like, which are not limited to this invention.

As shown in fig. 2, the cashier system 200 includes a client device 210 and a server 220. Wherein a cashier device (not shown) resides in the server 220. According to one embodiment of the present invention, a camera is disposed in the client device 210, and after capturing a menu image by the camera, the menu image is uploaded to the server 220. The server 220 performs billing settlement of the order corresponding to the dishes through the cash register on the received dish image.

When the dining person selects the preferred dishes A1, A2 and A3, the dishes A1, A2 and A3 can be carried to the cashier for settlement through the tray. The checkout counter in the restaurant is provided with a checkout robot having a client device 210 built-in, through which client device 210 images of dishes A1, A2, and A3 can be taken.

It should be noted that the dishes A1, A2, and A3 may be photographed separately to form the corresponding single dish images B1, B2, and B3, or may be photographed simultaneously to form one dish image B4, in which the dishes A1, A2, and A3 are included. If the dish images B1, B2 and B3 are respectively shot, the three dish images B1, B2 and B3 are uploaded to the server 220 for dish identification, and then bill settlement is performed on orders associated with dishes, if the dish image B4 is only shot, the dish image B4 is uploaded to the server 220, and then the dish image B4 needs to be segmented, and after images respectively containing the dishes A1, A2 and A3 are acquired, the follow-up processing is performed. Of course, the invention is not limited in this regard.

Taking the example that the server 220 receives the dish image A1 uploaded by the client device 210, the dish image A1 is input into a first dish type identification model for processing through a cashier device to obtain image features output by a bottleneck layer in the first dish type identification model, and then the image features are input into a second dish type identification model for identification to obtain that the type of dishes in the dish image A1 is fish-flavored shredded pork. Further, the dish images A2 and A3 are respectively identified, and the corresponding categories are the fried rape and the pork rib white gourd soup. The price of the fish-flavored shredded pork is 15 yuan, the price of the stir-fried rape is 9 yuan, the price of the pork rib and white gourd soup is 12 yuan, the current dining person is calculated to pay 36 yuan, and the result is fed back to the client device 210. Finally, the settlement robot feeds back the result of the 36-membered meal fee to the diner for payment. In addition, it should be noted that, in consideration of practical situations, the cashier apparatus is not limited to be disposed in the server 220, but may be disposed in the client device 210.

In the menu ordering scenario, the menu identification system 100 is applied to ordering dishes in a restaurant, and the menu ordering system 300 is formed. Fig. 3 shows a schematic diagram of a dish ordering system 300 according to an embodiment of the invention. It should be noted that the dish ordering system 300 in fig. 3 is merely exemplary, and in a specific practical situation, there may be different numbers of client devices and servers in the dish ordering system 300, where the client devices determine whether the client devices should have a shooting function according to a usage situation, and the client devices may be mobile terminals, such as smart phones, tablet computers, and the like, and may also be computing devices, such as PCs, and the invention is not limited thereto.

As shown in fig. 3, the dish ordering system 300 includes a client device 310, a server 320, and a client device 330. Wherein a menu ordering device (not shown) resides in server 320. According to an embodiment of the present invention, a camera is disposed in the client device 310, and a menu image in a menu can be captured through the camera, and when a diner determines a menu with a good point, the client device 310 is utilized to capture a corresponding menu image in the menu, so as to form a corresponding order, where the order is associated with one or more menu images corresponding to the menu selected by the diner. While the client device 330 is usually disposed in the kitchen, and can receive various messages from the server 320, so that staff in the kitchen can perform corresponding operations according to the messages, such as supplementing food materials, accelerating the speed of making dishes, etc.

After the client device 310 uploads the current order to the server 320, the server 320 obtains 3 dish images corresponding to the current order, namely a dish image A1, a dish image A2 and a dish image A3, through a dish ordering device. Taking a dish image A1 as an example, inputting the dish image A1 into a first dish type identification model for processing to obtain image features output by a bottleneck layer in the first dish type identification model, and inputting the image features into a second dish type identification model for identification to obtain the fish-flavored shredded pork as the dish type in the dish image A1. Further, the dish images A2 and A3 are respectively identified, and the corresponding categories are the fried rape and the pork rib white gourd soup.

At this time, the dish ordering device in the server 320 has received the dish meal-out message fed back by the client device 330, where the dish meal-out message indicates that the two dishes of the fish-flavored shredded pork and the stir-fried rape have been made and eaten, but the pork rib and white gourd soup has not been made yet, and the meal-out is overtime. Based on this, a dish ordering message is sent to a corresponding client (typically software such as a kitchen management system) in the client device 330, so as to prompt a kitchen staff to accelerate the making speed of the dish of the pork rib white gourd soup. It should be noted that, in consideration of practical situations, the menu ordering device is not limited to be disposed in the server 320, but may be disposed in the client device 310, and if disposed in the client device 310, the client device 330 is directly connected to the client device 310 in a communication manner.

According to one embodiment of the present invention, the server 120 in the dish identification system 100, the server 220 in the cashier system 200, and the server 320 in the dish ordering system 300 may be implemented by the computing device 400 as described below. FIG. 4 illustrates a block diagram of a computing device 400, according to one embodiment of the invention.

As shown in FIG. 4, in a basic configuration 402, computing device 400 typically includes a system memory 406 and one or more processors 404. A memory bus 408 may be used for communication between the processor 404 and the system memory 406.

Depending on the desired configuration, processor 404 may be any type of processing, including, but not limited to: a microprocessor (μp), a microcontroller (μc), a digital information processor (DSP), or any combination thereof. Processor 404 may include one or more levels of cache, such as a first level cache 410 and a second level cache 412, a processor core 414, and registers 416. The example processor core 414 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 418 may be used with the processor 404 or, in some implementations, the memory controller 418 may be an internal part of the processor 404.

Depending on the desired configuration, system memory 406 may be any type of memory including, but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. The system memory 406 may include an operating system 420, one or more programs 422, and data 424. In some implementations, the program 422 may be arranged to execute instructions on an operating system by the one or more processors 404 using the data 424.

Computing device 400 may also include an interface bus 440 that facilitates communication from various interface devices (e.g., output devices 442, peripheral interfaces 444, and communication devices 446) to basic configuration 402 via bus/interface controller 430. The example output device 442 includes a graphics processing unit 448 and an audio processing unit 450. They may be configured to facilitate communication with various external devices such as a display or speakers via one or more a/V ports 452. Example peripheral interfaces 444 may include a serial interface controller 454 and a parallel interface controller 456, which may be configured to facilitate communications with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 458. An example communication device 446 may include a network controller 460, which may be arranged to facilitate communication with one or more other computing devices 462 over a network communication link via one or more communication ports 464.

The network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media in a modulated data signal, such as a carrier wave or other transport mechanism. A "modulated data signal" may be a signal that has one or more of its data set or changed in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or special purpose network, and wireless media such as acoustic, radio Frequency (RF), microwave, infrared (IR) or other wireless media. The term computer readable media as used herein may include both storage media and communication media.

Computing device 400 may be implemented as a server, such as a file server, database server, application server, WEB server, etc., as part of a small-sized portable (or mobile) electronic device, such as a cellular telephone, personal Digital Assistant (PDA), personal media player device, wireless WEB-watch device, personal headset device, application-specific device, or a hybrid device that may include any of the above functions. Computing device 400 may also be implemented as a personal computer including desktop and notebook computer configurations.

In some embodiments, computing device 400 is implemented as server 120, server 220, and/or server 320, and is configured to perform dish identification method 500, cashing method 600, and/or dish ordering method 700 according to the present invention. The program 222 of the computing device 200 includes a plurality of program instructions for executing the dish identification method 500, the cashing method 600 and/or the dish ordering method 700 according to the present invention, and the data 224 may further store configuration information of the dish identification system 100, the cashing system 200 and/or the dish ordering system 300.

Fig. 5A shows a schematic diagram of a dish identification process according to an embodiment of the invention. As shown in fig. 5A, a dish image to be identified is input into a first dish type identification model for processing, so as to obtain image features output by a bottleneck layer in the first dish type identification model, and then the image features are input into a second dish type identification model for identification, so as to obtain the type of the dish in the dish image. The first dish type recognition model is determined based on fine adjustment processing of a category recognition model trained in advance through an image data set and is trained through the first dish image data set, and the second dish type recognition model is trained based on the second dish image data set and the first dish type recognition model.

Fig. 5 shows a flow chart of a dish identification method 500 according to an embodiment of the invention. As shown in fig. 5, the method 500 begins at step S510. In step S510, the dish image to be identified is input to the first dish classification model for processing, so as to obtain the image features output by the bottleneck layer in the first dish classification model, where the bottleneck layer includes all processing layers before the last processing layer in the first dish classification model.

According to one embodiment of the invention, the first vegetable category recognition model is determined based on a fine tuning process performed on a pre-trained category recognition model. Wherein the class identification model comprises a deep neural network comprising a plurality of processing layers. In this embodiment, the deep neural network is a convolutional neural network, and the processing layer is any one of a convolutional layer, a pooled layer, and a fully-connected layer. In other words, the class identification model is a model determined based on a convolutional neural network, and fig. 6A shows a schematic diagram of the class identification model according to an embodiment of the present invention.

As shown in fig. 6A, the class recognition model includes m+1 processing layers and 1 classifier for outputting the final N1 recognition results. The full connection layer is usually selected as the last processing layer, i.e. the m+1th processing layer, and the first M processing layers may be a convolution layer, a pooling layer or a full connection layer, or may be an activation function layer, a normalization layer, or the like. The network structure adopted by the category identification model can be properly adjusted according to the actual application scene, the network training situation, the system configuration, the performance requirement and the like, which can be easily thought of by a person who knows the scheme of the invention, and the network structure is also within the protection scope of the invention, and is not described in detail herein.

After the network structure of the class identification model is determined, the class identification model needs to be trained first for subsequent use. According to one embodiment of the invention, the class identification model is model trained based on a pre-acquired set of image data such that the output of the class identification model is indicative of the class of image content in the input image. Wherein the image data set comprises a plurality of pieces of image data, each piece of image data comprising a training image and a category of image content in the training image.

When training the class identification model, firstly, taking a training image in image data as input to the class identification model for each piece of image data in the image data set to obtain a first class identification result of the training image output by the class identification model, and then, adjusting parameters of the class identification model based on the difference between the class of the image content in the training image and the first class identification result, wherein a back propagation algorithm is generally adopted to adjust the parameters.

In this embodiment, the image data set employs an ImageNet data set. The ImageNet dataset provides a large number of public images for a large visual database for visual object recognition software research. The ImageNet dataset contains 20000 categories, a typical category such as "balloon" or "strawberry", containing hundreds of images, facilitating training of models that identify items or biological categories.

After training of the category identification model is completed using the ImageNet dataset, fine tuning can be performed through the trained category identification model to determine a first dish category identification model. According to one embodiment of the present invention, the fine tuning process may be performed based on a pre-trained class identification model in the following manner. Firstly, modifying the output number of the classifiers in the pre-trained class identification model according to the number of the classes of the dishes corresponding to the first dish image data set.

In this embodiment, the first set of menu image data includes a plurality of menu image data, each of the menu image data includes a menu training image and categories of the menu in the menu training image, the categories of the menu in all of the menu training images are counted, and a total number of the different categories is accumulated as the number of the menu categories. In general, images of various dishes can be downloaded through the internet, and corresponding image preprocessing, such as cutting, rotation, denoising and the like, is performed to form a dish training image with uniform size and resolution so as to train a model for recognizing the categories of dishes.

And modifying the output number of the classifiers in the pre-trained class identification model to be N2 if the number of the classes of the dishes corresponding to the first dish image data set is N2. And then, correspondingly loading parameters of all processing layers before the last processing layer in the pre-trained category identification model into the modified category identification model to generate a first vegetable category identification model. According to one embodiment of the present invention, referring to fig. 5, parameters of all processing layers before the (m+1) th processing layer, i.e., the first M processing layers, in the pre-trained category recognition model are correspondingly loaded into the modified category recognition model to generate a first vegetable category recognition model.

Obviously, the first dish category recognition model has a similar network structure to the category recognition model, i.e. the first dish category recognition model comprises a deep neural network comprising a plurality of processing layers. In this embodiment, the deep neural network is a convolutional neural network, and the processing layer is any one of a convolutional layer, a pooled layer, and a fully-connected layer. In other words, the first dish category identification model is also a model determined based on a convolutional neural network, and fig. 6B shows a schematic diagram of the first dish category identification model according to an embodiment of the present invention. As shown in fig. 6B, the first dish category recognition model includes m+1 processing layers and 1 classifier, which will output N2 recognition results. The M processing layers before the M+1th processing layer are bottleneck layers of the first vegetable category identification model. Of course, for the class identification model shown in fig. 6A, the M processing layers before the (m+1) th processing layer may also be regarded as bottleneck layers of the class identification model.

After the network structure of the first dish category identification model is initially formed, training the first dish category identification model based on the first dish image data set so that the output of the first dish category identification model indicates the category of dishes in the input dish image. Specifically, firstly, inputting the dish training images in the dish image data into a first dish type recognition model to obtain a first dish type recognition result of the dish training images output by the first dish type recognition model, and then adjusting parameters of the first dish type recognition model based on differences between the categories of dishes in the dish training images and the first dish type recognition result.

Considering that the trained class identification model already has good recognition capability, parameters of the first M processing layers transplanted through the trained class identification model may remain unchanged for a majority of the processing layers near the input end, and parameters of a plurality of processing layers near the output end in the first vegetable class identification model are generally adjusted. For example, parameters of the M-1 th to M+1 th processing layers are adjusted.

In step S510, after the dish image a to be identified is input into the first dish classification model, the image features output by the bottleneck layer in the first dish classification model are obtained, and then, step S520 is performed. In step S520, the image features are input into the second dish category recognition model to be recognized, so as to obtain the category of the dishes in the dish image.

According to one embodiment of the invention, the second dish category recognition model is model trained based on a pre-acquired second set of dish image data and the first dish category recognition model such that an output of the second dish category recognition model indicates a category of a dish in the input dish image. The second dish image data set comprises a plurality of pieces of specific dish image data, and each piece of specific dish image data comprises a specific dish training image and categories of dishes in the specific dish training image.

In this embodiment, the specific dish training image is usually a specific dish image provided by a dining place where dish identification is required, such as a restaurant, and because dishes of the same category cooked by different restaurants may be quite different, the specific dish training image should be collected for the dish image of the restaurant, and after corresponding image preprocessing, such as cutting, rotation, denoising, etc., specific dish training images with uniform size and resolution are formed so as to train a model for identifying the specific dish category.

When the second dish type recognition model is trained, firstly, specific dish training images in specific dish image data are input into the first dish type recognition model for processing for each piece of specific dish image data in the second dish image data set, so that training image characteristics output by a bottleneck layer in the first dish type recognition model are obtained. And then, taking the training image characteristics as input, and inputting the training image characteristics into a second dish category recognition model to obtain a second category recognition result which is output by the second dish category recognition model and corresponds to dishes in the specific dish training image. Finally, parameters of a second dish category recognition model are adjusted based on the categories of dishes in the specific dish training image and the second category recognition result. In this embodiment, the second dish category identification model includes a support vector machine (Support Vector Machine, SVM) model.

The support vector machine is used as a classification algorithm, the generalization capability of the learning machine is improved by seeking the minimum structural risk, and the minimization of experience risk and confidence range is realized, so that the aim of obtaining good statistical rules under the condition of less statistical sample size is fulfilled. In popular terms, the support vector machine model is a class-two classifier, and the basic model is defined as a linear classifier with the largest interval in the feature space, namely, the learning strategy of the support vector machine is that the interval is maximized, and finally, the learning strategy can be converted into a solution of a convex quadratic programming problem.

Obviously, the support vector machine model adopted in the second dish category recognition model is a multi-category classifier, and the current method for constructing the support vector machine multi-category classifier mainly comprises two categories, namely a direct method and an indirect method. The direct method is to modify the objective function directly, combine the parameter solutions of multiple classification planes into one optimization problem, and realize multiple classification by solving the optimization problem "once". The indirect method mainly realizes the construction of multi-class classifiers by combining a plurality of two-class classifiers, and common methods include a one-to-many (one-against-all) method and a one-to-one (one-against-one) method.

The one-to-many method is to sequentially classify samples of a certain class into one class and the other remaining samples into another class during training, so that K class samples construct K class classifiers, and unknown samples are classified into the class with the largest classification function value during classification. The one-to-one rule is to design a class-two classifier between any two classes of samples, so that K classes of samples need to be designed into K (K-1)/2 class-two classifiers, and when an unknown sample is classified, the class with the largest number of tickets is the class of the unknown sample.

The method is specifically adopted to realize the support vector machine model of the multi-class classifier, and can be selected according to actual conditions, so that the method is not limited. The second dish classification model may be constructed based on the support vector machine model, and may be realized by logistic regression ((Logistic Regression, LR), GBDT (Gradient Boosting Decision Tree, gradient lifting decision tree) algorithm, etc. the present invention is not limited to the construction of the second dish classification model by which algorithm or model is used, and may be selected according to the actual application scenario, network training situation, system configuration, performance requirement, etc. and the model construction process and corresponding parameters in the selected mode may be appropriately adjusted, which may be easily considered by a person who knows the scheme of the present invention, and are also within the scope of the present invention, which is not repeated herein.

According to one embodiment of the present invention, the image features of the dish image a obtained in step S510 are input into the second dish category identification model, and the category of the dish in the dish image a is identified as the fish-flavored shredded pork.

Fig. 7 shows a flow chart of a cashier method 600 according to an embodiment of the invention. As shown in fig. 7, the method 600 begins at step S610. In step S610, one or more menu images corresponding to the current order are obtained, where the menu images include corresponding menu items. According to one embodiment of the invention, the current order corresponds to 3 dish images, namely dish image A1, dish image A2 and dish image A3.

Subsequently, step S620 is performed to input each dish image into the first dish classification model for processing, so as to obtain the image features output by the bottleneck layer in the first dish classification model, where the bottleneck layer includes all processing layers before the last processing layer in the first dish classification model. The first vegetable category recognition model is determined based on fine adjustment processing of a category recognition model trained in advance.

According to one embodiment of the present invention, the fine tuning process may be performed based on a pre-trained class identification model in the following manner. Firstly, modifying the output number of classifiers in a pre-trained class identification model according to the number of the dishes corresponding to a first dish image data set, then correspondingly loading parameters of all processing layers before the last processing layer in the pre-trained class identification model into the modified class identification model to generate a first dish class identification model, and training the first dish class identification model based on the first dish image data set so that the output of the first dish class identification model indicates the class of dishes in an input dish image. Wherein the class identification model is model trained based on a pre-acquired set of image data such that an output of the class identification model indicates a class of image content in the input image.

In this embodiment, the dish images A1, A2, and A3 are respectively input to the first dish classification model for processing, and the outputs of the bottleneck layers in the obtained first dish classification model are image features C1, C2, and C3 in order.

In step S630, the image features are input into the second dish category recognition model for recognition, so as to obtain the category of the dishes in the dish image. According to one embodiment of the invention, after the image features C1, C2 and C3 are input into the second vegetable category recognition model for recognition, the category of the vegetable in the vegetable image A1 is fish-flavored shredded pork, the category of the vegetable in the vegetable image A2 is fried rape, and the category of the vegetable in the vegetable image A3 is pork rib and white gourd soup.

Next, step S640 is performed to obtain the price of the dishes according to the categories of the dishes in the dishes image. According to one embodiment of the invention, the obtained fish-flavored shredded pork has a price of 15 yuan, the fried rape has a price of 9 yuan, and the pork rib and white gourd soup has a price of 12 yuan.

Finally, in step S650, a billing amount corresponding to the current order is calculated based on the price of the dishes in each of the dish images and the number of the dish images. According to one embodiment of the invention, the current order corresponds to a billing amount of 15×1+9×1+12×1=36 yuan.

The processing procedure for identifying the category of the dishes in the dish images in steps S620 and S630 is disclosed in detail in the description of the method 500, and will not be repeated here.

Fig. 8 shows a flow chart of a menu ordering method 700 according to an embodiment of the invention. As shown in fig. 8, the method 700 begins at step S710. In step S710, one or more menu images corresponding to the current order are obtained, where the menu images include corresponding menu items. According to one embodiment of the invention, the current order corresponds to 3 dish images, namely dish image A1, dish image A2 and dish image A3.

Subsequently, step S720 is performed, in which each dish image is respectively input into the first dish category identification model for processing, so as to obtain image features output by a bottleneck layer in the first dish category identification model, where the bottleneck layer includes all processing layers before the last processing layer in the first dish category identification model. The first vegetable category recognition model is determined based on fine adjustment processing of a category recognition model trained in advance.

In step S730, the image features are input into the second dish category recognition model for recognition, so as to obtain the category of the dishes in the dish image. According to one embodiment of the invention, after the image features C1, C2 and C3 are input into the second vegetable category recognition model for recognition, the category of the vegetable in the vegetable image A1 is fish-flavored shredded pork, the category of the vegetable in the vegetable image A2 is fried rape, and the category of the vegetable in the vegetable image A3 is pork rib and white gourd soup.

After determining the category of the dishes, according to another embodiment of the invention, according to the category of the dishes in the dishes image, the stock quantity of the dishes corresponding to the food materials is counted, if the stock quantity is lower than the preset food material quantity, a replenishment message is sent to the client, and if the stock quantity is not lower than the preset food material quantity, a dishes making message is sent to the corresponding client.

In this embodiment, taking the pork rib and white gourd soup as an example, the food materials corresponding to the dishes include pork ribs and white gourd, the stock amounts of the pork ribs and white gourd are counted to be D1 and D2, and when the pork rib and white gourd soup is manufactured, the preset food material amount of the pork ribs is Δd1, the preset food material amount of the white gourd is Δd2, and as D1> Δd1 and D2> Δd2, the dish manufacturing message corresponding to the pork rib and white gourd soup is sent to the corresponding client.

Next, step S740 is performed to determine whether the dishes are out of the meal and time out according to the categories of the dishes in the dishes image. According to one embodiment of the invention, the two dishes of the shredded fish and the stir-fried rape are made and eaten, but the pork rib and white gourd soup is not made yet, and the eating is overtime.

Finally, in step S750, if the meal of the dish is out of time, a dish ordering message is sent to the corresponding client. According to one embodiment of the invention, as the dishes of the pork rib white gourd soup are out of meal overtime, a dish ordering message is sent to a client (usually software such as a kitchen management system) so as to prompt a kitchen staff to accelerate the production speed of the dishes of the pork rib white gourd soup.

The processing procedure for identifying the category of the dishes in the dish images in steps S720 and S730 is disclosed in detail in the description of the method 500, and will not be repeated here.

Fig. 9 shows a schematic diagram of a dish identification device 800 according to an embodiment of the invention. As shown in fig. 9, the dish recognition device 800 includes a feature extraction module 810 and a recognition module 820.

The feature extraction module 810 is adapted to input the dish image to be identified into the first dish type identification model for processing, so as to obtain the image features output by the bottleneck layer in the first dish type identification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type identification model.

According to one embodiment of the invention, the first vegetable category recognition model is determined based on a fine tuning process performed on a pre-trained category recognition model. The feature extraction module 810 is further adapted to perform a fine tuning process based on the pre-trained category recognition model, and further adapted to modify the output number of the classifiers in the pre-trained category recognition model according to the number of the categories corresponding to the first set of the menu image data, and correspondingly load parameters of all the processing layers before the last processing layer in the pre-trained category recognition model into the modified category recognition model to generate a first menu category recognition model, and train the first menu category recognition model based on the first set of the menu image data, so that the output of the first menu category recognition model indicates the categories of the menu in the input menu image.

According to one embodiment of the invention, the first item category identification model comprises a deep neural network comprising a plurality of processing layers. The deep neural network is a convolutional neural network, and the processing layer is any one of a convolutional layer, a pooling layer and a full-connection layer.

In other words, the first dish category identification model is also a model determined based on a convolutional neural network, and fig. 6B shows a schematic diagram of the first dish category identification model according to an embodiment of the present invention. As shown in fig. 6B, the first dish category recognition model includes m+1 processing layers and 1 classifier, which will output N2 recognition results. The M processing layers before the M+1th processing layer are bottleneck layers of the first vegetable category identification model.

According to one embodiment of the invention, the class identification model is model trained based on a pre-acquired set of image data such that the output of the class identification model is indicative of the class of image content in the input image. Wherein the image data set comprises a plurality of pieces of image data, each piece of image data comprising a training image and a category of image content in the training image. The feature extraction module 810 is further adapted to perform model training based on a pre-acquired image data set, and is further adapted to input a training image in the image data set as an input to the class recognition model, to obtain a first class recognition result of the training image output by the class recognition model, and to adjust parameters of the class recognition model based on differences between the class of the image content in the training image and the first class recognition result. In this embodiment, the image data set is an ImageNet data set.

According to one embodiment of the invention, the first set of dish image data comprises a plurality of pieces of dish image data, each piece of dish image data comprising a dish training image and a category of dishes in the dish training image. In general, images of various dishes can be downloaded through the internet, and corresponding image preprocessing, such as cutting, rotation, denoising and the like, is performed to form a dish training image with uniform size and resolution so as to train a model for recognizing the categories of dishes.

The feature extraction module 810 is further adapted to train the first dish category recognition model based on the first set of dish image data, further adapted to input a dish training image in the dish image data as input to the first dish category recognition model for each piece of dish image data in the first set of dish image data, so as to obtain a first dish category recognition result of the dish training image output by the first dish category recognition model, and adjust parameters of the first dish category recognition model based on a difference between a category of a dish in the dish training image and the first dish category recognition result.

In this embodiment, the feature extraction module 810 is further adapted to adjust parameters of a plurality of processing layers in the first vegetable class identification model near the output.

The identification module 820 is adapted to input the image features into the second dish category identification model for identification to obtain the category of the dishes in the dish image.

According to one embodiment of the invention, the second dish category recognition model is model trained based on a pre-acquired second set of dish image data and the first dish category recognition model such that an output of the second dish category recognition model indicates a category of a dish in the input dish image. In this embodiment, the second dish category identification model includes a support vector machine model.

According to one embodiment of the invention, the second set of dish image data comprises a plurality of pieces of specific dish image data, each piece of specific dish image data comprising a specific dish training image and a category of dishes in the specific dish training image. In this embodiment, the specific dish training image is usually a specific dish image provided by a dining place where dish identification is required, such as a restaurant, and because dishes of the same category cooked by different restaurants may be quite different, the specific dish training image should be collected for the dish image of the restaurant, and after corresponding image preprocessing, such as cutting, rotation, denoising, etc., specific dish training images with uniform size and resolution are formed so as to train a model for identifying the specific dish category.

The recognition module 820 is further adapted to perform model training based on the second set of pre-acquired dish image data and the first dish type recognition model, further adapted to input a specific dish training image in the specific dish image data to the first dish type recognition model for processing, so as to obtain training image features output by a bottleneck layer in the first dish type recognition model, input the training image features as input to the second dish type recognition model, so as to obtain a second type recognition result output by the second dish type recognition model, wherein the training image features correspond to dishes in the specific dish training image, and adjust parameters of the second dish type recognition model based on the categories and the second type recognition result of dishes in the specific dish training image.

Specific steps and embodiments for dish identification are disclosed in detail in the descriptions based on fig. 5A to 6B, and are not repeated here.

Fig. 10 shows a schematic diagram of a cashier device 900 according to an embodiment of the invention. As shown in fig. 10, the cashier device 900 includes a first acquisition module 910, a feature extraction module 920, an identification module 930, a second acquisition module 940, and a calculation module 950.

The first obtaining module 910 is adapted to obtain one or more menu images corresponding to the current order, where the menu images include corresponding menu items.

The feature extraction module 920 is adapted to input each dish image into the first dish classification model for processing, so as to obtain the image features output by the bottleneck layer in the first dish classification model, where the bottleneck layer includes all processing layers before the last processing layer in the first dish classification model.

According to one embodiment of the invention, the first vegetable category recognition model is determined based on a fine tuning process performed on a pre-trained category recognition model. The feature extraction module 920 is further adapted to perform fine tuning processing based on the pre-trained category identification model, and further adapted to modify the output number of the classifiers in the pre-trained category identification model according to the number of the categories corresponding to the first set of the menu image data, and correspondingly load parameters of all the processing layers before the last processing layer in the pre-trained category identification model into the modified category identification model to generate a first menu category identification model, and train the first menu category identification model based on the first set of the menu image data, so that the output of the first menu category identification model indicates the categories of the menu in the input menu image. Wherein the class identification model is model trained based on a pre-acquired set of image data such that an output of the class identification model indicates a class of image content in the input image.

The identification module 930 is adapted to input the image features into a second dish category identification model for identification to obtain the category of the dish in the dish image.

The second obtaining module 940 is adapted to obtain a price of the dish according to a category of the dish in the dish image.

The calculating module 950 is adapted to calculate a billing amount corresponding to the current order based on the price of the dishes in each of the dish images and the number of the dish images.

Specific steps and embodiments of cashing are disclosed in detail in the descriptions based on fig. 2 and 7, and are not repeated here.

Fig. 11 shows a schematic diagram of a cashier device 1000 according to an embodiment of the invention. As shown in fig. 11, the dish ordering apparatus 1000 includes an acquisition module 1010, a feature extraction module 1020, an identification module 1030, a determination module 1040, and a transmission module 1050.

The obtaining module 1010 is adapted to obtain one or more menu images corresponding to the current order, where the menu images include corresponding menu items.

The feature extraction module 1020 is adapted to input each dish image into the first dish classification model for processing, so as to obtain image features output by a bottleneck layer in the first dish classification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish classification model.

The recognition module 1030 is adapted to input the image features into the second dish category recognition model for recognition, so as to obtain the category of the dishes in the dish image;

The determining module 1040 is adapted to determine, according to the category of the dish in the dish image, whether the dish is out of the meal or out of the meal is overtime;

the sending module 1050 is adapted to send a menu ordering message to the corresponding client when the meal out of the menu is timed out.

According to one embodiment of the present invention, the sending module 1050 is further adapted to count the inventory amount of the dishes corresponding to the food materials according to the categories of the dishes in the dishes image, send a replenishment message to the client when the inventory amount is lower than the preset food material amount, and send a dish making message to the client when the inventory amount is not lower than the preset food material amount.

Specific steps and embodiments of menu promotion are disclosed in detail in the descriptions based on fig. 3 and 8, and are not repeated here.

The conventional dish identification method is generally realized by adopting a conventional deep neural network, if accuracy is to be ensured, a large number of actually used dish samples are required to be collected for network training, and the network structure is complex, so that the identification efficiency is low. According to the dish identification scheme provided by the embodiment of the invention, forward reasoning is performed on the dish image to be identified by using the first dish category identification model, and the image characteristics of the bottleneck layer are extracted and input into the second dish category identification model so as to determine the category of the dish. The first vegetable category recognition model is determined based on fine adjustment processing of a pre-trained category recognition model, the category recognition model is trained on a large-scale image data set, and high recognition capability is ensured, so that the first vegetable category recognition model has a good initial network structure. And furthermore, a general first dish image data set is adopted to finish migration training on the first dish type recognition model, so that the recognition performance in dish recognition is further improved. Considering that dishes obtained by processing the same raw material in different restaurants may have large differences, a small amount of dish samples of each restaurant are collected to form a specific second dish image data set, and the set is combined with the first dish type recognition model to train the second dish type recognition model, so that more accurate recognition capability is obtained, the expected dish recognition effect can be achieved without collecting a large amount of actually used dish samples, and development time and cost are greatly saved.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules or units or groups of devices in the examples disclosed herein may be arranged in a device as described in this embodiment, or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into a plurality of sub-modules.

Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or groups of embodiments may be combined into one module or unit or group, and furthermore they may be divided into a plurality of sub-modules or sub-units or groups. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.

Furthermore, some of the embodiments are described herein as methods or combinations of method elements that may be implemented by a processor of a computer system or by other means of performing the functions. Thus, a processor with the necessary instructions for implementing the described method or method element forms a means for implementing the method or method element. Furthermore, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is for carrying out the functions performed by the elements for carrying out the objects of the invention.

The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions of the methods and apparatus of the present invention, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.

In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to execute the dish identification method, the cashing method and/or the dish ordering method of the present invention according to instructions in said program code stored in the memory.

By way of example, and not limitation, computer readable media comprise computer storage media and communication media. Computer-readable media include computer storage media and communication media. Computer storage media stores information such as computer readable instructions, data structures, program modules, or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of computer readable media.

As used herein, unless otherwise specified the use of the ordinal terms "first," "second," "third," etc., to describe a general object merely denote different instances of like objects, and are not intended to imply that the objects so described must have a given order, either temporally, spatially, in ranking, or in any other manner.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments are contemplated within the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is defined by the appended claims.

Claims

1. A method of dish identification for a particular dining venue, comprising:

inputting a dish image to be identified into a first dish type identification model for processing so as to acquire image characteristics output by a bottleneck layer in the first dish type identification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type identification model, the first dish type identification model is determined based on a pre-trained type identification model and adopts a first dish image data set for fine adjustment processing, and the first dish image data set is a general dish image data set;

Inputting the image characteristics into a second vegetable category recognition model to recognize so as to acquire the categories of the vegetables in the vegetable images, wherein the second vegetable category recognition model is based on a second vegetable image data set acquired in advance, and the first vegetable category recognition model is used for model training, so that the output of the second vegetable category recognition model indicates the categories of the vegetables in the input vegetable images, and the second vegetable image data set is the vegetable image data set of the specific dining place.

2. The method of claim 1, wherein the fine tuning based on the pre-trained class identification model comprises:

modifying the output number of the classifiers in the pre-trained class identification model according to the number of the classes of the dishes corresponding to the first dish image data set;

correspondingly loading parameters of all processing layers before the last processing layer in the pre-trained category identification model into the modified category identification model to generate the first vegetable category identification model;

training the first dish category recognition model based on the first set of dish image data so that output of the first dish category recognition model indicates a category of dishes in the input dish image.

3. The method of claim 1 or 2, wherein the class identification model is model trained based on a pre-acquired set of image data such that an output of the class identification model is indicative of a class of image content in the input image.

4. A method as claimed in claim 3, the set of image data comprising a plurality of pieces of image data, each piece of image data comprising a training image and a category of image content in the training image, the model training based on the set of pre-acquired image data comprising:

for each piece of image data in the image data set, taking a training image in the image data as input, and inputting the training image into the category recognition model to obtain a first category recognition result of the training image output by the category recognition model;

and adjusting parameters of the category identification model based on the difference between the category of the image content in the training image and the first category identification result.

5. The method of claim 3 or 4, wherein the image data set is an ImageNet data set.

6. The method of claim 2, wherein the first set of dish image data comprises a plurality of pieces of dish image data, each piece of dish image data comprising a dish training image and a category of a dish in the dish training image, the training the first dish category recognition model based on the first set of dish image data comprising:

Inputting the dish training images in the dish image data as input to the first dish type recognition model for each piece of dish image data in the first dish image data set to obtain a first dish type recognition result of the dish training images, which is output by the first dish type recognition model;

and adjusting parameters of the first dish category identification model based on the difference between the category of the dish in the dish training image and the first dish category identification result.

7. The method of claim 6, wherein the adjusting parameters of the first item category identification model comprises:

and adjusting parameters of a plurality of processing layers close to an output end in the first vegetable category identification model.

8. The method of claim 1, wherein the second set of dish image data comprises a plurality of pieces of specific dish image data, each piece of specific dish image data comprising a specific dish training image and a category of dishes in the specific dish training image, the model training based on the pre-acquired second set of dish image data, and the first dish category recognition model comprising:

Inputting specific dish training images in the specific dish image data into the first dish type identification model for processing to obtain training image features output by a bottleneck layer in the first dish type identification model;

inputting the training image features serving as input into the second vegetable category recognition model to obtain a second category recognition result, which is output by the second vegetable category recognition model and corresponds to the vegetable in the specific vegetable training image, of the training image features;

and adjusting parameters of the second dish category recognition model based on the category of the dishes in the specific dish training image and the second category recognition result.

9. The method of claim 1, wherein the first menu category identification model comprises a deep neural network comprising a plurality of processing layers.

10. The method of claim 9, wherein the deep neural network is a convolutional neural network and the processing layer is any one of a convolutional layer, a pooling layer, and a fully-connected layer.

11. The method of claim 1, wherein the second dish category identification model comprises a support vector machine model.

12. A cashier method for a specific dining venue, comprising:

acquiring one or more dish images corresponding to the current order, wherein the dish images comprise corresponding dishes;

inputting each dish image into a first dish type recognition model respectively for processing so as to acquire image characteristics output by a bottleneck layer in the first dish type recognition model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type recognition model, and the first dish type recognition model is determined based on a pre-trained type recognition model and by adopting a first dish image data set for fine adjustment processing;

inputting the image characteristics into a second dish category identification model for identification so as to acquire the categories of dishes in the dish image;

acquiring the price of the dishes according to the categories of the dishes in the dishes image;

and calculating the bill amount corresponding to the current order based on the price of the dishes in each dish image and the number of the dish images, wherein the second dish type recognition model is based on a second dish image data set acquired in advance, and the first dish type recognition model is used for model training, so that the output of the second dish type recognition model indicates the type of the dishes in the input dish image, and the second dish image data set is the dish image data set of the specific dining place.

13. The method of claim 12, wherein the fine tuning based on the pre-trained class identification model comprises:

14. The method of claim 12 or 13, wherein the class identification model is model trained based on a pre-acquired set of image data such that an output of the class identification model is indicative of a class of image content in the input image.

15. A method for ordering dishes at a specific dining place, comprising:

inputting the image characteristics into a second vegetable category recognition model to recognize so as to acquire the category of the vegetable in the vegetable image, wherein the second vegetable category recognition model is based on a second vegetable image data set acquired in advance, and the first vegetable category recognition model is used for model training, so that the output of the second vegetable category recognition model indicates the category of the vegetable in the input vegetable image, and the second vegetable image data set is based on the vegetable image data set of the specific dining place;

determining whether the dishes are out of the meal overtime according to the categories of the dishes in the dishes image;

and if the meal of the dishes is overtime, sending a dish ordering message to the corresponding client.

16. The method of claim 15, further comprising:

counting the stock quantity of the dishes corresponding to the food materials according to the categories of the dishes in the dishes image;

and if the stock quantity is lower than the preset food material quantity, sending a replenishment message to the client.

17. The method of claim 16, further comprising:

and if the stock quantity is not lower than the preset food material quantity, sending a dish making message to the client.

18. The method of claim 15, wherein the fine tuning based on the pre-trained class identification model comprises:

19. The method of claim 15, wherein the class identification model is model trained based on a pre-acquired set of image data such that an output of the class identification model indicates a class of image content in the input image.

20. A dish identification device for a specific dining place, comprising:

the feature extraction module is suitable for inputting a dish image to be identified into a first dish type identification model for processing so as to obtain image features output by a bottleneck layer in the first dish type identification model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type identification model;

the identification module is suitable for inputting the image characteristics into a second dish category identification model to identify so as to acquire the categories of dishes in the dish image,

the first vegetable category recognition model is determined based on a pre-trained category recognition model and is subjected to fine adjustment processing by adopting a first vegetable image data set, the second vegetable category recognition model is subjected to model training based on a pre-acquired second vegetable image data set, and the first vegetable category recognition model is subjected to model training, so that the output of the second vegetable category recognition model indicates the category of a vegetable in an input vegetable image, and the second vegetable image data set is the vegetable image data set of the specific dining place.

21. A cashier device for a specific dining venue comprising:

the first acquisition module is suitable for acquiring one or more dish images corresponding to the current order, wherein the dish images comprise corresponding dishes;

the feature extraction module is suitable for inputting each dish image into a first dish type recognition model respectively for processing so as to acquire image features output by a bottleneck layer in the first dish type recognition model, wherein the bottleneck layer comprises all processing layers before the last processing layer in the first dish type recognition model, and the first dish type recognition model is determined based on a pre-trained type recognition model and is subjected to fine adjustment processing by adopting a first dish image data set;

the identification module is suitable for inputting the image characteristics into a second vegetable category identification model to identify so as to acquire the categories of the vegetables in the vegetable images, wherein the second vegetable category identification model is based on a second vegetable image data set acquired in advance, and the first vegetable category identification model is used for model training so that the output of the second vegetable category identification model indicates the categories of the vegetables in the input vegetable images, and the second vegetable image data set is a vegetable image data set of a specific dining place;

The second acquisition module is suitable for acquiring the price of the dishes according to the categories of the dishes in the dishes image;

and the calculating module is suitable for calculating the bill amount corresponding to the current order based on the prices of the dishes in the dish images and the quantity of the dish images.

22. A menu ordering device for a specific dining place, comprising:

the acquisition module is suitable for acquiring one or more dish images corresponding to the current order, wherein the dish images comprise corresponding dishes;

the identification module is suitable for inputting the image characteristics into a second vegetable category identification model to identify so as to acquire the categories of the vegetables in the vegetable images, wherein the second vegetable category identification model is based on a second vegetable image data set acquired in advance, and the first vegetable category identification model is used for model training so that the output of the second vegetable category identification model indicates the categories of the vegetables in the input vegetable images, and the second vegetable image data set is a vegetable image data set of the specific dining place;

The determining module is suitable for determining whether the dishes are out of the meal overtime according to the categories of the dishes in the dishes image;

and the sending module is suitable for sending a menu ordering message to the corresponding client when the menu meal is out overtime.

23. A computing device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-19.

24. A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-19.