WO2020215952A1

WO2020215952A1 - Object recognition method and system

Info

Publication number: WO2020215952A1
Application number: PCT/CN2020/080767
Authority: WO
Inventors: 马事伟; 吴江旭; 张伟华; 石海龙; 张洪光; 徐荣图; 胡淼枫; 王璟璟
Original assignee: 北京京东尚科信息技术有限公司; 北京京东世纪贸易有限公司
Priority date: 2019-04-23
Filing date: 2020-03-24
Publication date: 2020-10-29
Also published as: CN111832590B; CN111832590A

Abstract

An object recognition method and system, relating to the field of image recognition. The method comprises: obtaining one or more images to undergo recognition, wherein the one or more images comprise one or more objects to undergo recognition (110); using a trained pre-recognition model, and determining whether or not the probability that the one or more images are clear and contain a complete object is greater than a threshold value (120); and if the probability is greater than the threshold value, identifying the type of each of the one or more objects (130). The method can improve the accuracy and efficiency of object recognition.

Description

Item identification method and system

Cross references to related applications

This application is based on the application with the CN application number 201910325808.X and the filing date of April 23, 2019, and claims its priority. The disclosure of the CN application is hereby incorporated into this application as a whole.

Technical field

The present disclosure relates to the field of image recognition, and in particular to an item recognition method and system.

Background technique

In the restaurant settlement system, the dishes need to be identified first, and then settled according to the prices corresponding to the dishes. In related technologies, dishes can be identified based on computer vision technology. For example, the sensor triggers the image acquisition device to take a picture of the dish, and then recognizes the dish; or, processes each image collected by the image acquisition device to identify the dish information in the image.

Summary of the invention

According to one aspect of the present disclosure, an item identification method is proposed, which includes: acquiring one or more images to be identified, wherein the images to be identified include one or more items to be identified; using a trained pre-recognition model to determine Identify whether the probability of the image being clear and containing the complete item is greater than a threshold; and if the probability is greater than the threshold, identify the category of each item.

In some embodiments, if the probability of the first image being clear and containing the complete item is greater than the first threshold, and the probability of other images being clear and containing the complete item is greater than the second threshold, the The items contained in the first image among the images are classified.

In some embodiments, training the pre-recognition model includes: labeling images in the sample images that are clear and containing complete items as positive sample images, and labeling images that are not positive sample images in the sample images as negative sample images; and The sample image and the negative sample image train the pre-recognition model to determine whether the probability that the image to be recognized is clear and contains the complete item is greater than the threshold according to the trained pre-recognition model.

In some embodiments, identifying the category of each item includes: inputting the image to be recognized into the item detection model, extracting the area information and the first level category corresponding to each item in the image to be recognized; determining the effective category in the first level category; And input the area information of the items belonging to the valid category in the image to be recognized into the item recognition model, extract the item features corresponding to each area information, and compare the item features corresponding to each area information with the item features in the item feature library to determine The second-level category of each item in the image to be recognized.

In some embodiments, training the item detection model and the item recognition model includes: labeling the area information and the first-level category corresponding to the item in the sample image, generating first label information, and labeling the item based on the sample image and the first label information. The detection model is trained to determine the area information and the first-level category corresponding to each item in the image to be recognized according to the trained item detection model; and to label the item features corresponding to the area information of the effective category items in the sample image The second tagging information is to train the item recognition model based on the sample image and the second tagging information, so as to extract the item features corresponding to the area information of each item in the image to be recognized based on the trained item recognition model.

In some embodiments, determine the valid item features in the item feature library within a predetermined time; and compare the item features corresponding to each area information with the valid item features in the item feature library to determine the first item of each item in the image to be identified Secondary category.

In some embodiments, the minimum distance between the item feature of each item and the item feature in the item feature library is determined; if the minimum distance is less than or equal to the distance threshold, the item feature corresponding to each item in the item feature library is the closest to the item feature The corresponding category is regarded as the second-level category of each item; if the minimum distance is greater than the distance threshold, the user is prompted whether the item category and attribute information needs to be input; and if the item category and attribute need to be input, the item category and attribute information is added, otherwise , The category corresponding to the closest item feature in the item feature library corresponding to each item is taken as the second level category of each item.

In some embodiments, the corresponding attribute information is matched according to the category of each item.

In some embodiments, after the attribute information is matched, in response to the user modifying the attribute information corresponding to the category of the item, the image to be recognized is marked as a training image or a test image, so that the item detection model and the item recognition model are performed based on the image to be recognized. Training or testing.

In some embodiments, the size information of each item is determined based on the item detection model, and the corresponding attribute information is matched according to the category and size of each item; it is determined whether multiple items in the image to be identified meet the item combination, if multiple items meet the item combination , Match the attribute information corresponding to the item combination; determine whether the attribute sum corresponding to multiple items in the image to be identified meets the preset condition, if the attribute sum satisfies the preset condition, the attribute sum is processed according to the preset condition; and determine For the matching time of item matching attribute information, the attribute information corresponding to each item is determined according to the matching time.

According to another aspect of the present disclosure, an item identification system is also provided, including: an image acquisition module configured to acquire one or more images to be identified, wherein the images to be identified include one or more items to be identified; The pre-recognition module is configured to use the trained pre-recognition model to determine whether the probability that the image to be recognized is clear and contains the complete item is greater than a threshold; and the item determination module is configured to recognize the probability of each item when the probability is greater than the threshold category.

In some embodiments, the pre-recognition module is further configured to: if the probability of the first image being clear and containing the complete item is greater than the first threshold, and the probability of other images being clear and containing the complete item is greater than the first If the threshold is two, the first image in the consecutive multiple images is sent to the item determination module; and the item determination module is configured to perform category recognition on the items contained in the first image in the consecutive multiple images.

In some embodiments, the pre-recognition module is further configured to mark images in the sample image that are clear and contain complete items as positive sample images, and to mark images in the sample images that are not positive sample images as negative sample images; and The positive sample image and the negative sample image train the pre-recognition model to determine whether the probability that the image to be recognized is clear and contains the complete item is greater than the threshold according to the trained pre-recognition model.

In some embodiments, the item determination module includes: an item detection module configured to input the image to be recognized into the item detection model, and extract the area information and the first-level category corresponding to each item in the image to be recognized based on the item detection model; The management module is configured to determine the effective category in the first-level category; and the item recognition module is configured to input the area information of items belonging to the effective category in the image to be recognized into the item recognition model, and extract each area based on the item recognition model The item features corresponding to the information are compared with the item features in the item feature database to determine the second-level category of each item in the image to be recognized.

In some embodiments, the item management module is configured to determine the effective category in the first-level category; the item detection module is configured to input the image to be recognized into the item detection model, and extract the area corresponding to each item in the image to be recognized Information and the first-level category, call the item management module, and input the area information of items belonging to the effective category to the item recognition module; and the item recognition module is also configured to identify the item characteristics corresponding to the area information of the effective category items in the sample image Annotation is performed to generate second annotation information, and the item recognition model is trained based on the sample image and the second annotation information, so as to extract the item features corresponding to the area information of each item in the image to be recognized according to the trained item identification model.

In some embodiments, the item management module is configured to determine valid item features in the item feature library within a predetermined time; and the item identification module is further configured to compare the item features corresponding to each area information with the valid item features in the item feature library. Perform comparison to determine the second-level category of each item in the image to be identified.

In some embodiments, the item identification module is configured to determine the minimum distance between the item feature of each item and the item feature in the item feature library; if the minimum distance is less than or equal to the distance threshold, the item in the item feature library corresponding to each item The category corresponding to the item feature with the closest feature distance is regarded as the second-level category of each item; if the minimum distance is greater than the distance threshold, the user is prompted whether the item category and attribute information needs to be input; if the item category and attribute need to be input, the item category is added And attribute information, otherwise, the category corresponding to the item feature closest to the item feature corresponding to each item in the item feature library is taken as the second level category of each item.

In some embodiments, the attribute matching unit is configured to match corresponding attribute information according to the category of each item.

In some embodiments, the item management module is further configured to, after matching the attribute information, in response to the user modifying the attribute information corresponding to the category of the item, mark the image to be recognized as a training image or a test image, so as to compare the item based on the image to be recognized. The detection model and the item recognition model are trained or tested.

In some embodiments, the attribute matching unit is further configured to: match corresponding attribute information according to the category and size of each item, wherein the item detection module is further configured to determine the size information of each item based on the item detection model ; Determine whether multiple items in the image to be recognized meet the item combination, if multiple items meet the item combination, match the attribute information corresponding to the item combination; determine whether the attribute sum corresponding to multiple items in the image to be recognized meets the preset conditions, If the preset conditions are met, the attributes are processed according to the preset conditions; and the matching time of the item matching attribute information is determined, and the attribute information corresponding to each item is determined according to the matching time.

According to another aspect of the present disclosure, an article identification system is also provided, including: a memory; and a processor coupled to the memory, the processor configured to execute the above-mentioned article identification method based on instructions stored in the memory.

According to another aspect of the present disclosure, a non-transitory computer-readable storage medium is also provided, on which computer program instructions are stored, and when the instructions are executed by a processor, the above-mentioned item identification method is realized.

Through the following detailed description of exemplary embodiments of the present disclosure with reference to the accompanying drawings, other features and advantages of the present disclosure will become clear.

Description of the drawings

The drawings constituting a part of the specification describe the embodiments of the present disclosure, and together with the specification, serve to explain the principle of the present disclosure.

With reference to the accompanying drawings, the present disclosure can be understood more clearly according to the following detailed description, in which:

FIG. 1 is a schematic flowchart of some embodiments of the object identification method of the present disclosure.

FIG. 2 is a schematic flowchart of other embodiments of the object identification method of the present disclosure.

FIG. 3 is a schematic structural diagram of some embodiments of the object identification system of the present disclosure.

Fig. 4 is a schematic structural diagram of other embodiments of the object identification system of the present disclosure.

FIG. 5 is a schematic structural diagram of other embodiments of the object identification system of the present disclosure.

FIG. 6 is a schematic structural diagram of other embodiments of the object identification system of the present disclosure.

FIG. 7 is a schematic structural diagram of other embodiments of the object identification system of the present disclosure.

FIG. 8 is a schematic structural diagram of other embodiments of the object identification system of the present disclosure.

Detailed ways

Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that unless specifically stated otherwise, the relative arrangement, numerical expressions and numerical values of the components and steps set forth in these embodiments do not limit the scope of the present disclosure.

At the same time, it should be understood that, for ease of description, the sizes of the various parts shown in the drawings are not drawn in accordance with actual proportional relationships.

The following description of at least one exemplary embodiment is actually only illustrative, and in no way serves as any limitation to the present disclosure and its application or use.

The technologies, methods, and equipment known to those of ordinary skill in the relevant fields may not be discussed in detail, but where appropriate, the technologies, methods, and equipment should be regarded as part of the authorization specification.

In all examples shown and discussed here, any specific value should be interpreted as merely exemplary, rather than as a limitation. Therefore, other examples of the exemplary embodiment may have different values.

It should be noted that similar reference numerals and letters indicate similar items in the following drawings, so once a certain item is defined in one drawing, it does not need to be further discussed in subsequent drawings.

In order to make the objectives, technical solutions, and advantages of the present disclosure clearer, the following further describes the present disclosure in detail with reference to specific embodiments and drawings.

The inventor found that the method of triggering an image acquisition device to take a picture of a dish through a sensor and then identifying the dish requires a sensor, which leads to an increase in cost. Moreover, when there is debris in the recognition area, it will cause false triggering. In addition, after the customer places the tray in the recognition area, it will take a while for the sensor to respond to the state until the state is stable. Therefore, there will be a trigger delay that affects the customer experience.

Recognizing each image collected requires higher algorithm requirements for the server, and recognizing all images will also affect the accuracy of recognition.

In step 110, one or more images to be identified are acquired, where the images to be identified include one or more objects to be identified. For example, in a restaurant, when a customer purchases a dish and a bowl of rice, the dish and rice can be placed in the recognition area, and the recognition area can be photographed by a camera to obtain an image containing the dish and rice.

In step 120, the trained pre-recognition model is used to determine whether the probability that the image to be recognized is clear and contains a complete item is greater than a threshold.

In some embodiments, the pre-recognition model may be trained in advance, a certain number of sample images are collected, and the sample images are classified. For example, the image in the sample image is clear and the image containing the complete item is annotated as a positive sample image, the image in the sample image that is not a positive sample image is annotated as a negative sample image, and the pre-recognition model is performed based on the positive sample image and the negative sample image. training. The output result of the pre-recognition module is compared with the corresponding information of the sample to determine whether the comparison result meets the requirements of the loss function of the pre-recognition model. Iteratively, optimize and adjust the parameters of the pre-recognition module, so that the comparison result finally meets the requirements of constructing the pre-recognition model The requirement of the loss function, save the pre-recognition model.

For example, a customer purchases a dish and a bowl of rice, first places the dish and rice on a tray, and then places the tray in the recognition area. When the user places the tray, the tray is constantly moving. If the tray does not completely enter the recognition area when the image capture device captures an image, the tray in the image is incomplete. For example, some dishes are not collected, or some dishes are collected only a small part, which will affect the accuracy of subsequent identification. In addition, when the tray is moving, the collected images are blurred with motion, which will also affect the accuracy of subsequent recognition. Therefore, invalid images are excluded, and dishes are recognized only for images that contain a complete tray and have clear images.

During the normal customer settlement process, a certain number of images of the recognition area are collected, including images of no tray in the area, images of the tray just entering the recognition area, images of half of the tray entering the recognition area, and images of the tray completely entering the recognition area. Then classify the images, label the images with clear images and the trays in the image completely entering the recognition area as positive sample images, label other images as negative sample images, and then train the pre-recognition model based on the positive and negative sample images. If the customer does not use the tray, the image is clear, and the images containing the complete dishes, beverages and other priced goods are marked as positive sample images, and the other images are marked as negative sample images, and then the pre-recognition model is trained.

In step 130, when the probability is greater than the threshold, the category of each item is identified. That is, in this embodiment, instead of performing item category recognition on all images, it first determines whether the image meets the requirements, and performs item recognition on the images that meet the requirements.

In the foregoing embodiment, each item in the image whose image is clear and the probability of containing a complete item is greater than the threshold is recognized, instead of performing item recognition on all images, the accuracy and recognition efficiency of the recognition system can be improved.

In some embodiments, in order to further reduce the burden of image processing, if the probability of each image being clear and containing a complete item in multiple consecutive images to be recognized is greater than the threshold, then the first image contained in the consecutive multiple images Items are identified by category.

In some embodiments, in order to further reduce the burden of image processing and improve the stability of the system, if there are consecutive images to be recognized, the probability that the first image is clear and contains a complete item is greater than the first threshold, and the other images are clear and contain complete If the probability of the item is greater than the second threshold, the item contained in the first image among the consecutive multiple images is classified.

For example, calculate the probability that the current image is clear and contains a complete item. When it is determined that the first image is clear and contains an image with a probability of greater than 0.9, the item contained in the first image is identified. When the second to Nth images are clear and the probability of containing a complete object is greater than 0.1, the second to Nth images are not processed.

In step 210, one or more images to be identified are acquired, where the images to be identified include one or more items to be identified.

In step 220, the trained pre-recognition model is used to determine whether the probability that the image to be recognized is clear and contains a complete item is greater than a threshold.

In step 230, when the probability is greater than the threshold, the image to be recognized is input to the item detection model, and the region information and the first level category corresponding to each item in the image to be recognized are extracted based on the item detection model. The first-level category refers to the category to which the item belongs, such as dishes, fruits, and beverages.

In some embodiments, the item detection model can be pre-trained, the area information and the first-level category corresponding to the item in the sample image are annotated, the first annotation information is generated, and the item detection model is performed based on the sample image and the first annotation information. training. Compare the output result of the item detection model with the first label information to determine whether the comparison result meets the requirements of the loss function for building the item recognition model, and iterate repeatedly, optimize and adjust the parameters of the item detection model, so that the comparison result finally meets the construction of the item detection model The requirements of the loss function, save the item detection model.

In a restaurant, there may be priced items such as dishes, yogurt, fruits, and beverages in the recognition area, and there may also be non-valuable items such as keys, badges, wallets, mobile phones, chopsticks, spoons, and hands. Therefore, you can first determine the category of each item in the image to remove invalid categories.

When training the item detection model, mark the items in the collected images as categories such as dishes, yogurt, fruits, drinks, keys, badges, wallets, mobile phones, chopsticks, spoons, and hands. Then input the image to the item detection model for training. After the item detection model is trained, when an image is input, the item detection model can output the area information and category information of each item in the image.

In step 240, a valid category in the first-level category is determined according to the configuration information. For example, for non-valuable items, there is a probability that they will be mistaken as valued items. Therefore, it is necessary to remove the category of non-valued items and only keep the category of valued items to avoid misidentification.

In step 250, the area information of the items belonging to the effective category in the image to be recognized is input into the item recognition model, and the item features corresponding to each area information are extracted based on the item recognition model, and the item features corresponding to each area information are compared with those in the item feature database. The item features are compared to determine the second-level category of each item in the image to be recognized. The second level category can correspond to the specific information of the item. For example, a certain dish is specifically fried green peppers or fried cabbage.

In some embodiments, the item features corresponding to the area information of the effective category items in the sample image are annotated, the second annotation information is generated, and the item recognition model is trained based on the sample image and the second annotation information. Compare the output result of the item recognition model with the second label information, determine whether the comparison result meets the requirements of the loss function of building the item recognition model, iterate repeatedly, optimize and adjust the parameters of the item recognition model, so that the comparison result finally meets the construction of the item recognition model The requirements of the loss function, save the item recognition model.

For example, to register a dish to be sold, an image of the dish is first collected, and the image is input to the item detection module, and the item detection model outputs the area information and category of the dish. Then, the dish feature corresponding to the regional information of the dish is annotated, and the image and the annotation information are input to the item recognition model to train the item recognition model. Store the dish features in the feature library. When the regional information corresponding to a dish is input into the item recognition model, the item recognition model calls the feature library, compares the output dish features with the dish features stored in the feature library, and recognizes the The specific information corresponding to the dish, for example, whether the dish is fried cabbage or fried green pepper.

In the above embodiment, the image is pre-identified first, the unqualified images are removed, and then the categories of the items in the images that meet the requirements are identified. The invalid category is removed, and only the item features corresponding to the area information of the items belonging to the valid category are identified, and specific items can be identified based on the item features, which improves the accuracy of item identification.

In some other embodiments of the present disclosure, the effective item features in the item feature database are determined within a predetermined time; the item features corresponding to each area information are compared with the effective item features in the item feature database to determine each The second level category of the item.

For example, the item feature database saves the characteristics of each dish in each period, but the vegetables that make up a certain dish may be slightly different in different seasons, or, in certain periods, certain dishes are no longer on sale. Therefore, it is possible to set the features of dishes not currently participating in the sale as invalid features, and the features of dishes participating in the sale as valid features. When identifying a dish, compare the characteristics of the dish to be identified with the characteristics of the effective dish in the feature library to determine the specific dish.

In the above embodiment, the item features corresponding to each area information are compared with the valid item features in the item feature library to determine the second-level category of each item in the image to be identified, which can reduce the interference in the item identification process, and further Improve the accuracy of recognition.

In some embodiments of the present disclosure, when comparing the item features with the item features in the feature library, first determine the minimum distance between the item feature of each item and the item feature in the item feature library; if the minimum distance is less than or equal to the distance threshold , The item feature corresponding to each item in the item feature library is the category corresponding to the item feature closest to the item feature as the second level category of each item; if the minimum distance is greater than the distance threshold, the user is prompted whether to enter the item category and attribute Information; if you need to enter the item category and attribute, add the item category and attribute information, otherwise, the item feature corresponding to each item in the item feature library is the category corresponding to the item feature closest to the item as the second level category of each item.

The distance is, for example, Euclidean distance, and the size of the distance represents the size of similarity. The smaller the distance, the more similar the item feature of the item to be identified and the item feature in the item feature library. When the distance exceeds the distance threshold, it means that the item feature database may not contain the feature of the item to be identified. Therefore, the user can be prompted whether to input item category and attribute information. If the user inputs, it means that a new item needs to be registered. If the user does not input, the category corresponding to the item feature closest to each item is taken as the second level category of each item.

In some other embodiments of the present disclosure, after identifying the category of each item, the attribute information corresponding to the item is matched. In some embodiments, the attribute information is, for example, price. For example, after recognizing that a certain dish is stir-fried cabbage, the price corresponding to the dish can be matched. In the settlement, if there are multiple dishes, the multiple dishes can be settled.

In this embodiment, since the accuracy of item identification is improved, the attribute information of the item can be more accurately matched. When the attribute information is price information, the accuracy of commodity settlement can be improved.

In other embodiments of the present disclosure, after the attribute information is matched, in response to the user modifying the attribute information corresponding to the category of the item, the image to be recognized is marked as a training image or a test image, so that the object detection model and the The item recognition model is trained or tested. For example, if it is recognized that a certain dish is stir-fried cabbage, and the price of stir-fried cabbage is matched, but in actual calculation, the user modifies the settlement price, it means that the dish was identified incorrectly. Therefore, the image containing the dish can be used as a training image or a test image, and the image can be used to train or test the object detection model and the object recognition model. Through automatic iteration of the model, the accuracy of the recognition of the model can be improved.

In some other embodiments of the present disclosure, the size information of each item is determined based on the item detection model, and the corresponding attribute information is matched according to the category and size of each item. For example, the attribute information is the price. For large and small dishes, the size boundary of the large and small dishes can be calculated, that is, the average value of the large dishes and the average value of the small dishes. The size of the recognized dish is compared with the size boundary to determine whether the recognized dish is a large portion or a small portion, and then the corresponding price is matched.

In some other embodiments of the present disclosure, it is determined whether multiple items in the image to be identified satisfy the item combination, and if the multiple items satisfy the item combination, the attribute information corresponding to the item combination is matched. For example, the set menu information is configured when the restaurant is settled. If you order 15 yuan for a single fried cabbage, 2 yuan for a bowl of rice, and 16 yuan for a fried cabbage and a rice, it is recognized that the image contains fried After cabbage and rice, you need to match the price of 16 yuan.

In some other embodiments of the present disclosure, it is determined whether the attribute sums corresponding to multiple items in the image to be identified meet a preset condition, and if the attribute sums meet the preset condition, the attribute sum is processed according to the preset condition. For example, when a restaurant sells dishes, there may be a gift-giving activity, for example, free drinks when it reaches 20. Therefore, if the sum of the prices corresponding to the multiple identified dishes is greater than 20 yuan, a beverage can be presented.

In some other embodiments of the present disclosure, the matching time of item matching attribute information is determined, and the attribute information corresponding to each item is determined according to the matching time. For example, at the time of restaurant settlement, you can configure the discount period and the discount intensity to determine whether the time when the dish matches the price is in the discount period, and if so, you can match the dish with the discount price corresponding to the discount period.

FIG. 3 is a schematic structural diagram of some embodiments of the object identification system of the present disclosure. The system includes an image acquisition module 310, a pre-identification module 320, and an item determination module 330.

The image acquisition module 310 is configured to acquire one or more images to be identified, where the images to be identified include one or more objects to be identified.

The pre-recognition module 320 is configured to use the trained pre-recognition model to determine whether the probability that the image to be recognized is clear and contains a complete item is greater than a threshold.

In some embodiments, the pre-recognition model may be trained in advance, a certain number of sample images are collected, and the sample images are classified. For example, the image in the sample image is clear and the image containing the complete item is annotated as a positive sample image, the image in the sample image that is not a positive sample image is annotated as a negative sample image, and the pre-recognition model is performed based on the positive sample image and the negative sample image. training.

The item determination module 330 is configured to identify the category of each item when the probability is greater than the threshold. That is, in this embodiment, instead of performing item category recognition on all images, it first determines whether the image meets the requirements, and performs item recognition on the images that meet the requirements.

In the above-mentioned embodiment, each item in the image whose image is clear and the probability of containing a complete item is greater than the threshold is recognized instead of recognizing all the images, which can improve the accuracy and recognition efficiency of the recognition system.

In some other embodiments of the present disclosure, the pre-recognition module 320 is further configured to, if the probability of the first image being clear and containing the complete item is greater than the first threshold, and the other images are clear and containing complete If the probability of the item is greater than the second threshold, the first image of the consecutive multiple images is sent to the item determination module 330. The item determination module 330 is configured to perform category recognition on items contained in the first image of the consecutive plurality of images.

For example, calculate the probability that the current image is clear and contains a complete item, and if it is determined that the first image is clear and contains an image with a probability of greater than 0.9 that the complete item is included, perform category recognition on the item contained in the first image. When the second to Nth images are clear and the probability of containing a complete article is greater than 0.1, not processing the second to Nth images can reduce the processing burden of the article recognition system and improve system stability.

Fig. 4 is a schematic structural diagram of other embodiments of the object identification system of the present disclosure. The item determination module 330 in the system includes an item detection module 331, an item management module 332, and an item identification module 333.

The item detection module 331 is configured to input the image to be recognized into the item detection model, extract the area information and the first-level category corresponding to each item in the image to be recognized based on the item detection model, call the item management module 332, and set the items belonging to the valid category The area information of the item is input to the item identification module 333. The first-level category refers to the category to which the item belongs, such as dishes, fruits, and beverages.

In some embodiments, the item detection model can be pre-trained, the area information and the first-level category corresponding to the item in the sample image are annotated, the first annotation information is generated, and the item detection model is performed based on the sample image and the first annotation information. training.

The item management module 332 is configured to determine the effective category in the first level category. In a restaurant, there may be priced items such as dishes, yogurt, fruits, and beverages in the recognition area, and there may also be non-valuable items such as keys, badges, wallets, mobile phones, chopsticks, spoons, and hands. Therefore, after the first-level category of the article is identified, the invalid category is removed first, and only the valid category is retained.

The item recognition module 333 is configured to input the area information of the items belonging to the valid category in the image to be recognized into the item recognition model, extract the item features corresponding to each area information based on the item recognition model, and combine the item features corresponding to each area information with the item features The features of the items in the library are compared to determine the second-level category of each item in the image to be recognized. The second level category can correspond to the specific information of the item. For example, a certain dish is specifically fried green peppers or fried cabbage.

In some embodiments, the item features corresponding to the area information of the effective category items in the sample image are annotated, the second annotation information is generated, and the item recognition model is trained based on the sample image and the second annotation information.

In some other embodiments of the present disclosure, the item management module 332 is further configured to determine valid item features in the item feature library within a predetermined time. The item identification module 333 is also configured to compare the item features corresponding to each area information with the effective item features in the item feature library to determine the second-level category of each item in the map to be identified.

In some other embodiments of the present disclosure, the item recognition module 333 is configured to determine the minimum distance between the item feature of each item and the item feature in the item feature library; if the minimum distance is less than or equal to the distance threshold, the item feature library and The item feature corresponding to each item is the category corresponding to the closest item feature as the second-level category of each item; if the minimum distance is greater than the distance threshold, the user is prompted whether to enter the item category and attribute information; if the item category and attribute information need to be entered Attribute, add item category and attribute information, otherwise, the category corresponding to the item feature closest to the item feature corresponding to each item in the item feature library is taken as the second level category of each item.

In some other embodiments of the present disclosure, as shown in FIG. 5, the system further includes an attribute matching unit 510 configured to match corresponding attribute information according to the category of each item. In some embodiments, the attribute information is, for example, price. For example, after recognizing that a certain dish is stir-fried cabbage, the price corresponding to the dish can be matched. In the settlement, if there are multiple dishes, the multiple dishes can be settled.

In other embodiments of the present disclosure, the item management module 332 is further configured to, after matching the attribute information, in response to the user modifying the attribute information corresponding to the category of the item, mark the image to be recognized as a training image or a test image so as to be based on The image to be recognized trains or tests the item detection model and the item recognition model. For example, if it is recognized that a certain dish is stir-fried cabbage, and the price of stir-fried cabbage is matched, but in actual calculation, the user modifies the settlement price, it means that the dish was identified incorrectly. Therefore, the image containing the dish can be used as a training image or a test image, and the image can be used to train or test the object detection model and the object recognition model. Through automatic iteration of the model, the accuracy of the recognition of the model can be improved.

In some other embodiments of the present disclosure, the attribute matching unit 510 is further configured to match the corresponding attribute information according to the category and size of each item, wherein the item detection module 331 is further configured to determine the size of each item based on the item detection model. information. For example, the attribute information is the price. For large and small dishes, the size boundary of the large and small dishes can be calculated, that is, the average value of the large dishes and the average value of the small dishes. The size of the recognized dish is compared with the size boundary to determine whether the recognized dish is a large portion or a small portion, and then the corresponding price is matched.

In some other embodiments of the present disclosure, the attribute matching unit 510 is further configured to determine whether multiple items in the image to be identified satisfy the item combination, and if the multiple items satisfy the item combination, match the attribute information corresponding to the item combination.

In some other embodiments of the present disclosure, the attribute matching unit 510 is further configured to determine whether the attributes corresponding to multiple items in the image to be identified and whether they meet a preset condition, and if the preset condition is met, the attribute matching unit 510 is matched according to the preset condition. And processing.

In some other embodiments of the present disclosure, the attribute matching unit 510 is further configured to determine the matching time of item matching attribute information, and determine the attribute information corresponding to each item according to the matching time.

In the following, the present disclosure will be introduced by taking the application of the item identification system to the field of restaurant settlement as an example.

As shown in FIG. 6, this embodiment includes a registration module 610, a pre-identification module 620, an item detection module 630, an item identification module 640, an item management module 650, a search module 660, a feature library 670, and a settlement module 680. The settlement module 680 corresponds to the attribute matching unit 510.

First, various commodities need to be registered in the system, and the registration module 610 calls the camera to collect images of the settlement area. For accurate subsequent identification, when registering priced commodities such as dishes and beverages, only one commodity is placed in the settlement area. For example, place only one plate of fried cabbage. The registration module 610 inputs the image to the item detection module 630. The item detection module 630 detects the area information of the item and sends the area information to the item identification module 640. The item identification module 640 extracts the feature of the item and then stores the feature in the feature Library 670.

When the customer checks out, the product is brought to the checkout counter. After the settlement module 680 calls the camera to take an image, the camera sends the image to the pre-recognition module 620 to determine whether the image is available. That is, it is judged whether the probability that the image is clear and contains the complete product is greater than the threshold, and whether the image is the first image among consecutive images greater than the threshold, and if so, the pre-identification module 620 sends the image to the item detection module 630 . The item detection module 630 detects the category of each item included in the image and outputs area information corresponding to each item. When shooting goods, there is no need to use a sensor to trigger the camera, therefore, the cost is reduced, and the response efficiency is improved compared with setting the sensor.

During dish registration and dish identification, it is necessary to perform dish inspection on the collected images. The dish registration desk is usually placed inside the back kitchen to facilitate restaurant staff to register dishes. However, the back kitchen is usually messy, and there may be some irrelevant things appearing near the registration desk. If non-dish items and other non-price items cannot be filtered, then the non-price items may be entered into the feature database, causing misunderstanding. When identifying dishes, in addition to the dishes, the collected images often include chopsticks, spoons, badges, mobile phones, wallets, hands and other items. These non-price items may be detected as dishes and cause misunderstanding. Calling the item management module 650 can remove non-valuable items and solve the problem of easy interference in commodity detection.

The item management module 650 can configure which types are non-priced items. For example, for example, some restaurants have drinks for sale, and some restaurants do not sell drinks, then the restaurant can configure whether drinks participate in the pricing according to the actual situation. For another example, if the restaurant has fruit delivery activities, you can configure the fruit not to participate in the pricing. In addition, keys, badges, wallets, mobile phones, chopsticks, spoons, hands, etc. can be configured in the item management module 650 as non-valuable items.

A restaurant may sell dozens of dishes at a time, and the variety of dishes sold in a year may be tens of thousands, and the feature database also preserves the same amount of characteristics of the dishes, including some very similar dishes. If the full feature library is used to realize dish recognition, it is easy to cause misunderstanding. Therefore, the item management module 650 may also set the feature of the commodity not currently participating in the sale as an invalid feature. For example, the item management module 650 records the dishes and their prices sold at each time of day, and triggers menu synchronization through a timer. When synchronizing, firstly, all the product features in the feature library 670 are invalidated, and then according to the entered menu information, the features of the products sold in the current period are set to be valid, and the valid product feature library and the invalid product feature library are obtained to resolve similarities. The product is easy to be misunderstood. The item management module 650 can also process daily order data, count the sales of dishes, and customer order information.

The item identification module 640 determines the feature information corresponding to the area information of the commodity in the pricing category, and calls the feature library 670 through the search module 660, and finds the feature closest to the feature of the commodity in the feature library 670, so that the item identification module 640 outputs the corresponding feature of the commodity Specific category, and send the product information to the settlement module 680.

A settlement system based on dish identification usually requires dish registration before the meal is opened. However, in the actual use of the restaurant, some dishes, such as temporary dishes, are only served after a certain period of time after the meal is opened. These temporary dishes cannot be registered before the meal is opened. Therefore, the temporary dishes cannot be identified at the time of settlement.

In some embodiments, if the item identification module 640 determines that the distance between the product feature and the closest feature in the feature library 670 is greater than the distance threshold, the user may be prompted, for example, whether the settlement clerk is registered for temporary dishes. If registered, the registration of the temporary dish is completed by entering the dish name and price, and the dish information is sent to the settlement module 680; if not, the dish information is sent to the settlement module 680 according to the current recognition result. This embodiment can solve the problem that temporary dishes cannot be identified.

The settlement module 680 performs settlement according to the commodity category and price.

Some dishes in the restaurant have different sizes and sizes, and the prices for large and small portions are different. For example, the large portion of eight-treasure porridge is 6 yuan, and the small portion is 3 yuan. However, apart from the difference in size, dishes of large and small portions are basically similar in appearance. Therefore, it is necessary to identify the size information of the dishes, and set the price of the large and small dishes in the settlement module 680. When the restaurant sells dishes, there may be set meal discount activities. For example, the unit price of clear soup ramen is 9 yuan, the unit price of beef slices is 9 yuan, and the combination of clear soup ramen and beef slices is 16 yuan. Therefore, after identifying the dishes, it is necessary to determine whether the dishes meet the set menu. For setting, the package price needs to be set in the settlement module 680. Some restaurants may discount certain dishes during certain time periods, for example, in the evening. Therefore, it is also necessary to configure the discount time period and discount strength in the settlement module 680. In some restaurants, there will be a full gift event. Therefore, the full gift information can also be set in the settlement module 680.

In the foregoing embodiment, since the accuracy of product identification is improved, the accuracy of product settlement can be improved, user experience can be improved, and the cost of product settlement can be reduced.

In other embodiments, the system further includes an IoT (Internet of Things) platform 6100, an annotation platform 6110, and an algorithm server 6120. During the settlement, if the user modifies the settlement price, it means that the product identification in the image is wrong. The item management module 650 uploads the wrongly recognized image to the IoT platform 6100, the IoT platform 6100 submits the error data of the day to the labeling platform 6110, and the labeling platform 6100 returns the labelled data to the algorithm server 6120 after completing the labeling. The algorithm server 6120 randomly divides the labeled data into a training set and a test set, and performs model training and model testing to improve model iteration efficiency. Before registering the product, the algorithm server 6120 trains each model in the product recognition process.

In some embodiments, the registration module 610, the pre-identification module 620, and the settlement module 680 may be installed on the client; the item detection module 630, the item identification module 640, the item management module 650, the search module 660, and the feature library 670 may be installed on the server In addition, the module in the client can communicate with the module in the server through the service module 690; the IoT platform 6100, the annotation platform 6110, and the algorithm server 6120 can be set in the cloud.

FIG. 7 is a schematic structural diagram of other embodiments of the object identification system of the present disclosure. The system includes a memory 710 and a processor 720, where the memory 710 may be a magnetic disk, flash memory or any other non-volatile storage medium. The memory is used to store instructions in the embodiments corresponding to FIGS. 1 and 2. The processor 720 is coupled to the memory 710 and may be implemented as one or more integrated circuits, such as a microprocessor or a microcontroller. The processor 720 is configured to execute instructions stored in the memory.

In some embodiments, as shown in FIG. 8, the system 800 includes a memory 810 and a processor 820. The processor 820 is coupled to the memory 810 through the BUS bus 830. The system 800 can also be connected to an external storage device 850 through the storage interface 840 to call external data, and can also be connected to the network or another computer system (not shown) through the network interface 860, which will not be described in detail here.

In this embodiment, the data instructions are stored in the memory, and the above instructions are processed by the processor, which can improve the accuracy of item identification.

In other embodiments, a computer-readable storage medium has computer program instructions stored thereon, and when the instructions are executed by a processor, the steps of the method in the embodiments corresponding to FIGS. 1 and 2 are implemented. Those skilled in the art should understand that the embodiments of the present disclosure may be provided as methods, devices, or computer program products. Therefore, the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present disclosure may take the form of a computer program product implemented on one or more computer-usable non-transitory storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. .

The present disclosure is described with reference to flowcharts and/or block diagrams of methods, devices (systems) and computer program products according to embodiments of the present disclosure. It should be understood that each process and/or block in the flowchart and/or block diagram and the combination of processes and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

So far, the present disclosure has been described in detail. In order to avoid obscuring the concept of the present disclosure, some details known in the art are not described. Based on the above description, those skilled in the art can fully understand how to implement the technical solutions disclosed herein.

Although some specific embodiments of the present disclosure have been described in detail through examples, those skilled in the art should understand that the above examples are only for illustration and not for limiting the scope of the present disclosure. Those skilled in the art should understand that the above embodiments can be modified without departing from the scope and spirit of the present disclosure. The scope of the present disclosure is defined by the appended claims.

Claims

An item identification method, including:

Acquiring one or more images to be identified, wherein the images to be identified include one or more objects to be identified;

Using the trained pre-recognition model, determine whether the probability that the image to be recognized is clear and contains a complete item is greater than a threshold; and

In the case where the probability is greater than the threshold, the category of each item is identified.
The article identification method according to claim 1, further comprising:

If the probability of the first image being clear and containing the complete item is greater than the first threshold, and the probability of the other images being clear and containing the complete item is greater than the second threshold, the The items contained in each image are classified.
The article identification method according to claim 1, wherein training the pre-identification model comprises:

Mark the image in the sample image that is clear and contains the complete article as a positive sample image, and mark the image in the sample image that is not a positive sample image as a negative sample image; and

The pre-recognition model is trained based on the positive sample image and the negative sample image, so as to determine whether the probability that the image to be recognized is clear and contains a complete item is greater than a threshold according to the trained pre-recognition model.
The method for identifying items according to any one of claims 1 to 3, wherein identifying the category of each item comprises:

Inputting the image to be recognized into an item detection model, and extracting the area information and the first-level category corresponding to each item in the image to be recognized based on the item detection model;

Determine the valid categories in the first level category; and

The area information of the items belonging to the effective category in the image to be recognized is input into the item recognition model, the item features corresponding to each area information are extracted based on the item recognition model, and the item features corresponding to each area information are combined with the item feature database Compare the features of the items in the image to determine the second-level category of each item in the image to be recognized.
4. The article identification method of claim 4, wherein training the article detection model and the article identification model comprises:

The area information and the first-level category corresponding to the items in the sample image are annotated to generate first annotation information, and the item detection model is trained based on the sample image and the first annotation information, so as to be based on the trained The item detection model determines the area information and the first level category corresponding to each item in the image to be recognized; and

Annotate the item features corresponding to the area information of the effective category items in the sample image, generate second annotation information, and train the item recognition model based on the sample image and the second annotation information, so as to train according to the training The good item recognition model extracts item features corresponding to the area information of each item in the image to be recognized.
The article identification method according to claim 4, wherein:

Determine the valid item features in the item feature library within a predetermined time; and

The item features corresponding to each area information are compared with the effective item features in the item feature library to determine the second-level category of each item in the to-be-identified map.
The article identification method according to claim 4, wherein:

Determine the minimum distance between the item feature of each item and the item feature in the item feature library;

If the minimum distance is less than or equal to the distance threshold, the category corresponding to the item feature closest to the item feature corresponding to each item in the item feature library is taken as the second level category of each item;

If the minimum distance is greater than the distance threshold, prompt the user whether to input item category and attribute information; and

If it is necessary to input the item category and attribute, add the item category and attribute information; otherwise, use the category corresponding to the item feature closest to the item feature corresponding to each item in the item feature library as the second level category of each item.
The article identification method according to claim 4, further comprising:

According to the category of each item, the corresponding attribute information is matched.
The article identification method according to claim 8, further comprising:

After the attribute information is matched, in response to the user modifying the attribute information corresponding to the category of the item, the image to be recognized is marked as a training image or a test image, so as to identify the item detection model and the item based on the image to be recognized The model is trained or tested.
The article identification method according to claim 8, further comprising at least one of the following steps:

Determine the size information of each item based on the item detection model, and match the corresponding attribute information according to the category and size of each item;

Determine whether multiple items in the image to be identified satisfy the item combination, and if multiple items satisfy the item combination, match the attribute information corresponding to the item combination;

Determine whether the attribute sums corresponding to the multiple items in the image to be recognized meet a preset condition, and if the attribute sums meet the preset conditions, process the attribute sums according to the preset conditions; and

The matching time of item matching attribute information is determined, and the attribute information corresponding to each item is determined according to the matching time.
An item identification system, including:

An image acquisition module configured to acquire one or more images to be identified, wherein the images to be identified include one or more items to be identified;

The pre-recognition module is configured to use the trained pre-recognition model to determine whether the probability that the image to be recognized is clear and contains a complete item is greater than a threshold; and

The item determination module is configured to identify the category of each item when the probability is greater than the threshold.
An item identification system, including:

Memory; and

A processor coupled to the memory, and the processor is configured to execute the item identification method according to any one of claims 1 to 10 based on instructions stored in the memory.
A non-transitory computer-readable storage medium having computer program instructions stored thereon, which when executed by a processor, realizes the article identification method of any one of claims 1 to 10.