WO2023138447A1

WO2023138447A1 - Ai weighing system, and method for improving precision of ai model by using various types of data sets

Info

Publication number: WO2023138447A1
Application number: PCT/CN2023/071665
Authority: WO
Inventors: 朱辰; 西田翔; 杉山聪
Original assignee: 索尼半导体解决方案公司; 索尼集团公司; 朱辰
Priority date: 2022-01-20
Filing date: 2023-01-10
Publication date: 2023-07-27
Also published as: CN116503724A

Abstract

The present invention relates to an AI weighing system, and a method for improving the precision of an AI model by using various types of data sets. The AI weighing system may comprise: an object placement table, which is used for the placement of a target object, wherein the object placement table can weigh the target object; a first photographing apparatus, which is used for photographing and recognizing the target object, which is placed on the object placement table, wherein the first photographing apparatus comprises an image sensor, and the image sensor can execute, in an offline state, AI processing for recognizing the target object; and an output portion, which is used for outputting a recognition result and a weighing result of the target object. The AI weighing system according to the present invention significantly improves the recognition speed and precision.

Description

AI weighing system and method to increase the accuracy of AI models using multiple data sets

References to related applications

This application claims the rights and interests of the Chinese patent application No. 202210065055.5 submitted to the State Intellectual Property Office of the People's Republic of China on January 20, 2022, the entire contents of which are hereby incorporated herein by reference.

technical field

The present invention relates to an AI (Antificial Intelligence) weighing system, more specifically, relates to an AI weighing (AI scale) system utilizing an image sensor with an AI processing function.

Background technique

In sales places such as supermarkets, there have been tools for customers to self-checkout and pay. For example, in a supermarket, customers can use the barcode on the product to scan and pay by themselves. However, barcodes still need to be manually attached to merchandise. Especially, for such as fresh commodities, manual identification and weighing are also required, and then barcodes are printed out. Due to the wide variety of products, manual product identification is time-consuming, laborious and error-prone.

Currently, there are also artificial intelligence (AI) weighing devices based on computer vision technology. Such an AI weighing device automatically recognizes a product by establishing an intelligent recognition model and analyzing a product image using the model. Such an AI weighing device avoids the steps of manual weighing and printing barcodes, and avoids the need to manually search or memorize the prices of various commodities, thereby greatly improving the work efficiency of places such as supermarkets and saving labor costs.

Contents of the invention

However, in the existing AI weighing device, the identification of goods is carried out by sending the images captured by the camera to an external computing device such as the cloud, and the learning, establishment, training and retraining of the AI model are also carried out on the cloud. Thus, there are at least the following three problems. First of all, data transmission depends on data traffic, which may be unstable and have delays or even errors, which will significantly affect the recognition speed and accuracy. Secondly, cloud AI recognition depends on the stability of the network and cannot be implemented offline. Therefore, cloud AI recognition significantly depends on the cloud environment. Third, the pictures sent to the cloud may contain private information, such as the user's personal information, which poses a risk of privacy leakage.

At the same time, the recognition accuracy of the recognition model of the existing AI weighing device needs to be improved. In particular, in a place such as a supermarket, even the same type of product has various packages, sales forms, and the like. For example, for fresh goods, there are various packaging forms such as boxed and plastic bags, and may also be packaged by itself; at the same time, they may be sold whole or divided (for example, sliced, divided into pieces, etc.). All of these make it difficult to accurately identify products.

In order to solve the above problems, the present invention can perform AI recognition offline by using an image sensor with AI processing function, thereby significantly improving the recognition accuracy and recognition speed of the intelligent weighing device, and can be used offline.

The present invention also significantly improves the recognition accuracy of the AI model by utilizing image data containing various data sets.

The AI weighing system according to the present invention may include: a storage platform for placing a target object, the storage platform can weigh the target object; a first camera device for photographing and identifying the target object placed on the storage platform, the first camera device includes an image sensor, and the image sensor can perform AI processing for identifying the target object in an offline state; and an output unit is used for outputting the recognition result and the weighing result of the target object.

The image sensor of the camera device of the AI weighing system according to the present invention may be a CMOS image sensor chip, the CMOS image sensor chip includes a first substrate and a second substrate, the first substrate has a plurality of pixels for converting optical signals into electrical signals, and the second substrate has a memory and a processing circuit, the memory stores an AI model, and the processing circuit has a function of performing the AI processing based on the electrical signal by using the AI model.

The AI models stored on the image sensor include a first inference model.

According to an aspect of the present invention, the processing circuit of the stacked CMOS image sensor chip generates image data, and the processing circuit may include: a learning part that retrains the AI model based on the image data; and an inference part that uses the AI model to recognize the target object.

According to another aspect of the present invention, the AI weighing system may further include one or more computing devices located in a cloud environment, and the one or more computing devices have corresponding processors and memories. Wherein the processing circuit of the stacked CMOS image sensor chip of the first camera generates image data, and the image data is sent to the one or more computing devices. The one or more computing devices create a second inference model based on the image data generated by the stacked CMOS image sensor chip, and may directly deploy the second inference model into the memory of the stacked CMOS image sensor chip such that the first inference model is updated.

According to another aspect of the present invention, the stacked CMOS image sensor chip is capable of selecting a size of the image data including a full sensor size and a Video Graphics Array (VGA) size based on the AI processing of the processing circuit.

According to another aspect of the present invention, when the stacked CMOS image sensor outputs the image data having a full sensor size, the AI processing includes intercepting an overall image or a partial image of the target object from the image data having the full sensor size, and the partial image includes a VGA-sized image.

According to another aspect of the present invention, the image data captured by the first camera device of the AI weighing system may include: a profile image of the target object; and/or a cross-sectional image of the target object; and/or a packaging image of the target object.

According to another aspect of the present invention, the image data captured by the first camera device of the AI weighing system may also include a partially enlarged image of the target object.

According to another aspect of the present invention, the first camera device of the AI weighing system may further include a ToF sensor, and the AI processing includes combining RBG data output by the image sensor and ToF data output by the ToF sensor.

According to another aspect of the present invention, the first camera device of the AI weighing system may further include a multi-wavelength sensor, and the AI processing can utilize the output of the multi-wavelength sensor to optimize the identification of the target object.

According to another aspect of the present invention, the first camera device of the AI weighing system may further include a polarized light sensor, and the AI processing can utilize the output of the polarized light sensor to optimize the identification of the target object.

According to another aspect of the present invention, the AI weighing system may further include: a second camera device for capturing an image of the target person and sending it to the first camera device, and wherein the AI processing includes acquiring feature data from the image of the target person captured by the second camera device and using the feature data to assist in identifying the target object. Wherein, the feature data includes anonymous feature data such as gender and age of the target person.

According to another aspect of the present invention, the AI weighing system may further include: a third camera device, configured to acquire an image of the target person and send it to the first camera device, and wherein the AI processing includes performing SLAM processing on the image of the target person captured by the third camera device, and outputting metadata of the processing result to assist in identifying the target object. Wherein, the third camera device also acquires an image of the shopping cart or shopping basket of the target person and sends it to the first camera device, and the AI processing includes performing SLAM processing on the image of the shopping cart or shopping basket, and outputting metadata of the processing result to assist in identifying the target object.

Preferably, the third camera device of the present invention may include a plurality of image sensors positioned at different positions within the moving range of the target person.

According to another aspect of the present invention, the AI processing of the AI weighing system may also include obtaining other information including ambient temperature, area address and/or weather conditions, and assist in identifying the target object based on the other information.

The method for increasing the recognition accuracy of an AI model by using multiple data sets according to the present invention includes: acquiring image data of an item; creating an AI model using learning data, the learning data including the image data and the name and attribute of the item; applying the AI model to identify the target object, wherein the learning data includes at least two of the following three data sets: an outline image of the item; a cross-sectional image of the item; and a packaging image of the item.

Preferably, the learning data used to create the AI model may also include a data set of partially enlarged images of the item.

Preferably, the image data of the item includes RGB image data and at least one of the following data: ToF data, multi-wavelength data, polarization data.

Through the above one or more aspects of the present invention, the recognition speed and precision of the AI weighing system are significantly improved, and especially, the target object can be recognized accurately and quickly even in an offline state. Also, the method for creating an AI model according to the present invention significantly improves the accuracy of the recognition model by adding different data sets.

Description of drawings

These and other more detailed, specific features of various embodiments will be more fully disclosed in the following description with reference to the accompanying drawings, in which:

FIG. 1 is a schematic diagram showing a first embodiment of an AI weighing system according to the present invention.

FIG. 2 is a schematic diagram showing an embodiment of an image sensor according to the present invention.

FIG. 3 is a schematic diagram showing a stacked CMOS image sensor chip according to the present invention.

FIG. 4 is a flowchart illustrating steps for creating, training or retraining an AI model according to the present invention.

FIG. 5 is a flowchart illustrating an AI recognition method according to the present invention.

FIG. 6 is a schematic diagram illustrating an image output and/or processing mode of an image sensor according to the present invention.

Fig. 7 is a schematic diagram showing a second embodiment of the AI weighing system according to the present invention.

Fig. 8 is a schematic diagram showing a third embodiment of the AI weighing system according to the present invention.

Fig. 9 is a schematic diagram showing a fourth embodiment of the AI weighing system according to the present invention.

10 is a schematic diagram showing a fifth embodiment of the AI weighing system according to the present invention

Detailed ways

In the following description, numerous details are set forth, but it will be apparent to those skilled in the art that these specific details are illustrative only and are not intended to limit the scope of the application.

As shown in FIG. 1 , the AI weighing system according to the first embodiment of the present invention includes: a storage platform 1 , a camera device 2 and an output unit 3 , wherein an object 4 is placed on the storage platform 1 . The storage table 1 has a built-in weighing device, which can weigh the target object 4 . The output unit 3 can be a display as shown in FIG. 1 , such as a liquid crystal display, to display the recognition result and weighing result of the target object 4 to the user. Optionally, the output unit 3 may also output the recognition result and the weighing result in audio form. And the output unit 3 can be a touch display to interact with the user. In this article, for example, commodities sold in supermarkets (especially fresh products such as fruits and vegetables) are described as targets of the present invention, but it is easy to understand that the present invention is not limited to these specific commodities. For example, the object of the present invention may also be goods stored in a warehouse.

The imaging device 2 is, for example, arranged above the object table 1 to photograph the target object. The imaging device 2 includes an image sensor 21 such as a CMOS image sensor. The image sensor 21 according to the present invention is capable of not only storing and outputting captured images, but also performing various processing including AI processing on image data. AI processing includes obtaining various information (metadata such as feature data of objects) from image data, and recognizing objects in images. As shown in FIG. 2 , the image sensor 21 can also communicate with a cloud environment (cloud). However, since the image sensor 21 itself can perform AI processing, unlike existing weighing recognition systems, the AI weighing system of the present invention does not depend on network connections and cloud servers, and can quickly and accurately identify objects even in an offline state.

[Structure of image sensor with AI processing function]

FIG. 2 shows a functional block diagram of an image sensor 21 according to the invention. As shown in FIG. 2 , the image sensor 21 includes an imaging section 211 , a memory 212 and an AI processing section 213 . The image sensor 21 further includes a control unit (not shown) that controls the imaging unit. The imaging unit 211 photographs the target object, sends the imaging data to the AI processing unit 213 , and stores the imaging data in the memory 212 .

The memory 212 stores an AI model, for example, an inference model for recognizing a target ("first inference model"). The AI model is, for example, a neural network computing model for computer vision created using learning of a deep neural network (DNN) by executing a program, for example, stored in memory 212 and/or in a cloud environment. Also, the AI model may be a learning model utilizing a multi-layer neural network. It should be understood that for the AI model here, any suitable known AI model and algorithm may be selected according to usage and requirements.

The AI processing unit 213 is, for example, a graphics processing unit (Graphic Processing Unit, GPU), so that the image data of the target object can be processed using the AI inference model stored in the memory 212, and the processing result is sent to the output unit 3.

Preferably, the AI processing unit 213 may include an inference unit 2131 and a learning unit 2132 . The inference unit 2131 uses the AI inference model stored in the memory 212 to recognize the target image captured and sent by the imaging unit 211 , and sends the recognition result to the output unit 3 . The learning unit 2132 retrains the AI inference model based on the object image and the confirmed recognition result. In addition, the learning unit 2132 can also use data in the cloud environment to retrain the AI inference model. Retraining can be done when the AI weighing system is idle and/or optionally when the network is available, so as not to affect the efficiency with which the AI weighing system operates.

Optionally, the learning unit 2132 can perform learning based on the data stored in the memory 212, and create and train an AI model. The learning unit 2132 can also use the data generated during the application process and other learning data (for example, data in the cloud environment) to retrain the recognition model. Preferably, the learning part 2132 can also train the learning model by using learning data to change the weights of various parameters in the AI inference model; and/or by preparing multiple AI inference models and then changing the AI inference model to be used according to the content of the calculation process. In addition, as mentioned above, the training of the AI inference model by the learning unit 2132 is preferably performed when the AI weighing system is idle.

Alternatively, the AI processing unit 213 of the image sensor 21 according to the present invention may only include the inference unit 2131 . The function of the learning part 2132 is executed on the cloud environment when the AI weighing system is idle and when the network is available.

That is to say, the AI processing unit 213 of the image sensor 21 can use the AI inference model stored in the memory 212 to only perform the AI recognition of the target, thereby further improving the recognition speed, and completely independent of network connection and cloud environment. In this respect, the learning part of the AI processing part according to the present invention can be located in any external computing device of the AI weighing system accessible in a wired or wireless manner, such as cloud environment (cloud), edge server, core network, etc. A cloud server located in a cloud environment may include one or more computing devices with corresponding processors and memory capable of high-speed processing of large amounts of (and updated in real time) data. At the same time, the AI weighing system executes the training of the inference model ("the second inference model") on the cloud environment when the system is idle and the network is available, and automatically deploys the inference model ("the second inference model") to the memory 212 of the image sensor 21, for example periodically, to update the inference model ("the first inference model") in the memory 212. Since the cloud environment can acquire, store and process large amounts of data, it is conducive to the establishment and retraining of AI inference models. At the same time, the AI inference model on the system can be updated when the network is available or the AI weighing system is idle, thereby improving the recognition accuracy without affecting the recognition speed when the system is working.

Additionally, in this aspect, image data generated by image sensor 21 is also sent and stored to one or more computing devices in the cloud. The computing device located in the cloud environment can create and/or retrain the AI inference model based on the image data generated by the image sensor 21, and can directly deploy the AI inference model to the memory 212 of the image sensor 21, thereby updating the AI inference model stored in the image sensor 21.

After the AI recognition is completed, the AI processing unit 213 sends the recognition result of the target to the output unit 3 . The recognition result may include N options (N≥1), and when N>1, the N options are arranged according to the ranking of possibility. As shown in FIG. 1 , the output unit 3 displays three possible results of the target object 4 , and the first option is the best result recognized by the system. If the best result is not the correct result for the object 4 , the user can select the correct result on the interactive display screen of the output part 3 . This selection by the user will be used as historical data (learning data) for the feedback mechanism of the present invention. In addition to sorting, when displaying the best result, the output unit 3 can also highlight the best result among multiple possible results by darkening the color, changing the font, enlarging the font size, and the like.

Specifically, according to the feedback mechanism of the present invention, when the final option selected by the user (that is, the result of the target identified by the user) is not the first option among the N options, the image of the target used in the AI recognition process and the correct option are stored in the memory 212 of the image sensor 21 or sent to the cloud environment, and are fed back to the learning part 2132. The learning part 2132 and/or the cloud processor associate the image of the target object and its correct options as learning data for retraining the AI inference model, so as to dynamically and continuously optimize the recognition accuracy of the model.

[Construction of image sensor with AI processing function]

The specific configuration of the image sensor of the present invention will be described below with reference to FIG. 3 . As shown in FIG. 3 , the image sensor 21 may be a stacked complementary metal oxide semiconductor (CMOS) image sensor chip. The stacked CMOS image sensor chip includes a first substrate 301 and a second substrate 302 . A pixel array part 3011 composed of a plurality of pixels is arranged on the first substrate 301, and the pixel array part 3011 converts optical signals into electrical signals through photoelectric conversion, and transmits them to the second substrate 302 (the connection between the first substrate and the second substrate is not shown). A memory 3021 and a processing circuit 3022 are arranged on the second substrate 302 . The processing circuit 3022 includes, for example, a DSP (Digital Signal Processor), which generates image data based on electrical signals transmitted from the first substrate 301, and stores the image data in the memory 3021 (for example, the memory 212 shown in FIG. 2 ). The memory 3021 stores an AI model, for example, an inference model for AI recognition of a target. The processing circuit 3022 executes the AI processing function of the inference part 2131 of the AI processing part 213 in FIG. The processing circuit 3022 can also perform the function of the learning unit 2132 in FIG. 2 as described above.

Optionally, the stacked CMOS image sensor chip may further include a third substrate. The memory and processing circuitry may be located on the second substrate and/or the third substrate, respectively. That is, the pixel array unit, the memory, and the processing unit may be respectively located on different substrates.

Based on the above configuration, the image sensor 21 of the present invention can itself perform AI recognition processing on the image of the target object, thereby enabling recognition of the target object in an offline state.

[AI model creation/training method]

Fig. 4 illustrates the steps of creating or retraining an AI model according to the present invention. In step S01, the AI processing part (specifically, the learning part 2132 located in the cloud environment or image sensor) acquires learning data, the learning data includes the data set of the item image and the name and attribute of the item, etc., and the learning data also includes historical identification data of the AI weighing system; in step S02, the learning part 2132 uses the learning data to create an AI model, or trains or retrains the created AI model; in step S03, the created or updated AI model is output.

According to the present invention, the image data used to create or train the AI model includes a variety of different image data sets, such as outline images of items, images of items with packaging, cross-sectional images of items, partial images of items, and the like. The AI model of the present invention is created using more than two kinds of data sets from the above-mentioned data sets. Specifically, in step S01, multiple sets of images of the overall outline of the item (data set 1), such as the profile of the item taken from different angles; multiple sets of images of the packaged item (data set 2), such as images of items in packaging of various colors (for example, various common plastic bags); multiple sets of images of different sections of the item (data set 3), such as images of items segmented into different shapes; and local enlarged images of the item (data set 4). For different items, you can choose to use different data sets for modeling. For example, for watermelon and other commodities that may be sold separately, two data sets (ie, data sets 1 and 3) can be collected for the overall image and the segmented cross-sectional image. Therefore, when a customer chooses to buy, for example, half a watermelon, the product can be quickly and accurately identified. As another example, for grapes, two data sets (namely, data sets 1 and 4) can be selected to use the overall image and the partial image, and the grapes can be quickly identified by confirming the local details.

The item image data used to create or train the AI model described above is generally an RGB image or a black and white image. However, preferably according to the present invention, the image data used for creating or training the AI model may also include one or more of three-dimensional data, polarization data, and multi-wavelength data of the object.

Specifically, the ToF sensor can be used to obtain the three-dimensional data of the item, that is, the ToF data (data set 5), and it is fused with the RGB image of the item (at least one of the data sets 1 to 4) to obtain a stereo image of the item. Stereo images can fully present the surface features of objects, thereby optimizing the recognition accuracy of AI models. For items with smooth surfaces, the polarized image of the item obtained by a polarized light sensor (Dataset 6) can be used when modeling. By using polarized images, it is possible to avoid the problem that the captured image is unclear due to reflections on the surface of the object. In addition, the item can also be photographed using a multi-wavelength sensor to obtain a multi-wavelength image of the item (data set 7). Since multi-wavelength sensors are able to capture small differences in the surface color of items, the combined use of multi-wavelength images with RGB images of items (at least one of datasets 1 to 4) is beneficial for modeling for identifying the presence of many species of the same item (e.g., the same fruit with different origins).

As described above, by using two or more of the above-mentioned data sets (data sets 1-4), the AI model of the present invention can significantly improve the recognition accuracy compared with the conventional recognition model. Furthermore, the present invention further improves the recognition accuracy of the AI model by making the image data used for modeling include other image data (data sets 5-7) other than RGB images.

In addition, when collecting the overall image and partial image of the item, the photographing device according to the present invention zooms in and processes the part of the item based on optical zoom instead of digital zoom commonly used in the prior art. Correspondingly, when identifying the target, the acquisition of the partially enlarged image of the target by the shooting device according to the present invention is also based on optical zoom. Therefore, the image quality is not degraded by enlarging the image, thereby further improving the recognition accuracy. This will be described in detail below with reference to FIG. 6 .

[AI recognition method]

Fig. 5 shows the steps of the method for AI identification of a target by using the AI weighing identification system of the present invention. In step S501, after sensing that the target object 4 is placed on the storage table 1, the camera 2 takes pictures of the target object and acquires multiple images of the target object, including overall outline images, partial images, etc., as will be described below with reference to FIG. 6 . Preferably, the camera device 2 can also acquire image data other than the RGB image of the target through other sensors, such as ToF data, polarized light images and multi-wavelength images. In step S502, the AI processing unit 213 in the imaging device 2 performs AI processing on the target image to obtain high-quality image data for AI recognition. The AI processing here includes intercepting the local detail image of the target object from the captured image, as described below with reference to FIG. 6 . Preferably, the AI processing of the image further includes fusing or combining the RGB image with other sensor data such as a ToF sensor to optimize the image data of the target object.

Fig. 6 shows an image output and/or processing mode of an image sensor according to the present invention. The size of the image output by the image sensor 21 according to the present invention can be selected from a variety of pixel sizes. For example, 4056x3040 pixels (12M full sensor size), 1947x1459 pixels (covering the entire shelf) or 640x480 pixels (Video Graphics Array (VGA) size) shown in Figure 6. Therefore, according to the control of the control unit, the imaging unit 211 of the image sensor 21 can choose to capture a panoramic image (mode 1), an overall image of the target object (mode 2) and a partial image (mode 3). Mode 3 can optimize the recognition result when the target object is, for example, grapes.

On the other hand, when the imaging unit 211 outputs a 12M full sensor size image, the AI processing unit 213 can also cut out an image of the object area from the full sensor size image, and can cut out a VGA size image from the full sensor size image or the object area image. In this way, it is also possible to acquire local detailed images of objects.

Through the above method, the present invention can utilize an image sensor with an AI processing function to output a partially enlarged image of a target object. Different from the digital zoom used in the traditional recognition technology, the above-mentioned image enlargement process according to the present invention is based on optical zoom, so the quality of the enlarged image will not be affected at all, thereby greatly improving the recognition accuracy and efficiency compared with the prior art. At the same time, in this regard, although the data volume of the image captured by the camera device of the present invention is larger than that of the traditional camera device, since the image sensor of the present invention itself has AI processing and recognition functions, instead of sending image data to the cloud for processing and recognition as in the prior art, the recognition speed will not be affected, and the risk of loss or error during data transmission will be avoided. Therefore, by using high-quality images, the present invention greatly improves the recognition accuracy of the target object without affecting the recognition speed.

In step S503 , the AI processing unit 213 uses the AI inference model in the memory 212 to recognize the target object based on the image data processed in step S502 . The recognition result may include N (N≥1) options. At this time, preferably, the AI processing unit 213 may weight different options in combination with other obtained information, so as to optimize the ranking of multiple options. Other information refers to information that may affect the user's (for example, a customer in a supermarket or a store)'s choice of goods, such as the identified user's anonymous characteristic data (such as gender and age, etc.), the address information of the system's location, weather and temperature information, etc.

In step S504 , the AI processing unit 213 outputs the recognition result to the output unit 3 . The output unit 3 displays the recognition result to the user. If the first option is not the correct result, the user can select the correct result in other options, or manually input the correct result. In step S505, the AI processing unit 213 stores the image data and the correct recognition result in the memory 212, and/or uploads them to the memory of the computing device in the cloud environment. As mentioned earlier, this data can be used to retrain the AI inference model according to the feedback mechanism.

Fig. 7 shows a second embodiment of the AI weighing system according to the present invention.

The image sensor 21 of the imaging device 2 according to the first embodiment is capable of outputting RGB color images and/or black and white images. In the second embodiment, however, as shown in FIG. 7 , the imaging device 2 further includes other sensors such as a ToF sensor 22 , a polarization sensor 23 and a multi-wavelength sensor 24 . These other sensors are all connected to the image sensor 21 . The AI processing unit 213 can combine or fuse the data output by various types of sensors for AI recognition, so as to optimize the image of the target object and increase the recognition accuracy of the target object.

The ToF sensor 22 is a ToF (Time of Flight) distance image sensor that measures the distance to the target by detecting the flight time (time difference) of the light emitted by the light source being reflected by the target and reaching the sensor. The AI processing unit 213 can fuse the RGB image obtained from the imaging unit 211 with the ToF image output from the ToF sensor 22 to obtain a stereoscopic image of the target. Recognition based on the stereo image can increase the recognition accuracy of the target.

The polarization sensor 23 is, for example, a sensor obtained by incorporating a polarization element, which is an independent component in a conventional polarization camera, into a CMOS image sensor. The polarized light sensor 23 can capture objects that cannot be seen clearly due to light reflection. For example, plastic bags for packaging vegetables will show uneven reflection, and the unevenness of the surface of the object will be displayed delicately, thereby optimizing the captured image of the target object. When the target object has reflection due to smooth surface, etc., the AI processing unit 213 can choose to use the output of the polarization sensor to optimize the recognition accuracy of the target object.

The multi-wavelength sensor 24 can capture small differences in the color of the target object by utilizing different sensitivities to different wavelengths of light. The AI processing unit 213 is particularly advantageous when identifying fruits such as citrus by utilizing the difference in sensitivity of the target object to different wavelengths output by the multi-wavelength sensor 24 .

According to the second embodiment of the present invention, the AI processing unit 213 can selectively combine or fuse one or more of the data output by the above sensors with the data output by the image sensor 21 . In this way, compared with the first embodiment using only RGB images, the second embodiment further improves the recognition effect by using multiple types of sensor data individually or in combination.

Fig. 8 shows a third embodiment of the AI weighing system according to the present invention. The difference between the third embodiment and the first embodiment is that two imaging devices are provided. As shown in FIG. 8 , the AI weighing system according to this embodiment includes a camera 2 and a camera 5 communicating with the camera 2 . The imaging device 2 is the same as that of the first embodiment.

The camera device 5 is arranged at a different position from the camera device 2 on the shelf, for example, in front of the checkout counter of a supermarket or the like. As shown in FIG. 8 , the imaging device 5 is used to capture an image of a target person (for example, a customer), and transmit the captured image to the imaging device 2 . The AI processing unit 213 of the imaging device 2 can acquire anonymous personal information (characteristic data) of the customer, such as metadata representing gender and age, from the image of the customer. The AI processing section 213 may only store and output the anonymous personal information without saving the image of the customer. The AI processing unit 213 assists in identifying the target based on the preference information associated with the anonymous personal information, for example, by changing the weight of an option in the identification result.

On the other hand, similarly to the camera 2, the camera 5 itself may also include an image sensor having an AI processing function. The image sensor processes the image of the customer to obtain characteristic data of the customer (for example, the customer's gender and age). The camera 5 sends only the metadata obtained from the captured image, not the image itself, to the camera 2, thereby fully protecting personal privacy while increasing transmission and processing speed.

Preferably, the camera device 5 may include an RGB image sensor and a ToF sensor for respectively obtaining the RGB image and the ToF image of the customer, and sending the two images to the AI processing unit 213 of the camera device 2 . The AI processing unit 213 combines the RGB image and the ToF image of the customer to generate a stereoscopic image of the customer, and obtains characteristic data of the customer based on the stereoscopic image. In the case that the camera 5 itself includes an image sensor with an AI processing function, the above-mentioned AI processing (including generating a stereoscopic image, and feature analysis based on the stereoscopic image) will be performed at the camera 5 side, and the camera 5 only sends the processing result to the camera 2.

Specifically, for example, the camera device 5 obtains the customer as a woman about 25 years old from the captured image of the customer, and sends the metadata (ie, "female, 25 years old") to the camera device 2 . The imaging device 2 stores the metadata in the memory 212 and sends it to the inference unit 2131 . The preference information of a 25-year-old female is stored in the memory 212, and the preference information will be regularly updated according to historical data. As shown in FIG. 8, after placing the object purchased by the customer on the storage table 1, the inference unit 2131 uses the AI model to obtain a preliminary recognition result of the object based on the image of the object captured by the imaging unit 211, which may be orange, tangerine, tomato or lemon. At the same time, according to the customer information obtained from the camera 5, the inference unit 2131 learns that women around the age of 25 prefer sweeter oranges, and uses this as auxiliary identification information. In the absence of other recognition parameters, the inference unit 2131 will finally use oranges as the first-ranked option in the recognition results according to the preference. After the customer's payment is completed, the above metadata and the final result of the target object (whether the user selected the first ranked orange) can be used to retrain the AI model.

Optionally, the camera 5 can also take an image of the shopping basket or shopping cart held by the customer, and send the image data to the camera 2. The camera device 2 can pre-recognize the image data obtained from the camera device 5 . On the other hand, the image data captured by the imaging device 5 can be subjected to AI processing for pre-identification, and the metadata of the processing result can be directly sent to the imaging device 2 . For example, camera device 5 may send a list of objects identified from the image data to camera device 2 . The pre-recognition can be used as auxiliary information for the final AI recognition performed by the imaging device 2 .

And, after the customer completes the payment, the metadata (for example, gender and age range, etc.) of the above-mentioned AI processing results of the customer's image and the final list of products purchased by the customer can be stored in the storage 212 and/or in the computing device on the cloud environment, so as to update the preference information associated with the anonymous personal information and retrain the AI model. In this way, the recognition accuracy of the AI model is further improved.

In this regard, when a customer pays with a mobile phone, the AI processing unit 213 can also identify the customer's QR code (two-dimensional code) and obtain anonymous information such as the customer's purchase history for updating preference information and retraining the AI model.

Fig. 9 shows a fourth embodiment of the AI weighing system according to the present invention. The difference between the fourth embodiment and the third embodiment is that an imaging device 6 is also provided, that is, this embodiment includes three

imaging devices

2 , 5 and 6 . The imaging devices 2 and 5 are the same as those of the second embodiment.

The imaging device 6 according to the present embodiment is provided at a different position from the imaging devices 2 and 5 . Specifically, a plurality of camera devices 6 are respectively installed at a plurality of different positions in a place such as a supermarket. For example, FIG. 9 shows an imaging device 6 installed in a refrigerator in a supermarket. Certainly, the camera device 6 can be set at each commodity display place of the supermarket.

As shown in FIG. 9 , the imaging device 6 photographs customers within the imaging range, and sends the captured image to the imaging device 2 . The camera device 2 can perform simultaneous localization and mapping (SLAM) processing on the image of the customer. Specifically, the AI processing unit 213 analyzes the feature points in the image, determines whether a specific feature point moves by a certain vector relative to another image, and generates a SLAM map by combining feature point data of multiple images taken successively. In this way, the AI processing unit 213 can determine the product information that the customer takes out or puts down at the container where the camera device 6 is installed, in combination with the product information stored in the system memory at the container where the camera device 6 is installed.

Optionally, the camera device 6 can also take pictures of customers' shopping carts or shopping baskets within the shooting range, and send the captured images to the camera device 2 . Similarly, the AI processing unit 213 can perform SLAM processing on the image of the shopping cart or shopping basket, so as to determine the product information put into or taken away from the shopping cart or shopping basket. Optionally, the AI processing unit 213 can also compare the images of shopping carts or shopping baskets captured by multiple camera devices 6 arranged at different locations, and determine the product information put into or taken away from the shopping cart or shopping basket through image differences.

Of course, when the customer and his shopping cart or shopping basket are all within the shooting range of the camera 6, the camera 6 can send the overall image of the customer and his shopping cart or shopping basket to the camera 2, and the AI processing unit 213 can determine the product information that the customer puts into the shopping cart or shopping basket or takes away from the shopping cart or shopping basket based on the overall image.

On the other hand, similar to the camera 5 , the camera 6 itself may also include an image sensor with an AI processing function, and the AI image sensor performs the aforementioned AI processing on the camera 6 and sends metadata of the processing result to the camera 2 . For example, a customer takes out/puts down a product from a container at a certain time, a customer's shopping cart or basket is put in or taken out a certain product at a certain time.

In this way, since the accurate information of all or part of the products selected by the customer in the store can be obtained, the accuracy and recognition efficiency of the AI processing unit 213 in recognizing the target (the customer's selected product list) are greatly improved.

In addition, similar to the third embodiment, the imaging device 2 can process the image of the customer captured by the imaging device 6 to obtain the customer's anonymous information (for example, gender, age, etc.). When the imaging device 6 itself has an AI image sensor, only the metadata of the above information can be sent to the AI processing unit 213 of the imaging device 2 . At the same time, the AI processing unit 213 can associate the customer's metadata with the commodity data selected by him and store it in the memory 212 for retraining the AI model.

Similar to the third embodiment, after the customer completes the payment, the metadata (for example, gender and age range, etc.) of the above-mentioned AI processing results of the customer's image and the final list of products purchased by the customer can be stored in the memory 212 and/or in the cloud for retraining the AI model. In this way, the recognition accuracy of the AI model is further improved.

Fig. 10 shows a fifth embodiment of the AI weighing system according to the present invention. As shown in Figure 10, multiple AI weighing and payment systems may be installed in the same store, and multiple systems are connected to each other in a wireless or wired manner. Thus, multiple AI weighing systems communicate with each other to update the store's database (including product information, customer's anonymous personal information and purchase history data, etc.). Based on the updated database, the AI model of each AI image sensor is retrained to continuously improve the recognition accuracy of the AI model. Of course, the training and retraining of the AI model can also be performed on the computing device in the cloud environment, and the updated AI model can be deployed to each AI weighing system.

In addition, database information of different stores (store A and store B as shown in FIG. 10 ) can also be uploaded to computing devices in the cloud environment. Use the databases of different stores to build a large database on computing devices in the cloud environment, use this large database to create, train and retrain AI models, and update the AI models of each store. Using the cloud environment, the AI processing units of the AI weighing systems in different stores can also communicate in real time.

In addition, each store can obtain other information, such as real-time temperature, area address, and local climate, etc., from the cloud environment, for example. Utilizing these information can also assist in improving the recognition accuracy of the AI processing unit for the target object (for example, a product in a store).

The application scenario of the present invention is described above by taking the weighing of goods in a supermarket as an example. However, the present invention is obviously not limited to this. Those skilled in the art can understand that the AI recognition system of the present invention can also be applied to any scene where commodities/items need to be recognized, and is not limited to weighing scenes.

Claims

An AI weighing system, including:

a storage platform for placing a target object, and the storage platform can weigh the target object;

A first camera device, configured to photograph and identify the target placed on the storage table, the first camera device includes an image sensor, and the image sensor can perform AI processing for identifying the target in an offline state; and

The output unit is used to output the identification result and weighing result of the target object.
The AI weighing system according to claim 1, wherein the image sensor comprises a stacked CMOS image sensor chip having a first substrate having a plurality of pixels converting optical signals into electrical signals, and a second substrate having a memory and a processing circuit, the memory storing an AI model, and the processing circuit performing the AI processing based on the electrical signal by using the AI model.
The AI weighing system of claim 2, wherein said AI model comprises a first inference model.
The AI weighing system of claim 3, wherein the processing circuit of the stacked CMOS image sensor chip generates image data, and the processing circuit comprises:

a learning section that retrains the first inference model based on the image data; and

an inference unit, the inference unit uses the first inference model to identify the target.
The AI weighing system of claim 3, further comprising one or more computing devices located in a cloud environment, the one or more computing devices having corresponding processors and memory;

Wherein the processing circuitry of the stacked CMOS image sensor chip generates image data and the image data is sent to the one or more computing devices.
The AI weighing system of claim 5, wherein said one or more computing devices create a second inference model based on said image data generated by said stacked CMOS image sensor chip, and directly deploy said second inference model into said memory of said stacked CMOS image sensor chip such that said first inference model is updated.
The AI weighing system according to claim 4 or 5, wherein the stacked CMOS image sensor chip is capable of selecting a size of the image data, the size including a full sensor size and a video graphics array (VGA) size based on the AI processing of the processing circuit.
The AI weighing system according to claim 7, wherein, when the stacked CMOS image sensor outputs the image data having a full sensor size, the AI processing includes intercepting an overall image or a partial image of the target from the image data having the full sensor size, the partial image including a VGA size image.
The AI weighing system according to any one of claims 1 to 6, wherein the image data captured by the first camera device includes:

a profile image of the object; and/or

a cross-sectional image of the object; and/or

The package image for the object in question.
The AI weighing system according to any one of claims 1 to 6, wherein the first camera device further includes a ToF sensor, and the AI processing includes combining RBG data output by the image sensor and ToF data output by the ToF sensor.
The AI weighing system according to any one of claims 1 to 6, wherein the first camera device further includes a multi-wavelength sensor and/or a polarization sensor, and the AI processing can utilize the output of the multi-wavelength sensor and/or the polarization sensor to optimize the identification of the target.
The AI weighing system according to any one of claims 1 to 6, further comprising:

The second camera device is used to acquire the image of the target person and send it to the first camera device, and wherein,

The AI processing includes acquiring characteristic data from the image of the target person captured by the second camera device and using the characteristic data to assist in identifying the target object.
The AI weighing system according to claim 12, wherein the feature data includes the gender and age of the target person.
The AI weighing system according to any one of claims 1 to 6, further comprising:

The third camera device is used to acquire the image of the target person and send it to the first camera device, and wherein,

The AI processing includes performing SLAM processing on the image of the target person captured by the third camera device, and outputting metadata of the processing result to assist in identifying the target object.
The AI weighing system according to claim 14, wherein the third camera device also acquires an image of the shopping cart or shopping basket of the target person and sends it to the first camera device, and the AI processing includes performing SLAM processing on the image of the shopping cart or shopping basket, and outputting metadata of the processing result to assist in identifying the target object.
The AI weighing system according to claim 14, wherein the third camera device includes a plurality of image sensors positioned at different positions within the moving range of the target person.
The AI weighing system according to any one of claims 1 to 6, wherein the AI processing further includes obtaining other information including ambient temperature, area address and/or weather conditions, and can assist in identifying the target based on the other information.
A method for increasing the recognition accuracy of an AI model using multiple data sets, including:

Obtain the image data of the item;

Create an AI model using learning data, the learning data including the image data and the name and attributes of the item;

Apply the AI model to identify the target object,

Wherein, the learning data includes at least two of the following three data sets:

a profile image of the item;

a cross-sectional image of the item;

The package image for said item.
The method of claim 18, wherein said image data of said item comprises RGB image data and at least one of:

ToF data;

Multi-wavelength data;

polarized data.
The method of claim 19, wherein said learning data further comprises a data set of partial images of said item.