CN116051972A

CN116051972A - Container identification method, device, container access equipment and storage medium

Info

Publication number: CN116051972A
Application number: CN202111249636.6A
Authority: CN
Inventors: 赵仕伟
Original assignee: Beijing Jizhijia Technology Co Ltd
Current assignee: Beijing Jizhijia Technology Co Ltd
Priority date: 2021-10-26
Filing date: 2021-10-26
Publication date: 2023-05-02

Abstract

The invention provides a container identification method, a device, container access equipment and a storage medium, wherein the container identification method comprises the following steps: acquiring an image acquired by an image sensor, wherein the image carries color data and depth data of each pixel; determining candidate container regions in the image according to color data and/or depth data of each pixel in the image; classifying and identifying the target container in the candidate container region according to the color data and the depth data of each pixel in the candidate container region respectively to obtain a first identification result and a second identification result; based on the first recognition result and the second recognition result, a category of the target container is determined. According to the scheme, when the category is determined, the first recognition result based on the color data and the second recognition result based on the depth data are mutually compensated, so that the obtained category of the target container is more accurate.

Description

Container identification method, device, container access equipment and storage medium

Technical Field

The present invention relates to the field of warehousing technologies, and in particular, to a container identification method, a device, a container access device, and a storage medium.

Background

In recent years, with the rapid development of electronic commerce, the number of orders of users increases in geometric multiples, and a warehouse needs to store a large number of articles, so how to automatically sort the articles becomes a key for improving the warehouse efficiency.

At present, automatic sorting of articles is mostly realized by container access equipment in a warehouse, the container access equipment comprises gripping devices such as a sucker and a mechanical arm, and a target container filled with articles can be taken out from a goods shelf to be sorted through the gripping devices and then moved to a designated position. Under actual logistics application scene, a plurality of types of containers (such as common material boxes, hollowed-out material boxes, paper boxes and the like) are often stored on the same goods shelf or different goods shelves, and different grabbing strategies are needed to be selected for different types of containers in order to ensure that the containers cannot be damaged when being grabbed. Therefore, how to automatically identify the types of the containers and ensure the accuracy of the identification results becomes a key for realizing automatic sorting of the articles.

Disclosure of Invention

In view of the above, embodiments of the present invention provide a container identification method, device, container access apparatus and storage medium, so as to solve the technical defects existing in the prior art.

According to a first aspect of an embodiment of the present invention, there is provided a container identification method, including:

acquiring an image acquired by an image sensor, wherein the image carries color data and depth data of each pixel;

determining a candidate container area in the image according to the color data and/or the depth data of each pixel in the image, wherein the candidate container area is an image area containing a target container;

classifying and identifying the target container in the candidate container region according to the color data and the depth data of each pixel in the candidate container region respectively to obtain a first identification result and a second identification result;

based on the first recognition result and the second recognition result, a category of the target container is determined.

Optionally, the step of determining the candidate container region in the image based on color data and/or depth data of each pixel in the image comprises:

determining a target area formed by a plurality of adjacent pixels with continuous color data according to the color data of each pixel in the image; and/or determining a target area composed of a plurality of adjacent pixels with continuous depth data according to the depth data of each pixel in the image;

based on the target region, a candidate container region is determined.

Optionally, the step of determining the candidate container region based on the target region comprises:

if the number of the target areas is a plurality of, acquiring first center position data of the image and second center position data of each target area;

calculating the center distance between any target area and the image according to the second center position data and the first center position data of the target area; and/or calculating average depth data of the target area according to the depth data of each pixel in the target area;

determining a target region with a center distance less than a first preset threshold as part of the candidate container region; and/or determining a target region with the average depth data less than a second preset threshold as part of the candidate container region.

Optionally, the step of classifying and identifying the target container in the candidate container region according to the color data and the depth data of each pixel in the candidate container region to obtain a first identification result and a second identification result includes:

invoking a trained first recognition model, and processing color data of each pixel in the candidate container area to obtain a first recognition result;

and calling the trained second recognition model, and processing the depth data of each pixel in the candidate container region to obtain a second recognition result.

Optionally, the training method of the first recognition model includes:

acquiring a plurality of first training samples, wherein the first training samples comprise color data of each pixel and carry label information of the real type of the container;

acquiring an initial first identification model;

training an initial first recognition model according to a plurality of first training samples to obtain a trained first recognition model, wherein the input of the first recognition model is color data of each pixel, and the input of the first recognition model is output as a corresponding first prediction container type;

a method of training a second recognition model, comprising:

acquiring a plurality of second training samples, wherein the second training samples comprise depth data of each pixel and carry tag information of the real class of the container;

acquiring an initial second recognition model;

training the initial second recognition model according to a plurality of second training samples to obtain a trained second recognition model, wherein the input of the second recognition model is depth data of each pixel, and the input of the second recognition model is output as a corresponding second prediction container type.

Optionally, the first recognition result includes a first category and a first confidence level, and the second recognition result includes a second category and a second confidence level;

A step of determining a category of the target container based on the first recognition result and the second recognition result, comprising:

and determining the category of the target container according to the first category and the first confidence level and the second category and the second confidence level.

Optionally, the step of determining the category of the target container according to the first category and the first confidence level, and the second category and the second confidence level includes:

for different categories of the target container, respectively acquiring weights preset for determining the category according to the color data and determining the category according to the depth data;

for any category, weighting the first confidence coefficient and the second confidence coefficient by using a weight to obtain a weighted value;

and determining the category of the target container according to the weighted value.

Optionally, after the step of determining the category of the target container based on the first identification result and the second identification result, the method further includes:

determining a target grabbing strategy corresponding to the category according to the category of the target container;

and driving the target grabbing device to grab the target container according to the target grabbing strategy.

Optionally, the container is a cargo box.

According to a second aspect of an embodiment of the present invention, there is provided a container identification device including:

The acquisition module is configured to acquire an image acquired by the image sensor, wherein the image carries color data and depth data of each pixel;

a positioning module configured to determine a candidate container region in the image according to color data and/or depth data of each pixel in the image, wherein the candidate container region is an image region containing a target container;

the identification module is configured to respectively carry out classification identification on the target container in the candidate container area according to the color data and the depth data of each pixel in the candidate container area to obtain a first identification result and a second identification result;

and a category determination module configured to determine a category of the target container based on the first recognition result and the second recognition result.

Optionally, the positioning module comprises a target area determining unit and a candidate container area determining unit;

a target area determining unit configured to determine a target area composed of a plurality of adjacent pixels whose color data is continuous, based on the color data of each pixel in the image; and/or determining a target area composed of a plurality of adjacent pixels with continuous depth data according to the depth data of each pixel in the image;

and a candidate container region determination unit configured to determine a candidate container region based on the target region.

Optionally, the candidate container region determining unit includes an acquiring subunit, a calculating subunit, and a determining subunit;

an acquisition subunit configured to acquire first center position data of the image and second center position data of each target area if the number of target areas is plural;

a calculating subunit configured to calculate, for any one of the target areas, a center distance of the target area from the image based on the second center position data and the first center position data of the target area; and/or calculating average depth data of the target area according to the depth data of each pixel in the target area;

a determining subunit configured to determine a target region having a center distance less than a first preset threshold as a part of the candidate container region; and/or determining a target region with the average depth data less than a second preset threshold as part of the candidate container region.

Optionally, the recognition module is further configured to call the trained first recognition model, and process the color data of each pixel in the candidate container area to obtain a first recognition result; and calling the trained second recognition model, and processing the depth data of each pixel in the candidate container region to obtain a second recognition result.

Optionally, the apparatus further comprises: the first training module and the second training module;

the first training module is configured to acquire a plurality of first training samples, wherein the first training samples comprise color data of each pixel and carry label information of the real class of the container; acquiring an initial first identification model; training an initial first recognition model according to a plurality of first training samples to obtain a trained first recognition model, wherein the input of the first recognition model is color data of each pixel, and the input of the first recognition model is output as a corresponding first prediction container type;

the second training module is configured to acquire a plurality of second training samples, wherein the second training samples comprise depth data of each pixel and carry tag information of the real class of the container; acquiring an initial second recognition model; training the initial second recognition model according to a plurality of second training samples to obtain a trained second recognition model, wherein the input of the second recognition model is depth data of each pixel, and the input of the second recognition model is output as a corresponding second prediction container type.

The category determination module is further configured to determine a category of the target container based on the first category and the first confidence level, and the second category and the second confidence level.

Optionally, the category determining module is further configured to obtain weights preset for determining the category according to the color data and determining the category according to the depth data for different categories of the target container respectively; for any category, weighting the first confidence coefficient and the second confidence coefficient by using a weight to obtain a weighted value; and determining the category of the target container according to the weighted value.

Optionally, the apparatus further comprises: a container grabbing control module;

the container grabbing control module is configured to determine a target grabbing strategy corresponding to the category according to the category of the target container; and driving the target grabbing device to grab the target container according to the target grabbing strategy.

Optionally, the container is a cargo box.

According to a third aspect of an embodiment of the present invention, there is provided a container access apparatus comprising: an image sensor, a memory, and a processor;

the image sensor is used for acquiring an image and transmitting the image to the processor;

the memory is configured to store computer-executable instructions that, when executed by the processor, implement the container identification method described above.

According to a fourth aspect of embodiments of the present invention, there is provided a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the above-described container identification method.

The embodiment provides a container identification method, a device, container access equipment and a storage medium, wherein the container identification method comprises the following steps: acquiring an image acquired by an image sensor, wherein the image carries color data and depth data of each pixel; determining candidate container regions in the image according to color data and/or depth data of each pixel in the image; classifying and identifying the target container in the candidate container region according to the color data and the depth data of each pixel in the candidate container region respectively to obtain a first identification result and a second identification result; based on the first recognition result and the second recognition result, a category of the target container is determined.

When the container identification is carried out, firstly, a candidate container area containing a target container in an image can be positioned according to the color data and/or the depth data of each pixel in the image, so that the positioning of the target container is realized, secondly, a first identification result based on the color data of each pixel in the candidate container area is combined with a second identification result based on the depth data of each pixel in the candidate container area, and then the category of the target container is determined, and when the category is determined, the first identification result based on the color data and the second identification result based on the depth data are mutually compensated, so that the obtained category of the target container is more accurate.

Drawings

FIG. 1 is a flow chart of a method of identifying a container provided in one embodiment of the invention;

FIG. 2 is a flow chart of another method of identifying a container provided in accordance with one embodiment of the present invention;

FIG. 3 is a flow chart of a method of gripping containers provided in one embodiment of the invention;

FIG. 4 is a schematic view of a container identification device according to an embodiment of the present invention;

fig. 5 is a block diagram of a container access device according to an embodiment of the present invention.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be embodied in many other forms than those herein described, and those skilled in the art will readily appreciate that the present invention may be similarly embodied without departing from the spirit or essential characteristics thereof, and therefore the present invention is not limited to the specific embodiments disclosed below.

The terminology used in the one or more embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the invention. As used in one or more embodiments of the invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present invention refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of the invention to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the invention. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

First, terms related to one or more embodiments of the present invention will be explained.

A container: also commonly referred to as a bin or a container, is a entity that holds articles such as goods/materials, including plastic bins, cartons, plastic baskets, and the like.

Container access device: also called a C robot (Carry transfer robot), is an automated device capable of picking and placing containers from a rack and transferring the containers.

Color data: various data representing pixel color space, such as RGB, HSV, etc.

Depth data: the larger the value is, the deeper the depth of the pixel is.

In the current warehouse container sorting scenario, container identification mainly includes the following two schemes: firstly, some container surfaces are provided with characters for explaining what type of the container is, the container surfaces can be subjected to image acquisition by means of an RGB camera, then the acquired images are subjected to character recognition, and the type of the container is determined according to the character recognition result; the second method is to use an image acquisition device to acquire images of the container, and then input the acquired images into a pre-trained recognition model, wherein the recognition model is obtained by training based on a large number of sample images with category identifications, so that the recognition model can directly output the categories of the container.

However, in a practical scenario, the number of containers with text descriptions on the surface is only small, so the application range of the first container recognition scheme is very small; in the second container recognition scheme, the recognition model generally has poor recognition effect on containers of different materials with similar colors. In summary, current container identification schemes do not provide an ideal identification for complex scenarios where multiple containers exist.

In order to cope with the above-described problems, the present invention provides a container recognition method, and at the same time, provides a container recognition apparatus, a container access device, and a computer-readable storage medium, which are described in detail one by one in the following embodiments.

The execution main body of the container identification method provided by the embodiment of the invention can be a computing device for carrying out container identification, and can also be a container access device for automatically accessing a container. The container identification method provided by the embodiment of the invention can be implemented by at least one of software, a hardware circuit and a logic circuit arranged in an execution main body.

Fig. 1 shows a flowchart of a container identification method according to an embodiment of the present invention, which specifically includes the following steps.

Step 102, acquiring an image acquired by an image sensor, wherein the image acquired by the image sensor carries color data and depth data of each pixel.

In an embodiment of the present invention, the image sensor is a specific type of image sensor, such as an RGB-D camera, that can obtain color data and depth data of each pixel when an image is acquired. The image sensor shoots a scene in a field range, and transmits the shot image to the execution main body of the embodiment of the invention, and the execution main body acquires the image acquired by the image sensor, so that the color data and the depth data of each pixel carried by the image can be obtained. In some cases, the image sensor may output a color image and a depth image, where each pixel in the color image has color data, and the color data of each pixel in the image collected by the image sensor is color data of each pixel in the depth image, and the depth data of each pixel in the depth image is depth data of each pixel in the image collected by the image sensor. The embodiment of the invention does not particularly limit the output form of the image sensor. In the embodiment of the invention, the scene in the visual field range of the image sensor is a scene in which the container is placed.

Step 104, determining a candidate container area in the image according to the color data and/or the depth data of each pixel in the image, wherein the candidate container area is an image area containing the target container.

After the image acquired by the image sensor is acquired, the color data and the depth data of each pixel in the image can be obtained, and as the surface color and the depth of the container generally meet a certain rule, for example, the surface of the container is generally blue, red, green and other solid colors, and the depth is generally fixed, the candidate container area containing the target container in the image can be determined according to the pixel information of each pixel in the image. Specifically, the candidate container region may be determined by determining whether the color data of each pixel is specified color data (color data representing a solid color) and/or whether the depth data of each pixel is specified depth data (fixed depth of the container), and determining that the color data is specified color data and/or that the depth data is a region composed of a plurality of pixels of the specified depth data as the candidate container region. In the case where the image sensor outputs a color image for which the candidate container region can be determined from the color data of each pixel and a depth image for which the candidate container region can be determined from the depth data of each pixel.

In one implementation manner of the embodiment of the present invention, step 104 may be specifically implemented by the following manner: determining a target area formed by a plurality of adjacent pixels with continuous color data according to the color data of each pixel in the image; and/or determining a target area composed of a plurality of adjacent pixels with continuous depth data according to the depth data of each pixel in the image. Based on the target region, a candidate container region is determined.

Because the surface color and depth of the container generally meet a certain rule and are influenced by factors such as illumination, angles and the like, the collected color data and depth data of each pixel have a certain range of errors, in order to eliminate the influence of the factors and improve the precision of a container positioning result, in the embodiment of the invention, a target area formed by a plurality of adjacent pixels with continuous color data can be determined according to the color data of each pixel in an image, and/or a target area formed by a plurality of adjacent pixels with continuous depth data can be determined according to the depth data of each pixel in the image, and then a candidate container area is determined based on the determined target area.

Specifically, when determining the candidate container region, a region composed of a plurality of adjacent pixels whose color data is continuous may be determined as the candidate container region based on the color data of each pixel in the image, or a region composed of a plurality of adjacent pixels whose depth data is continuous may be determined as the candidate container region based on the depth data of each pixel in the image, or a region composed of a plurality of adjacent pixels whose color data is continuous and whose depth data is also continuous may be determined as the candidate container region based on the color data and the depth data of each pixel in the image.

The process of dividing the candidate container regions is similar to the process of clustering, namely, pixels meeting continuous conditions (namely, continuous color data and/or continuous depth data of a plurality of adjacent pixels) are clustered into a class, and the class is a candidate container region. The continuous condition is that the color data of all pixels in one area is within a certain range (the range considers the color deviation setting introduced by the illumination factor), and/or the depth data of all pixels is within a certain range (the range considers the depth deviation setting introduced by the shooting angle). Of course, the continuous condition is not limited to this, and it is also possible that the euclidean distance between the color data of each pixel in one area is smaller than a certain threshold value (the threshold value is set in consideration of the color deviation due to the illumination factor), and/or that the euclidean distance between the depth data of each pixel is smaller than a certain threshold value (the threshold value is set in consideration of the depth deviation due to the photographing angle).

In one implementation manner of the embodiment of the present invention, the step of determining the candidate container area based on the target area may be specifically implemented as follows:

determining a target area with the center distance meeting a first preset condition as a part of the candidate container area; and/or determining a target area of which the average depth data meets the second preset condition as a part of the candidate container area.

In some special situations, for example, when a plurality of containers are placed on a shelf, only one container is taken out at a time, in order to improve the accuracy of the positioning result of the containers, the containers near the center of the view range of the image sensor are generally positioned and grabbed, so if a plurality of target areas exist, the distance between the center of the area and the center of the view range of the image sensor (the center of the view range of the image sensor is the center of the image acquired by the image sensor) is selected to meet the first preset condition, and/or the target area with the average depth of the area meeting the second preset condition is selected as a part of the candidate container area. The first preset condition refers to a condition that the position data of the candidate container region is optimal, for example, a condition that the distance between the position data and the center of the visual field range of the image sensor is nearest, the distance between the position data and the center of the visual field range of the image sensor is smaller than a certain threshold value, and the second preset condition refers to a condition that the depth data of the candidate container region is optimal, for example, a condition that a difference value between the average depth data of the region and the plane depth data of an image acquired by the image sensor is smaller than a certain threshold value, a Euclidean distance between the depth data of each pixel in the region is minimum, and the like.

Specifically, the process of determining the distance between the center of the area and the center of the field of view of the image sensor is: and acquiring first central position data of the image and second central position data of each target area, and calculating the central distance between the target area and the image according to the second central position data and the first central position data of the target area for any target area. The first center position data may be obtained by calculating an internal parameter of the image sensor, where the internal parameter of the image sensor includes a focal length, a viewing angle, and the like, and a specific calculation process is the same as a conventional process for determining a center point of the camera, which is not described herein. The process of determining the average depth of the region is as follows: for any target area, calculating average depth data of the target area according to the depth data of each pixel in the target area, specifically adding the depth data of each pixel in the target area, and dividing by the number of pixels to obtain the average depth data of the target area.

In one implementation manner of the embodiment of the present invention, the step of determining, as a part of the candidate container area, the target area whose center distance meets the first preset condition may be specifically implemented by the following manner: a target region having a center distance less than a first preset threshold is determined as part of the candidate container region. The step of determining a target region for which the average depth data meets the second preset condition as part of the candidate container region may be specifically implemented as follows: a target region having average depth data less than a second preset threshold is determined as part of the candidate container region.

In order to ensure accuracy of the container positioning result, it is preferable to select a target area within a first preset threshold from the center of the field of view of the image sensor as a part of the candidate container area, and/or to select a target area whose average depth data is smaller than a second preset threshold as a part of the candidate container area. Further, a target region having the smallest distance from the center of the field of view of the image sensor may be selected as a part of the candidate container region, and/or a target region having the smallest average depth data may be selected as a part of the candidate container region. The part of the candidate container area refers to an image area covered by the target container, so that the target container is guaranteed to be positioned near the center of the image sensor and/or closest to the image sensor, and the target container in an image acquired by the image sensor is guaranteed to be clear, so that the precision of container positioning and the detectable rate of the target container are improved.

And 106, classifying and identifying the target container in the candidate container area according to the color data and the depth data of each pixel in the candidate container area to obtain a first identification result and a second identification result.

The candidate container areas are obtained, namely the positioning of the target container is completed, which area in the image is the area where the target container is located is known, and the category of the target container can be identified on the premise of accurately positioning the target container. In the embodiment of the invention, the color data and the depth data are both image information, so that the target container can be classified and identified by adopting an image-based classification and identification method. The image-based classification recognition method includes a deep learning model/machine learning model-based classification recognition method, an image feature analysis-based classification recognition method, and the like. The classification recognition method based on the deep learning model/machine learning model is an end-to-end intelligent recognition method, and has the characteristics of high efficiency, high accuracy and the like.

The classification recognition method based on the deep learning model/machine learning model is characterized in that training is carried out on a neural network based on a sample image, the sample image comprises a container, the sample image is labeled with the type of the container in advance, the sample image is input into the neural network, the type recognition result of the container in the sample image is obtained through forward calculation of the neural network, the recognition result is compared with labeled type information to obtain a loss value, network parameters of the neural network are adjusted based on the loss value, then the step of inputting the sample image into the neural network is executed again, the loss value is gradually reduced through repeated iteration until the loss value is reduced to a certain degree (namely the neural network converges), or the iteration times reach a certain number, training is stopped, and the neural network at the moment is the final recognition model. When the method is applied, the acquired image is input into the identification model, and the identification model can directly output the type identification result of the container in the image.

For different types of containers, there may be a certain difference between the recognition result based on the color data and the recognition result based on the depth data, for example, for a plastic container, the recognition result based on the depth data may be more accurate than the recognition result based on the color data, so in the embodiment of the present invention, the target container in the candidate container region is classified and identified according to the color data and the depth data of each pixel in the candidate container region. The color data and the depth data can be mutually compensated for different types of containers, and one identification result is relatively accurate, so that the method can be suitable for more types of containers.

In one implementation manner of the embodiment of the present invention, step 106 may be specifically implemented by the following manner:

In specific implementation, for the candidate container region, a trained first recognition model can be called to process color data of each pixel in the candidate container region to obtain a first recognition result, wherein the first recognition model is obtained by training a preset neural network based on a color image of a sample container.

Optionally, the training method of the first recognition model specifically may include the following steps:

acquiring an initial first identification model;

training the initial first recognition model according to a plurality of first training samples to obtain a trained first recognition model, wherein the input of the first recognition model is color data of each pixel, and the input of the first recognition model is output as a corresponding first prediction container type.

And acquiring massive first training samples, wherein the first training samples can be RGB images specifically, comprise color data of each pixel and carry tag information of real types of containers. Specifically, when training is performed, a first training sample is input into a first recognition model, a first predicted container type of a container in the first training sample is obtained through forward calculation of the first recognition model, the first predicted container type is compared with label information of a real container type to obtain a loss value, model parameters of the first recognition model are adjusted based on the loss value, the step of inputting the first training sample into the first recognition model is performed in a returning mode, after repeated loop iteration, the loss value is gradually reduced until the loss value is reduced to a certain degree (smaller than a preset loss value threshold value, namely neural network convergence), or the number of iterations reaches a certain number of times, training is stopped, and the trained first recognition model is obtained.

Meanwhile, the trained second recognition model can be called to process the depth data of each pixel in the candidate container region, and a second recognition result is obtained, wherein the second recognition model is obtained by training a preset neural network based on the depth image of the sample container.

Optionally, the training method of the second recognition model specifically may include the following steps:

acquiring an initial second recognition model;

And acquiring massive second training samples, wherein the second training samples can be depth images specifically, comprise depth data of each pixel and carry tag information of real types of containers. Specifically, when training is performed, a second training sample is input into a second recognition model, a second predicted container type of a container in the second training sample is obtained through forward calculation of the second recognition model, the second predicted container type is compared with label information of a real container type to obtain a loss value, model parameters of the second recognition model are adjusted based on the loss value, the step of inputting the second training sample into the second recognition model is performed in a returning mode, after repeated loop iteration, the loss value is gradually reduced until the loss value is reduced to a certain degree (smaller than a preset loss value threshold value, namely neural network convergence), or the number of iterations reaches a certain number of times, training is stopped, and the trained second recognition model is obtained.

In particular, the first recognition result and the second recognition result may include a category of the target container and a confidence level for the category.

Step 108, determining the category of the target container based on the first identification result and the second identification result.

After the first recognition result and the second recognition result are obtained, the first recognition result and the second recognition result can be combined to determine the category of the target container. For example, the first recognition result and the second recognition result are identical and are both a certain category, the target container can be determined to be the category, for example, the first recognition result and the second recognition result may both recognize a plurality of categories, whether overlapping categories exist or not is judged, and if so, the category of the target container can be determined to be the overlapping category. In order to further improve the accuracy of classification recognition, the confidence in the first recognition result and the confidence in the second recognition result can be fused, and the classification of the target container can be determined based on the result of the confidence fusion.

In one implementation of the embodiment of the present invention, the first recognition result includes a first category and a first confidence level, and the second recognition result includes a second category and a second confidence level. Accordingly, step 108 may be specifically implemented as follows: and determining the category of the target container according to the first category and the first confidence level and the second category and the second confidence level.

In practical application, the first recognition result includes a first category and a first confidence, the first confidence is a probability that the target container is of the first category, and the second recognition result includes a second category and a second confidence, the second confidence is a probability that the target container is of the second category. Then, the category of the specific target container needs to be determined according to the first category and the first confidence coefficient, the second category and the second confidence coefficient, and the accuracy of the category identification of the target container can be improved by combining the categories and the confidence coefficients of the two identification results. In a specific implementation, the following cases can be classified:

first, determining the category of the target container as a first category when the first confidence is greater than a first threshold, the second confidence is less than a second threshold, and the first category is inconsistent with the second category.

For example, the first category is plastic containers, the second category is paper containers, the first confidence is 85%, greater than the first threshold value 75%, the second confidence is 40%, and less than the second threshold value 60%, then the category of the target container may be determined to be plastic containers.

Second, determining the category of the target container as the second category when the first confidence is less than the third threshold, the second confidence is greater than the fourth threshold, and the first category is inconsistent with the second category.

For example, if the first category is a plastic container, the second category is a paper container, the first confidence is 60%, less than the third threshold value of 75%, the second confidence is 80%, and greater than the fourth threshold value of 60%, the category of the target container may be determined to be a paper container.

In one implementation manner of the embodiment of the present invention, the step of determining the category of the target container according to the first category and the first confidence level, and the second category and the second confidence level may be specifically implemented as follows:

and determining the category of the target container as a first category according to the weighted value.

In the scheme of fusing the confidence levels in the first recognition result and the second recognition result, firstly, aiming at different categories of the target container, the preset weights for determining the category according to the color data and determining the category according to the depth data are respectively obtained, and because the category recognition results determined according to the color data and the category recognition results determined according to the depth data of the containers of different categories are different in precision, corresponding weights can be preset according to actual conditions, for example, the recognition results of the color data are more accurate, larger weights can be allocated, and then, aiming at any category, the preset weights are utilized to weight the first confidence level and the second confidence level, so that a weighted value is obtained.

The weighting value characterizes the likelihood that the target container is of a certain class, the greater the weighting value, the more likely the target container is of that class. For example, for a plastic container, the confidence that the target container is a plastic container is 10% according to the color data, the confidence that the target container is a plastic container is 90% according to the depth data, and for weights of 0.1 and 0.9 that are preset for determining the plastic container according to the color data and determining the plastic container according to the depth data, the weight value may be calculated to be (0.9×0.9+0.1×0.1) ×100% =82%; for paper containers, the confidence that the target container is a paper container is 80% according to the color data, the confidence that the target container is a paper container is 20% according to the depth data, and the weights preset for the paper container are 0.7 and 0.3 according to the color data and the depth data, the weight value can be calculated to be (0.8×0.7+0.2×0.3) ×100% =62%.

Accordingly, the category of the target container may be determined according to the weighted value, and if the 82% weighted value is greater than the 62% weighted value in the foregoing example, the target container may be determined to be a plastic container. Of course, in another implementation manner, a certain threshold may be set, if a certain weighted value is greater than or equal to a preset threshold, it may be determined that the class of the target container is a class corresponding to the weighted value, for example, for the above example, the threshold is set to 75%,82% is greater than 75%, and 62% is less than 75%, and it may be determined that the target container is a plastic container.

In this embodiment, the first confidence coefficient and the second confidence coefficient are weighted by weights, the category of the target container is determined according to the weighted value, and the mutual compensation of the color data and the depth data is fully considered by weighting the first confidence coefficient and the second confidence coefficient, so that the precision of container identification is improved.

In one implementation of an embodiment of the present invention, after step 108, the container identification method may further include the steps of:

if the first category is completely different from the second category, or the first confidence coefficient is smaller than the first threshold value and the second confidence coefficient is smaller than the second threshold value, or the weighted value calculated for any category is smaller than the preset threshold value, a reminding message is output.

If the first category is completely different from the second category, or the first confidence coefficient is smaller than the first threshold value and the second confidence coefficient is smaller than the second threshold value, or the weighted value calculated for any category is smaller than the preset threshold value, the possibility that the type of the target container is accurately identified is not high, and the possibility is caused by factors such as actual scenes. In order to ensure the accuracy of the grabbing of the subsequent container, an reminding message can be output to remind the user to manually process the target container. Specifically, the form of the reminding message can be a voice prompt form, a short message prompt form, a buzzer alarm form and the like.

When the embodiment of the invention is applied, firstly, the candidate container region containing the target container in the image can be positioned according to the color data and/or the depth data of each pixel in the image, so that the positioning of the target container is realized, and secondly, the first recognition result based on the color data of each pixel in the candidate container region and the second recognition result based on the depth data of each pixel in the candidate container region are combined, so that the category of the target container is determined, and when the category is determined, the first recognition result based on the color data and the second recognition result based on the depth data are mutually compensated, so that the obtained category of the target container is more accurate.

Based on the embodiment shown in fig. 1, fig. 2 shows a flowchart of another container identification method according to an embodiment of the present invention, which specifically includes the following steps.

Steps 102-108 in the embodiment of fig. 2 are the same as those in the embodiment of fig. 1, and detailed descriptions of the embodiment of fig. 1 are omitted here.

Step 110, determining a target grabbing strategy corresponding to the category according to the category of the target container.

Corresponding grabbing strategies are needed to be selected for grabbing different types of containers, for example, a sucker is needed to be adopted for grabbing a plastic container, and a mechanical arm is needed to be adopted for grabbing a hollow container, so that the grabbing efficiency of the container can be guaranteed, the grabbing stability of the container can be guaranteed, and accidents and damage to the container in the grabbing process due to an improper grabbing strategy are prevented.

Therefore, the capturing strategies corresponding to different categories can be stored in advance, and after the category of the target container is obtained, the target capturing strategy corresponding to the category can be searched according to the recorded corresponding relation. Of course, a deep learning-based approach may also be used to determine the target capture strategy. The specific limitation is not particularly limited herein. The grabbing strategy is not limited to the selection of grabbing devices, but can also comprise grabbing force, grabbing angle, grabbing speed and the like.

And step 112, driving the target grabbing device to grab the target container according to the target grabbing strategy.

After the target grabbing strategy is determined, the target grabbing device can be driven to grab the target container, a plurality of grabbing devices can be arranged on some container access equipment, for example, a mechanical arm and a sucker are arranged, and then a proper target grabbing device can be selected to grab according to the target grabbing strategy. Specifically, after determining the target grabbing strategy, a driving instruction is issued to the corresponding target grabbing device according to the target grabbing strategy, so that the motor of the target grabbing device is driven and controlled to act, and the motor brings the target grabbing device to act, so that grabbing of the target container is realized.

In a general scenario, the container access device is often provided with only one gripping device, and if the corresponding target gripping device is not provided on the container access device according to the target gripping policy corresponding to the category determined by the category of the target container, that is, it is determined that the container access device does not have the target gripping device according to the target gripping policy, the container access device can learn that the container access device cannot effectively grip the target container, or can not determine that the target container cannot be successfully gripped, an awake message can be output to remind the user to manually process the target container. Specifically, the form of the reminding message can be a voice prompt form, a short message prompt form, a buzzer alarm form and the like.

When the embodiment of the invention is applied, firstly, the candidate container region containing the target container in the image can be positioned according to the color data and/or the depth data of each pixel in the image, so that the positioning of the target container is realized, and secondly, the first recognition result based on the color data of each pixel in the candidate container region and the second recognition result based on the depth data of each pixel in the candidate container region are combined, so that the category of the target container is determined, and when the category is determined, the first recognition result based on the color data and the second recognition result based on the depth data are mutually compensated, so that the obtained category of the target container is more accurate. And according to the determined category of the target container, determining a target grabbing strategy corresponding to the category, wherein the target grabbing strategy is a grabbing strategy which is more suitable for the target container, so that when the target grabbing device is driven to grab the target container according to the target grabbing strategy, the stability of the grabbing process is optimized, accidents and container damage in the grabbing process caused by improper grabbing strategies are prevented, the accident rate is reduced, and the loss is reduced.

The container in the foregoing embodiment may be a container, and for convenience of understanding, the method for identifying a container according to the embodiment of the present invention is described below with reference to an application scenario of grabbing the container. In the embodiment of the invention, the container access device comprises a picking and placing mechanism for picking and placing the container, an RGB-D camera is arranged at the bottom of the picking and placing mechanism, a grabbing device capable of grabbing the container is further arranged on the picking and placing mechanism, the container access device further comprises a processor, the processor executes a container identification method shown in fig. 3, and fig. 3 shows a flow chart of a container grabbing method provided by one embodiment of the invention, and the method specifically comprises the following steps.

First, the RGB-D camera collects data.

Color image data (i.e., the color data) and depth information data (i.e., the depth data) are acquired.

And secondly, extracting candidate container areas.

The color image data constitutes an RGB map and the depth information data constitutes a depth map. And searching a region with continuous depth information data in the depth map and a region with continuous color image data in the RGB map as candidate container regions. If there are multiple candidate bin regions, the region closest to the camera center with the closest average depth (i.e., the region with the smallest average depth from the camera plane) is taken as the final candidate bin region.

And thirdly, container identification based on visual information.

The visual information is depth information data in the depth map and color image data in the RGB map, and the candidate container areas in the depth map and the RGB map are identified through classification, and identification results and respective confidence degrees are respectively output. Methods of classification recognition include, but are not limited to, methods by deep learning models and machine learning models.

And fourthly, fusing the identification result and deciding a container grabbing strategy.

According to the two recognition results output in the third step and the respective confidence degrees, the opposite confidence degrees are fused, and a specific fusion mode is as shown in the embodiment, and can be that an average value, a weight and the like are calculated, so that the final confidence degree can be obtained, and the accurate type of the container can be determined based on the calculated confidence degree. According to the type of the container, a grabbing strategy corresponding to the type can be determined. If the container can not be successfully grasped or whether the container can be successfully grasped is not determined, a reminding message can be initiated to a worker so as to remind the worker of manual processing. The specific manner of determining whether the container cannot be successfully gripped or whether the container can be successfully gripped is not determined, for example, in the foregoing embodiment, the confidence is smaller than a preset threshold, and the container access device has no gripping device corresponding to the gripping policy.

The invention mutually compensates the color data and the depth data of the RGB-D camera, combines the respective recognition results with the confidence coefficient to determine the more accurate container recognition result, and can adapt to containers with more categories/materials. Therefore, the stability of grabbing the container in a complex scene with various containers is improved, the accident occurrence rate is reduced, and the loss is reduced.

Corresponding to the above method embodiment, the present invention further provides a container identification device embodiment, and fig. 4 shows a schematic structural diagram of a container identification device according to one embodiment of the present invention. As shown in fig. 4, the apparatus includes:

an acquisition module 420 configured to acquire an image acquired by the image sensor, wherein the image carries color data and depth data for each pixel;

a positioning module 440 configured to determine a candidate container region in the image based on the color data and/or the depth data of each pixel in the image, wherein the candidate container region is an image region containing the target container;

the identifying module 460 is configured to respectively perform classification and identification on the target container in the candidate container area according to the color data and the depth data of each pixel in the candidate container area, so as to obtain a first identifying result and a second identifying result;

A category determination module 480 configured to determine a category of the target container based on the first recognition result and the second recognition result.

By applying the embodiment of the invention, the image acquired by the image sensor is acquired, wherein the image carries the color data and the depth data of each pixel; determining candidate container regions in the image according to color data and/or depth data of each pixel in the image; classifying and identifying the target container in the candidate container region according to the color data and the depth data of each pixel in the candidate container region respectively to obtain a first identification result and a second identification result; based on the first recognition result and the second recognition result, a category of the target container is determined. When the container identification is carried out, firstly, a candidate container area containing a target container in an image can be positioned according to the color data and/or the depth data of each pixel in the image, so that the positioning of the target container is realized, secondly, a first identification result based on the color data of each pixel in the candidate container area is combined with a second identification result based on the depth data of each pixel in the candidate container area, and then the category of the target container is determined, and when the category is determined, the first identification result based on the color data and the second identification result based on the depth data are mutually compensated, so that the obtained category of the target container is more accurate.

Optionally, the positioning module 440 includes a target region determination unit and a candidate container region determination unit;

Optionally, the recognition module 460 is further configured to invoke the trained first recognition model, and process the color data of each pixel in the candidate container area to obtain a first recognition result; and calling the trained second recognition model, and processing the depth data of each pixel in the candidate container region to obtain a second recognition result.

the category determination module 480 is further configured to determine a category of the target container based on the first category and the first confidence level, and the second category and the second confidence level.

Optionally, the category determining module 480 is further configured to obtain weights preset for determining the category according to the color data and determining the category according to the depth data, respectively, for different categories of the target container; for any category, weighting the first confidence coefficient and the second confidence coefficient by using a weight to obtain a weighted value; and determining the category of the target container as a first category according to the weighted value.

Optionally, the container is a cargo box.

The above is a schematic version of a container recognition device of the present embodiment. It should be noted that, the technical solution of the container recognition device and the technical solution of the container recognition method belong to the same concept, and details of the technical solution of the container recognition device, which are not described in detail, can be referred to the description of the technical solution of the container recognition method.

Fig. 5 shows a block diagram of a container access device according to an embodiment of the present invention. The components of the container access device 500 include, but are not limited to, an image sensor 510, a memory 520, and a processor 530. Processor 530 is coupled to image sensor 510 and memory 520 via bus 540, and database 560 is used to hold data.

The container access device 500 also includes an access device 550, the access device 550 enabling the container access device 500 to communicate via one or more networks 570. Examples of such networks include public switched telephone networks (PSTN, public SwitchedTelephone Network), local area networks (LAN, localAreaNetwork), wide area networks (WAN, wideAreaNetwork), personal area networks (PAN, personalArea networks), or combinations of communication networks such as the internet. The access device 550 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network Interface Card), such as an IEEE802.11 wireless local area network (WLAN, wireless LocalAreaNetworks) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, world Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the invention, the above-described components of the container access device 500, as well as other components not shown in fig. 5, may also be connected to each other, for example by a bus. It should be understood that the block diagram of the container access device shown in fig. 5 is for exemplary purposes only and is not intended to limit the scope of the present invention. Those skilled in the art may add or replace other components as desired.

Wherein the image sensor 510 is configured to collect an image and transmit the image to the processor 530; the processor 530 is configured to execute computer-executable instructions that, when executed by the processor, implement:

acquiring an image acquired by the image sensor 510, wherein the image carries color data and depth data of each pixel;

After the category of the target container is obtained, a target grabbing strategy corresponding to the category can be determined according to the category of the target container, and the target grabbing device is driven to grab the target container according to the target grabbing strategy.

The above is a schematic version of a container access device of the present embodiment. It should be noted that, the technical solution of the container access device and the technical solution of the container identification method described above belong to the same concept, and details of the technical solution of the container access device, which are not described in detail, can be referred to the description of the technical solution of the container identification method described above.

An embodiment of the present invention also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the container identification method described above.

The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the container identification method described above belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the container identification method described above.

The foregoing describes certain embodiments of the present invention. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), an electrical carrier signal, a telecommunication signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments of the present invention are not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred, and that the acts and modules referred to are not necessarily all required in the embodiments of the invention.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teachings of the embodiments of the present invention. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.

Claims

1. A method of identifying a container, comprising:

classifying and identifying the target container in the candidate container area according to the color data and the depth data of each pixel in the candidate container area to obtain a first identification result and a second identification result;

2. The method of claim 1, wherein the step of determining candidate container regions in the image from color data and/or depth data for each pixel in the image comprises:

Based on the target region, a candidate container region is determined.

3. The method of claim 2, wherein the step of determining candidate container regions based on the target region comprises:

determining a target region having the center distance less than a first preset threshold as part of the candidate container region; and/or determining a target region for which the average depth data is less than a second preset threshold as part of the candidate container region.

4. The method according to claim 1, wherein the step of classifying and identifying the target container in the candidate container region according to the color data and the depth data of each pixel in the candidate container region to obtain a first identification result and a second identification result includes:

Invoking a trained first recognition model, and processing color data of each pixel in the candidate container region to obtain a first recognition result;

and calling a trained second recognition model, and processing the depth data of each pixel in the candidate container region to obtain the second recognition result.

5. The method of claim 4, wherein the training method of the first recognition model comprises:

acquiring an initial first identification model;

training the initial first recognition model according to the plurality of first training samples to obtain a trained first recognition model, wherein the input of the first recognition model is color data of each pixel, and the input of the first recognition model is output as a corresponding first prediction container type;

the training method of the second recognition model comprises the following steps:

acquiring an initial second recognition model;

And training the initial second recognition model according to the plurality of second training samples to obtain a trained second recognition model, wherein the input of the second recognition model is depth data of each pixel, and the input of the second recognition model is output as a corresponding second prediction container type.

6. The method of any one of claims 1-5, wherein the first recognition result comprises a first category and a first confidence level and the second recognition result comprises a second category and a second confidence level;

the step of determining the category of the target container based on the first recognition result and the second recognition result includes:

determining the category of the target container according to the first category and the first confidence and the second category and the second confidence.

7. The method of claim 6, wherein the step of determining the category of the target container based on the first category and the first confidence level, and the second category and the second confidence level, comprises:

For any category, weighting the first confidence coefficient and the second confidence coefficient by using the weight to obtain a weighted value;

8. The method of any of claims 1-5, further comprising, after the step of determining the category of the target container based on the first recognition result and the second recognition result:

and driving a target grabbing device to grab the target container according to the target grabbing strategy.

9. The method of any one of claims 1-5, wherein the container is a cargo box.

10. A container identification device, comprising:

a category determination module configured to determine a category of the target container based on the first recognition result and the second recognition result.

11. A container access apparatus comprising: an image sensor, a memory, and a processor;

the memory is configured to store computer-executable instructions that, when executed by the processor, implement the container identification method of any one of claims 1 to 9.

12. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the container identification method of any one of claims 1 to 9.