WO2023083273A1

WO2023083273A1 - Grip point information acquisition method and apparatus, electronic device, and storage medium

Info

Publication number: WO2023083273A1
Application number: PCT/CN2022/131217
Authority: WO
Inventors: 李辉; 司林林; 丁有爽; 邵天兰
Original assignee: 梅卡曼德（北京）机器人科技有限公司
Priority date: 2021-11-10
Filing date: 2022-11-10
Publication date: 2023-05-19

Abstract

Disclosed in the present application are a grip point information acquisition method and apparatus, an electronic device, and a storage medium. The grip point information acquisition method comprises: obtaining a non-point cloud image containing an object to be gripped; processing the non-point cloud image to obtain a mask of the object; processing the mask of the object to obtain grip point association information; and on the basis of the grip point association information, obtaining grip point information for controlling a robot to grip the object. According to the present application, under the condition that a point cloud of an object to be gripped cannot be obtained, grip point information of the object can be obtained by means of other image data such as a color picture, so that a robot or a gripper can grip the object directly depending on the grip point information without the aid of the point cloud of the object, and the object gripping problem in a point cloud missing environment is effectively solved.

Description

Capture point information acquisition method, device, electronic device and storage medium

This application is required to be submitted to the China Patent Office on November 10, 2021, and the application number is CN 202111329097.7, the priority of the Chinese patent application titled "grabbing point information acquisition method, device, electronic equipment and storage medium", and submitted to the China Patent Office on November 10, 2021, the application number is CN 202111329085.4, the application title is The priority of the Chinese patent application for "image data processing method, device, electronic equipment and storage medium", and submitted to the Chinese Patent Office on November 10, 2021, the application number is CN 202111338191.9, and the application name is "image data processing method, device , electronic equipment and storage medium" and the priority of the Chinese patent application for ", and the application number CN 202111329083.5 submitted to the Chinese Patent Office on November 10, 2021, and the title of the application is "Catch point information acquisition method, device, electronic equipment and storage Medium” Chinese patent application priority, the entire content of which is incorporated in this application by reference.

technical field

The embodiments of the present application relate to the field of automatic control and program control of a robot arm or a fixture, and in particular to an order processing method, device, equipment, system, medium and product.

Background technique

In recent years, the rapid rise of e-commerce, express delivery and other industries has created good development opportunities for the logistics industry. With the maturity of technologies such as artificial intelligence and machine vision, there are more and more automation equipment in the most basic parts of logistics such as warehousing, handling, and sorting. It is the general trend for industrial robots to replace labor. As an important part of industrial robots, fixtures are more and more used in logistics industry, sorting, 3C industry and food feeding.

At present, the automatic control of the robotic arm or fixture needs to be based on the point cloud data of the object to be grasped. The usual practice is to first use a 3D camera to collect point cloud data, and then determine the trajectory point or grasping point of the robot arm based on the point cloud data, and then control the robot arm to execute at an appropriate position based on the trajectory point or grasping point and other information. position to perform the grab operation with the appropriate trajectory. The robot arm operates in a three-dimensional space, and its motion track is a three-dimensional track in the three directions of length, width, and height. Therefore, the track point or grab point should also have three-dimensional information of the X-axis, Y-axis, and Z-axis. 3D The point cloud data collected by the camera includes information in three dimensions. However, in some industrial scenarios, it is difficult to obtain clear point cloud information of grasped objects, such as in the cosmetics industry, where glass, especially black glass bottles, are used to hold liquids. This kind of glass bottle has a weak light signal, and the reflective material is easily interfered by multiple reflections from surrounding objects. Moreover, the glass is transparent, so there will be a lot of diffuse reflection and multiple reflection data, making it difficult to collect suitable point cloud data. Specifically, in this industrial scenario, it may not be possible to obtain the point cloud of the object to be grasped, or the collected point cloud of the object to be grasped may be missing, or other poor point clouds may cause the robot to either recognize If the item to be grasped is not found, or the wrong grasping point is calculated based on the wrong point cloud, and the wrong grasping point is used to perform the grasping, the bottle cannot be grasped or even the bottle falls. At present, there is no solution in the prior art to automatically control the robot to perform grasping based on the robot vision in the industrial scene where the point cloud is missing.

technical solution

In view of the above-mentioned problems, the present application is proposed in order to overcome the above-mentioned problems or at least partially solve the above-mentioned problems. Specifically, firstly, when the point cloud of the object to be grasped cannot be obtained, the application can obtain the grasping point information of the object to be grasped with the help of other image data such as color pictures, so that the robot or the gripper can directly rely on the grasping point The grabbing of the item to be grabbed can be achieved by obtaining point information without the help of the point cloud of the item, which effectively solves the problem of item grabbing in the absence of point cloud; secondly, this application proposes to correct the mask and obtain the The method of obtaining point-related information makes it possible to correct the inaccurate mask when the accurate item mask cannot be obtained, so as to obtain an accurate mask as much as possible and further obtain accurate grab point information, effectively avoiding the The inaccurate grasping point caused by the inaccurate mask leads to the problem of inaccurate grasping or falling when grasping; thirdly, this application proposes a method for the robot to automatically convert the inputted grasping point related information into a grasping point The method of point information, which can automatically obtain the reference information that can convert the relevant information of the grab point into the grab point information according to the environmental characteristics of the item to be grabbed, and obtain the grab point information based on the reference information for the robot to grab, This solution makes it possible to supplement and obtain complete grabbing point information based on existing information when the grabbing point information is missing, and reduces manual intervention.

All the solutions disclosed in the claims and description of the present application have the above-mentioned one or more innovations, and correspondingly, can solve the above-mentioned one or more technical problems. Specifically, the present application provides a method, device, electronic device, and storage medium for acquiring grabbing point information.

The capture point information acquisition method of the embodiment of the present application includes:

Obtain non-point cloud images containing items to be captured;

Process the non-point cloud image to obtain the mask of the object to be grasped;

Process the mask of the item to be grabbed to obtain the associated information of the grabbing point;

Based on the grasping point association information, the grasping point information used to control the robot to grasp the item to be grasped is obtained, wherein the grasping point association information is the parameter information used to determine the grasping point information, and the grasping point information is used to represent The parameter information needed by the robot to grab the item to be grabbed.

In another embodiment, an image data processing method for mask correction and capture point related information acquisition is provided, including:

Perform morphological processing on the mask of the item to be grabbed;

The mask of the item to be grasped is further processed to obtain the correction mask of the object to be grasped, and the grasping point association information is obtained based on the correction mask.

In yet another embodiment, a method for obtaining a correction mask of an item and obtaining associated information of grabbing points is provided, including:

Get the circumscribed rectangle of the mask of the item to be grabbed;

Based on the circumscribed rectangle of the mask of the item to be grabbed, an inscribed circle of the circumscribed rectangle is generated;

The correction mask of the item to be grasped is obtained based on the inscribed circle, and the relevant information of the grasping point is obtained based on the correction mask.

In yet another embodiment, another method for obtaining a correction mask of an item and obtaining associated information of grabbing points is provided, including:

Use the circle detection algorithm to process the mask of the item to be grabbed;

The correction mask of the object to be grasped is obtained based on the processing result of the circle detection algorithm, and the relevant information of the grasping point is obtained based on the correction mask.

In yet another embodiment, there is provided yet another method for obtaining a correction mask of an item and obtaining associated information of grabbing points, including:

Based on the pre-saved template of the item to be grabbed, the template matching algorithm is used to process the mask of the item to be grabbed;

The correction mask of the object to be grasped is obtained based on the processing result of the template matching algorithm, and the relevant information of the grasping point is obtained based on the correction mask.

In yet another embodiment, a method for converting grabbing point association information into grabbing point information is provided, including:

Obtain the reference object information of the item to be grabbed;

Process the reference object information of the item to be grabbed, and obtain the reference information of the item to be grabbed, and the reference information is information determined according to the reference object information of the item to be grabbed;

Based on the reference information of the item to be grabbed and the associated information of the grabbing point, the grabbing point information of the item to be grabbed is generated.

The capture point information acquisition device in the embodiment of the present application includes:

An image acquisition module, configured to acquire non-point cloud images containing items to be captured;

A mask generation module is used to process the non-point cloud image to obtain the mask of the object to be grabbed;

The mask processing module is used to process the mask of the item to be grabbed, so as to obtain the associated information of the grabbing point;

The grasping point information generation module is used to obtain the grasping point information for controlling the robot to grasp the item to be grasped based on the grasping point association information, wherein the grasping point association information is a parameter for determining the grasping point information Information, the grasping point information is used to indicate the parameter information required by the robot to grasp the object to be grasped.

The electronic device according to the embodiment of the present application includes a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the computer program, the method for obtaining grabbing point information in any of the above embodiments is implemented.

The computer-readable storage medium in the embodiments of the present application stores a computer program thereon, and when the computer program is executed by a processor, the method for acquiring grabbing point information in any of the above-mentioned embodiments is implemented.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Description of drawings

The above and/or additional aspects and advantages of the present application will become apparent and understandable from the description of the embodiments in conjunction with the following drawings, wherein:

FIG. 1 is a schematic flow diagram of a method for acquiring information on grabbing points in a scene where the point cloud is not good in some embodiments of the present application;

Fig. 2 is a schematic flowchart of a method for obtaining correction mask and information related to grabbing points in some embodiments of the present application;

Fig. 3 is a schematic flowchart of a method for converting capture point associated information into capture point information in some embodiments of the present application;

Fig. 4 is a schematic diagram of an item to be grabbed and an item mask acquired using a non-dedicated deep learning network in some embodiments of the present application;

Fig. 5 is a schematic structural diagram of a capture point information acquisition device in some embodiments of the present application;

Fig. 6 is a schematic structural diagram of an electronic device according to some embodiments of the present application.

Embodiments of the present invention

Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that the present application can be more thoroughly understood, and the scope of the present application can be fully conveyed to those skilled in the art.

Fig. 1 shows a schematic flow chart of a method for acquiring information on an item grabbing point according to an embodiment of the present application. As shown in Fig. 1, the method includes:

Step S100, acquiring a non-point cloud image containing an item to be captured;

Step S110, processing the non-point cloud image to obtain a mask of the object to be captured;

Step S120, process the mask of the item to be captured to obtain the information related to the captured point, where the associated information of the captured point is parameter information for determining the information of the captured point;

Step S130, based on the grasping point associated information, acquiring grasping point information for controlling the robot to grasp the object to be grasped.

For step S100, in an industrial scene, it is usually necessary to use a robot to grab a large number of items, so there may be multiple items to be grabbed instead of one.

As a preferred embodiment, the items to be grabbed can be a group of black cosmetic bottles placed in a material frame, and the robot needs to grab the cosmetic bottles from the material frame and transport them to other locations. Because the light signal of black glass itself is weak, and it is easy to be interfered by multiple reflections of surrounding objects, there are a lot of diffuse reflection and multiple reflection data, so when shooting with an industrial camera, it is likely that it is impossible to obtain a point cloud, which is difficult to collect. Control the required 3D point cloud data. Therefore, the method of the present application is particularly suitable for use in such a scenario, to identify objects to be grasped by means of image data other than point cloud data, calculate grasping point information, and control the robot to perform grasping.

The non-point cloud image data in this embodiment may preferably be a color picture of the item, such as a 2D color picture. Different from the point cloud, the 2D color image can clearly identify the items contained in it, even if the item is an item such as a black transparent glass bottle that cannot generate a point cloud, it can be clearly photographed and identified. It can be photographed by an industrial camera, and the group of objects to be captured is placed under the visual sensor to obtain the image data of the objects to be captured.

For step S110, any existing method can be used to calculate the mask of the item.

Preferably, the mask of the item can be obtained based on the deep learning network to generate the mask of the item to be grabbed. Image recognition is a conventional application of a deep learning network, and a general deep learning network capable of performing image recognition already exists in the prior art. As a specific implementation, an existing pre-built deep learning network for identifying items or extracting item masks can be used, and non-point cloud images can be input into the deep learning network for processing to obtain masks. To improve the accuracy of recognition, the network can be trained in advance.

Input the acquired 2D color image into the deep learning network, process the color image through the deep learning network, identify the area of interest in the image to generate a mask, generate an image mask in this area, and use the image mask to cover the original image area . In this way, the mask of the object to be grasped can be obtained on the basis of the color image.

For step S120, after the mask of the item is acquired, the associated information of the grabbing point of the item may be calculated in the mask area. Among them, the grasping point related information is information that lacks some information compared with the grasping point information and is difficult to achieve the purpose of robot grasping, or information that cannot be directly used by the robot but can be used to calculate the grasping point information.

Wherein, the specific position of the grabbing point is related to the clamp used and the item to be grabbed, for example, the grabbing point may include the center point of the grabable area of the item. For example, when the object to be grabbed is a black cosmetic bottle, the center point of the mouth of the cosmetic bottle can be selected as the grabbing point.

A non-customized deep learning network that can use and identify items in various situations. When processing captured photos, identifying items in the photos and extracting item masks, the generated masks usually do not match the captured items. Together, they may sometimes be slightly beyond the item's graspable area, sometimes may be slightly smaller than the item's graspable area, and the shape of the mask often does not coincide with the item's graspable area.

Figure 4 shows the item to be grabbed and the typical mask of the item to be grabbed generated by the deep learning network, specifically, Figure 4 shows the scene where the item to be grabbed is the above-mentioned cosmetic bottle, where the The shaped part is the bottleneck area of the glass bottle to be grasped, that is, the grasping area of the robotic arm. The shaded part is the mask extracted by the general deep learning network. It can be seen that in this case, the mask of the item is not accurate, and the grasping point calculated from this is naturally inaccurate. The inaccurate grasping point is It may cause problems such as not being able to catch the bottle or dropping the bottle when grabbing.

One solution is to design a deep learning network dedicated to this scenario, and through repeated training to improve the accuracy of the processing results after the image is input into the network, so that an accurate mask can be obtained after the image is input into the deep learning network and grab points.

Another solution is to process the inaccurate mask and obtain grab point information from the processed mask.

This application preferably uses the latter solution. Specifically, this application provides a lower-cost and more general method for mask correction and grabbing point acquisition, which is also one of the key points of this application.

Fig. 2 shows a schematic flowchart of an image data processing method for mask correction and capture point related information acquisition according to an embodiment of the present application. As shown in Figure 2, the methods include:

Step S200, performing morphological processing on the mask of the object to be grabbed;

In step S210, the mask of the item to be grasped is further processed to obtain a correction mask of the object to be grasped, and information related to the grasping point is obtained based on the correction mask.

For step S200, as shown in Figure 4, the item mask obtained by a general deep learning network usually cannot perfectly fit the object outline (the grasping area in Figure 4 is the area where the bottle mouth is located, it is not difficult to see that the mask There is a large gap between the film and the actual bottle mouth), showing a crooked state, and there may be many holes inside, which has little impact on general object recognition applications, but this application is applied to the industry of grabbing objects In the scene, the precision requirement is high, and such errors are intolerable. Therefore, it is necessary to process the obtained mask. The first thing to do is morphological processing, which changes the graphic form of the mask area.

In one embodiment, the morphological treatment may be a swelling treatment. After obtaining the non-point cloud image information, the image is expanded to fill in the defects such as lack and irregularity of the image. For example, for each pixel point on the mask, a certain number of points around the point, such as 8-25 points, can be set to have the same color as the point. This step is equivalent to filling the surroundings of each pixel, so if there is a missing part in the item mask, this operation will fill in all the missing parts. After this process, the item mask will become complete without missing, and at the same time The overall mask will also become slightly "fat" due to expansion, and proper expansion will help subsequent further image processing operations.

In one embodiment, the morphological processing can also be opening operation processing or closing operation processing. Both the opening operation processing and the closing operation processing are processing methods that combine expansion processing and erosion processing. The processing removes the overfilled part, so as to better improve the graphic accuracy of the mask area.

For step S210, the present application discloses three processing methods to obtain the correction mask of the item and obtain the relevant information of the grabbing point.

The first treatment method includes:

Step S220, obtaining the circumscribed rectangle of the mask of the item to be grabbed;

Step S221, based on the circumscribed rectangle of the mask of the item to be captured, an inscribed circle of the circumscribed rectangle is generated;

In step S222, the correction mask of the object to be grasped is obtained based on the inscribed circle, and the relevant information of the grasping point is obtained based on the correction mask.

For step S220, any circumscribing rectangle algorithm may be used to obtain a circumscribing rectangle for the mask.

As a specific implementation, the four corner points of the circumscribed rectangle can be generated based on the mask of the item to be grasped, and then the circumscribed rectangle can be generated based on the corner points. Specifically, calculate the X coordinate value and Y coordinate value of each pixel in the mask, respectively select the smallest X value, the smallest Y value, the largest X value, and the largest Y value; then, combine the four values into The coordinates of the point, that is, the minimum X value and the minimum Y value form the coordinates (Xmin, Ymin), the maximum X value and the Y value form the coordinates (Xmax, Ymax), and the minimum X value and the maximum Y value form the coordinates (Xmin , Ymax), and the largest X value and the smallest Y value form the coordinates (Xmax, Ymin). In points (Xmin, Ymin), (Xmax, Ymax), (Xmin, Ymax), (Xmax, Ymin) as the four corners of the circumscribing rectangle and connecting them, the circumscribing rectangle is obtained.

For step S221, the key point of this application is to use the inscribed circle algorithm as a part of calculating the correction mask and grab point information, without making any improvements to the inscribed circle algorithm, so the specific inscribed circle algorithm is not limited, and any inscribed circle algorithm is not limited. All inscribed circle algorithms can be used in this application, as long as the selected inscribed circle algorithm can obtain the inscribed circle of the above-mentioned circumscribed rectangle.

For step S222, after the inscribed circle is obtained, the mask part enclosed by the inscribed circle can be calculated, and this part of the mask can be used as a correction mask for the object to be grasped. The contour of the obtained correction mask is the same shape and size as the inscribed circle, and the position of the center of the circle is calculated on the correction mask of the inscribed circle, and the information of the center of the circle is obtained, and the information of the center of the circle is used as the relevant information of the grab point. Specifically, the two-dimensional position information of the circle center, for example, the X-axis and Y-axis information of the circle center, may be used as the X-axis position information and the Y-axis position information of the grabbing point.

The correction mask obtained in this way has the shape of the mask consistent with the shape of the bottle mouth.

The second treatment method includes:

Step S230, using the circle detection algorithm to process the mask of the item to be grabbed;

In step S231, the correction mask of the object to be grasped is obtained based on the processing result of the circle detection algorithm, and the relevant information of the grasping point is obtained based on the correction mask.

For step S230, the circle detection algorithm is also called a circle finding algorithm, which can be used to detect circular features in irregular graphics and find circles contained in the graphics. Commonly used algorithms include circle Hough transform algorithm, random Hough transform algorithm, random circle detection algorithm, etc. The focus of this embodiment is to use the circle detection algorithm to find circles from the morphologically processed mask, and does not limit which circle detection algorithm to use specifically. Since the mouth of the bottle itself is circular, and the collected mask contains some features of the shape of the mouth of the bottle, the circle found in the mask area is roughly the position of the mouth of the bottle.

For step S231, after the circle in the mask is found by the circle detection algorithm, the mask part surrounded by the circle can be used as a correction mask for the object to be grasped. Then calculate the center of the circle, and use the information of the center of the circle as the associated information of the grabbing point. Similar to method 1, the associated information of the grabbing point may be the two-dimensional position information of the center of the circle, and this information is used as the X-axis position information and the Y-axis position information of the grabbing point.

The second method does not need to calculate the circumscribed rectangle and inscribed circle that do not exist originally, but only needs to find the circular part from the existing mask area, and the calculation accuracy is higher. In addition, the first and second methods have better applicability in the industrial scene where the area to be captured is circular, and can significantly improve the accuracy of the determined associated information of the captured points.

The third treatment method includes:

Step S240, based on the pre-saved template of the item to be captured, use a template matching algorithm to process the mask of the item to be captured;

In step S241, the correction mask of the object to be grasped is obtained based on the processing result of the template matching algorithm, and the relevant information of the grasping point is obtained based on the correction mask.

For step S240, the template of the item to be grabbed can be the template of the whole item to be grabbed, or the template of the grabbing area of the item to be grabbed, for example, when the item to be grabbed is a black glass cosmetic bottle, and the clamp In a scene where the mouth of a cosmetic bottle needs to be grabbed, the template for the item to be grabbed can be a three-dimensional template for the entire black glass cosmetic bottle, or a template only for the graspable area of the bottle mouth.

After obtaining the uncorrected mask area, based on the pre-stored template, a matching algorithm is used to match the template in the mask area. In simple terms, the template is equivalent to a known small image, and the template matching algorithm is equivalent to searching for a target in a large image including a small image. It is known that there is a target in the image, and the target is the same The templates have the same size, orientation and image elements, through which the template matching algorithm can find the target, that is, the small image, and determine its pose.

This embodiment does not limit the specific matching algorithm. Since the mask itself will lose color information, the focus is on shape matching rather than color matching. Therefore, this application preferably uses a shape-based matching algorithm for matching.

In addition, considering the efficiency and accuracy of matching, when performing template matching, the matching can be considered successful when the shape similarity reaches 70-95%. Which numerical value is specifically selected can be selected and adjusted according to the needs of actual application scenarios. Certainly, those skilled in the art may also set a specific value or range of values for the above-mentioned shape similarity according to actual matching accuracy requirements.

For step S241, after the shape matching the pre-stored template is found from the mask, the mask surrounded by the shape can be used as a correction mask, and the relevant information of the grasping point is further calculated.

Due to the use of the template, no matter what shape the item to be grabbed is in the area to be grabbed, it can be matched and grabbed, not limited to the case where the grabbing area of the item to be grabbed is circular Scenes. Correspondingly, when grabbing different items, the positions of the grabbing points are also different.

In one embodiment, the grabbing point may be the center point of the correction mask, and the associated information of the grabbing point may be the two-dimensional position information of the center point, that is, the X-axis position information and the Y-axis position information of the grabbing point.

In engineering practice, the third method can reach a high standard in terms of accuracy and calculation speed, and can be used for grabbing any item.

Those skilled in the art can understand that although the above-mentioned preferred embodiments of the present application are described in conjunction with the method of obtaining a correction mask based on a morphologically processed mask, this is not a limitation. In the case of a small defect, the mask can not be processed morphologically, and the correction mask can be obtained directly through the mask obtained from the non-point cloud image and the relevant information of the grabbing point can be determined. Those skilled in the art can according to the actual application time The degree of mask defect, the mask is selectively morphologically processed before the mask is corrected.

In step S130, as before, the grasping point association information may be incomplete compared with the grasping point information, so it cannot be directly used by the robot to perform grasping. For this scenario, in a possible implementation manner, it can be implemented by The missing dimension information of the grasping point association information is preset, and then the grasping point information for controlling the robot to grasp the object to be grasped is obtained based on the grasping point association information and the preset dimension information missing from the grasping point association information.

As a specific example, the associated information of the grasping point can be two-dimensional information of the grasping point (such as X-axis information and Y-axis information), and the grasping point information required by the robot can be three-dimensional information or more dimensional information, such as , the grasping point information is three-dimensional information that also includes Z-axis information, or four-dimensional information that also includes rotation angle/clamping depth, or, the grasping point associated information can be the three-dimensional information of the grasping point, the grasping point required by the robot Point information is four-dimensional information or more dimensional information.

In this case, in order to convert the grabbing point associated information into grabbing point information, the grabbing point information can be obtained by pre-setting the missing information of the grabbing point associated information. For example, the grabbing point associated information is two-dimensional Information (or three-dimensional information), the third-dimensional information (or fourth-dimensional information) that is missing in the relevant information of the grab point is preset by pre-detection or manual input. For example, when the item to be grabbed is a black glass cosmetic bottle for which the information of the grabbing point cannot be obtained, the specific height information can be entered in advance through manual entry, or when the item to be grabbed is a fragile product placed at a specific angle, you can Manually enter its corresponding grabbing angle.

In this way, after the non-point cloud image is processed through the aforementioned scheme and the grasping point association information of the grasping point is obtained, the grasping point association information can be further converted into grasping point information based on the preset information. Robot use.

This method needs to manually input the missing dimension information before grabbing. In the case where the structure and arrangement of the items are relatively uniform, the processing efficiency can be significantly improved by manually inputting the unified dimension information.

Furthermore, the present application also proposes a method for converting grabbing point association information into grabbing point information without manual intervention and applicable to occasions where the structure and placement of items are not uniform.

Fig. 3 shows a schematic flowchart of a method for converting capture point association information into capture point information according to an embodiment of the present application. As shown in Figure 3, the methods include:

Step S300, acquiring reference object information of the item to be grabbed;

Step S310, processing the reference object information of the item to be captured, and obtaining the reference information of the item to be captured, where the reference information is information determined according to the reference object information of the item to be captured;

Step S320, based on the reference information of the item to be captured and the associated information of the grabbing point, the information of the grabbing point of the item to be grabbed is generated.

For step S300, in the scene where the point cloud of the item to be captured is not good, the item with a qualified point cloud can be used as a reference object to obtain the missing information in the capture point information of the item to be captured. In the case of obtaining the X-axis information and Y-axis information of the grasping point, an item with a point cloud that can recognize the Z-axis information can be used as a reference object.

As a specific implementation manner, the grasping point association information may be two-dimensional information of the grasping point, for example, the grasping point association information may include X-axis information and Y-axis information of the grasping point. At this time, the corresponding grabbing point information may be three-dimensional information of the grabbing point, or four-dimensional or more dimensional information. Assuming that the current control system needs to determine the information of the X-axis, Y-axis, and Z-axis of the grasping point in order to control the position of the robot’s grasping, after obtaining the two-dimensional information of the grasping point, it cannot To perform grasping based on this information, it is necessary to convert the two-dimensional information of the grasping point into three-dimensional information of the grasping point in a subsequent step. The information that can be obtained through non-point cloud images is usually two-dimensional information (rather than one-dimensional information). Combining the relevant information of the grasping point with the missing height dimension, the corresponding three-dimensional information of the grasping point can be obtained, that is, the grasping point Get some info.

Specifically, the reference object may include other items to be grasped and/or material frames. The reference object may be an item that is closer to the item to be grabbed, or may be other similar items to be grabbed that are placed together with the item to be grabbed. Specifically, for an industrial scene where a large number of items to be grasped, such as cosmetic bottles, are placed in a material box, since the point cloud of the box is usually complete, and as long as the entire box does not have strong deformation, its height at each position is Consistent, that is, at each position, the height of the material frame is the same as the height of the item to be grabbed or has a fixed height difference from the item to be grabbed, so it is suitable as a reference object. Therefore, when the point cloud of the item to be captured cannot be obtained in this scenario, the 2D color image data of the item and the point cloud data of the recognition frame can be collected for subsequent steps.

In other embodiments, the reference object should have a qualified point cloud. When shooting with a camera, since multiple objects to be captured are in different positions, the quality of the point cloud of each object to be captured is different in the overall point cloud data captured at a certain position. Specifically, it may not be possible to obtain suitable point cloud data of some items to be grasped, but suitable point cloud data of some other items to be grasped may be obtained. In this case, the object to be grasped with a better point cloud can be selected as a reference for other objects to be grasped.

The point cloud information can be obtained through a 3D industrial camera. A 3D industrial camera is generally equipped with two lenses to capture the group of objects to be captured from different angles. After processing, the three-dimensional image of the object can be displayed. Place the group of items to be captured under the visual sensor, and shoot the two lenses at the same time. According to the relative attitude parameters of the two images obtained, use the general binocular stereo vision algorithm to calculate the X, The Y and Z coordinate values and the coordinate orientation of each point are converted into point cloud data of the item group to be captured. During specific implementation, components such as laser detectors, visible light detectors such as LEDs, infrared detectors, and radar detectors may also be used to generate point clouds, and this application does not limit the specific implementation methods.

As an example, the two-dimensional color image corresponding to the three-dimensional object area and the depth image corresponding to the two-dimensional color image may also be acquired along a depth direction perpendicular to the object. Among them, the two-dimensional color map corresponds to the image of the plane area perpendicular to the preset depth direction; each pixel in the depth map corresponding to the two-dimensional color map corresponds to each pixel in the two-dimensional color map, and each The value of a pixel is the depth value of the pixel. In one embodiment, the obtained reference object information may be a point cloud of the reference object or a depth map of the reference object.

For step S310, the reference information is information determined according to the reference object information of the item to be grabbed. Take an industrial scene in which multiple black glass cosmetic bottles to be captured are arranged in a material frame as an example. After obtaining the overall point cloud of the item group, you can further identify the obtained overall point cloud. A clearer reference For example, from the overall point cloud, the point cloud of the material frame or the point cloud of some bottle mouths with clearer point clouds can be identified. Afterwards, the identified point cloud is processed to extract the height information or Z-axis information therein as reference information. Similarly, depth information (corresponding to z-axis information) can also be extracted from the depth map of the reference object as reference information. Although in this embodiment, the missing information in the two-dimensional information of the grabbing point is height information as an example, those skilled in the art can understand that when the missing information is not height information, the reference information may not be height information.

For step S320, the specific method of generating the grabbing point information of the item to be grabbed can be adjusted by preset reference information, and the reference information is adjusted using the reference information adjustment value, and then based on the adjusted reference information and the item to be grabbed Generate the grabbing point information of the item to be grabbed based on the associated information of the grabbing point. Wherein, the reference information adjustment value may be set according to the original difference between the reference information and the missing information of the grab point association information. For example, in the height dimension, the original difference between the reference information and the missing information of the grasping point association information is the height difference between the point cloud of the reference object and the point cloud of the object to be grasped. The following two examples are used for further illustration.

If you are using the point cloud of the bottle mouth, because the bottles in the material box are of the same type, the height of the point cloud is the same as the height of the mouth of all bottles. At this time, there is no need to set the reference information adjustment value, or the reference information adjustment value is 0 . After obtaining the height information of the bottle mouth, combine it with the two-dimensional grasping point information to obtain three-dimensional grasping point information, and then the gripper can perform grasping based on the three-dimensional grasping point information.

If you are using the point cloud of the material box, the height obtained through the point cloud may or may not be the same as the bottle. If the heights are different, that is, there is a height difference between the point cloud of the reference object and the point cloud of the object to be grasped, an adjustment value can be preset according to the height difference between the two.

After obtaining the height information of the material frame and the associated information of the two-dimensional grasping point, the three-dimensional grasping point information can be determined in combination with the adjustment value. For example, if the height of the box is 10cm, and the adjustment value is -2cm, then the height of the bottle mouth can be calculated as 10-2=8cm, and then combined with the X-axis and Y-axis information of the grabbing point, a three-dimensional grabbing point can be obtained information.

The robots or fixtures appearing in the above embodiments can include various general fixtures. The general fixtures refer to fixtures whose structure has been standardized and have a large scope of application, for example, three-jaw chucks and four-jaw chucks for lathes, Flat pliers and indexing heads for milling machines. As another example, according to the clamping power source used by the fixture, the fixture can be divided into manual clamping fixture, pneumatic clamping fixture, hydraulic clamping fixture, gas-hydraulic linkage clamping fixture, electromagnetic fixture, vacuum fixture, etc., or other items that can be picked up bionic devices. The present application does not limit the specific type of the gripper, as long as it can realize the grabbing operation of the item.

In addition, it should be noted that although each embodiment of the present application has a specific combination of features, further combinations and cross-combinations of these features among the embodiments are also feasible.

According to the above-mentioned embodiments, firstly, when the point cloud of the object to be grasped cannot be obtained, the application can obtain the grasping point information of the object to be grasped with the help of other image data such as color pictures, so that the robot or the fixture can directly rely on The capture point information can realize the capture of the item to be captured without the help of the point cloud of the item, which effectively solves the problem of item capture in the absence of point cloud; secondly, this application proposes three methods for correcting the mask And the method of obtaining the relevant information of the grabbing point makes it possible to correct the inaccurate mask when the accurate item mask cannot be obtained, so as to obtain the accurate mask as much as possible and obtain the grabbing point information, effectively avoiding The inaccurate grasping point caused by the inaccurate mask leads to the problem of inaccurate grasping or falling when grasping; thirdly, the application proposes a method for the robot to automatically convert the inputted grasping point related information into a grasping point A method for obtaining point information, which can automatically obtain reference information that can convert the associated information of the grab point into grab point information according to the environmental characteristics of the item to be grabbed, and determine the grab point information based on the reference information for the robot to grab , this scheme makes it possible to obtain complete grabbing point information based on existing information in the absence of grabbing point information, and reduces manual intervention.

FIG. 5 shows an apparatus for obtaining grabbing point information according to yet another embodiment of the present application, which includes:

An image acquisition module 400, configured to acquire a non-point cloud image containing an item to be captured;

A mask generating module 410, configured to process the non-point cloud image to obtain the mask of the item to be grabbed;

A mask processing module 420, configured to process the mask of the item to be grabbed, so as to obtain the information associated with the grabbing point;

The grasping point information generating module 430 is configured to obtain grasping point information for controlling the robot to grasp the item to be grasped based on the grasping point association information, wherein the grasping point association information is used to determine the grasping point information The parameter information and the grasping point information are used to indicate the parameter information required by the robot to grasp the object to be grasped.

Optionally, the image acquisition module 400 specifically includes that the non-point cloud images include color images.

Optionally, the mask generation module 410 is specifically configured to process non-point cloud images based on deep learning to generate a mask of the object to be captured.

Optionally, the mask generation module 410 is specifically configured to pre-build a deep learning network, and input non-point cloud images into the deep learning network for processing.

Optionally, the mask processing module 420 specifically includes that the grabbing point includes a center point of the grabable area of the item.

Optionally, the mask processing module 420 is specifically configured to perform morphological processing on the mask of the object to be grabbed.

Optionally, the mask processing module 420 specifically includes that the morphological processing includes morphological dilation processing.

Optionally, the mask processing module 420 is specifically configured to: obtain the circumscribed rectangle of the mask of the item to be captured; generate an inscribed circle of the circumscribed rectangle based on the circumscribed rectangle of the mask of the item to be captured; Obtain the correction mask of the item to be grasped and/or the associated information of the grasping point.

Optionally, the mask processing module 420 is specifically configured to: generate four corner points of a circumscribed rectangle based on the mask of the item to be grasped; generate a circumscribed rectangle based on the corner points.

Optionally, the mask processing module 420 is specifically configured to: use a circle detection algorithm to process the mask of the item to be captured; obtain a corrected mask and/or capture point association of the item to be captured based on the processing result of the circle detection algorithm information.

Optionally, the mask processing module 420 specifically includes that the circle detection algorithm includes a circle Hough transform algorithm, a random Hough transform algorithm and/or a random circle detection algorithm.

Optionally, the mask processing module 420 is specifically configured to: use a template matching algorithm to process the mask of the item to be captured based on the pre-saved template of the item to be captured; obtain the item to be captured based on the processing result of the template matching algorithm Rectification mask and/or grab point association information for .

Optionally, the mask processing module 420 specifically includes that the matching algorithm includes a shape-based matching algorithm.

Optionally, the grabbing point information generation module 430 is specifically configured to: preset dimension information missing in grabbing point associated information; obtain information for Control the grabbing point information of the robot to grab the item to be grabbed.

Optionally, the grasping point information generation module 430 is specifically configured to: acquire reference object information of the item to be grasped; process the reference object information of the item to be grasped, and obtain reference information of the item to be grasped, and the reference information is based on The information determined by the reference object information of the captured item; according to the reference information of the item to be captured and the related information of the captured point, the information of the captured point of the item to be captured is generated.

Optionally, the grabbing point information generating module 430 specifically includes that the grabbing point associated information includes two-dimensional information of the grabbing point.

Optionally, the grabbing point information generation module 430 specifically includes that the grabbing point associated information includes X-axis information and Y-axis information of the grabbing point.

Optionally, the grabbing point information generation module 430 specifically includes that the grabbing point information includes three-dimensional information of the grabbing point.

Optionally, the grabbing point information generating module 430 specifically includes that the reference information includes Z-axis information.

Optionally, the grabbing point information generation module 430 is specifically configured to: preset reference information adjustment values; use reference information adjustment values to adjust reference information; The grab point information of the grabbed item.

Optionally, the capture point information generating module 430 specifically includes that the reference object information includes a point cloud and/or a depth map of the reference object.

Optionally, the capture point information generation module 430 specifically includes that the reference object has a qualified point cloud.

Optionally, the grabbing point information generating module 430 specifically includes that the reference objects include other items to be grabbed and/or material boxes.

In the device embodiment shown in FIG. 5 above, only the main functions of the modules are described, and all the functions of each module correspond to the corresponding steps in the method embodiment. The working principle of each module can also refer to the description of the corresponding steps in the method embodiment. , which will not be repeated here.

The present application also provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the method in any one of the above-mentioned implementation modes is implemented. It should be noted that the computer program stored in the computer-readable storage medium in the embodiments of the present application can be executed by the processor of the electronic device. In addition, the computer-readable storage medium can be a storage medium built in the electronic device, or can be The storage medium of the electronic device is pluggable and pluggable. Therefore, the computer-readable storage medium in the embodiments of the present application has high flexibility and reliability.

6 shows a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may be a control system/electronic system configured in a car, a mobile terminal (for example, a smart mobile phone, etc.), a personal computer (PC, such as , desktop computer or notebook computer, etc.), tablet computer, server, etc., the specific embodiments of the present application do not limit the specific implementation of the electronic device.

As shown in FIG. 6 , the electronic device may include: a processor (processor) 1202 , a communication interface (Communications Interface) 1204 , a memory (memory) 1206 , and a communication bus 1208 .

in:

The processor 1202 , the communication interface 1204 , and the memory 1206 communicate with each other through the communication bus 1208 .

The communication interface 1204 is used to communicate with network elements of other devices such as clients or other servers.

The processor 1202 is configured to execute the program 1210, and specifically, may execute relevant steps in the foregoing method embodiments.

Specifically, the program 1210 may include program codes including computer operation instructions.

The processor 1202 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement the embodiments of the present application. The one or more processors included in the electronic device may be of the same type, such as one or more CPUs, or may be different types of processors, such as one or more CPUs and one or more ASICs.

The memory 1206 is used to store the program 1210 . The memory 1206 may include a high-speed RAM memory, and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.

Program 1210 may be downloaded and installed from a network via communication interface 1204, and/or installed from removable media. When the program is executed by the processor 1202, the processor 1202 may be made to perform various operations in the foregoing method embodiments.

In the description of this specification, descriptions with reference to the terms "one embodiment", "some embodiments", "exemplary embodiments", "example", "specific examples" or "some examples" mean that a combination of the embodiments or Examples describe specific features, structures, materials, or characteristics that are included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples.

Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments or portions of code comprising one or more executable instructions for implementing specific logical functions or steps of the process , and the scope of preferred embodiments of the present application includes additional implementations in which functions may be performed out of the order shown or discussed, including in substantially simultaneous fashion or in reverse order depending on the functions involved, which shall It should be understood by those skilled in the art to which the embodiments of the present application belong.

The logic and/or steps represented in the flowcharts or otherwise described herein, for example, can be considered as a sequenced listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium, For use with instruction execution systems, devices, or devices (such as computer-based systems, systems including processing modules, or other systems that can fetch instructions from instruction execution systems, devices, or devices and execute instructions), or in conjunction with these instruction execution systems, devices or equipment used. For the purposes of this specification, a "computer-readable medium" may be any device that can contain, store, communicate, propagate or transmit a program for use in or in conjunction with an instruction execution system, device or device. More specific examples (non-exhaustive list) of computer-readable media include the following: electrical connection with one or more wires (electronic device), portable computer disk case (magnetic device), random access memory (RAM), Read Only Memory (ROM), Erasable and Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM). In addition, the computer-readable medium may even be paper or other suitable medium on which the program may be printed, as it may be possible, for example, by optically scanning the paper or other medium, followed by editing, interpretation or other suitable means if necessary. Processing to obtain programs electronically and store them in computer memory.

The processor can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field- Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.

It should be understood that each part of the embodiments of the present application may be realized by hardware, software, firmware or a combination thereof. In the embodiments described above, various steps or methods may be implemented by software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques known in the art: Discrete logic circuits, ASICs with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.

Those of ordinary skill in the art can understand that all or part of the steps carried by the methods of the above embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium. When the program is executed , including one or a combination of the steps of the method embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing module, each unit may exist separately physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. If the integrated modules are realized in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic disk or an optical disk, and the like.

Although the embodiments of the present application have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limitations on the present application, and those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims

A method for acquiring grabbing point information, comprising:

Obtain non-point cloud images containing items to be captured;

Processing the non-point cloud image to obtain a mask of the item to be captured;

Processing the mask of the item to be grasped to obtain the relevant information of the grasping point;

Based on the grasping point associated information, the grasping point information used to control the robot to grasp the item to be grasped is acquired, wherein the grasping point associated information is parameter information used to determine the grasping point information, and the grasping point The fetching point information is used to indicate the parameter information required by the robot to grab the item to be grabbed.
The method for obtaining grabbing point information according to claim 1, wherein the processing the non-point cloud image to obtain the mask of the item to be grabbed includes:

The non-point cloud image is processed based on deep learning to generate a mask of the object to be captured.
The method for obtaining grabbing point information according to claim 1, wherein the grabbing point includes a center point of a grabable area of an item.
The method for obtaining grabbing point information according to claim 1, wherein the processing of the mask of the item to be grabbed to obtain the grabbing point associated information includes:

Obtain the circumscribed rectangle of the mask of the item to be grabbed;

Generate an inscribed circle of the circumscribed rectangle based on the circumscribed rectangle of the mask of the item to be grabbed;

The correction mask of the item to be grasped is obtained based on the inscribed circle, and the grasping point association information is obtained based on the correction mask.
The grabbing point information acquisition method according to claim 4, wherein said acquiring the circumscribed rectangle of the mask of the item to be grabbed includes:

Based on the mask of the item to be grabbed, generate 4 corner points of the circumscribed rectangle;

A bounding rectangle is generated based on the corner points.
The method for obtaining grabbing point information according to claim 1, wherein the processing of the mask of the item to be grabbed to obtain the grabbing point associated information includes:

Using a circle detection algorithm to process the mask of the item to be grabbed;

The correction mask of the object to be grasped is obtained based on the processing result of the circle detection algorithm, and the relevant information of the grasping point is obtained based on the correction mask.
The method for obtaining grabbing point information according to claim 1, wherein the processing of the mask of the item to be grabbed to obtain the grabbing point associated information includes:

Processing the mask of the item to be captured using a template matching algorithm based on the pre-saved template of the item to be captured;

The correction mask of the item to be grasped is obtained based on the processing result of the template matching algorithm, and the relevant information of the grasping point is obtained based on the correction mask.
The method for acquiring grabbing point information according to claim 7, wherein the matching algorithm includes a shape-based matching algorithm.
The method for acquiring grabbing point information according to any one of claims 4 to 8, wherein the mask of the item to be grabbed is a morphologically processed mask.
The method for acquiring grabbing point information according to claim 9, wherein the morphological processing includes morphological expansion processing.
The grabbing point information acquisition method according to any one of claims 1 to 8, characterized in that, based on the grabbing point associated information, acquiring the grabbing point for controlling the robot to grab the item to be grabbed information, including:

Pre-set the missing dimension information of the capture point association information;

Based on the grasping point association information and the dimension information missing from the preset grasping point association information, the grasping point information for controlling the robot to grasp the object to be grasped is acquired.
The grabbing point information acquisition method according to any one of claims 1 to 8, characterized in that, based on the grabbing point associated information, acquiring the grabbing point for controlling the robot to grab the item to be grabbed information, including:

Obtain the reference object information of the item to be grabbed;

Processing the reference object information of the item to be grabbed, and obtaining the reference information of the item to be grabbed, the reference information is information determined according to the reference object information of the item to be grabbed;

The grabbing point information of the item to be grabbed is generated according to the reference information of the item to be grabbed and the associated information of the grabbing point.
The method for obtaining grabbing point information according to claim 12, wherein the grabbing point of the item to be grabbed is generated according to the reference information of the item to be grabbed and the associated information of the grabbing point information, including:

Preset reference information adjustment value;

adjusting the reference information using the reference information adjustment value;

The grasping point information of the item to be grasped is generated based on the adjusted reference information and the grasping point associated information of the item to be grasped.
The method for acquiring grabbing point information according to claim 12, wherein the reference object information includes a point cloud or a depth map of the reference object.
The method for acquiring grab point information according to claim 12, wherein the reference object has a qualified point cloud.
The method for acquiring grabbing point information according to claim 12, wherein the reference objects include other items to be grabbed and/or material frames.
A grabbing point information acquisition device, characterized in that it comprises:

An image acquisition module, configured to acquire non-point cloud images containing items to be captured;

A mask generating module, configured to process the non-point cloud image to obtain a mask of the item to be captured;

A mask processing module, configured to process the mask of the item to be grabbed, so as to obtain information related to the grabbing point;

A grasping point information generation module, configured to obtain grasping point information for controlling the robot to grasp the item to be grasped based on the grasping point associated information, wherein the grasping point associated information is used to determine the grasping point Parameter information of the point information, the grasping point information is used to represent the parameter information required when the robot grasps the object to be grasped.
An electronic device, characterized by comprising: a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the computer program, claims 1 to 1 are realized. The grabbing point information acquisition method described in any one of 16.
A computer-readable storage medium, on which a computer program is stored, wherein, when the computer program is executed by a processor, the grabbing point information acquisition method according to any one of claims 1 to 16 is implemented.