CN116175542A

CN116175542A - Grabbing control method, grabbing control device, electronic equipment and storage medium

Info

Publication number: CN116175542A
Application number: CN202111429144.5A
Authority: CN
Inventors: 崔致豪; 丁有爽; 邵天兰
Original assignee: Mech Mind Robotics Technologies Co Ltd
Current assignee: Mech Mind Robotics Technologies Co Ltd
Priority date: 2021-11-28
Filing date: 2021-11-28
Publication date: 2023-05-30
Anticipated expiration: 2041-11-28
Also published as: CN116175542B

Abstract

The application discloses a grabbing control method, a grabbing control device, electronic equipment and a storage medium. The grabbing control method comprises the following steps: acquiring a mask of a grabbing area of at least one object to be grabbed; for each object to be grabbed in at least one object to be grabbed, acquiring a characteristic value of at least one characteristic of a mask in a grabbed area of the object to be grabbed; performing normalization processing on each of the acquired feature values of the at least one feature to obtain at least one normalized feature value; and calculating the grabbing priority value of each article to be grabbed based on at least one normalized characteristic value and a preset weight value of each article to be grabbed, so that when at least one article to be grabbed is grabbed, the grabbing sequence can be controlled according to the grabbing priority value. Compared with the traditional method, the grabbing and sorting method improves the sorting accuracy, and does not obviously reduce the operation speed even if multiple factors are considered because the characteristics of the whole object are not processed.

Description

Grabbing control method, grabbing control device, electronic equipment and storage medium

Technical Field

The present application relates to the field of automatic control of a robot arm or a gripper, program control B25J, and more particularly, to a gripping control method, apparatus, electronic device, and storage medium.

Background

The robot has the basic characteristics of perception, decision making, execution and the like, can assist or even replace human beings to finish dangerous, heavy and complex work, improves the working efficiency and quality, serves the life of the human beings, and enlarges or extends the activity and capacity range of the human beings. With the development of industrial automation and computer technology, robots are beginning to enter mass production and practical application stages. In industrial settings, industrial robots have found widespread use, capable of performing some repetitive or dangerous work instead of humans. Traditional industrial robot designs focus on the design and manufacture of robot hardware, which is not "intelligent" by itself. When the robot is used in an industrial field, technicians need to plan hardware equipment, a production line, material positions, task paths of the robot and the like of the whole industrial field in advance, for example, if articles are to be sorted and carried, field workers need to sort out different types of articles and neatly put the articles into material frames with uniform specifications, before the robot is used for operation, the production line, the material frames, carrying positions and the like need to be determined, and a fixed motion path, a fixed grabbing position, a fixed rotating angle and a fixed clamp are set for the robot according to determined information.

As an improvement of the conventional robot technology, an intelligent program-controlled robot based on robot vision has been developed, however, the current 'intelligence' is simpler, and the main implementation mode is that image data related to a task is acquired through a vision acquisition device such as a camera, 3D point cloud information is acquired based on the image data, and then the operation of the robot is planned based on the point cloud information, including information such as movement speed and movement track, so as to control the robot to execute the task. However, the existing robot control schemes do not work well when they encounter complex tasks. For example, in the scenes of super business, logistics and the like, a plurality of stacked articles are processed, the mechanical arm is required to sequentially position and identify the positions of the articles by means of vision equipment in the scattered and unordered scenes, the articles are picked up by using suction cups, clamps or other bionic instruments, and the picked articles are placed at corresponding positions according to a certain rule by the operations of mechanical arm movement, track planning and the like. Under such an industrial scene, a robot is used to perform grabbing, for example, the number of objects to be grabbed in the scene is too large, light rays are uneven, so that the quality of point clouds of partial objects is poor, and the grabbing effect is affected; the objects are various, are not orderly placed and face the five-flower eight door, so that when each object is grabbed, the grabbing points are different, and the grabbing position of the clamp is difficult to determine; the stacked articles are easy to generate the condition that other articles fly when one article is grabbed. Under the industrial scene, the factors influencing the difficulty in grabbing the objects are more, and the effect of the traditional grabbing and sorting method is not good enough; in addition, when the grabbing algorithm is designed to be more complex, more barriers are brought to site workers, and when a problem occurs, the site workers have difficulty in finding out why the problem occurs and how to adjust the problem to solve the problem, and often the robot provider is required to send out an expert to assist.

Disclosure of Invention

The present invention has been made in view of the above problems, and aims to overcome or at least partially solve the above problems. Firstly, the method controls the clamp to execute grabbing of the articles to be grabbed based on the mask of the grabbing areas of the articles to be grabbed and combining the steps of press-stacking detection, pose prediction, grabbing sequencing and the like, so that the grabbing mode of each article to be grabbed can be accurately identified for a dense scene where a plurality of articles to be grabbed are piled, all the articles to be grabbed can be grabbed orderly according to a specific sequence, and compared with the existing grabbing scheme, the method can effectively avoid situations of flying other articles and the like when grabbing in the dense scene, and improves the grabbing accuracy; secondly, the grabbing and sorting method disclosed by the invention is used for comprehensively sorting according to the characteristics of the mask in the grabbing area of the object to be grabbed, so that the sorting accuracy is improved compared with the traditional method, and the characteristics of the whole object are not processed, so that the operation speed is not obviously reduced even if a plurality of factors are considered; thirdly, the invention provides an article overlapping degree calculating method based on the graphic characteristics of the grabbing area of the article to be grabbed, compared with the traditional calculating method, the method can give out specific overlapping values, does not judge the overlapping condition of the object, does not give out a conclusion whether the object is overlapped or not, and can not accurately and qualitatively reflect whether the overlapping can affect grabbing of the article, however, the method has higher processing speed, quantitatively outputs the overlapping values of the article to be grabbed, can be used in other aspects, and is particularly suitable for scenes with requirements on operation speed or scenes with comprehensive ordering according to a plurality of characteristics; finally, the invention also provides a method for visually displaying parameters and image data related to the grabbing control method to the user, so that the user can intuitively determine various parameters in the grabbing process of the robot under the condition that the operation principle of the robot is not known, understand what the robot can execute tasks in a certain way, and further determine how to adjust the parameters of the robot so that the robot can operate according to the needs of the user.

All of the solutions disclosed in the claims and the description of the present application have one or more of the innovations described above, and accordingly, one or more of the technical problems described above can be solved. Specifically, the application provides a grabbing control method, a grabbing control device, electronic equipment and a storage medium.

The grabbing control method of the embodiment of the application comprises the following steps:

acquiring a mask of a grabbing area of at least one object to be grabbed;

for each object to be grabbed in at least one object to be grabbed, acquiring a characteristic value of at least one characteristic of a mask in a grabbed area of the object to be grabbed;

performing normalization processing on each of the acquired feature values of the at least one feature to obtain at least one normalized feature value;

and calculating the grabbing priority value of each article to be grabbed based on at least one normalized characteristic value and a preset weight value of each article to be grabbed, so that when at least one article to be grabbed is grabbed, the grabbing sequence can be controlled according to the grabbing priority value.

In some embodiments, the mask of the grippable region is characterized by: mask height, clamp size, number of point clouds in the mask, mask diagonal degree, mask stacking degree, mask size and/or pose direction.

In some embodiments, mask height feature values of the mask of the grippable region are calculated based on depth values of the grippable region.

In some embodiments, the clamp size is determined based on a mapping relationship between a preset clamp and the clamp size.

In some embodiments, the mask diagonal is determined based on the angle between the diagonal of the circumscribed rectangle of the mask and one side of the circumscribed rectangle.

In some embodiments, the priority value is calculated according to the following formula:

wherein P is the priority value of the object to be grabbed, n is the characteristic quantity, omega _i Weight of ith feature, X _i Is the feature value of the ith feature.

The grasping control device according to an embodiment of the present application includes:

the mask acquisition module is used for acquiring a mask of a grabbed area of at least one article to be grabbed;

the characteristic value acquisition module is used for acquiring the characteristic value of at least one characteristic of the mask in the grabbing area of each article to be grabbed in the at least one article to be grabbed;

the feature value normalization module is used for performing normalization processing on each of the acquired feature values of the at least one feature to obtain at least one normalized feature value;

the priority value calculating module is used for calculating the grabbing priority value of each article to be grabbed based on at least one normalized characteristic value and a preset weight value of each article to be grabbed, so that when at least one article to be grabbed is grabbed, the grabbing sequence can be controlled according to the grabbing priority value.

In some embodiments, the priority value calculation module calculates the priority value according to the following formula:

The electronic device of the embodiment of the application comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the grabbing control method of any embodiment when executing the computer program.

The computer-readable storage medium of the embodiments of the present application has stored thereon a computer program which, when executed by a processor, implements the grab control method of any of the embodiments described above.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of a grip control method of certain embodiments of the present application;

FIG. 2 is a schematic illustration of frame parameters according to certain embodiments of the present application;

FIG. 3 is a schematic illustration of mask pretreatment according to certain embodiments of the present application;

FIGS. 4a and 4b are schematic illustrations of article segments according to article size and article length in accordance with certain embodiments of the present application;

FIG. 5 is a flow chart of a method of determining a grasping order according to certain embodiments of the present application;

FIG. 6 is a schematic view of the diagonal of an article according to certain embodiments of the present application;

FIG. 7 is a flow chart of a method of calculating the degree of article overlap according to certain embodiments of the present application;

FIG. 8 is a schematic illustration of an article mask overlay condition according to certain embodiments of the present application;

FIG. 9 is a schematic illustration of the impact of the grade pose of an object to be grabbed on grabbing according to some embodiments of the present application;

FIG. 10 is a flow diagram of a method of visualizing parameters for a grip in accordance with certain embodiments of the present application;

FIGS. 11a and 11b are schematic illustrations of a visual image presented to a user after visualization of a visual menu and selection of height and suction cup size in accordance with certain embodiments of the present application;

FIG. 12 is a schematic structural view of a grip control device according to certain embodiments of the present application;

FIG. 13 is a schematic view of a grasping sequence determining device according to certain embodiments of the present application;

FIG. 14 is a schematic view of the structure of an article fold level computing device according to certain embodiments of the present application;

FIG. 15 is a schematic structural view of a grasping parameter visualization device according to certain embodiments of the present application;

fig. 16 is a schematic structural view of an electronic device according to some embodiments of the present application.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

In the description of the specific embodiments, it should be understood that the terms "center," "longitudinal," "transverse," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate describing the invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the invention.

Furthermore, the terms "first," "second," "third," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", etc. may explicitly or implicitly include one or more such feature. In the description of the present invention, unless otherwise indicated, the meaning of "a plurality" is two or more.

The invention can be used in industrial robot control scenes based on visual identification. A typical vision-based industrial robot control scenario includes devices for capturing images, control devices such as hardware for a production line and a PLC for the production line, robot components for performing tasks, and operating systems or software for controlling these devices. The means for capturing images may include a 2D or 3D smart/non-smart industrial camera, which may include an area camera, a line camera, a black and white camera, a color camera, a CCD camera, a CMOS camera, an analog camera, a digital camera, a visible light camera, an infrared camera, an ultraviolet camera, etc., depending on different functions and application scenarios; the production line can comprise a packaging production line, a sorting production line, a logistics production line, a processing production line and the like which need robots; the robot parts used in the industrial scene for performing tasks may be biomimetic robots, such as a human-type robot or a dog-type robot, or may be conventional industrial robots, such as a mechanical arm, etc.; the industrial robot may be an operation type robot, a program controlled type robot, a teaching reproduction type robot, a numerical control type robot, a sensory control type robot, an adaptation control type robot, a learning control type robot, an intelligent robot, or the like; the mechanical arm can be a ball-and-socket type mechanical arm, a multi-joint mechanical arm, a rectangular coordinate mechanical arm, a cylindrical coordinate mechanical arm, a polar coordinate mechanical arm and the like according to the working principle, and can be a grabbing mechanical arm, a stacking mechanical arm, a welding mechanical arm and an industrial mechanical arm according to the functions of the mechanical arm; the end of the mechanical arm can be provided with an end effector, and the end effector can use a robot clamp, a robot gripper, a robot tool quick-change device, a robot collision sensor, a robot rotary connector, a robot pressure tool, a compliance device, a robot spray gun, a robot burr cleaning tool, a robot arc welding gun, a robot electric welding gun and the like according to the requirements of tasks; the robot clamp can be various universal clamps, and the universal clamps refer to clamps with standardized structures and wide application range, such as a three-jaw chuck and a four-jaw chuck for a lathe, a flat tongs and an index head for a milling machine, and the like. For another example, the clamp may be classified into a manual clamp, a pneumatic clamp, a hydraulic clamp, a gas-liquid linkage clamp, an electromagnetic clamp, a vacuum clamp, etc. or other bionic devices capable of picking up an article, according to a clamping power source used for the clamp. The device for collecting images, the control devices such as hardware for a production line, a PLC (programmable logic controller) for the production line and the like, the robot parts for executing tasks and the operating system or software for controlling the devices can communicate based on TCP (transmission control protocol), HTTP (hyper text transfer protocol) and GRPC (generic personal computer) protocols (Google Remote Procedure Call Protocol ) so as to transmit various control instructions or commands. The operating system or software may be disposed in any electronic device, typically such electronic devices include industrial computers, personal computers, notebook computers, tablet computers, cell phones, etc., which may communicate with other devices or systems by wired or wireless means. Further, the gripping appearing in the present invention refers to any gripping action capable of controlling an article to change the position of the article in a broad sense, and is not limited to gripping the article in a narrow sense in a "gripping" manner, in other words, gripping the article in a manner such as suction, lifting, tightening, or the like, and also falls within the scope of the gripping of the present invention. The articles to be grasped in the present invention may be cartons, plastic soft packs (including but not limited to snack packages, milk tetra pillow packages, milk plastic packages, etc.), cosmeceutical bottles, cosmeceuticals, and/or irregular toys, etc., which may be placed in a floor, tray, conveyor belt, and/or material basket.

Fig. 1 shows a schematic flow chart of a grip control method according to an embodiment of the invention, as shown in fig. 1, comprising the steps of:

step S100, obtaining image data comprising one or more objects to be grabbed;

step S110, processing the image data to generate one or more masks of the grippable areas of the object to be grippable, and preprocessing the masks;

step S120, detecting whether one or more objects to be grabbed have a press-fit condition or not;

step S130, estimating the position and the posture of one or more objects to be grabbed;

step S140, configuring a clamp for the object to be grabbed according to the attribute of the object to be grabbed;

step S150, determining the order of gripping the one or more objects to be gripped by using the gripper based on the gripping characteristics of the one or more objects to be gripped.

First, it should be understood that the present embodiment enables the present invention to be used to perform gripping of any article in any industrial scene and obtain a good gripping effect by the above steps rather than the above steps having a specific order. In other words, although the steps are numbered in the present embodiment so that the method appears to be performed in the numbered order, there is practically no strict order between the steps, and thus the same effect can be achieved in an actual industrial scene without performing in the above-described specific order. Therefore, the present invention does not strictly limit the execution order of the steps, and the present invention is within the scope of the present invention as long as the steps are included in the scheme.

For step S100, the present invention may be applied to an industrial scene including one or more objects to be gripped, sequentially gripping all the objects to be gripped using a jig, and discharging the gripped objects to a specific position. The type of image data and the acquisition method are not limited in this embodiment. As an example, the acquired image data may include a point cloud or an RGB color map, the point cloud information may be acquired through a 3D industrial camera, and the 3D industrial camera is generally equipped with two lenses, which capture the object group to be grabbed from different angles, respectively, and the three-dimensional image of the object can be displayed after processing. And placing the object group to be grabbed below the vision sensor, shooting by two lenses at the same time, and calculating X, Y, Z coordinate values of each point and coordinate directions of each point of the glass to be glued by using a universal binocular stereoscopic vision algorithm according to the obtained relative attitude parameters of the two images so as to convert the X, Y, Z coordinate values and the coordinate directions of each point into point cloud data of the object group to be grabbed. In the specific implementation, the point cloud can also be generated by using elements such as a laser detector, a visible light detector such as an LED, an infrared detector, a radar detector and the like, and the specific implementation of the invention is not limited.

The point cloud data acquired in the mode is three-dimensional data, so that the data corresponding to the dimension with small influence on grabbing is filtered, the data processing amount is reduced, the data processing speed is further increased, the efficiency is improved, and the acquired three-dimensional point cloud data of the object group to be grabbed can be orthographically mapped to a two-dimensional plane.

As an example, a depth map corresponding to the orthographic projection may also be generated. A two-dimensional color map corresponding to the three-dimensional object region and a depth map corresponding to the two-dimensional color map may be acquired in a direction perpendicular to the depth of the object. Wherein the two-dimensional color map corresponds to an image of a planar area perpendicular to a preset depth direction; each pixel point in the depth map corresponding to the two-dimensional color map corresponds to each pixel point in the two-dimensional color map one by one, and the value of each pixel point is the depth value of the pixel point.

Articles to be grasped are often piled in boxes for transportation to the site, and such boxes for piled articles are often called material frames, and when grasping is performed, a mechanical arm or a clamp may touch the material frames during movement, so that the material frames and the placement positions of the articles in the material frames have important influence on grasping. As a preferred embodiment, parameters of the frame may be obtained. As shown in fig. 2, the frame data may be processed to extract or generate auxiliary parameters that have an effect on grabbing, such parameters including: the height of the material frame, the width of the material frame, the length of the material frame, and the grid obtained by dividing the width and the length of the material frame. It should be understood that the height, width and length are all determined values, and the dividing mode and number of the grids are determined by the skilled person according to the actual conditions of the used fixture, the grabbing mode, the characteristics of the objects to be grabbed and the like, and the grids can be used for conveniently calibrating the positions of the objects to be grabbed. The frame data may be preset or acquired by a camera.

For step S110, the grippable area of the article refers to a portion of the surface of the article that can be gripped by the fixture, in an industrial scene, the articles to be gripped may be placed in a orderly and orderly manner, and at this time, the grippable area of each article is substantially the same, and the manner of determining the grippable area is also relatively simple; it is also possible to pile together in a chaotic and unordered manner, where the grippable area of each item is random and it is necessary to determine the grippable area in a complex manner. The present embodiment is not limited to a specific use scenario and a specific method of determining the graspable region, as long as the graspable region can be acquired.

One possible embodiment of determining the grabber area and generating the mask may be to first, after acquiring image data comprising one or more objects to be grabbed, process the image data to identify each pixel in the image, e.g. for a 256 x 256 image 256 x 65536 pixels should be identified; and classifying all the pixel points included in the whole image based on the characteristics of each pixel point, wherein the characteristics of the pixel points mainly refer to RGB values of the pixel points, and in an actual application scene, RGB color images can be processed into gray images for conveniently classifying the characteristics, and the gray images can be classified by using the gray values. For classification of the pixel points, it may be predetermined which class the pixel points need to be classified into, for example, a large stack of beverage cans, food boxes and frames is included in the RGB image obtained by photographing, so if the purpose is to generate a mask in which the beverage cans, food boxes and frames are to be generated, the predetermined classification may be beverage cans, food boxes and frames. The three different classifications can be provided with a label, wherein the label can be a number, for example, a beverage can is 1, a food box is 2, a material frame is 3, or the label can be a color, for example, a beverage can is red, a food box is blue, and a material frame is green, so that after the classification and the processing are carried out, the beverage can is marked with 1 or red, the food box is marked with 2 or blue, and the material frame is marked with 3 or green in a finally obtained image. In this embodiment, the mask of the grippable region of the object is to be generated, so that only the grippable region is classified, for example, blue, and the blue region in the image processed in this way is the mask of the grippable region of the object to be grippable; a channel of image output is then created for each class, the channel acting to extract as output all class-dependent features in the input image. For example, after we create a channel of image output for the class of grippable region, the acquired RGB color image is input into the channel, and then the image from which the features of the grippable region are extracted can be acquired from the output of the channel. Finally, the feature image of the grippable region obtained by the processing is combined with the original RGB image to generate the composite image data with the grippable region mask identified.

Masks generated in this manner are sometimes unsuitable, e.g., some masks are of a size and shape that is inconvenient to follow. For another example, some areas may have masks generated, but the clamps may not be able to perform a grab at the mask locations. An unsuitable mask can have a significant impact on subsequent processing, and therefore requires pretreatment of the resulting mask for further steps. As shown in fig. 3, the preprocessing of the mask may include: 1. and (3) performing expansion treatment on the mask to fill in defects such as missing and irregular mask images. For example, for each pixel point on the mask, a certain number of points, e.g., 8-25 points, around the point may be set to be the same color as the point. This step corresponds to filling the periphery of each pixel, so if there is a defect in the object mask, the missing part will be filled completely, after this, the object mask will become complete, there is no defect, and the mask will become slightly "fat" due to expansion, and proper expansion will help to follow-up further image processing operation; 2. judging whether the area of the mask meets the preset condition, and if not, eliminating the mask. First, smaller mask areas are likely to be erroneous because of the continuity of the image data, one grabbed area will typically include a large number of pixels with similar characteristics, and mask areas formed by discrete small pixels may not be truly grabbed areas; secondly, the robot end actuating mechanism, namely the clamp, needs to have a certain area in the foot falling position when the grabbing task is executed, if the area of the grabbing area is too small, the clamp cannot drop the foot in the area at all, and therefore the object cannot be grabbed, and therefore too small mask is meaningless. The predetermined condition may be set according to the size of the jig and the size of the noise, and the value thereof may be a determined size, or the number of the included pixels, or a ratio, for example, the predetermined condition may be set to 0.1%, that is, when the ratio of the mask area to the whole image area is less than 0.1%, the mask is considered to be unusable, and then is removed from the image; 3. and judging whether the number of the point clouds in the mask is less than the preset minimum number of the point clouds. The number of the point clouds reflects the quality of the acquisition of the camera, and if the number of the point clouds in a certain grippable area is too small, the shooting of the area is not accurate enough. The point cloud may be used to control the gripper to perform the gripping, and too small a number may have an impact on the gripper's control process. Thus, the number of point clouds that should be included at least in a certain mask area may be set, for example: and when the number of the point clouds covered in a certain grabbing area is less than 10, eliminating the mask from the image data or randomly adding the point clouds for the grabbing area until the number reaches 10.

For step S120, the present invention may be used to perform gripping in a scenario where a large number of objects to be gripped are stacked, in which case there may be a situation where objects to which a certain grippable area belongs are pressed, in which case abnormal gripping may be caused, which includes object tape flying (gripping a lower object tape flying an upper object), double gripping (gripping a plurality of objects at a time), and the like, which often results in damage to the objects or flying out of the basket. To avoid such abnormal gripping situations, it is necessary to detect whether there is a press-on condition of one or more articles to be gripped. The invention is not limited to a specific detection mode. The maximum number of the stacking detection can be preset according to the needs of the actual use scene, for example, the maximum number of the stacking detection can be set to be 10, and when the number of the detected stacking objects exceeds 10, only the first 10 is selected as output. Thus, the operation rate can be effectively improved.

For step S130, the pose of the object to be grabbed refers to the position of the object to be grabbed and the pose of the object to be grabbed, and the pose of the object to be grabbed may include flat placement, straight placement, or oblique placement, and placement angle and rotation angle. The pose of the object to be grabbed directly influences the difficulty level of grabbing. In one embodiment, the position relationship and the orientation relationship between the object to be grasped and the material frame are also determined when the gesture of the object to be grasped is determined, and the object grasping strategies are ordered. According to the requirements of actual use scenes, the maximum number of output poses can be limited, for example, the number of output grabbing poses of the robot can be limited to be 5, and under the arrangement, only the poses of 5 objects to be grabbed are output.

For step S140, the attribute of the object may also be an image attribute of the mask of the object to be grabbed, where the image attribute of the mask of the object to be grabbed refers to a feature on the visualized image that the mask of the object to be grabbed has when displayed in a graph. The properties of the object may include the size of the object to be grasped (the size of the object mask may also be used), the height (the length of the object mask may also be used), the shape, etc. In an alternative embodiment, all items to be grabbed may be segmented based on the attributes of the items. Assuming that the attribute is the size of the article, the segmentation limit is 20mm and 40mm, as shown in fig. 4a, after segmentation, the article with the size between 0 and 20mm can be classified as a small article, the article with the size between 21 and 40mm can be classified as a medium article, and the article with the size above 40mm can be classified as a large article. After the segmentation, the clamp can be configured according to the segmentation condition of the article, for example, when the article is segmented into small articles, the clamp is configured for the small articles; when the article is segmented into medium-sized articles, a common sucker clamp is configured for the articles; when the article is segmented into large articles, a powerful sucker clamp is configured for the large articles; as shown in fig. 4b, all the objects to be grasped can be segmented according to the height of the objects, the segmentation limit is 80mm, the objects smaller than 80mm are low objects, the objects larger than 80mm are high objects, and different clamps are configured according to classification. It should be understood that in the present invention, the different jigs do not include only different kinds of jigs, and if two jigs are the same kind but different in size, such two jigs also belong to different jigs, for example, when the object to be grasped is classified into a small object, a medium object and a large object in sections, suction cup jigs may be configured for each of them, wherein the small object is configured with a small suction cup jig, the medium object is configured with a medium suction cup jig, and the large object is configured with a large suction cup jig.

For step S150, in a grabbing scene where a large number of objects to be grabbed are stacked, the random grabbing of the objects is easy to cause the situations of no grabbing, damaged objects, flying objects, and the like, and the grabbing should be performed in a certain order. The existing grabbing and sorting method is usually used for sorting based on the characteristics of point clouds of objects, when the point cloud data are poor, the scheme cannot be executed, but the condition of influencing the point clouds in a factory scene is quite common, for example, the illumination condition is poor, or when the objects to be grabbed are objects such as glass, the applicability is not good enough. In addition, the existing sorting scheme generally only considers one to two characteristics of the objects to be grabbed, and performs sorting based on simple sorting logic, and because considered factors are not comprehensive enough, sorting results are often not accurate enough, or when on-site staff finds that sorting is not accurate enough, there is no way to enable the grabbing order to conform to the self-expected results through adjusting parameters, so that poor grabbing effects are generated when grabbing is performed based on the order. In order to solve the problems, the invention provides a method for comprehensively determining the grabbing sequence of all objects to be grabbed based on a plurality of characteristics of the grabbing areas of the objects, which improves the sequencing accuracy and the degree of freedom for adjusting the grabbing sequence, does not obviously improve the operation speed, and has strong applicability, thus being one of the key points of the invention.

Fig. 5 shows a flow diagram of a method of processing image data to determine a grabbing order according to an embodiment of the invention. As shown in fig. 5, the method includes:

step S200, obtaining a mask of a grabbing area of at least one object to be grabbed;

step S210, for each object to be grabbed in at least one object to be grabbed, acquiring at least one characteristic value of a mask of a grabbed area of the object to be grabbed;

step S220, for each of the at least one obtained characteristic value, performing normalization processing to obtain at least one normalized characteristic value;

step S230, calculating a grabbing priority value of each article to be grabbed based on at least one normalized characteristic value and a preset weight value of each article to be grabbed, so that when at least one article to be grabbed is grabbed, the grabbing sequence can be controlled according to the grabbing priority value.

For step S200, the method for acquiring the mask in the grippable region in step S110 may be used to acquire the mask, which is not described herein.

After the mask of the grabbed area is acquired, the feature of the mask related to grabbing needs to be acquired in step S210. During the course of the study, the inventors found that the following features of the mask are most likely to affect the capture: mask height, clamp size, number of point clouds in the mask, mask diagonal degree, mask stacking degree, mask size and pose direction. The grabbing sequence can be optionally determined according to the requirements of the actual application scene by combining one or more features. In particular, among these features, the four features of mask height, mask size, mask collapse degree, and pose direction have the greatest influence on gripping. As a preferred embodiment, all of the above features may be considered in combination to determine the order of grabbing. The following describes the meaning of each feature, the effect of grabbing and the acquisition method thereof:

Mask height

The mask height refers to the height of a mask in a grabbing area of an object to be grabbed, and can also be Z coordinate value. The height of the mask reflects the height of the object grabbing surface, as a plurality of objects to be grabbed are stacked and placed together, the object on the upper layer is grabbed preferentially firstly, the problem that the object on the upper layer is scattered due to the fact that the object on the lower layer is pressed can be prevented, secondly, the object on the upper layer can be prevented from being knocked down, grabbing of the object on the lower layer is affected, and the object on the upper layer is obviously grabbed better than the object on the lower layer. The height of the mask can be obtained through a depth map or a point cloud of the position of the mask, in one embodiment, the point cloud including one or more objects to be grabbed can be obtained first, the point cloud is a data set of points under a preset coordinate system, and for convenience in calculating the height value, a camera can be used for shooting right above the objects to be grabbed. And then acquiring the point cloud included in the mask area based on the mask area. And calculating pose key points of the grippable region represented by the mask and depth values of the pose key points, wherein the three-dimensional pose information of the object is used for describing the pose of the object to be gripped in the three-dimensional world. The pose key points refer to: the pose point of the three-dimensional position feature of the grippable region can be reflected. The calculation can be performed by:

Firstly, three-dimensional position coordinates of each data point of a mask area are obtained, and position information of pose key points of the grippable area corresponding to the mask is determined according to a preset operation result corresponding to the three-dimensional position coordinates of each data point. For example, assuming that 100 data points are included in the point cloud of the mask region, three-dimensional position coordinates of the 100 data points are obtained, an average value of the three-dimensional position coordinates of the 100 data points is calculated, and a data point corresponding to the average value is used as a pose key point of the grippable region corresponding to the mask region. Of course, the above-mentioned preset operation method may be, in addition to the averaging, center of gravity calculation, maximum value calculation, minimum value calculation, or the like, which is not limited to the present invention. Then, the direction with the smallest variation and the direction with the largest variation among the 100 data points are found. The direction with the smallest variation is taken as a Z-axis direction (namely, a depth direction consistent with the shooting direction of a camera), the direction with the largest variation is taken as an X-axis direction, and a Y-axis direction is determined through a right-hand coordinate system, so that three-dimensional state information of position information of the pose key point is determined, and the direction characteristics of the pose key point in a three-dimensional space are reflected.

And finally, calculating pose key points of the object grippable areas corresponding to the mask areas and depth values of the pose key points. The depth value of the pose key point is a coordinate value of the object gripable region corresponding to a depth coordinate axis, wherein the depth coordinate axis is set according to a photographing direction of a camera, a gravity direction or a direction of a vertical line of a plane where the gripable region is located. Accordingly, the depth value is used to reflect the position of the grippable region at the depth coordinate axis. In specific implementation, the origin and direction of the depth coordinate axis can be flexibly set by a person skilled in the art, and the setting mode of the origin of the depth coordinate axis is not limited in the invention. For example, when the depth coordinate axis is set according to the photographing direction of the camera, the origin of the depth coordinate axis may be the position where the camera is located, and the direction of the depth coordinate axis is the direction from the camera to the object, so that the depth value of the mask in each graspable region corresponds to the opposite number of the distance from the graspable region to the camera, that is, the farther from the camera, the lower the depth value of the mask, and the depth value is taken as the mask height characteristic value.

Clamp size

The jig size refers to the size of a jig configured for a certain article to be grasped. Since the grippable region of the article is on the surface of the object, the article is gripped by the gripper, which essentially controls the gripper to perform the gripping operation in the grippable region, the gripper size may also be counted as a feature of the mask of the grippable region of the article. The influence of the clamp size on grabbing is mainly reflected in whether the clamp possibly bumps the article which does not correspond to the clamp by mistake. For example, if a large-sized suction cup is used, the suction cup is gripped when there are more stacked objects than if a small-sized suction cup is used, and the large-sized suction cup is more likely to collide with other objects during gripping, resulting in shaking of the suction cup or a change in the position of the objects, which may cause gripping failure. In an actual industrial scenario, what kind of jigs is used by each set of system may be predetermined, that is, the size of the jigs may be determined before the actual gripping, so that the jig size in this embodiment may be obtained based on the configured jigs and the mapping relationship between the jigs and the sizes thereof, which are established and stored in advance.

Number of point clouds in mask

The number of the point clouds in the mask refers to the number of the point clouds covered by the mask in the grabbing area of a certain object to be grabbed. The number of point clouds reflects the quality of the acquisition of the camera, and if the number of point clouds in a certain grippable area is too small, the point clouds may be due to light reflection or shielding, which indicates that the shooting of the area is inaccurate, and may affect the control process of the clamp. Therefore, the grabbing priority of the objects with more point clouds in the mask can be set to be higher, and grabbing is preferentially executed. The number of the point clouds can be obtained by calculating the number of the point clouds covered by the mask in the grippable region.

Mask diagonal degree

As shown in fig. 6, the diagonal degree of the mask refers to the degree of inclination of the diagonal line of the mask. The object to be grabbed with high diagonal degree of the mask is fat, and grabbing is relatively easy; while the object to be grabbed with a low diagonal degree of the mask is relatively thin, the object to be grabbed is relatively difficult to grab. As shown in fig. 7, in order to calculate the diagonal degree of the mask, the minimum circumscribed rectangle of the mask may be calculated first, and the corner point of the circumscribed rectangle is the corner point of the mask. The angle X DEG between the two corner points which are diagonal to each other and the side (for example, the side parallel to X in FIG. 6) of the circumscribed rectangle can reflect the diagonal degree, and as a preferred implementation, the diagonal degree of the mask can be equal to |45 DEG to X DEG|.

Degree of mask press-on

The degree of mask folding refers to the degree to which a mask of a graspable region of an article to be grasped is folded by other articles. While typical overlay detection only determines whether an article is being overlaid, the degree of masking in this embodiment requires a certain value to be calculated, i.e., the "overlay degree value". The specific stacking degree value can be used for sequencing all the objects to be grabbed, the object with low stacking degree value has high grabbing priority value, and the stacking degree cannot be quantified in general stacking detection. In addition, conventional stacking detection is to determine whether a stacked object affects gripping, which requires identifying the type of the object to be gripped (some objects are stacked to affect gripping, some objects are not in the way even if being stacked), gripping points, specific stacking positions (stacked objects are at edges, even if stacked more, gripping is not affected, if in the middle, even if only one point is pressed, the stacked objects may be carried away), and the like. Although the accuracy is high, the operation speed is low, and the method is not applicable to certain industrial scenes with lower requirements on error rate but higher requirements on speed. The inventor develops a method for determining the folding degree of the object to be grabbed based on the graphic features of the mask in the area to be grabbed and the graphic features of the folded object, the method can output a determined folding degree value which can be used in other aspects, the folding degree is determined only through the graphic features, and the method is particularly suitable for industrial scenes with high requirements on operation speed and is one of the key points of the method.

Fig. 7 shows a flow diagram of a method of determining a degree of mask collapse for a grabbed region of an article to be grabbed, in accordance with one embodiment of the present invention. As shown in fig. 7, the method includes:

step S300, obtaining a mask of a grabbing area of at least one object to be grabbed;

step S310, for each article to be grasped, calculating the area S of the mask of the graspable area of the article ₁ ；

Step S320, for each object to be grabbed, generating an circumscribed rectangle of the mask of the grabbed area of the object and calculating the area S of the circumscribed rectangle ₂ ；

Step S330, for each object to be grabbed, calculating the mask stacking degree C of the object to be grabbed according to the following formula:

C＝1-S ₁ /S ₂ 。

for step S300, the method for acquiring the mask in the grippable region in step S110 may be used to acquire the mask, which is not described herein.

For step S310, as shown in fig. 8, it is assumed that there is an article to be grasped, and the mask of his graspable area is square in nature, but because there is a square article superimposed over the graspable area, the detected mask is U-shaped when the camera photographs from directly above. In order to calculate the degree of overlapping of square objects and objects to be grabbed, the area of the U-shaped area can be calculated first. In one embodiment, the area may be calculated geometrically, for example, the U-shaped region may be divided into three rectangular regions, and the area of each rectangular region may be calculated and summed. As a preferred embodiment, the area of the mask may also be calculated based on the pixels contained in the mask of the grippable region, in which embodiment all pixels are first assigned a same value, for example: 0. then, scanning all pixel points in the whole image one by one from left to right and from top to bottom, if a certain pixel point has the characteristics (such as the color) of the mask, continuing to judge whether the point is right upper, left upper and left front points do not have the characteristics of the mask in sequence, if so, adding 1 (at this time, the first point in the mask is assigned 1), otherwise, the number is unchanged, and if the point does not have the characteristics of the mask, skipping. If the upper right point and the front left point of the current point are different marks, and the upper right point and the upper left point do not have the characteristics of a mask, the marks of the current point are set to the same value as the upper right point, and all pixel values marked to be the same as the front left point are marked to the same value as the upper right point. And traversing all pixel points in the image according to the method, and marking. The points marked in this way are all pixel points contained in the mask, the number of the pixel points is calculated, and the area of the mask is calculated based on the area of each pixel point and the number of the pixel points. The second method has good universality and can be used for calculating the area of the image with any shape. When the second method is used to calculate the area, the total number of pixels included in the mask of the grippable region may be directly used as the area of the mask.

For step S320, any circumscribed rectangle algorithm may be used to find the circumscribed rectangle for the mask. As a specific implementation mode, the X coordinate value and the Y coordinate value of each pixel point in the mask can be calculated first, and the minimum X value, the minimum Y value, the maximum X value and the maximum Y value are selected respectively; next, 4 values are combined into coordinates of points, i.e., the smallest X value and the smallest Y value constitute coordinates (X _min ,Y _min ) The largest X and Y values constitute the coordinates (X _max ,Y _max ) The smallest X value and the largest Y value constitute the coordinates (X _min ,Y _max ) And the maximum X value and the minimum Y value constitute coordinates (X _max ,Y _min ). In dots (X) _min ,Y _min )，(X _max ,Y _max )，(X _min ,Y _max )，(X _max ,Y _min ) And 4 corner points serving as the circumscribed rectangle are connected in parallel, so that the circumscribed rectangle is obtained. Then, the area of the circumscribed rectangle is calculated, and the calculating method is similar to the method for calculating the mask area in step S310, which is not described here again.

For step S330, the area S of the mask of the grippable region of the object to be gripped is obtained by step S310 ₁ The grippability of the item is obtained by step S320Area S of circumscribed rectangle of mask of region ₂ After that, S can be calculated ₁ And S is equal to ₂ And subtracting the ratio from a constant 1 to obtain a mask press-up degree value, note S ₁ And S is equal to ₂ The same dimension should be present, i.e. if the number of pixels is taken as the area of the mask, the area of the bounding rectangle should also be measured in number of pixels.

It should be understood that the method of calculating the degree of mask collapse in the present invention may be used alone to determine the order of grabbing, or may be used in combination with other features of the present invention to calculate the order of grabbing. The method for calculating the mask stacking degree is particularly suitable for being used in the method for determining the grabbing order, so that the grabbing order is determined in combination with other features. In addition, the method of calculating the degree of folding in this embodiment does not actually consider whether folding affects gripping, and therefore, it is preferable that the method of this embodiment is adopted for calculating the degree of folding of each article for the remaining articles after removing articles that cannot be gripped due to folding, like the folding detection process in step S130 of the present invention, is performed first.

Mask size

The size of the mask in the grabbing area can be the area of the mask, and the large area of the mask indicates that the grabbing area of the object to be grabbed is large, so that the clamp is easier to grab; conversely, if the graspable area is small, the gripper grasps more difficult. The area of the mask may be calculated by a method similar to step S310, and will not be described here.

Orientation of pose

A plurality of articles to be grasped are piled in the material frame, each article has a unique pose, and the pose of the article is changed even after each grasping. The pose of the article, in particular the pose of the grippable region of the article, determines where the gripper should be and in what pose the gripping of the article is performed. The existing gripping method does not particularly consider the problem of the orientation of the articles, so that the order of gripping the articles is not determined based on the orientation of the articles, however, the orientation of the articles (or the orientation of the grippable region) has an influence on the gripping effect, as shown in fig. 9, if the grippable region of a certain article faces the frame mouth, the article is clearly gripped better, if the grippable region of the article is biased toward the frame wall, the gripping difficulty is relatively high, and especially if the article is located near the edge of the frame, the influence of the orientation on the gripping difficulty is particularly obvious. The position and the orientation of the object can be calculated by an imaging mode or a mode of inputting image data into a neural network, and the pose characteristic value of the object is calculated based on the difficulty degree of grabbing the object at the position and the orientation, and it can be understood that the harder the object is grabbed, the lower the pose characteristic value is.

For step S220, the dimensions of the respective feature values obtained in the above manner may be different, for example: the mask height value may be a length value, for example-182 mm; the number of the point cloud in the mask is the number of pixels, for example: 100; the mask diagonal value is an angle value, for example: 45 deg.. The values of different dimensions cannot be directly put together for calculation, and normalization processing is required to be carried out on the values of each feature. Normalization enables different dimensions to be categorized into a uniform interval, for example, feature values of individual features may be uniformly normalized into the interval of [0,10 ]. In a specific embodiment, assuming that the mask height value of one article to be grasped is-100 mm and the mask height value of the other article to be grasped is-120 mm, the mask height values of the two articles to be grasped after normalization are 8 and 6 respectively, and the mask height value of the one article to be grasped can be normalized to be 8 and the mask height value of the mask to be grasped after normalization to be 6 are normalized to be 8 and 6 respectively; for another example, the diagonal value of the mask of one article to be grasped is 30 °, the diagonal value of the mask of the other article to be grasped is 15 °, the 30 ° can be normalized to 6, the 15 ° is normalized to 3, and the diagonal values of the masks of the two articles to be grasped after normalization are respectively 7 and 4.

For step S230, after the normalized feature values of the respective features are obtained, weights may be preset for each feature, and based on each featureAnd calculating the priority value P of each article to be grabbed according to the characteristic value and the corresponding weight value of the characteristic. The priority value may be according to the following formula

Calculating, wherein P is the priority value of a certain object to be grabbed, n is the characteristic quantity, omega _i Weight of ith feature, X _i Is the feature value of the ith feature. For example, a certain grabbing task needs to grab two objects to be grabbed, and the mask height, the clamp size, the number of point clouds in the mask, the mask diagonal degree, the mask stacking degree, the mask size and the pose direction are used as characteristic values, and before the grabbing sequence is determined, weights are preset for all the characteristics, for example, the mask height weight is 3, the clamp size weight is 1, the number weight of the point clouds in the mask is 2, the mask diagonal degree weight is 0, the mask stacking degree weight is 1, the mask size weight is 2 and the pose direction weight is 3. Next, each normalized feature value of the first article to be grabbed is obtained, for example, as follows: the mask height value is 5, the clamp size value is 6, the number of point clouds in the mask is 4, the mask diagonal degree value is 9, the mask stacking degree value is 6, the mask size value is 3, the pose direction value is 2, and then the priority value P of the first object to be grabbed can be obtained according to a formula ₁ =3×5+1×6+2×4+0×9+1×6+2×3+3×2=47. Then, each normalized characteristic value of the second article to be grabbed is obtained, for example, as follows: the mask height value 3, the clamp size value 5, the number of point clouds in the mask 2, the mask diagonal degree value 2, the mask stacking degree value 5, the mask size value 6 and the pose direction value 5, and calculating the priority value P of the first object to be grabbed according to a formula ₂ =3×3+1×5+2×2+0×2+1×5+2×6+3×5=50. Due to P ₂ >P ₁ That is, the grabbing priority value of the second object to be grabbed is higher than the grabbing priority value of the first object to be grabbed, so that when the grabbing task is executed, the second object to be grabbed is grabbed by the clamp, and after grabbing, the first object to be grabbed is grabbed.

In an actual industrial scenario, the field staff is generally allowed to set various parameters of the robot for a specific grabbing task, however, the field staff is not familiar with the grabbing principle, so when a problem is found, it is not clear where the problem is present, nor how to modify the settings to solve the problem. For example, when gripping a plurality of stacked articles to be gripped, a situation occurs in which the articles are brought out of the frame, and the field staff judges that the article on the upper layer is gripped directly by the gripper, but he cannot determine why the robot would consider the priority value of the articles on the lower layer higher, nor is it clear how to set the weight to change the gripping order of the robot. In order to solve the problem, the inventor has developed a set of methods for visually displaying the graphics and parameters involved in the grabbing process to the field staff for operation according to the needs of the field staff, which is also one of the important points of the present invention.

FIG. 10 shows a flow diagram of a method of visualizing graphics and parameters in a grabbing process in accordance with one embodiment of the invention. As shown in fig. 10, the method includes:

step S400, obtaining image data comprising one or more objects to be grabbed;

step S410, outputting the image data and an operable control to form an interactive interface, wherein the control is operable by a user to select a grabbing auxiliary image and display the selected grabbing auxiliary image to the user;

step S420, responding to the operation of the control by the user, and acquiring grabbing auxiliary data corresponding to the grabbing auxiliary image selected by the user;

step S430, a grabbing auxiliary layer is generated based on the acquired grabbing auxiliary data;

step S440 combines the capture assistance layer with image data comprising one or more items to be captured to generate a user selected capture assistance image.

For step S400, image data including one or more objects to be grabbed may be acquired in a similar manner as in step S100, and no further description is given here.

For step S410, the shot picture and the control may be output to a display for presentation to the user. The interaction between the user and the robot may be performed by touch operation, voice operation, or conventional operation of a device, such as a mouse, a keyboard, etc., which is not limited by the present invention. The interactive interfaces are channels for information exchange between people and the computer system, the user inputs information to the computer system through the interactive interfaces and operates the computer system, the computer provides information for the user through the interactive interfaces for reading, analyzing and judging, and each interactive interface comprises an information display interface provided by the interactive interface and a control which can be operated by the user. The control for controlling the visualization can be displayed on an interactive interface with the image as a whole, can be divided into two interfaces with the image, and provides an interface for turning to the control interface on the image interface, and provides an interface for turning to the image interface on the space interface, and when a user operates the interface, the user turns to the control interface or the image interface. As shown in fig. 11a, the control interface may select operations related to visualization, including: and opening the visualized operation, displaying the outline of the overlapped object and the visualized attribute. Wherein the visual properties may include any of the parameters output in any of the previous embodiments, the alternative visual properties of fig. 11a include: ALL, height display according to pose, size display according to a sucker, pressing and overlapping degree display, transparency degree display and pose orientation display. Wherein ALL refers to the score value of the population, which may be the priority value output in step S230. It will be readily seen that the pose orientation, suction cup size, and degree of crimping are all values that are output in the scheme that determines the order in which the articles are grasped.

For step S420, the user may select the value of interest according to his own needs. For example, when the user finds that the robot does not grasp in the order expected by the user, the ALL control may be selected to display the grasp priority value of each object to be grasped to determine the difference between the actual grasp order and the grasp order expected by the user, and then select the specific visual attribute separately to determine which attribute affects the grasp order. When the user selects a certain visual option, the system searches and invokes the corresponding data. As a preferred embodiment, the system may obtain the parameters selected by the user and the mask of the grippable region in response to the selection by the user, and use the parameters and the mask of the grippable region together as auxiliary data, for example, when the user selects "display by suction cup size", the system may call the mask of the grippable region generated during the execution of step S110 and the value of the suction cup size obtained during the execution of step S210 at the same time; similarly, when the user selects "display according to pose height", the mask of the grippable region generated during the execution of step S110 and the mask height feature value acquired during the execution of step S210 are invoked.

For step S430, the data invoked in step S420 are combined to generate a visual layer viewable by the user, taking the mask that the user has selected "display per pose height", "display per suction cup size", and the capture assistance data also includes a graspable region as an example: when the user selects 'displaying according to the pose height', calling masks of all objects to be grabbed in the original image, and mask height characteristic values of all objects to be grabbed, and generating a layer for placing the mask height characteristic values beside the corresponding masks; when the user selects 'display according to the size of the sucker', calling masks of all objects to be grabbed in the original image and the characteristic value of the size of the sucker of each object to be grabbed, and generating a layer for placing the characteristic value of the size of the sucker beside the corresponding mask;

for step S440, the capture auxiliary layer generated in step S430 is synthesized with the original captured image data and visually presented to the user. The layer generated in step S430 may be processed, and attributes such as color, transparency, and contrast of the layer may be adjusted, and then all pixels in the auxiliary image layer and all pixels in the original image data are sequentially combined together in order from left to right and from top to bottom, so as to generate synthesized image data. As shown in fig. 11b, the synthesized image shows the image of each object to be grabbed, and the mask covering the grabbed area over the object to be grabbed, and the user-selected "pose height" value or "chuck size" value shown next to the mask.

In addition, it should be noted that although each embodiment of the present invention has a specific combination of features, further combinations and cross combinations of these features between embodiments are also possible.

According to the embodiment, firstly, the clamp is controlled to execute the grabbing of the articles to be grabbed based on the mask of the grabbing areas of the articles to be grabbed and combined with the steps of the press-stacking detection, the pose prediction, the grabbing sequencing and the like, so that the grabbing mode of each article to be grabbed can be accurately identified for a dense scene where a plurality of articles to be grabbed are piled, all the articles to be grabbed can be orderly grabbed according to a specific sequence, and compared with the existing grabbing scheme, the situation that other articles are carried by flying when the articles are grabbed in the dense scene can be effectively avoided, and the grabbing accuracy is improved; secondly, the grabbing and sorting method disclosed by the invention is used for comprehensively sorting according to the characteristics of the mask in the grabbing area of the object to be grabbed, so that the sorting accuracy is improved compared with the traditional method, and the characteristics of the whole object are not processed, so that the operation speed is not obviously reduced even if a plurality of factors are considered; thirdly, the invention provides an article overlapping degree calculating method based on the graphic features of the grabbing areas of the articles to be grabbed, compared with the traditional calculating method, the method is high in operation speed, can give specific overlapping values instead of judging whether the articles are overlapped or not, is low in accuracy, is simple and quick in calculation, can be used in other aspects, and is particularly suitable for scenes with requirements on operation speed or scenes with comprehensive ordering according to a plurality of features; fourth, the invention also provides a method for visually displaying parameters and image data related to the grabbing control method to the user, so that the user can intuitively determine various parameters in the grabbing process of the robot under the condition that the operation principle of the robot is not known, determine the grabbing basis of the robot, further determine how to adjust various parameters of the robot, and solve the problem that the user can only guess to adjust parameters in the traditional grabbing scheme.

Fig. 12 shows a grip control device according to still another embodiment of the present invention, the device including:

an image data obtaining module 500, configured to obtain image data including one or more objects to be grabbed, i.e. to implement step S100;

the mask prediction module 510 is configured to process the image data, generate one or more masks of the grabbed areas of the object to be grabbed, and perform preprocessing on the masks, i.e. is configured to implement step S110;

the stacking detection module 520 is configured to detect whether one or more objects to be grabbed have stacking conditions, i.e. is configured to implement step S120;

the pose estimation module 530 is configured to estimate a position and a pose of one or more objects to be grasped, i.e. to implement step S130;

the clamp configuration module 540 is configured to configure a clamp for the object to be grabbed according to the attribute of the object to be grabbed, so that the clamp suitable for grabbing the object to be grabbed can be used for grabbing when grabbing the object to be grabbed, namely, the step S140 is implemented;

a gripping and sorting module 550 is configured to determine, based on gripping characteristics of the one or more objects to be gripped, a step of gripping the one or more objects to be gripped in an order using the gripper, so that the gripper can grip the objects in the determined order, i.e. to implement step S150.

Optionally, the device further comprises a frame parameter acquisition module, which is used for processing the frame data to obtain parameters of the frame. Articles to be grasped are often piled in boxes for transportation to the site, and such boxes for piled articles are often called material frames, and when grasping is performed, a mechanical arm or a clamp may touch the material frames during movement, so that the material frames and the placement positions of the articles in the material frames have important influence on grasping. As a preferred embodiment, parameters of the frame may be obtained. As shown in fig. 2, the frame data may be processed to extract or generate auxiliary parameters that have an effect on grabbing, such parameters including: the height of the material frame, the width of the material frame, the length of the material frame, and the grid obtained by dividing the width and the length of the material frame. It should be understood that the height, width and length are all determined values, and the dividing mode and number of the grids are determined by the skilled person according to the actual conditions of the used fixture, the grabbing mode, the characteristics of the objects to be grabbed and the like, and the grids can be used for conveniently calibrating the positions of the objects to be grabbed. The frame data may be preset or acquired by a camera.

Fig. 13 shows a grip control device according to still another embodiment of the present invention, the device including:

a mask acquiring module 600, configured to acquire a mask of a grabbed area of at least one object to be grabbed, i.e. to implement step S200;

a feature value obtaining module 610, configured to obtain, for each of the at least one object to be grabbed, a feature value of at least one feature of the mask in the grabbed area, that is, the feature value is used to implement step S210;

a feature value normalization module 620, configured to perform normalization processing on each of the obtained feature values of the at least one feature, to obtain at least one normalized feature value, that is, to implement step S220;

the priority value calculating module 630 is configured to calculate, based on at least one normalized feature value and a preset weight value of each object to be grabbed, a grabbing priority value of the object to be grabbed, so that when at least one object to be grabbed is grabbed, a grabbing sequence can be controlled according to the grabbing priority value, that is, the method is used to implement step S230.

Fig. 14 shows an image data processing apparatus according to still another embodiment of the present invention, the apparatus including:

the mask acquiring module 700 is configured to acquire a mask of a grabbed area of at least one object to be grabbed, i.e. to implement step S300;

A mask area calculation module 710 for calculating, for each object to be grasped, an area S of a mask of a graspable area of the object ₁ I.e. for implementing step S310;

an external rectangle processing module 720 forFor each article to be grabbed, generating an circumscribed rectangle of a mask of a grabbed area of the article and calculating an area S of the circumscribed rectangle ₂ I.e. for implementing step S320;

the stacking degree calculating module 730 is configured to calculate, for each article to be grabbed, a mask stacking degree C of the article to be grabbed according to the following formula:

C＝1-S ₁ /S ₂ i.e. for implementing step S330.

Fig. 15 shows an image data processing apparatus according to still another embodiment of the present invention, the apparatus including:

an image data obtaining module 800, configured to obtain image data including one or more objects to be grabbed, i.e. to implement step S400;

the interactive interface display module 810 is configured to output the image data and an operable control to form an interactive interface, where the control is operable by a user to select to capture an auxiliary image and display the selected capture auxiliary image to the user, that is, to implement step S410;

an auxiliary data obtaining module 820, configured to obtain, in response to the operation of the control by the user, capturing auxiliary data corresponding to the capturing auxiliary image selected by the user, that is, to implement step S420;

An auxiliary layer generating module 830, configured to generate a grabbing auxiliary layer based on the acquired grabbing auxiliary data, that is, to implement step S430;

an auxiliary image generation module 840 for combining the capture auxiliary image layer with image data comprising one or more items to be captured to generate a user selected capture auxiliary image, i.e. for implementing step S440.

It should be understood that in the above embodiment of the apparatus shown in fig. 12 to 15, only the main functions of the modules are described, and all the functions of each module correspond to the corresponding steps in the method embodiment, and the working principle of each module may refer to the description of the corresponding steps in the method embodiment. For example, the auxiliary image generation module 840 is used to implement the method of step S440 in the above-described embodiment, indicating that the content for describing and explaining step S440 is also the content for describing and explaining the function of the auxiliary image generation module 840. In addition, although the correspondence between functions of the functional modules and the method is defined in the above embodiments, those skilled in the art will understand that the functions of the functional modules are not limited to the correspondence, that is, a specific functional module may also implement other method steps or a part of the method steps. For example, the above embodiment describes the method for implementing step S440 by the auxiliary image generation module 840, however, the auxiliary image generation module 840 may be used to implement the method or a part of the method of steps S400, S410, S420 or S430 as the actual situation requires.

The present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the method of any of the above embodiments. It should be noted that, the computer program stored in the computer readable storage medium according to the embodiment of the present application may be executed by the processor of the electronic device, and in addition, the computer readable storage medium may be a storage medium built in the electronic device or may be a storage medium capable of being plugged into the electronic device in a pluggable manner, so that the computer readable storage medium according to the embodiment of the present application has higher flexibility and reliability.

Fig. 16 shows a schematic structural diagram of an electronic device according to an embodiment of the present invention, which may be a control system/electronic system configured in an automobile, a mobile terminal (e.g., a smart mobile phone, etc.), a personal computer (PC, e.g., a desktop computer or a notebook computer, etc.), a tablet computer, a server, etc., and the specific embodiment of the present invention does not limit the specific implementation of the electronic device.

As shown in fig. 16, the electronic device may include: a processor 1202, a communication interface (Communications Interface) 1204, a memory 1206, and a communication bus 1208.

Wherein:

the processor 1202, the communication interface 1204, and the memory 1206 communicate with each other via a communication bus 1208.

A communication interface 1204 for communicating with network elements of other devices, such as clients or other servers, etc.

The processor 1202 is configured to execute the program 1210, and may specifically perform relevant steps in the method embodiments described above.

In particular, program 1210 may include program code including computer operating instructions.

The processor 1202 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the electronic device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.

Memory 1206 for storing program 1210. The memory 1206 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.

Program 1210 may be downloaded and installed from a network and/or from a removable medium via communications interface 1204. The program, when executed by the processor 1202, may cause the processor 1202 to perform the operations of the method embodiments described above.

In general terms, the invention comprises the following steps:

a grip control method, the control method comprising at least the steps of:

a step of acquiring image data including one or more objects to be grasped;

processing the image data to generate one or more masks of the grippable areas of the object to be grippable, and preprocessing the masks;

detecting whether one or more objects to be grabbed have a press-fit condition or not;

estimating the position and the posture of one or more objects to be grabbed;

according to the attribute of the object to be grabbed, configuring a clamp for the object to be grabbed, so that the clamp suitable for grabbing the object to be grabbed can be used for grabbing the object to be grabbed when the object to be grabbed is grabbed;

determining an order in which the one or more items to be grasped are grasped using the jig based on grasping characteristics of the one or more items to be grasped, so that the jig can grasp the items in the determined order.

Optionally, the method further comprises the step of processing the frame data to obtain parameters of the frame.

Optionally, the preprocessing the mask includes: the method comprises the steps of expanding a mask, preprocessing the mask based on a preset minimum area of the mask, and/or preprocessing the mask based on the minimum number of point clouds in the preset mask.

Optionally, the step of detecting whether the one or more objects to be grabbed have a press-fit condition further includes: and outputting the result of the press-fit detection according to the preset maximum number of press-fit detection.

Optionally, the step of estimating the position and the posture of the one or more objects to be grabbed further includes: and outputting an estimated result according to the preset maximum estimated quantity.

Optionally, the attribute of the object includes an image attribute of an object mask.

A grip control device comprising:

the image data acquisition module is used for acquiring image data comprising one or more objects to be grabbed;

the mask prediction module is used for processing the image data, generating one or more masks of the grabbing areas of the objects to be grabbed, and preprocessing the masks;

the folding detection module is used for detecting whether one or more objects to be grabbed have folding conditions or not;

the pose estimation module is used for estimating the positions and the poses of one or more objects to be grabbed;

the clamp configuration module is used for configuring a clamp for the object to be grabbed according to the attribute of the object to be grabbed, so that the clamp suitable for grabbing the object to be grabbed can be used for grabbing when the object to be grabbed is grabbed;

And the grabbing sequencing module is used for determining the sequence of grabbing the one or more articles to be grabbed by using the clamp based on grabbing characteristics of the one or more articles to be grabbed, so that the clamp can grab the articles according to the determined sequence.

Optionally, the apparatus further comprises: and the material frame parameter acquisition module is used for processing the material frame data to acquire the parameters of the material frame.

Optionally, the press-fit detection module is further configured to: and outputting the result of the press-fit detection according to the preset maximum number of press-fit detection.

Optionally, the pose estimation module is further configured to: and outputting an estimated result according to the preset maximum estimated quantity.

A grip control method comprising:

acquiring a mask of a grabbing area of at least one object to be grabbed;

Optionally, the features of the mask of the grippable region include: mask height, clamp size, number of point clouds in the mask, mask diagonal degree, mask stacking degree, mask size and/or pose direction.

Optionally, a mask height feature value of the mask of the grippable region is calculated based on the depth value of the grippable region.

Optionally, the clamp size is determined based on a mapping relationship between a preset clamp and the clamp size.

Optionally, the diagonal degree of the mask is determined based on an included angle between a diagonal line of the circumscribed rectangle of the mask and one side of the circumscribed rectangle.

Optionally, the priority value is calculated according to the following formula:

A grip control device comprising:

Optionally, the priority value calculating module calculates the priority value according to the following formula:

An image data processing method, comprising:

acquiring a mask of a grabbing area of at least one object to be grabbed;

for each article to be grasped, calculating the area S of the mask of the graspable area of the article ₁ ；

For each article to be grabbed, generating an circumscribed rectangle of a mask of a grabbed area of the article and calculating an area S of the circumscribed rectangle ₂ ；

For each article to be grasped, the mask folding degree C of the article to be grasped is calculated by the following formula:

C＝1-S ₁ /S ₂ ；

the mask stacking degree C can be used for determining the grabbing sequence of the objects to be grabbed so as to control the clamp to grab the objects to be grabbed.

Optionally, the area of the mask and/or the area of the bounding rectangle is calculated based on a geometrical method.

Optionally, the area of the mask is calculated based on the pixel points contained in the mask, and/or the area of the circumscribed rectangle is calculated based on the pixel points contained in the circumscribed rectangle.

Optionally, the generating the circumscribed rectangle of the mask of the grippable region of the article includes: and acquiring X coordinate values and Y coordinate values of each pixel point of the mask, and calculating the circumscribed rectangle based on the minimum X value, the minimum Y value, the maximum X value and the maximum Y value.

An image data processing apparatus comprising:

a mask area calculating module for calculating, for each object to be grasped, an area S of a mask of a graspable area of the object ₁ ；

An circumscribed rectangle processing module for generating, for each article to be grasped, a circumscribed rectangle of a mask of a graspable region of the article and calculating an area S of the circumscribed rectangle ₂ ；

The stacking degree calculating module is used for calculating the mask stacking degree C of each article to be grabbed through the following formula:

C＝1-S ₁ /S ₂ ；

Optionally, the circumscribed rectangle processing module is further configured to: and acquiring X coordinate values and Y coordinate values of each pixel point of the mask, and calculating the circumscribed rectangle based on the minimum X value, the minimum Y value, the maximum X value and the maximum Y value.

An image data processing method, comprising:

acquiring image data comprising one or more items to be grabbed;

outputting the image data and an operable control to form an interactive interface, wherein the control is operable by a user to select a grabbing auxiliary image and display the selected grabbing auxiliary image to the user;

responding to the operation of the control by the user, acquiring grabbing auxiliary data corresponding to the grabbing auxiliary image selected by the user;

generating a grabbing auxiliary layer based on the acquired grabbing auxiliary data;

the capture assistance layer is combined with image data comprising one or more items to be captured to generate a user selected capture assistance image.

Optionally, the image data and the operable control are in the same interactive interface.

Optionally, the image data and the operable control are in different interaction interfaces.

Optionally, the different interaction interfaces are switched in response to a user operation.

Optionally, the capturing auxiliary data includes: a value associated with the user selected capture assistance image and a mask of the grippable region of the item to be captured.

Optionally, the combining the capture auxiliary layer with image data including one or more objects to be captured includes: after adjusting the color, transparency and/or contrast of the grabbing auxiliary layer, the adjusted grabbing auxiliary layer is combined with image data comprising one or more objects to be grabbed.

An image data processing apparatus comprising:

the interactive interface display module is used for outputting the image data and an operable control to form an interactive interface, wherein the control can be operated by a user to select to capture the auxiliary image and display the selected auxiliary image to the user;

The auxiliary data acquisition module is used for responding to the operation of the control by the user and acquiring grabbing auxiliary data corresponding to the grabbing auxiliary image selected by the user;

the auxiliary layer generation module is used for generating a grabbing auxiliary layer based on the acquired grabbing auxiliary data;

an auxiliary image generation module for combining the capture auxiliary image layer with image data comprising one or more items to be captured to generate a user selected capture auxiliary image.

Optionally, the auxiliary image generating module is further configured to: after adjusting the color, transparency and/or contrast of the grabbing auxiliary layer, the adjusted grabbing auxiliary layer is combined with image data comprising one or more objects to be grabbed.

In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, system that includes a processing module, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It is to be understood that portions of embodiments of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

Furthermore, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like.

Although the embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the embodiments described above by those of ordinary skill in the art within the scope of the application.

Claims

1. A grip control method, characterized by comprising:

acquiring a mask of a grabbing area of at least one object to be grabbed;

2. The grip control method according to claim 1, wherein the characteristics of the mask of the grippable region include: mask height, clamp size, number of point clouds in the mask, mask diagonal degree, mask stacking degree, mask size and/or pose direction.

3. The method according to claim 2, wherein the mask height feature value of the mask of the grippable region is calculated based on the depth value of the grippable region.

4. The grip control method according to claim 2, wherein the jig size is determined based on a mapping relationship between a preset jig and the jig size.

5. The grip control method according to claim 2, wherein the diagonal degree of the mask is determined based on an angle between a diagonal line of the circumscribed rectangle of the mask and one side of the circumscribed rectangle.

6. The grip control method according to any one of claims 1 to 5, characterized in that the priority value is calculated according to the following formula:

7. A grip control device, characterized by comprising:

8. The grip control device of claim 7, wherein the mask of the grippable region is characterized by: mask height, clamp size, number of point clouds in the mask, mask diagonal degree, mask stacking degree, mask size and/or pose direction.

9. The grip control device according to claim 8, wherein the mask height feature value of the mask of the grippable region is calculated based on the depth value of the grippable region.

10. The grip control device according to claim 8, wherein the jig size is determined based on a mapping relationship between a preset jig and the jig size.

11. The grip control device of claim 8, wherein the mask diagonal is determined based on an angle between a diagonal of the circumscribed rectangle of the mask and one side of the circumscribed rectangle.

12. The grip control device according to any one of claims 7 to 12, wherein the priority value calculation module calculates the priority value according to the following formula:

13. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the grab control method according to any of claims 1 to 6 when the computer program is executed.

14. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the grab control method of any of claims 1 to 6.