CN112837370A - Object stacking judgment method and device based on 3D bounding box and computing equipment - Google Patents

Object stacking judgment method and device based on 3D bounding box and computing equipment Download PDF

Info

Publication number
CN112837370A
CN112837370A CN202110217390.8A CN202110217390A CN112837370A CN 112837370 A CN112837370 A CN 112837370A CN 202110217390 A CN202110217390 A CN 202110217390A CN 112837370 A CN112837370 A CN 112837370A
Authority
CN
China
Prior art keywords
objects
bounding box
coordinate
determining
stacking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110217390.8A
Other languages
Chinese (zh)
Inventor
魏海永
李辉
黄体森
盛文波
丁有爽
邵天兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mech Mind Robotics Technologies Co Ltd
Original Assignee
Mech Mind Robotics Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mech Mind Robotics Technologies Co Ltd filed Critical Mech Mind Robotics Technologies Co Ltd
Priority to CN202110217390.8A priority Critical patent/CN112837370A/en
Publication of CN112837370A publication Critical patent/CN112837370A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0014Image feed-back for automatic industrial control, e.g. robot with camera
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30164Workpiece; Machine component

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Robotics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method, a device and a computing device for judging object stacking based on a 3D bounding box, wherein the method comprises the following steps: acquiring pose information of a plurality of objects in a current scene and point clouds corresponding to the objects; generating a 3D bounding box corresponding to a plurality of objects according to the pose information of the plurality of objects and the point clouds corresponding to the plurality of objects; judging whether the 3D bounding boxes corresponding to any two objects have overlapping areas or not; if yes, determining that the two objects have a stacking relation, extracting point clouds in an overlapping area of the 3D bounding boxes corresponding to the two objects, and determining a pressed object in the stacking relation according to the point clouds in the overlapping area; if not, determining that the two objects do not have a stacking relation. This scheme has realized piling up effective judgement to the object, has improved the object effectively and has piled up the treatment effeciency of judging, can determine the quilt pressure object in a plurality of objects accurately moreover, has promoted the object and has piled up the precision of judging.

Description

Object stacking judgment method and device based on 3D bounding box and computing equipment
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for judging object stacking based on a 3D bounding box and computing equipment.
Background
With the development of industrial intelligence, it is becoming more and more common to operate an object (e.g., an industrial part, a box, etc.) by a robot instead of a human. In operation of a robot, it is generally necessary to grasp an object, move the object from one location and place the object at another location, such as grasping the object from a conveyor belt and placing the object on a pallet or in a cage car, or grasping the object from a pallet, placing the object on a conveyor belt or other pallet as desired, and the like. However, in the prior art, in the process of selecting the object to be grasped, the stacking relationship among the objects is ignored, and if the object to be grasped is used as the object to be grasped, the object located above the object to be grasped can be dropped after the robot grasps the object to be grasped, and the object is easily damaged.
Disclosure of Invention
In view of the above, the present invention is proposed in order to provide a 3D bounding box based object stack determination method, apparatus and computing device that overcome or at least partially solve the above problems.
According to an aspect of the present invention, there is provided a 3D bounding box-based object stack determination method, including:
acquiring pose information of a plurality of objects in a current scene and point clouds corresponding to the objects;
generating a 3D bounding box corresponding to a plurality of objects according to the pose information of the plurality of objects and the point clouds corresponding to the plurality of objects;
judging whether the 3D bounding boxes corresponding to any two objects have overlapping areas or not;
if yes, determining that the two objects have a stacking relation, extracting point clouds in an overlapping area of the 3D bounding boxes corresponding to the two objects, and determining a pressed object in the stacking relation according to the point clouds in the overlapping area; if not, determining that the two objects do not have a stacking relation.
According to another aspect of the present invention, there is provided a 3D bounding box-based object stack determination apparatus, including:
the acquisition module is suitable for acquiring pose information of a plurality of objects in a current scene and a plurality of point clouds corresponding to the objects;
the bounding box generating module is suitable for generating a 3D bounding box corresponding to a plurality of objects according to the pose information of the plurality of objects and the point clouds corresponding to the plurality of objects;
the overlapping judgment module is suitable for judging whether the 3D bounding boxes corresponding to any two objects have overlapping areas or not;
the processing module is suitable for determining that the two objects have a stacking relation if the overlapping judging module judges that the overlapping area exists, extracting point clouds in the overlapping area of the 3D bounding boxes corresponding to the two objects, and determining a pressed object in the stacking relation according to the point clouds in the overlapping area; and if the overlapping judgment module judges that no overlapping area exists, determining that the two objects do not have a stacking relation.
According to yet another aspect of the present invention, there is provided a computing device comprising: the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the object stacking judgment method based on the 3D bounding box.
According to still another aspect of the present invention, there is provided a computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the 3D bounding box-based object stack determination method as described above.
According to the technical scheme provided by the invention, the 3D bounding boxes corresponding to the objects are generated according to the pose information of the objects and the point clouds corresponding to the objects, compared with the objects, the 3D bounding boxes have the characteristics of regular shape, simple structure and the like, effective judgment on object stacking is realized by judging whether the 3D bounding boxes corresponding to any two objects have an overlapping area, the processing efficiency of object stacking judgment is effectively improved, the processing process of object stacking judgment is simplified, the pressed objects can be accurately determined from a plurality of objects according to the point clouds in the overlapping area of the 3D bounding boxes, the accuracy of object stacking judgment is improved, and the situation that the objects above the pressed objects fall off due to the fact that the pressed objects are mistakenly grabbed as target objects is avoided.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow chart diagram illustrating a method for determining object stacking based on 3D bounding boxes according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating a method for determining object stacking based on 3D bounding boxes according to another embodiment of the present invention;
FIG. 3 shows a schematic diagram of a 3D bounding box corresponding to an object;
fig. 4 is a block diagram showing a configuration of a 3D bounding box-based object stack determination apparatus according to an embodiment of the present invention;
FIG. 5 illustrates a schematic structural diagram of a computing device according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Fig. 1 is a flowchart illustrating a 3D bounding box-based object stack determination method according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step S101, obtaining pose information of a plurality of objects in a current scene and point clouds corresponding to the objects.
The current scene includes a plurality of objects, and a scene image and a depth image of the current scene may be acquired by a 3D camera disposed at an upper position, where the 3D camera may be disposed at the upper position, for example, at a position right above or obliquely above, and configured to simultaneously acquire information of the current scene within a camera view angle to obtain the scene image and the depth image, and specifically, the 3D camera may include elements such as a visible light detector such as a laser detector and an LED, an infrared detector and/or a radar detector, and the elements are used to detect the current scene to obtain the depth image. The scene image may be specifically an RGB image, and the pixel points of the scene image and the depth image correspond one to one. By processing the scene image and the depth image, the point cloud corresponding to the scene image can be conveniently obtained, the point cloud comprises the pose information of each 3D point, and the pose information of each 3D point can specifically comprise the coordinate values of each 3D point in the XYZ three-axis of the space, the XYZ three-axis directions of each 3D point and other information. By carrying out a series of processing such as example segmentation and matching on the scene image and the point cloud corresponding to the scene image, the pose information of a plurality of objects in the current scene and the point cloud corresponding to the plurality of objects can be obtained. In step S101, pose information of a plurality of objects in the current scene obtained by the above processing and a point cloud corresponding to the plurality of objects are acquired.
And S102, generating a 3D bounding box corresponding to a plurality of objects according to the pose information of the plurality of objects and the point clouds corresponding to the plurality of objects.
The pose information of the object may specifically include coordinate values of the object center in three XYZ axes of space, and information of the object itself in three XYZ axes directions. In order to conveniently and accurately perform object stacking judgment, a 3D bounding box used for surrounding a point cloud corresponding to each object can be generated for each object, and the 3D bounding box can be a cuboid, a cube or other polygonal bodies. Since the shape of the 3D bounding box is more regular with respect to the object and the structure of the 3D bounding box is also relatively simple, object stacking determination based on the 3D bounding box will have faster processing efficiency.
Step S103, judging whether the 3D bounding boxes corresponding to any two objects have overlapping areas or not; if yes, go to step S104; if not, go to step S105.
In order to prevent the pressed object from being mistakenly used as the target object to be grasped, in the present embodiment, the stacking relationship between the objects is identified by determining whether the 3D bounding boxes corresponding to any two objects have an overlapping region, and the stacking relationship is used for reflecting which object between any two objects is the pressed object pressed below, in consideration of the fact that the objects are likely to be stacked on each other when the objects are multiple in the actual scene. Specifically, each object has its own coordinate system, the 3D bounding box is in the coordinate system of the object, and for any two objects, the 3D bounding box corresponding to one of the two objects can be converted into the coordinate system of the other object, and in the same coordinate system, by determining whether the 3D point in the 3D bounding box corresponding to one of the two objects is located in the 3D bounding box corresponding to the other object, it can be determined conveniently whether the 3D bounding boxes corresponding to the two objects have an overlapping region.
And step S104, determining that the two objects have a stacking relation, extracting point clouds in an overlapping area of the 3D bounding boxes corresponding to the two objects, and determining a pressed object in the stacking relation according to the point clouds in the overlapping area.
In the case where it is determined in step S103 that the overlapping region exists in the 3D bounding boxes corresponding to the two objects, it is determined that there is a stacking relationship between the two objects. In order to determine which object is the pressed object pressed below from the two objects, the point clouds in the overlapping area of the 3D bounding boxes corresponding to the two objects are extracted, and the pressed object in the stacking relationship is determined according to the point clouds in the overlapping area. For example, the coordinates of the point cloud center of the point cloud in the overlapping area in the coordinate systems of the two objects are respectively calculated, and the Z-axis coordinate values in the coordinates are compared to determine which object of the two objects is located at a higher position and which object is located at a lower position, so as to determine the pressed object in the stacking relationship.
Step S105, it is determined that there is no stacking relationship between the two objects.
In the case where it is determined in step S103 that there is no overlapping area in the 3D bounding boxes corresponding to the two objects, it is determined that there is no stacking relationship between the two objects.
According to the 3D bounding box-based object stacking judgment method provided by the embodiment, the 3D bounding box corresponding to the object is generated according to the pose information of the object and the point cloud corresponding to the object, compared with the object, the 3D bounding box has the characteristics of regular shape, simple structure and the like, whether the 3D bounding box corresponding to any two objects has an overlapping area is judged, the object stacking judgment is effectively judged, the processing efficiency of the object stacking judgment is effectively improved, the processing process of the object stacking judgment is simplified, the pressed object can be accurately determined from a plurality of objects according to the point cloud in the overlapping area of the 3D bounding box, the object stacking judgment accuracy is improved, and the situation that the object above the pressed object falls due to the fact that the pressed object is mistakenly grabbed as a target object is avoided.
Fig. 2 is a flowchart illustrating a 3D bounding box-based object stack determination method according to another embodiment of the present invention, as shown in fig. 2, the method includes the following steps:
step S201, obtaining pose information of a plurality of objects in the current scene and a plurality of point clouds corresponding to the objects.
The method can acquire the pre-processed pose information of a plurality of objects in the current scene and the point clouds corresponding to the objects. Specifically, before step S201, a scene image and a depth image of a current scene may be acquired by a 3D camera, and the scene image and the depth image are processed, so that a point cloud corresponding to the scene image may be obtained conveniently. In order to determine point clouds corresponding to a plurality of objects from the point clouds corresponding to the scene image, the scene image can be subjected to instance segmentation processing by using a trained deep learning segmentation model to obtain segmentation results of the objects in the scene image, the point clouds corresponding to the objects are determined by matching the point clouds corresponding to the scene image with the segmentation results, and the point clouds corresponding to the objects are respectively matched with preset template point clouds in a template library to determine the pose information of the objects. The preset template point cloud is a point cloud which is determined in advance and corresponds to a known object serving as a matching reference.
Step S202, aiming at each object, calculating the maximum value and the minimum value of the point cloud corresponding to the object in each coordinate axis direction of the pose information of the object.
And the point cloud corresponding to the object comprises the coordinate values of the 3D points in the three X, Y and Z axes of the space, and the pose information of the object comprises the coordinate values of the center of the object in the three X, Y axes of the space, the three X, Y axes of the object and the information of the self X, Y axes of the object, and the like, and then the maximum value and the minimum value corresponding to the point cloud corresponding to the object in the three X, Y axes are calculated for each object. Specifically, the maximum value of the point cloud corresponding to the object on the X axis may be represented as maxX, and the minimum value may be represented as minX; the maximum value corresponding to the point cloud corresponding to the object on the Y axis can be represented as maxY, and the minimum value can be represented as minY; the maximum value of the point cloud corresponding to the object on the Z axis can be represented as maxZ, and the minimum value can be represented as minZ.
Step S203, a 3D bounding box corresponding to the object is generated according to the maximum value and the minimum value corresponding to each coordinate axis direction.
The 3D bounding box corresponding to the object is constructed in a coordinate system of the object, the length, the width and the height of the 3D bounding box are respectively parallel to XYZ three axes of the pose information of the object, and the starting position and the ending position of the length, the width and the height of the 3D bounding box are set according to the corresponding maximum value and the corresponding minimum value on the XYZ three axes, so that the 3D bounding box corresponding to the object is generated.
Fig. 3 is a schematic diagram of a 3D bounding box corresponding to an object, and as shown in fig. 3, taking the center of the object (i.e., the origin O) as the center of the 3D bounding box, setting the starting position and the ending position of the 3D bounding box in the X-axis direction according to the minimum value minX and the maximum value maxX corresponding to the point cloud corresponding to the object on the X-axis, that is, the starting position of the length of the 3D bounding box is minX, and the ending position is maxX, it can be said that the length of the 3D bounding box corresponds to the interval (minX, maxX); similarly, the starting position and the ending position of the 3D bounding box in the Y axis direction are set according to the minimum value minY and the maximum value maxY of the point cloud corresponding to the object on the Y axis, that is, the starting position of the width of the 3D bounding box is minY, the ending position is maxY, and the width of the 3D bounding box can be called as the corresponding interval (minY, maxY); and setting the starting position and the ending position of the 3D bounding box in the Z-axis direction according to the corresponding minimum value minZ and maximum value maxZ of the point cloud corresponding to the object on the Z axis, namely setting the starting position of the height of the 3D bounding box to be minZ and setting the ending position to be maxZ, and calling that the height of the 3D bounding box corresponds to the interval (minZ, maxZ).
Optionally, after the 3D bounding box corresponding to the object is generated according to the maximum value and the minimum value corresponding to each coordinate axis direction, the 3D bounding box may also be appropriately adjusted, for example, the 3D bounding box corresponding to the object is subjected to expansion processing according to a preset expansion parameter. The preset expansion parameter can be set by a person skilled in the art according to actual needs, and is not limited herein. For example, when the preset expansion parameter is 1.1, the length, width and height of the 3D bounding box are all expanded by 1.1 times, and the length, width and height of the 3D bounding box after the expansion process correspond to the interval (1.1 minX, 1.1 maxX), the width corresponds to the interval (1.1 minY, 1.1 maxY) and the height corresponds to the interval (1.1 minZ, 1.1 maxZ).
Step S204, judging whether the 3D bounding boxes corresponding to any two objects have overlapping areas or not; if yes, go to step S205; if not, go to step S207.
In order to facilitate distinguishing one object from another object of any two objects, in this embodiment, it is defined that any two objects include a first object and a second object, that is, one of the two objects is referred to as a first object, the other of the two objects is referred to as a second object, a coordinate system of the first object is referred to as a first coordinate system, and a coordinate system of the second object is referred to as a second coordinate system. In order to facilitate the determination of whether the 3D bounding box corresponding to the first object and the 3D bounding box corresponding to the second object have an overlapping region, a plurality of 3D points may be generated in the 3D bounding box corresponding to the first object at preset intervals, so that the 3D bounding box is filled with the 3D points, and then coordinates of the plurality of 3D points in the 3D bounding box corresponding to the first object are converted into a second coordinate system of the second object. The present embodiment is suitable for performing stacking judgment on objects having the same size. Specifically, the first object and the second object are objects with the same size, and in the coordinate conversion process, the 3D bounding box corresponding to the first object can be converted to be located on the same height plane as the 3D bounding box corresponding to the second object.
After the coordinate conversion is completed, whether a 3D point located in the 3D bounding box corresponding to the second object exists is determined, specifically, whether the 3D point located in the 3D bounding box corresponding to the second object exists may be determined for each 3D point in the 3D bounding box corresponding to the first object after the coordinate conversion, if the coordinate of the 3D point is (x, y, z), the length, width, and height of the 3D bounding box corresponding to the second object respectively correspond to the intervals (minX, maxX), (minY, maxY), and (minZ, maxZ), then whether the coordinate of the 3D point is (x, y, z) falls within the space formed by the intervals (minX, maxX), (minY, maxY), and (minZ, maxZ) may be determined, if so, that the 3D point is located in the 3D bounding box corresponding to the second object, and if not, the 3D point is determined to be located outside the 3D bounding box corresponding to the second object.
If the 3D points in the 3D bounding box corresponding to the second object exist through judgment, the 3D bounding box corresponding to the first object and the 3D bounding box corresponding to the second object are overlapped in the position relation, the 3D bounding box corresponding to the two objects is determined to have an overlapping area, the 3D points in the 3D bounding box corresponding to the second object are used as the 3D points in the overlapping area of the 3D bounding boxes corresponding to the two objects, all the 3D points in the overlapping area are collected, and therefore point clouds in the overlapping area are obtained. And if the 3D points located in the 3D bounding boxes corresponding to the second object do not exist according to the judgment, indicating that the 3D bounding box corresponding to the first object and the 3D bounding box corresponding to the second object do not overlap in the position relationship, determining that the 3D bounding boxes corresponding to the two objects do not have an overlapping area.
Step S205, determining that the two objects have a stacking relationship, extracting point clouds in an overlapping region of the 3D bounding boxes corresponding to the two objects, and calculating a first coordinate of a point cloud center of the point clouds in the overlapping region in a first coordinate system of the first object and a second coordinate of the point cloud center in a second coordinate system of the second object.
In the case where it is determined in step S204 that the overlapping region exists in the 3D bounding boxes corresponding to the two objects, which indicates that the two objects are in a state where one object is pressed above the other object, it is determined that there is a stacking relationship between the two objects. In order to determine which object is the pressed object pressed below from the two objects, the point clouds in the overlapping regions of the 3D bounding boxes corresponding to the two objects are extracted, and a first coordinate (x1, y1, z1) of the point cloud center of the point cloud in the overlapping region in the first coordinate system of the first object and a second coordinate (x2, y2, z2) of the point cloud center of the point cloud in the overlapping region in the second coordinate system of the second object are calculated respectively.
And step S206, determining the pressed object in the stacking relation according to the first coordinate and the second coordinate.
If the first object and the second object are the same size, the 3D bounding box corresponding to the first object and the 3D bounding box corresponding to the second object are also the same size. Since the overlapping region is a part or all of the 3D bounding box corresponding to the second object in the second coordinate system, the high and low positions of the overlapping region can be used to reflect the high and low positions of the 3D bounding box corresponding to the second object. Taking the Z-axis direction as shown in fig. 3 as an example, in step S206, it may be determined whether the Z-axis coordinate value in the first coordinate is smaller than the Z-axis coordinate value in the second coordinate; if yes, the overlapped area is positioned below the 3D bounding box corresponding to the first object, namely the 3D bounding box corresponding to the second object is positioned below the 3D bounding box corresponding to the first object, and the second object is determined as the pressed object in the stacking relation; if not, the overlapped area is positioned above the 3D bounding box corresponding to the first object, namely the 3D bounding box corresponding to the second object is positioned above the 3D bounding box corresponding to the first object, and the first object is determined as the pressed object in the stacking relation.
Step S207, it is determined that there is no stacking relationship between the two objects.
In the case where it is determined in step S204 that there is no overlapping area in the 3D bounding boxes corresponding to the two objects, which indicates that the two objects are not in a state where one object is pressed above the other object, it is determined that there is no stacking relationship between the two objects.
And S208, determining target objects from the objects, converting the pose information of the target objects into a robot coordinate system, and transmitting the converted pose information of the target objects to the robot.
Whether any two objects in the plurality of objects of the current scene have a stacking relationship can be conveniently judged through the steps S201 to S207. Considering that the objects to be pressed are not suitable as the target objects to be grabbed currently, after the stacking relationship among the objects is identified, the objects to be pressed can be screened out from the objects according to the stacking relationship, and the object closest to the plane where the camera is located is selected as the target object from the screened-out objects. Because the pose information of the object is determined in the camera coordinate system, in order to facilitate the robot to position the target object, the pose information of the target object needs to be converted into the robot coordinate system by using a preset conversion algorithm, and then the pose information of the converted target object is transmitted to the robot, so that the robot can execute the grabbing operation aiming at the target object according to the pose information of the converted target object.
According to the object stacking judgment method based on the 3D bounding box provided by the embodiment, whether the 3D bounding boxes corresponding to the two objects have the overlapped area or not can be conveniently determined by converting the coordinates of the generated 3D points in the 3D bounding box corresponding to the first object into the second coordinate system of the second object, the overlapped relation between the two objects is judged under the condition that the overlapped area exists, and the pressed object can be conveniently determined by comparing the coordinates of the point cloud center of the point cloud in the overlapped area under the respective coordinate systems of the two objects, so that the efficient and accurate judgment of object stacking is realized, the situation that the object positioned above the pressed object falls off due to the fact that the pressed object is mistakenly grabbed as the target object is avoided, and the grabbing errors of a robot are reduced.
Fig. 4 is a block diagram illustrating a structure of a 3D bounding box-based object stack determination apparatus according to an embodiment of the present invention, as shown in fig. 4, the apparatus including: an acquisition module 401, a bounding box generation module 402, an overlap determination module 403, and a processing module 404.
The acquisition module 401 is adapted to: the method comprises the steps of obtaining pose information of a plurality of objects in a current scene and point clouds corresponding to the objects.
The bounding box generation module 402 is adapted to: and generating a 3D bounding box corresponding to the plurality of objects according to the pose information of the plurality of objects and the point clouds corresponding to the plurality of objects.
The overlap determination module 403 is adapted to: and judging whether the 3D bounding boxes corresponding to any two objects have an overlapping area.
The processing module 404 is adapted to: if the overlap judgment module 403 judges that an overlap area exists, determining that the two objects have a stacking relationship, extracting point clouds in the overlap area of the 3D bounding boxes corresponding to the two objects, and determining a pressed object in the stacking relationship according to the point clouds in the overlap area; if the overlap determining module 403 determines that there is no overlap region, it is determined that there is no stacking relationship between the two objects.
Optionally, the bounding box generating module 402 is further adapted to: calculating a maximum value and a minimum value of the point cloud corresponding to each object in each coordinate axis direction of the pose information of the object according to each object; and generating a 3D bounding box corresponding to the object according to the maximum value and the minimum value corresponding to each coordinate axis direction.
Optionally, the bounding box generating module 402 is further adapted to: and performing expansion processing on the 3D bounding box corresponding to the object according to preset expansion parameters.
Optionally, any two objects include: a first object and a second object; the overlap determination module 403 is further adapted to: generating a plurality of 3D points in the 3D bounding box corresponding to the first object according to a preset interval, and converting the coordinates of the plurality of 3D points in the 3D bounding box corresponding to the first object into a second coordinate system of the second object; judging whether a 3D point located in a 3D bounding box corresponding to the second object exists or not; if so, determining that the 3D bounding boxes corresponding to the two objects have an overlapping region, and taking the 3D point as a 3D point in the overlapping region of the 3D bounding boxes corresponding to the two objects; if not, determining that the 3D bounding boxes corresponding to the two objects do not have an overlapping area.
Optionally, the first object and the second object are objects of the same size; the processing module 404 is further adapted to: calculating a first coordinate of a point cloud center of a point cloud in the overlapping area under a first coordinate system of a first object and a second coordinate under a second coordinate system of a second object; and determining the pressed object in the stacking relation according to the first coordinate and the second coordinate.
Optionally, the processing module 404 is further adapted to: judging whether the Z-axis coordinate value in the first coordinate is smaller than the Z-axis coordinate value in the second coordinate; if yes, determining the second object as a pressed object in the stacking relation; if not, the first object is determined to be a pressed object in the stacking relation.
According to the 3D bounding box-based object stacking judgment device provided by the embodiment, the coordinates of the generated 3D points in the 3D bounding box corresponding to the first object are converted into the second coordinate system of the second object, whether the 3D bounding boxes corresponding to the two objects have an overlapped area can be conveniently determined, if the overlapped area exists, the two objects are judged to have a stacking relation, and the coordinates of the point cloud center of the point cloud in the overlapped area under the respective coordinate systems of the two objects are compared, so that the pressed object can be conveniently determined, the efficient and accurate judgment of object stacking is realized, the situation that the pressed object is mistakenly grabbed as the target object to cause the object above the pressed object to fall off is avoided, and the grabbing errors of a robot are reduced.
The invention further provides a nonvolatile computer storage medium, and the computer storage medium stores at least one executable instruction, and the executable instruction can execute the 3D bounding box-based object stacking judgment method in any method embodiment.
Fig. 5 is a schematic structural diagram of a computing device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computing device.
As shown in fig. 5, the computing device may include: a processor (processor)502, a Communications Interface 504, a memory 506, and a communication bus 508.
Wherein:
the processor 502, communication interface 504, and memory 506 communicate with one another via a communication bus 508.
A communication interface 504 for communicating with network elements of other devices, such as clients or other servers.
The processor 502 is configured to execute the program 510, and may specifically execute the relevant steps in the embodiment of the object stack determination method based on the 3D bounding box.
In particular, program 510 may include program code that includes computer operating instructions.
The processor 502 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement an embodiment of the present invention. The computing device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 506 for storing a program 510. The memory 506 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 510 may be specifically configured to cause the processor 502 to execute the 3D bounding box-based object stack determination method in any of the above-described method embodiments. For specific implementation of each step in the program 510, reference may be made to corresponding steps and corresponding descriptions in units in the foregoing embodiment for determining object stacking based on a 3D bounding box, which are not described herein again. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described devices and modules may refer to the corresponding process descriptions in the foregoing method embodiments, and are not described herein again.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components in accordance with embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims (14)

1. A method for determining object stacking based on 3D bounding boxes, the method comprising:
acquiring pose information of a plurality of objects in a current scene and point clouds corresponding to the objects;
generating a 3D bounding box corresponding to a plurality of objects according to the pose information of the plurality of objects and the point clouds corresponding to the plurality of objects;
judging whether the 3D bounding boxes corresponding to any two objects have overlapping areas or not;
if yes, determining that the two objects have a stacking relation, extracting point clouds in an overlapping area of the 3D bounding boxes corresponding to the two objects, and determining a pressed object in the stacking relation according to the point clouds in the overlapping area; if not, determining that the two objects do not have a stacking relation.
2. The method of claim 1, wherein the generating a plurality of object-corresponding 3D bounding boxes from pose information of a plurality of objects and a plurality of object-corresponding point clouds further comprises:
calculating a maximum value and a minimum value of the point cloud corresponding to each object in each coordinate axis direction of the pose information of the object according to each object;
and generating a 3D bounding box corresponding to the object according to the maximum value and the minimum value corresponding to each coordinate axis direction.
3. The method of claim 2, wherein after generating the 3D bounding box corresponding to the object according to the maximum and minimum values corresponding to the coordinate axis directions, the method further comprises:
and performing expansion processing on the 3D bounding box corresponding to the object according to preset expansion parameters.
4. The method of any of claims 1-3, wherein the any two objects comprise: a first object and a second object;
the judging whether the 3D bounding boxes corresponding to any two objects have overlapping areas further comprises:
generating a plurality of 3D points in the 3D bounding box corresponding to the first object according to a preset interval, and converting the coordinates of the plurality of 3D points in the 3D bounding box corresponding to the first object into a second coordinate system of the second object;
judging whether a 3D point located in a 3D bounding box corresponding to the second object exists or not;
if so, determining that the 3D bounding boxes corresponding to the two objects have an overlapping region, and taking the 3D point as a 3D point in the overlapping region of the 3D bounding boxes corresponding to the two objects;
if not, determining that the 3D bounding boxes corresponding to the two objects do not have an overlapping area.
5. The method of claim 4, wherein the first object and the second object are identically sized objects;
the determining a pressed object in the stacking relationship from the point cloud within the overlap region further comprises:
calculating a first coordinate of a point cloud center of the point cloud within the overlap region under a first coordinate system of the first object and a second coordinate under a second coordinate system of the second object;
and determining the pressed object in the stacking relation according to the first coordinate and the second coordinate.
6. The method of claim 5, wherein the determining the pressed object in the stacked relationship from the first and second coordinates further comprises:
judging whether the Z-axis coordinate value in the first coordinate is smaller than the Z-axis coordinate value in the second coordinate;
if yes, determining the second object as a pressed object in the stacking relation; if not, determining the first object as the pressed object in the stacking relation.
7. An object stack determination apparatus based on a 3D bounding box, the apparatus comprising:
the acquisition module is suitable for acquiring pose information of a plurality of objects in a current scene and a plurality of point clouds corresponding to the objects;
the bounding box generating module is suitable for generating a 3D bounding box corresponding to a plurality of objects according to the pose information of the plurality of objects and the point clouds corresponding to the plurality of objects;
the overlapping judgment module is suitable for judging whether the 3D bounding boxes corresponding to any two objects have overlapping areas or not;
the processing module is suitable for determining that the two objects have a stacking relation if the overlapping judging module judges that an overlapping area exists, extracting point clouds in the overlapping area of the 3D bounding boxes corresponding to the two objects, and determining a pressed object in the stacking relation according to the point clouds in the overlapping area; and if the overlapping judging module judges that no overlapping area exists, determining that the two objects do not have a stacking relation.
8. The apparatus of claim 7, wherein the bounding box generation module is further adapted to:
calculating a maximum value and a minimum value of the point cloud corresponding to each object in each coordinate axis direction of the pose information of the object according to each object;
and generating a 3D bounding box corresponding to the object according to the maximum value and the minimum value corresponding to each coordinate axis direction.
9. The apparatus of claim 8, wherein the bounding box generation module is further adapted to:
and performing expansion processing on the 3D bounding box corresponding to the object according to preset expansion parameters.
10. The apparatus of any of claims 7-9, wherein the any two objects comprise: a first object and a second object;
the overlap determination module is further adapted to:
generating a plurality of 3D points in the 3D bounding box corresponding to the first object according to a preset interval, and converting the coordinates of the plurality of 3D points in the 3D bounding box corresponding to the first object into a second coordinate system of the second object;
judging whether a 3D point located in a 3D bounding box corresponding to the second object exists or not;
if so, determining that the 3D bounding boxes corresponding to the two objects have an overlapping region, and taking the 3D point as a 3D point in the overlapping region of the 3D bounding boxes corresponding to the two objects;
if not, determining that the 3D bounding boxes corresponding to the two objects do not have an overlapping area.
11. The apparatus of claim 10, wherein the first object and the second object are identically sized objects;
the processing module is further adapted to:
calculating a first coordinate of a point cloud center of the point cloud within the overlap region under a first coordinate system of the first object and a second coordinate under a second coordinate system of the second object;
and determining the pressed object in the stacking relation according to the first coordinate and the second coordinate.
12. The apparatus of claim 11, wherein the processing module is further adapted to:
judging whether the Z-axis coordinate value in the first coordinate is smaller than the Z-axis coordinate value in the second coordinate;
if yes, determining the second object as a pressed object in the stacking relation; if not, determining the first object as the pressed object in the stacking relation.
13. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the 3D bounding box based object stacking judgment method according to any one of claims 1 to 6.
14. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the 3D bounding box based object stack determination method as claimed in any one of claims 1 to 6.
CN202110217390.8A 2021-02-26 2021-02-26 Object stacking judgment method and device based on 3D bounding box and computing equipment Pending CN112837370A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110217390.8A CN112837370A (en) 2021-02-26 2021-02-26 Object stacking judgment method and device based on 3D bounding box and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110217390.8A CN112837370A (en) 2021-02-26 2021-02-26 Object stacking judgment method and device based on 3D bounding box and computing equipment

Publications (1)

Publication Number Publication Date
CN112837370A true CN112837370A (en) 2021-05-25

Family

ID=75933680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110217390.8A Pending CN112837370A (en) 2021-02-26 2021-02-26 Object stacking judgment method and device based on 3D bounding box and computing equipment

Country Status (1)

Country Link
CN (1) CN112837370A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113284129A (en) * 2021-06-11 2021-08-20 梅卡曼德(北京)机器人科技有限公司 Box pressing detection method and device based on 3D bounding box

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105469406A (en) * 2015-11-30 2016-04-06 东北大学 Bounding box and space partitioning-based virtual object collision detection method
US20160163067A1 (en) * 2014-12-05 2016-06-09 Symbol Technologies, Inc. Apparatus for and method of estimating dimensions of an object associated with a code in automatic response to reading the code
US20180286111A1 (en) * 2017-03-30 2018-10-04 Kabushiki Kaisha Square Enix (Also Trading As Square Enix Co., Ltd.) Cross determining program, cross determining method, and cross determining apparatus
CN110377776A (en) * 2018-07-23 2019-10-25 北京京东尚科信息技术有限公司 The method and apparatus for generating point cloud data
CN111080653A (en) * 2019-11-06 2020-04-28 广西大学 Method for simplifying multi-view point cloud by using region segmentation and grouping random simplification method
CN111754515A (en) * 2019-12-17 2020-10-09 北京京东尚科信息技术有限公司 Method and device for sequential gripping of stacked articles
CN112292689A (en) * 2019-12-23 2021-01-29 商汤国际私人有限公司 Sample image acquisition method and device and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160163067A1 (en) * 2014-12-05 2016-06-09 Symbol Technologies, Inc. Apparatus for and method of estimating dimensions of an object associated with a code in automatic response to reading the code
CN105469406A (en) * 2015-11-30 2016-04-06 东北大学 Bounding box and space partitioning-based virtual object collision detection method
US20180286111A1 (en) * 2017-03-30 2018-10-04 Kabushiki Kaisha Square Enix (Also Trading As Square Enix Co., Ltd.) Cross determining program, cross determining method, and cross determining apparatus
CN110377776A (en) * 2018-07-23 2019-10-25 北京京东尚科信息技术有限公司 The method and apparatus for generating point cloud data
CN111080653A (en) * 2019-11-06 2020-04-28 广西大学 Method for simplifying multi-view point cloud by using region segmentation and grouping random simplification method
CN111754515A (en) * 2019-12-17 2020-10-09 北京京东尚科信息技术有限公司 Method and device for sequential gripping of stacked articles
CN112292689A (en) * 2019-12-23 2021-01-29 商汤国际私人有限公司 Sample image acquisition method and device and electronic equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DANFEI XU ET AL.: "PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》, pages 135 - 140 *
张心怡: "基于三维重建及关系推理的未知堆叠物体抓取方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
郑刘坡: "面向机器人抓取的物体识别及定位技术", 《中国优秀硕士学位论文全文数据库 信息科技辑》, pages 27 - 58 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113284129A (en) * 2021-06-11 2021-08-20 梅卡曼德(北京)机器人科技有限公司 Box pressing detection method and device based on 3D bounding box
CN113284129B (en) * 2021-06-11 2024-06-18 梅卡曼德(北京)机器人科技有限公司 3D bounding box-based press box detection method and device

Similar Documents

Publication Publication Date Title
CN112837371B (en) Object grabbing method and device based on 3D matching and computing equipment
JP7433609B2 (en) Method and computational system for object identification
JP6529302B2 (en) INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM
US9576363B2 (en) Object picking system, object detecting device, object detecting method
CN105217324A (en) A kind of novel de-stacking method and system
CN113284178B (en) Object stacking method, device, computing equipment and computer storage medium
JP4766269B2 (en) Object detection method, object detection apparatus, and robot equipped with the same
US11403764B2 (en) Method and computing system for processing candidate edges
CN114783068A (en) Gesture recognition method, gesture recognition device, electronic device and storage medium
CN114310892B (en) Object grabbing method, device and equipment based on point cloud data collision detection
CN111369611B (en) Image pixel depth value optimization method, device, equipment and storage medium thereof
JP2010210511A (en) Recognition device of three-dimensional positions and attitudes of objects, and method for the same
CN112837370A (en) Object stacking judgment method and device based on 3D bounding box and computing equipment
CN115249324A (en) Method and device for determining position to be stacked in stack shape and computing equipment
CN116228854B (en) Automatic parcel sorting method based on deep learning
CN113284129B (en) 3D bounding box-based press box detection method and device
US20220230459A1 (en) Object recognition device and object recognition method
CN114972495A (en) Grabbing method and device for object with pure plane structure and computing equipment
US11900652B2 (en) Method and computing system for generating a safety volume list for object detection
JP7161857B2 (en) Information processing device, information processing method, and program
JP6512852B2 (en) Information processing apparatus, information processing method
WO2022137509A1 (en) Object recognition device, object recognition method, non-transitory computer-readable medium, and object recognition system
WO2023140266A1 (en) Picking device and image generation program
JP5757157B2 (en) Method, apparatus and program for calculating head position and axis direction for detection object
McAtee et al. Simulation scan comparison for process monitoring using 3D scanning in manufacturing environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 1100, 1st Floor, No. 6 Chuangye Road, Shangdi Information Industry Base, Haidian District, Beijing 100085

Applicant after: MECH-MIND (BEIJING) ROBOTICS TECHNOLOGIES CO.,LTD.

Address before: 100085 1001, floor 1, building 3, No.8 Chuangye Road, Haidian District, Beijing

Applicant before: MECH-MIND (BEIJING) ROBOTICS TECHNOLOGIES CO.,LTD.