CN113160324A

CN113160324A - Bounding box generation method and device, electronic equipment and computer readable medium

Info

Publication number: CN113160324A
Application number: CN202110348075.9A
Authority: CN
Inventors: 沈蕾
Original assignee: Beijing Jingdong Qianshi Technology Co Ltd
Current assignee: Beijing Jingdong Qianshi Technology Co Ltd
Priority date: 2021-03-31
Filing date: 2021-03-31
Publication date: 2021-07-23
Anticipated expiration: 2041-03-31
Also published as: CN113160324B

Abstract

The embodiment of the disclosure discloses a bounding box generation method, a bounding box generation device, electronic equipment and a medium. One embodiment of the method comprises: in response to detecting the existence of the respective object related to the target image, determining at least one bounding box on the target image, wherein the bounding box in the at least one bounding box represents the position information of the object in the respective object related to the target image; generating a point cloud set related to each bounding box in the at least one bounding box to obtain a point cloud set group; determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in each article, as a first target point cloud set to obtain a first target point cloud set group; determining the bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box to obtain a target bounding box set. The embodiment can accurately and effectively generate the enclosing frame which is more relevant to each article.

Description

Bounding box generation method and device, electronic equipment and computer readable medium

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a bounding box generation method and device, electronic equipment and a computer readable medium.

Background

Currently, there are many tasks involved in vision-based object recognition and positioning, such as robot arm unstacking, robot arm in-box picking, robot arm line assembly, vision-based navigation and positioning, and so on. These tasks often require predicting the target bounding box of the item of interest for subsequent tasks.

For the generation of the target bounding box, the following method is generally adopted: and inputting the target image into a pre-trained deep learning network, and outputting the relevant information of the surrounding frame. However, when the image is segmented in the above manner, there are often technical problems as follows:

the bounding box input to the deep network output often cannot more accurately and effectively locate the object related to the task, so that the task is completed less efficiently.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

Some embodiments of the present disclosure propose a bounding box generation method, apparatus, electronic device and computer readable medium to solve one or more of the technical problems mentioned in the background section above.

In a first aspect, some embodiments of the present disclosure provide a bounding box generation method, including: in response to detecting that the object image-related items exist, determining at least one surrounding frame on the object image, wherein the surrounding frame in the at least one surrounding frame represents position information of the items in the object image-related items; generating a point cloud set related to each bounding box in the at least one bounding box to obtain a point cloud set group; determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in each article, as a first target point cloud set to obtain a first target point cloud set group; and determining the bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box to obtain a target bounding box set.

Optionally, the generating a point cloud set associated with each bounding box of the at least one bounding box includes: and generating a point cloud set related to each bounding box in the at least one bounding box by using the external parameter matrix of the camera.

Optionally, the determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each item in the respective items, as a first target point cloud set includes: determining the center point of the surrounding frame related to the point cloud set; selecting at least one point cloud in the point cloud set adjacent to the central point as a second target point cloud set; and generating the first target point cloud set according to the second target point cloud set.

Optionally, the selecting at least one point cloud in the point cloud set adjacent to the central point as a second target point cloud set includes: constructing a tree model related to the point cloud set; and selecting at least one point cloud in the point cloud set adjacent to the central point as the second target point cloud set according to the tree model.

Optionally, the method further includes: and adding the related information of the central point into a first list.

Optionally, the generating the first target point cloud set according to the second target point cloud set includes: for each center point associated with the first list, performing the following first target point cloud set generation steps: responding to the second target point cloud set with traversed second target point clouds, removing the traversed second target point clouds in the second target point cloud set, and obtaining a third target point cloud set; in response to determining that a third target point cloud satisfying a target condition set exists in the third target point cloud set, screening at least one point cloud satisfying the target condition set from the third target point cloud set as a fourth target point cloud set, wherein the target condition set includes: the normal included angle between the point cloud and the central point is smaller than a first threshold value, and the curvature between the point cloud and the central point is smaller than a second threshold value; adding the related information of each fourth target point cloud in the fourth target point cloud set into the first list and the second list respectively; removing the related information of the center point from the first list; and determining the point cloud set corresponding to the second list as the first target point cloud set in response to that the second target point cloud in the second target point cloud set is a traversed point cloud and/or the first list does not have at least one point cloud associated therewith.

Optionally, the method further includes: in response to the first list having the associated at least one point cloud, determining each point cloud of the first list associated at least one point cloud as a center point, and continuing the target point cloud set generating step.

Optionally, after the step of responding to the second target point cloud set existing in the second target point cloud set, removing the second target point cloud set traversed in the second target point cloud set, and obtaining a third target point cloud set, the method further includes: and determining the second target point cloud set as a third target point cloud set in response to the second target point cloud set not having traversed the second target point cloud.

Optionally, the method further includes: and controlling the corresponding mechanical equipment to carry the objects according to the target surrounding frame set.

In a second aspect, some embodiments of the present disclosure provide an enclosure generating apparatus, including: a first determining unit, configured to determine at least one bounding box on the target image in response to detecting that there is a respective item related to the target image, wherein a bounding box in the at least one bounding box represents location information of an item in the respective item related to the target image; a generating unit configured to generate a point cloud set associated with each of the at least one bounding box, resulting in a point cloud set group; a second determining unit configured to determine at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in the articles, as a first target point cloud set, resulting in a first target point cloud set group; and the third determining unit is configured to determine the bounding box associated with each first target point cloud set in the first target point cloud set group as the target bounding box, so as to obtain a target bounding box set.

Optionally, the generating unit is further configured to: and generating a point cloud set related to each bounding box in the at least one bounding box by using the external parameter matrix of the camera.

Optionally, the second determination unit is further configured to: determining the center point of the surrounding frame related to the point cloud set; selecting at least one point cloud in the point cloud set, which is adjacent to the central point, as a second target point cloud set; and generating the first target point cloud set according to the second target point cloud set.

Optionally, the second determination unit is further configured to: constructing a tree model related to the point cloud set; and selecting at least one point cloud in the point cloud set adjacent to the central point as the second target point cloud set according to the tree model.

Optionally, the second determination unit is further configured to: and adding the related information of the central point into the first list.

Optionally, the second determination unit is further configured to: for each center point associated with the first list, performing the following first target point cloud set generation steps: in response to the second target point cloud set having traversed the second target point cloud, removing the traversed second target point cloud from the second target point cloud set to obtain a third target point cloud set; in response to determining that a third target point cloud satisfying a set of target conditions exists in the third target point cloud set, screening at least one point cloud satisfying the set of target conditions from the third target point cloud set as a fourth target point cloud set, wherein the set of target conditions includes: the normal line clip angle between the point cloud and the central point is smaller than a first threshold value, and the curvature between the point cloud and the central point is smaller than a second threshold value; adding the related information of each fourth target point cloud in the fourth target point cloud set into the first list and the second list respectively; removing the information related to the center point from the first list; and determining the point cloud set corresponding to the second list as the first target point cloud set in response to that the second target point cloud in the second target point cloud set is a traversed point cloud and/or the first list does not have at least one point cloud associated therewith.

Optionally, the apparatus further comprises: and controlling the corresponding mechanical equipment to carry the objects according to the target surrounding frame set.

In a third aspect, some embodiments of the present disclosure provide an electronic device, comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.

In a fourth aspect, some embodiments of the disclosure provide a computer readable medium having a computer program stored thereon, where the program when executed by a processor implements a method as described in any of the implementations of the first aspect.

The above embodiments of the present disclosure have the following beneficial effects: the bounding box generation method of some embodiments of the present disclosure can accurately and effectively generate the bounding box more related to each article. In particular, the bounding box input to the deep web output often fails to more accurately locate the item involved in the task effectively, resulting in inefficient task completion. Based on this, the bounding box generating method of some embodiments of the present disclosure may first determine, in response to detecting that there are respective items related to the target image, at least one bounding box on the target image for subsequently generating a bounding box more associated with the respective item. Wherein the bounding box of the at least one bounding box represents the position information of the object in each object related to the target image. Here, approximate positional information of each article related to the target image is determined. And then, generating a point cloud set related to each bounding box in the at least one bounding box to obtain a point cloud set group. Here, the stereoscopic structure information of the article existing corresponding to each bounding box can be approximately simulated by generating the point cloud set, so as to be used for subsequently generating the bounding box of the article related to the target image. And further determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in each article, as a first target point cloud set so as to accurately determine the three-dimensional structure information of each article, thereby obtaining a first target point cloud set group. And finally, determining the bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box to generate a bounding box more associated with each article, so as to obtain a target bounding box set.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic and that elements and components are not necessarily drawn to scale.

FIG. 1 is a schematic diagram of one application scenario of a bounding box generation method according to some embodiments of the present disclosure;

FIG. 2 is a flow diagram of some embodiments of an enclosure generation method according to the present disclosure;

3-4 are schematic diagrams of point clouds in some embodiments of bounding box generation methods according to the present disclosure;

FIG. 5 is a flow diagram of further embodiments of an enclosure generation method according to the present disclosure;

FIG. 6 is a structural schematic diagram of some embodiments of an bounding box generation apparatus according to the present disclosure;

FIG. 7 is a structural schematic diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be noted that, for convenience of description, only the relevant portions of the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

FIG. 1 is a schematic diagram of one application scenario of a bounding box generation method according to some embodiments of the present disclosure.

In the application scenario of fig. 1, the electronic device 101 may first determine at least one bounding box 103 on the target image 102 in response to detecting the presence of various items associated with the target image 102. Wherein the bounding box of the at least one bounding box 103 represents the position information of the object in each object related to the target image 102. In this application scenario, each of the above items includes: item 1021, item 1022. The at least one enclosure frame 103 includes: enclosure 1031, enclosure 1032. Then, a point cloud set associated with each bounding box of the at least one bounding box 103 is generated, resulting in a point cloud set group 104. In this application scenario, the point cloud set group 104 includes: a point cloud set 1041 corresponding to the bounding box 1031, and a point cloud set 1042 corresponding to the bounding box 1032. Further, at least one point cloud in each point cloud set in the point cloud set group 104, which is related to each article in the articles, is determined as a first target point cloud set, and a first target point cloud set group 105 is obtained. In the present application scenario, the first target point cloud set group 105 includes: a first target point cloud set 1051 corresponding to item 1021, and a first target point cloud set 1052 corresponding to item 1022. Finally, the bounding box associated with each first target point cloud set in the first target point cloud set group 1052 is determined as a target bounding box, so as to obtain a target bounding box set 106. In this application scenario, the target enclosure box set 106 includes: an enclosure 1061 corresponding to the item 1021, and an enclosure 1062 corresponding to the item 1022.

The electronic device 101 may be hardware or software. When the electronic device is hardware, the electronic device may be implemented as a distributed cluster formed by a plurality of servers or terminal devices, or may be implemented as a single server or a single terminal device. When the electronic device is embodied as software, it may be installed in the above-listed hardware devices. It may be implemented as multiple software or software modules, for example to provide distributed services, or as a single software or software module. And is not particularly limited herein.

It should be understood that the number of electronic devices in fig. 1 is merely illustrative. There may be any number of electronic devices, as desired for implementation.

With continued reference to fig. 2, a flow 200 of some embodiments of an enclosure generation method according to the present disclosure is shown. The bounding box generation method comprises the following steps:

at step 201, at least one bounding box on the target image is determined.

In some embodiments, an executing subject of the bounding box generation method described above (e.g., the electronic device illustrated in fig. 1) may determine at least one bounding box on the target image. Wherein the bounding box of the at least one bounding box represents the position information of each object related to the target image. The target image may be an image obtained by photographing the article stored in the transfer box with the target camera. The transfer box may be a box for transporting goods. The shape of the surrounding frame is a shape suitable for being grabbed by related mechanical equipment, for example, the shape of the surrounding frame can be a rectangle.

As an example, the execution subject described above may determine on the target image by the following steps

First, a target image captured by a target camera is acquired.

And secondly, inputting the target image into a pre-trained target detection network to obtain at least one bounding box on the target image. Wherein, the target detection network may include: SSD (Single Shot MultiBox Detector) algorithm, R-CNN (Region-conditional Neural Networks) algorithm, Fast R-CNN (Fast Region-conditional Neural Networks) algorithm, SPP-NET (spatial gradient Pooling network) algorithm, YOLO (young Only Look one) algorithm, FPN (feature random Networks) algorithm, DCN (Deformable ConvNet) algorithm.

Step 202, generating a point cloud set related to each bounding box in the at least one bounding box to obtain a point cloud set group.

In some embodiments, the executing subject may generate a point cloud set associated with each bounding box of the at least one bounding box, resulting in a point cloud set group. The point cloud set related to the bounding box may be a point cloud set of an object and a background corresponding to the bounding box. For example, the target image is an image of an a box stored in an a box. Relative to the bounding box of the a-box, it is possible that the bounding box is larger than the 2D profile extent of the a-box. The bounding box may have a partial content of the a-box. The point cloud sets associated with the bounding box may be the point cloud sets of the partial contents of the a-box and the a-box to which the bounding box corresponds.

As an example, the execution body may first determine the contents of the item and the background through the bounding box. Then, a set of point clouds associated with each of the at least one bounding box is generated by controlling the associated laser device to scan.

As an example, the above-mentioned point cloud set related to the bounding box may be as shown in fig. 3.

In some optional implementations of some embodiments, the executing subject may generate a point cloud set associated with each bounding box of the at least one bounding box by using an external parameter matrix of the camera. The external parameter matrix of the camera may be an external parameter matrix between the 2D camera and the 3D camera. The external parameter matrix of the camera may be preset.

Step 203, determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in each article, as a first target point cloud set, so as to obtain a first target point cloud set group.

In some embodiments, the executing entity may determine at least one point cloud associated with each of the items in each of the point cloud sets as a first target point cloud set, resulting in a first target point cloud set. As an example, the execution body may determine the three-dimensional structure information of each article through a world coordinate system. And then, determining at least one point cloud in each point cloud set in the point cloud set group, which is related to the three-dimensional structure information of each article, as a first target point cloud set to obtain a first target point cloud set group. As yet another example, the executing entity may determine at least one point cloud in each point cloud set of the point cloud set group, which is associated with each of the items, as a first target point cloud set using a dense point cloud generating network, resulting in a first target point cloud set group. The Dense Point Cloud generating network may be a Dense Point Cloud generating network (DensePCR, Dense 3D Point Cloud Reconstruction).

As an example, the first target point cloud set may be as shown in fig. 4.

In some optional implementations of some embodiments, the determining at least one point cloud in each point cloud set of the point cloud set group, which is related to each item of the respective items, as a first target point cloud set may include:

firstly, determining the center point of a surrounding frame related to the point cloud set. The center point may be coordinate information of a center position.

And secondly, selecting at least one point cloud in the point cloud set, which is adjacent to the central point, as a second target point cloud set. As an example, the executing subject may first find a cosine distance between two point clouds in the point cloud set, resulting in a cosine distance set. And then, selecting at least one point cloud in the point cloud set, wherein the cosine distance of the point cloud set is smaller than a preset threshold value, as a second target point cloud set.

And thirdly, generating the first target point cloud set according to the second target point cloud set. As an example, the executing entity may generate the first target point cloud set according to the second target point cloud set in various ways.

Optionally, the selecting at least one point cloud in the point cloud set adjacent to the central point as a second target point cloud set may include:

firstly, a tree model related to the point cloud set is constructed. Wherein the tree model may be a k-d tree. As an example, the execution subject may construct a tree model related to the point cloud set according to a construction manner of a k-d tree.

And secondly, selecting at least one point cloud in the point cloud set adjacent to the central point as a second target point cloud set according to the tree model. As an example, the executing entity may select at least one point cloud in the point cloud set adjacent to the central point as the second target point cloud set by traversing the tree model.

Optionally, the foregoing steps further include:

the execution body adds the relevant information of the center point into a first list. Wherein the first list may be a pre-established, empty list.

Optionally, the generating the first target point cloud set according to the second target point cloud set may include:

for each center point associated with the first list, performing the following first target point cloud set generation steps:

and a first substep of removing the traversed second target point cloud in the second target point cloud set in response to the second target point cloud set existing traversed second target point cloud, so as to obtain a third target point cloud set. The traversed second target point cloud may be a point cloud that has been screened from the tree model.

A second substep of, in response to determining that a third target point cloud satisfying a target condition set exists in the third target point cloud set, screening at least one point cloud satisfying the target condition set from the third target point cloud set as a fourth target point cloud set. Wherein the set of target conditions comprises: the normal included angle between the point cloud and the central point is smaller than a first threshold value, and the curvature between the point cloud and the central point is smaller than a second threshold value.

And a third substep of adding the related information of each fourth target point cloud in the fourth target point cloud set to the first list and the second list respectively. Wherein the second list may be a pre-established, empty list. The related information may be position coordinate information corresponding to the fourth target point cloud.

A fourth substep of removing the information related to the center point from the first list.

And a fifth substep, in response to that a second target point cloud in the second target point cloud set is a traversed point cloud and/or that the first list does not have at least one point cloud associated therewith, determining the point cloud set corresponding to the second list as the first target point cloud set.

Optionally, in response to the first list having the associated at least one point cloud, the executing entity determines each point cloud in the associated at least one point cloud of the first list as a central point, and continues the target point cloud set generating step.

Optionally, in response to that there is no traversed second target point cloud in the second target point cloud set, the executing entity may determine the second target point cloud set as a third target point cloud set.

And 204, determining the bounding box related to each first target point cloud set in the first target point cloud set group as a target bounding box to obtain a target bounding box set.

In some embodiments, the execution subject may determine, as the target bounding box, a bounding box associated with each first target point cloud set in the first target point cloud set group, resulting in a target bounding box set. As an example, the executing subject may also determine, by using an external parameter matrix of the camera, a bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box, so as to obtain a target bounding box set. Here, the external reference matrix of the camera may be an external reference matrix between the 3D camera and the 2D camera. The external parameter matrix of the camera may be preset.

With further reference to fig. 5, a flow 500 of further embodiments of an enclosure generation method according to the present disclosure is shown. The bounding box generation method comprises the following steps:

step 501, in response to detecting that there are various items related to the target image, determining at least one bounding box on the target image.

Step 502, generating a point cloud set related to each bounding box in the at least one bounding box to obtain a point cloud set group.

Step 503, determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in each article, as a first target point cloud set, so as to obtain a first target point cloud set group.

Step 504, determining the bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box, so as to obtain a target bounding box set.

In some embodiments, the specific implementation of

steps

501 and 504 and the technical effect thereof can refer to

steps

201 and 204 in the embodiment corresponding to fig. 2, and are not described herein again.

And 505, controlling the relevant mechanical equipment to carry the articles according to the target enclosure frame set.

In some embodiments, an executing entity (e.g., the electronic device shown in fig. 1) may control the relevant mechanical device to carry the objects according to the target enclosure box set. As an example, the mechanical device may be a robotic arm.

As can be seen from fig. 5, compared with the description of some embodiments corresponding to fig. 2, the flow 500 of the bounding box generation method in some embodiments corresponding to fig. 5 highlights the specific steps of controlling the relevant mechanical equipment to carry the above items according to the above target bounding box set. Therefore, the scheme described in the embodiments can be used for more conveniently and accurately conveying each article related to the target image. The task completion efficiency can be greatly improved.

With further reference to fig. 6, as an implementation of the methods shown in the above figures, the present disclosure provides some embodiments of an enclosure generation apparatus, which correspond to those shown in fig. 2, and which may be applied in various electronic devices in particular.

As shown in fig. 6, an enclosure generating apparatus 600 includes: a first determining unit 601, a generating unit 602, a second determining unit 603, and a third determining unit 604. Wherein the first determining unit 601 is configured to: in response to detecting the existence of each object related to the target image, determining at least one bounding box on the target image, wherein the bounding box in the at least one bounding box represents the position information of the object in each object related to the target image. The generating unit 602 is configured to: and generating a point cloud set related to each bounding box in the at least one bounding box to obtain a point cloud set group. The second determination unit 603 is configured to: and determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in the articles, as a first target point cloud set to obtain a first target point cloud set group. The third determining unit 604 is configured to: and determining the bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box to obtain a target bounding box set.

In some optional implementations of some embodiments, the generating unit 602 of the apparatus 600 described above may be further configured to: and generating a point cloud set related to each bounding box in the at least one bounding box by using the external parameter matrix of the camera.

In some optional implementations of some embodiments, the second determining unit 603 of the apparatus 600 described above may be further configured to: determining a central point of a bounding box related to the point cloud set; selecting at least one point cloud in the point cloud set adjacent to the central point as a second target point cloud set; and generating the first target point cloud set according to the second target point cloud set.

In some optional implementations of some embodiments, the second determining unit 603 of the apparatus 600 described above may be further configured to: constructing a tree model related to the point cloud set; and selecting at least one point cloud in the point cloud set adjacent to the central point as the second target point cloud set according to the tree model.

In some optional implementations of some embodiments, the second determining unit 603 of the apparatus 600 described above may be further configured to: and adding the related information of the central point into a first list.

In some optional implementations of some embodiments, the second determining unit 603 of the apparatus 600 described above may be further configured to: for each center point associated with the first list, performing the following first target point cloud set generation steps: responding to the second target point cloud set which is traversed, removing the second target point cloud which is traversed in the second target point cloud set, and obtaining a third target point cloud set; in response to determining that a third target point cloud satisfying a target condition set exists in the third target point cloud set, screening at least one point cloud satisfying the target condition set from the third target point cloud set as a fourth target point cloud set, wherein the target condition set includes: the normal included angle between the point cloud and the central point is smaller than a first threshold value, and the curvature between the point cloud and the central point is smaller than a second threshold value; adding the related information of each fourth target point cloud in the fourth target point cloud set into the first list and the second list respectively; removing the related information of the central point from the first list; and determining the point cloud set corresponding to the second list as the first target point cloud set in response to that the second target point cloud in the second target point cloud set is the traversed point cloud and/or the first list does not have at least one point cloud associated with the traversed point cloud and/or the first list.

In some optional implementations of some embodiments, the second determining unit 603 of the apparatus 600 described above may be further configured to: in response to the first list having the associated at least one point cloud, determining each point cloud of the first list associated at least one point cloud as a center point, and continuing the target point cloud set generating step.

In some optional implementations of some embodiments, the second determining unit 603 of the apparatus 600 described above may be further configured to: and determining the second target point cloud set as a third target point cloud set in response to the fact that the traversed second target point cloud does not exist in the second target point cloud set.

In some optional implementations of some embodiments, the apparatus 600 further includes: a control unit (not shown). Wherein the control unit may be configured to: and controlling related mechanical equipment to carry the objects according to the target enclosure frame set.

It will be understood that the elements described in the apparatus 600 correspond to various steps in the method described with reference to fig. 2. As such, the operations, features, and resulting benefits described above for the method are equally applicable to the device 600 and the units contained therein, and are not redundantly described here.

Referring now to FIG. 7, a schematic diagram of an electronic device (e.g., the electronic device of FIG. 1) 700 suitable for use in implementing some embodiments of the present disclosure is shown. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the function and the range of use of the embodiments of the present disclosure.

As shown in fig. 7, the electronic device 700 may include a processing means (e.g., central processing unit, graphics processor, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 7 may represent one device or may represent multiple devices as desired.

In particular, according to some embodiments of the present disclosure, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed over a network via communications device 709, or may be installed from storage device 708, or may be installed from ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of some embodiments of the present disclosure.

It should be noted that the computer readable medium described above in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: in response to detecting that the object image-related items exist, determining at least one surrounding frame on the object image, wherein the surrounding frame in the at least one surrounding frame represents position information of the items in the object image-related items; generating a point cloud set related to each bounding box in the at least one bounding box to obtain a point cloud set group; determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in each article, as a first target point cloud set to obtain a first target point cloud set group; and determining the bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box to obtain a target bounding box set.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in some embodiments of the present disclosure may be implemented by software, and may also be implemented by hardware. The described units may also be provided in a processor, and may be described as: a processor includes a first determining unit, a generating unit, a second determining unit, and a third determining unit. The names of the units do not form a limitation on the units themselves in some cases, and for example, the generation unit may be further described as "generating a point cloud set associated with each of the at least one bounding box to obtain a unit of a point cloud set group".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, and other embodiments formed by any combination of the above-mentioned features or their equivalents are also encompassed by the present invention without departing from the above-mentioned inventive concept. For example, the above features and (but not limited to) the features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims

1. A bounding box generation method comprises the following steps:

in response to detecting the existence of the object-image-related items, determining at least one bounding box on the object image, wherein a bounding box in the at least one bounding box represents position information of an item in the object-image-related items;

generating a point cloud set related to each bounding box in the at least one bounding box to obtain a point cloud set group;

determining at least one point cloud in each point cloud set in the point cloud set group, which is related to each article in each article, as a first target point cloud set to obtain a first target point cloud set group;

determining the bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box to obtain a target bounding box set.

2. The method of claim 1, wherein the generating a set of point clouds associated with each of the at least one bounding box comprises:

and generating a point cloud set related to each bounding box in the at least one bounding box by utilizing the external parameter matrix of the camera.

3. The method of claim 1, wherein said determining at least one point cloud in each point cloud set of the set of point cloud sets associated with each of the respective items as a first target point cloud set comprises:

determining a center point of a bounding box associated with the point cloud set;

selecting at least one point cloud in the point cloud set adjacent to the central point as a second target point cloud set;

and generating the first target point cloud set according to the second target point cloud set.

4. The method of claim 3, wherein the selecting at least one point cloud of the set of point clouds adjacent to the center point as a second set of target point clouds comprises:

constructing a tree model associated with the point cloud set;

and selecting at least one point cloud in the point cloud set adjacent to the central point as the second target point cloud set according to the tree model.

5. The method of claim 3, wherein the method further comprises:

and adding the related information of the central point into a first list.

6. The method of claim 5, wherein the generating the first target point cloud set from the second target point cloud set comprises:

for each center point associated with the first list, performing the following first target point cloud set generating steps:

in response to the second target point cloud set having traversed the second target point cloud, removing the traversed second target point cloud in the second target point cloud set to obtain a third target point cloud set;

in response to determining that a third target point cloud satisfying a set of target conditions exists in the third set of target point clouds, screening at least one point cloud satisfying the set of target conditions from the third set of target point clouds as a fourth set of target point clouds, wherein the set of target conditions includes: the normal included angle between the point cloud and the central point is smaller than a first threshold value, and the curvature between the point cloud and the central point is smaller than a second threshold value;

adding relevant information of each fourth target point cloud in the fourth target point cloud set into the first list and the second list respectively;

removing the relevant information of the central point from the first list;

and in response to that a second target point cloud in the second target point cloud set is a traversed point cloud and/or that the first list does not have at least one point cloud associated with the traversed point cloud, determining a point cloud set corresponding to the second list as the first target point cloud set.

7. The method of claim 6, wherein the method further comprises:

in response to the first list having the associated at least one point cloud, determining each point cloud of the first list associated at least one point cloud as a center point, and continuing the target point cloud set generating step.

8. The method of claim 6, wherein after the removing the traversed second target point cloud from the second set of target point clouds resulting in a third set of target point clouds in response to the presence of the traversed second target point cloud in the second set of target point clouds, the method further comprises:

determining the second target point cloud set as a third target point cloud set in response to the second target point cloud set not having been traversed.

9. The method of claim 1, wherein the method further comprises:

and controlling related mechanical equipment to carry each article according to the target enclosure frame set.

10. An bounding box generation apparatus comprising:

a first determination unit configured to determine at least one bounding box on the target image in response to detecting that there is a respective item related to the target image, wherein a bounding box of the at least one bounding box represents location information of an item of the respective item related to the target image;

a generating unit configured to generate a point cloud set associated with each bounding box of the at least one bounding box, resulting in a point cloud set group;

a second determining unit configured to determine at least one point cloud in each point cloud set in the point cloud set group, which is related to each item in the respective items, as a first target point cloud set, resulting in a first target point cloud set group;

and the third determining unit is configured to determine the bounding box associated with each first target point cloud set in the first target point cloud set group as a target bounding box, so as to obtain a target bounding box set.

11. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9.

12. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-9.