CN115719377A

CN115719377A - Automatic acquisition system for pose estimation data set with six degrees of freedom

Info

Publication number: CN115719377A
Application number: CN202211482124.9A
Authority: CN
Inventors: 陈鹏; 孙翰翔; 包倍源; 陈海永; 刘卫朋
Original assignee: Hebei University of Technology
Current assignee: Hebei University of Technology
Priority date: 2022-11-24
Filing date: 2022-11-24
Publication date: 2023-02-28

Abstract

The embodiment of the invention discloses an automatic acquisition system of a pose estimation data set with six degrees of freedom, which comprises a data acquisition platform and data processing equipment; the data acquisition platform is used for shooting an RGBD image sequence of a target scene by using a depth camera; and a data labeling software algorithm is installed in the data processing equipment and is used for processing the RGBD image sequence of the target scene, performing three-dimensional reconstruction on the target object to obtain a three-dimensional model, and automatically labeling segmentation mask information and six-degree-of-freedom position and pose information of the object in the target scene based on the three-dimensional model. The invention can automatically label the pose information, the segmentation mask information and the three-dimensional model information of the object of the scene in the RGB-D image sequence acquired by the depth camera, and the data set obtained by the method can be used for training and testing a robot grasping neural network model based on deep learning.

Description

Automatic acquisition system for pose estimation data set with six degrees of freedom

Technical Field

The invention relates to the technical field of robots, in particular to an automatic acquisition system for a six-degree-of-freedom pose estimation data set, which is suitable for a training or test data set labeling link of a deep learning network model facing robot grabbing.

Background

With the rapid development of scientific technology and industrial modernization, the robot industry meets new development opportunities, the market occupation scale is increased day by day, and a large number of robots are applied to actual production tasks such as sorting, assembly, feeding and the like, and play an important role in various industries. Compared with the traditional manual operation, the robot has the advantages of high operation accuracy, strong system stability, high investment return rate and the like.

With the rapid rise of the artificial intelligence technology and the continuous iteration of the intelligent hardware, the computer vision and the robot technology are increasingly closely linked, and the robot can acquire the visual information of a scene by taking a camera as an eye to realize the interaction with the external environment.

The robot grabbing is a key link in the operation processes of robot sorting, assembling, feeding and the like, and is a key technology for realizing automation and intellectualization of the operation process of the robot. The robot grabbing can be subdivided into three subtasks, namely grabbing detection, grabbing planning and grabbing control. The grabbing detection is the basis of grabbing planning and grabbing control, image information and depth information of a grabbed scene are collected through cameras, particularly optical instruments such as a depth camera, and the like, a three-dimensional point cloud of the grabbed scene is constructed, the position and the posture of a target object are estimated through an algorithm, grabbing description is formed, and the robot is guided to grab. Therefore, a common grabbing detection method can register the three-dimensional model of the target object with the actual point cloud of the target object, so as to estimate the pose of the target object. Classical algorithms include a template matching-based LineMOD algorithm, a voting-based point pair feature algorithm, and the like.

Along with the popularization and application of artificial intelligence technology, more and more pose estimation algorithms based on deep learning are applied to the unordered grabbing task of the robot. For example, a robot grasping method based on an improved KeypointRCNN model, a robot grasping method based on PVNet, and the like. These methods often require a large amount of data to train the network model to achieve the desired pose estimation accuracy. The richness and labeling quality of the data set directly affect the performance of the network model.

Currently, data sets such as LineMOD, homebrewedDB, HOPE, etc. dedicated to six-degree-of-freedom pose estimation have emerged. However, because the labeling of the data set needs to be completed manually completely or partially, the labeling work is time-consuming and labor-consuming, and it is difficult to obtain a large amount of widely-available data set labeling results. For example, only 15 objects in the LineMOD dataset are labeled with about 1000 pose labels and segmentation mask labels, the HomebrewedDB dataset is labeled with only 33 objects in 13 scenes, and the HOPE dataset is labeled with only 28 toy objects in 50 scenes. Although there is also an open-source six-degree-of-freedom object pose estimation dataset production tool such as ObjectDataSetTools, which can realize automation of the pose estimation dataset production process, that is, pose information, segmentation mask information and object three-dimensional model information can be automatically labeled by a given RGB-D image sequence, the tool can only label a single object and cannot handle a complex scene and the occlusion problem between objects, and because no method direction consistency processing is performed in the three-dimensional reconstruction process, the three-dimensional reconstruction result often appears phenomena such as deformation and tearing, and the labeling quality of the pose estimation dataset is greatly reduced.

Disclosure of Invention

Aiming at the technical defects mentioned in the background technology, the embodiment of the invention aims to realize the automatic marking of the pose data of six degrees of freedom of an object in a shooting scene, thereby laying a foundation for realizing the disordered grabbing of a robot, wherein the pose data of six degrees of freedom mainly comprises the pose information of the object, the segmentation mask information of the object in an image and the three-dimensional model information of the object.

In order to achieve the above object, an embodiment of the present invention provides an automatic acquisition system for a pose estimation data set with six degrees of freedom, including a data acquisition platform and a data processing device. The data acquisition platform is used for shooting an RGBD image sequence of a target scene by using a depth camera;

and a data annotation software algorithm is installed in the data processing equipment and is used for processing the RGBD image sequence of the target scene, performing three-dimensional reconstruction on the target object to obtain a three-dimensional model, and automatically annotating segmentation mask information and six-degree-of-freedom position and posture information of the object in the target scene based on the three-dimensional model.

As a specific implementation manner of the invention, the data acquisition platform comprises an electric turntable and a mechanical arm; a target object is placed on the electric rotary table, and the electric rotary table receives a control instruction sent by an upper computer and rotates according to the control instruction; the depth camera is installed at the tail end of the mechanical arm, and can shoot RGD images of target objects at all angles to obtain the RGBD image sequence.

Further, the electric turntable receives a control instruction issued by the upper computer through an RS-232 communication module; the control command includes a rotation angle and a rotation speed of the electric turntable.

As a concrete mode of realization, electric turntable includes step motor, belt, inferior gram force board, bottom plate and circular slide rail, step motor install in the bottom plate, the belt install in circular slide rail top, circular slide rail install in through two double-screw bolts the bottom plate top, the belt top is installed through two double-screw bolts inferior gram force board, have the ArUOC mark on the inferior gram force board.

As a specific implementation manner of the present invention, the processing of the RGBD image sequence of the target scene by the data processing device using a data annotation software algorithm specifically includes:

firstly, point cloud alignment: finding the three-dimensional coordinates of 4 corner points marked by m Aruco in the k frame image from the RGBD image sequence, and recording the three-dimensional coordinates as X _k = xki | i =1,2, \ 8230;, 4m, and then X is determined from frame to frame _k The corresponding relationship of (a); according to X of each frame image _k And the corresponding relation between the point clouds and other frame images, and the point clouds of the k frame and the first frame are calculated by utilizing a point cloud global registration algorithmTransformation matrix T _1k If n frames are in total in the image sequence, a point cloud transformation set T is formed ₁ ＝{T _1k L k =1,2, \ 8230;, n }, i.e. the k frame point cloud passes through T _1k After transformation, the point cloud can be aligned with the 1 st frame point cloud;

and step two, three-dimensional reconstruction: using random sampling consistency algorithm to X _k Performing plane fitting, and filtering out points near a plane in each frame of point cloud; according to the set T ₁ T in (1) _1k Transforming the position and the posture of the kth frame point cloud; performing three-dimensional point cloud splicing on the transformed point cloud to obtain a point cloud with a complete object, and performing point cloud smoothing to remove noise to obtain a reconstructed point cloud M; segmenting the reconstructed point cloud M by using an Euclidean clustering segmentation algorithm to obtain the point cloud O = { O } of c objects in the scene _i |i＝1,2,…,c}；

Thirdly, surface triangularization: calculating normal vectors of all object point clouds, and utilizing a normal propagation method based on a tree-shaped layered Riemannian diagram to carry out unification on the normal vectors of all object point clouds; triangularizing the O by using a Poisson surface reconstruction algorithm to obtain a three-dimensional model m of each object _i ，i＝1,2,…,c；

Fourthly, segmentation information labeling: three-dimensional model m of object _i I =1,2, \ 8230, the triangular surfaces in c are projected to a camera plane one by one to obtain a segmentation mask sequence of the object;

fifthly, marking pose information: calculating three-dimensional model m of each object _i Orientation bounding box of (1) and pose transformation matrix of bounding box

According to the pose transformation relation T of the first frame point cloud and the kth frame point cloud _1k Calculating an object model m in the point cloud of the kth frame _i Pose information of six degrees of freedom

In the surface triangularization step, the normal propagation method based on the tree-shaped layered Riemannian diagram is used for carrying out uniformization on normal vectors of point clouds of all objects, and specifically the method comprises the following steps:

(1) In the principal component analysis method, the point cloud is normally estimated by using a larger neighborhood radius to obtain a rough normal vector;

(2) Selecting a point with the minimum curvature as a root node of the minimum spanning tree and marking the point;

(3) Calculating the distance from all unmarked points to the tree, and finding the point p closest to the tree _i And a distance p _i The nearest tree node;

(4) P is to be _i Added as a new node to the tree and connected to the tree node nearest to it, a new edge is generated and p is marked _i ；

(5) Repeating the steps 3 and 4 until all the points in the point cloud are added into the minimum spanning tree;

(6) Traversing all edges of the minimum spanning tree, and calculating an included angle between normal vectors of nodes connected with the two ends of the minimum spanning tree; if the included angle is larger than 90 degrees, the normal vector of the node is added after inversion, and therefore a rough normal vector with the consistent direction is obtained;

(7) In the principal component analysis method, the point cloud is normally estimated by using a smaller neighborhood radius to obtain an accurate normal vector;

(8) And calculating an included angle between the accurate normal vector and the rough normal vector of each point, and if the included angle is larger than 90 degrees, reversing the accurate normal vector to finally obtain the accurate normal vector with the consistent direction.

Compared with the prior art, the invention has the beneficial effects that: three-dimensional model information, pose information of six degrees of freedom and segmentation mask information of a target can be automatically obtained only by shooting an image sequence of the target placed on a data acquisition hardware platform by using an RGB-D camera, and a 3D scanner and a manual intervention shooting process are not needed. In the process of generating the three-dimensional model of the object, the normal propagation method based on the tree-shaped layered Riemannian diagram is used for normal orientation, so that the consistency of the normal directions of point clouds on the surface of the object is ensured, the phenomena of deformation, tearing and the like of the three-dimensional model caused by the inconsistency of the normal directions are avoided, and the more accurate three-dimensional model can be obtained. On the basis, perspective projection is carried out by using the triangulated three-dimensional model, and more accurate image segmentation mask information is obtained.

Drawings

In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings that are needed in the detailed description of the invention or the prior art will be briefly described below.

FIG. 1 is a schematic diagram of a data acquisition platform provided by an embodiment of the invention;

FIG. 2 is a schematic illustration of data acquisition according to the present invention;

FIG. 3 is a flow chart of the data annotation software algorithm of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

The automatic acquisition system of the pose estimation data set with six degrees of freedom comprises a data acquisition platform (also called a data acquisition hardware platform) and data processing equipment. The data acquisition hardware platform is used for shooting an RGBD image sequence of a target scene by using a depth camera. The data acquisition hardware platform is an electric turntable which is provided with an Aruco mark and can be used for placing an object to be detected, and the electric turntable can communicate with an upper computer through an RS-232 communication module. The upper computer issues the rotation angle and the speed information in a command form, and the electric turntable automatically rotates according to the command of the upper computer.

And a data labeling software algorithm is installed in the data processing equipment and is used for processing the RGBD image sequence of the target scene, performing three-dimensional reconstruction on the target object to obtain a three-dimensional model, and automatically labeling the segmentation mask information and the six-degree-of-freedom pose information of the object in the target scene based on the three-dimensional model.

As shown in fig. 1, the data acquisition hardware platform provided by the invention is an electric turntable, and comprises a stepping motor 1, a belt 2, an acrylic plate 3, a bottom plate 4, a circular slide rail 5, studs 6-9 and a motor wire 10. The acrylic plate is provided with an Aruco mark and fixed at one end of the circular sliding rail, the other end of the acrylic plate is connected to the belt pulley, the belt pulley is connected with the stepping motor through a belt, and the stepping motor drags the electric turntable to rotate. And the hardware system is communicated with the upper computer through an RS-232 communication module. The upper computer issues the rotation angle and the speed information in a command form, and the electric turntable rotates according to the command of the upper computer. The process of data set acquisition using this hardware platform (see fig. 2) is to mount a depth camera to the end of the robotic arm and then place the target object onto a motorized turntable. After the electric turntable rotates for a fixed angle, a depth camera is used for shooting an RGB-D image. After the electric turntable rotates 360 degrees, the depth camera shoots a plurality of RGB-D images around the object. And then changing the tail end position of the mechanical arm so as to change the shooting visual angle of the depth camera, and repeating the previous shooting process. After multiple times of shooting at different viewing angles, RGB-D images of the object at all angles can be obtained. On the basis, the data annotation software algorithm provided by the invention is utilized to label the three-dimensional model information, the pose information of six degrees of freedom and the image segmentation mask information of the object.

Referring to fig. 3 again, the data annotation software algorithm provided by the embodiment of the present invention is mainly applied to automatically annotate three-dimensional model information, pose information with six degrees of freedom, and image segmentation mask information of a target object. The method comprises the following specific steps:

s1, point cloud alignment, namely aligning each frame of point cloud with a first frame of point cloud by using an ArUCo mark;

s2, performing three-dimensional reconstruction, namely splicing the aligned point clouds to obtain complete point clouds of the scene;

s3, triangularizing the surface, namely segmenting scene point clouds by using a clustering segmentation algorithm to obtain point clouds of all objects in the scene, on the basis, performing normal vector unification on the object point clouds by using a Riemannian chart method phase propagation algorithm based on tree layering, and triangularizing the object point clouds by using a Poisson surface reconstruction algorithm to obtain a three-dimensional model of the objects;

s4, labeling segmentation information, and projecting the triangular surfaces of the three-dimensional model of the object to an imaging plane of a camera one by one to obtain a segmentation mask of the object;

and S5, acquiring a three-dimensional model of the object, marking pose information, moving the object model to the origin of a three-dimensional coordinate system and aligning the three coordinate axes by calculating a direction bounding box and a pose transformation matrix of the object model, and obtaining the pose information of the object in each frame of point cloud by using the pose transformation relation between image frames.

The following details each part:

the first step is as follows: and aligning the point clouds.

The invention firstly finds the three-dimensional coordinates of the corner points of the Aruco mark in each frame of image by the following method. The ArUco marker is a binary square fiducial marker that can be used for camera pose estimation. It is a square mark consisting of a wide black border and an internal binary matrix that determines its identifier. The method detects two-dimensional coordinates of four corner points of the Aruco marker in an image sequence, and then maps the two-dimensional coordinates into a depth map by utilizing the pixel corresponding relation between an RGB map and the depth map, so as to obtain the three-dimensional coordinates of the corner points. If b Aruco markers are placed in total, a corner point set X can be obtained from the k frame image _k ＝{x _i |i＝1,2,…,4b}。

Assuming that the set of point cloud sequences obtained from the sequence of RGB-D images is P = { P = _i I =1,2, \8230;, n }, point cloud alignment in P can be achieved through a multi-directional registration algorithm. The process is as follows: firstly, realizing registration between every two point clouds in P, if two point clouds in P are respectively marked as P _i And P _j Then the corresponding set of corner coordinates X obtained by the ArUco markers is aligned using a random sampling consistency algorithm _i And X _j By performing coarse registration, P can be obtained preliminarily _i And P _j A coarse rigid body transformation matrix. On the basis, an ICP algorithm is further used for P _i And P _j All three-dimensional points in the image are subjected to fine registration to obtain a transformation matrix T _ij So that T is _ij P _i And P _j And (4) aligning. Then, P is added _i As the ith node of the pose graph, the transformation matrix of the adjacent point clouds, { T } _ij I =1,2, \8230 |, n-1; j = i +1, as the odometer edge of the pose graph, the transformation matrix { T } of the non-adjacent point cloud is used _ij L i =1,2, \8230 |, n-1; j ≠ i + 1) is used as a loop edge of the pose graph, the pose graph is constructed, and the pose graph is optimized by using a robustness optimization algorithm. Extracting the transformation matrix of the kth frame point cloud to the 1 st frame point cloud in the optimized pose graph to form a transformation matrix set T ₁ ＝{T _1k I k =1,2, \ 8230;, n }, so the k-th frame point cloud passes through T _1k After transformation, the point cloud of the 1 st frame can be aligned.

The second step is that: and (4) three-dimensional reconstruction.

Using RANSAC algorithm to obtain three-dimensional corner point coordinate set X by Aruco mark in K frame RGB-D image _k Performing plane fitting and filtering out point cloud P _k To a point belonging to a plane, so that only the object point cloud remains in the scene point cloud. The object point cloud sequence thus obtained is denoted as P _s ＝{P _sk I k =1,2, · n }, according to T ₁ To P _s And (3) converting each frame of point cloud to align the point cloud with the 1 st frame of point cloud, and fusing the converted point clouds through a voting algorithm to obtain the point cloud M with a complete object. And resampling the fused point cloud M by using a sliding least square algorithm, and eliminating the region which is not smooth or has a leak in the M. The surface missing part can be reconstructed by carrying out high-order polynomial interpolation on surrounding data points by utilizing a resampling algorithm, so that the problem of double-wall pseudo data caused by a plurality of scanning points is solved, and the surface of a reconstructed object three-dimensional model is smoother. The pseudo-code for the three-dimensional reconstruction of the object surface is shown below.

The third step: surface triangularization

Considering that a plurality of objects are usually placed in a scene, the objects contained in M need to be segmented to obtain a point cloud set O = { O } of a single object _i L i =1,2, \8230;, c }. The method is realized by using an Euclidean clustering point cloud segmentation algorithm.

To reconstruct the curved surface of the object, the point cloud O is triangulated to obtain a three-dimensional model of the object. The present invention uses a poisson surface reconstruction algorithm. Poisson surface reconstruction belongs to the realization of a hidden function method, integrates the advantages of a global method and a local method, adopts a hidden fitting mode, obtains a hidden equation represented by surface information described by a point cloud model by solving a Poisson equation, and then extracts an isosurface from the equation, thereby obtaining a surface model with geometric entity information. The poisson surface reconstruction algorithm requires a point cloud and a normal, and the normal needs to have consistency. The normal of the point cloud can be estimated by a principal component analysis method, and the local shape of the curved surface is approximated according to the position information of the sampling points and the adjacent points thereof, so that the normal direction of the point cloud is estimated. However, the normal directions obtained by this method often do not have consistency, that is, the normal direction of a certain sampling point is opposite to the normal direction of its neighboring points. The three-dimensional surface reconstructed thereby may be deformed, torn, or the like. Therefore, the invention provides a normal propagation method based on the minimum spanning tree to carry out normal uniformization on the normal vector of the object, thereby providing an accurate and consistent normal vector for a surface reconstruction algorithm.

The normal propagation algorithm based on the minimum spanning tree comprises the following specific steps:

(1) In the principal component analysis method, the point cloud is normally estimated by using a larger neighborhood radius to obtain a rough normal vector.

(2) The point with the smallest curvature is selected as the root node of the minimum spanning tree and marked.

(3) Calculating the distance from all unmarked points to the tree, and finding the point p closest to the tree _i And a distance p _i The nearest tree node.

(4) P is to be _i Added as a new node to the tree and connected to the tree node nearest to it, a new edge is generated and p is marked _i 。

(5) Repeating steps 3 and 4 until all points in the point cloud are added to the minimum spanning tree.

(6) And traversing all edges of the minimum spanning tree, and calculating an included angle between normal vectors of nodes connected with the two ends of the minimum spanning tree. And if the included angle is larger than 90 degrees, the normal vector of the node is added after inversion, and therefore a rough normal vector with the consistent direction is obtained.

(7) In the principal component analysis method, the point cloud is normally estimated by using a smaller neighborhood radius to obtain an accurate normal vector.

On the basis, an object point cloud set O = { O ] is subjected to Poisson surface reconstruction algorithm _i Point cloud o in | i =1,2, \8230;, c } _i Surface reconstruction is carried out, and then three-dimensional models m of all objects can be obtained _i 。

The fourth step: annotating segmentation information

The three-dimensional model of the object consists of a number of triangular faces which can be projected onto the camera plane with the camera intrinsic parameters known, so that a segmentation mask of the object is obtained. Because the RGB image and the depth map have been aligned, the RGB image and the depth map have the same intra-camera parameters. According to the camera pinhole imaging model, the vertex of the triangular surface can be projected into the two-dimensional image by equation (1).

In the formula f _x 、f _y 、c _x 、c _y X, Y, Z are three-dimensional coordinates, and u, v are two-dimensional coordinates of the three-dimensional coordinates projected onto the image, which are intrinsic parameters of the camera. The three vertexes of the triangular surface are respectively projected into the image coordinate system, and the triangle formed by the three projection points is filled in, so that the projection of the triangular surface on the image plane can be obtained. The segmentation mask of the object in the image can be obtained by projecting all the triangular surfaces in the three-dimensional model of the object.

The fifth step: and acquiring a three-dimensional model of the object and marking pose information.

The transformation matrix T between each frame of point cloud and the first frame of point cloud is obtained ₁ ＝{T _1k I k =1,2, \ 8230;, n }, and on this basis, each frame point cloud is aligned with the first frame point cloud, so that a reconstructed three-dimensional model m of the object is obtained _i Will align with the object in the first frame of point cloud. From m _i Calculating direction bounding box of object model and pose transformation matrix thereof

Thereby passing through T _mi ^-1 And transforming the pose of the three-dimensional model of the object to enable the pose to move to the origin of the three-dimensional coordinate system and to be aligned with the three coordinate axes. Will pass through

And the transformed three-dimensional model of the object is stored as a final labeling result.

Further, a pose transformation matrix T is formed by the k frame point cloud and the first frame point cloud _1k The object model m in the kth frame can be calculated _i Position and attitude marking information of

Comprises the following steps:

in conclusion, the data set automatic acquisition platform is used in the following process: firstly, an object is placed on an electric turntable, a plurality of ArUco marks are placed around the object, then the electric turntable starts to rotate, a depth camera is used for shooting an RGBD image every time the electric turntable rotates for a certain angle, and an RGBD image sequence of a scene is collected. And (3) performing three-dimensional reconstruction on the object in the scene based on the RGBD image by using a software algorithm, and labeling the segmentation mask information and the pose information.

Compared with the prior art, the invention has the beneficial effects that: three-dimensional model information, pose information of six degrees of freedom and segmentation mask information of a target can be automatically obtained only by shooting an image sequence of the target placed on a data acquisition hardware platform by using an RGB-D camera, and a 3D scanner and a manual intervention shooting process are not needed. In the process of generating the three-dimensional model of the object, the normal propagation method based on the tree-shaped layered Riemannian diagram is used for normal orientation, so that the consistency of the normal directions of the point cloud on the surface of the object is ensured, the phenomena of deformation, tearing and the like of the three-dimensional model caused by the inconsistency of the normal directions are avoided, and the more accurate three-dimensional model can be obtained. On the basis, perspective projection is carried out by utilizing the triangulated three-dimensional model, and more accurate image segmentation mask information is obtained.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. The automatic acquisition system of the pose estimation data set with six degrees of freedom comprises a data acquisition platform and data processing equipment, and is characterized in that the data acquisition platform is used for shooting an RGBD image sequence of a target scene by using a depth camera;

2. The automated acquisition system of claim 1, wherein the data acquisition platform comprises a motorized turntable and a robotic arm; a target object is placed on the electric turntable, and the electric turntable receives a control command issued by an upper computer and rotates according to the control command; the depth camera is installed at the tail end of the mechanical arm, and can shoot RGD images of the target object at various angles to obtain the RGBD image sequence.

3. The automatic acquisition system of claim 2, wherein the electric turntable receives a control command issued by the upper computer through an RS-232 communication module; the control command includes a rotation angle and a rotation speed of the electric turntable.

4. The automatic acquisition system of claim 2, wherein the electric turntable comprises a stepping motor, a belt, an acrylic plate, a bottom plate and a circular slide rail, the stepping motor is mounted on the bottom plate, the belt is mounted above the circular slide rail, the circular slide rail is mounted above the bottom plate through two studs, the acrylic plate is mounted above the belt through two studs, and the acrylic plate is provided with an ArUOC mark.

5. The automatic acquisition system of claim 1, wherein the data processing device processes the RGBD image sequence of the target scene using a data annotation software algorithm, specifically comprising:

step one, point cloud alignment: finding the three-dimensional coordinates of 4 corner points marked by m Aruco in the k frame image from the RGBD image sequence, and recording the three-dimensional coordinates as X _k = xki | i =1,2, \ 8230;, 4m, and then X is determined from frame to frame _k The corresponding relationship of (a); according to X of each frame image _k And its correspondence with other frame images, using the point cloud globalThe algorithm of registration calculates the transformation matrix T of the kth frame point cloud and the first frame point cloud _1k If n frames are in total in the image sequence, a point cloud transformation set T is formed ₁ ＝{T _1k L k =1,2, \8230 |, n }, i.e. the kth frame point cloud passes through T _1k After transformation, the point cloud can be aligned with the 1 st frame point cloud;

and step two, three-dimensional reconstruction: using random sampling consistency algorithm to X _k Performing plane fitting, and filtering out points near a plane in each frame of point cloud; according to the set T ₁ T in (1) _1k Carrying out position and posture transformation on the kth frame point cloud; performing three-dimensional point cloud splicing on the transformed point cloud to obtain a point cloud with a complete object, and performing point cloud smoothing to remove noise to obtain a reconstructed point cloud M; segmenting the reconstructed point cloud M by using an Euclidean clustering segmentation algorithm to obtain the point cloud O = { O } of c objects in the scene _i |i＝1,2,…,c}；

fifthly, marking pose information: calculating three-dimensional model m of each object _i The orientation bounding box and the pose transformation matrix T of the bounding box _mi According to the pose transformation relation T of the first frame point cloud and the kth frame point cloud _1k Calculating an object model m in the point cloud of the kth frame _i Pose information of six degrees of freedom

6. The automatic acquisition system of claim 5 wherein in the surface triangularization step, the normal propagation method based on the tree-based hierarchical Riemannian chart is used to unify the normal vectors of the point clouds of the objects, specifically:

(6) Traversing all edges of the minimum spanning tree, and calculating an included angle between normal vectors of nodes connected with the two ends of the minimum spanning tree;

if the included angle is larger than 90 degrees, the normal vector of the node is added after inversion, and therefore a rough normal vector with the consistent direction is obtained;

(7) In the principal component analysis method, normal estimation is carried out on the point cloud by using a smaller neighborhood radius to obtain an accurate normal vector;

(8) And calculating an included angle between the precise normal vector and the rough normal vector of each point, and if the included angle is larger than 90 degrees, reversing the precise normal vector to finally obtain the precise normal vector with the consistent direction.