CN113393524A - Target pose estimation method combining deep learning and contour point cloud reconstruction - Google Patents
Target pose estimation method combining deep learning and contour point cloud reconstruction Download PDFInfo
- Publication number
- CN113393524A CN113393524A CN202110676959.7A CN202110676959A CN113393524A CN 113393524 A CN113393524 A CN 113393524A CN 202110676959 A CN202110676959 A CN 202110676959A CN 113393524 A CN113393524 A CN 113393524A
- Authority
- CN
- China
- Prior art keywords
- point cloud
- straight line
- target
- point
- line segment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
- G06T7/85—Stereo camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Abstract
The invention relates to the technical field of target pose estimation, in particular to a target pose estimation method combining deep learning and contour point cloud reconstruction, which comprises the following steps of: s1, calibrating the binocular vision system and performing stereo correction; s2, recognizing the targets in the left camera image and the right camera image by using the trained target detection network model, and obtaining the boundary area of the targets; s3, carrying out straight line segment detection on the boundary area of the target detected in the left and right camera images by using an LSD algorithm; s4, matching the straight line segment by combining the output of the deep learning target detection network and a multi-constraint method; s5, reconstructing contour point cloud of the target; and S6, carrying out pose estimation on the target. According to the method, through the left camera and the right camera, a YOLOv4 deep learning algorithm is utilized and the contour point cloud reconstruction is combined, the calculation time of stereo matching is short, the calculation amount is small, and the cost of a common camera is greatly reduced.
Description
Technical Field
The invention relates to the technical field of target pose estimation, in particular to a target pose estimation method combining deep learning and contour point cloud reconstruction.
Background
The pose estimation aims at acquiring a three-dimensional coordinate and a three-dimensional rotation vector of a target to be measured in a camera coordinate system; in many cases, only the 6D pose of the target is accurately estimated, so that the next operation and decision of the machine can be facilitated; for example, in related tasks of the intelligent robot, the 6D pose of the target is identified, so that useful information can be provided for grabbing and motion planning; in virtual reality applications, the 6D pose of a target is key to supporting virtual interaction between any objects.
The existing 6D pose estimation method is mainly a point cloud registration method, can process targets with complex shapes and weak textures, and has good accuracy and robustness; the point cloud registration method can be divided into a binocular vision-based method and a depth camera-based method according to different methods for acquiring point cloud data.
Most of existing binocular vision-based methods firstly solve a disparity map of a scene through an SGBM (Semi-Global Block Matching) stereo Matching algorithm, then reconstruct point cloud of the scene according to the disparity map and segment a target, and finally register the point cloud with template point cloud to obtain the pose of the target; however, in the method, the point cloud of the whole scene is reconstructed, so that the calculation time of stereo matching is too long; according to the scheme based on the depth camera, the point cloud of the target is firstly obtained, the three-dimensional characteristics of each point in the point cloud are calculated, and then pose estimation is carried out according to the three-dimensional characteristics; therefore, how to provide a pose estimation method with small calculation amount, low cost and high precision is a problem that needs to be solved urgently by the technical personnel in the field.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the image target boundary regions of the left camera and the right camera are obtained by utilizing a YOLOv4 deep learning algorithm, the target is subjected to straight line segment detection and matching by utilizing an LSD algorithm and a multi-constraint method, the contour point cloud of the target is reconstructed, and the pose of the target is estimated, so that the calculation time of stereo matching is short, the calculation amount is small, and in addition, the development cost is greatly reduced by utilizing a common camera to acquire pictures.
The technical scheme adopted by the invention is as follows: a target pose estimation method combining deep learning and contour point cloud reconstruction comprises the following steps:
s1, calibrating a binocular vision system by adopting a Zhangangyou checkerboard calibration mode, calibrating and performing three-dimensional correction on a binocular vision camera by using a Bouguet algorithm based on calibrated parameters, selecting various targets as analysis objects, training the various targets through a YOLOv4 network, and establishing various target detection network models;
s2, recognizing the targets in the left camera image and the right camera image by using the trained target detection network model, and obtaining the boundary area of the targets;
s3, carrying out straight line segment detection on the boundary area of the target detected in the left and right camera images by using an LSD algorithm;
s4, matching the straight line segment by combining the type, the boundary area and the multi-constraint method output by the deep learning target detection network;
s5, reconstructing contour point cloud of the target;
and S6, performing pose estimation on the target by using a point cloud registration method.
Further, step S3 includes:
s31, taking out the end points of all the straight line segments, and grouping the end points with Euclidean distances smaller than a set threshold value d into the same group;
s32 for any specific oneGroup end points, if the number of the end points in the group is more than or equal to 2, the end points in the group are merged into the same point, which is marked as Pr,PrThe calculation is as follows:
wherein, PiIs the intersection point of the extension lines of the straight line segments to which any two end points in the current group respectively belong, n is the number of the end points in the group,a combination number representing 2 endpoints selected from the n endpoints;
s33, use of PrAnd obtaining the optimized and reconstructed straight line segment as the common end point of the straight line segment where the group of end points are located.
Further, step S4 includes:
s41, calculating the lengths S of all the optimized and reconstructed straight line segments and the included angle theta between the straight line segments and the positive direction of the transverse axis of the image;
s42, for a certain straight line segment l to be matched in the left camera imagelRecording the target boundary region Rect to which it belongslThen, the boundary region Rect of the same target is found in the right camera imagerTaking out RectrAll the straight line segments in (1) are marked as a straight line segment set Lr;
S43, collecting L in straight line segmentrThe horizontal coordinate of the middle eliminating midpoint is more than llA straight line segment of the middle point abscissa obtains a new straight line segment set L'r;
S44, calculating llAnd L's'rEach straight line segment lr(lr∈Lr) Horizontal error E ofeLength error EsAnd the angle error Eθ:
Wherein, yls、yrsAre each llAnd lrOrdinate of the starting end point of (2), yle、yreAre each llAnd lrThe ordinate of the termination endpoint of (1); sl、srAre respectively a straight line segment llAnd lrLength of (d); thetal、θrAre respectively a straight line segment llAnd lrThe included angle with the positive direction of the transverse axis of the image;
s45, mixing Ee,Es,EθAccording to E ═ Ee Es Eθ]The form of the E is spliced into a matching error vector E, and each value in the E is normalized;
s46, calculating a straight line segment llAnd set LrError value E of matching for each straight line segment in the graphtotal:
Get EtotalThe straight line segment with the smallest value is taken as llAre matched with straight line segments.
Further, step S5 includes:
s51, setting the starting end point and the ending end point of the matched straight line segment in the left camera image and the right camera image as p respectivelyls(uls,vls)、ple(ule,vle) And prs(urs,vrs)、pre(ure,vre) A 1 is to pls(uls,vls) And prs(urs,vrs) U of (a)ls、urs、vlsValues of u are respectively substituted for formula (4)l、ur、vlReconstructing the starting endpoint P of the three-dimensional space straight line segments(xs,ys,zs) (ii) a Then p is putle(ule,vle) And pre(ure,vre) U of (a)le、ure、vleValues of u are respectively substituted for formula (4)l、ur、vlReconstructing the termination point P of the three-dimensional space straight line segmente(xe,ye,ze):
Wherein u isl、urFor the abscissa, v, of the point to be reconstructed in the left and right camera images, respectivelylFor the ordinate of the point to be reconstructed in the left camera image, b is the base-line distance of the binocular camera, (u)0,v0) Is the coordinate value of the center of the optical axis of the left camera, f is the focal length of the two cameras, (X)c,Yc,Zc) Three-dimensional coordinates of the reconstructed point in a left camera coordinate system;
s52, two three-dimensional end points P are reconstructedsAnd PeCalculating a spatial straight-line equation L (x, y, z) represented by the equation:
the direction vector of the straight line L (x, y, z) is n (x)e-xs,ye-ys,ze-zs) And then unitized to nunit(xunit,yunit,zunit);
S53 substituting the initial end point P of the straight line segment into the three-dimensional spaces(xs,ys,zs) (x) to formula (6)i-1,yi-1,zi-1) Then iterate step by step to the termination endpoint Pe(xe,ye,ze) Generating a point cloud of the spatial straight line segment:
wherein (x)i,yi,zi) Is the coordinate of the current point in the iterative process, (x)i-1,yi-1,zi-1) Δ S is a preset overlap for the coordinates of the previous pointReplacing step length, namely the space distance between adjacent points in the scattered three-dimensional point cloud;
and S54, generating point clouds of all the matched straight line segments to obtain the contour point cloud of the target.
Further, step S6 includes:
s61, in an off-line state, generating a complete contour point cloud of the target to be detected as a template point cloud by using CAD, and calculating a Fast Point Feature Histogram (FPFH) of the template point cloud;
s62, taking the template point cloud generated in the step S61 as a source point cloud P, taking the reconstructed contour point cloud of the target as a target point cloud Q, and calculating the FPFH of the target point cloud Q;
s63, randomly selecting k sampling points from the source point cloud P, wherein k is an integer larger than 3, searching a plurality of points with similar FPFH (flat-panel display frequency) with the sampling points from the target point cloud Q, and then randomly selecting a corresponding point as the sampling point;
s64, calculating a transformation matrix of the point correspondences, and then calculating a transformation error by adopting a Huber penalty function which is recorded asWherein H (e)i) The calculation formula is as follows:
in the formula: t is teIs a predetermined value, eiRepresenting the distance difference of the transformed ith point pair;
s65, repeatedly executing the steps S63 and S64 until the preset iteration times are reached, and finally taking the transformation matrix which enables the transformation error to be minimum as an initial transformation matrix;
s66, applying the initial transformation matrix to the source point cloud P to obtain a new source point cloud P';
s67, for each point in the new source point cloud P', finding the closest euclidean distance in the target point cloud Q as the corresponding point, and then calculating the transformation matrix and the corresponding error E (R, T):
wherein E (R, T) represents the error between the new source point cloud P' and the target point cloud Q under the transformation matrix (R, T); p is a radical ofiAnd q isiRespectively the coordinates of each point in the source point cloud P' and the target point cloud Q;
s68, applying the transformation matrix obtained in the step S67 to the source point cloud P ', obtaining a new source point cloud P ', and calculating the error E (R, T) of the P ' and the target point cloud Q;
and S69, repeatedly executing the steps S67 and S68 until E (R, T) or the iteration number meets the set condition (the set condition is that E (R, T) is smaller than a preset error value or the steps S67 and S68 are repeatedly executed), finally solving a rotation and translation matrix between the two point clouds, and decomposing the rotation and translation matrix into three-dimensional coordinates and three-dimensional rotation vectors, namely the pose of the target.
The invention has the beneficial effects that:
1. a target pose estimation method based on binocular vision and combined with deep learning and contour point cloud reconstruction is provided and realized;
2. the cost is lower than that of a depth camera scheme by using a binocular camera scheme;
3. the contour point cloud of the target is reconstructed and registered, the operation efficiency is higher compared with that of dense point cloud processing of the target, and certain precision is guaranteed.
Drawings
FIG. 1 is a flow chart of a target pose estimation method combining deep learning and contour point cloud reconstruction according to the present invention;
FIG. 2 is a graph of the relationship between the training rounds of the YOLOv4 network and the loss value of the present invention;
FIG. 3 is a diagram of the effect of the confidence in the test set of the YOLOv4 network of the present invention;
FIG. 4 is a comparison graph of the effect of straight line segments before and after the optimized reconstruction of the present invention;
FIG. 5 is a diagram illustrating the effects of final target detection and identification and straight line segment matching according to the present invention;
FIG. 6 is a diagram of the point cloud process of the present invention for generating all matching straight line segments;
FIG. 7 is a cloud of target contour points reconstructed by the present invention;
FIG. 8 is a cloud of complete contour points CAD generated in the target coordinate system in accordance with the present invention;
FIG. 9 is the final point cloud registration map of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and examples, which are simplified schematic drawings and illustrate only the basic structure of the invention in a schematic manner, and therefore only show the structures relevant to the invention.
As shown in fig. 1, S1, calibrating and stereoscopically correcting the binocular camera, selecting multiple targets as analysis objects, training the multiple targets through the YOLOv4 network, and establishing multiple target detection network models;
carrying out implementation mode verification on the invention by taking three representative targets of a square type, a slice type and an angle type as objects, collecting 400 target images of each type, wherein the number of the target images is 1200, dividing marked samples into a training set, a verification set and a test set according to the number ratio of 8:1:1, inputting the training set into a YOLOv4 network to start learning, setting the initial momentum of the network to be 0.9, adjusting the initial learning rate to be 0.001 and setting the size of a training batch to be 8; freezing a main network, carrying out 30 rounds of preheating training, setting the learning rate to be 0.0001, and training the unfrozen whole network for 30 rounds; the samples in the verification set are input in each training round for identification and verification, and the loss value is calculated, and the result is shown in fig. 2, so that the network is converged after 60 rounds of training; the test set is tested by using the trained model, and the partial result of the confidence degree predicted by the Yolov4 network is shown in FIG. 3, so that the trained model can accurately identify and frame the target.
Calibrating a binocular vision system by adopting a Zhangyingyou checkerboard calibration mode, and performing stereo correction on a binocular camera by using a Bouguet stereo correction method based on calibrated parameters.
And S2, acquiring field pictures by using the binocular camera after stereo correction, and identifying all targets in the images of the left camera and the right camera by using the trained YOLOv4 network to obtain the boundary area of the targets.
S3, carrying out straight line segment detection on the target boundary areas detected in the left camera image and the right camera image by using an LSD algorithm;
because the LSD algorithm considers that each pixel point can only belong to one straight-line segment at most, when two or more intersected straight-line segments are detected, the straight-line segments are generally disconnected from the intersection point, and the end points of the straight-line segments need to be recalculated and merged to optimize and reconstruct the straight-line segments; firstly, taking out the end points of all the straight line segments, merging the end points if the distance between the end points is greater than a set threshold value, and merging the end points PrAs common endpoints of these straight line segments, i.e., intersections:
the optimized and reconstructed straight line segment is obtained, the process of the optimized and reconstructed straight line segment is shown in fig. 4, and it can be seen that when the intersected straight line segment is detected by a simple LSD algorithm before optimization, the straight line segment is disconnected from the intersection point, but after optimization, the contour of the target discontinuity is correctly connected.
S4, matching the straight line segment by combining the type, the boundary area and the multi-constraint method output by the deep learning target detection network;
s41, calculating the lengths S of all the optimized and reconstructed straight line segments and the included angle theta between the straight line segments and the positive direction of the transverse axis of the image;
s42, for a certain straight line segment l to be matched in the left camera imagelRecording the target boundary region Rect to which it belongslThen, the boundary region Rect of the same target is found in the right camera imagerTaking out RectrAll the straight line segments in (1) are marked as a straight line segment set Lr;
S43, collecting L in straight line segmentrThe horizontal coordinate of the middle eliminating midpoint is more than llA straight line segment of the middle point abscissa obtains a new straight line segment set L'r;
S46, calculating llAnd L's'rIn (1)Each straight line segment lr(lr∈Lr) Is matched with the error value EtotalTaking out EtotalThe straight line segment with the smallest value is taken as llThe matching straight line segment of (1);
the final target detection recognition and straight line segment matching results are shown in fig. 5, and numerical values in the figure represent the Yolov4 network recognition confidence coefficients of three target pictures of a square type, a thin type and an angle type of a left camera and a right camera; for convenient display, the matching result of the square target is only displayed in the figure, and the method can be used for correctly matching the straight line segments in the images of the left camera and the right camera.
S5, reconstructing contour point cloud of the target;
s52, obtaining a three-dimensional space straight line equation by reconstructing two three-dimensional end points of the straight line segments, performing iterative sampling on the three-dimensional space straight line to generate point clouds of all matched straight line segments, generating the point clouds of all matched straight line segments by the process shown in FIG. 6, and finally reconstructing a target contour point cloud shown in FIG. 7; as can be seen from fig. 7, the contour point cloud of the target is accurately reconstructed, and the position in space is the same as the actual position; because the method only matches and reconstructs the contour of the target, compared with the method for reconstructing the whole scene, the method has the advantages of smaller calculated amount and higher operation efficiency.
S6, performing pose estimation on the target by using a point cloud registration method;
generating a complete contour point cloud defined under a target coordinate system as shown in fig. 8 by using a CAD, then performing point cloud registration with the reconstructed contour point cloud to obtain a pose of the target, wherein a final point cloud registration result is shown in fig. 9; therefore, after registration, two point clouds are approximately overlapped, and the pose estimation result is correct and has high precision.
Meanwhile, the registration experiment of dense point cloud on the target surface is realized, the running time of the registration experiment is recorded, and then the registration experiment is compared with the registration of the contour point cloud of the method, and the result is shown in table 1:
TABLE 1
It can be seen that the average processing speed is improved by about 50 times because the number of points is less while the contour point cloud retains the target structure information to the maximum extent.
The actual pose of the hand measurement and the pose calculated by the algorithm in this document are subjected to error calculation, and then the absolute value is taken, and the result is shown in table 2:
TABLE 2
Therefore, the position errors in all directions estimated by the method are smaller than 0.7mm, the attitude errors are smaller than 0.9 degrees, and the requirements of practical application are met.
The invention has the beneficial effects that: a target pose estimation method based on binocular vision and combined with deep learning and contour point cloud reconstruction is provided and realized; the cost is lower than that of a depth camera scheme by using a binocular camera scheme; the contour point cloud of the target is reconstructed and registered, the operation efficiency is higher compared with that of dense point cloud processing of the target, and certain precision is guaranteed.
In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.
Claims (5)
1. A target pose estimation method combining deep learning and contour point cloud reconstruction is characterized by comprising the following steps:
s1, calibrating and performing stereo correction on the binocular camera, selecting various targets as analysis objects, training the various targets through a YOLOv4 network, and establishing various target detection network models;
s2, recognizing the targets in the left camera image and the right camera image by using the trained target detection network model, and obtaining the boundary area of the targets;
s3, carrying out straight line segment detection on the boundary area of the target detected in the left and right camera images by using an LSD algorithm;
s4, matching the straight line segment by combining the output category, the boundary area and the multi-constraint method of the deep learning target detection network;
s5, reconstructing contour point cloud of the target;
and S6, performing pose estimation on the target by using a point cloud registration method.
2. The method for estimating the pose of an object by combining deep learning and reconstruction of a contour point cloud according to claim 1, wherein the step S3 comprises:
s31, taking out the end points of all the straight line segments, and grouping the end points with Euclidean distances smaller than a set threshold value d into the same group;
s32, aiming at any specific group of endpoints, if the number of the endpoints in the group is more than or equal to 2, merging the endpoints in the group into the same point, and marking as Pr,PrThe calculation is as follows:
wherein, PiIs the intersection point of the extension lines of the straight line segments to which any two end points in the current group respectively belong, n is the number of the end points in the group,a combination number representing 2 endpoints selected from the n endpoints;
s33, use of PrAnd obtaining the optimized and reconstructed straight line segment as the common end point of the straight line segment where the group of end points are located.
3. The method for estimating the pose of an object by combining deep learning and reconstruction of a contour point cloud according to claim 1, wherein the step S4 comprises:
s41, calculating the lengths S of all the optimized and reconstructed straight line segments and the included angle theta between the straight line segments and the positive direction of the transverse axis of the image;
s42, for a certain straight line segment l to be matched in the left camera imagelRecording the target boundary region Rect to which it belongslThen, the boundary region Rect of the same target is found in the right camera imagerTaking out RectrAll the straight line segments in (1) are marked as a straight line segment set Lr;
S43, collecting L in straight line segmentrThe horizontal coordinate of the middle eliminating midpoint is more than llA straight line segment of the middle point abscissa obtains a new straight line segment set L'r;
S44, calculating llAnd L's'rEach straight line segment lr(lr∈Lr) Horizontal error E ofeLength error EsAnd the angle error Eθ:
Wherein, yls、yrsAre each llAnd lrOrdinate of the starting end point of (2), yle、yreAre each llAnd lrThe ordinate of the termination endpoint of (1); sl、srAre respectively a straight line segment llAnd lrLength of (d); thetal、θrAre respectively a straight line segment llAnd lrThe included angle with the positive direction of the transverse axis of the image;
s45, mixing Ee,Es,EθAccording to E ═ Ee Es Eθ]The form of the E is spliced into a matching error vector E, and each value in the E is normalized;
s46, calculating a straight line segment llAnd set LrError value E of matching for each straight line segment in the graphtotal:
Get EtotalThe straight line segment with the smallest value is taken as llAre matched with straight line segments.
4. The method for estimating the pose of an object by combining deep learning and reconstruction of a contour point cloud according to claim 1, wherein the step S5 comprises:
s51, setting the starting end point and the ending end point of the matched straight line segment in the left camera image and the right camera image as p respectivelyls(uls,vls)、ple(ule,vle) And prs(urs,vrs)、pre(ure,vre) A 1 is to pls(uls,vls) And prs(urs,vrs) U of (a)ls、urs、vlsValues of u are respectively substituted for formula (4)l、ur、vlReconstructing the starting endpoint P of the three-dimensional space straight line segments(xs,ys,zs) (ii) a Then p is putle(ule,vle) And pre(ure,vre) U of (a)le、ure、vleValues of u are respectively substituted for formula (4)l、ur、vlReconstructing the termination point P of the three-dimensional space straight line segmente(xe,ye,ze):
Wherein u isl、urFor the abscissa, v, of the point to be reconstructed in the left and right camera images, respectivelylFor the ordinate of the point to be reconstructed in the left camera image, b is the base-line distance of the binocular camera, (u)0,v0) Is the coordinate value of the center of the optical axis of the left camera, f is the focal length of the two cameras, (X)c,Yc,Zc) Three-dimensional coordinates of the reconstructed point in a left camera coordinate system;
s52, two three-dimensional end points P are reconstructedsAnd PeThe spatial straight-line equation L (x, y, z) it represents is calculated,the calculation formula is as follows:
the direction vector of the straight line L (x, y, z) is n (x)e-xs,ye-ys,ze-zs) And then unitized to nunit(xunit,yunit,zunit);
S53 substituting the initial end point P of the straight line segment into the three-dimensional spaces(xs,ys,zs) (x) to formula (6)i-1,yi-1,zi-1) Then iterate step by step to the termination endpoint Pe(xe,ye,ze) Generating a point cloud of the spatial straight line segment:
wherein (x)i,yi,zi) Is the coordinate of the current point in the iterative process, (x)i-1,yi-1,zi-1) The coordinate of the previous point is adopted, and the delta S is a preset iteration step length, namely a space distance between adjacent points in the dispersed three-dimensional point cloud;
and S54, generating point clouds of all the matched straight line segments to obtain the contour point cloud of the target.
5. The method for estimating the pose of an object by combining deep learning and reconstruction of a contour point cloud according to claim 1, wherein the step S6 comprises:
s61, in an off-line state, generating a complete contour point cloud of the target to be detected as a template point cloud by using CAD, and calculating a Fast Point Feature Histogram (FPFH) of the template point cloud;
s62, taking the template point cloud generated in the step S61 as a source point cloud P, taking the reconstructed contour point cloud of the target as a target point cloud Q, and calculating the FPFH of the target point cloud Q;
s63, randomly selecting k sampling points from the source point cloud P, wherein k is an integer larger than 3, searching a plurality of points with similar FPFH (flat-panel display frequency) with the sampling points from the target point cloud Q, and then randomly selecting a corresponding point as the sampling point;
s64, calculating a transformation matrix of the point correspondences, and then calculating a transformation error by adopting a Huber penalty function which is recorded asWherein H (e)i) The calculation formula is as follows:
in the formula: t is teIs a predetermined value, eiRepresenting the distance difference of the transformed ith point pair;
s65, repeatedly executing the steps S63 and S64 until the preset iteration times are reached, and finally taking the transformation matrix which enables the transformation error to be minimum as an initial transformation matrix;
s66, applying the initial transformation matrix to the source point cloud P to obtain a new source point cloud P';
s67, for each point in the new source point cloud P', finding the closest euclidean distance in the target point cloud Q as the corresponding point, and then calculating the transformation matrix and the corresponding error E (R, T):
wherein E (R, T) represents the error between the new source point cloud P' and the target point cloud Q under the transformation matrix (R, T); p is a radical ofiAnd q isiRespectively the coordinates of each point in the source point cloud P' and the target point cloud Q;
s68, applying the transformation matrix obtained in the step S67 to the source point cloud P ', obtaining a new source point cloud P ', and calculating the error E (R, T) of the P ' and the target point cloud Q;
and S69, repeatedly executing the steps S67 and S68 until E (R, T) or the iteration number meets the set condition, finally solving a rotation and translation matrix between the two point clouds, and decomposing the rotation and translation matrix into a three-dimensional coordinate and a three-dimensional rotation vector, namely the pose of the target.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110676959.7A CN113393524B (en) | 2021-06-18 | 2021-06-18 | Target pose estimation method combining deep learning and contour point cloud reconstruction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110676959.7A CN113393524B (en) | 2021-06-18 | 2021-06-18 | Target pose estimation method combining deep learning and contour point cloud reconstruction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113393524A true CN113393524A (en) | 2021-09-14 |
CN113393524B CN113393524B (en) | 2023-09-26 |
Family
ID=77621867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110676959.7A Active CN113393524B (en) | 2021-06-18 | 2021-06-18 | Target pose estimation method combining deep learning and contour point cloud reconstruction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113393524B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116091706A (en) * | 2023-04-07 | 2023-05-09 | 山东建筑大学 | Three-dimensional reconstruction method for multi-mode remote sensing image deep learning matching |
CN117237451A (en) * | 2023-09-15 | 2023-12-15 | 南京航空航天大学 | Industrial part 6D pose estimation method based on contour reconstruction and geometric guidance |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106909877A (en) * | 2016-12-13 | 2017-06-30 | 浙江大学 | A kind of vision based on dotted line comprehensive characteristics builds figure and localization method simultaneously |
US20170212516A1 (en) * | 2016-01-27 | 2017-07-27 | National Institute Of Advanced Industrial Science And Technology | Position control system and position control method for an unmanned surface vehicle |
CN108133458A (en) * | 2018-01-17 | 2018-06-08 | 视缘(上海)智能科技有限公司 | A kind of method for automatically split-jointing based on target object spatial point cloud feature |
CN108305277A (en) * | 2017-12-26 | 2018-07-20 | 中国航天电子技术研究院 | A kind of heterologous image matching method based on straightway |
CN109035200A (en) * | 2018-06-21 | 2018-12-18 | 北京工业大学 | A kind of bolt positioning and position and posture detection method based on the collaboration of single binocular vision |
WO2019005999A1 (en) * | 2017-06-28 | 2019-01-03 | Magic Leap, Inc. | Method and system for performing simultaneous localization and mapping using convolutional image transformation |
US20190026919A1 (en) * | 2016-01-20 | 2019-01-24 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method, and program |
US20190102601A1 (en) * | 2017-09-21 | 2019-04-04 | Lexset.Ai Llc | Detecting one or more objects in an image, or sequence of images, and determining a category and one or more descriptors for each of the one or more objects, generating synthetic training data, and training a neural network with the synthetic training data |
CN109934862A (en) * | 2019-02-22 | 2019-06-25 | 上海大学 | A kind of binocular vision SLAM method that dotted line feature combines |
CN110782494A (en) * | 2019-10-16 | 2020-02-11 | 北京工业大学 | Visual SLAM method based on point-line fusion |
CN110825743A (en) * | 2019-10-31 | 2020-02-21 | 北京百度网讯科技有限公司 | Data importing method and device of graph database, electronic equipment and medium |
US20200218929A1 (en) * | 2017-09-22 | 2020-07-09 | Huawei Technologies Co., Ltd. | Visual slam method and apparatus based on point and line features |
CN111462210A (en) * | 2020-03-31 | 2020-07-28 | 华南理工大学 | Monocular line feature map construction method based on epipolar constraint |
CN111768449A (en) * | 2019-03-30 | 2020-10-13 | 北京伟景智能科技有限公司 | Object grabbing method combining binocular vision with deep learning |
CN112967217A (en) * | 2021-03-11 | 2021-06-15 | 大连理工大学 | Image splicing method based on linear feature matching and constraint |
-
2021
- 2021-06-18 CN CN202110676959.7A patent/CN113393524B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190026919A1 (en) * | 2016-01-20 | 2019-01-24 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method, and program |
US20170212516A1 (en) * | 2016-01-27 | 2017-07-27 | National Institute Of Advanced Industrial Science And Technology | Position control system and position control method for an unmanned surface vehicle |
CN106909877A (en) * | 2016-12-13 | 2017-06-30 | 浙江大学 | A kind of vision based on dotted line comprehensive characteristics builds figure and localization method simultaneously |
WO2019005999A1 (en) * | 2017-06-28 | 2019-01-03 | Magic Leap, Inc. | Method and system for performing simultaneous localization and mapping using convolutional image transformation |
US20190102601A1 (en) * | 2017-09-21 | 2019-04-04 | Lexset.Ai Llc | Detecting one or more objects in an image, or sequence of images, and determining a category and one or more descriptors for each of the one or more objects, generating synthetic training data, and training a neural network with the synthetic training data |
US20200218929A1 (en) * | 2017-09-22 | 2020-07-09 | Huawei Technologies Co., Ltd. | Visual slam method and apparatus based on point and line features |
CN108305277A (en) * | 2017-12-26 | 2018-07-20 | 中国航天电子技术研究院 | A kind of heterologous image matching method based on straightway |
CN108133458A (en) * | 2018-01-17 | 2018-06-08 | 视缘(上海)智能科技有限公司 | A kind of method for automatically split-jointing based on target object spatial point cloud feature |
CN109035200A (en) * | 2018-06-21 | 2018-12-18 | 北京工业大学 | A kind of bolt positioning and position and posture detection method based on the collaboration of single binocular vision |
CN109934862A (en) * | 2019-02-22 | 2019-06-25 | 上海大学 | A kind of binocular vision SLAM method that dotted line feature combines |
CN111768449A (en) * | 2019-03-30 | 2020-10-13 | 北京伟景智能科技有限公司 | Object grabbing method combining binocular vision with deep learning |
CN110782494A (en) * | 2019-10-16 | 2020-02-11 | 北京工业大学 | Visual SLAM method based on point-line fusion |
CN110825743A (en) * | 2019-10-31 | 2020-02-21 | 北京百度网讯科技有限公司 | Data importing method and device of graph database, electronic equipment and medium |
CN111462210A (en) * | 2020-03-31 | 2020-07-28 | 华南理工大学 | Monocular line feature map construction method based on epipolar constraint |
CN112967217A (en) * | 2021-03-11 | 2021-06-15 | 大连理工大学 | Image splicing method based on linear feature matching and constraint |
Non-Patent Citations (2)
Title |
---|
JIANWEI REN: "An improved binocular LSD_SLAM method for object localization", 2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER APPLICATIONS (ICAICA), DALIAN, CHINA, pages 30 * |
荣燊: "基于点线综合的双目视觉SLAM研究", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 01, pages 138 - 1565 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116091706A (en) * | 2023-04-07 | 2023-05-09 | 山东建筑大学 | Three-dimensional reconstruction method for multi-mode remote sensing image deep learning matching |
CN117237451A (en) * | 2023-09-15 | 2023-12-15 | 南京航空航天大学 | Industrial part 6D pose estimation method based on contour reconstruction and geometric guidance |
CN117237451B (en) * | 2023-09-15 | 2024-04-02 | 南京航空航天大学 | Industrial part 6D pose estimation method based on contour reconstruction and geometric guidance |
Also Published As
Publication number | Publication date |
---|---|
CN113393524B (en) | 2023-09-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fan et al. | Pothole detection based on disparity transformation and road surface modeling | |
CN109166149B (en) | Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU | |
David et al. | Simultaneous pose and correspondence determination using line features | |
CN110853075B (en) | Visual tracking positioning method based on dense point cloud and synthetic view | |
CN107392947B (en) | 2D-3D image registration method based on contour coplanar four-point set | |
CN110853100B (en) | Structured scene vision SLAM method based on improved point-line characteristics | |
CN109960402B (en) | Virtual and real registration method based on point cloud and visual feature fusion | |
CN107492107B (en) | Object identification and reconstruction method based on plane and space information fusion | |
CN111998862B (en) | BNN-based dense binocular SLAM method | |
CN113393524B (en) | Target pose estimation method combining deep learning and contour point cloud reconstruction | |
EP3185212B1 (en) | Dynamic particle filter parameterization | |
CN112163588A (en) | Intelligent evolution-based heterogeneous image target detection method, storage medium and equipment | |
Fanani et al. | Keypoint trajectory estimation using propagation based tracking | |
JPH07103715A (en) | Method and apparatus for recognizing three-dimensional position and attitude based on visual sense | |
CN111882663A (en) | Visual SLAM closed-loop detection method achieved by fusing semantic information | |
CN116662600A (en) | Visual positioning method based on lightweight structured line map | |
CN106056599B (en) | A kind of object recognition algorithm and device based on Object Depth data | |
CN112991372B (en) | 2D-3D camera external parameter calibration method based on polygon matching | |
Kang et al. | 3D urban reconstruction from wide area aerial surveillance video | |
CN112419496A (en) | Semantic map construction method based on deep learning | |
CN112598736A (en) | Map construction based visual positioning method and device | |
CN111915632A (en) | Poor texture target object truth value database construction method based on machine learning | |
Fan et al. | Simple but effective scale estimation for monocular visual odometry in road driving scenarios | |
Wang et al. | Stereo Rectification Based on Epipolar Constrained Neural Network | |
Del-Tejo-Catalá et al. | Probabilistic pose estimation from multiple hypotheses |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |