CN113393524A - Target pose estimation method combining deep learning and contour point cloud reconstruction - Google Patents

Target pose estimation method combining deep learning and contour point cloud reconstruction Download PDF

Info

Publication number
CN113393524A
CN113393524A CN202110676959.7A CN202110676959A CN113393524A CN 113393524 A CN113393524 A CN 113393524A CN 202110676959 A CN202110676959 A CN 202110676959A CN 113393524 A CN113393524 A CN 113393524A
Authority
CN
China
Prior art keywords
point cloud
straight line
target
point
line segment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110676959.7A
Other languages
Chinese (zh)
Other versions
CN113393524B (en
Inventor
陈从平
姚威
张力
江高勇
周正旺
丁坤
张屹
戴国洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou University
Original Assignee
Changzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou University filed Critical Changzhou University
Priority to CN202110676959.7A priority Critical patent/CN113393524B/en
Publication of CN113393524A publication Critical patent/CN113393524A/en
Application granted granted Critical
Publication of CN113393524B publication Critical patent/CN113393524B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Abstract

The invention relates to the technical field of target pose estimation, in particular to a target pose estimation method combining deep learning and contour point cloud reconstruction, which comprises the following steps of: s1, calibrating the binocular vision system and performing stereo correction; s2, recognizing the targets in the left camera image and the right camera image by using the trained target detection network model, and obtaining the boundary area of the targets; s3, carrying out straight line segment detection on the boundary area of the target detected in the left and right camera images by using an LSD algorithm; s4, matching the straight line segment by combining the output of the deep learning target detection network and a multi-constraint method; s5, reconstructing contour point cloud of the target; and S6, carrying out pose estimation on the target. According to the method, through the left camera and the right camera, a YOLOv4 deep learning algorithm is utilized and the contour point cloud reconstruction is combined, the calculation time of stereo matching is short, the calculation amount is small, and the cost of a common camera is greatly reduced.

Description

Target pose estimation method combining deep learning and contour point cloud reconstruction
Technical Field
The invention relates to the technical field of target pose estimation, in particular to a target pose estimation method combining deep learning and contour point cloud reconstruction.
Background
The pose estimation aims at acquiring a three-dimensional coordinate and a three-dimensional rotation vector of a target to be measured in a camera coordinate system; in many cases, only the 6D pose of the target is accurately estimated, so that the next operation and decision of the machine can be facilitated; for example, in related tasks of the intelligent robot, the 6D pose of the target is identified, so that useful information can be provided for grabbing and motion planning; in virtual reality applications, the 6D pose of a target is key to supporting virtual interaction between any objects.
The existing 6D pose estimation method is mainly a point cloud registration method, can process targets with complex shapes and weak textures, and has good accuracy and robustness; the point cloud registration method can be divided into a binocular vision-based method and a depth camera-based method according to different methods for acquiring point cloud data.
Most of existing binocular vision-based methods firstly solve a disparity map of a scene through an SGBM (Semi-Global Block Matching) stereo Matching algorithm, then reconstruct point cloud of the scene according to the disparity map and segment a target, and finally register the point cloud with template point cloud to obtain the pose of the target; however, in the method, the point cloud of the whole scene is reconstructed, so that the calculation time of stereo matching is too long; according to the scheme based on the depth camera, the point cloud of the target is firstly obtained, the three-dimensional characteristics of each point in the point cloud are calculated, and then pose estimation is carried out according to the three-dimensional characteristics; therefore, how to provide a pose estimation method with small calculation amount, low cost and high precision is a problem that needs to be solved urgently by the technical personnel in the field.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the image target boundary regions of the left camera and the right camera are obtained by utilizing a YOLOv4 deep learning algorithm, the target is subjected to straight line segment detection and matching by utilizing an LSD algorithm and a multi-constraint method, the contour point cloud of the target is reconstructed, and the pose of the target is estimated, so that the calculation time of stereo matching is short, the calculation amount is small, and in addition, the development cost is greatly reduced by utilizing a common camera to acquire pictures.
The technical scheme adopted by the invention is as follows: a target pose estimation method combining deep learning and contour point cloud reconstruction comprises the following steps:
s1, calibrating a binocular vision system by adopting a Zhangangyou checkerboard calibration mode, calibrating and performing three-dimensional correction on a binocular vision camera by using a Bouguet algorithm based on calibrated parameters, selecting various targets as analysis objects, training the various targets through a YOLOv4 network, and establishing various target detection network models;
s2, recognizing the targets in the left camera image and the right camera image by using the trained target detection network model, and obtaining the boundary area of the targets;
s3, carrying out straight line segment detection on the boundary area of the target detected in the left and right camera images by using an LSD algorithm;
s4, matching the straight line segment by combining the type, the boundary area and the multi-constraint method output by the deep learning target detection network;
s5, reconstructing contour point cloud of the target;
and S6, performing pose estimation on the target by using a point cloud registration method.
Further, step S3 includes:
s31, taking out the end points of all the straight line segments, and grouping the end points with Euclidean distances smaller than a set threshold value d into the same group;
s32 for any specific oneGroup end points, if the number of the end points in the group is more than or equal to 2, the end points in the group are merged into the same point, which is marked as Pr,PrThe calculation is as follows:
Figure BDA0003121037730000031
wherein, PiIs the intersection point of the extension lines of the straight line segments to which any two end points in the current group respectively belong, n is the number of the end points in the group,
Figure BDA0003121037730000032
a combination number representing 2 endpoints selected from the n endpoints;
s33, use of PrAnd obtaining the optimized and reconstructed straight line segment as the common end point of the straight line segment where the group of end points are located.
Further, step S4 includes:
s41, calculating the lengths S of all the optimized and reconstructed straight line segments and the included angle theta between the straight line segments and the positive direction of the transverse axis of the image;
s42, for a certain straight line segment l to be matched in the left camera imagelRecording the target boundary region Rect to which it belongslThen, the boundary region Rect of the same target is found in the right camera imagerTaking out RectrAll the straight line segments in (1) are marked as a straight line segment set Lr
S43, collecting L in straight line segmentrThe horizontal coordinate of the middle eliminating midpoint is more than llA straight line segment of the middle point abscissa obtains a new straight line segment set L'r
S44, calculating llAnd L's'rEach straight line segment lr(lr∈Lr) Horizontal error E ofeLength error EsAnd the angle error Eθ
Figure BDA0003121037730000033
Wherein, yls、yrsAre each llAnd lrOrdinate of the starting end point of (2), yle、yreAre each llAnd lrThe ordinate of the termination endpoint of (1); sl、srAre respectively a straight line segment llAnd lrLength of (d); thetal、θrAre respectively a straight line segment llAnd lrThe included angle with the positive direction of the transverse axis of the image;
s45, mixing Ee,Es,EθAccording to E ═ Ee Es Eθ]The form of the E is spliced into a matching error vector E, and each value in the E is normalized;
s46, calculating a straight line segment llAnd set LrError value E of matching for each straight line segment in the graphtotal
Figure BDA0003121037730000041
Get EtotalThe straight line segment with the smallest value is taken as llAre matched with straight line segments.
Further, step S5 includes:
s51, setting the starting end point and the ending end point of the matched straight line segment in the left camera image and the right camera image as p respectivelyls(uls,vls)、ple(ule,vle) And prs(urs,vrs)、pre(ure,vre) A 1 is to pls(uls,vls) And prs(urs,vrs) U of (a)ls、urs、vlsValues of u are respectively substituted for formula (4)l、ur、vlReconstructing the starting endpoint P of the three-dimensional space straight line segments(xs,ys,zs) (ii) a Then p is putle(ule,vle) And pre(ure,vre) U of (a)le、ure、vleValues of u are respectively substituted for formula (4)l、ur、vlReconstructing the termination point P of the three-dimensional space straight line segmente(xe,ye,ze):
Figure BDA0003121037730000042
Wherein u isl、urFor the abscissa, v, of the point to be reconstructed in the left and right camera images, respectivelylFor the ordinate of the point to be reconstructed in the left camera image, b is the base-line distance of the binocular camera, (u)0,v0) Is the coordinate value of the center of the optical axis of the left camera, f is the focal length of the two cameras, (X)c,Yc,Zc) Three-dimensional coordinates of the reconstructed point in a left camera coordinate system;
s52, two three-dimensional end points P are reconstructedsAnd PeCalculating a spatial straight-line equation L (x, y, z) represented by the equation:
Figure BDA0003121037730000051
the direction vector of the straight line L (x, y, z) is n (x)e-xs,ye-ys,ze-zs) And then unitized to nunit(xunit,yunit,zunit);
S53 substituting the initial end point P of the straight line segment into the three-dimensional spaces(xs,ys,zs) (x) to formula (6)i-1,yi-1,zi-1) Then iterate step by step to the termination endpoint Pe(xe,ye,ze) Generating a point cloud of the spatial straight line segment:
Figure BDA0003121037730000052
wherein (x)i,yi,zi) Is the coordinate of the current point in the iterative process, (x)i-1,yi-1,zi-1) Δ S is a preset overlap for the coordinates of the previous pointReplacing step length, namely the space distance between adjacent points in the scattered three-dimensional point cloud;
and S54, generating point clouds of all the matched straight line segments to obtain the contour point cloud of the target.
Further, step S6 includes:
s61, in an off-line state, generating a complete contour point cloud of the target to be detected as a template point cloud by using CAD, and calculating a Fast Point Feature Histogram (FPFH) of the template point cloud;
s62, taking the template point cloud generated in the step S61 as a source point cloud P, taking the reconstructed contour point cloud of the target as a target point cloud Q, and calculating the FPFH of the target point cloud Q;
s63, randomly selecting k sampling points from the source point cloud P, wherein k is an integer larger than 3, searching a plurality of points with similar FPFH (flat-panel display frequency) with the sampling points from the target point cloud Q, and then randomly selecting a corresponding point as the sampling point;
s64, calculating a transformation matrix of the point correspondences, and then calculating a transformation error by adopting a Huber penalty function which is recorded as
Figure BDA0003121037730000053
Wherein H (e)i) The calculation formula is as follows:
Figure BDA0003121037730000061
in the formula: t is teIs a predetermined value, eiRepresenting the distance difference of the transformed ith point pair;
s65, repeatedly executing the steps S63 and S64 until the preset iteration times are reached, and finally taking the transformation matrix which enables the transformation error to be minimum as an initial transformation matrix;
s66, applying the initial transformation matrix to the source point cloud P to obtain a new source point cloud P';
s67, for each point in the new source point cloud P', finding the closest euclidean distance in the target point cloud Q as the corresponding point, and then calculating the transformation matrix and the corresponding error E (R, T):
Figure BDA0003121037730000062
wherein E (R, T) represents the error between the new source point cloud P' and the target point cloud Q under the transformation matrix (R, T); p is a radical ofiAnd q isiRespectively the coordinates of each point in the source point cloud P' and the target point cloud Q;
s68, applying the transformation matrix obtained in the step S67 to the source point cloud P ', obtaining a new source point cloud P ', and calculating the error E (R, T) of the P ' and the target point cloud Q;
and S69, repeatedly executing the steps S67 and S68 until E (R, T) or the iteration number meets the set condition (the set condition is that E (R, T) is smaller than a preset error value or the steps S67 and S68 are repeatedly executed), finally solving a rotation and translation matrix between the two point clouds, and decomposing the rotation and translation matrix into three-dimensional coordinates and three-dimensional rotation vectors, namely the pose of the target.
The invention has the beneficial effects that:
1. a target pose estimation method based on binocular vision and combined with deep learning and contour point cloud reconstruction is provided and realized;
2. the cost is lower than that of a depth camera scheme by using a binocular camera scheme;
3. the contour point cloud of the target is reconstructed and registered, the operation efficiency is higher compared with that of dense point cloud processing of the target, and certain precision is guaranteed.
Drawings
FIG. 1 is a flow chart of a target pose estimation method combining deep learning and contour point cloud reconstruction according to the present invention;
FIG. 2 is a graph of the relationship between the training rounds of the YOLOv4 network and the loss value of the present invention;
FIG. 3 is a diagram of the effect of the confidence in the test set of the YOLOv4 network of the present invention;
FIG. 4 is a comparison graph of the effect of straight line segments before and after the optimized reconstruction of the present invention;
FIG. 5 is a diagram illustrating the effects of final target detection and identification and straight line segment matching according to the present invention;
FIG. 6 is a diagram of the point cloud process of the present invention for generating all matching straight line segments;
FIG. 7 is a cloud of target contour points reconstructed by the present invention;
FIG. 8 is a cloud of complete contour points CAD generated in the target coordinate system in accordance with the present invention;
FIG. 9 is the final point cloud registration map of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and examples, which are simplified schematic drawings and illustrate only the basic structure of the invention in a schematic manner, and therefore only show the structures relevant to the invention.
As shown in fig. 1, S1, calibrating and stereoscopically correcting the binocular camera, selecting multiple targets as analysis objects, training the multiple targets through the YOLOv4 network, and establishing multiple target detection network models;
carrying out implementation mode verification on the invention by taking three representative targets of a square type, a slice type and an angle type as objects, collecting 400 target images of each type, wherein the number of the target images is 1200, dividing marked samples into a training set, a verification set and a test set according to the number ratio of 8:1:1, inputting the training set into a YOLOv4 network to start learning, setting the initial momentum of the network to be 0.9, adjusting the initial learning rate to be 0.001 and setting the size of a training batch to be 8; freezing a main network, carrying out 30 rounds of preheating training, setting the learning rate to be 0.0001, and training the unfrozen whole network for 30 rounds; the samples in the verification set are input in each training round for identification and verification, and the loss value is calculated, and the result is shown in fig. 2, so that the network is converged after 60 rounds of training; the test set is tested by using the trained model, and the partial result of the confidence degree predicted by the Yolov4 network is shown in FIG. 3, so that the trained model can accurately identify and frame the target.
Calibrating a binocular vision system by adopting a Zhangyingyou checkerboard calibration mode, and performing stereo correction on a binocular camera by using a Bouguet stereo correction method based on calibrated parameters.
And S2, acquiring field pictures by using the binocular camera after stereo correction, and identifying all targets in the images of the left camera and the right camera by using the trained YOLOv4 network to obtain the boundary area of the targets.
S3, carrying out straight line segment detection on the target boundary areas detected in the left camera image and the right camera image by using an LSD algorithm;
because the LSD algorithm considers that each pixel point can only belong to one straight-line segment at most, when two or more intersected straight-line segments are detected, the straight-line segments are generally disconnected from the intersection point, and the end points of the straight-line segments need to be recalculated and merged to optimize and reconstruct the straight-line segments; firstly, taking out the end points of all the straight line segments, merging the end points if the distance between the end points is greater than a set threshold value, and merging the end points PrAs common endpoints of these straight line segments, i.e., intersections:
Figure BDA0003121037730000081
the optimized and reconstructed straight line segment is obtained, the process of the optimized and reconstructed straight line segment is shown in fig. 4, and it can be seen that when the intersected straight line segment is detected by a simple LSD algorithm before optimization, the straight line segment is disconnected from the intersection point, but after optimization, the contour of the target discontinuity is correctly connected.
S4, matching the straight line segment by combining the type, the boundary area and the multi-constraint method output by the deep learning target detection network;
s41, calculating the lengths S of all the optimized and reconstructed straight line segments and the included angle theta between the straight line segments and the positive direction of the transverse axis of the image;
s42, for a certain straight line segment l to be matched in the left camera imagelRecording the target boundary region Rect to which it belongslThen, the boundary region Rect of the same target is found in the right camera imagerTaking out RectrAll the straight line segments in (1) are marked as a straight line segment set Lr
S43, collecting L in straight line segmentrThe horizontal coordinate of the middle eliminating midpoint is more than llA straight line segment of the middle point abscissa obtains a new straight line segment set L'r
S46, calculating llAnd L's'rIn (1)Each straight line segment lr(lr∈Lr) Is matched with the error value EtotalTaking out EtotalThe straight line segment with the smallest value is taken as llThe matching straight line segment of (1);
the final target detection recognition and straight line segment matching results are shown in fig. 5, and numerical values in the figure represent the Yolov4 network recognition confidence coefficients of three target pictures of a square type, a thin type and an angle type of a left camera and a right camera; for convenient display, the matching result of the square target is only displayed in the figure, and the method can be used for correctly matching the straight line segments in the images of the left camera and the right camera.
S5, reconstructing contour point cloud of the target;
s52, obtaining a three-dimensional space straight line equation by reconstructing two three-dimensional end points of the straight line segments, performing iterative sampling on the three-dimensional space straight line to generate point clouds of all matched straight line segments, generating the point clouds of all matched straight line segments by the process shown in FIG. 6, and finally reconstructing a target contour point cloud shown in FIG. 7; as can be seen from fig. 7, the contour point cloud of the target is accurately reconstructed, and the position in space is the same as the actual position; because the method only matches and reconstructs the contour of the target, compared with the method for reconstructing the whole scene, the method has the advantages of smaller calculated amount and higher operation efficiency.
S6, performing pose estimation on the target by using a point cloud registration method;
generating a complete contour point cloud defined under a target coordinate system as shown in fig. 8 by using a CAD, then performing point cloud registration with the reconstructed contour point cloud to obtain a pose of the target, wherein a final point cloud registration result is shown in fig. 9; therefore, after registration, two point clouds are approximately overlapped, and the pose estimation result is correct and has high precision.
Meanwhile, the registration experiment of dense point cloud on the target surface is realized, the running time of the registration experiment is recorded, and then the registration experiment is compared with the registration of the contour point cloud of the method, and the result is shown in table 1:
TABLE 1
Figure BDA0003121037730000101
It can be seen that the average processing speed is improved by about 50 times because the number of points is less while the contour point cloud retains the target structure information to the maximum extent.
The actual pose of the hand measurement and the pose calculated by the algorithm in this document are subjected to error calculation, and then the absolute value is taken, and the result is shown in table 2:
TABLE 2
Figure BDA0003121037730000102
Therefore, the position errors in all directions estimated by the method are smaller than 0.7mm, the attitude errors are smaller than 0.9 degrees, and the requirements of practical application are met.
The invention has the beneficial effects that: a target pose estimation method based on binocular vision and combined with deep learning and contour point cloud reconstruction is provided and realized; the cost is lower than that of a depth camera scheme by using a binocular camera scheme; the contour point cloud of the target is reconstructed and registered, the operation efficiency is higher compared with that of dense point cloud processing of the target, and certain precision is guaranteed.
In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.

Claims (5)

1. A target pose estimation method combining deep learning and contour point cloud reconstruction is characterized by comprising the following steps:
s1, calibrating and performing stereo correction on the binocular camera, selecting various targets as analysis objects, training the various targets through a YOLOv4 network, and establishing various target detection network models;
s2, recognizing the targets in the left camera image and the right camera image by using the trained target detection network model, and obtaining the boundary area of the targets;
s3, carrying out straight line segment detection on the boundary area of the target detected in the left and right camera images by using an LSD algorithm;
s4, matching the straight line segment by combining the output category, the boundary area and the multi-constraint method of the deep learning target detection network;
s5, reconstructing contour point cloud of the target;
and S6, performing pose estimation on the target by using a point cloud registration method.
2. The method for estimating the pose of an object by combining deep learning and reconstruction of a contour point cloud according to claim 1, wherein the step S3 comprises:
s31, taking out the end points of all the straight line segments, and grouping the end points with Euclidean distances smaller than a set threshold value d into the same group;
s32, aiming at any specific group of endpoints, if the number of the endpoints in the group is more than or equal to 2, merging the endpoints in the group into the same point, and marking as Pr,PrThe calculation is as follows:
Figure FDA0003121037720000011
wherein, PiIs the intersection point of the extension lines of the straight line segments to which any two end points in the current group respectively belong, n is the number of the end points in the group,
Figure FDA0003121037720000012
a combination number representing 2 endpoints selected from the n endpoints;
s33, use of PrAnd obtaining the optimized and reconstructed straight line segment as the common end point of the straight line segment where the group of end points are located.
3. The method for estimating the pose of an object by combining deep learning and reconstruction of a contour point cloud according to claim 1, wherein the step S4 comprises:
s41, calculating the lengths S of all the optimized and reconstructed straight line segments and the included angle theta between the straight line segments and the positive direction of the transverse axis of the image;
s42, for a certain straight line segment l to be matched in the left camera imagelRecording the target boundary region Rect to which it belongslThen, the boundary region Rect of the same target is found in the right camera imagerTaking out RectrAll the straight line segments in (1) are marked as a straight line segment set Lr
S43, collecting L in straight line segmentrThe horizontal coordinate of the middle eliminating midpoint is more than llA straight line segment of the middle point abscissa obtains a new straight line segment set L'r
S44, calculating llAnd L's'rEach straight line segment lr(lr∈Lr) Horizontal error E ofeLength error EsAnd the angle error Eθ
Figure FDA0003121037720000021
Wherein, yls、yrsAre each llAnd lrOrdinate of the starting end point of (2), yle、yreAre each llAnd lrThe ordinate of the termination endpoint of (1); sl、srAre respectively a straight line segment llAnd lrLength of (d); thetal、θrAre respectively a straight line segment llAnd lrThe included angle with the positive direction of the transverse axis of the image;
s45, mixing Ee,Es,EθAccording to E ═ Ee Es Eθ]The form of the E is spliced into a matching error vector E, and each value in the E is normalized;
s46, calculating a straight line segment llAnd set LrError value E of matching for each straight line segment in the graphtotal
Figure FDA0003121037720000022
Get EtotalThe straight line segment with the smallest value is taken as llAre matched with straight line segments.
4. The method for estimating the pose of an object by combining deep learning and reconstruction of a contour point cloud according to claim 1, wherein the step S5 comprises:
s51, setting the starting end point and the ending end point of the matched straight line segment in the left camera image and the right camera image as p respectivelyls(uls,vls)、ple(ule,vle) And prs(urs,vrs)、pre(ure,vre) A 1 is to pls(uls,vls) And prs(urs,vrs) U of (a)ls、urs、vlsValues of u are respectively substituted for formula (4)l、ur、vlReconstructing the starting endpoint P of the three-dimensional space straight line segments(xs,ys,zs) (ii) a Then p is putle(ule,vle) And pre(ure,vre) U of (a)le、ure、vleValues of u are respectively substituted for formula (4)l、ur、vlReconstructing the termination point P of the three-dimensional space straight line segmente(xe,ye,ze):
Figure FDA0003121037720000031
Wherein u isl、urFor the abscissa, v, of the point to be reconstructed in the left and right camera images, respectivelylFor the ordinate of the point to be reconstructed in the left camera image, b is the base-line distance of the binocular camera, (u)0,v0) Is the coordinate value of the center of the optical axis of the left camera, f is the focal length of the two cameras, (X)c,Yc,Zc) Three-dimensional coordinates of the reconstructed point in a left camera coordinate system;
s52, two three-dimensional end points P are reconstructedsAnd PeThe spatial straight-line equation L (x, y, z) it represents is calculated,the calculation formula is as follows:
Figure FDA0003121037720000032
the direction vector of the straight line L (x, y, z) is n (x)e-xs,ye-ys,ze-zs) And then unitized to nunit(xunit,yunit,zunit);
S53 substituting the initial end point P of the straight line segment into the three-dimensional spaces(xs,ys,zs) (x) to formula (6)i-1,yi-1,zi-1) Then iterate step by step to the termination endpoint Pe(xe,ye,ze) Generating a point cloud of the spatial straight line segment:
Figure FDA0003121037720000041
wherein (x)i,yi,zi) Is the coordinate of the current point in the iterative process, (x)i-1,yi-1,zi-1) The coordinate of the previous point is adopted, and the delta S is a preset iteration step length, namely a space distance between adjacent points in the dispersed three-dimensional point cloud;
and S54, generating point clouds of all the matched straight line segments to obtain the contour point cloud of the target.
5. The method for estimating the pose of an object by combining deep learning and reconstruction of a contour point cloud according to claim 1, wherein the step S6 comprises:
s61, in an off-line state, generating a complete contour point cloud of the target to be detected as a template point cloud by using CAD, and calculating a Fast Point Feature Histogram (FPFH) of the template point cloud;
s62, taking the template point cloud generated in the step S61 as a source point cloud P, taking the reconstructed contour point cloud of the target as a target point cloud Q, and calculating the FPFH of the target point cloud Q;
s63, randomly selecting k sampling points from the source point cloud P, wherein k is an integer larger than 3, searching a plurality of points with similar FPFH (flat-panel display frequency) with the sampling points from the target point cloud Q, and then randomly selecting a corresponding point as the sampling point;
s64, calculating a transformation matrix of the point correspondences, and then calculating a transformation error by adopting a Huber penalty function which is recorded as
Figure FDA0003121037720000042
Wherein H (e)i) The calculation formula is as follows:
Figure FDA0003121037720000043
in the formula: t is teIs a predetermined value, eiRepresenting the distance difference of the transformed ith point pair;
s65, repeatedly executing the steps S63 and S64 until the preset iteration times are reached, and finally taking the transformation matrix which enables the transformation error to be minimum as an initial transformation matrix;
s66, applying the initial transformation matrix to the source point cloud P to obtain a new source point cloud P';
s67, for each point in the new source point cloud P', finding the closest euclidean distance in the target point cloud Q as the corresponding point, and then calculating the transformation matrix and the corresponding error E (R, T):
Figure FDA0003121037720000051
wherein E (R, T) represents the error between the new source point cloud P' and the target point cloud Q under the transformation matrix (R, T); p is a radical ofiAnd q isiRespectively the coordinates of each point in the source point cloud P' and the target point cloud Q;
s68, applying the transformation matrix obtained in the step S67 to the source point cloud P ', obtaining a new source point cloud P ', and calculating the error E (R, T) of the P ' and the target point cloud Q;
and S69, repeatedly executing the steps S67 and S68 until E (R, T) or the iteration number meets the set condition, finally solving a rotation and translation matrix between the two point clouds, and decomposing the rotation and translation matrix into a three-dimensional coordinate and a three-dimensional rotation vector, namely the pose of the target.
CN202110676959.7A 2021-06-18 2021-06-18 Target pose estimation method combining deep learning and contour point cloud reconstruction Active CN113393524B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110676959.7A CN113393524B (en) 2021-06-18 2021-06-18 Target pose estimation method combining deep learning and contour point cloud reconstruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110676959.7A CN113393524B (en) 2021-06-18 2021-06-18 Target pose estimation method combining deep learning and contour point cloud reconstruction

Publications (2)

Publication Number Publication Date
CN113393524A true CN113393524A (en) 2021-09-14
CN113393524B CN113393524B (en) 2023-09-26

Family

ID=77621867

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110676959.7A Active CN113393524B (en) 2021-06-18 2021-06-18 Target pose estimation method combining deep learning and contour point cloud reconstruction

Country Status (1)

Country Link
CN (1) CN113393524B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116091706A (en) * 2023-04-07 2023-05-09 山东建筑大学 Three-dimensional reconstruction method for multi-mode remote sensing image deep learning matching
CN117237451A (en) * 2023-09-15 2023-12-15 南京航空航天大学 Industrial part 6D pose estimation method based on contour reconstruction and geometric guidance

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106909877A (en) * 2016-12-13 2017-06-30 浙江大学 A kind of vision based on dotted line comprehensive characteristics builds figure and localization method simultaneously
US20170212516A1 (en) * 2016-01-27 2017-07-27 National Institute Of Advanced Industrial Science And Technology Position control system and position control method for an unmanned surface vehicle
CN108133458A (en) * 2018-01-17 2018-06-08 视缘(上海)智能科技有限公司 A kind of method for automatically split-jointing based on target object spatial point cloud feature
CN108305277A (en) * 2017-12-26 2018-07-20 中国航天电子技术研究院 A kind of heterologous image matching method based on straightway
CN109035200A (en) * 2018-06-21 2018-12-18 北京工业大学 A kind of bolt positioning and position and posture detection method based on the collaboration of single binocular vision
WO2019005999A1 (en) * 2017-06-28 2019-01-03 Magic Leap, Inc. Method and system for performing simultaneous localization and mapping using convolutional image transformation
US20190026919A1 (en) * 2016-01-20 2019-01-24 Canon Kabushiki Kaisha Information processing apparatus, information processing method, and program
US20190102601A1 (en) * 2017-09-21 2019-04-04 Lexset.Ai Llc Detecting one or more objects in an image, or sequence of images, and determining a category and one or more descriptors for each of the one or more objects, generating synthetic training data, and training a neural network with the synthetic training data
CN109934862A (en) * 2019-02-22 2019-06-25 上海大学 A kind of binocular vision SLAM method that dotted line feature combines
CN110782494A (en) * 2019-10-16 2020-02-11 北京工业大学 Visual SLAM method based on point-line fusion
CN110825743A (en) * 2019-10-31 2020-02-21 北京百度网讯科技有限公司 Data importing method and device of graph database, electronic equipment and medium
US20200218929A1 (en) * 2017-09-22 2020-07-09 Huawei Technologies Co., Ltd. Visual slam method and apparatus based on point and line features
CN111462210A (en) * 2020-03-31 2020-07-28 华南理工大学 Monocular line feature map construction method based on epipolar constraint
CN111768449A (en) * 2019-03-30 2020-10-13 北京伟景智能科技有限公司 Object grabbing method combining binocular vision with deep learning
CN112967217A (en) * 2021-03-11 2021-06-15 大连理工大学 Image splicing method based on linear feature matching and constraint

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190026919A1 (en) * 2016-01-20 2019-01-24 Canon Kabushiki Kaisha Information processing apparatus, information processing method, and program
US20170212516A1 (en) * 2016-01-27 2017-07-27 National Institute Of Advanced Industrial Science And Technology Position control system and position control method for an unmanned surface vehicle
CN106909877A (en) * 2016-12-13 2017-06-30 浙江大学 A kind of vision based on dotted line comprehensive characteristics builds figure and localization method simultaneously
WO2019005999A1 (en) * 2017-06-28 2019-01-03 Magic Leap, Inc. Method and system for performing simultaneous localization and mapping using convolutional image transformation
US20190102601A1 (en) * 2017-09-21 2019-04-04 Lexset.Ai Llc Detecting one or more objects in an image, or sequence of images, and determining a category and one or more descriptors for each of the one or more objects, generating synthetic training data, and training a neural network with the synthetic training data
US20200218929A1 (en) * 2017-09-22 2020-07-09 Huawei Technologies Co., Ltd. Visual slam method and apparatus based on point and line features
CN108305277A (en) * 2017-12-26 2018-07-20 中国航天电子技术研究院 A kind of heterologous image matching method based on straightway
CN108133458A (en) * 2018-01-17 2018-06-08 视缘(上海)智能科技有限公司 A kind of method for automatically split-jointing based on target object spatial point cloud feature
CN109035200A (en) * 2018-06-21 2018-12-18 北京工业大学 A kind of bolt positioning and position and posture detection method based on the collaboration of single binocular vision
CN109934862A (en) * 2019-02-22 2019-06-25 上海大学 A kind of binocular vision SLAM method that dotted line feature combines
CN111768449A (en) * 2019-03-30 2020-10-13 北京伟景智能科技有限公司 Object grabbing method combining binocular vision with deep learning
CN110782494A (en) * 2019-10-16 2020-02-11 北京工业大学 Visual SLAM method based on point-line fusion
CN110825743A (en) * 2019-10-31 2020-02-21 北京百度网讯科技有限公司 Data importing method and device of graph database, electronic equipment and medium
CN111462210A (en) * 2020-03-31 2020-07-28 华南理工大学 Monocular line feature map construction method based on epipolar constraint
CN112967217A (en) * 2021-03-11 2021-06-15 大连理工大学 Image splicing method based on linear feature matching and constraint

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIANWEI REN: "An improved binocular LSD_SLAM method for object localization", 2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER APPLICATIONS (ICAICA), DALIAN, CHINA, pages 30 *
荣燊: "基于点线综合的双目视觉SLAM研究", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 01, pages 138 - 1565 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116091706A (en) * 2023-04-07 2023-05-09 山东建筑大学 Three-dimensional reconstruction method for multi-mode remote sensing image deep learning matching
CN117237451A (en) * 2023-09-15 2023-12-15 南京航空航天大学 Industrial part 6D pose estimation method based on contour reconstruction and geometric guidance
CN117237451B (en) * 2023-09-15 2024-04-02 南京航空航天大学 Industrial part 6D pose estimation method based on contour reconstruction and geometric guidance

Also Published As

Publication number Publication date
CN113393524B (en) 2023-09-26

Similar Documents

Publication Publication Date Title
Fan et al. Pothole detection based on disparity transformation and road surface modeling
CN109166149B (en) Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU
David et al. Simultaneous pose and correspondence determination using line features
CN110853075B (en) Visual tracking positioning method based on dense point cloud and synthetic view
CN107392947B (en) 2D-3D image registration method based on contour coplanar four-point set
CN110853100B (en) Structured scene vision SLAM method based on improved point-line characteristics
CN109960402B (en) Virtual and real registration method based on point cloud and visual feature fusion
CN107492107B (en) Object identification and reconstruction method based on plane and space information fusion
CN111998862B (en) BNN-based dense binocular SLAM method
CN113393524B (en) Target pose estimation method combining deep learning and contour point cloud reconstruction
EP3185212B1 (en) Dynamic particle filter parameterization
CN112163588A (en) Intelligent evolution-based heterogeneous image target detection method, storage medium and equipment
Fanani et al. Keypoint trajectory estimation using propagation based tracking
JPH07103715A (en) Method and apparatus for recognizing three-dimensional position and attitude based on visual sense
CN111882663A (en) Visual SLAM closed-loop detection method achieved by fusing semantic information
CN116662600A (en) Visual positioning method based on lightweight structured line map
CN106056599B (en) A kind of object recognition algorithm and device based on Object Depth data
CN112991372B (en) 2D-3D camera external parameter calibration method based on polygon matching
Kang et al. 3D urban reconstruction from wide area aerial surveillance video
CN112419496A (en) Semantic map construction method based on deep learning
CN112598736A (en) Map construction based visual positioning method and device
CN111915632A (en) Poor texture target object truth value database construction method based on machine learning
Fan et al. Simple but effective scale estimation for monocular visual odometry in road driving scenarios
Wang et al. Stereo Rectification Based on Epipolar Constrained Neural Network
Del-Tejo-Catalá et al. Probabilistic pose estimation from multiple hypotheses

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant