WO2021063128A1 - Method for determining pose of active rigid body in single-camera environment, and related apparatus - Google Patents

Method for determining pose of active rigid body in single-camera environment, and related apparatus Download PDF

Info

Publication number
WO2021063128A1
WO2021063128A1 PCT/CN2020/110254 CN2020110254W WO2021063128A1 WO 2021063128 A1 WO2021063128 A1 WO 2021063128A1 CN 2020110254 W CN2020110254 W CN 2020110254W WO 2021063128 A1 WO2021063128 A1 WO 2021063128A1
Authority
WO
WIPO (PCT)
Prior art keywords
matrix
dimensional space
rigid body
space point
coordinates
Prior art date
Application number
PCT/CN2020/110254
Other languages
French (fr)
Chinese (zh)
Inventor
王越
许秋子
Original Assignee
深圳市瑞立视多媒体科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市瑞立视多媒体科技有限公司 filed Critical 深圳市瑞立视多媒体科技有限公司
Publication of WO2021063128A1 publication Critical patent/WO2021063128A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Definitions

  • the invention relates to the technical field of computer vision, in particular to a method for positioning an active rigid body in a single-camera environment and related equipment.
  • the traditional optical motion capture method uses the ultra-high-power near-infrared light source in the motion capture camera to emit infrared light and irradiate it on the passive marking point; the marking point coated with high-reflective material reflects the irradiated infrared light, and this part of the infrared light
  • the ambient light with background information will pass through the low-distortion lens and reach the infrared narrow band pass filter unit of the camera. Since the light band of the infrared narrow band pass filter unit is the same as that of the infrared light source, the ambient light with redundant background information will be filtered out, leaving only the infrared light with the marking point information to pass through and be taken by the camera. Photosensitive element recording.
  • the photosensitive element then converts the light signal into an image signal and outputs it to the control circuit, and the image processing unit in the control circuit uses a Field Programmable Gate Array (FPGA) to preprocess the image signal in the form of hardware, and finally to The tracking software flows out the 2D coordinate information of the marked points.
  • FPGA Field Programmable Gate Array
  • the system server needs to receive the 2D data of each camera in the multi-camera system, and then adopt the principle of multi-eye vision, according to the matching relationship between 2D point clouds And calibrate the calculated pose relationship between the cameras in advance, calculate the 3D coordinates in the three-dimensional space, and use this as the basis to calculate the motion information of the rigid body in the space.
  • This method relies on the collaborative work between multiple cameras, so that it can be used in a relatively large space to realize the identification and tracking of rigid bodies, which leads to the high cost and difficult maintenance of the motion capture system.
  • the main purpose of the present invention is to provide an active rigid body pose positioning method and related equipment in a single-camera environment, aiming to solve the high cost and difficult-to-maintain technology caused by the use of multi-camera systems in current passive or active motion capture methods problem.
  • the present invention provides a method for positioning an active rigid body in a single-camera environment.
  • the method includes the following steps:
  • the two-dimensional space point code corresponding to the two-dimensional space point coordinates, and the camera parameters of the camera.
  • the phase Matching the two-dimensional spatial point coordinates of two adjacent frames to obtain multiple sets of two-dimensional spatial feature pairs, constructing a linear equation system from the multiple sets of two-dimensional spatial feature pairs and the camera parameters, and solving the essential matrix;
  • the two-dimensional space feature pair multiple sets of the rotation matrix and the translation matrix, the three-dimensional space point coordinates are estimated, the depth value of the three-dimensional space point coordinates is detected, and the set of the rotation matrix whose depth value is a positive number
  • the translation matrix is defined as a target rotation matrix and a target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
  • the determining the rigid body pose according to the target rotation matrix and the target translation matrix includes:
  • the target translation matrix is optimized by an optimization formula to obtain an optimized target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the optimized target translation matrix;
  • the optimization formula is:
  • L1 is the three-dimensional average distance
  • L2 is the average distance of the rigid body
  • T is the target translation matrix before optimization
  • T′ is the target translation matrix after optimization.
  • the acquiring rigid body coordinates, summing the distances between all rigid body mark points in the rigid body coordinates and then taking the average value, before obtaining the average distance of the rigid body includes:
  • the coordinates of all three-dimensional space points in the same frame are converted into rigid body coordinates in the rigid body coordinate system, and the rigid body coordinates of each of the marked points in each frame are obtained.
  • the multiple cameras are matched in pairs, and each of the marked points is obtained according to the spatial position data of the two cameras and the coordinates of the multiple two-dimensional spatial points in the same frame.
  • the three-dimensional space point coordinates of the frame including:
  • the two two-dimensional space point coordinates captured by the two matching cameras in the same frame are matched by pairwise matching of all the cameras that have captured the same marked point, and the least squares method is solved by singular value decomposition. Calculate a set of three-dimensional space point coordinates;
  • the conversion of all three-dimensional space point coordinates in the same frame into rigid body coordinates in a rigid body coordinate system to obtain the rigid body coordinates of each of the marked points in each frame includes:
  • the difference between the origin and the coordinates of the three-dimensional space points corresponding to each of the marking points in the same frame is respectively calculated to obtain the rigid body coordinates of each of the marking points in each frame.
  • the estimating three-dimensional spatial point coordinates through the two-dimensional spatial feature pair, multiple sets of the rotation matrix and the translation matrix includes:
  • the two cameras are camera 1 and camera 2
  • the two two-dimensional space point coordinates captured in the same frame are A(a1, a2), B(b1, b2)
  • the rotation matrix of camera 1 is R1( R11, R12, R13)
  • R1 is a 3*3 matrix
  • the translation matrix is T1 (T11, T12, T13)
  • T1 is a 3*1 matrix
  • the rotation matrix of camera 2 is R2 (R21, R22, R23)
  • the translation matrix is T2 (T21, T22, T23).
  • R2 is a 3*3 matrix
  • T2 is a 3*1 matrix.
  • the three-dimensional space point coordinates can be obtained by the following method:
  • the detecting the depth value of the three-dimensional space point coordinates, and defining the group of the rotation matrix and the translation matrix whose depth value is a positive number as the target rotation matrix and the target translation matrix includes:
  • the estimated three-dimensional space point coordinates detect whether the depth value corresponding to the three-dimensional space point coordinates is a positive number, and if so, define the corresponding set of the rotation matrix and translation matrix as the target rotation matrix and target translation matrix.
  • the present invention also provides an active rigid body pose positioning device in a single-camera environment, including:
  • the calculation essential matrix module is used to obtain the two-dimensional space point coordinates of two adjacent frames captured by the monocular camera, the two-dimensional space point code corresponding to the two-dimensional space point coordinates, and the camera parameters of the camera, according to the two Two-dimensional space point coding, matching the two-dimensional space point coordinates of two adjacent frames to obtain multiple sets of two-dimensional space feature pairs, and constructing a linear equation system from multiple sets of the two-dimensional space feature pairs and the camera parameters, Solve the essential matrix;
  • Calculating rotation matrix and translation matrix module which is used to decompose the essential matrix through a singular value decomposition algorithm to obtain multiple sets of rotation matrix and translation matrix;
  • the rigid body pose determination module is used to estimate the three-dimensional space point coordinates through the two-dimensional space feature pair, multiple sets of the rotation matrix and the translation matrix, detect the depth value of the three-dimensional space point coordinate, and set the depth value to be positive
  • the number of rotation matrices and translation matrices are defined as a target rotation matrix and a target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
  • the present invention also provides a device for positioning an active rigid body in a single-camera environment.
  • the device includes a memory, a processor, and a device that is stored in the memory and can run on the processor.
  • a pose positioning program for an active rigid body in a single-camera environment where the pose positioning program for an active rigid body in a single-camera environment is executed by the processor to realize the pose positioning of an active rigid body in a single-camera environment as described above Method steps.
  • the present invention also provides a computer-readable storage medium that stores a program for positioning the pose of an active rigid body in a single-camera environment.
  • the pose positioning program is executed by the processor, the steps of the active rigid body pose positioning method in the single-camera environment as described above are realized.
  • the essence matrix is solved by matching the characteristic points in the coordinates of two adjacent frames; and the essence is graded by the singular value decomposition algorithm Matrix, multiple groups of rotation matrix and translation matrix are obtained; by detecting the depth value of the feature point, the final target rotation matrix and translation matrix are determined.
  • the whole process does not depend on the rigid body structure, and the required matching data can be obtained according to the code and coordinates to calculate the rigid body pose information.
  • the present invention can realize the tracking and positioning of an active optical rigid body at a lower cost, and has obvious advantages compared with a complex multi-camera environment.
  • the present invention matches the feature points of two adjacent frames each time, so that the active light rigid body can be tracked and positioned every time the current frame is compared to the initial frame's motion posture, thereby avoiding the common features of monocular camera tracking.
  • the cumulative error problem further improves the tracking accuracy.
  • FIG. 1 is a schematic structural diagram of the operating environment of an active rigid body pose positioning device in a single-camera environment related to a solution of an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for positioning an active rigid body in a single-camera environment in an embodiment of the present invention
  • FIG. 3 is a detailed flowchart of step S3 in an embodiment of the present invention.
  • step S302 is a detailed flowchart of step S302 in an embodiment of the present invention.
  • Fig. 5 is a structural diagram of an active rigid body pose positioning device in a single-camera environment in an embodiment of the present invention.
  • the active rigid body pose positioning device in a single-camera environment includes: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to implement connection and communication between these components.
  • the user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001.
  • the hardware structure of the active rigid body pose positioning device in the single-camera environment shown in FIG. 1 does not constitute a limitation on the active rigid body pose positioning device in the single-camera environment, and may include ratios More or fewer parts are shown, or some parts are combined, or different parts are arranged.
  • the memory 1005 which is a computer-readable storage medium, may include an operating system, a network communication module, a user interface module, and a pose positioning program for an active rigid body in a single-camera environment.
  • the operating system is a program that manages and controls the pose positioning equipment and software resources of the active rigid body in the single-camera environment, and supports the operation of the pose positioning program of the active rigid body in the single-camera environment and other software and/or programs.
  • the network interface 1004 is mainly used to access the network; the user interface 1003 is mainly used to detect confirmation commands and edit commands, and process
  • the device 1001 can be used to call the pose positioning program of the active rigid body in the single-camera environment stored in the memory 1005, and execute the operations of the following embodiments of the pose positioning method of the active rigid body in the single-camera environment.
  • FIG. 2 is a flowchart of a method for positioning an active rigid body in a single-camera environment in an embodiment of the present invention. As shown in FIG. 2, a method for positioning an active rigid body in a single-camera environment includes The following steps:
  • Step S1 solving the essential matrix: Obtain the two-dimensional point coordinates of the two adjacent frames captured by the monocular camera, the two-dimensional point code corresponding to the two-dimensional point coordinates, and the camera parameters of the camera, according to the two-dimensional point code, The two-dimensional space point coordinates of two adjacent frames are matched to obtain multiple sets of two-dimensional spatial feature pairs. The multiple sets of two-dimensional spatial feature pairs and camera parameters are constructed to form a linear equation set to solve the essential matrix.
  • the marker points in this step are generally set at different positions of the rigid body.
  • the two-dimensional space coordinate information of the marker point is captured by the monocular camera to determine the spatial point data.
  • the spatial point data includes two-dimensional space. Point coordinates and corresponding two-dimensional space point codes.
  • marking points there are eight marking points on the rigid body, and the marking points can be eight luminous LED lights. Therefore, a rigid body usually contains eight spatial point data, and each frame of monocular camera data contains eight mark points.
  • the coding of the same mark point in different frames is the same, and different mark points are in the same frame. The encoding is different.
  • the two-dimensional spatial feature pair is the projection of the same marker point in two adjacent frames on the monocular camera.
  • a rigid body contains eight marker points, it has eight sets of two-dimensional spatial feature pairs.
  • the camera parameters of the monocular camera need to be calibrated, that is, the camera's optical center, focal length and distortion parameters, etc. These camera parameters are used as a matrix, recorded as matrix M, and used in the essential matrix calculation.
  • the principle of epipolar geometric constraint is adopted. The linear equations are constructed for multiple sets of two-dimensional spatial feature pairs and camera parameters in the following way to solve the essential matrix:
  • Step S2 Decompose the essential matrix: Decompose the essential matrix through a singular value decomposition algorithm to obtain multiple sets of rotation matrices and translation matrices.
  • the motion information of the rigid body is recovered according to the essential matrix: rotation matrix R and translation matrix T.
  • This process is obtained by singular value decomposition (SVD) in this step.
  • SVD singular value decomposition
  • a total of four possible solutions R, T
  • R, T four sets of rotation matrices and translation matrices, of which only one correct solution is available in the monocular camera Positive depth (the depth value is a positive number). Therefore, the next step of detecting depth information is required.
  • Step S3 Determine the pose of the rigid body: Estimate the three-dimensional space point coordinates through the two-dimensional space feature pairs, multiple sets of rotation matrices and translation matrices, detect the depth value of the three-dimensional space point coordinates, and set the group of rotation matrices with a positive depth value
  • the translation matrix is defined as the target rotation matrix and the target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
  • step S2 After decomposing the essential matrix using singular value decomposition in step S2, four possible solutions are obtained. Therefore, in this step, it is necessary to finally determine the correct solution among the four possible solutions. First, it is necessary to estimate the coordinates of the three-dimensional space point, and detect the depth value of the feature point according to the three-dimensional space point coordinate. Only the set of solutions (R, T) with a positive depth value is the final target (R, T).
  • step S3 estimating three-dimensional space point coordinates through two-dimensional spatial feature pairs, multiple sets of rotation matrices and translation matrices, further includes:
  • the two cameras are camera 1 and camera 2
  • the two two-dimensional space point coordinates captured in the same frame are A(a1, a2), B(b1, b2)
  • the rotation matrix of camera 1 is R1( R11, R12, R13)
  • R1 is a 3*3 matrix
  • the translation matrix is T1 (T11, T12, T13)
  • T1 is a 3*1 matrix
  • the rotation matrix of camera 2 is R2 (R21, R22, R23)
  • the translation matrix is T2 (T21, T22, T23).
  • R2 is a 3*3 matrix
  • T2 is a 3*1 matrix.
  • the coordinates of a three-dimensional space point C(c1, c2) in the same frame are obtained by the following method , C3):
  • multiple different rotation matrices and translation matrices such as multiple sets of R1, T1, R2, and T2 rotation matrix and translation matrix data pairs, multiple different three-dimensional space point coordinates are obtained.
  • step S2 For example, if four sets of rotation matrices and translation matrices are obtained in step S2, four different three-dimensional space point coordinates can be estimated through this step, but there is only one three-dimensional space point coordinate C whose coordinate value c3 is greater than 0, then The R and T corresponding to the coordinate C of the three-dimensional space point are the final target data.
  • the matched sets of two-dimensional space feature pairs are combined with four possible solutions (R, T), and the corresponding three-dimensional space coordinate data (x, y, z) is estimated by the above method according to the principle of triangulation. , Provide accurate data for the subsequent detection depth value z.
  • step S3 the depth value of the three-dimensional space point coordinates is detected, and the group of rotation matrices and translation matrices with a positive depth value are defined as the target rotation matrix and the target translation matrix, including:
  • the estimated three-dimensional space point coordinates it is detected whether the depth value corresponding to the three-dimensional space point coordinates is a positive number, and if so, the corresponding group of rotation matrix and translation matrix is defined as the target rotation matrix and the target translation matrix.
  • multiple depth values z are obtained by solving in the above-mentioned manner, and the corresponding solution (R, T) when the depth value z is zero or negative is eliminated, and the corresponding solution (R, T) when the depth value z is positive is retained. And as the final target data, the rigid body pose is determined with this target data.
  • step S3 after defining the set of rotation matrices and translation matrices with a positive depth value as the target rotation matrix and the target translation matrix, before determining the rigid body pose according to the target rotation matrix and the target translation matrix, such as As shown in Figure 3, it includes:
  • Step S301 Calculate the three-dimensional average distance: sum the distances between all three-dimensional space points in the three-dimensional space point coordinates and take the average value to obtain the three-dimensional average distance.
  • D is the distance between two three-dimensional space points
  • (a 1 , a 2 , a 3 ) is the three-dimensional space point coordinates of the three-dimensional space point 1
  • (b 1 , b 2 , b 3 ) is the three-dimensional space point 2
  • Step S302 Calculate the average distance of the rigid body: obtain the coordinates of the rigid body, sum the distances between all rigid body mark points in the coordinates of the rigid body and take the average value to obtain the average distance of the rigid body.
  • a distance calculation formula similar to that in step S301 can be used to calculate the distance between two rigid body mark points in the rigid body coordinate system after calculating the distance between the two coordinates of the rigid body coordinate system, and then the sum is performed and the average value is taken.
  • the rigid body coordinates in this step can be obtained by actually measuring the rigid body coordinates of the marked points, as shown in Figure 4, or can be obtained by a multi-camera system, that is, in the following way, accurate rigid body coordinates can be obtained with only one initialization. Multiple calculations:
  • Step S30201 acquiring data: acquiring the two-dimensional point coordinates of two adjacent frames captured by multiple cameras, the two-dimensional point codes corresponding to the two-dimensional point coordinates, and the spatial position data of multiple cameras, and encode the two-dimensional points
  • the coordinates of the same multiple two-dimensional space points are classified into the same kind, and are marked under the same marking point.
  • the marker points in this step are generally set at different positions of the rigid body.
  • the two-dimensional space coordinate information of the marker points is captured by multiple cameras, and the spatial point data is determined through the preset rigid body encoding technology.
  • the spatial point data includes two-dimensional spatial points. Coordinates and corresponding two-dimensional space point codes.
  • the spatial position data is obtained by calibrating and calculating the spatial position relationship of each camera.
  • there are eight marking points on the rigid body and the marking points can be eight luminous LED lights. Therefore, a rigid body usually contains eight spatial point data.
  • each frame of data for a single camera contains the spatial point data of eight marker points.
  • the encoding of the same marker point in different frames is the same.
  • the spatial point data with the same spatial point code in all cameras can be divided together as the same type, and these spatial point data are considered to be projections of the same marker point in space on different cameras.
  • Step S30202 Calculate the three-dimensional space data: match multiple cameras in pairs, and obtain the three-dimensional space point coordinates of each marker point per frame according to the spatial position data of the two cameras and the multiple two-dimensional space point coordinates of the same frame .
  • the processing of this step is performed on each frame of data of each marker point.
  • multiple cameras that capture the marker point are matched in pairs.
  • singular value decomposition singular Value Decomposition
  • Decomposition, SVD solve the least square method to obtain a set of three-dimensional space point data.
  • the rigid body includes eight marker points
  • eight three-dimensional space point codes and three-dimensional space point coordinates of the eight marker points are obtained through this step.
  • This step further includes:
  • the two cameras are camera 1 and camera 2
  • the coordinates of two two-dimensional space points captured in the same frame are A(a1, a2), B(b1, b2)
  • the rotation matrix of camera 1 is R1( R11, R12, R13)
  • R1 is a 3*3 matrix
  • the translation matrix is T1 (T11, T12, T13)
  • T1 is a 3*1 matrix
  • the rotation matrix of camera 2 is R2 (R21, R22, R23)
  • the translation matrix is T2 (T21, T22, T23).
  • R2 is a 3*3 matrix
  • the translation matrix is T2
  • T2 is a 3*1 matrix.
  • the coordinates of a three-dimensional space point in the same frame are obtained by the following method C(c1, c2, c3):
  • the two two-dimensional space point coordinates captured by all pairwise matching cameras are finally calculated to obtain a set of three-dimensional space point coordinates.
  • this threshold range is a coordinate parameter preset in advance. If the three-dimensional space point coordinates are found to deviate from the threshold range, the three-dimensional space point coordinates are considered to be wrong data and eliminated.
  • Step S30203 Calculate rigid body coordinates: convert all three-dimensional space point codes and three-dimensional space point coordinates in the same frame into rigid body coordinates in the rigid body coordinate system, and obtain the rigid body coordinates of each marker point and each frame.
  • the three-dimensional space point data corresponding to each marking point can be obtained, and the multiple three-dimensional space point data corresponding to multiple marking points can be formed into a rigid body. If the rigid body currently in use has eight luminous LED lights, the rigid body Contains eight three-dimensional space point data. Through multiple three-dimensional space point data, such as three-dimensional space point coordinates in eight three-dimensional space point data, it can be transformed into rigid body coordinates in a rigid body coordinate system.
  • This step further includes:
  • the average value is calculated for each dimension of the three-dimensional space point coordinates corresponding to all the mark points in the same frame to obtain the coordinate average value, and this coordinate average value is recorded as the origin in the rigid coordinate system as all the mark points
  • the reference data of the corresponding three-dimensional space point coordinates is obtained.
  • step S2 obtains eight three-dimensional space point coordinate data, and calculates the average value of each dimension of the eight three-dimensional space point coordinate data to obtain the coordinate average value.
  • the average value of the coordinates is taken as the origin in the rigid body coordinate system, and the difference between the coordinates of each three-dimensional space point and the origin is calculated, and the difference obtained is the rigid body coordinate of each marked point.
  • the three-dimensional space point coordinates corresponding to the eight mark points are calculated as the difference from the origin.
  • the difference between the coordinates of each dimension and the dimension coordinates corresponding to the origin is calculated, and finally Get eight rigid body coordinates.
  • multiple cameras are used to capture multiple two-dimensional space point coordinates, a set of three-dimensional space point data is analyzed through a specific solution algorithm, and after operations such as integration, averaging, and optimization of multiple three-dimensional space point data are performed, Finally, more accurate three-dimensional space point data is obtained, and the accurate three-dimensional space point data is converted into rigid body coordinate data in the rigid body coordinate system, which provides definite and accurate data for the subsequent calculation of the average distance of the rigid body.
  • Step S303 optimization: the target translation matrix is optimized by the optimization formula to obtain the optimized target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the optimized target translation matrix.
  • the optimization formula is:
  • L1 is the three-dimensional average distance
  • L2 is the average distance of the rigid body
  • T is the target translation matrix before optimization
  • T′ is the target translation matrix after optimization.
  • the translation amount may have various situations, so there is no guarantee that the translation matrix T is accurate and true data.
  • the three-dimensional space point coordinates of the rigid body are estimated by the triangulation principle, the three-dimensional space point coordinates and the rigid body coordinates in the rigid body coordinate system are estimated according to the estimated three-dimensional space point coordinates and the rigid body coordinates.
  • the target translation matrix is optimized.
  • the optimized target translation matrix is obtained through the above optimization formula, so that the final rigid body pose is more accurate and more authentic.
  • the pose positioning method of the active rigid body in the single-camera environment of this embodiment The active optical rigid body has coding information so that the motion capture tracking and positioning no longer depends on the rigid body structure, but can directly obtain matching two-dimensional spatial features based on the coding information. Yes, to solve the rigid body pose.
  • the invention can realize the tracking and positioning of a rigid body at a lower cost, which has obvious advantages compared with a complex multi-camera environment.
  • the encoding information of the active optical rigid body is used to match the adjacent two frames, each time the active optical rigid body is tracked and positioned, the motion posture of the current frame compared to the initial frame can be calculated, thereby avoiding monocular camera tracking
  • the common cumulative error problem further improves the tracking accuracy.
  • a device for positioning an active rigid body in a single-camera environment is proposed. As shown in FIG. 5, the device includes:
  • the calculation essential matrix module is used to obtain the two-dimensional space point coordinates of two adjacent frames captured by the monocular camera, the two-dimensional space point code corresponding to the two-dimensional point coordinates, and the camera parameters of the camera. According to the two-dimensional space point code, Match the two-dimensional space point coordinates of two adjacent frames to obtain multiple sets of two-dimensional spatial feature pairs, construct a linear equation system from multiple sets of two-dimensional spatial feature pairs and camera parameters, and solve the essential matrix;
  • Calculate rotation matrix and translation matrix module which is used to decompose the essential matrix through the singular value decomposition algorithm to obtain multiple sets of rotation matrix and translation matrix;
  • the translation matrix is defined as the target rotation matrix and the target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
  • this embodiment does not describe the contents of the embodiment of the active rigid body pose positioning device in the single-camera environment. Too much repeat.
  • a device for positioning an active rigid body in a single-camera environment includes a memory, a processor, and an active rigid body in a single-camera environment that is stored in the memory and can run on the processor.
  • the pose positioning program of the active rigid body in the single-camera environment is executed by the processor to implement the steps in the pose positioning method of the active rigid body in the single-camera environment of the foregoing embodiments.
  • a computer-readable storage medium stores a pose positioning program for an active rigid body in a single-camera environment, and the pose positioning program for an active rigid body in a single-camera environment is processed by the processor The steps in the active rigid body pose positioning method in the single-camera environment of the foregoing embodiments are implemented during execution.
  • the storage medium may be a non-volatile storage medium.
  • the program can be stored in a computer-readable storage medium, and the storage medium can include: Read only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

The present invention relates to the technical field of computer vision, and particularly relates to a method for determining a pose of an active rigid body in a single-camera environment, and a related apparatus. The method comprises: acquiring two-dimensional spatial point coordinates and two-dimensional spatial point codes of two adjacent frames, and a camera parameter, matching, according to the two-dimensional spatial point codes, the two-dimensional spatial point coordinates so as to obtain multiple two-dimensional spatial feature pairs, constructing a set of linear equations from the multiple two-dimensional spatial feature pairs and the camera parameter, and solving to obtain an intrinsic matrix; decomposing the intrinsic matrix by means of a singular value decomposition algorithm so as to obtain multiple rotation matrices and translation matrices; and estimating three-dimensional spatial point coordinates, detecting a depth value, determining a target rotation matrix and a target translation matrix, and determining a pose of a rigid body according to the target rotation matrix and the target translation matrix. The invention achieves tracking and positioning of an active light emitting rigid body in a single-camera environment at a low cost, and is thus more advantageous than employing a complicated multi-camera setting.

Description

单相机环境中主动式刚体的位姿定位方法及相关设备Position and attitude positioning method of active rigid body in single camera environment and related equipment 技术领域Technical field
本发明涉及计算机视觉技术领域,尤其涉及一种单相机环境中主动式刚体的位姿定位方法及相关设备。The invention relates to the technical field of computer vision, in particular to a method for positioning an active rigid body in a single-camera environment and related equipment.
背景技术Background technique
传统的光学动捕方法是通过动捕相机内的超大功率近红外光源发出红外光,照射在被动式标记点上;涂有高反光材料的标记点反射被照射到的红外光,而这部分红外光和带有背景信息的环境光会经过低畸变镜头,到达摄像机红外窄带通滤光单元。由于红外窄带通滤光单元的通光波段跟红外光源的波段一致,因此,带有冗余背景信息的环境光会被过滤掉,只剩下带有标记点信息的红外光通过,并被摄像机感光元件记录。感光元件再将光信号转化为图像信号输出到控制电路,而控制电路中的图像处理单元使用现场可编程门阵列(Field Programmable Gate Array,FPGA),以硬件形式对图像信号进行预处理,最后向跟踪软件流出标记点的2D坐标信息。The traditional optical motion capture method uses the ultra-high-power near-infrared light source in the motion capture camera to emit infrared light and irradiate it on the passive marking point; the marking point coated with high-reflective material reflects the irradiated infrared light, and this part of the infrared light The ambient light with background information will pass through the low-distortion lens and reach the infrared narrow band pass filter unit of the camera. Since the light band of the infrared narrow band pass filter unit is the same as that of the infrared light source, the ambient light with redundant background information will be filtered out, leaving only the infrared light with the marking point information to pass through and be taken by the camera. Photosensitive element recording. The photosensitive element then converts the light signal into an image signal and outputs it to the control circuit, and the image processing unit in the control circuit uses a Field Programmable Gate Array (FPGA) to preprocess the image signal in the form of hardware, and finally to The tracking software flows out the 2D coordinate information of the marked points.
传统的光学动作捕捉系统中,无论是主动式刚体跟踪还是被动式刚体跟踪,都需要系统服务器接收多相机系统中各个相机的2D数据,然后采用多目视觉原理,根据2D点云之间的匹配关系以及提前标定计算好的各相机间的位姿关系,计算出三维空间内的3D坐标,并以此为基础解算刚体在空间内的运动信息。这种方式依赖于多相机之间的协同工作,从而可以应用在比较大的空间范围内实现对刚体的识别跟踪,这就导致了动捕系统的高成本和难维护问题。In the traditional optical motion capture system, whether it is active rigid body tracking or passive rigid body tracking, the system server needs to receive the 2D data of each camera in the multi-camera system, and then adopt the principle of multi-eye vision, according to the matching relationship between 2D point clouds And calibrate the calculated pose relationship between the cameras in advance, calculate the 3D coordinates in the three-dimensional space, and use this as the basis to calculate the motion information of the rigid body in the space. This method relies on the collaborative work between multiple cameras, so that it can be used in a relatively large space to realize the identification and tracking of rigid bodies, which leads to the high cost and difficult maintenance of the motion capture system.
发明内容Summary of the invention
本发明的主要目的在于提供一种单相机环境中主动式刚体的位姿定位方法及相关设备,旨在解决目前被动式或主动式动捕方法中利用多相机系统引起的 高成本和难维护的技术问题。The main purpose of the present invention is to provide an active rigid body pose positioning method and related equipment in a single-camera environment, aiming to solve the high cost and difficult-to-maintain technology caused by the use of multi-camera systems in current passive or active motion capture methods problem.
为实现上述目的,本发明提供一种单相机环境中主动式刚体的位姿定位方法,所述方法包括以下步骤:To achieve the above objective, the present invention provides a method for positioning an active rigid body in a single-camera environment. The method includes the following steps:
获取单目相机捕捉的相邻两帧的二维空间点坐标、所述二维空间点坐标对应的二维空间点编码和所述相机的相机参数,根据所述二维空间点编码,将相邻两帧的所述二维空间点坐标进行匹配,得到多组二维空间特征对,将多组所述二维空间特征对和所述相机参数构造线性方程组,求解出本质矩阵;Obtain the two-dimensional space point coordinates of two adjacent frames captured by the monocular camera, the two-dimensional space point code corresponding to the two-dimensional space point coordinates, and the camera parameters of the camera. According to the two-dimensional space point code, the phase Matching the two-dimensional spatial point coordinates of two adjacent frames to obtain multiple sets of two-dimensional spatial feature pairs, constructing a linear equation system from the multiple sets of two-dimensional spatial feature pairs and the camera parameters, and solving the essential matrix;
通过奇异值分解算法分解所述本质矩阵,得到多组旋转矩阵和平移矩阵;Decompose the essential matrix through a singular value decomposition algorithm to obtain multiple sets of rotation matrices and translation matrices;
通过所述二维空间特征对、多组所述旋转矩阵和所述平移矩阵,估算出三维空间点坐标,检测三维空间点坐标的深度值,将深度值为正数的那组所述旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,根据所述目标旋转矩阵和所述目标平移矩阵确定刚体位姿。According to the two-dimensional space feature pair, multiple sets of the rotation matrix and the translation matrix, the three-dimensional space point coordinates are estimated, the depth value of the three-dimensional space point coordinates is detected, and the set of the rotation matrix whose depth value is a positive number The translation matrix is defined as a target rotation matrix and a target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
可选地,所述根据所述目标旋转矩阵和所述目标平移矩阵确定刚体位姿,包括:Optionally, the determining the rigid body pose according to the target rotation matrix and the target translation matrix includes:
将所述三维空间点坐标内的所有三维空间点之间的距离求和后取平均值,得到三维平均距离;Sum the distances between all three-dimensional space points in the three-dimensional space point coordinates and take an average value to obtain a three-dimensional average distance;
获取刚体坐标,将所述刚体坐标内的所有刚体标记点之间的距离求和后取平均值,得到刚体平均距离;Obtain rigid body coordinates, sum the distances between all rigid body mark points in the rigid body coordinates and take an average value to obtain the average rigid body distance;
通过优化公式将所述目标平移矩阵进行优化,得到优化后的目标平移矩阵,根据所述目标旋转矩阵和优化后的所述目标平移矩阵确定刚体位姿;The target translation matrix is optimized by an optimization formula to obtain an optimized target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the optimized target translation matrix;
所述优化公式为:The optimization formula is:
Figure PCTCN2020110254-appb-000001
Figure PCTCN2020110254-appb-000001
其中,L1为所述三维平均距离,L2为所述刚体平均距离,T为优化前的所述目标平移矩阵,T′为优化后的所述目标平移矩阵。Wherein, L1 is the three-dimensional average distance, L2 is the average distance of the rigid body, T is the target translation matrix before optimization, and T′ is the target translation matrix after optimization.
可选地,所述获取刚体坐标,将所述刚体坐标内的所有刚体标记点之间的距离求和后取平均值,得到刚体平均距离前,包括:Optionally, the acquiring rigid body coordinates, summing the distances between all rigid body mark points in the rigid body coordinates and then taking the average value, before obtaining the average distance of the rigid body, includes:
获取多个相机捕捉的相邻两帧的二维空间点坐标、所述二维空间点坐标对应的二维空间点编码和多个所述相机的空间位置数据,将所述二维空间点编码相同的多个所述二维空间点坐标分为同类,且标记于同一个标记点下;Obtain the two-dimensional space point coordinates of two adjacent frames captured by multiple cameras, the two-dimensional space point code corresponding to the two-dimensional space point coordinates, and the space position data of multiple cameras, and encode the two-dimensional space point The same multiple of the two-dimensional space point coordinates are classified into the same type, and are marked under the same marking point;
将多个所述相机两两进行匹配,根据两个所述相机的空间位置数据及同类同帧的多个所述二维空间点坐标,得到每个所述标记点每帧的三维空间点坐标;Match a plurality of the cameras in pairs, and obtain the three-dimensional space point coordinates of each frame of each of the marker points according to the spatial position data of the two cameras and the plurality of the two-dimensional space point coordinates of the same frame ;
将同帧的所有三维空间点坐标,转化为刚体坐标系下的刚体坐标,得到每个所述标记点每帧的刚体坐标。The coordinates of all three-dimensional space points in the same frame are converted into rigid body coordinates in the rigid body coordinate system, and the rigid body coordinates of each of the marked points in each frame are obtained.
可选地,所述将多个所述相机两两进行匹配,根据两个所述相机的空间位置数据及同类同帧的多个所述二维空间点坐标,得到每个所述标记点每帧的三维空间点坐标,包括:Optionally, the multiple cameras are matched in pairs, and each of the marked points is obtained according to the spatial position data of the two cameras and the coordinates of the multiple two-dimensional spatial points in the same frame. The three-dimensional space point coordinates of the frame, including:
将捕捉到的同一个标记点的所有相机进行两两匹配,对匹配的两个相机在同帧中捕捉到的两个所述二维空间点坐标,通过奇异值分解求解最小二乘法方法,解算得到一组三维空间点坐标;The two two-dimensional space point coordinates captured by the two matching cameras in the same frame are matched by pairwise matching of all the cameras that have captured the same marked point, and the least squares method is solved by singular value decomposition. Calculate a set of three-dimensional space point coordinates;
判断所述三维空间点坐标是否处于预设的阈值范围内,若超过所述阈值范围,则剔除所述三维空间点坐标,得到剔除后的一组所述三维空间点坐标;Judging whether the three-dimensional space point coordinates are within a preset threshold value range, and if the three-dimensional space point coordinates exceed the threshold value range, the three-dimensional space point coordinates are eliminated to obtain a set of the eliminated three-dimensional space point coordinates;
计算一组所述三维空间点坐标的平均值,通过高斯牛顿法优化,得到所述标记点的三维空间点坐标。Calculate the average value of a set of the three-dimensional space point coordinates, and optimize by Gauss-Newton method to obtain the three-dimensional space point coordinates of the mark point.
可选地,所述将同帧的所有三维空间点坐标,转化为刚体坐标系下的刚体坐标,得到每个所述标记点每帧的刚体坐标,包括:Optionally, the conversion of all three-dimensional space point coordinates in the same frame into rigid body coordinates in a rigid body coordinate system to obtain the rigid body coordinates of each of the marked points in each frame includes:
计算同帧的多个所述标记点对应的所述三维空间点坐标的坐标平均值,将所述坐标平均值记为刚体坐标系下的原点;Calculate the coordinate average value of the three-dimensional space point coordinates corresponding to the multiple mark points in the same frame, and record the coordinate average value as the origin in the rigid body coordinate system;
分别计算原点与同帧的每个所述标记点对应的所述三维空间点坐标之间的差值,得到每个所述标记点每帧的刚体坐标。The difference between the origin and the coordinates of the three-dimensional space points corresponding to each of the marking points in the same frame is respectively calculated to obtain the rigid body coordinates of each of the marking points in each frame.
可选地,所述通过所述二维空间特征对、多组所述旋转矩阵和所述平移矩阵,估算出三维空间点坐标,包括:Optionally, the estimating three-dimensional spatial point coordinates through the two-dimensional spatial feature pair, multiple sets of the rotation matrix and the translation matrix includes:
设两个相机分别为相机1和相机2,在同帧中捕捉到的两个二维空间点坐标分别为A(a1,a2),B(b1,b2),相机1的旋转矩阵为R1(R11,R12,R13),R1是3*3的矩阵,平移矩阵为T1(T11,T12,T13),T1是3*1的矩阵,相机2的旋转矩阵为R2(R21,R22,R23),平移矩阵为T2(T21,T22,T23),同样地,R2是3*3的矩阵,T2是3*1的矩阵,通过下述方法可得到三维空间点坐标:Suppose the two cameras are camera 1 and camera 2, the two two-dimensional space point coordinates captured in the same frame are A(a1, a2), B(b1, b2), and the rotation matrix of camera 1 is R1( R11, R12, R13), R1 is a 3*3 matrix, the translation matrix is T1 (T11, T12, T13), T1 is a 3*1 matrix, and the rotation matrix of camera 2 is R2 (R21, R22, R23), The translation matrix is T2 (T21, T22, T23). Similarly, R2 is a 3*3 matrix, and T2 is a 3*1 matrix. The three-dimensional space point coordinates can be obtained by the following method:
1)根据两个相机的内参和畸变参数,将像素坐标A(a1,a2),B(b1,b2)转化为相机坐标A′(a1′,a2′),B′(b1′,b2′);1) According to the internal parameters and distortion parameters of the two cameras, transform the pixel coordinates A(a1, a2), B(b1, b2) into camera coordinates A′(a1′, a2′), B′(b1′, b2′) );
2)构造最小二乘法矩阵X和Y,其中X是4*3的矩阵,Y是4*1的矩阵;X矩阵第一行为a1′*R13-R11,X矩阵第二行为a2′*R13-R12,X矩阵第三行为b1′*R23-R21,X矩阵第四行为b2′*R23-R22;Y矩阵第一行为T11-a1′*T13,Y矩阵第二行为T12-a2′*T13,Y矩阵第三行为T21-b1′*T23,Y矩阵第四行为T22-b2′*T23;2) Construct the least squares matrix X and Y, where X is a 4*3 matrix and Y is a 4*1 matrix; the first line of X matrix is a1′*R13-R11, and the second line of X matrix is a2′*R13- R12, the third row of X matrix is b1′*R23-R21, the fourth row of X matrix is b2′*R23-R22; the first row of Y matrix is T11-a1′*T13, the second row of Y matrix is T12-a2′*T13, The third row of Y matrix is T21-b1′*T23, and the fourth row of Y matrix is T22-b2′*T23;
3)根据等式X*C=Y和已经构造好的矩阵X、矩阵Y,利用奇异值分解(SVD)求得一个三维空间点坐标C(c1,c2,c3);3) According to the equation X*C=Y and the already constructed matrix X and matrix Y, use singular value decomposition (SVD) to obtain a three-dimensional space point coordinate C(c1, c2, c3);
4)根据多个不同的旋转矩阵和平移矩阵R1、T1、R2、T2,得到多个不同的三维空间点坐标。4) According to a plurality of different rotation matrices and translation matrices R1, T1, R2, T2, a plurality of different three-dimensional space point coordinates are obtained.
可选地,所述检测三维空间点坐标的深度值,将深度值为正数的那组所述旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,包括:Optionally, the detecting the depth value of the three-dimensional space point coordinates, and defining the group of the rotation matrix and the translation matrix whose depth value is a positive number as the target rotation matrix and the target translation matrix includes:
根据估算出的所述三维空间点坐标,检测所述三维空间点坐标对应的深度值是否为正数,若是,则将对应的那组所述旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵。According to the estimated three-dimensional space point coordinates, detect whether the depth value corresponding to the three-dimensional space point coordinates is a positive number, and if so, define the corresponding set of the rotation matrix and translation matrix as the target rotation matrix and target translation matrix.
进一步地,为实现上述目的,本发明还提供一种单相机环境中主动式刚体的位姿定位装置,包括:Further, in order to achieve the above objective, the present invention also provides an active rigid body pose positioning device in a single-camera environment, including:
计算本质矩阵模块,用于获取单目相机捕捉的相邻两帧的二维空间点坐标、所述二维空间点坐标对应的二维空间点编码和所述相机的相机参数,根据所述二维空间点编码,将相邻两帧的所述二维空间点坐标进行匹配,得到多组二维空间特征对,将多组所述二维空间特征对和所述相机参数构造线性方程组,求解出本质矩阵;The calculation essential matrix module is used to obtain the two-dimensional space point coordinates of two adjacent frames captured by the monocular camera, the two-dimensional space point code corresponding to the two-dimensional space point coordinates, and the camera parameters of the camera, according to the two Two-dimensional space point coding, matching the two-dimensional space point coordinates of two adjacent frames to obtain multiple sets of two-dimensional space feature pairs, and constructing a linear equation system from multiple sets of the two-dimensional space feature pairs and the camera parameters, Solve the essential matrix;
计算旋转矩阵和平移矩阵模块,用于通过奇异值分解算法分解所述本质矩阵,得到多组旋转矩阵和平移矩阵;Calculating rotation matrix and translation matrix module, which is used to decompose the essential matrix through a singular value decomposition algorithm to obtain multiple sets of rotation matrix and translation matrix;
确定刚体位姿模块,用于通过所述二维空间特征对、多组所述旋转矩阵和所述平移矩阵,估算出三维空间点坐标,检测三维空间点坐标的深度值,将深度值为正数的那组所述旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,根据所述目标旋转矩阵和所述目标平移矩阵确定刚体位姿。The rigid body pose determination module is used to estimate the three-dimensional space point coordinates through the two-dimensional space feature pair, multiple sets of the rotation matrix and the translation matrix, detect the depth value of the three-dimensional space point coordinate, and set the depth value to be positive The number of rotation matrices and translation matrices are defined as a target rotation matrix and a target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
为实现上述目的,本发明还提供一种单相机环境中主动式刚体的位姿定位设备,所述设备包括:存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的单相机环境中主动式刚体的位姿定位程序,所述单相机环境中主动式刚体的位姿定位程序被所述处理器执行时实现如上所述的单相机环境中主动式刚体的位姿定位方法的步骤。In order to achieve the above objective, the present invention also provides a device for positioning an active rigid body in a single-camera environment. The device includes a memory, a processor, and a device that is stored in the memory and can run on the processor. A pose positioning program for an active rigid body in a single-camera environment, where the pose positioning program for an active rigid body in a single-camera environment is executed by the processor to realize the pose positioning of an active rigid body in a single-camera environment as described above Method steps.
为实现上述目的,本发明还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有单相机环境中主动式刚体的位姿定位程序,所述单相机环境中主动式刚体的位姿定位程序被处理器执行时实现如上所述的单相机环境中主动式刚体的位姿定位方法的步骤。In order to achieve the above object, the present invention also provides a computer-readable storage medium that stores a program for positioning the pose of an active rigid body in a single-camera environment. When the pose positioning program is executed by the processor, the steps of the active rigid body pose positioning method in the single-camera environment as described above are realized.
本发明提供的单相机环境中主动式刚体的位姿定位方法,在确定刚体位姿过程中,通过对相邻两帧坐标中的特征点进行匹配,求解本质矩阵;通过奇异值分解算法分级本质矩阵,得到多组旋转矩阵和平移矩阵;通过检测特征点的深度值,确定出最终的目标旋转矩阵和平移矩阵。整个过程不依赖于刚体结构,只需根据编码和坐标就能得到所需匹配数据以解算刚体位姿信息。本发明在单相机环境中,能以较低的成本就可实现主动光刚体的跟踪定位,相较于复杂的 多相机环境具有明显优势。此外,本发明由于每次都对相邻两帧的特征点进行匹配,使得每次跟踪定位主动光刚体都可以计算当前帧相较于初始帧的运动姿态,从而避免了单目相机跟踪常见的累积误差问题,进一步提升了跟踪精度。In the pose positioning method of an active rigid body in a single-camera environment provided by the present invention, in the process of determining the pose of a rigid body, the essence matrix is solved by matching the characteristic points in the coordinates of two adjacent frames; and the essence is graded by the singular value decomposition algorithm Matrix, multiple groups of rotation matrix and translation matrix are obtained; by detecting the depth value of the feature point, the final target rotation matrix and translation matrix are determined. The whole process does not depend on the rigid body structure, and the required matching data can be obtained according to the code and coordinates to calculate the rigid body pose information. In a single-camera environment, the present invention can realize the tracking and positioning of an active optical rigid body at a lower cost, and has obvious advantages compared with a complex multi-camera environment. In addition, the present invention matches the feature points of two adjacent frames each time, so that the active light rigid body can be tracked and positioned every time the current frame is compared to the initial frame's motion posture, thereby avoiding the common features of monocular camera tracking. The cumulative error problem further improves the tracking accuracy.
附图说明Description of the drawings
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。By reading the detailed description of the preferred embodiments below, various other advantages and benefits will become clear to those of ordinary skill in the art. The drawings are only used for the purpose of illustrating the preferred embodiments, and are not considered as a limitation to the present invention.
图1为本发明实施例方案涉及的单相机环境中主动式刚体的位姿定位设备的运行环境的结构示意图;FIG. 1 is a schematic structural diagram of the operating environment of an active rigid body pose positioning device in a single-camera environment related to a solution of an embodiment of the present invention;
图2为本发明一个实施例中单相机环境中主动式刚体的位姿定位方法的流程图;2 is a flowchart of a method for positioning an active rigid body in a single-camera environment in an embodiment of the present invention;
图3为本发明一个实施例中步骤S3的细化流程图;Figure 3 is a detailed flowchart of step S3 in an embodiment of the present invention;
图4为本发明一个实施例中步骤S302的细化流程图;4 is a detailed flowchart of step S302 in an embodiment of the present invention;
图5为本发明一个实施例中单相机环境中主动式刚体的位姿定位装置的结构图。Fig. 5 is a structural diagram of an active rigid body pose positioning device in a single-camera environment in an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the present invention in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本发 明的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。Those skilled in the art can understand that, unless specifically stated, the singular forms "a", "an", "said" and "the" used herein may also include plural forms. It should be further understood that the term "comprising" used in the specification of the present invention refers to the presence of the described features, integers, steps, operations, elements and/or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components, and/or groups of them.
参照图1,为本发明实施例方案涉及的单相机环境中主动式刚体的位姿定位设备运行环境的结构示意图。1, which is a schematic structural diagram of an operating environment of an active rigid body pose positioning device in a single-camera environment related to a solution of an embodiment of the present invention.
如图1所示,该单相机环境中主动式刚体的位姿定位设备包括:处理器1001,例如CPU,通信总线1002、用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1, the active rigid body pose positioning device in a single-camera environment includes: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Among them, the communication bus 1002 is used to implement connection and communication between these components. The user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001.
本领域技术人员可以理解,图1中示出的单相机环境中主动式刚体的位姿定位设备的硬件结构并不构成对单相机环境中主动式刚体的位姿定位设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the hardware structure of the active rigid body pose positioning device in the single-camera environment shown in FIG. 1 does not constitute a limitation on the active rigid body pose positioning device in the single-camera environment, and may include ratios More or fewer parts are shown, or some parts are combined, or different parts are arranged.
如图1所示,作为一种计算机可读存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及单相机环境中主动式刚体的位姿定位程序。其中,操作系统是管理和控制单相机环境中主动式刚体的位姿定位设备和软件资源的程序,支持单相机环境中主动式刚体的位姿定位程序以及其它软件和/或程序的运行。As shown in FIG. 1, the memory 1005, which is a computer-readable storage medium, may include an operating system, a network communication module, a user interface module, and a pose positioning program for an active rigid body in a single-camera environment. Among them, the operating system is a program that manages and controls the pose positioning equipment and software resources of the active rigid body in the single-camera environment, and supports the operation of the pose positioning program of the active rigid body in the single-camera environment and other software and/or programs.
在图1所示的单相机环境中主动式刚体的位姿定位设备的硬件结构中,网络接口1004主要用于接入网络;用户接口1003主要用于侦测确认指令和编辑指令等,而处理器1001可以用于调用存储器1005中存储的单相机环境中主动式刚体的位姿定位程序,并执行以下单相机环境中主动式刚体的位姿定位方法的各实施例的操作。In the hardware structure of the active rigid body pose positioning device in the single-camera environment shown in Figure 1, the network interface 1004 is mainly used to access the network; the user interface 1003 is mainly used to detect confirmation commands and edit commands, and process The device 1001 can be used to call the pose positioning program of the active rigid body in the single-camera environment stored in the memory 1005, and execute the operations of the following embodiments of the pose positioning method of the active rigid body in the single-camera environment.
参照图2,为本发明一个实施例中的单相机环境中主动式刚体的位姿定位方法的流程图,如图2所示,一种单相机环境中主动式刚体的位姿定位方法,包括以下步骤:2 is a flowchart of a method for positioning an active rigid body in a single-camera environment in an embodiment of the present invention. As shown in FIG. 2, a method for positioning an active rigid body in a single-camera environment includes The following steps:
步骤S1,求解本质矩阵:获取单目相机捕捉的相邻两帧的二维空间点坐标、二维空间点坐标对应的二维空间点编码和相机的相机参数,根据二维空间点编码,将相邻两帧的二维空间点坐标进行匹配,得到多组二维空间特征对,将多组二维空间特征对和相机参数构造线性方程组,求解出本质矩阵。Step S1, solving the essential matrix: Obtain the two-dimensional point coordinates of the two adjacent frames captured by the monocular camera, the two-dimensional point code corresponding to the two-dimensional point coordinates, and the camera parameters of the camera, according to the two-dimensional point code, The two-dimensional space point coordinates of two adjacent frames are matched to obtain multiple sets of two-dimensional spatial feature pairs. The multiple sets of two-dimensional spatial feature pairs and camera parameters are constructed to form a linear equation set to solve the essential matrix.
本步骤的标记点一般都设置在刚体的不同位置,刚体在相机捕捉范围内运动时,通过单目相机捕捉标记点的二维空间坐标信息,确定出空间点数据,空间点数据包括二维空间点坐标及对应的二维空间点编码。通常,在刚体上设有八个标记点,标记点可以是八个发光的LED灯。因此刚体通常包含八个空间点数据,单目相机每帧数据中都包含八个标记点的空间点数据,同一个标记点在不同帧时的编码是相同的,不同的标记点在同帧时的编码是不同的。基于此可以把单目相机捕获的相邻两帧内的所有二维空间点进行匹配,将二维空间点编码相同的两个二维空间点作为一组二维空间特征对,并认为同一组二维空间特征对是空间中同一个标记点在单目相机上相邻两帧的投影。当刚体包含八个标记点时,则具有八组二维空间特征对。The marker points in this step are generally set at different positions of the rigid body. When the rigid body moves within the camera capture range, the two-dimensional space coordinate information of the marker point is captured by the monocular camera to determine the spatial point data. The spatial point data includes two-dimensional space. Point coordinates and corresponding two-dimensional space point codes. Usually, there are eight marking points on the rigid body, and the marking points can be eight luminous LED lights. Therefore, a rigid body usually contains eight spatial point data, and each frame of monocular camera data contains eight mark points. The coding of the same mark point in different frames is the same, and different mark points are in the same frame. The encoding is different. Based on this, all the two-dimensional space points in two adjacent frames captured by the monocular camera can be matched, and the two two-dimensional space points with the same two-dimensional space point code are regarded as a set of two-dimensional space feature pairs, and the same group is considered The two-dimensional spatial feature pair is the projection of the same marker point in two adjacent frames on the monocular camera. When a rigid body contains eight marker points, it has eight sets of two-dimensional spatial feature pairs.
在单目相机捕获空间点数据前,需要对单目相机标定好相机参数,即相机光心、焦距和畸变参数等,这些相机参数作为一个矩阵,记做矩阵M,应用于本质矩阵计算中。本步骤在求解本质矩阵时,采用对极几何约束原理,通过如下方式,对多组二维空间特征对和相机参数构造线性方程组,求解出本质矩阵:Before the monocular camera captures the spatial point data, the camera parameters of the monocular camera need to be calibrated, that is, the camera's optical center, focal length and distortion parameters, etc. These camera parameters are used as a matrix, recorded as matrix M, and used in the essential matrix calculation. In this step, when solving the essential matrix, the principle of epipolar geometric constraint is adopted. The linear equations are constructed for multiple sets of two-dimensional spatial feature pairs and camera parameters in the following way to solve the essential matrix:
为了求解本质矩阵,首先计算基础矩阵F,由
Figure PCTCN2020110254-appb-000002
根据多组二维空间特征对得到基础矩阵F,根据F=M -TEM,由于相机参数对应的矩阵M已知,则可以得到本质矩阵E。
In order to solve the essential matrix, first calculate the fundamental matrix F,
Figure PCTCN2020110254-appb-000002
According to multiple sets of two-dimensional spatial feature pairs, the fundamental matrix F is obtained. According to F=M- T EM, since the matrix M corresponding to the camera parameters is known, the essential matrix E can be obtained.
步骤S2,分解本质矩阵:通过奇异值分解算法分解本质矩阵,得到多组旋转矩阵和平移矩阵。Step S2: Decompose the essential matrix: Decompose the essential matrix through a singular value decomposition algorithm to obtain multiple sets of rotation matrices and translation matrices.
得到本质矩阵后,根据本质矩阵恢复出刚体的运动信息:旋转矩阵R和平移矩阵T,该过程通过本步骤由奇异值分解(Singular Value Decomposition,SVD)得到的。对步骤S1得到的本质矩阵E通过奇异值分解后,一共可以得到四个可能的解(R、T),即为四组旋转矩阵和平移矩阵,其中只有一个正确的解在单目相机中具有正深度(深度值为正数)。因此还需要进行下一步检测深度信息的步骤。After the essential matrix is obtained, the motion information of the rigid body is recovered according to the essential matrix: rotation matrix R and translation matrix T. This process is obtained by singular value decomposition (SVD) in this step. After passing the singular value decomposition of the essential matrix E obtained in step S1, a total of four possible solutions (R, T) can be obtained, that is, four sets of rotation matrices and translation matrices, of which only one correct solution is available in the monocular camera Positive depth (the depth value is a positive number). Therefore, the next step of detecting depth information is required.
步骤S3,确定刚体位姿:通过二维空间特征对、多组旋转矩阵和平移矩阵,估算出三维空间点坐标,检测三维空间点坐标的深度值,将深度值为正数的那组旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,根据目标旋转矩阵和目标平移矩阵确定刚体位姿。Step S3: Determine the pose of the rigid body: Estimate the three-dimensional space point coordinates through the two-dimensional space feature pairs, multiple sets of rotation matrices and translation matrices, detect the depth value of the three-dimensional space point coordinates, and set the group of rotation matrices with a positive depth value The translation matrix is defined as the target rotation matrix and the target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
在步骤S2中采用奇异值分解对本质矩阵进行分解后,得到四个可能的解,因此本步骤需要在四个可能的解中最终确定正确的解。首先需要估算出三维空间点坐标,根据三维空间点坐标来检测特征点的深度值,只有深度值为正数的那组解(R、T)才是最终的目标(R、T)。After decomposing the essential matrix using singular value decomposition in step S2, four possible solutions are obtained. Therefore, in this step, it is necessary to finally determine the correct solution among the four possible solutions. First, it is necessary to estimate the coordinates of the three-dimensional space point, and detect the depth value of the feature point according to the three-dimensional space point coordinate. Only the set of solutions (R, T) with a positive depth value is the final target (R, T).
在一个实施例中,步骤S3中,通过二维空间特征对、多组旋转矩阵和平移矩阵,估算出三维空间点坐标,进一步包括:In one embodiment, in step S3, estimating three-dimensional space point coordinates through two-dimensional spatial feature pairs, multiple sets of rotation matrices and translation matrices, further includes:
设两个相机分别为相机1和相机2,在同帧中捕捉到的两个二维空间点坐标分别为A(a1,a2),B(b1,b2),相机1的旋转矩阵为R1(R11,R12,R13),R1是3*3的矩阵,平移矩阵为T1(T11,T12,T13),T1是3*1的矩阵,相机2的旋转矩阵为R2(R21,R22,R23),平移矩阵为T2(T21,T22,T23),同样地,R2是3*3的矩阵,T2是3*1的矩阵,通过下述方法得到同帧中的一个三维空间点坐标C(c1,c2,c3):Suppose the two cameras are camera 1 and camera 2, the two two-dimensional space point coordinates captured in the same frame are A(a1, a2), B(b1, b2), and the rotation matrix of camera 1 is R1( R11, R12, R13), R1 is a 3*3 matrix, the translation matrix is T1 (T11, T12, T13), T1 is a 3*1 matrix, and the rotation matrix of camera 2 is R2 (R21, R22, R23), The translation matrix is T2 (T21, T22, T23). Similarly, R2 is a 3*3 matrix and T2 is a 3*1 matrix. The coordinates of a three-dimensional space point C(c1, c2) in the same frame are obtained by the following method , C3):
1)根据两个相机的内参和畸变参数,将像素坐标A(a1,a2),B(b1,b2)转化为相机坐标A′(a1′,a2′),B′(b1′,b2′);1) According to the internal parameters and distortion parameters of the two cameras, transform the pixel coordinates A(a1, a2), B(b1, b2) into camera coordinates A′(a1′, a2′), B′(b1′, b2′) );
2)构造最小二乘法矩阵X和Y,其中X是4*3的矩阵,Y是4*1的矩阵;X矩阵第一行为a1′*R13-R11,X矩阵第二行为a2′*R13-R12,X矩阵第 三行为b1′*R23-R21,X矩阵第四行为b2′*R23-R22;Y矩阵第一行为T11-a1′*T13,Y矩阵第二行为T12-a2′*T13,Y矩阵第三行为T21-b1′*T23,Y矩阵第四行为T22-b2′*T23;2) Construct the least squares matrix X and Y, where X is a 4*3 matrix and Y is a 4*1 matrix; the first line of X matrix is a1′*R13-R11, and the second line of X matrix is a2′*R13- R12, the third row of X matrix is b1′*R23-R21, the fourth row of X matrix is b2′*R23-R22; the first row of Y matrix is T11-a1′*T13, the second row of Y matrix is T12-a2′*T13, The third row of Y matrix is T21-b1′*T23, and the fourth row of Y matrix is T22-b2′*T23;
3)根据等式X*C=Y和已经构造好的矩阵X、矩阵Y,利用SVD分解可求得其中一个三维空间点坐标C(c1,c2,c3);3) According to the equation X*C=Y and the already constructed matrix X and matrix Y, one of the three-dimensional space point coordinates C(c1, c2, c3) can be obtained by SVD decomposition;
最后根据多个不同的旋转矩阵和平移矩阵,比如多组R1、T1和R2、T2等旋转矩阵和平移矩阵数据对,得到多个不同的三维空间点坐标。Finally, according to multiple different rotation matrices and translation matrices, such as multiple sets of R1, T1, R2, and T2 rotation matrix and translation matrix data pairs, multiple different three-dimensional space point coordinates are obtained.
例如,步骤S2中得到了四组旋转矩阵和平移矩阵,则通过本步骤可估算出4个不同的三维空间点坐标,但其中只有一个三维空间点坐标C的坐标值c3是大于0的,那么与该三维空间点坐标C对应的R、T就是最终的目标数据。For example, if four sets of rotation matrices and translation matrices are obtained in step S2, four different three-dimensional space point coordinates can be estimated through this step, but there is only one three-dimensional space point coordinate C whose coordinate value c3 is greater than 0, then The R and T corresponding to the coordinate C of the three-dimensional space point are the final target data.
本实施例将匹配的多组二维空间特征对与四个可能的解(R、T)进行结合,根据三角测量原理,通过上述方式估计出对应的三维空间坐标数据(x,y,z),为后续检测深度值z提供精确数据。In this embodiment, the matched sets of two-dimensional space feature pairs are combined with four possible solutions (R, T), and the corresponding three-dimensional space coordinate data (x, y, z) is estimated by the above method according to the principle of triangulation. , Provide accurate data for the subsequent detection depth value z.
在一个实施例中,步骤S3中,检测三维空间点坐标的深度值,将深度值为正数的那组旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,包括:In one embodiment, in step S3, the depth value of the three-dimensional space point coordinates is detected, and the group of rotation matrices and translation matrices with a positive depth value are defined as the target rotation matrix and the target translation matrix, including:
根据估算出的三维空间点坐标,检测三维空间点坐标对应的深度值是否为正数,若是,则将对应的那组旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵。According to the estimated three-dimensional space point coordinates, it is detected whether the depth value corresponding to the three-dimensional space point coordinates is a positive number, and if so, the corresponding group of rotation matrix and translation matrix is defined as the target rotation matrix and the target translation matrix.
本实施例通过上述方式求解得到多个深度值z,剔除深度值z为零或负数时对应的解(R、T),将深度值z为正数时对应的解(R、T)保留,并作为最终的目标数据,以此目标数据确定刚体位姿。In this embodiment, multiple depth values z are obtained by solving in the above-mentioned manner, and the corresponding solution (R, T) when the depth value z is zero or negative is eliminated, and the corresponding solution (R, T) when the depth value z is positive is retained. And as the final target data, the rigid body pose is determined with this target data.
在一个实施例中,步骤S3中,将深度值为正数的那组旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵后,根据目标旋转矩阵和目标平移矩阵确定刚体位姿前,如图3所示,包括:In one embodiment, in step S3, after defining the set of rotation matrices and translation matrices with a positive depth value as the target rotation matrix and the target translation matrix, before determining the rigid body pose according to the target rotation matrix and the target translation matrix, such as As shown in Figure 3, it includes:
步骤S301,计算三维平均距离:将三维空间点坐标内的所有三维空间点之 间的距离求和后取平均值,得到三维平均距离。Step S301: Calculate the three-dimensional average distance: sum the distances between all three-dimensional space points in the three-dimensional space point coordinates and take the average value to obtain the three-dimensional average distance.
计算三维平均距离时,可以随机任一取一个三维空间点1,计算三维空间点1与其他任一另一个三维空间点2之间的距离,计算时,可以采用如下公式:When calculating the three-dimensional average distance, you can randomly select a three-dimensional space point 1 to calculate the distance between the three-dimensional space point 1 and any other three-dimensional space point 2. The following formula can be used in the calculation:
D=sqrt((a 1-b 1) 2+(a 2-b 2) 2+(a 3-b 3) 2) D=sqrt((a 1 -b 1 ) 2 +(a 2 -b 2 ) 2 +(a 3 -b 3 ) 2 )
其中,D为两个三维空间点之间的距离,(a 1,a 2,a 3)为三维空间点1的三维空间点坐标,(b 1,b 2,b 3)为三维空间点2的三维空间点坐标。 Among them, D is the distance between two three-dimensional space points, (a 1 , a 2 , a 3 ) is the three-dimensional space point coordinates of the three-dimensional space point 1, (b 1 , b 2 , b 3 ) is the three-dimensional space point 2 The coordinates of the point in the three-dimensional space.
计算三维空间点2与还未参与计算的其他任一另一个三维空间点3之间的距离,直到所有的三维空间点都参与了计算后,将所有的距离进行求和,再取平均值。也可以在所有的三维空间点都参与了计算后,最后参与计算的三维空间点8再与第一个随机取的三维空间点1计算距离,将所有的距离进行求和,再取平均值。Calculate the distance between the 3D space point 2 and any other 3D space point 3 that has not participated in the calculation until all the 3D space points have participated in the calculation, then sum all the distances, and then take the average value. It is also possible to calculate the distance between the last three-dimensional space point 8 that participated in the calculation and the first randomly taken three-dimensional space point 1 after all three-dimensional space points have participated in the calculation, and then sum all the distances, and then take the average value.
例如,当刚体中的标记点为八个时,三维空间点坐标内的三维空间点为八个,通过上述方式,计算得到八个距离数值,将此八个距离数值进行相加,再除以八,得到三维平均距离。For example, when there are eight marked points in a rigid body, there are eight three-dimensional space points in the three-dimensional space point coordinates. Through the above method, eight distance values are calculated, the eight distance values are added, and then divided by Eight, get the three-dimensional average distance.
步骤S302,计算刚体平均距离:获取刚体坐标,将刚体坐标内的所有刚体标记点之间的距离求和后取平均值,得到刚体平均距离。Step S302: Calculate the average distance of the rigid body: obtain the coordinates of the rigid body, sum the distances between all rigid body mark points in the coordinates of the rigid body and take the average value to obtain the average distance of the rigid body.
在计算刚体平均距离时,可以采用与步骤S301相似的距离计算公式,分别计算两个刚体标记点在刚体坐标系下两者坐标之间的距离后,进行求和,再取平均值。When calculating the average distance of a rigid body, a distance calculation formula similar to that in step S301 can be used to calculate the distance between two rigid body mark points in the rigid body coordinate system after calculating the distance between the two coordinates of the rigid body coordinate system, and then the sum is performed and the average value is taken.
本步骤中的刚体坐标,可以通过实际测量标记点的刚体坐标获得,如图4所示,也可以通过多相机系统获得,即采用如下方式,只需一次初始化即可获得准确的刚体坐标,无需多次计算:The rigid body coordinates in this step can be obtained by actually measuring the rigid body coordinates of the marked points, as shown in Figure 4, or can be obtained by a multi-camera system, that is, in the following way, accurate rigid body coordinates can be obtained with only one initialization. Multiple calculations:
步骤S30201,获取数据:获取多个相机捕捉的相邻两帧的二维空间点坐标、二维空间点坐标对应的二维空间点编码和多个相机的空间位置数据,将二维空间点编码相同的多个二维空间点坐标分为同类,且标记于同一个标记点下。Step S30201, acquiring data: acquiring the two-dimensional point coordinates of two adjacent frames captured by multiple cameras, the two-dimensional point codes corresponding to the two-dimensional point coordinates, and the spatial position data of multiple cameras, and encode the two-dimensional points The coordinates of the same multiple two-dimensional space points are classified into the same kind, and are marked under the same marking point.
本步骤的标记点一般都设置在刚体的不同位置,通过多个相机捕捉标记点的二维空间坐标信息,通过预设的刚体编码技术,确定出空间点数据,空间点数据包括二维空间点坐标及对应的二维空间点编码。空间位置数据是由通过标定计算得到各相机的空间位置关系得到的。通常,在刚体上设有八个标记点,标记点可以是八个发光的LED灯。因此刚体通常包含八个空间点数据,在多个相机在捕捉到的信息中,单个相机每帧数据中都包含八个标记点的空间点数据,同一个标记点在不同帧时的编码是相同的,不同的标记点在同帧时的编码是不同的。基于此可以把所有相机中带有相同空间点编码的空间点数据划分在一起作为同类,并认为这些空间点数据是空间中同一个标记点在不同相机上的投影。The marker points in this step are generally set at different positions of the rigid body. The two-dimensional space coordinate information of the marker points is captured by multiple cameras, and the spatial point data is determined through the preset rigid body encoding technology. The spatial point data includes two-dimensional spatial points. Coordinates and corresponding two-dimensional space point codes. The spatial position data is obtained by calibrating and calculating the spatial position relationship of each camera. Usually, there are eight marking points on the rigid body, and the marking points can be eight luminous LED lights. Therefore, a rigid body usually contains eight spatial point data. Among the information captured by multiple cameras, each frame of data for a single camera contains the spatial point data of eight marker points. The encoding of the same marker point in different frames is the same. Yes, the coding of different marker points in the same frame is different. Based on this, the spatial point data with the same spatial point code in all cameras can be divided together as the same type, and these spatial point data are considered to be projections of the same marker point in space on different cameras.
步骤S30202,计算三维空间数据:将多个相机两两进行匹配,根据两个相机的空间位置数据及同类同帧的多个二维空间点坐标,得到每个标记点每帧的三维空间点坐标。Step S30202: Calculate the three-dimensional space data: match multiple cameras in pairs, and obtain the three-dimensional space point coordinates of each marker point per frame according to the spatial position data of the two cameras and the multiple two-dimensional space point coordinates of the same frame .
对每个标记点的每帧数据分别进行本步骤的处理,处理时将捕捉到此标记点的多个相机两两进行匹配,利用多视几何中的三角测量原理,通过奇异值分解(Singular Value Decomposition,SVD)求解最小二乘法解算得到一组三维空间点数据。The processing of this step is performed on each frame of data of each marker point. During processing, multiple cameras that capture the marker point are matched in pairs. Using the principle of triangulation in multi-view geometry, through singular value decomposition (Singular Value Decomposition). Decomposition, SVD) solve the least square method to obtain a set of three-dimensional space point data.
例如,刚体包括八个标记点时,通过本步骤得到八个标记点的八个三维空间点编码和三维空间点坐标。For example, when the rigid body includes eight marker points, eight three-dimensional space point codes and three-dimensional space point coordinates of the eight marker points are obtained through this step.
本步骤进一步包括:This step further includes:
(1)求解最小二乘法:将捕捉到的同一个标记点的所有相机进行两两匹配,对匹配的两个相机在同帧中捕捉到的两个二维空间点坐标,利用多视几何中的三角测量原理,通过奇异值分解求解最小二乘法方法,解算得到一个三维空间点,遍历所有两两匹配的相机后,得到一组三维空间点,一组三维空间点即为标记点的三维空间点坐标。(1) Solve the least squares method: match all the cameras that captured the same mark point in pairs, and use the multi-view geometry to match the two two-dimensional space point coordinates captured by the two matched cameras in the same frame. The principle of triangulation, the least squares method is solved by singular value decomposition, and a three-dimensional space point is obtained after traversing all pairwise matching cameras, and a set of three-dimensional space points is obtained. A set of three-dimensional space points is the three-dimensional mark point Space point coordinates.
设两个相机分别是相机1和相机2,在同帧中捕捉到的两个二维空间点坐标分别为A(a1,a2),B(b1,b2),相机1的旋转矩阵为R1(R11,R12,R13), R1是3*3的矩阵,平移矩阵为T1(T11,T12,T13),T1是3*1的矩阵,相机2的旋转矩阵为R2(R21,R22,R23),平移矩阵为T2(T21,T22,T23),同样地,R2是3*3的矩阵,平移矩阵为T2,T2是3*1的矩阵,通过下述方法得到同帧中的一个三维空间点坐标C(c1,c2,c3):Suppose the two cameras are camera 1 and camera 2, the coordinates of two two-dimensional space points captured in the same frame are A(a1, a2), B(b1, b2), and the rotation matrix of camera 1 is R1( R11, R12, R13), R1 is a 3*3 matrix, the translation matrix is T1 (T11, T12, T13), T1 is a 3*1 matrix, and the rotation matrix of camera 2 is R2 (R21, R22, R23), The translation matrix is T2 (T21, T22, T23). Similarly, R2 is a 3*3 matrix, the translation matrix is T2, and T2 is a 3*1 matrix. The coordinates of a three-dimensional space point in the same frame are obtained by the following method C(c1, c2, c3):
1)根据两个相机的内参和畸变参数,将像素坐标A(a1,a2),B(b1,b2)转化为相机坐标A′(a1′,a2′),B′(b1′,b2′);1) According to the internal parameters and distortion parameters of the two cameras, transform the pixel coordinates A(a1, a2), B(b1, b2) into camera coordinates A′(a1′, a2′), B′(b1′, b2′) );
2)构造最小二乘法矩阵X和Y,其中X是4*3的矩阵,Y是4*1的矩阵;X矩阵第一行为a1′*R13-R11,X矩阵第二行为a2′*R13-R12,X矩阵第三行为b1′*R23-R21,X矩阵第四行为b2′*R23-R22;Y矩阵第一行为T11-a1′*T13,Y矩阵第二行为T12-a2′*T13,Y矩阵第三行为T21-b1′*T23,Y矩阵第四行为T22-b2′*T23。2) Construct the least squares matrix X and Y, where X is a 4*3 matrix and Y is a 4*1 matrix; the first line of X matrix is a1′*R13-R11, and the second line of X matrix is a2′*R13- R12, the third row of X matrix is b1′*R23-R21, the fourth row of X matrix is b2′*R23-R22; the first row of Y matrix is T11-a1′*T13, the second row of Y matrix is T12-a2′*T13, The third row of the Y matrix is T21-b1'*T23, and the fourth row of the Y matrix is T22-b2'*T23.
3)根据等式X*C=Y和已经构造好的矩阵X、Y,利用SVD分解即可以求得一个三维空间点坐标C。3) According to the equation X*C=Y and the already constructed matrices X and Y, a three-dimensional space point coordinate C can be obtained by SVD decomposition.
本步骤最终将所有两两匹配的相机捕捉到的两个二维空间点坐标均进行解算,得到一组三维空间点坐标。In this step, the two two-dimensional space point coordinates captured by all pairwise matching cameras are finally calculated to obtain a set of three-dimensional space point coordinates.
(2)剔除阈值外坐标:判断三维空间点坐标是否处于预设的阈值范围内,若超过阈值范围,则剔除三维空间点坐标,得到剔除后的一组三维空间点坐标。(2) Eliminate the coordinates outside the threshold: determine whether the three-dimensional space point coordinates are within the preset threshold range, if it exceeds the threshold range, the three-dimensional space point coordinates are eliminated, and a set of three-dimensional space point coordinates after elimination is obtained.
在得到多个三维空间点坐标后,需要检查这些三维空间点坐标是否处于预设的阈值范围内,即较小的阈值距离,此阈值范围是提前预设的坐标参数。若发现三维空间点坐标偏离阈值范围,则认为此三维空间点坐标是错误数据,进行剔除。After obtaining multiple three-dimensional space point coordinates, it is necessary to check whether these three-dimensional space point coordinates are within a preset threshold range, that is, a smaller threshold distance, and this threshold range is a coordinate parameter preset in advance. If the three-dimensional space point coordinates are found to deviate from the threshold range, the three-dimensional space point coordinates are considered to be wrong data and eliminated.
(3)计算平均值:计算一组三维空间点坐标的平均值,通过高斯牛顿法优化,得到标记点的三维空间点坐标。(3) Calculate the average value: Calculate the average value of a set of three-dimensional space point coordinates, optimize by Gauss Newton method, and obtain the three-dimensional space point coordinates of the marker point.
将剔除错误数据后的所有三维空间点坐标计算其平均值,计算时将三维空间点坐标的每个维度分别计算平均值,得到三维空间点坐标C′(c1′,c2′, c3′),通过高斯牛顿法对得到的三维空间点坐标进行优化,最终得到某一标记点的三维空间点坐标C(c1,c2,c3):Calculate the average value of all three-dimensional space point coordinates after excluding the error data. When calculating, calculate the average value of each dimension of the three-dimensional space point coordinates to obtain the three-dimensional space point coordinates C′ (c1′, c2′, c3′), The obtained three-dimensional space point coordinates are optimized by the Gauss-Newton method, and finally the three-dimensional space point coordinates C (c1, c2, c3) of a certain mark point are obtained:
1)根据每台相机的R和T,为C′计算下列值并求总和g0、H0;1) According to the R and T of each camera, calculate the following values for C′ and sum up g0, H0;
计算三维空间点坐标C′在每台相机的投影坐标,匹配实际图像坐标最近点并计算与最近点的图像坐标的残差;Calculate the projection coordinates of the three-dimensional space point coordinate C′ on each camera, match the actual image coordinate to the nearest point and calculate the residual error of the image coordinate with the nearest point;
根据每台相机的R和T计算C′在相机坐标系内的3D坐标q,定义:Calculate the 3D coordinate q of C′ in the camera coordinate system according to the R and T of each camera, and define:
Figure PCTCN2020110254-appb-000003
返回D*R;
Figure PCTCN2020110254-appb-000003
Return D*R;
给定相机I坐标系里面的1个3D点p(x,y,z)及其在相机上的成像坐标(u,v),则
Figure PCTCN2020110254-appb-000004
Given a 3D point p (x, y, z) in the camera I coordinate system and its imaging coordinates (u, v) on the camera, then
Figure PCTCN2020110254-appb-000004
相应的Jacobian矩阵
Figure PCTCN2020110254-appb-000005
Corresponding Jacobian matrix
Figure PCTCN2020110254-appb-000005
以世界坐标系中的3D点位变量,则有
Figure PCTCN2020110254-appb-000006
Taking the 3D point variable in the world coordinate system, there are
Figure PCTCN2020110254-appb-000006
根据Gauss-Newton算法,计算梯度
Figure PCTCN2020110254-appb-000007
Figure PCTCN2020110254-appb-000008
According to the Gauss-Newton algorithm, calculate the gradient
Figure PCTCN2020110254-appb-000007
Figure PCTCN2020110254-appb-000008
2)计算
Figure PCTCN2020110254-appb-000009
2) Calculation
Figure PCTCN2020110254-appb-000009
3)最终得到优化后的三维空间点坐标C(c1,c2,c3)。3) Finally, the optimized three-dimensional space point coordinates C (c1, c2, c3) are obtained.
步骤S30203,计算刚体坐标:将同帧的所有三维空间点编码和三维空间点坐标,转化为刚体坐标系下的刚体坐标,得到每个标记点每帧的刚体坐标。Step S30203: Calculate rigid body coordinates: convert all three-dimensional space point codes and three-dimensional space point coordinates in the same frame into rigid body coordinates in the rigid body coordinate system, and obtain the rigid body coordinates of each marker point and each frame.
通过步骤S2可以得到每个标记点对应的三维空间点数据,将多个标记点对应得到的多个三维空间点数据组成一个刚体,若当前使用的刚体具有八个发光的LED灯,则此刚体包含八个三维空间点数据。通过多个三维空间点数据,如八个三维空间点数据中的三维空间点坐标,可以转化为刚体坐标系下的刚体坐 标。Through step S2, the three-dimensional space point data corresponding to each marking point can be obtained, and the multiple three-dimensional space point data corresponding to multiple marking points can be formed into a rigid body. If the rigid body currently in use has eight luminous LED lights, the rigid body Contains eight three-dimensional space point data. Through multiple three-dimensional space point data, such as three-dimensional space point coordinates in eight three-dimensional space point data, it can be transformed into rigid body coordinates in a rigid body coordinate system.
本步骤进一步包括:This step further includes:
(1)计算平均值:计算同帧的多个标记点对应的三维空间点坐标的坐标平均值,将坐标平均值记为刚体坐标系下的原点。(1) Calculate the average value: Calculate the coordinate average value of the three-dimensional space point coordinates corresponding to multiple mark points in the same frame, and record the coordinate average value as the origin in the rigid body coordinate system.
在确定刚体坐标时,首先确定刚体坐标系下的原点。本步骤通过对同一帧中的所有标记点对应的三维空间点坐标的每一维度分别计算平均值,得到坐标平均值,并将此坐标平均值记为刚体坐标系下的原点,作为所有标记点对应的三维空间点坐标的参考数据。When determining the rigid body coordinates, first determine the origin under the rigid body coordinate system. In this step, the average value is calculated for each dimension of the three-dimensional space point coordinates corresponding to all the mark points in the same frame to obtain the coordinate average value, and this coordinate average value is recorded as the origin in the rigid coordinate system as all the mark points The reference data of the corresponding three-dimensional space point coordinates.
例如,刚体包含八个标记点时,则步骤S2得到八个三维空间点坐标数据,将这八个三维空间点坐标数据的每一维度计算平均值,得到坐标平均值。For example, when the rigid body contains eight marker points, step S2 obtains eight three-dimensional space point coordinate data, and calculates the average value of each dimension of the eight three-dimensional space point coordinate data to obtain the coordinate average value.
(2)计算差值:分别计算原点与同帧的每个标记点对应的三维空间点坐标之间的差值,得到每个标记点每帧的刚体坐标。(2) Calculate the difference: calculate the difference between the origin and the three-dimensional space point coordinates corresponding to each mark point in the same frame, and obtain the rigid body coordinates of each mark point in each frame.
以坐标平均值作为刚体坐标系下的原点,将每个三维空间点坐标分别与原点进行差值计算,得到的差值即为每个标记点的刚体坐标。The average value of the coordinates is taken as the origin in the rigid body coordinate system, and the difference between the coordinates of each three-dimensional space point and the origin is calculated, and the difference obtained is the rigid body coordinate of each marked point.
例如,刚体包含八个标记点时,八个标记点对应的三维空间点坐标分别与原点进行差值计算,计算时,对每一维度的坐标分别与原点对应的维度坐标进行差值计算,最终得到八个刚体坐标。For example, when a rigid body contains eight mark points, the three-dimensional space point coordinates corresponding to the eight mark points are calculated as the difference from the origin. When calculating, the difference between the coordinates of each dimension and the dimension coordinates corresponding to the origin is calculated, and finally Get eight rigid body coordinates.
本实施例通过多个相机来捕获多个二维空间点坐标,通过具体的求解算法,解析出一组三维空间点数据,并对多个三维空间点数据进行整合、平均及优化等操作后,最终得到较为准确的三维空间点数据,根据准确的三维空间点数据转化为刚体坐标系下的刚体坐标数据,为后续计算刚体平均距离提供确定且精确的数据。In this embodiment, multiple cameras are used to capture multiple two-dimensional space point coordinates, a set of three-dimensional space point data is analyzed through a specific solution algorithm, and after operations such as integration, averaging, and optimization of multiple three-dimensional space point data are performed, Finally, more accurate three-dimensional space point data is obtained, and the accurate three-dimensional space point data is converted into rigid body coordinate data in the rigid body coordinate system, which provides definite and accurate data for the subsequent calculation of the average distance of the rigid body.
步骤S303,优化:通过优化公式将目标平移矩阵进行优化,得到优化后的目标平移矩阵,根据目标旋转矩阵和优化后的目标平移矩阵确定刚体位姿。优化公式为:Step S303, optimization: the target translation matrix is optimized by the optimization formula to obtain the optimized target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the optimized target translation matrix. The optimization formula is:
Figure PCTCN2020110254-appb-000010
Figure PCTCN2020110254-appb-000010
其中,L1为三维平均距离,L2为刚体平均距离,T为优化前的目标平移矩阵,T′为优化后的目标平移矩阵。Among them, L1 is the three-dimensional average distance, L2 is the average distance of the rigid body, T is the target translation matrix before optimization, and T′ is the target translation matrix after optimization.
单目相机下,估计出刚体的目标旋转矩阵R和目标平移矩阵T后,在刚体同一旋转角度情况下,其平移量可能有多种情况,因此无法完全保证该平移矩阵T是准确真实的数据。为了获得更优化可靠的刚体位姿信息,进而确定刚体运动情况,在通过三角测量原理估计出刚体的三维空间点坐标后,根据估计出的三维空间点坐标和刚体坐标系下的刚体坐标,对目标平移矩阵进行优化。通过上述优化公式得到优化后的目标平移矩阵,使得最终得到的刚体位姿更准确、更具有真实性。Under the monocular camera, after estimating the target rotation matrix R and the target translation matrix T of the rigid body, under the same rotation angle of the rigid body, the translation amount may have various situations, so there is no guarantee that the translation matrix T is accurate and true data. . In order to obtain more optimized and reliable rigid body pose information, and then determine the rigid body movement, after the three-dimensional space point coordinates of the rigid body are estimated by the triangulation principle, the three-dimensional space point coordinates and the rigid body coordinates in the rigid body coordinate system are estimated according to the estimated three-dimensional space point coordinates and the rigid body coordinates. The target translation matrix is optimized. The optimized target translation matrix is obtained through the above optimization formula, so that the final rigid body pose is more accurate and more authentic.
本实施例单相机环境中主动式刚体的位姿定位方法,主动光刚体带有编码信息使得动捕跟踪定位不再依赖于刚体结构,而是可以直接根据编码信息得到可匹配的二维空间特征对,以解算刚体位姿。单相机环境中,采用本发明可以以较低的成本实现对刚体的跟踪定位,相较于复杂的多相机环境具有明显优势。此外,由于是根据主动光刚体的编码信息,来对相邻两帧进行匹配,使得每次跟踪定位主动光刚体都可以计算当前帧相较于初始帧的运动姿态,从而避免了单目相机跟踪常见的累积误差问题,进一步提升了跟踪精度。The pose positioning method of the active rigid body in the single-camera environment of this embodiment. The active optical rigid body has coding information so that the motion capture tracking and positioning no longer depends on the rigid body structure, but can directly obtain matching two-dimensional spatial features based on the coding information. Yes, to solve the rigid body pose. In a single-camera environment, the invention can realize the tracking and positioning of a rigid body at a lower cost, which has obvious advantages compared with a complex multi-camera environment. In addition, because the encoding information of the active optical rigid body is used to match the adjacent two frames, each time the active optical rigid body is tracked and positioned, the motion posture of the current frame compared to the initial frame can be calculated, thereby avoiding monocular camera tracking The common cumulative error problem further improves the tracking accuracy.
在一个实施例中,提出了一种单相机环境中主动式刚体的位姿定位装置,如图5所示,该装置包括:In one embodiment, a device for positioning an active rigid body in a single-camera environment is proposed. As shown in FIG. 5, the device includes:
计算本质矩阵模块,用于获取单目相机捕捉的相邻两帧的二维空间点坐标、二维空间点坐标对应的二维空间点编码和相机的相机参数,根据二维空间点编码,将相邻两帧的二维空间点坐标进行匹配,得到多组二维空间特征对,将多组二维空间特征对和相机参数构造线性方程组,求解出本质矩阵;The calculation essential matrix module is used to obtain the two-dimensional space point coordinates of two adjacent frames captured by the monocular camera, the two-dimensional space point code corresponding to the two-dimensional point coordinates, and the camera parameters of the camera. According to the two-dimensional space point code, Match the two-dimensional space point coordinates of two adjacent frames to obtain multiple sets of two-dimensional spatial feature pairs, construct a linear equation system from multiple sets of two-dimensional spatial feature pairs and camera parameters, and solve the essential matrix;
计算旋转矩阵和平移矩阵模块,用于通过奇异值分解算法分解本质矩阵,得到多组旋转矩阵和平移矩阵;Calculate rotation matrix and translation matrix module, which is used to decompose the essential matrix through the singular value decomposition algorithm to obtain multiple sets of rotation matrix and translation matrix;
确定刚体位姿模块,用于通过二维空间特征对、多组旋转矩阵和平移矩阵,估算出三维空间点坐标,检测三维空间点坐标的深度值,将深度值为正数的那组旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,根据目标旋转矩阵和目标平移矩阵确定刚体位姿。Determine the rigid body pose module, used to estimate the 3D space point coordinates through the 2D space feature pairs, multiple sets of rotation matrices and translation matrices, detect the depth value of the 3D space point coordinates, and set the group of rotation matrices with a positive depth value The translation matrix is defined as the target rotation matrix and the target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
基于与上述本发明实施例的单相机环境中主动式刚体的位姿定位方法相同的实施例说明内容,因此本实施例对单相机环境中主动式刚体的位姿定位装置的实施例内容不做过多赘述。Based on the same embodiment description as the above-mentioned embodiment of the present invention for the active rigid body pose positioning method in the single-camera environment, this embodiment does not describe the contents of the embodiment of the active rigid body pose positioning device in the single-camera environment. Too much repeat.
在一个实施例中,提出了一种单相机环境中主动式刚体的位姿定位设备,设备包括:存储器、处理器以及存储在存储器上并可在处理器上运行的单相机环境中主动式刚体的位姿定位程序,单相机环境中主动式刚体的位姿定位程序被处理器执行时实现上述各实施例的单相机环境中主动式刚体的位姿定位方法中的步骤。In one embodiment, a device for positioning an active rigid body in a single-camera environment is proposed. The device includes a memory, a processor, and an active rigid body in a single-camera environment that is stored in the memory and can run on the processor. The pose positioning program of the active rigid body in the single-camera environment is executed by the processor to implement the steps in the pose positioning method of the active rigid body in the single-camera environment of the foregoing embodiments.
在一个实施例中,一种计算机可读存储介质,计算机可读存储介质上存储有单相机环境中主动式刚体的位姿定位程序,单相机环境中主动式刚体的位姿定位程序被处理器执行时实现上述各实施例的单相机环境中主动式刚体的位姿定位方法中的步骤。其中,存储介质可以为非易失性存储介质。In one embodiment, a computer-readable storage medium stores a pose positioning program for an active rigid body in a single-camera environment, and the pose positioning program for an active rigid body in a single-camera environment is processed by the processor The steps in the active rigid body pose positioning method in the single-camera environment of the foregoing embodiments are implemented during execution. Wherein, the storage medium may be a non-volatile storage medium.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above-mentioned embodiments can be completed by a program instructing relevant hardware. The program can be stored in a computer-readable storage medium, and the storage medium can include: Read only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk, etc.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-mentioned embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the various technical features in the above-mentioned embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, All should be considered as the scope of this specification.
以上所述实施例仅表达了本发明一些示例性实施例,其描述较为具体和详 细,但并不能因此而理解为对本发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express some exemplary embodiments of the present invention, and their descriptions are more specific and detailed, but they should not be interpreted as limiting the patent scope of the present invention. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of the present invention, several modifications and improvements can be made, and these all fall within the protection scope of the present invention. Therefore, the protection scope of the patent of the present invention should be subject to the appended claims.

Claims (10)

  1. 一种单相机环境中主动式刚体的位姿定位方法,其特征在于,所述方法包括以下步骤:A method for positioning an active rigid body in a single-camera environment, characterized in that the method includes the following steps:
    获取单目相机捕捉的相邻两帧的二维空间点坐标、所述二维空间点坐标对应的二维空间点编码和所述相机的相机参数,根据所述二维空间点编码,将相邻两帧的所述二维空间点坐标进行匹配,得到多组二维空间特征对,将多组所述二维空间特征对和所述相机参数构造线性方程组,求解出本质矩阵;Obtain the two-dimensional space point coordinates of two adjacent frames captured by the monocular camera, the two-dimensional space point code corresponding to the two-dimensional space point coordinates, and the camera parameters of the camera. According to the two-dimensional space point code, the phase Matching the two-dimensional spatial point coordinates of two adjacent frames to obtain multiple sets of two-dimensional spatial feature pairs, constructing a linear equation system from the multiple sets of two-dimensional spatial feature pairs and the camera parameters, and solving the essential matrix;
    通过奇异值分解算法分解所述本质矩阵,得到多组旋转矩阵和平移矩阵;Decompose the essential matrix through a singular value decomposition algorithm to obtain multiple sets of rotation matrices and translation matrices;
    通过所述二维空间特征对、多组所述旋转矩阵和所述平移矩阵,估算出三维空间点坐标,检测三维空间点坐标的深度值,将深度值为正数的那组所述旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,根据所述目标旋转矩阵和所述目标平移矩阵确定刚体位姿。According to the two-dimensional space feature pair, multiple sets of the rotation matrix and the translation matrix, the three-dimensional space point coordinates are estimated, the depth value of the three-dimensional space point coordinates is detected, and the set of the rotation matrix whose depth value is a positive number The translation matrix is defined as a target rotation matrix and a target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
  2. 根据权利要求1所述的单相机环境中主动式刚体的位姿定位方法,其特征在于,所述根据所述目标旋转矩阵和所述目标平移矩阵确定刚体位姿,包括:The method for positioning the pose of an active rigid body in a single-camera environment according to claim 1, wherein the determining the pose of the rigid body according to the target rotation matrix and the target translation matrix comprises:
    将所述三维空间点坐标内的所有三维空间点之间的距离求和后取平均值,得到三维平均距离;Sum the distances between all three-dimensional space points in the three-dimensional space point coordinates and take an average value to obtain a three-dimensional average distance;
    获取刚体坐标,将所述刚体坐标内的所有刚体标记点之间的距离求和后取平均值,得到刚体平均距离;Obtain rigid body coordinates, sum the distances between all rigid body mark points in the rigid body coordinates and take an average value to obtain the average rigid body distance;
    通过优化公式将所述目标平移矩阵进行优化,得到优化后的目标平移矩阵,根据所述目标旋转矩阵和优化后的所述目标平移矩阵确定刚体位姿;The target translation matrix is optimized by an optimization formula to obtain an optimized target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the optimized target translation matrix;
    所述优化公式为:The optimization formula is:
    Figure PCTCN2020110254-appb-100001
    Figure PCTCN2020110254-appb-100001
    其中,L1为所述三维平均距离,L2为所述刚体平均距离,T为优化前的所述目标平移矩阵,T′为优化后的所述目标平移矩阵。Wherein, L1 is the three-dimensional average distance, L2 is the average distance of the rigid body, T is the target translation matrix before optimization, and T′ is the target translation matrix after optimization.
  3. 根据权利要求2所述的单相机环境中主动式刚体的位姿定位方法,其特征在于,所述获取刚体坐标,将所述刚体坐标内的所有刚体标记点之间的距离求和后取平均值,得到刚体平均距离前,包括:The pose positioning method of an active rigid body in a single-camera environment according to claim 2, wherein the acquiring rigid body coordinates, summing the distances between all rigid body marking points in the rigid body coordinates, and taking the average Value, before getting the average distance of the rigid body, including:
    获取多个相机捕捉的相邻两帧的二维空间点坐标、所述二维空间点坐标对应的二维空间点编码和多个所述相机的空间位置数据,将所述二维空间点编码相同的多个所述二维空间点坐标分为同类,且标记于同一个标记点下;Obtain the two-dimensional space point coordinates of two adjacent frames captured by multiple cameras, the two-dimensional space point code corresponding to the two-dimensional space point coordinates, and the space position data of multiple cameras, and encode the two-dimensional space point The same multiple of the two-dimensional space point coordinates are classified into the same type, and are marked under the same marking point;
    将多个所述相机两两进行匹配,根据两个所述相机的空间位置数据及同类同帧的多个所述二维空间点坐标,得到每个所述标记点每帧的三维空间点坐标;Match a plurality of the cameras in pairs, and obtain the three-dimensional space point coordinates of each frame of each of the marker points according to the spatial position data of the two cameras and the plurality of the two-dimensional space point coordinates of the same frame ;
    将同帧的所有三维空间点坐标,转化为刚体坐标系下的刚体坐标,得到每个所述标记点每帧的刚体坐标。The coordinates of all three-dimensional space points in the same frame are converted into rigid body coordinates in the rigid body coordinate system, and the rigid body coordinates of each of the marked points in each frame are obtained.
  4. 根据权利要求3所述的单相机环境中主动式刚体的位姿定位方法,其特征在于,所述将多个所述相机两两进行匹配,根据两个所述相机的空间位置数据及同类同帧的多个所述二维空间点坐标,得到每个所述标记点每帧的三维空间点坐标,包括:The pose positioning method of an active rigid body in a single-camera environment according to claim 3, wherein the matching of a plurality of the cameras is performed according to the spatial position data of the two cameras and the same type. Obtaining the three-dimensional space point coordinates of each of the marking points in each frame of a plurality of the two-dimensional space point coordinates of the frame includes:
    将捕捉到的同一个标记点的所有相机进行两两匹配,对匹配的两个相机在同帧中捕捉到的两个所述二维空间点坐标,通过奇异值分解求解最小二乘法方法,解算得到一组三维空间点坐标;The two two-dimensional space point coordinates captured by the two matching cameras in the same frame are matched by pairwise matching of all the cameras that have captured the same marked point, and the least squares method is solved by singular value decomposition. Calculate a set of three-dimensional space point coordinates;
    判断所述三维空间点坐标是否处于预设的阈值范围内,若超过所述阈值范围,则剔除所述三维空间点坐标,得到剔除后的一组所述三维空间点坐标;Judging whether the three-dimensional space point coordinates are within a preset threshold value range, and if the three-dimensional space point coordinates exceed the threshold value range, the three-dimensional space point coordinates are eliminated to obtain a set of the eliminated three-dimensional space point coordinates;
    计算一组所述三维空间点坐标的平均值,通过高斯牛顿法优化,得到所述标记点的三维空间点坐标。Calculate the average value of a set of the three-dimensional space point coordinates, and optimize by Gauss-Newton method to obtain the three-dimensional space point coordinates of the mark point.
  5. 根据权利要求3所述的单相机环境中主动式刚体的位姿定位方法,其特征在于,所述将同帧的所有三维空间点坐标,转化为刚体坐标系下的刚体坐标,得到每个所述标记点每帧的刚体坐标,包括:The pose positioning method of an active rigid body in a single-camera environment according to claim 3, wherein the coordinates of all three-dimensional space points in the same frame are converted into rigid body coordinates in a rigid body coordinate system to obtain each position The rigid body coordinates of each frame of the marked points include:
    计算同帧的多个所述标记点对应的所述三维空间点坐标的坐标平均值,将 所述坐标平均值记为刚体坐标系下的原点;Calculate the coordinate average value of the three-dimensional space point coordinates corresponding to the plurality of the mark points in the same frame, and record the coordinate average value as the origin in the rigid body coordinate system;
    分别计算原点与同帧的每个所述标记点对应的所述三维空间点坐标之间的差值,得到每个所述标记点每帧的刚体坐标。The difference between the origin and the coordinates of the three-dimensional space points corresponding to each of the marking points in the same frame is respectively calculated to obtain the rigid body coordinates of each of the marking points in each frame.
  6. 根据权利要求1所述的单相机环境中主动式刚体的位姿定位方法,其特征在于,所述通过所述二维空间特征对、多组所述旋转矩阵和所述平移矩阵,估算出三维空间点坐标,包括:The pose positioning method of an active rigid body in a single-camera environment according to claim 1, wherein the three-dimensional image is estimated by the two-dimensional spatial feature pair, multiple sets of the rotation matrix and the translation matrix. Space point coordinates, including:
    设两个相机分别为相机1和相机2,在同帧中捕捉到的两个二维空间点坐标分别为A(a1,a2),B(b1,b2),所述相机1的旋转矩阵为R1(R11,R12,R13),平移矩阵为T1(T11,T12,T13),所述相机2的旋转矩阵为R2(R21,R22,R23),平移矩阵为T2(T21,T22,T23),其中,所述R1、R2为3*3的矩阵,所述T1、T2为3*1的矩阵,通过下述方法得到三维空间点坐标:Suppose the two cameras are camera 1 and camera 2, and the coordinates of two two-dimensional space points captured in the same frame are A (a1, a2) and B (b1, b2) respectively, and the rotation matrix of the camera 1 is R1 (R11, R12, R13), the translation matrix is T1 (T11, T12, T13), the rotation matrix of the camera 2 is R2 (R21, R22, R23), and the translation matrix is T2 (T21, T22, T23), Wherein, the R1 and R2 are 3*3 matrices, the T1 and T2 are 3*1 matrices, and the three-dimensional space point coordinates are obtained by the following method:
    根据所述两个相机的内参和畸变参数,将像素坐标A(a1,a2),B(b1,b2)转化为相机坐标A′(a1′,a2′),B′(b1′,b2′);According to the internal parameters and distortion parameters of the two cameras, the pixel coordinates A(a1, a2), B(b1, b2) are converted into camera coordinates A′(a1′, a2′), B′(b1′, b2′) );
    构造最小二乘法矩阵X和Y,其中X为4*3的矩阵,Y为4*1的矩阵,X矩阵第一行为a1′*R13-R11,X矩阵第二行为a2′*R13-R12,X矩阵第三行为b1′*R23-R21,X矩阵第四行为b2′*R23-R22;Y矩阵第一行为T11-a1′*T13,Y矩阵第二行为T12-a2′*T13,Y矩阵第三行为T21-b1′*T23,Y矩阵第四行为T22-b2′*T23;Construct the least squares matrix X and Y, where X is a 4*3 matrix, Y is a 4*1 matrix, the first row of X matrix is a1′*R13-R11, the second row of X matrix is a2′*R13-R12, The third row of X matrix is b1′*R23-R21, the fourth row of X matrix is b2′*R23-R22; the first row of Y matrix is T11-a1′*T13, the second row of Y matrix is T12-a2′*T13, Y matrix The third row is T21-b1′*T23, and the fourth row of Y matrix is T22-b2′*T23;
    根据等式X*C=Y和所述矩阵X、矩阵Y,利用奇异值分解求得一个三维空间点坐标C(c1,c2,c3);According to the equation X*C=Y and the matrix X and matrix Y, a three-dimensional space point coordinate C (c1, c2, c3) is obtained by using singular value decomposition;
    根据多个不同的旋转矩阵和平移矩阵,得到多个不同的三维空间点坐标。According to multiple different rotation matrices and translation matrices, multiple different three-dimensional space point coordinates are obtained.
  7. 根据权利要求1所述的单相机环境中主动式刚体的位姿定位方法,其特征在于,所述检测三维空间点坐标的深度值,将深度值为正数的那组所述旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,包括:The pose positioning method of an active rigid body in a single-camera environment according to claim 1, wherein the detection of the depth value of the three-dimensional space point coordinates, the group of the rotation matrix with a positive depth value and the translation The matrix is defined as the target rotation matrix and the target translation matrix, including:
    根据估算出的所述三维空间点坐标,检测所述三维空间点坐标对应的深度值是否为正数,若是,则将对应的那组所述旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵。According to the estimated three-dimensional space point coordinates, detect whether the depth value corresponding to the three-dimensional space point coordinates is a positive number, and if so, define the corresponding set of the rotation matrix and translation matrix as the target rotation matrix and target translation matrix.
  8. 一种单相机环境中主动式刚体的位姿定位装置,其特征在于,所述装置包括:An active rigid body pose positioning device in a single-camera environment, characterized in that the device includes:
    计算本质矩阵模块,用于获取单目相机捕捉的相邻两帧的二维空间点坐标、所述二维空间点坐标对应的二维空间点编码和所述相机的相机参数,根据所述二维空间点编码,将相邻两帧的所述二维空间点坐标进行匹配,得到多组二维空间特征对,将多组所述二维空间特征对和所述相机参数构造线性方程组,求解出本质矩阵;The calculation essential matrix module is used to obtain the two-dimensional space point coordinates of two adjacent frames captured by the monocular camera, the two-dimensional space point code corresponding to the two-dimensional space point coordinates, and the camera parameters of the camera, according to the two Two-dimensional space point coding, matching the two-dimensional space point coordinates of two adjacent frames to obtain multiple sets of two-dimensional space feature pairs, and constructing a linear equation system from multiple sets of the two-dimensional space feature pairs and the camera parameters, Solve the essential matrix;
    计算旋转矩阵和平移矩阵模块,用于通过奇异值分解算法分解所述本质矩阵,得到多组旋转矩阵和平移矩阵;Calculating rotation matrix and translation matrix module, which is used to decompose the essential matrix through a singular value decomposition algorithm to obtain multiple sets of rotation matrix and translation matrix;
    确定刚体位姿模块,用于通过所述二维空间特征对、多组所述旋转矩阵和所述平移矩阵,估算出三维空间点坐标,检测三维空间点坐标的深度值,将深度值为正数的那组所述旋转矩阵和平移矩阵定义为目标旋转矩阵和目标平移矩阵,根据所述目标旋转矩阵和所述目标平移矩阵确定刚体位姿。The rigid body pose determination module is used to estimate the three-dimensional space point coordinates through the two-dimensional space feature pair, multiple sets of the rotation matrix and the translation matrix, detect the depth value of the three-dimensional space point coordinate, and set the depth value to be positive The number of rotation matrices and translation matrices are defined as a target rotation matrix and a target translation matrix, and the rigid body pose is determined according to the target rotation matrix and the target translation matrix.
  9. 一种单相机环境中主动式刚体的位姿定位设备,其特征在于,所述设备包括:An active rigid body pose positioning device in a single-camera environment, characterized in that the device includes:
    存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的单相机环境中主动式刚体的位姿定位程序,所述单相机环境中主动式刚体的位姿定位程序被所述处理器执行时实现如权利要求1至7中任一项所述的单相机环境中主动式刚体的位姿定位方法的步骤。A memory, a processor, and a pose positioning program for an active rigid body in a single-camera environment stored on the memory and running on the processor. The pose positioning program for an active rigid body in the single-camera environment is controlled by When the processor is executed, the steps of the method for positioning an active rigid body in a single-camera environment according to any one of claims 1 to 7 are realized.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有单相机环境中主动式刚体的位姿定位程序,所述单相机环境中主动式刚体的位姿定位程序被处理器执行时实现如权利要求1至7中任一项所述的单相机环境中主动式刚体的位姿定位方法的步骤。A computer-readable storage medium, wherein the computer-readable storage medium stores a pose positioning program for an active rigid body in a single-camera environment, and the pose positioning program for an active rigid body in the single-camera environment is When the processor is executed, the steps of the method for positioning an active rigid body in a single-camera environment according to any one of claims 1 to 7 are realized.
PCT/CN2020/110254 2019-09-30 2020-08-20 Method for determining pose of active rigid body in single-camera environment, and related apparatus WO2021063128A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910938118.1A CN110689577B (en) 2019-09-30 2019-09-30 Active rigid body pose positioning method in single-camera environment and related equipment
CN201910938118.1 2019-09-30

Publications (1)

Publication Number Publication Date
WO2021063128A1 true WO2021063128A1 (en) 2021-04-08

Family

ID=69111063

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/110254 WO2021063128A1 (en) 2019-09-30 2020-08-20 Method for determining pose of active rigid body in single-camera environment, and related apparatus

Country Status (2)

Country Link
CN (2) CN114170307A (en)
WO (1) WO2021063128A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610979A (en) * 2021-07-12 2021-11-05 深圳市瑞立视多媒体科技有限公司 Method and equipment for early warning similarity between rigid bodies and optical motion capture system
CN113850873A (en) * 2021-09-24 2021-12-28 成都圭目机器人有限公司 Offset position calibration method of linear array camera under carrying platform positioning coordinate system
CN114742904A (en) * 2022-05-23 2022-07-12 轻威科技(绍兴)有限公司 Calibration method and device of commercial stereo camera set after interference points are eliminated
CN115100287A (en) * 2022-04-14 2022-09-23 美的集团(上海)有限公司 External reference calibration method and robot
CN117523678A (en) * 2024-01-04 2024-02-06 广东茉莉数字科技集团股份有限公司 Virtual anchor distinguishing method and system based on optical action data

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114170307A (en) * 2019-09-30 2022-03-11 深圳市瑞立视多媒体科技有限公司 Active rigid body pose positioning method in single-camera environment and related equipment
CN113744347B (en) * 2020-04-02 2023-06-16 深圳市瑞立视多媒体科技有限公司 Method, device, equipment and storage medium for calibrating sweeping field and simultaneously calibrating field in large space environment
CN113392909B (en) * 2021-06-17 2022-12-27 深圳市睿联技术股份有限公司 Data processing method, data processing device, terminal and readable storage medium
CN113473210A (en) * 2021-07-15 2021-10-01 北京京东方光电科技有限公司 Display method, apparatus and storage medium
CN118298113A (en) * 2024-06-05 2024-07-05 知行汽车科技(苏州)股份有限公司 Three-dimensional reconstruction method, device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080174594A1 (en) * 2007-01-22 2008-07-24 Sharp Laboratories Of America, Inc. Method for supporting intuitive view specification in the free-viewpoint television application
CN103759670A (en) * 2014-01-06 2014-04-30 四川虹微技术有限公司 Object three-dimensional information acquisition method based on digital close range photography
CN107341814A (en) * 2017-06-14 2017-11-10 宁波大学 The four rotor wing unmanned aerial vehicle monocular vision ranging methods based on sparse direct method
CN108648270A (en) * 2018-05-12 2018-10-12 西北工业大学 Unmanned plane real-time three-dimensional scene reconstruction method based on EG-SLAM
CN110689577A (en) * 2019-09-30 2020-01-14 深圳市瑞立视多媒体科技有限公司 Active rigid body pose positioning method in single-camera environment and related equipment
CN110689584A (en) * 2019-09-30 2020-01-14 深圳市瑞立视多媒体科技有限公司 Active rigid body pose positioning method in multi-camera environment and related equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102564350A (en) * 2012-02-10 2012-07-11 华中科技大学 Plane structured light and light pen-based precise three-dimensional measurement method for complex part
CN102768767B (en) * 2012-08-06 2014-10-22 中国科学院自动化研究所 Online three-dimensional reconstructing and locating method for rigid body
CN103759716B (en) * 2014-01-14 2016-08-17 清华大学 The dynamic target position of mechanically-based arm end monocular vision and attitude measurement method
CN108151713A (en) * 2017-12-13 2018-06-12 南京航空航天大学 A kind of quick position and orientation estimation methods of monocular VO
CN109141396B (en) * 2018-07-16 2022-04-26 南京航空航天大学 Unmanned aerial vehicle pose estimation method with fusion of auxiliary information and random sampling consistency algorithm

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080174594A1 (en) * 2007-01-22 2008-07-24 Sharp Laboratories Of America, Inc. Method for supporting intuitive view specification in the free-viewpoint television application
CN103759670A (en) * 2014-01-06 2014-04-30 四川虹微技术有限公司 Object three-dimensional information acquisition method based on digital close range photography
CN107341814A (en) * 2017-06-14 2017-11-10 宁波大学 The four rotor wing unmanned aerial vehicle monocular vision ranging methods based on sparse direct method
CN108648270A (en) * 2018-05-12 2018-10-12 西北工业大学 Unmanned plane real-time three-dimensional scene reconstruction method based on EG-SLAM
CN110689577A (en) * 2019-09-30 2020-01-14 深圳市瑞立视多媒体科技有限公司 Active rigid body pose positioning method in single-camera environment and related equipment
CN110689584A (en) * 2019-09-30 2020-01-14 深圳市瑞立视多媒体科技有限公司 Active rigid body pose positioning method in multi-camera environment and related equipment

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610979A (en) * 2021-07-12 2021-11-05 深圳市瑞立视多媒体科技有限公司 Method and equipment for early warning similarity between rigid bodies and optical motion capture system
CN113610979B (en) * 2021-07-12 2023-12-01 深圳市瑞立视多媒体科技有限公司 Method and equipment for early warning similarity between rigid bodies and optical motion capturing system
CN113850873A (en) * 2021-09-24 2021-12-28 成都圭目机器人有限公司 Offset position calibration method of linear array camera under carrying platform positioning coordinate system
CN113850873B (en) * 2021-09-24 2024-06-07 成都圭目机器人有限公司 Offset position calibration method of linear array camera under carrying platform positioning coordinate system
CN115100287A (en) * 2022-04-14 2022-09-23 美的集团(上海)有限公司 External reference calibration method and robot
CN114742904A (en) * 2022-05-23 2022-07-12 轻威科技(绍兴)有限公司 Calibration method and device of commercial stereo camera set after interference points are eliminated
CN114742904B (en) * 2022-05-23 2024-07-02 轻威科技(绍兴)有限公司 Calibration method and device for commercial three-dimensional computer unit with interference points removed
CN117523678A (en) * 2024-01-04 2024-02-06 广东茉莉数字科技集团股份有限公司 Virtual anchor distinguishing method and system based on optical action data
CN117523678B (en) * 2024-01-04 2024-04-05 广东茉莉数字科技集团股份有限公司 Virtual anchor distinguishing method and system based on optical action data

Also Published As

Publication number Publication date
CN110689577B (en) 2022-04-01
CN110689577A (en) 2020-01-14
CN114170307A (en) 2022-03-11

Similar Documents

Publication Publication Date Title
WO2021063128A1 (en) Method for determining pose of active rigid body in single-camera environment, and related apparatus
WO2021063127A1 (en) Pose positioning method and related equipment of active rigid body in multi-camera environment
CN106780601B (en) Spatial position tracking method and device and intelligent equipment
JP6855587B2 (en) Devices and methods for acquiring distance information from a viewpoint
TWI624170B (en) Image scanning system and method thereof
CN106875435B (en) Method and system for obtaining depth image
CN112150528A (en) Depth image acquisition method, terminal and computer readable storage medium
CN107808398B (en) Camera parameter calculation device, calculation method, program, and recording medium
TWI393980B (en) The method of calculating the depth of field and its method and the method of calculating the blurred state of the image
JP2004340840A (en) Distance measuring device, distance measuring method and distance measuring program
US11030478B1 (en) System and method for correspondence map determination
CN109640066B (en) Method and device for generating high-precision dense depth image
CN112200056B (en) Face living body detection method and device, electronic equipment and storage medium
JP2017117386A (en) Self-motion estimation system, control method and program of self-motion estimation system
JP7489253B2 (en) Depth map generating device and program thereof, and depth map generating system
US10049454B2 (en) Active triangulation calibration
JP6288770B2 (en) Face detection method, face detection system, and face detection program
CN111160233B (en) Human face in-vivo detection method, medium and system based on three-dimensional imaging assistance
CN105427302B (en) A kind of three-dimensional acquisition and reconstructing system based on the sparse camera collection array of movement
US11166005B2 (en) Three-dimensional information acquisition system using pitching practice, and method for calculating camera parameters
US11195290B2 (en) Apparatus and method for encoding in structured depth camera system
CN110232715B (en) Method, device and system for self calibration of multi-depth camera
KR101866107B1 (en) Coding Device, Device and Method and Depth Information Compensation by Plane Modeling
WO2019116518A1 (en) Object detection device and object detection method
CN115375772B (en) Camera calibration method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20871576

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20871576

Country of ref document: EP

Kind code of ref document: A1