CN115880344A - Binocular stereo matching data set parallax truth value acquisition method - Google Patents

Binocular stereo matching data set parallax truth value acquisition method Download PDF

Info

Publication number
CN115880344A
CN115880344A CN202211448064.9A CN202211448064A CN115880344A CN 115880344 A CN115880344 A CN 115880344A CN 202211448064 A CN202211448064 A CN 202211448064A CN 115880344 A CN115880344 A CN 115880344A
Authority
CN
China
Prior art keywords
camera
binocular
depth
data set
parallax
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211448064.9A
Other languages
Chinese (zh)
Inventor
应义斌
王清玉
周鸣川
刘炜
娄明照
蒋焕煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202211448064.9A priority Critical patent/CN115880344A/en
Publication of CN115880344A publication Critical patent/CN115880344A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Length Measuring Devices By Optical Means (AREA)

Abstract

The invention discloses a binocular stereo matching data set parallax truth value acquisition method. Respectively acquiring left and right views of a scene by using left and right cameras of a binocular camera, acquiring a depth map of the scene by using a structured light depth camera, acquiring internal parameters and external parameters of the respective cameras by using a camera joint calibration method, and calculating relative external parameters between the depth camera and the binocular left camera; calculating the pixel coordinate of each pixel point in the depth image under a left view pixel coordinate system through the internal parameter, the external parameter and the depth value, and realizing the registration of the depth image and the left view; and converting the depth value on the depth map into a parallax value by using internal parameters obtained by the binocular camera in binocular calibration to generate the parallax map, and obtaining a true value of the data set. The invention provides a method for acquiring the true value of the binocular stereo matching task and manufacturing the data set, and the manufactured data set can be used for transfer learning and fine adjustment of a deep learning model, and finally binocular three-dimensional reconstruction of a specific scene is realized.

Description

Binocular stereo matching data set parallax truth value acquisition method
Technical Field
The invention relates to a binocular data set parallax parameter acquisition method, in particular to a binocular stereo matching data set construction method for plant binocular three-dimensional reconstruction and phenotype measurement in the field of agricultural engineering.
Background
With the development of artificial intelligence and robot vision technology in recent years, the performance of the stereo matching method based on deep learning on the public binocular chart is far beyond the traditional algorithm. However, training of the deep learning model requires a large data set as a support, and the acquisition of a true value (disparity map) in the binocular stereo matching data set is a big problem in academia.
Aiming at the problem, a learner acquires a disparity map by using computer vision simulation software, but the model generalization capability trained on a virtual data set has a certain problem; for scenes such as outdoor reconstruction and automatic driving, a learner uses a depth value acquired by a laser radar as a true value and performs registration with a binocular left camera to generate a disparity map, but the disparity map acquired by the method has large error and low density; and a learner also builds a structured light measurement system by using a projector and a binocular camera aiming at an indoor reconstructed scene, and uses depth values obtained by encoding and decoding as true values, but the system of the method is complex, time-consuming and labor-consuming, and is not beneficial to automatically and quickly constructing a large-scale data set.
Disclosure of Invention
Aiming at the problems, the invention aims to provide a method for quickly and semi-automatically acquiring a disparity truth value of a binocular stereo matching data set, and aims to construct a large-scale and high-quality binocular stereo matching data set under a specific application and scene, so as to be used for transfer learning and fine adjustment of a supervised deep learning model and realize high-precision depth information perception based on binocular vision.
Aiming at the problems, the invention provides a method for acquiring a high-precision and high-density parallax truth value by using a structured light depth camera and a binocular camera, which is used for quickly and automatically constructing a large-scale binocular stereo matching data set and for performing transfer learning and fine tuning of a deep learning stereo matching model under a specified scene to realize binocular reconstruction.
In order to achieve the above purpose, the present invention adopts the following technical scheme that the implementation steps include camera joint calibration, image registration, parallax calculation and parallax map generation.
The method comprises the steps of adopting a structured light depth camera and a binocular camera, respectively collecting images of a specified scene, and further processing the images and parameters of a left camera and the structured light depth camera in the binocular camera to obtain the parallax truth value of a binocular stereo matching data set under the specified scene.
The invention has the innovation that the parameters of the parallax truth value of the traditional binocular image data are difficult to accurately obtain, and the invention skillfully utilizes the high-precision depth map of the structured light depth camera to register with the image acquired by the left camera and convert the image into the parallax map, thereby realizing the accurate acquisition of the parallax truth value of the binocular stereo matching data set.
Step a, using a binocular camera and a structured light depth camera to build an imaging platform, and building a system by the imaging platform and a data set; and adjusting the relative poses of the two cameras until the relative poses are suitable, and keeping the relative poses of the two cameras unchanged in the subsequent calibration, image registration and data set construction processes.
The binocular camera adopts a ZED binocular camera;
the structured light depth camera adopts a Mech-Mind high-precision structured light depth camera.
B, obtaining internal parameters and external parameters of a left camera and a structured light depth camera of the binocular camera by using a checkerboard calibration plate and Zhang Zhengyou calibration method as calibration results, and then calculating relative external parameters between the left camera and the structured light depth camera, wherein the relative external parameters comprise a rotation matrix and a translation matrix;
c, shooting a specified scene by using a left camera and a right camera of a binocular camera to obtain a left view and a right view of the specified scene, and shooting the specified scene by using a structured light depth camera to obtain a depth map of the scene;
d, traversing each pixel in the depth map by using the calibration result and the relative external reference result, and calculating the pixel coordinate of each pixel in the left view for registration; namely, the corresponding relation between each pixel on the depth map and the left view is obtained, and the image registration is realized.
E, converting the depth values in the depth map by using the internal reference calibration result of the binocular camera to calculate parallax values so as to generate a parallax map, carrying out size normalization processing on the generated parallax map and the left and right views originally acquired by the binocular camera, for example, carrying out post-processing operations such as clipping and the like, and taking the left and right views after size normalization as binocular views of a binocular stereo matching data set, namely input signals; and taking the disparity map with the normalized size as a disparity truth value of a binocular stereo matching data set, namely a supervision signal.
And continuously repeating the process to construct a large-scale binocular reconstruction data set.
The step d comprises the following steps:
converting the coordinates of the current pixel under the pixel coordinate system of the depth map into the coordinates under the camera coordinate system of the depth camera by using the depth value of the current pixel in the depth map and the internal reference matrix of the depth camera;
secondly, converting the coordinates of the current pixel under the camera coordinate system of the depth camera into the coordinates under the camera coordinate system of the left camera in the binocular camera by using the relative external reference result obtained in the step b;
and converting the coordinates of the current pixel under the camera coordinate system of the binocular left camera into the coordinates under the pixel coordinate system of the left view by using the depth value of the current pixel and the internal reference matrix of the binocular left camera.
In the step e, the internal reference calibration result of the binocular camera comprises the focal length and the baseline distance, and the parallax value is converted and calculated according to the following formula:
Figure BDA0003950243040000021
wherein, b ZED Representing binocular phaseBase line distance of the machine, f ZED Denotes the focal length of the binocular camera, d i Representing depth value of ith pixel point in depth map
Figure BDA0003950243040000031
The converted disparity value.
The invention has the beneficial effects that:
the invention generates the parallax truth value of the binocular stereo matching data set by a semi-automatic means through the proposed camera combined calibration and image registration mode.
The method for constructing the binocular stereo matching data set obtains the parallax truth value construction data set, can greatly reduce the cost of manpower and material resources in the construction process of the binocular stereo matching data set, and can ensure that a large-scale stereo matching data set is quickly constructed in a real scene.
Compared with data sets constructed by other methods, the real disparity map of the data set constructed by the method has the advantages of high precision, high density and the like, can be used for transfer learning and fine tuning of a supervised deep learning model in specific tasks and requirements, and realizes high-precision binocular stereo matching, three-dimensional reconstruction and robot depth perception in specific scenes (such as plant three-dimensional reconstruction and phenotype measurement in the field of agricultural engineering or robot indoor grabbing operation in the field of robots).
Drawings
FIG. 1 is a general flow diagram of the process;
FIG. 2 is a schematic diagram of an image registration process of the present invention;
FIG. 3 is a schematic diagram of an imaging platform set up;
FIG. 4 is a data set (including left view, right view, true parallax) constructed using this method, taking plants (including seedlings of spinach, tomato, pepper, pumpkin plants) as an example;
FIG. 5 is a representation of representative stereo matching algorithms (including conventional matching algorithms: BM, SGM, deep learning algorithms: PSmNet, gwcNet) on a test data set.
In the figure: 1, a ZED binocular camera, 2, a Mech-Mini structured light high-precision depth camera and 3, a camera connecting piece.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The hardware device of the present invention includes: the system comprises a ZED binocular camera, a Mech-Mind structured light depth camera, connecting pieces of the camera and an imaging platform. The parameters of the structured light depth camera, such as resolution, field angle and the like, are different from those of the binocular camera, the depth map cannot be directly used as a true value of the binocular camera, and the pixel corresponding relation between the depth map and the image acquired by the binocular left camera is calculated.
On the basis of the imaging platform, firstly, the internal reference and the external reference of the left camera and the structured light depth camera of the binocular camera are respectively obtained by using a chessboard format calibration plate and a Zhang-Yongyou calibration method, and the relative external reference (including a rotation matrix and a translation matrix) between the two cameras is obtained by using the parameters. And traversing each pixel point in a pixel coordinate system of the structured light camera depth map, solving the coordinate of the pixel point in the pixel coordinate system of the left camera of the binocular camera, and converting the depth value corresponding to the pixel point into a corresponding parallax value by utilizing internal parameters of the binocular camera and the binocular camera to form the parallax map. The disparity map and left and right views acquired by a subsequent clipped binocular camera can jointly construct a data set of a binocular stereo matching task.
As shown in fig. 1 and fig. 2, the method described in the present invention includes three steps: camera joint calibration, image registration, parallax calculation and parallax map generation, wherein:
1) Camera joint calibration:
firstly, a binocular camera and a structured light depth camera are respectively used for acquiring a scene view containing a checkerboard calibration board. Obtaining an external reference matrix (including a rotation matrix R) of the left camera of the structured light depth camera and the binocular camera under a world coordinate system by using a Zhang-Zhengyou calibration method mech 、R ZED And a translation matrix t mech 、t ZED ) With respective reference matrices (including K) mech 、K ZED ). Then, by using the external parameter matrix, a relative rotation matrix R from a camera coordinate system of the structured light depth camera to a camera coordinate system of a left camera of a binocular camera is obtained mech→ZED And a translation matrix t mech→ZED
R mech→ZED =R ZED (R mech ) -1
t mech→ZED =t ZED -R ZED (R mech ) -1 t mech
Wherein R is mech 、R ZED Respectively representing the rotation matrixes of the ZED binocular camera and the Mech-Mind structured light depth camera.
2) Image registration:
referring to the steps shown in fig. 2, the key point of obtaining the real disparity map is to calculate the corresponding position of each pixel point in the depth map under the binocular camera left view pixel coordinate system. For this, all the pixels in each row and each column in the depth map need to be traversed for calculation.
Here, the ith pixel in the depth map is taken as an example for description.
First, using the depth value of the ith pixel
Figure BDA0003950243040000041
With internal reference of a structured light depth camera>
Figure BDA0003950243040000042
Figure BDA0003950243040000043
Pixel coordinates of ith pixel of depth map acquired by structured light depth camera
Figure BDA0003950243040000044
Conversion into coordinates in the camera coordinate system of a structured light depth camera @>
Figure BDA0003950243040000045
/>
Figure BDA0003950243040000046
Wherein the content of the first and second substances,
Figure BDA0003950243040000047
respectively representing the coordinate values of the ith pixel on the X, Y and Z axes of the camera coordinate system of the structured light depth camera, f x,mech 、f y,mech Denotes the focal length, u, of the structured light depth camera in the X and Y directions, respectively 0,mech 、v 0,mech Respectively represents the main point pixel coordinate values of the structured light depth camera in the u and v directions>
Figure BDA0003950243040000051
And coordinate values of the ith pixel in the pixel coordinate systems u and v of the structured light depth camera are respectively shown.
Next, using the relative conversion relation R between the two cameras obtained in step 1) mech→ZED And t mech→ZED Coordinates in the camera coordinate system of the depth camera may be determined
Figure BDA0003950243040000052
Coordinate under camera coordinates converted into binocular left camera
Figure BDA0003950243040000053
Figure BDA0003950243040000054
Wherein the content of the first and second substances,
Figure BDA0003950243040000055
and coordinate values of the ith pixel on X, Y and Z axes of a camera coordinate system of a left camera of the binocular camera are respectively represented.
Finally, utilizing the internal reference of the left camera of the binocular camera obtained by calibration in the step 1)
Figure BDA0003950243040000056
Figure BDA0003950243040000057
Depth value of corresponding pixel/>
Figure BDA0003950243040000058
Calculating the pixel coordinate of the pixel point under the field of view of the binocular left camera->
Figure BDA0003950243040000059
Figure BDA00039502430400000510
Wherein f is x,ZED 、f y,ZED Denotes focal lengths, u, of the left camera of the binocular camera in X and Y directions, respectively 0,ZED 、u 0,ZED Respectively representing principal point pixel coordinate values of a left camera of the binocular camera in u and v directions,
Figure BDA00039502430400000511
and coordinate values of the ith pixel in the pixel coordinate systems u and v of the left camera of the binocular camera are respectively represented.
The coordinates of the ith pixel in the pixel coordinate system of the depth camera may then be determined
Figure BDA00039502430400000512
Conversion into coordinates in the pixel coordinate system of a binocular left camera>
Figure BDA00039502430400000513
By traversing each pixel in the depth map, the pixel coordinates of each pixel in the depth map in the binocular left view can be obtained.
3) Parallax calculation and parallax map generation:
before the operation of the step is carried out, the ZeD binocular camera is subjected to additional binocular calibration again by using the chessboard calibration plate and Zhang Zhen calibration method, and the internal parameters (including binocular baseline distance b) of the ZED binocular camera are obtained ZED Focal length f of camera ZED )。
Followed by internal reference to binocular camera according to binocular vision (bag)Including baseline distance and focal length), the depth value of the corresponding pixel point can be determined
Figure BDA00039502430400000514
Converted into a disparity value d i
Figure BDA0003950243040000061
Finally, because the field angle of the structured light depth camera is smaller than that of the binocular camera, edges of the left and right views acquired by the binocular camera need to be cut, and the view field range of the disparity map is kept consistent. The cut left and right views can be used as the input of a binocular stereo matching data set; the disparity map obtained after registration can be used as a true value of a data set, and provides supervision information during deep learning model training.
The plant binocular reconstruction data set constructed by the method described in the present invention is shown in fig. 4. The prediction results of the deep learning model trained using this data set are shown in fig. 5.
Therefore, the method is ingeniously provided for acquiring the true value of the parallax with high precision and high density of the binocular stereo matching task and manufacturing the data set, the manufactured data set can be used for transfer learning and fine adjustment of the deep learning model, and finally binocular three-dimensional reconstruction of a specific scene (such as robot grabbing, plant phenotype measurement and the like) is achieved.

Claims (4)

1. A binocular stereo matching data set parallax truth value acquisition method is characterized by comprising the following steps:
the method comprises the steps of adopting a structured light depth camera and a binocular camera, respectively collecting images of a specified scene, and further processing the images and parameters of a left camera and the structured light depth camera in the binocular camera to obtain the parallax truth value of a binocular stereo matching data set under the specified scene.
2. The binocular stereo matching data set parallax true value acquisition method according to claim 1, wherein: the method specifically comprises the following steps:
step a, using a binocular camera and a structured light depth camera to build an imaging platform;
b, obtaining internal parameters and external parameters of a left camera and a structured light depth camera of the binocular camera by using a checkerboard calibration plate and Zhang Zhengyou calibration method as calibration results, and then calculating relative external parameters between the left camera and the structured light depth camera, wherein the relative external parameters comprise a rotation matrix and a translation matrix;
c, shooting a specified scene by using a binocular camera to obtain left and right views of the specified scene, and shooting the specified scene by using a structured light depth camera to obtain a depth map of the scene;
d, traversing each pixel in the depth map by using the calibration result and the relative external reference result, and calculating the pixel coordinate of each pixel in the left view for registration;
e, converting the depth values in the depth map by using the internal parameters of the binocular camera to calculate parallax values so as to generate a parallax map, carrying out size normalization processing on the generated parallax map and the left and right views originally acquired by the binocular camera, and taking the left and right views after size normalization as binocular views of a binocular stereo matching data set; and taking the disparity map after size normalization as a disparity true value of the binocular stereo matching data set.
3. The binocular stereo matching data set parallax true value acquisition method according to claim 2, characterized in that: the step d comprises the following steps:
converting the coordinates of the current pixel under the pixel coordinate system of the depth map into the coordinates under the camera coordinate system of the depth camera by using the depth value of the current pixel in the depth map and the internal reference matrix of the depth camera;
secondly, converting the coordinates of the current pixel under the camera coordinate system of the depth camera into the coordinates under the camera coordinate system of the left camera in the binocular camera by using the relative external reference result obtained in the step b;
and converting the coordinates of the current pixel under the camera coordinate system of the binocular left camera into the coordinates under the pixel coordinate system of the left view by using the depth value of the current pixel and the internal reference matrix of the binocular left camera.
4. The method for acquiring the parallax true value of the binocular stereo matching data set according to claim 2, wherein the method comprises the following steps: in the step e, the internal parameters of the binocular camera comprise the focal length and the baseline distance, and the parallax value is converted and calculated according to the following formula:
Figure FDA0003950243030000011
wherein, b ZED Representing the base-line distance, f, of a binocular camera ZED Denotes the focal length of the binocular camera, d i Depth value for representing ith pixel point in depth map
Figure FDA0003950243030000021
The converted disparity value. />
CN202211448064.9A 2022-11-18 2022-11-18 Binocular stereo matching data set parallax truth value acquisition method Pending CN115880344A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211448064.9A CN115880344A (en) 2022-11-18 2022-11-18 Binocular stereo matching data set parallax truth value acquisition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211448064.9A CN115880344A (en) 2022-11-18 2022-11-18 Binocular stereo matching data set parallax truth value acquisition method

Publications (1)

Publication Number Publication Date
CN115880344A true CN115880344A (en) 2023-03-31

Family

ID=85760227

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211448064.9A Pending CN115880344A (en) 2022-11-18 2022-11-18 Binocular stereo matching data set parallax truth value acquisition method

Country Status (1)

Country Link
CN (1) CN115880344A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116866522A (en) * 2023-07-11 2023-10-10 广州市图威信息技术服务有限公司 Remote monitoring method
CN117315033A (en) * 2023-11-29 2023-12-29 上海仙工智能科技有限公司 Neural network-based identification positioning method and system and storage medium
CN117372647A (en) * 2023-10-26 2024-01-09 天宫开物(深圳)科技有限公司 Rapid construction method and system of three-dimensional model for building
CN117456124A (en) * 2023-12-26 2024-01-26 浙江大学 Dense SLAM method based on back-to-back binocular fisheye camera

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869167A (en) * 2016-03-30 2016-08-17 天津大学 High-resolution depth map acquisition method based on active and passive fusion
CN113222945A (en) * 2021-05-19 2021-08-06 西安电子科技大学 Depth information measuring method based on binocular event camera
CN115035247A (en) * 2022-06-06 2022-09-09 中国计量大学 Mars scene binocular data set generation method based on virtual reality

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869167A (en) * 2016-03-30 2016-08-17 天津大学 High-resolution depth map acquisition method based on active and passive fusion
CN113222945A (en) * 2021-05-19 2021-08-06 西安电子科技大学 Depth information measuring method based on binocular event camera
CN115035247A (en) * 2022-06-06 2022-09-09 中国计量大学 Mars scene binocular data set generation method based on virtual reality

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘娇丽;李素梅;李永达;刘富岩;: "基于TOF与立体匹配相融合的高分辨率深度获取", 信息技术, no. 12, 25 December 2016 (2016-12-25) *
徐晟: "《基于双目立体视觉的深度感知技术研究及实现》", 《华南理工大学》, 15 September 2018 (2018-09-15), pages 5 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116866522A (en) * 2023-07-11 2023-10-10 广州市图威信息技术服务有限公司 Remote monitoring method
CN116866522B (en) * 2023-07-11 2024-05-17 广州市图威信息技术服务有限公司 Remote monitoring method
CN117372647A (en) * 2023-10-26 2024-01-09 天宫开物(深圳)科技有限公司 Rapid construction method and system of three-dimensional model for building
CN117315033A (en) * 2023-11-29 2023-12-29 上海仙工智能科技有限公司 Neural network-based identification positioning method and system and storage medium
CN117315033B (en) * 2023-11-29 2024-03-19 上海仙工智能科技有限公司 Neural network-based identification positioning method and system and storage medium
CN117456124A (en) * 2023-12-26 2024-01-26 浙江大学 Dense SLAM method based on back-to-back binocular fisheye camera
CN117456124B (en) * 2023-12-26 2024-03-26 浙江大学 Dense SLAM method based on back-to-back binocular fisheye camera

Similar Documents

Publication Publication Date Title
CN109615652B (en) Depth information acquisition method and device
CN115880344A (en) Binocular stereo matching data set parallax truth value acquisition method
Teller et al. Calibrated, registered images of an extended urban area
CN109919911B (en) Mobile three-dimensional reconstruction method based on multi-view photometric stereo
CN111028155B (en) Parallax image splicing method based on multiple pairs of binocular cameras
CN114399554B (en) Calibration method and system of multi-camera system
CN104537707B (en) Image space type stereoscopic vision moves real-time measurement system online
CN109712232B (en) Object surface contour three-dimensional imaging method based on light field
CN113129430B (en) Underwater three-dimensional reconstruction method based on binocular structured light
CN104463969B (en) A kind of method for building up of the model of geographical photo to aviation tilt
CN108053373A (en) One kind is based on deep learning model fisheye image correcting method
WO2024045632A1 (en) Binocular vision and imu-based underwater scene three-dimensional reconstruction method, and device
CN109920000B (en) Multi-camera cooperation-based dead-corner-free augmented reality method
CN111461963B (en) Fisheye image stitching method and device
WO2020237492A1 (en) Three-dimensional reconstruction method, device, apparatus, and storage medium
CN104794713A (en) Greenhouse crop digital-imaging method based on ARM and binocular vision
CN114066983A (en) Intelligent supplementary scanning method based on two-axis rotary table and computer readable storage medium
CN114283203A (en) Calibration method and system of multi-camera system
CN112634379B (en) Three-dimensional positioning measurement method based on mixed vision field light field
CN105374067A (en) Three-dimensional reconstruction method based on PAL cameras and reconstruction system thereof
CN116129037B (en) Visual touch sensor, three-dimensional reconstruction method, system, equipment and storage medium thereof
CN111854636A (en) Multi-camera array three-dimensional detection system and method
CN113724337A (en) Camera dynamic external parameter calibration method and device without depending on holder angle
CN111429571A (en) Rapid stereo matching method based on spatio-temporal image information joint correlation
CN114935316B (en) Standard depth image generation method based on optical tracking and monocular vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination