CN114298965A - Binocular vision system-based interframe matching detection method and system and intelligent terminal - Google Patents

Binocular vision system-based interframe matching detection method and system and intelligent terminal Download PDF

Info

Publication number
CN114298965A
CN114298965A CN202111224174.2A CN202111224174A CN114298965A CN 114298965 A CN114298965 A CN 114298965A CN 202111224174 A CN202111224174 A CN 202111224174A CN 114298965 A CN114298965 A CN 114298965A
Authority
CN
China
Prior art keywords
image
matching
coordinate system
world coordinate
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111224174.2A
Other languages
Chinese (zh)
Inventor
裴姗姗
孙钊
肖志鹏
王欣亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Smarter Eye Technology Co Ltd
Original Assignee
Beijing Smarter Eye Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Smarter Eye Technology Co Ltd filed Critical Beijing Smarter Eye Technology Co Ltd
Priority to CN202111224174.2A priority Critical patent/CN114298965A/en
Publication of CN114298965A publication Critical patent/CN114298965A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a binocular vision system-based interframe matching detection method, a binocular vision system-based interframe matching detection system and an intelligent terminal, wherein the method comprises the following steps: respectively acquiring a top gray-scale image and a top segmentation image of two adjacent frames in the same road scene; detecting a candidate region to be matched of a previous frame through the top view segmentation graph; calculating an initial moving distance between two frames and an initial estimation position of a next frame through the vehicle speed and the timestamp, and acquiring a search area of the next frame based on the initial estimation position of the next frame; performing template matching on the candidate area to be matched and the search area, and calculating the deviation of a matching position; and correcting the initial moving distance by using the matching position deviation to obtain an interframe matching result. The method and the device improve the registration precision of the data between the adjacent frames in the image processing of the auxiliary driving, and further provide more accurate image processing data for the auxiliary driving system.

Description

Binocular vision system-based interframe matching detection method and system and intelligent terminal
Technical Field
The invention relates to the technical field of automatic driving assistance, in particular to a binocular vision system-based interframe matching detection method and system and an intelligent terminal.
Background
With the development of automatic driving technology, people have increasingly higher requirements on safety and comfort of vehicles for assisting driving. In the auxiliary driving, the effect of image processing, especially the data registration precision between adjacent frames, has a great influence on the control effect of the auxiliary driving, and is directly related to the safety and comfort of the auxiliary driving vehicle.
Therefore, it is an urgent need to solve the problem for those skilled in the art to provide a frame-to-frame matching detection method based on a binocular vision system, so as to improve the registration accuracy of data between adjacent frames in image processing for driving assistance and provide more accurate image processing data for the driving assistance system.
Disclosure of Invention
Therefore, the embodiment of the invention provides a binocular vision system-based interframe matching detection method, a binocular vision system-based interframe matching detection system and an intelligent terminal, so that the registration accuracy of data between adjacent frames in image processing of auxiliary driving can be improved, and more accurate image processing data can be provided for an auxiliary driving system.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
the invention provides a binocular vision system-based interframe matching detection method, which is characterized by comprising the following steps:
respectively acquiring a top gray-scale image and a top segmentation image of two adjacent frames in the same road scene;
detecting a candidate region to be matched of a previous frame through the top view segmentation graph;
calculating an initial moving distance between two frames and an initial estimation position of a next frame through the vehicle speed and the timestamp, and acquiring a search area of the next frame based on the initial estimation position of the next frame;
performing template matching on the candidate area to be matched and the search area, and calculating the deviation of a matching position;
and correcting the initial moving distance by using the matching position deviation to obtain an interframe matching result.
Further, the obtaining of the top gray-scale image and the top segmentation image of two adjacent frames in front and behind the same road scene respectively specifically includes:
acquiring left and right views of the same road scene, and processing the left and right views to obtain a dense disparity map of the road scene;
converting image information of a target area into three-dimensional point cloud information under a world coordinate system based on the dense parallax map, and fitting a road surface model based on the three-dimensional point cloud information;
a target area is defined in the dense disparity map, an image of the target area is input into a trained semantic segmentation model, and two-dimensional image information after segmentation is obtained;
and converting the gray-scale image of the detection area to an XOZ projection plane to generate the top view gray-scale image and converting the segmentation image of the detection area to the XOZ projection plane to generate the top view segmentation image based on the homography transformation of the two-dimensional image information and the three-dimensional point cloud information.
Further, the converting the image information of the target area into three-dimensional point cloud information under a world coordinate system based on the dense disparity map specifically includes:
converting the image coordinate system of the dense parallax image into a world coordinate system based on a binocular stereo vision system imaging model and a pinhole imaging model;
taking a target area under a real world coordinate system as a reference, and intercepting the target area from the dense parallax image;
converting the image information in the target area into three-dimensional point cloud information according to the following formula:
Figure 161667DEST_PATH_IMAGE001
b is the distance from the optical center of a left camera to the optical center of a right camera in the binocular stereoscopic vision imaging system;
f is the focal length of a camera in the binocular stereoscopic vision imaging system;
cx and cy are image coordinates of a camera principal point in the binocular stereoscopic vision imaging system;
Figure 258936DEST_PATH_IMAGE002
and
Figure 560605DEST_PATH_IMAGE003
is an image coordinate point within the target region;
disp is the coordinate of an image point of (
Figure 186758DEST_PATH_IMAGE002
Figure 308298DEST_PATH_IMAGE003
) The disparity value of (1);
x is the transverse distance between a three-dimensional point and the camera under the world coordinate system;
y is the longitudinal distance between the three-dimensional point and the camera under the world coordinate system;
and Z is the depth distance of the three-dimensional point from the camera under the world coordinate system.
Further, a road surface model equation fitted based on the three-dimensional point cloud information is as follows:
Figure 943679DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 99853DEST_PATH_IMAGE005
is the direction cosine of the included angle between the road surface normal vector and the x coordinate axis of the world coordinate system;
Figure 896908DEST_PATH_IMAGE006
is a road surfaceThe direction cosine of an included angle between the normal vector and the y coordinate axis of the world coordinate system;
Figure 36902DEST_PATH_IMAGE007
is the direction cosine of an included angle between a road surface normal vector and a world coordinate system z coordinate axis;
and D is the distance from the origin of the world coordinate system to the plane of the road surface.
Further, homography transformation based on the two-dimensional image information and the three-dimensional point cloud information is completed by using the following formula:
Figure 944816DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 955497DEST_PATH_IMAGE009
respectively, the lateral distance from the camera, the depth from the camera;
Figure 923453DEST_PATH_IMAGE010
and
Figure 611397DEST_PATH_IMAGE011
is an image coordinate point within the detection area;
Figure 323001DEST_PATH_IMAGE012
and
Figure 188189DEST_PATH_IMAGE013
is a projection plane coordinate point within the detection area;
h is the homography transformation matrix.
Further, the detecting the candidate region to be matched of the previous frame through the top view segmentation map specifically includes:
setting the pixel value marked as a preset mark on the overlook segmentation image of the previous frame as 1, and setting other positions as 0, and generating a binary image according to the pixel value;
and detecting a connected domain on the binary image, and expanding a region with a preset size to the periphery as a candidate region by taking the central position of the lower boundary of the connected domain as a reference.
Further, the acquiring a search area of the next frame based on the initial estimated position of the next frame specifically includes:
obtaining an initial estimation position of a candidate region in a next frame based on the initial moving distance and the candidate region;
and taking the initial estimation position as a reference, and expanding a region with a preset size to the periphery as a search region.
The invention also provides a binocular vision system-based interframe matching detection system, which comprises:
the image acquisition unit is used for respectively acquiring a top gray-scale image and a top segmentation image of two adjacent frames in the same road scene;
a candidate region acquisition unit, configured to detect a candidate region to be matched of a previous frame through the top view segmentation map;
a search area acquisition unit for calculating an initial moving distance between two frames and an initial estimated position of a subsequent frame by a vehicle speed and a time stamp, and acquiring a search area of the subsequent frame based on the initial estimated position of the subsequent frame;
the position deviation acquiring unit is used for performing template matching on the candidate area to be matched and the search area and calculating the matching position deviation;
and the matching result output unit is used for correcting the initial moving distance by using the matching position deviation so as to obtain an interframe matching result.
The present invention also provides an intelligent terminal, including: the device comprises a data acquisition device, a processor and a memory;
the data acquisition device is used for acquiring data; the memory is to store one or more program instructions; the processor is configured to execute one or more program instructions to perform the method as described above.
The present invention also provides a computer readable storage medium having embodied therein one or more program instructions for executing the method as described above.
The invention provides a binocular vision system-based interframe matching detection method, which comprises the steps of respectively obtaining a top view gray scale image and a top view segmentation image of two adjacent frames in the same road scene, detecting a candidate area to be matched of the previous frame through the top view segmentation image, calculating an initial moving distance between the two frames and an initial estimation position of the next frame through a vehicle speed and a time stamp, and obtaining a search area of the next frame based on the initial estimation position of the next frame; and performing template matching on the candidate area to be matched and the search area, calculating a matching position deviation, and finally correcting the initial moving distance by using the matching position deviation to obtain an inter-frame matching result. Therefore, the method obtains the position deviation by matching the candidate area to be matched with the search area, so that the initial moving distance is corrected and compensated by using the position deviation, the registration precision between two adjacent frames is higher, the registration precision of data between the adjacent frames in the image processing of the auxiliary driving is improved, and more accurate image processing data is provided for the auxiliary driving system.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.
The structures, ratios, sizes, and the like shown in the present specification are only used for matching with the contents disclosed in the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions that the present invention can be implemented, so that the present invention has no technical significance, and any structural modifications, changes in the ratio relationship, or adjustments of the sizes, without affecting the effects and the achievable by the present invention, should still fall within the range that the technical contents disclosed in the present invention can cover.
FIG. 1 is a flowchart of a binocular vision system based interframe matching detection method according to an embodiment of the present invention;
fig. 2 is a block diagram of a specific embodiment of the binocular vision system-based interframe matching detection system provided by the invention.
Detailed Description
The present invention is described in terms of particular embodiments, other advantages and features of the invention will become apparent to those skilled in the art from the following disclosure, and it is to be understood that the described embodiments are merely exemplary of the invention and that it is not intended to limit the invention to the particular embodiments disclosed. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The binocular vision system-based interframe matching detection method provided by the invention can improve the registration precision of data between adjacent frames in the image processing of auxiliary driving.
In a specific embodiment, as shown in fig. 1, the interframe matching detection method provided by the present invention includes the following steps:
s1: and respectively acquiring top gray-scale views and top segmentation views of two adjacent frames in the same road scene.
In step S1, the obtaining of the top gray-scale image and the top segmentation image of two adjacent frames in the same road scene includes:
s11: and acquiring left and right views of the same road scene, and processing the left and right views to obtain a dense disparity map of the road scene.
That is to say, the left and right views of the same road scene are acquired through the binocular stereo vision sensor, and the left and right views are processed to obtain the dense disparity map of the road scene.
In this embodiment, the coordinate system of the binocular stereo camera is taken as a reference system, the optical axis direction of the left eye camera is a Z-axis distance direction, the baseline direction of the binocular stereo camera is an X-axis transverse direction, and the vertical direction is a Y-axis longitudinal direction.
S12: and converting the image information of the target area into three-dimensional point cloud information under a world coordinate system based on the dense parallax map, and fitting a road surface model based on the three-dimensional point cloud information.
Specifically, a target area in an image is intercepted by taking the target area in a real world coordinate system as a reference, and the image area of the target area is converted into three-dimensional point cloud information pts in the world coordinate system; and the image area information completes the conversion from an image coordinate system to a world coordinate system according to the imaging model of the binocular stereoscopic vision system and the pinhole imaging model.
In order to improve the accuracy of the three-dimensional point cloud information and further ensure the accuracy of the subsequent calculation result, in step S12, the converting the image information of the target area into the three-dimensional point cloud information under the world coordinate system based on the dense disparity map specifically includes:
converting the image coordinate system of the dense parallax image into a world coordinate system based on a binocular stereo vision system imaging model and a pinhole imaging model;
taking a target area under a real world coordinate system as a reference, and intercepting the target area from the dense parallax image;
converting the image information in the target area into three-dimensional point cloud information according to the following formula:
Figure 327046DEST_PATH_IMAGE001
b is the distance from the optical center of a left camera to the optical center of a right camera in the binocular stereoscopic vision imaging system;
f is the focal length of a camera in the binocular stereoscopic vision imaging system;
cx and cy are image coordinates of a camera principal point in the binocular stereoscopic vision imaging system;
Figure 441633DEST_PATH_IMAGE002
and
Figure 956928DEST_PATH_IMAGE003
is an image coordinate point within the target region;
disp is the coordinate of an image point of (
Figure 676622DEST_PATH_IMAGE002
Figure 986381DEST_PATH_IMAGE003
) The disparity value of (1);
x is the transverse distance between a three-dimensional point and the camera under the world coordinate system;
y is the longitudinal distance between the three-dimensional point and the camera under the world coordinate system;
and Z is the depth distance of the three-dimensional point from the camera under the world coordinate system.
In step S12, the road surface model equation fitted based on the three-dimensional point cloud information is:
Figure 322684DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 376091DEST_PATH_IMAGE005
is the direction cosine of the included angle between the road surface normal vector and the x coordinate axis of the world coordinate system;
Figure 215871DEST_PATH_IMAGE006
is the direction cosine of the included angle between the road surface normal vector and the y coordinate axis of the world coordinate system;
Figure 978422DEST_PATH_IMAGE007
is the direction cosine of an included angle between a road surface normal vector and a world coordinate system z coordinate axis;
and D is the distance from the origin of the world coordinate system to the plane of the road surface.
S13: and defining a target area in the dense disparity map, inputting the image of the target area into a trained semantic segmentation model, and obtaining segmented two-dimensional image information.
In order to obtain an accurate semantic segmentation model, the terrain conditions possibly occurring in the road can be analyzed, the terrain common scene categories are classified, and then various scenes are shot to obtain a plurality of training images. And then, labeling the interested region for each training image to obtain a mask image. For example, the pixel value of the bridge joint is marked as 0, the pixel value of the common road surface is marked as 1, the pixel value of the road surface mark is marked as 2, the pixel value of the deceleration strip is marked as 3, the pixel value of the manhole cover is marked as 4, and the pixel value of the accumulated water is marked as 5, so that the mask image uniquely corresponding to each training image can be obtained.
S14: and converting the gray-scale image of the detection area to an XOZ projection plane to generate the top view gray-scale image and converting the segmentation image of the detection area to the XOZ projection plane to generate the top view segmentation image based on the homography transformation of the two-dimensional image information and the three-dimensional point cloud information.
In step S14, homography transformation based on the two-dimensional image information and the three-dimensional point cloud information is completed using the following formula:
Figure 536442DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 127960DEST_PATH_IMAGE009
respectively, the lateral distance from the camera, the depth from the camera;
Figure 822247DEST_PATH_IMAGE010
and
Figure 67283DEST_PATH_IMAGE011
is an image coordinate point within the detection area;
Figure 847020DEST_PATH_IMAGE012
and
Figure 242230DEST_PATH_IMAGE013
is a projection plane coordinate point within the detection area;
h is the homography transformation matrix.
The grayscale image of the detection area can be converted to the XOZ projection plane through homography to generate a top view grayscale image, and meanwhile, the segmentation image of the detection area is converted to the XOZ projection plane to generate a top view segmentation image.
S2: and detecting a candidate region to be matched of a previous frame through the top view segmentation graph.
It should be understood that, in this embodiment, the previous frame refers to the previous frame in two adjacent frames, and for convenience of description, the previous frame is set as the t-1 frame; the next frame refers to the next frame of the two adjacent frames, and for convenience of description, the next frame is set as the t frame.
In step S2, in order to improve the accuracy of the delimiting of the candidate region, the detecting the candidate region to be matched of the previous frame through the top view segmentation map specifically includes the following steps:
s21: setting the pixel value marked as a preset mark on the overlook segmentation image of the previous frame as 1, and setting other positions as 0, and generating a binary image according to the pixel value;
s22: and detecting a connected domain on the binary image, and expanding a region with a preset size to the periphery as a candidate region by taking the central position of the lower boundary of the connected domain as a reference.
For example, in a specific usage scenario, candidate regions with pixel values 2, 3, and 4 are detected on the top view of the t-1 th frame, specifically, on the top view of the t-1 th frame, the pixel values with pixel values 2, 3, and 4 are set to 1, and other positions are set to 0, a binary image is generated, a connected component is detected on the binary image, and a fixed-size region is expanded around the connected component as a candidate region with the center position of the lower boundary of the connected component as a reference.
S3: calculating an initial moving distance between two frames and an initial estimation position of a next frame through the vehicle speed and the timestamp, and acquiring a search area of the next frame based on the initial estimation position of the next frame;
the method for acquiring the search area of the next frame based on the initial estimation position of the next frame specifically comprises the following steps:
s31: obtaining an initial estimation position of a candidate region in a next frame based on the initial moving distance and the candidate region;
s32: and taking the initial estimation position as a reference, and expanding a region with a preset size to the periphery as a search region.
In the specific use scenario described above, the initial movement distance between two frames is calculated from the vehicle speed information, the t-frame timestamp information, and the t-1-frame timestamp information.
S4: and performing template matching on the candidate area to be matched and the search area, and calculating the matching position deviation.
Specifically, by the initial moving distance calculated in step S3 and the determined candidate region, the initial estimated position of the candidate region in the t-th frame may be obtained. That is, a fixed-size region is expanded around as a search region with the initial estimated position as a reference, and a matching position deviation is calculated by performing template matching using the candidate region and the search region.
S5: and correcting and compensating the initial moving distance by using the matching position deviation to obtain an interframe matching result.
In the above specific embodiment, the inter-frame matching detection method based on the binocular vision system provided by the present invention respectively obtains the top view grayscale image and the top view segmentation image of two adjacent frames in the same road scene, detects the candidate area to be matched of the previous frame through the top view segmentation image, calculates the initial moving distance between the two frames and the initial estimated position of the next frame through the vehicle speed and the timestamp, and obtains the search area of the next frame based on the initial estimated position of the next frame; and performing template matching on the candidate area to be matched and the search area, calculating a matching position deviation, and finally correcting the initial moving distance by using the matching position deviation to obtain an inter-frame matching result. Therefore, the method obtains the position deviation by matching the candidate area to be matched with the search area, so that the initial moving distance is corrected and compensated by using the position deviation, the registration precision between two adjacent frames is higher, the registration precision of data between the adjacent frames in the image processing of the auxiliary driving is improved, and more accurate image processing data is provided for the auxiliary driving system.
In addition to the above method, the present invention also provides a binocular vision system-based interframe matching detection system, which, in one embodiment, as shown in fig. 2, includes:
the image acquisition processing unit 100 is specifically configured to:
acquiring left and right views of the same road scene, and processing the left and right views to obtain a dense disparity map of the road scene;
converting image information of a target area into three-dimensional point cloud information under a world coordinate system based on the dense parallax map, and fitting a road surface model based on the three-dimensional point cloud information;
a target area is defined in the dense disparity map, an image of the target area is input into a trained semantic segmentation model, and two-dimensional image information after segmentation is obtained;
and converting the gray-scale image of the detection area to an XOZ projection plane to generate the top view gray-scale image and converting the segmentation image of the detection area to the XOZ projection plane to generate the top view segmentation image based on the homography transformation of the two-dimensional image information and the three-dimensional point cloud information.
The converting the image information of the target area into three-dimensional point cloud information under a world coordinate system based on the dense disparity map specifically comprises:
converting the image coordinate system of the dense parallax image into a world coordinate system based on a binocular stereo vision system imaging model and a pinhole imaging model;
taking a target area under a real world coordinate system as a reference, and intercepting the target area from the dense parallax image;
converting the image information in the target area into three-dimensional point cloud information according to the following formula:
Figure 56602DEST_PATH_IMAGE001
b is the distance from the optical center of a left camera to the optical center of a right camera in the binocular stereoscopic vision imaging system;
f is the focal length of a camera in the binocular stereoscopic vision imaging system;
cx and cy are image coordinates of a camera principal point in the binocular stereoscopic vision imaging system;
Figure 144644DEST_PATH_IMAGE002
and
Figure 411677DEST_PATH_IMAGE003
is an image coordinate point within the target region;
disp is the coordinate of an image point of (
Figure 610577DEST_PATH_IMAGE002
Figure 59882DEST_PATH_IMAGE003
) The disparity value of (1);
x is the transverse distance between a three-dimensional point and the camera under the world coordinate system;
y is the longitudinal distance between the three-dimensional point and the camera under the world coordinate system;
and Z is the depth distance of the three-dimensional point from the camera under the world coordinate system.
The road surface model equation based on the three-dimensional point cloud information fitting is as follows:
Figure 53246DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 807575DEST_PATH_IMAGE005
is the direction cosine of the included angle between the road surface normal vector and the x coordinate axis of the world coordinate system;
Figure 544587DEST_PATH_IMAGE006
is the direction cosine of the included angle between the road surface normal vector and the y coordinate axis of the world coordinate system;
Figure 395868DEST_PATH_IMAGE007
is the direction cosine of an included angle between a road surface normal vector and a world coordinate system z coordinate axis;
and D is the distance from the origin of the world coordinate system to the plane of the road surface.
Wherein homography transformation based on the two-dimensional image information and the three-dimensional point cloud information is completed by using the following formula:
Figure 825713DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 67338DEST_PATH_IMAGE009
respectively, the lateral distance from the camera, the depth from the camera;
Figure 342462DEST_PATH_IMAGE010
and
Figure 517091DEST_PATH_IMAGE011
is an image coordinate point within the detection area;
Figure 852257DEST_PATH_IMAGE012
and
Figure 581179DEST_PATH_IMAGE013
is a projection plane coordinate point within the detection area;
h is the homography transformation matrix.
The candidate area obtaining unit 200 is specifically configured to:
setting the pixel value marked as a preset mark on the overlook segmentation image of the previous frame as 1, and setting other positions as 0, and generating a binary image according to the pixel value;
and detecting a connected domain on the binary image, and expanding a region with a preset size to the periphery as a candidate region by taking the central position of the lower boundary of the connected domain as a reference.
The search area obtaining unit 300 is specifically configured to:
obtaining an initial estimation position of a candidate region in a next frame based on the initial moving distance and the candidate region;
and taking the initial estimation position as a reference, and expanding a region with a preset size to the periphery as a search region.
In the above specific embodiment, the inter-frame matching detection system based on the binocular vision system provided by the invention respectively obtains the top view grayscale image and the top view segmentation image of two adjacent frames in the same road scene, detects the candidate area to be matched of the previous frame through the top view segmentation image, calculates the initial moving distance between the two frames and the initial estimated position of the next frame through the vehicle speed and the timestamp, and obtains the search area of the next frame based on the initial estimated position of the next frame; and performing template matching on the candidate area to be matched and the search area, calculating a matching position deviation, and finally correcting the initial moving distance by using the matching position deviation to obtain an inter-frame matching result. Therefore, the method obtains the position deviation by matching the candidate area to be matched with the search area, so that the initial moving distance is corrected and compensated by using the position deviation, the registration precision between two adjacent frames is higher, the registration precision of data between the adjacent frames in the image processing of the auxiliary driving is improved, and more accurate image processing data is provided for the auxiliary driving system.
The present invention also provides an intelligent terminal, including: the device comprises a data acquisition device, a processor and a memory;
the data acquisition device is used for acquiring data; the memory is to store one or more program instructions; the processor is configured to execute one or more program instructions to perform the method as described above.
In correspondence with the above embodiments, embodiments of the present invention also provide a computer storage medium containing one or more program instructions therein. Wherein the one or more program instructions are for executing the method as described above by a binocular camera depth calibration system.
In an embodiment of the invention, the processor may be an integrated circuit chip having signal processing capability. The Processor may be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The processor reads the information in the storage medium and completes the steps of the method in combination with the hardware.
The storage medium may be a memory, for example, which may be volatile memory or nonvolatile memory, or which may include both volatile and nonvolatile memory.
The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory.
The volatile Memory may be a Random Access Memory (RAM) which serves as an external cache. By way of example and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), SLDRAM (SLDRAM), and Direct Rambus RAM (DRRAM).
The storage media described in connection with the embodiments of the invention are intended to comprise, without being limited to, these and any other suitable types of memory.
Those skilled in the art will appreciate that the functionality described in the present invention may be implemented in a combination of hardware and software in one or more of the examples described above. When software is applied, the corresponding functionality may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The above embodiments are only for illustrating the embodiments of the present invention and are not to be construed as limiting the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the embodiments of the present invention shall be included in the scope of the present invention.

Claims (10)

1. A binocular vision system-based interframe matching detection method is characterized by comprising the following steps:
respectively acquiring a top gray-scale image and a top segmentation image of two adjacent frames in the same road scene;
detecting a candidate region to be matched of a previous frame through the top view segmentation graph;
calculating an initial moving distance between two frames and an initial estimation position of a next frame through the vehicle speed and the timestamp, and acquiring a search area of the next frame based on the initial estimation position of the next frame;
performing template matching on the candidate area to be matched and the search area, and calculating the deviation of a matching position;
and correcting the initial moving distance by using the matching position deviation to obtain an interframe matching result.
2. The inter-frame matching detection method according to claim 1, wherein the obtaining of the top-view grayscale image and the top-view segmentation image of two adjacent frames in the same road scene respectively specifically includes:
acquiring left and right views of the same road scene, and processing the left and right views to obtain a dense disparity map of the road scene;
converting image information of a target area into three-dimensional point cloud information under a world coordinate system based on the dense parallax map, and fitting a road surface model based on the three-dimensional point cloud information;
a target area is defined in the dense disparity map, an image of the target area is input into a trained semantic segmentation model, and two-dimensional image information after segmentation is obtained;
and converting the gray-scale image of the detection area to an XOZ projection plane to generate the top view gray-scale image and converting the segmentation image of the detection area to the XOZ projection plane to generate the top view segmentation image based on the homography transformation of the two-dimensional image information and the three-dimensional point cloud information.
3. The method according to claim 2, wherein the converting image information of the target area into three-dimensional point cloud information in a world coordinate system based on the dense disparity map specifically comprises:
converting the image coordinate system of the dense parallax image into a world coordinate system based on a binocular stereo vision system imaging model and a pinhole imaging model;
taking a target area under a real world coordinate system as a reference, and intercepting the target area from the dense parallax image;
converting the image information in the target area into three-dimensional point cloud information according to the following formula:
Figure 796621DEST_PATH_IMAGE001
b is the distance from the optical center of a left camera to the optical center of a right camera in the binocular stereoscopic vision imaging system;
f is the focal length of a camera in the binocular stereoscopic vision imaging system;
cx and cy are image coordinates of a camera principal point in the binocular stereoscopic vision imaging system;
Figure 294598DEST_PATH_IMAGE002
and
Figure 66245DEST_PATH_IMAGE003
is an image coordinate point within the target region;
disp is the coordinate of an image point of (
Figure 610359DEST_PATH_IMAGE002
Figure 492864DEST_PATH_IMAGE003
) The disparity value of (1);
x is the transverse distance between a three-dimensional point and the camera under the world coordinate system;
y is the longitudinal distance between the three-dimensional point and the camera under the world coordinate system;
and Z is the depth distance of the three-dimensional point from the camera under the world coordinate system.
4. The interframe matching detection method of claim 2, wherein a road surface model equation fitted based on the three-dimensional point cloud information is:
Figure 845348DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 787896DEST_PATH_IMAGE005
is the direction cosine of the included angle between the road surface normal vector and the x coordinate axis of the world coordinate system;
Figure 271836DEST_PATH_IMAGE006
is the direction cosine of the included angle between the road surface normal vector and the y coordinate axis of the world coordinate system;
Figure 161295DEST_PATH_IMAGE007
is the direction cosine of an included angle between a road surface normal vector and a world coordinate system z coordinate axis;
and D is the distance from the origin of the world coordinate system to the plane of the road surface.
5. The interframe matching detection method of claim 2, wherein homography transformation based on the two-dimensional image information and three-dimensional point cloud information is accomplished using the following formula:
Figure 492919DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 606368DEST_PATH_IMAGE009
respectively, the lateral distance from the camera, the depth from the camera;
Figure 79069DEST_PATH_IMAGE010
and
Figure 303377DEST_PATH_IMAGE011
is an image coordinate point within the detection area;
Figure 364874DEST_PATH_IMAGE012
and
Figure 118066DEST_PATH_IMAGE013
is a projection plane coordinate point within the detection area;
h is the homography transformation matrix.
6. The method according to claim 1, wherein the detecting the candidate region to be matched of the previous frame through the top view segmentation map specifically includes:
setting the pixel value marked as a preset mark on the overlook segmentation image of the previous frame as 1, and setting other positions as 0, and generating a binary image according to the pixel value;
and detecting a connected domain on the binary image, and expanding a region with a preset size to the periphery as a candidate region by taking the central position of the lower boundary of the connected domain as a reference.
7. The method according to claim 6, wherein the obtaining a search area of a subsequent frame based on the initial estimated position of the subsequent frame specifically includes:
obtaining an initial estimation position of a candidate region in a next frame based on the initial moving distance and the candidate region;
and taking the initial estimation position as a reference, and expanding a region with a preset size to the periphery as a search region.
8. An interframe matching detection system based on a binocular vision system, the system comprising:
the image acquisition unit is used for respectively acquiring a top gray-scale image and a top segmentation image of two adjacent frames in the same road scene;
a candidate region acquisition unit, configured to detect a candidate region to be matched of a previous frame through the top view segmentation map;
a search area acquisition unit for calculating an initial moving distance between two frames and an initial estimated position of a subsequent frame by a vehicle speed and a time stamp, and acquiring a search area of the subsequent frame based on the initial estimated position of the subsequent frame;
the position deviation acquiring unit is used for performing template matching on the candidate area to be matched and the search area and calculating the matching position deviation;
and the matching result output unit is used for correcting the initial moving distance by using the matching position deviation so as to obtain an interframe matching result.
9. An intelligent terminal, characterized in that, intelligent terminal includes: the device comprises a data acquisition device, a processor and a memory;
the data acquisition device is used for acquiring data; the memory is to store one or more program instructions; the processor, configured to execute one or more program instructions to perform the method of any of claims 1-7.
10. A computer-readable storage medium having one or more program instructions embodied therein for performing the method of any of claims 1-7.
CN202111224174.2A 2021-10-21 2021-10-21 Binocular vision system-based interframe matching detection method and system and intelligent terminal Pending CN114298965A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111224174.2A CN114298965A (en) 2021-10-21 2021-10-21 Binocular vision system-based interframe matching detection method and system and intelligent terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111224174.2A CN114298965A (en) 2021-10-21 2021-10-21 Binocular vision system-based interframe matching detection method and system and intelligent terminal

Publications (1)

Publication Number Publication Date
CN114298965A true CN114298965A (en) 2022-04-08

Family

ID=80964570

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111224174.2A Pending CN114298965A (en) 2021-10-21 2021-10-21 Binocular vision system-based interframe matching detection method and system and intelligent terminal

Country Status (1)

Country Link
CN (1) CN114298965A (en)

Similar Documents

Publication Publication Date Title
CN109902637B (en) Lane line detection method, lane line detection device, computer device, and storage medium
CN112906449B (en) Road surface pothole detection method, system and equipment based on dense disparity map
US10909395B2 (en) Object detection apparatus
US20230144678A1 (en) Topographic environment detection method and system based on binocular stereo camera, and intelligent terminal
US20100246901A1 (en) Operation Support System, Vehicle, And Method For Estimating Three-Dimensional Object Area
CN113343745B (en) Remote target detection method and system based on binocular camera and intelligent terminal
JP6515650B2 (en) Calibration apparatus, distance measuring apparatus and calibration method
CN109741241B (en) Fisheye image processing method, device, equipment and storage medium
CN112037249A (en) Method and device for tracking object in image of camera device
CN110926408A (en) Short-distance measuring method, device and system based on characteristic object and storage medium
CN114495043B (en) Method and system for detecting up-and-down slope road conditions based on binocular vision system and intelligent terminal
JP2019067150A (en) Surrounding monitoring device and surrounding monitoring method
CN110969666A (en) Binocular camera depth calibration method, device and system and storage medium
CN113965742B (en) Dense disparity map extraction method and system based on multi-sensor fusion and intelligent terminal
CN111382591A (en) Binocular camera ranging correction method and vehicle-mounted equipment
CN112465831A (en) Curve scene perception method, system and device based on binocular stereo camera
CN113140002B (en) Road condition detection method and system based on binocular stereo camera and intelligent terminal
CN113256709A (en) Target detection method, target detection device, computer equipment and storage medium
CN111754574A (en) Distance testing method, device and system based on binocular camera and storage medium
CN111383255B (en) Image processing method, device, electronic equipment and computer readable storage medium
CN114298965A (en) Binocular vision system-based interframe matching detection method and system and intelligent terminal
CN113689565B (en) Road flatness grade detection method and system based on binocular stereo vision and intelligent terminal
CN113674275B (en) Dense disparity map-based road surface unevenness detection method and system and intelligent terminal
CN115100621A (en) Ground scene detection method and system based on deep learning network
CN114511600A (en) Pose calculation method and system based on point cloud registration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination