CN114926521A - Stereo matching method and system based on binocular camera - Google Patents

Stereo matching method and system based on binocular camera Download PDF

Info

Publication number
CN114926521A
CN114926521A CN202210709579.3A CN202210709579A CN114926521A CN 114926521 A CN114926521 A CN 114926521A CN 202210709579 A CN202210709579 A CN 202210709579A CN 114926521 A CN114926521 A CN 114926521A
Authority
CN
China
Prior art keywords
parallax
cost
matching
pixel
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210709579.3A
Other languages
Chinese (zh)
Inventor
葛方海
杨超
朱海涛
刘永才
王鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Smarter Eye Technology Co Ltd
Original Assignee
Beijing Smarter Eye Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Smarter Eye Technology Co Ltd filed Critical Beijing Smarter Eye Technology Co Ltd
Priority to CN202210709579.3A priority Critical patent/CN114926521A/en
Publication of CN114926521A publication Critical patent/CN114926521A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/337Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20228Disparity calculation for image-based rendering

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a stereo matching method and a stereo matching system based on binocular cameras, wherein the method comprises the following steps: collecting continuous multi-frame left and right eye images in a target area; performing parallax matching cost calculation on each of the left and right eye images to obtain a plurality of parallax matching cost values; aggregating all the parallax matching cost values in multiple directions, and calculating by argmax (maximum independent variable point set) to obtain a parallax map corresponding to the maximum parallax matching cost value; and carrying out normalization processing on the disparity map, and carrying out sub-pixel disparity calculation to obtain a stereo matching result. The method has the advantages that effective deployment in a deep learning chip is realized, time and effect are met, the generalization capability is better, the effect is better in sub-pixel precision, and the algorithm precision and the generalization capability of stereo matching are improved, so that an algorithm basis is provided for improving the accuracy of automatic driving data.

Description

Stereo matching method and system based on binocular camera
Technical Field
The invention relates to the technical field of auxiliary driving, in particular to a stereo matching method and system based on a binocular camera.
Background
With the increasing demand of people for safer and more convenient travel, intelligent driving technology is in a vigorous development period, and the ability to sense and understand the environment is the basis and precondition of an intelligent system of an automobile. The intelligent vehicle acquires views through the binocular camera, analyzes the views after sensing the surrounding environment, and detects the driving condition by providing information for the control system.
When depth estimation is performed on a stereo image acquired by a binocular camera, stereo matching is required, the purpose is to estimate the corresponding relation of all pixel points between two corrected images, and the accuracy of automatic driving information identification is directly influenced by the performance of target matching.
Disclosure of Invention
Therefore, the embodiment of the invention provides a stereo matching method and system based on binocular cameras to improve the algorithm precision and generalization capability of stereo matching, thereby providing an algorithm basis for improving the accuracy of automatic driving data.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
a binocular camera based stereo matching method, the method comprising:
collecting continuous multi-frame left and right eye images in a target area;
performing parallax matching cost calculation on each of the left and right eye images to obtain a plurality of parallax matching cost values;
aggregating all the parallax matching cost values in multiple directions, and calculating by argmax (maximum independent variable point set) to obtain a parallax map corresponding to the maximum parallax matching cost value;
and carrying out normalization processing on the disparity map, and carrying out sub-pixel disparity calculation to obtain a stereo matching result.
Further, acquiring continuous multi-frame left and right eye images in the target area, and then:
and performing Gaussian filtering calculation on the left and right target images by using a convolution operator of the deep learning network so as to extract high-dimensional characteristic information of the images and improve the accuracy of matching cost calculation.
Further, the disparity matching cost calculation is performed on each of the left and right eye images by using the following formula:
Figure 312128DEST_PATH_IMAGE001
wherein x is the gray value of the pixel point of the left eye image, y is the gray value of the pixel point of the right eye image,
Figure 784697DEST_PATH_IMAGE002
is the average value of x and is,
Figure 898147DEST_PATH_IMAGE003
is the average value of y, and,
Figure 246214DEST_PATH_IMAGE004
is the variance of x and is,
Figure 470522DEST_PATH_IMAGE005
is the variance of y and is,
Figure 532019DEST_PATH_IMAGE006
is the covariance of x and y,
Figure 816370DEST_PATH_IMAGE007
and
Figure 760055DEST_PATH_IMAGE008
is a constant.
Further, the constants were calculated using the following formula
Figure 788054DEST_PATH_IMAGE009
And
Figure 969636DEST_PATH_IMAGE010
Figure 831413DEST_PATH_IMAGE011
where L is the dynamic range of the pixel values,
Figure 731236DEST_PATH_IMAGE012
Figure 297347DEST_PATH_IMAGE013
further, a plurality of directions of aggregation are performed on each of the disparity matching cost values using the following formula:
Figure 599015DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 880961DEST_PATH_IMAGE015
representing the aggregate cost value in the direction of propagation of r, representing the direction of propagation,
Figure 268080DEST_PATH_IMAGE016
the value of the matching cost is represented,
Figure 637881DEST_PATH_IMAGE017
representing the aggregate cost value of p pixel points on d parallax in the r propagation direction,
Figure 528477DEST_PATH_IMAGE018
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on the d parallax,
Figure 856690DEST_PATH_IMAGE019
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on the d-1 parallax,
Figure 465526DEST_PATH_IMAGE020
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on the d +1 parallax,
Figure 639018DEST_PATH_IMAGE021
and the aggregation cost of the previous pixel point of the P pixel points in the r propagation direction in the parallax i with the minimum aggregation cost is represented, and P1 and P2 represent penalty terms.
Further, the normalizing the disparity map and performing sub-pixel disparity calculation to obtain a stereo matching result specifically includes:
based on the disparity map, performing softmax normalization processing on each pixel point to obtain a normalization result corresponding to the position;
multiplying the parallax value of each point by the normalization result of the corresponding position to obtain a sub-pixel parallax value;
and taking the sub-pixel parallax value as a stereo matching result.
The invention also provides a stereo matching system based on the binocular camera, which comprises:
the image acquisition unit is used for acquiring continuous multi-frame left and right eye images in a target area;
the cost value calculation unit is used for performing parallax matching cost calculation on each left and right eye image to obtain a plurality of parallax matching cost values;
the cost value aggregation unit is used for carrying out aggregation on each parallax matching cost value in multiple directions and calculating through argmax (maximum independent variable point set) to obtain a parallax map corresponding to the maximum parallax matching cost value;
and the result output unit is used for carrying out normalization processing on the disparity map and carrying out sub-pixel disparity calculation to obtain a stereo matching result.
Further, the image acquisition unit is specifically configured to:
and performing Gaussian filtering calculation on the left and right eye images by using a convolution operator of the deep learning network, and performing Gaussian filtering calculation on the left and right eye images by using the convolution operator of the deep learning network so as to extract high-dimensional feature information of the images and improve the accuracy of matching cost calculation.
The present invention also provides an intelligent terminal, including: the device comprises a data acquisition device, a processor and a memory;
the data acquisition device is used for acquiring data; the memory is to store one or more program instructions; the processor is configured to execute one or more program instructions to perform the method as described above.
The present invention also provides a computer readable storage medium having embodied therein one or more program instructions for executing the method as described above.
According to the binocular camera-based stereo matching method, through the collection of continuous multi-frame left and right eye images in a target area, parallax matching cost calculation is carried out on each left and right eye image to obtain a plurality of parallax matching cost values; aggregating all the parallax matching cost values in multiple directions, and calculating by argmax to obtain a parallax image corresponding to the maximum parallax matching cost value; and carrying out normalization processing on the disparity map, and carrying out sub-pixel disparity calculation to obtain a stereo matching result. Therefore, the binocular camera-based stereo matching method provided by the invention realizes effective deployment in a deep learning chip through matching cost calculation, matching cost aggregation, parallax calculation and parallax refinement, meets the requirements for time and effect, has better generalization capability and better effect in sub-pixel precision, and improves the algorithm precision and generalization capability of stereo matching, thereby providing an algorithm basis for improving the accuracy of automatic driving data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary and that other implementation drawings may be derived from the provided drawings by those of ordinary skill in the art without inventive effort.
The structures, ratios, sizes, and the like shown in the present specification are only used for matching with the contents disclosed in the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions that the present invention can be implemented, so that the present invention has no technical significance, and any structural modifications, changes in the ratio relationship, or adjustments of the sizes, without affecting the effects and the achievable by the present invention, should still fall within the range that the technical contents disclosed in the present invention can cover.
Fig. 1 is a flowchart of a binocular camera-based stereo matching method according to an embodiment of the present invention;
fig. 2 is a block diagram illustrating an embodiment of the binocular camera based stereo matching system according to the present invention.
Detailed Description
The present invention is described in terms of particular embodiments, other advantages and features of the invention will become apparent to those skilled in the art from the following disclosure, and it is to be understood that the described embodiments are merely exemplary of the invention and that it is not intended to limit the invention to the particular embodiments disclosed. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to adapt to a deep learning chip and maintain generalization capability and precision of an algorithm, the binocular camera-based stereo matching method provided by the invention comprises matching cost calculation, matching cost aggregation, parallax calculation and parallax refinement, SSIM feature calculation is carried out on an image by using convolution, a matching cost value is calculated, then convolution and maximum pooling are used for aggregating the matching cost, and softargmax is used for obtaining the final sub-pixel parallax, so that the precision and generalization capability of the algorithm are ensured, and the method can be effectively deployed in the deep learning chip so as to provide an algorithm basis for improving the accuracy of automatic driving data.
In one embodiment, as shown in fig. 1, the binocular camera-based stereo matching method provided by the present invention includes the following steps:
s1: acquiring continuous multi-frame left and right eye images in a target area, and performing Gaussian filtering calculation on the left and right eye images by using a convolution operator of a deep learning network to extract high-dimensional feature information of the images and improve the accuracy of matching cost calculation in order to improve the image effect.
S2: and carrying out parallax matching cost calculation on each left and right eye image to obtain a plurality of parallax matching cost values.
Specifically, in order to ensure the accuracy of calculating the parallax matching cost value, it is preferable to perform parallax matching cost calculation on each of the left and right eye images by using the following formula:
Figure 649699DEST_PATH_IMAGE001
wherein x is the gray value of the pixel point of the left eye image, y is the gray value of the pixel point of the right eye image,
Figure 289759DEST_PATH_IMAGE022
is the average value of x and is,
Figure 385891DEST_PATH_IMAGE023
is the average value of y, and,
Figure 97495DEST_PATH_IMAGE024
is the variance of x and is,
Figure 962683DEST_PATH_IMAGE025
is the variance of y and is,
Figure 632699DEST_PATH_IMAGE026
is the covariance of x and y,
Figure 216127DEST_PATH_IMAGE027
and
Figure 465843DEST_PATH_IMAGE028
is a constant.
Using the following formulaCalculating constants
Figure 451116DEST_PATH_IMAGE027
And
Figure 183711DEST_PATH_IMAGE028
Figure 254435DEST_PATH_IMAGE029
where L is the dynamic range of the pixel values,
Figure 307842DEST_PATH_IMAGE030
Figure 554147DEST_PATH_IMAGE031
specifically, the structural similarity range is-1 to 1, and as an implementation of the structural similarity theory, the structural similarity index defines structural information from the perspective of image composition as being independent of brightness and contrast, reflects attributes of structures of objects in a scene, and models distortion as a combination of three different factors of brightness, contrast, and structure.
S3: and aggregating all the parallax matching cost values in multiple directions, and calculating by argmax (maximum independent variable point set) to obtain a parallax map corresponding to the maximum parallax matching cost value.
In an actual use scenario, the disparity matching cost values may be aggregated in multiple directions by using the following formula:
Figure 34807DEST_PATH_IMAGE014
wherein, the first and the second end of the pipe are connected with each other,
Figure 327248DEST_PATH_IMAGE032
representing the aggregate cost value in the direction of propagation of r, representing the direction of propagation,
Figure 184345DEST_PATH_IMAGE033
the value of the matching cost is represented,
Figure 675369DEST_PATH_IMAGE034
representing the aggregate cost value of p pixel points on d parallax in the r propagation direction,
Figure 326931DEST_PATH_IMAGE035
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on the d parallax,
Figure 372247DEST_PATH_IMAGE036
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on the d-1 parallax,
Figure 767456DEST_PATH_IMAGE037
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on d +1 parallax,
Figure 972042DEST_PATH_IMAGE038
and the aggregation cost of the parallax i with the minimum aggregation cost of the previous pixel point of the P pixel points in the r propagation direction in all parallaxes is represented, and P1 and P2 represent penalty items.
In particular, cost aggregation can be performed on different directions to finally obtain a final aggregated cost value, and the same propagation process is completed by using convolution operation in the invention. Taking left to right as an example, firstly obtained in the previous step
Figure 60083DEST_PATH_IMAGE039
Divide it into H pieces of size
Figure 327117DEST_PATH_IMAGE040
For each of the matrices, using the maximum parallax size
Figure 526017DEST_PATH_IMAGE041
The penalty terms of P1, P2 are converted from addition to multiplication, and the weight of the convolution kernel is further determined by the penaltyAnd (4) penalty term composition, setting corresponding penalty term weight to form a convolution kernel for different parallaxes d, and finally obtaining a product
Figure 601420DEST_PATH_IMAGE042
After propagation of (c).
S4: and carrying out normalization processing on the disparity map, and carrying out sub-pixel disparity calculation to obtain a stereo matching result.
Step S4 specifically includes:
based on the disparity map, performing softmax normalization processing on each pixel point to obtain a normalization result corresponding to the position;
multiplying the parallax value of each point by the normalization result of the corresponding position to obtain a sub-pixel parallax value;
and taking the sub-pixel parallax value as a stereo matching result.
In an actual use scene, the purpose of stereo matching is to estimate the correspondence of all pixel points between two rectified images. Given a pair of rectified stereo images, the purpose of disparity estimation is to calculate the disparity d of each pixel in the reference image. Disparity refers to the horizontal displacement between a pair of corresponding points in the reference image and the target image. For a point in the reference image with a pixel (x, y), if a corresponding pixel point is found in the target image (x-d, y), the depth of the point can be calculated by f × b/d, where f is the focal length of the camera and b is the distance between the two cameras.
In the above specific embodiment, the stereo matching method based on the binocular camera provided by the invention acquires continuous multiple frames of left and right eye images in a target area, and performs parallax matching cost calculation on each of the left and right eye images to obtain multiple parallax matching cost values; aggregating all the parallax matching cost values in multiple directions, and calculating by argmax to obtain a parallax map corresponding to the maximum parallax matching cost value; and carrying out normalization processing on the disparity map, and carrying out sub-pixel disparity calculation to obtain a stereo matching result. Therefore, the binocular camera-based stereo matching method provided by the invention proves effective deployment in a deep learning chip through matching cost calculation, matching cost aggregation, parallax calculation and parallax refinement, meets time and effect, has better generalization capability and better effect in sub-pixel precision, and improves algorithm precision and generalization capability of stereo matching, thereby providing an algorithm basis for improving accuracy of automatic driving data.
In addition to the above method, the present invention also provides a binocular camera-based stereo matching system, as shown in fig. 2, the system comprising:
the image acquisition unit 100 is used for acquiring continuous multiframe left and right eye images in a target area;
a cost value calculation unit 200, configured to perform disparity matching cost calculation on each of the left and right eye images to obtain a plurality of disparity matching cost values;
a cost value aggregation unit 300, configured to perform aggregation in multiple directions on each disparity matching cost value, and calculate by argmax (maximum independent variable point set), so as to obtain a disparity map corresponding to a maximum disparity matching cost value;
and a result output unit 400, configured to perform normalization processing on the disparity map, and perform sub-pixel disparity calculation to obtain a stereo matching result.
Wherein, the image acquisition unit is specifically used for:
and performing Gaussian filtering calculation on the left and right target images by using a convolution operator of the deep learning network so as to extract high-dimensional characteristic information of the images and improve the accuracy of matching cost calculation.
In the above specific embodiment, the stereo matching system based on the binocular camera provided by the invention acquires continuous multiple frames of left and right eye images in a target area, and performs parallax matching cost calculation on each of the left and right eye images to obtain multiple parallax matching cost values; aggregating all the parallax matching cost values in multiple directions, and calculating by argmax to obtain a parallax map corresponding to the maximum parallax matching cost value; and carrying out normalization processing on the disparity map, and carrying out sub-pixel disparity calculation to obtain a stereo matching result. Therefore, the binocular camera-based stereo matching method provided by the invention proves effective deployment in a deep learning chip through matching cost calculation, matching cost aggregation, parallax calculation and parallax refinement, meets the requirements for time and effect, has better generalization capability and better effect in sub-pixel precision, and improves the algorithm precision and generalization capability of stereo matching, thereby providing an algorithm basis for improving the accuracy of automatic driving data.
The present invention also provides an intelligent terminal, including: the device comprises a data acquisition device, a processor and a memory;
the data acquisition device is used for acquiring data; the memory is to store one or more program instructions; the processor is configured to execute one or more program instructions to perform the method as described above.
In correspondence with the above embodiments, the present invention also provides a computer readable storage medium, which contains one or more program instructions. Wherein the one or more program instructions are for executing the method as described above by a binocular camera depth calibration system.
In an embodiment of the invention, the processor may be an integrated circuit chip having signal processing capability. The Processor may be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The processor reads the information in the storage medium and completes the steps of the method in combination with the hardware.
The storage medium may be a memory, for example, which may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory.
The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory.
The volatile Memory may be a Random Access Memory (RAM) which serves as an external cache. By way of example and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synclink DRAM (SLDRAM), and Direct Rambus RAM (DRRAM).
The storage media described in connection with the embodiments of the invention are intended to comprise, without being limited to, these and any other suitable types of memory.
Those skilled in the art will appreciate that the functionality described in the present invention may be implemented in a combination of hardware and software in one or more of the examples described above. When software is applied, the corresponding functionality may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer-readable storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The above embodiments are only for illustrating the embodiments of the present invention and are not to be construed as limiting the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the embodiments of the present invention shall be included in the scope of the present invention.

Claims (10)

1. A stereo matching method based on binocular cameras is characterized by comprising the following steps:
collecting continuous multi-frame left and right eye images in a target area;
performing parallax matching cost calculation on each of the left and right eye images to obtain a plurality of parallax matching cost values;
aggregating all the parallax matching cost values in multiple directions, and calculating through a maximum independent variable point set to obtain a parallax image corresponding to the maximum parallax matching cost value;
and carrying out normalization processing on the disparity map, and carrying out sub-pixel disparity calculation to obtain a stereo matching result.
2. The stereo matching method according to claim 1, wherein a plurality of consecutive frames of left and right eye images in the target region are collected, and thereafter further comprising:
and performing Gaussian filtering calculation on the left and right target images by using a convolution operator of the deep learning network so as to extract high-dimensional characteristic information of the images and improve the accuracy of matching cost calculation.
3. The stereo matching method according to claim 1, wherein the disparity matching cost calculation is performed for each of the left and right eye images using the following formula:
Figure 334254DEST_PATH_IMAGE001
wherein x is the gray value of the pixel point of the left eye image, y is the gray value of the pixel point of the right eye image,
Figure 64312DEST_PATH_IMAGE002
is the average value of x and is,
Figure 733191DEST_PATH_IMAGE003
is the average value of y, and,
Figure 398659DEST_PATH_IMAGE004
is the variance of x and is,
Figure 152988DEST_PATH_IMAGE005
is the variance of y and is,
Figure 421159DEST_PATH_IMAGE006
is the covariance of x and y,
Figure 944544DEST_PATH_IMAGE007
and
Figure 797224DEST_PATH_IMAGE008
is a constant.
4. The stereo matching method according to claim 3, wherein the constant is calculated using the following formula
Figure 38850DEST_PATH_IMAGE007
And
Figure 720498DEST_PATH_IMAGE008
Figure 363969DEST_PATH_IMAGE009
where L is the dynamic range of the pixel values,
Figure 495873DEST_PATH_IMAGE010
Figure 224795DEST_PATH_IMAGE011
5. the stereo matching method according to claim 1, wherein the disparity matching cost values are aggregated in a plurality of directions using the following formula:
Figure 303609DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure 457379DEST_PATH_IMAGE013
representing the aggregate cost value in the direction of propagation of r, representing the direction of propagation,
Figure 229026DEST_PATH_IMAGE014
the value of the matching cost is represented,
Figure 851768DEST_PATH_IMAGE015
representing the aggregate cost value of p pixel points on d parallax in the r propagation direction,
Figure 734273DEST_PATH_IMAGE016
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on the d parallax,
Figure 883495DEST_PATH_IMAGE017
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on the d-1 parallax,
Figure 983300DEST_PATH_IMAGE018
representing the aggregation cost of the previous pixel point of the p pixel points in the r propagation direction on the d +1 parallax,
Figure 421235DEST_PATH_IMAGE019
and the aggregation cost of the parallax i with the minimum aggregation cost of the previous pixel point of the P pixel points in the r propagation direction in all parallaxes is represented, and P1 and P2 represent penalty items.
6. The stereo matching method according to claim 1, wherein the normalizing the disparity map and performing sub-pixel disparity calculation to obtain a stereo matching result specifically comprises:
based on the disparity map, performing softmax normalization processing on each pixel point to obtain a normalization result corresponding to the position;
multiplying the parallax value of each point by the normalization result of the corresponding position to obtain a sub-pixel parallax value;
and taking the sub-pixel parallax value as a stereo matching result.
7. A binocular camera based stereo matching system, the system comprising:
the image acquisition unit is used for acquiring continuous multiframe left and right eye images in a target area;
the cost value calculation unit is used for performing parallax matching cost calculation on each left and right eye image to obtain a plurality of parallax matching cost values;
the cost value aggregation unit is used for carrying out aggregation on each parallax matching cost value in multiple directions and calculating through a maximum independent variable point set to obtain a parallax map corresponding to the maximum parallax matching cost value;
and the result output unit is used for carrying out normalization processing on the disparity map and carrying out sub-pixel disparity calculation to obtain a stereo matching result.
8. The stereo matching system according to claim 7, wherein the image acquisition unit is specifically configured to:
and performing Gaussian filtering calculation on the left and right target images by using a convolution operator of the deep learning network so as to extract high-dimensional characteristic information of the images and improve the accuracy of matching cost calculation.
9. An intelligent terminal, characterized in that, intelligent terminal includes: the device comprises a data acquisition device, a processor and a memory;
the data acquisition device is used for acquiring data; the memory is to store one or more program instructions; the processor, for executing one or more program instructions to perform the method of any one of claims 1-6.
10. A computer-readable storage medium having one or more program instructions embodied therein for performing the method of any of claims 1-6.
CN202210709579.3A 2022-06-22 2022-06-22 Stereo matching method and system based on binocular camera Pending CN114926521A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210709579.3A CN114926521A (en) 2022-06-22 2022-06-22 Stereo matching method and system based on binocular camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210709579.3A CN114926521A (en) 2022-06-22 2022-06-22 Stereo matching method and system based on binocular camera

Publications (1)

Publication Number Publication Date
CN114926521A true CN114926521A (en) 2022-08-19

Family

ID=82815338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210709579.3A Pending CN114926521A (en) 2022-06-22 2022-06-22 Stereo matching method and system based on binocular camera

Country Status (1)

Country Link
CN (1) CN114926521A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100267A (en) * 2022-08-29 2022-09-23 北京中科慧眼科技有限公司 Stereo matching method and system based on deep learning operator
CN117078984A (en) * 2023-10-17 2023-11-17 腾讯科技(深圳)有限公司 Binocular image processing method and device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100267A (en) * 2022-08-29 2022-09-23 北京中科慧眼科技有限公司 Stereo matching method and system based on deep learning operator
CN117078984A (en) * 2023-10-17 2023-11-17 腾讯科技(深圳)有限公司 Binocular image processing method and device, electronic equipment and storage medium
CN117078984B (en) * 2023-10-17 2024-02-02 腾讯科技(深圳)有限公司 Binocular image processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN114926521A (en) Stereo matching method and system based on binocular camera
CN111210429B (en) Point cloud data partitioning method and device and obstacle detection method and device
CN111160232B (en) Front face reconstruction method, device and system
CN110926408A (en) Short-distance measuring method, device and system based on characteristic object and storage medium
CN113128347B (en) Obstacle target classification method and system based on RGB-D fusion information and intelligent terminal
CN111582054A (en) Point cloud data processing method and device and obstacle detection method and device
CN113965742B (en) Dense disparity map extraction method and system based on multi-sensor fusion and intelligent terminal
US20220277470A1 (en) Method and system for detecting long-distance target through binocular camera, and intelligent terminal
US20200191971A1 (en) Method and System for Vehicle Detection Using LIDAR
CN114119777B (en) Stereo matching method and system based on deep learning
CN111553946A (en) Method and device for removing ground point cloud and obstacle detection method and device
CN113140002B (en) Road condition detection method and system based on binocular stereo camera and intelligent terminal
CN115329111A (en) Image feature library construction method and system based on point cloud and image matching
JP2023021087A (en) Binocular image matching method, apparatus, device, and storage medium
CN113034666B (en) Stereo matching method based on pyramid parallax optimization cost calculation
CN113792752A (en) Image feature extraction method and system based on binocular camera and intelligent terminal
CN113792583A (en) Obstacle detection method and system based on drivable area and intelligent terminal
CN112489097A (en) Stereo matching method based on mixed 2D convolution and pseudo 3D convolution
CN114998412B (en) Shadow region parallax calculation method and system based on depth network and binocular vision
CN113763303B (en) Real-time ground fusion method and system based on binocular stereo vision and intelligent terminal
CN114972470A (en) Road surface environment obtaining method and system based on binocular vision
CN116258758A (en) Binocular depth estimation method and system based on attention mechanism and multistage cost body
CN115100621A (en) Ground scene detection method and system based on deep learning network
CN114049510A (en) Binocular camera stereo matching algorithm and system based on loss function and intelligent terminal
WO2021087812A1 (en) Method for determining depth value of image, image processor and module

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination