CN112233163A

CN112233163A - Depth estimation method and device for laser radar stereo camera fusion and medium thereof

Info

Publication number: CN112233163A
Application number: CN202011464746.XA
Authority: CN
Inventors: 陈刚; 仲崇豪; 孟海涛
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2020-12-14
Filing date: 2020-12-14
Publication date: 2021-01-15
Anticipated expiration: 2040-12-14
Also published as: CN112233163B

Abstract

The invention discloses a depth estimation method, a device and a medium for fusion of a laser radar stereo camera, wherein the method comprises the following steps: acquiring a current frame left image and a current frame right image of a stereo camera; acquiring a radar left image and a radar right image; fusing the current frame left image and the radar left image to obtain a first left image; fusing the right image of the current frame with the right image of the radar to obtain a first right image; inputting the first left image into a binary neural network for feature extraction, and aggregating to obtain a first feature left image; inputting the first right image into a binary neural network for feature extraction, and aggregating to obtain a first feature right image; acquiring initial matching cost between the first characteristic left image and the first characteristic right image; optimizing the initial matching cost and extracting a disparity map based on a crossed radar trust aggregation and semi-global stereo matching algorithm; and performing depth estimation according to the disparity map. The method can obtain accurate and reliable depth prediction and is widely applied to the technical field of image processing.

Description

Depth estimation method and device for laser radar stereo camera fusion and medium thereof

Technical Field

The invention relates to the field of image processing and computer vision, in particular to a depth estimation method and device for laser radar stereo camera fusion and a medium thereof.

Background

The laser radar is one of important sensors for realizing the perception of the mobile robot and the automatic driving environment of the automobile, is suitable for the perception of the complex traffic environment, has the advantages of higher precision of the obtained depth map and low resolution, and can obtain the depth map which is very sparse and is easy to ignore small targets. Binocular stereo vision is an important branch of computer vision, is widely applied to the automobile unmanned technology, but the accuracy of the obtained depth map is low due to the fact that the influence of environmental factors such as vision and illumination is large. The existing methods based on the deep neural network cannot meet the requirement of obtaining real-time and accurate depth estimation, and a proper solution for fusing radar measurement and a stereo matching algorithm is lacked.

Disclosure of Invention

The present invention is directed to solving at least one of the problems of the prior art. Therefore, the invention provides a depth estimation method and device for laser radar stereo camera fusion and a medium thereof.

The technical scheme adopted by the invention is as follows:

in one aspect, an embodiment of the present invention includes a depth estimation method for laser radar stereo camera fusion, including:

acquiring a current frame left image and a current frame right image of a stereo camera;

acquiring a radar left image and a radar right image, wherein the radar left image and the current frame left image correspond to images of the same part of the same object, and the radar right image and the current frame right image correspond to images of the same part of the same object;

fusing the current frame left image and the radar left image to obtain a first left image;

fusing the current frame right image and the radar right image to obtain a first right image;

inputting the first left image into a binary neural network for feature extraction, and aggregating to obtain a first feature left image;

inputting the first right image into a binary neural network for feature extraction, and aggregating to obtain a first feature right image;

acquiring an initial matching cost between the first characteristic left image and the first characteristic right image;

optimizing the initial matching cost and extracting a disparity map based on a crossed radar trust aggregation and semi-global stereo matching algorithm;

and performing depth estimation according to the disparity map.

Further, the method further comprises:

and simultaneously shooting calibration objects in different postures and different positions by using the stereo camera and the laser radar.

Further, the stereo camera includes a left camera and a right camera, and after acquiring a current frame left image and a current frame right image of the stereo camera, the method further includes:

carrying out deformation correction on the current frame left image according to the distortion parameter of the left camera;

and carrying out deformation correction on the right image of the current frame according to the distortion parameter of the right camera.

Further, the step of acquiring a radar left image and a radar right image specifically includes:

acquiring a mapping chart shot by the laser radar;

compressing the map and dividing the map into a radar left image and a radar right image.

Further, the fusing the current frame left image and the radar left image to obtain a first left image specifically includes:

and fusing the current frame left image and the radar left image along a fusion channel according to the image size to obtain a first left image.

Further, the fusing the current frame right image and the radar right image to obtain a first right image specifically includes:

and fusing the current frame right image and the radar right image along a fusion channel according to the image size to obtain a first right image.

Further, the step of obtaining an initial matching cost between the first feature left image and the first feature right image specifically includes:

calculating a similarity measure between the first feature left image and the first feature right image by a weighted hamming distance method;

and acquiring an initial matching cost between the first characteristic left image and the first characteristic right image according to the similarity measurement.

Further, the step of optimizing the initial matching cost based on a cross radar trust aggregation and semi-global stereo matching algorithm includes:

determining a first target point in the radar left image, and drawing a cross-shaped graph through the first target point, wherein the first target point is any effective point in the radar left image, and the effective point is a point of which the point value is greater than zero;

acquiring a first distance through a first formula, wherein the first distance is the longest distance from a second target point in the vertical direction or the horizontal direction, and the second target point is a point corresponding to the first target point in the current frame left image; the first formula is:

(ii) a In the formula (I), the compound is shown in the specification,

the first distance is represented by a first distance,

the coordinates of the second target point are represented,

coordinates representing a point in the vertical or horizontal direction of the second target point,

indicating function for indicating coordinates

And coordinates

Whether the difference in pixel intensity between is less than a threshold; wherein, the

The second formula is calculated, and the second formula is as follows:

(ii) a In the formula (I), the compound is shown in the specification,

representing coordinates

The intensity of the pixel of (a) is,

representing coordinates

The pixel intensity of (a);

representing coordinates

Pixel intensity and coordinates of

The absolute difference in the intensity of the pixels of (a),

a threshold value representing a difference in pixel intensity;

according to the first distanceOptimizing the initial matching cost by a third formula, wherein the third formula is as follows:

(ii) a In the formula (I), the compound is shown in the specification,

the coordinates of the points are represented by,

representing coordinates in radar left image

The parallax of (a) is greater than (b),

coordinates representing the first target point

And the coordinates

Correspondingly, the numerical values are completely the same,

representing point coordinates

And the coordinates

The distance in the vertical direction or in the horizontal direction,

indicating function for indicating coordinates

And point coordinates

BetweenWhether the difference in pixel intensity is less than a threshold;

representing optimized point coordinates

In the parallax

The cost of the matching of (a) to (b),

representing point coordinates obtained by a weighted hamming distance method

In the parallax

An initial matching cost of (c);

and extracting the disparity map by a semi-global stereo matching algorithm according to the optimized matching cost.

On the other hand, the embodiment of the invention also comprises a depth estimation device for the fusion of the laser radar stereo camera, which comprises the following steps:

at least one processor;

at least one memory for storing at least one program;

when the at least one program is executed by the at least one processor, the at least one processor is caused to implement the depth estimation method.

In another aspect, the embodiments of the present invention further include a computer-readable storage medium on which a program executable by a processor is stored, the program executable by the processor being used for implementing the depth estimation method when being executed by the processor.

The invention has the beneficial effects that:

(1) by effectively fusing the laser radar and the stereo camera, accurate and reliable depth prediction can be obtained;

(2) by utilizing the binary neural network to simultaneously extract the characteristics of the two images, the accuracy is ensured, and simultaneously, the method is large

The speed is greatly improved;

(3) by means of cross-based radar trust aggregation, depth information obtained by laser radar shooting is utilized to the maximum extent,

thereby achieving a good improvement in accuracy.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flowchart illustrating steps of a method for depth estimation with laser radar stereo camera fusion according to an embodiment of the present invention;

fig. 2 is a block diagram of a depth estimation method for lidar stereo camera fusion according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a binary neural network according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of cross-based radar trust aggregation according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a depth estimation device with a laser radar stereo camera fused according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.

In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.

The embodiments of the present application will be further explained with reference to the drawings.

Referring to fig. 1, an embodiment of the present invention includes a method for depth estimation of lidar stereo camera fusion, including but not limited to the following steps:

s1, acquiring a current frame left image and a current frame right image of a stereo camera;

s2, acquiring a radar left image and a radar right image, wherein the radar left image and the current frame left image correspond to images of the same part of the same object, and the radar right image and the current frame right image correspond to images of the same part of the same object;

s3, fusing the current frame left image and the radar left image to obtain a first left image;

s4, fusing the current frame right image and the radar right image to obtain a first right image;

s5, inputting the first left image into a binary neural network for feature extraction, and aggregating to obtain a first feature left image;

s6, inputting the first right image into a binary neural network for feature extraction, and aggregating to obtain a first feature right image;

s7, acquiring initial matching cost between the first characteristic left image and the first characteristic right image;

s8, optimizing the initial matching cost and extracting a disparity map based on a crossed radar trust aggregation and semi-global stereo matching algorithm;

and S9, carrying out depth estimation according to the disparity map.

As an optional embodiment, the method further comprises:

s0. uses the stereo camera and laser radar to shoot the calibration objects with different postures and positions at the same time.

As an optional implementation manner, the stereo camera includes a left camera and a right camera, and after acquiring the current frame left image and the current frame right image of the stereo camera, the method further includes:

In this embodiment, the stereo camera adopts binocular camera, and binocular camera generally includes two monocular cameras that are used for the formation of image, is called left camera and right camera, and these two monocular cameras set up the coplanar at binocular camera, and the distance between each other is greater than a definite value. In practical application, a binocular camera is generally applied to the fields of robots, unmanned vehicles, security monitoring and the like, specifically, the binocular camera can shoot images at a certain time interval, and the images shot at a certain moment comprise a left image and a right image respectively shot by a left camera and a right camera in the binocular camera, namely a left image of a certain frame and a right image of a certain frame.

In this embodiment, after the left image and the right image are obtained by shooting, correction processing needs to be performed respectively according to distortion parameters of the camera.

In step S2, that is, the step of acquiring the radar left image and the radar right image specifically includes:

s201, obtaining a mapping map shot by the laser radar;

s202, compressing the mapping map, and dividing the mapping map into a radar left image and a radar right image.

In the embodiment, the laser radar and the stereo camera are used for shooting calibration objects with different postures and different positions at the same time; the laser radar is a radar system that detects a characteristic quantity such as a position and a velocity of a target by emitting a laser beam. From the theory of operation, there is not fundamental difference with microwave radar, its theory of operation is: the method comprises the steps of transmitting a detection signal (laser beam) to a target, comparing a received signal (target echo) reflected from the target with the transmitted signal, and after proper processing, obtaining relevant information of the target, such as target distance, azimuth, height, speed, attitude, even shape and other parameters, thereby detecting, tracking and identifying the target such as an airplane, a missile and the like. Specifically, the laser radar is composed of a laser transmitter, an optical receiver, a rotary table, an information processing system and the like, wherein the laser device converts electric pulses into optical pulses to be transmitted out, and the optical receiver restores the optical pulses reflected from a target into electric pulses to be transmitted to a display.

In the embodiment, an image of the same part of the same marker, which is shot by the laser radar and corresponds to the image shot by the left camera of the binocular camera, is selected as a radar left image; selecting an image of the same part of the same marker, which is shot by the laser radar and corresponds to the image shot by the right camera of the binocular camera, as a radar right image; in this embodiment, the image captured by the laser radar is a sparse map.

As an optional implementation manner, in step S5, that is, the current frame left image and the radar left image are fused to obtain a first left image, specifically:

As an optional implementation manner, in step S6, that is, the current frame right image and the radar right image are fused to obtain a first right image, specifically:

In this embodiment, steps S5 and S6 are executed along the fusion path according to the image size, respectively, that is, the current frame left image and the radar left image are fused to obtain a first left image; and fusing the current frame right image and the radar right image to obtain a first right image.

As an optional implementation manner, step S7, that is, the step of obtaining the initial matching cost between the first feature left image and the first feature right image, specifically includes:

s701, calculating similarity measurement between the first characteristic left image and the first characteristic right image through a weighted Hamming distance method;

s702, according to the similarity measurement, obtaining an initial matching cost between the first characteristic left image and the first characteristic right image.

Specifically, referring to fig. 2, fig. 2 is a frame diagram of the depth estimation method for lidar stereo camera fusion; the specific process comprises the following steps:

(1) the method comprises the following steps of taking an RGB left image obtained by shooting through a binocular camera and a radar left image obtained by shooting through a corresponding radar as first inputs, and taking an RGB right image obtained by shooting through the binocular camera and a radar right image obtained by shooting through the corresponding radar as parallel second inputs;

(2) fusing the RGB left image and the corresponding radar left image along the channel size to obtain a first left image; the RGB right image and the corresponding radar right image are fused along the channel size to obtain a first right image;

(3) inputting the first left image into a binary neural network for feature extraction, and aggregating to obtain a first feature left image; inputting the first right image into a binary neural network for feature extraction, and aggregating to obtain a first feature right image;

(4) calculating a similarity matrix of the first characteristic left image and the first characteristic right image through a weighted Hamming distance;

(5) continuing to perform cost aggregation processing, and refining and aggregating the result by using the depth information of the images shot by the laser radar in the cost aggregation process;

(6) and finally obtaining a finally refined disparity map after SGM (semi-global stereo matching algorithm).

Referring to fig. 3, in the process, the binary neural network is a binary neural network and is a highly quantized network, and the floating point weight is represented as +1 or-1, so as to realize maximum model compression. The binary neural network comprises a floating point convolution layer, a binary convolution layer, a zooming layer, a normalization layer, a binarization neuron and Hardtach, wherein the characteristic extraction network comprises four groups of layer processing, which are respectively as follows: the first group of layers comprise a floating point convolution layer, a normalization layer, a binarization neuron and a Hardtath; the second group of layers comprises a binary convolution layer, a scaling layer, a normalization layer and a binary neuron; the third group of layers and the second group of layers comprise a binary convolution layer, a scaling layer, a normalization layer and a binary neuron, and the fourth group of layers comprise a binary convolution layer, a scaling layer and a normalization layer; wherein the first set of layers has no binary convolutional layer in order to ensure that the accuracy is not excessively degraded.

In this embodiment, the binary neural network is equivalent to a binary feature extractor, and can jointly represent multidimensional information as a high-level bitwise feature vector. By encoding the image information captured by the lidar, more accurate feature information can be obtained, which is more accurate than relying on optical appearance alone.

Referring to fig. 4, in step S8, namely, regarding the method of cross-based radar trust aggregation, the purpose is to better utilize accurate depth information obtained by lidar shooting; the method does not need to establish a local region for each pixel and aggregate all candidate disparities, but only needs to update a small amount of the specific disparities of the pixels at the vertical intersection of sparse keypoints (e.g., radar points). Then, after the aggregation, the influence of the key points is automatically expanded to the neighbors. The method can improve the accuracy of depth estimation.

Specifically, a first target point is determined in the radar left image, a cross-shaped graph is drawn through the first target point, the first target point is any effective point in the radar left image, and the effective point is a point of which the point value is greater than zero;

(ii) a In the formula (I), the compound is shown in the specification,

the first distance is represented by a first distance,

the coordinates of the second target point are represented,

indicating function for indicating coordinates

And coordinates

The second formula is calculated, and the second formula is as follows:

(ii) a In the formula (I), the compound is shown in the specification,

representing coordinates

The intensity of the pixel of (a) is,

representing coordinates

The pixel intensity of (a);

representing coordinates

Pixel intensity and coordinates of

The absolute difference in the intensity of the pixels of (a),

a threshold value representing a difference in pixel intensity;

optimizing the initial matching cost through a third formula according to the first distance, wherein the third formula is as follows:

(ii) a In the formula (I), the compound is shown in the specification,

the coordinates of the points are represented by,

representing coordinates in radar left image

The parallax of (a) is greater than (b),

coordinates representing the first target point

And the coordinates

Correspondingly, the numerical values are completely the same,

representing point coordinates

And the coordinates

The distance in the vertical direction or in the horizontal direction,

indicating function for indicating coordinates

And point coordinates

Whether the difference in pixel intensity between is less than a threshold;

representing optimized point coordinates

In the parallax

The cost of the matching of (a) to (b),

representing point coordinates obtained by a weighted hamming distance method

In the parallax

An initial matching cost of (c);

In this embodiment, the radar left image has sparse radar points, where points with a point value greater than 0 are valid points, and the point values of most points are all 0, which are invalid points; in the embodiment, all effective points in the radar left image can be traversed, and the cross image is sketched by taking each effective point as a central point; because the current frame left image obtained by the stereo camera completely corresponds to the radar left image, points corresponding to effective points in the radar left image also exist in the current frame left image, and the coordinate values of the corresponding points are completely the same; therefore, a point (a second target point) corresponding to the effective point can be obtained in the left image of the current frame, and a cross-shaped graph can be drawn by taking the point corresponding to the effective point as a central point; a first distance is then calculated. The formula for the first distance can be understood as: and searching left, right, up and down by taking the point (second target point) corresponding to the effective point as a center, and meeting the longest distance that the pixel intensity difference value of all the points on the path and the second target point is smaller than.

And setting the cost of the point with the difference between the pixel value of the left arm of the cross and the pixel value of the radar point not larger than the threshold value to be 0, otherwise, calculating by using the matching cost obtained by the weighted Hamming distance method. By the method, the peripheral cost can be effectively updated by using the sparse radar points, so that the condition that only the cost of the radar points is updated, the radar points are considered to be outliers due to overlarge difference with peripheral pixels, and are ignored or repeatedly updated in the process of cost aggregation is avoided; the method for cross-based radar trust aggregation described in this embodiment is to spread the information of the key points into the whole area.

In summary, the depth estimation method for laser radar stereo camera fusion described in this embodiment has the following advantages:

The speed is greatly improved;

thereby achieving a good improvement in accuracy.

Referring to fig. 5, an embodiment of the present invention further provides a depth estimation apparatus 200 for laser radar stereo camera fusion, which specifically includes:

at least one processor 210;

at least one memory 220 for storing at least one program;

when executed by the at least one processor 210, causes the at least one processor 210 to implement the method as shown in fig. 1.

The memory 220, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs and non-transitory computer-executable programs. The memory 220 may include high speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 220 may optionally include remote memory located remotely from processor 210, and such remote memory may be connected to processor 210 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

It will be understood that the device structure shown in fig. 5 is not intended to be limiting of device 200, and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

In the apparatus 200 shown in fig. 5, the processor 210 may retrieve the program stored in the memory 220 and execute, but is not limited to, the steps of the embodiment shown in fig. 1.

The above-described embodiments of the apparatus 200 are merely illustrative, and the units illustrated as separate components may or may not be physically separate, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purposes of the embodiments.

Embodiments of the present invention also provide a computer-readable storage medium, which stores a program executable by a processor, and the program executable by the processor is used for implementing the method shown in fig. 1 when being executed by the processor.

The embodiment of the application also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.

It will be understood that all or some of the steps, systems of methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims

1. A depth estimation method for laser radar stereo camera fusion is characterized by comprising the following steps:

and performing depth estimation according to the disparity map.

2. The lidar stereo camera fused depth estimation method according to claim 1, wherein

In that the method further comprises:

3. The lidar stereo camera fused depth estimation method according to claim 1, wherein

After acquiring a current frame left image and a current frame right image of the stereo camera, the method further comprises:

4. The lidar stereo camera fused depth estimation method according to claim 1, wherein

The step of acquiring the radar left image and the radar right image specifically comprises:

acquiring a mapping chart shot by the laser radar;

5. The lidar stereo camera fused depth estimation method according to claim 1, wherein

In the method, the fusing the current frame left image and the radar left image to obtain a first left image specifically includes:

6. The lidar stereo camera fused depth estimation method according to claim 1, wherein

In the method, the step of fusing the current frame right image and the radar right image to obtain a first right image specifically comprises:

7. The lidar stereo camera fused depth estimation method according to claim 1, wherein

In the step of obtaining an initial matching cost between the first feature left image and the first feature right image, the method specifically includes:

8. The lidar stereo camera fused depth estimation method according to claim 1, wherein

In the method, the step of optimizing the initial matching cost based on the cross-based radar trust aggregation and semi-global stereo matching algorithm includes:

acquiring a first distance through a first formula, wherein the first distance is the longest distance from a second target point in the vertical direction or the horizontal direction, and the second target point is a point corresponding to the first target point in the current frame left image; the first mentionedOne formula is:

(ii) a In the formula (I), the compound is shown in the specification,

the first distance is represented by a first distance,

the coordinates of the second target point are represented,

indicating function for indicating coordinates

And coordinates

The second formula is calculated, and the second formula is as follows:

(ii) a In the formula (I), the compound is shown in the specification,

representing coordinates

The intensity of the pixel of (a) is,

representing coordinates

The pixel intensity of (a);

representing coordinates

Pixel intensity and coordinates of

The absolute difference in the intensity of the pixels of (a),

a threshold value representing a difference in pixel intensity;

(ii) a In the formula (I), the compound is shown in the specification,

the coordinates of the points are represented by,

representing coordinates in radar left image

The parallax of (a) is greater than (b),

coordinates representing the first target point

And the coordinates

Correspondingly, the numerical values are completely the same,

representing point coordinates

And the coordinates

The distance in the vertical direction or in the horizontal direction,

indicating function for indicating coordinates

And point coordinates

Whether the difference in pixel intensity between is less than a threshold;

representing optimized point coordinates

In the parallax

The cost of the matching of (a) to (b),

representing point coordinates obtained by a weighted hamming distance method

In the parallax

An initial matching cost of (c);

9. A depth estimation device for laser radar stereo camera fusion is characterized by comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-8.

10. Computer-readable storage medium, characterized in that a program executable by a processor is stored thereon, the program

The processor-executable program is for implementing the method of any one of claims 1 to 8 when executed by a processor.