WO2020119467A1 - 高精度稠密深度图像的生成方法和装置 - Google Patents

高精度稠密深度图像的生成方法和装置 Download PDF

Info

Publication number
WO2020119467A1
WO2020119467A1 PCT/CN2019/121495 CN2019121495W WO2020119467A1 WO 2020119467 A1 WO2020119467 A1 WO 2020119467A1 CN 2019121495 W CN2019121495 W CN 2019121495W WO 2020119467 A1 WO2020119467 A1 WO 2020119467A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pixel
pixels
matched
matching
Prior art date
Application number
PCT/CN2019/121495
Other languages
English (en)
French (fr)
Inventor
宋展
黄舒兰
Original Assignee
深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳先进技术研究院 filed Critical 深圳先进技术研究院
Publication of WO2020119467A1 publication Critical patent/WO2020119467A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/128Adjusting depth or disparity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/257Colour aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/271Image signal generators wherein the generated image signals comprise depth maps or disparity maps

Definitions

  • the present application relates to the technical field of image processing, and in particular to a method and device for generating a high-precision dense depth image.
  • image data such as depth images is used in image recognition and processing, scene understanding, augmented and virtual reality, robot navigation, and other application fields.
  • image data such as depth images is used in image recognition and processing, scene understanding, augmented and virtual reality, robot navigation, and other application fields.
  • people have higher and higher requirements on the accuracy and resolution of depth images.
  • the collected image data often has relatively low resolution and accuracy, and error information is prone to appear. For example, some unreliable image data will appear on the edges of some special materials or objects.
  • some error information in the image data collected by the depth camera will also be introduced into the matching algorithm, and as the matching process is transmitted and diffused, the final depth image will often exist.
  • the error is relatively sparse, and the resolution is not high and the accuracy is relatively poor. That is, the existing methods often have the technical problem that the determined depth image has large errors and low accuracy.
  • the embodiments of the present application provide a method and a device for generating a high-precision dense depth image, to solve the technical problems in the existing method of determining the depth image with large errors and low accuracy, and to obtain dense, high-precision and The technical effect of the resolution depth image.
  • An embodiment of the present application provides a method for generating a high-precision dense depth image, including:
  • the first image is image data containing the target object acquired through the left camera
  • the second image is the image containing the target object acquired through the right camera Image data
  • the third image is image data containing a target object acquired by a depth camera
  • the preset encoding of the pixel in the matching window of the pixel to be matched, and the disparity map determine the Matching pixels corresponding to the pixels to be matched in the first image
  • the first depth image is determined according to the pixels to be matched in the first image and the matching pixels in the second image corresponding to the pixels to be matched in the first image.
  • the gray value of the pixel to be matched in the first image the gray value of the pixel in the matching window of the pixel to be matched and the preset encoding rule, the The preset encoding of pixels in the matching window of matching pixels includes:
  • the preset encoding of the pixels whose gray value in the matching window of the pixel to be matched is less than or equal to the gray value of the pixel to be matched is determined as 1; the pixel to be matched
  • the preset encoding of pixels whose gray value in the matching window is greater than the gray value of the pixel to be matched is determined to be 0.
  • the preset encoding of the pixel in the matching window of the pixel to be matched, the disparity map, from the first includes:
  • multiple pixel points are selected from the second image as test pixel points;
  • the preset encoding of the pixel in the matching window of the pixel to be matched the gray value of the test pixel
  • the Preset encoding of pixels calculating the matching cost of the pixels to be matched and the test pixels
  • test pixel with the smallest matching cost value is determined as the matching pixel corresponding to the pixel to be matched in the first image.
  • the preset encoding of the pixel in the matching window of the pixel to be matched, the gray value of the test pixel, the test pixel include:
  • C represents the matching cost of the pixel to be matched with the test pixel, Expressed as the gray value of the pixel to be matched in the first image, Expressed as the gray value of the test pixel in the second image, Expressed as the preset encoding of the pixel number k in the matching window of the pixel to be matched in the first image, It is expressed as the preset encoding of the pixel number k in the matching window of the test pixel in the second image, and n is the total number of pixels in the matching window.
  • the method further includes:
  • a second depth image is determined.
  • determining the second depth image based on the correction weight and the first depth image includes:
  • q i represents the data value of pixel number i in the second depth image
  • W ij (I) represents the correction weight
  • I represents the disparity map
  • p j represents the corresponding pre-value in the first depth image
  • the correction weight is determined according to the following formula:
  • I i and I j represent the data values of two adjacent pixels in the corresponding preset window in the disparity map
  • ⁇ k represents the average of the data values of the pixels in the corresponding preset window in the disparity map
  • ⁇ k represents the variance of the data value of the pixels in the corresponding preset window in the disparity map
  • represents the penalty value
  • represents the disturbance value
  • the method further includes:
  • the blank area is an area including a plurality of pixels with a data value of 0;
  • the data values of the pixels connected to the blank area in the first depth image are modified.
  • An embodiment of the present application also provides a high-precision dense depth image generation device, including:
  • An acquisition module for acquiring a first image, a second image and a third image, wherein the first image is image data including the target object acquired through the left camera, and the second image is acquired through the right camera Image data containing the target object, the third image is image data containing the target object acquired by the depth camera;
  • a first determining module configured to determine a disparity map according to the first image, the second image, and the third image
  • the second determination module is used to determine the gray value of the pixels to be matched in the first image, the gray values of the pixels in the matching window of the pixels to be matched, and preset encoding rules Preset encoding of pixels in the matching window of pixels to be matched;
  • the third determining module is used for determining the pixel pixel to be matched in the first image, the preset encoding of the pixel in the matching window of the pixel to be matched, the disparity map from the The matching pixel corresponding to the pixel to be matched in the first image is determined in the second image;
  • the fourth determining module is configured to determine the first depth image according to the pixels to be matched in the first image and the matching pixels in the second image corresponding to the pixels to be matched in the first image.
  • An embodiment of the present application further provides an electronic device, including a processor and a memory for storing processor-executable instructions.
  • the processor executes the instructions, the first image, the second image, and the third image are acquired.
  • the first image is image data including the target object acquired through the left camera
  • the second image is image data including the target object acquired through the right camera
  • the third image is acquired through a depth camera Contains image data of the target object; determines the disparity map based on the first image, the second image, and the third image; based on the gray value of the pixel to be matched in the first image, the The gray value of the pixel in the matching window of the pixel to be matched and the preset encoding rule to determine the preset encoding of the pixel in the matching window of the pixel to be matched; according to the to-be-matched in the first image The gray value of the pixel, the preset encoding of the pixel in the matching window of the pixel to be matched, and the disparity map, determining
  • An embodiment of the present application also provides a computer-readable storage medium on which computer instructions are stored, and when the instructions are executed, the first image, the second image, and the third image are acquired, wherein the first image is Image data including the target object acquired through the left camera, the second image is image data including the target object acquired through the right camera, and the third image is image data including the target object acquired through a depth camera Determining the disparity map according to the first image, the second image and the third image; according to the gray value of the pixel to be matched in the first image, within the matching window of the pixel to be matched The gray value of the pixel of the pixel and the preset encoding rule, determine the preset encoding of the pixel in the matching window of the pixel to be matched; according to the gray value of the pixel to be matched in the first image, the The preset encoding of the pixels in the matching window of the pixels to be matched and the disparity map, determining the matching pixels corresponding to the pixels to be matched in the first image from
  • the pixel code in the matching window adjacent to the pixel to be matched in the first image is acquired and used according to the preset coding rule, combined with the gray value of the pixel to be matched, and the parallax is used
  • the picture is a constraint. More accurate matching pixels are determined from the second image by matching to determine the depth image, which reduces the matching error caused by factors such as differences in gray information caused by lighting, thereby solving the existing method.
  • the technical problems of determining the depth image with large error and low accuracy exist in the technology to achieve the technical effect of obtaining a dense depth image with higher accuracy and resolution.
  • FIG. 1 is a processing flowchart of a method for generating a high-precision dense depth image according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of an example of a method for generating a high-precision dense depth image according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of an example of a method for generating a high-precision dense depth image according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of an example of a method for generating a high-precision dense depth image according to an embodiment of the present application
  • FIG. 5 is a structural diagram of a device for generating a high-precision dense depth image according to an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of an electronic device based on the method for generating a high-precision dense depth image provided by an embodiment of the present application.
  • the existing method will also introduce the above error information into the matching process when introducing image data collected by a depth camera such as a ToF camera to participate in the matching, and the above error information will also be transmitted and amplified during the matching process, resulting in inaccurate matching results
  • the resulting depth image has lower accuracy and poorer resolution, and the obtained depth image often appears to be sparse and cannot meet the higher processing requirements.
  • this application considers that the image data collected by depth cameras such as ToF cameras is often susceptible to environmental lighting, etc., resulting in the grayscale information (or grayscale) of the pixels in the obtained image data Value) is often inaccurate and inaccurate. Further analysis of the existing methods will introduce the above image data during the implementation process, and rely heavily on the grayscale information affected by the above image data during the matching process, so that the error information carried in the above image data is carried out during the matching process Transmission and enlargement lead to inaccurate matching and affect the accuracy and resolution of the final depth image.
  • this application considers that the image data collected by the depth camera can be used to guide the matching and improve the matching efficiency, while reducing the dependence on the grayscale information of the pixels in the matching process, thereby reducing the image data carried Error transmission and amplification protect the resolution and accuracy of depth images.
  • the preset encoding of the pixels in the matching window adjacent to the pixel to be matched in the matching process can be determined according to the preset encoding rule, and instead of using the gray values of the pixels in the matching window, the corresponding The preset encoding, combined with the gray value of the pixel to be matched, performs binocular matching to find the corresponding matching point to generate a depth image, thereby reducing the dependence on gray information, improving the accuracy of matching, and solving the current
  • There are technical problems in determining the depth image with large errors and low accuracy which can achieve the technical effect of obtaining a dense depth image with higher accuracy and resolution.
  • the embodiments of the present application provide a high-precision dense depth image generation method.
  • the method for generating a high-precision dense depth image provided by the embodiments of the present application may include the following steps during specific implementation.
  • S11 Acquire a first image, a second image, and a third image, where the first image is image data including a target object acquired through a left camera, and the second image is a target including an object acquired through a right camera Image data of an object, the third image is image data containing a target object acquired by a depth camera.
  • the first image and the second image may be specifically understood as a color image (also called RGB image) or a black-and-white image containing the target object.
  • the first image may specifically be image data for a target object captured and captured by a left camera (or an independent left camera, referred to as l) in a binocular camera (or binocular stereo system).
  • the second image may specifically be image data of the same target object captured and captured by the right camera (or independent right camera, denoted as r) in the binocular camera at the same time.
  • the first image may specifically be image data captured by a right camera in a binocular camera for a target object
  • the second image may specifically be captured by a left camera in a binocular camera The image data of the same target object at the same time. This application is not limited.
  • the third image may specifically be image data of the same target object captured and acquired by the depth camera at the same time.
  • the above third image carries depth information, but the accuracy is poor and the resolution is low, which can be regarded as an initial depth image.
  • the above-mentioned depth camera may specifically include a camera capable of acquiring a depth image, such as a ToF (Time of Flight) camera.
  • ToF Time of Flight
  • the ToF cameras listed above are only for better illustrating the implementation of the present application.
  • the third image may also be image data acquired by a depth camera other than the ToF camera. This application is not limited.
  • the above-mentioned depth camera is different from the ordinary camera, that is, different from the above-mentioned left camera or right camera.
  • the built-in transmitter emits continuous near-infrared pulses to the target object, and then uses the sensor to receive the The light pulse reflected by the object; by comparing the phase difference between the emitted light pulse and the light pulse reflected by the target object, the transmission delay between the light pulses is calculated, and then the distance of the target object relative to the transmitter (that is, a kind of depth information) ), and finally get image data containing depth information. Therefore, the third image itself can be understood as a kind of depth image.
  • the resolution of the obtained depth image (ie, the third image) is often not as good as the resolution of the ordinary color image, and the depth value, gray value and other information in the depth image are easily affected External noise interference.
  • the target object corresponding to one pixel may cover the surface of different object objects, the depth value at the edge of the target object is prone to errors and other problems.
  • the first image, the second image, and the third image are acquired synchronously, and are directed to the image data of the same target object at the same time.
  • FIG. 2 for a schematic diagram of an example of a method for generating a high-precision dense depth image according to an embodiment of the present application. , Arranging the left camera, the right camera and the depth camera for acquiring the first image, the second image and the third image according to preset layout rules.
  • the left camera, the right camera and the depth camera may be arranged at the same horizontal position.
  • the imaging origin coordinates of the left camera and the right camera may be consistent, keeping the lens optical axis parallel, imaging plane coplanar, alignment of epipolar lines, etc., it is convenient for subsequent data processing (for example, binocular matching, etc.).
  • the subsequent epipolar line can be used as a constraint to find matching pixels in the corresponding line. Therefore, the two-dimensional search is reduced to one-dimensional search, the range of matching search is reduced, and the processing efficiency is improved.
  • the method further includes: jointly calibrating the left camera, the right camera, and the depth camera to determine the in-camera parameters and the out-camera parameters.
  • the above-mentioned in-camera parameters can be understood as the respective internal operating parameters of the left camera, the right camera, and the depth camera, which can be recorded as K.
  • the in-camera parameters may include one or more of the following listed operating parameters: focal length, imaging origin, and distortion coefficient.
  • focal length focal length
  • imaging origin imaging origin
  • distortion coefficient distortion coefficient
  • the above-mentioned off-camera parameters can be specifically understood as positional parameters that define the relative positional relationship between the left camera and the right camera, the left camera and the depth camera, and the two cameras and the depth camera, which can be written as R and t.
  • the aforementioned off-camera parameters may include one or more of the following listed position parameters: rotation vector, translation vector, and so on.
  • the external camera parameters listed above are only schematic illustrations.
  • the above-mentioned off-camera parameters may also include other types of operating parameters. This application is not limited.
  • the above-mentioned joint calibration of the left camera, the right camera and the depth camera is performed to determine the in-camera parameters and the out-camera parameters.
  • the following content may be included: through the left camera and the right camera Obtain the same chessboard image respectively, and calculate the inner and outer parameters of the left camera and the right camera according to the obtained chessboard objects.
  • the position vector of the projection point in each chessboard image obtained can be expressed as the following formula:
  • K can be expressed as the internal parameter of the left camera and the right camera
  • R can be expressed as the rotation vector between the left camera and the inner camera
  • t can be expressed as the space between the left camera and the inner camera.
  • the translation vector of M can be expressed as a three-dimensional coordinate point.
  • the appropriate in-camera parameters and out-camera parameters can be determined, so that the distortion of the first image and the second image can be eliminated and line-paired according to the setting position of the camera.
  • the imaging origin coordinates of the first image and the second image are unified, the imaging planes of the two images are coplanar, and the alignment of the epipolar lines is helpful, which can further reduce the matching search range and further improve the processing efficiency when the image data is processed later.
  • the method further includes the following content: preprocessing the third image.
  • preprocessing includes at least filtering processing.
  • the third image is image data obtained by a depth camera such as a depth camera
  • the accuracy is often poor and the resolution is low, resulting in the image data at the edge of the target object often having a large error and not reliable.
  • the image data representing the edge of the target object in the third image may be detected first, and the image data at the edge of the target object may be filtered out, thereby reducing the above-mentioned image data Subsequent errors introduced further improve the processing accuracy.
  • S12 Determine a disparity map according to the first image, the second image, and the third image.
  • the above-mentioned parallax map may also be referred to as initial parallax, and is a parallax map that is not obtained by binocular matching.
  • This kind of disparity map has relatively low precision and relatively poor accuracy, but it can reflect some of the overall information to a certain extent. Therefore, the disparity map can be used as a constraint to assist in the matching process.
  • the above-mentioned determination of the parallax map based on the first image, the second image, and the third image may include the following content: according to the third image, a three-dimensional point cloud is recovered ; Projecting the three-dimensional point cloud into the first image according to the in-camera parameters and the out-camera parameters to obtain a first projected image; according to the in-camera parameters and the out-camera parameters, the three-dimensional A point cloud is projected into the second image to obtain a second projected image; based on the first projected image and the second projected image, a disparity map is determined.
  • the left camera, the right camera and the depth camera are arranged according to certain rules, and through joint calibration and corresponding adjustment correction, the optical axes are parallel to each other, that is, the first image, the second image and the The third image is aligned in the u-axis direction, and there is only an offset in the v-axis direction.
  • the camera coordinate system can be set at a center symmetrical position between the left camera and the right camera, by projecting the coordinates of the three-dimensional point that carries the depth information and is recovered based on the third image to the first image, In the second image, the first projected image and the second projected image of the coordinates of the corresponding two-dimensional point are obtained to facilitate subsequent data processing.
  • the pixels in the third image are respectively projected into the second image to obtain the second projected image, wherein the pixels in the second projected image are 2
  • the disparity map is determined based on the first projected image and the second projected image, and in a specific implementation, it may include the second coordinate value of the point with the same name in the second projected image and the first projected image Perform the difference to obtain the disparity map.
  • S13 Determine the pixel to be matched according to the gray value of the pixel to be matched in the first image, the gray value of the pixel in the matching window of the pixel to be matched, and a preset encoding rule Match the preset encoding of pixels in the window.
  • the disparity map may be used as a constraint to guide binocular stereo matching between the first image and the second image. That is, for example, the corresponding matching pixel points of each pixel to be matched on the first image in the second image can be determined based on the first image, and then the subsequent matching is completed to obtain the corresponding depth image.
  • binocular matching based on the first image is only a schematic illustration. During specific implementation, binocular matching may also be performed based on the second image. This application is not limited.
  • the above-mentioned parallax map is obtained by projecting a three-dimensional point cloud based on the third image; and the third image is image data obtained by a depth camera such as a ToF camera. Therefore, the gray values of the pixels obtained in the above parallax map are often inaccurate, and there will be a certain error.
  • the existing depth image generation methods often do not take into account the error of the gray value of the above pixels, but directly use the above disparity map to match and search for the pixels to be matched and the matching pixels, resulting in the gray value
  • the error is transferred to the matching process, which affects the matching accuracy, which in turn makes the accuracy of the subsequently determined depth image lower.
  • the gray values of the pixels in the disparity map have errors and are not accurate enough. Therefore, in the vicinity of the pixels to be matched (that is, the matching window), a gray scale based on the pixels to be matched is introduced.
  • the preset code determined by the difference in degree values replaces the gray value, thereby avoiding the problem of excessively relying on the gray value to determine the matching pixels, resulting in the error of transferring the gray value to the matching process and affecting the subsequent matching accuracy .
  • the above is determined according to the gray value of the pixel to be matched in the first image, the gray value of the pixel in the matching window of the pixel to be matched and a preset encoding rule
  • the preset encoding of the pixels in the matching window of the pixels to be matched may include the following contents during specific implementation:
  • the above matching window can be understood as a range area that is adjacent to the pixel to be matched and does not include the pixel to be matched, and is composed of other pixels.
  • FIG. 3 for a schematic diagram of an embodiment of a method for generating a high-precision dense depth image according to an embodiment of the present application.
  • the pixel to be matched is located at the center of the matching window, and the pixel to be matched
  • the eight surrounding pixels are the pixels in the matching window of the pixel to be matched.
  • the pixels listed in the matching window of the pixels to be matched mentioned above are only a schematic illustration.
  • the pixels in the matching window of the pixels to be matched may also include other numbers of pixels distributed in other ways. This application is not limited.
  • encoding is performed according to a preset encoding rule.
  • the gray value of the pixel at the first position in the matching window is 6 and the gray value of the pixel to be matched is 7 Therefore, it can be determined that the preset code corresponding to the first pixel is 1.
  • the gray value of the pixel in the second position in the matching window is 8 greater than the gray value of the pixel to be matched in 7, so it can be determined that the preset code corresponding to the second pixel is 0.
  • the preset encoding of each pixel in the 8 pixels in the matching window of the pixel to be matched in the first image can be determined as follows: 1, 0, 0, 0, 1, 1, 1, 0.
  • the system can empty the preset codes at the positions of the pixels to be matched, and at the same time arrange the preset codes of the pixels in the matching window according to the positions of the pixels, and then record a vector characterizing the feature sequence , Namely: (1,0,0,0,1,1,1,0).
  • each bit value in the vector corresponds to a preset encoding of a pixel at a position in the matching window of the pixel to be matched.
  • the preset encoding of the pixels in the matching window of each pixel to be matched in the first image may be determined according to the preset encoding rules in the above manner, and then the preset encoding may be based on the preset encoding, and It is not the gray value of the error, and specific matching is carried out, which can effectively reduce the matching error and improve the matching accuracy.
  • the matching pixel in the second image corresponding to the pixel to be matched in the first image may specifically be understood as the actual position indicated in the second image and the pixel to be matched in the first image.
  • the pixels with the same actual position may also be referred to as the pixels with the same name in the second image to be matched.
  • the preset encoding of the pixels in the matching window of the pixels to be matched is used in combination with the gray values of the pixels to be matched, that is, not all the gray values of the pixels are used to match the corresponding search Matching pixels, therefore, reduces the matching error caused by the error of the gray value, and improves the accuracy of matching.
  • the matching pixels corresponding to the pixels to be matched in the first image are determined in the second image, and when specifically implemented, may include the following:
  • S2 Determine the gray value of the test pixel and the preset encoding of the pixel in the matching window of the test pixel;
  • the parallax map has errors, it can reflect the overall characteristic trend and can be used as a guide and reference. Therefore, in the specific implementation, the disparity map can be used as a guide and reference to determine the possible range of the matching pixels corresponding to the pixels to be matched in the second image; and then further from the above range according to the first coordinates Filter out multiple test pixels.
  • the first coordinate may specifically be understood as the line coordinate, that is, u.
  • the line coordinates of the pixels to be matched in the second image that is, pixels with the same u value can be used as Test pixels, that is, pixels that may be matched pixels to be further tested and determined. Therefore, the traversal search for all pixels in the second image is avoided, the matching search range is reduced, and the processing efficiency is improved.
  • each test pixel in the plurality of test pixels can be separately determined according to the preset encoding method of determining the pixels in the matching window of the pixels to be matched The preset encoding of the pixels in the matching window, so that according to the preset encoding, only the gray value of the test pixel can be combined to search for the most suitable pixel in the second image as the matching pixel.
  • the above matching cost can be specifically understood as a parameter that can reflect the degree of similarity between the test pixel and the pixel to be matched.
  • the smaller the matching cost of a test pixel and the pixel to be matched the higher the similarity with the pixel to be matched, and the test pixel has a relatively greater probability of becoming a matching pixel corresponding to the pixel to be matched point.
  • the greater the matching cost of a test pixel and the pixel to be matched the lower the degree of similarity to the pixel to be matched, and the test pixel has a relatively smaller probability of becoming a matching pixel corresponding to the pixel to be matched.
  • the pixels in the matching window of the pixels to be matched can be used
  • the preset code of the pixel and the preset code of the pixel in the matching window of the test pixel replace the corresponding gray value, and the XOR operation is performed to determine the adjacent matching window of the pixel to be matched and the adjacent of the test pixel
  • the degree of approximation of the matching window is used as the first item in the matching cost to reduce the impact of the third image due to low accuracy and poor resolution on the matching process, and retain more accurate structural information of the local texture in the image.
  • the preset codes of the pixels in the matching window of the pixels to be matched and the preset codes of the pixels in the matching window of the test pixels can be used to replace the corresponding gray values, respectively, by performing an XOR operation To determine the degree of similarity between the adjacent matching window of the pixel to be matched and the adjacent matching window of the test pixel as the first item of data in the matching cost.
  • the following content may be included: the preset encoding of the pixel at each position in the matching window of the pixel to be matched with the preset encoding of the pixel at the same position in the matching window of the test pixel respectively 1.
  • the preset encoding of the pixel at a position in the matching window of each pixel to be matched is respectively the preset encoding of the pixel at the same position in the matching window of the test pixel
  • the total accumulation result is obtained as the first item of data in the matching cost.
  • the absolute value of the difference between the gray value of the pixel to be matched and the gray value of the test pixel is introduced as the second in the matching cost
  • the item data plays a smoothing role, making the subsequent image relatively smoother and the effect relatively better.
  • the preset encoding of the pixels in the pixel matching window calculates the matching cost between the pixel to be matched and the test pixel, and when specifically implemented, may include the following:
  • C can be specifically expressed as the matching cost of the pixel to be matched with the test pixel, Specifically, it can be expressed as the gray value of the pixel to be matched in the first image, Specifically, it can be expressed as the gray value of the test pixel in the second image, Specifically, it can be expressed as the preset encoding of the pixel number k in the matching window of the pixel to be matched in the first image, Specifically, it may be expressed as the preset encoding of the pixel number k in the matching window of the test pixel in the second image, and n may specifically be the total number of pixels in the matching window.
  • the above symbol It can be used to characterize XOR operation.
  • the result is 1; when the values on both sides of the symbol are different, the result is 0.
  • the matching cost between each test pixel in the plurality of test pixels in the second image and the pixel to be matched in the first image may be calculated in the above manner.
  • the above matching cost may be further compared,
  • the test pixel with the smallest matching cost value that is, the matching cost with the highest degree of approximation, is selected as the matching pixel in the second image to be matched.
  • the corresponding matching pixel in the second image of each pixel to be matched in the first image can be determined, so that the matching search can be completed relatively quickly and accurately.
  • S15 Determine the first depth image according to the pixels to be matched in the first image and the matching pixels in the second image corresponding to the pixels to be matched in the first image.
  • the first image and the second image is subjected to specific stereo matching processing, and the disparity map with better effect is obtained. According to the disparity map, a more accurate first depth image carrying depth information is further obtained.
  • the first depth image does not all rely on the grayscale information with errors in the process of matching search, a preset code is introduced, and the grayscale values of the pixels to be matched and the test pixels are combined to determine the matching pixels
  • a preset code is introduced, and the grayscale values of the pixels to be matched and the test pixels are combined to determine the matching pixels
  • the present application compared with the existing method, by acquiring and using the preset encoding of the pixels in the matching window adjacent to the pixel to be matched in the first image according to the preset encoding rule, combining the pixel encoding to be matched
  • the gray value is constrained by the disparity map, and the more accurate matching pixels are determined from the second image through matching to determine the depth image, which reduces the matching error caused by the difference in gray information due to lighting, so as to solve
  • the technical problem of the existing method for determining the depth image with large error and low accuracy is achieved, and the technical effect of obtaining a dense depth image with higher accuracy and resolution is achieved.
  • the method may also include the following:
  • the obtained first depth image may also have some glitches and are not smooth enough, in order to make the obtained depth image smoother and better, and to further improve the accuracy of the depth image .
  • the disparity map obtained based on a depth camera such as a ToF camera can also be used as a guide to correct and adjust the first depth image to obtain a smoother and more accurate depth image.
  • parallax map based on the third image also called an initial parallax map, denoted as I.
  • I an initial parallax map
  • the above parallax map Since the data value of the pixel is obtained based on the gray value in the third image, there may be an error in itself, and the resolution is relatively low.
  • the disparity map may be used to generate weight values for correction and adjustment, and the first depth image may be directionally adjusted without excessively participating in specific pixel data in the depth image
  • the calculation of the value ensures that the data value of the corrected pixel point is affected by the error of the data value in the disparity map as little as possible, and maintains high resolution and accuracy.
  • the above determination of the second depth image based on the correction weight and the first depth image when specifically implemented, may include:
  • q i can be specifically expressed as the data value of the pixel number i in the second depth image
  • W ij (I) can specifically be expressed as the correction weight
  • I can be expressed as the disparity map
  • p j can be The data value of the pixel number j in the corresponding preset window in a depth image.
  • the correction weight may be determined according to the following formula:
  • I i and I j can be specifically expressed as the data values of two adjacent pixels in the corresponding preset window in the disparity map
  • ⁇ k can specifically be expressed as the pixel values in the corresponding preset window in the disparity map
  • the average value of the data values, ⁇ k may specifically be expressed as the variance of the data values of the pixels in the corresponding preset window in the disparity map
  • is expressed as the penalty value
  • is the disturbance value.
  • the preset window may be specifically understood as a range area centered on the pixel corresponding to the pixel in the second depth image.
  • the above-mentioned preset window formation or size may be set according to specific conditions. This application is not limited.
  • the specific value of the disturbance value may be a very small value to ensure that the denominator is not zero.
  • the specific values of the above disturbance value and penalty value can be flexibly set according to specific conditions and accuracy requirements. This manual is not limited.
  • the above data value is different from the gray value, which can be understood as a kind of parameter data that also contains depth information.
  • the correction weight determined in the above manner can make the difference between the data values I i and I j of two adjacent pixels at the edge position of the target object due to the difference Large, and I i and I j are located on both sides of the edge, so that (I i - ⁇ k ) and (I j - ⁇ k ) are different signs, and the value of (I i -I j ) is relatively large Therefore, weaker adjustments and corrections are made to the data values of pixels near the edge in the second depth image.
  • the weights for pixels in non-edge areas can be relatively large and smooth The effect is relatively more obvious; the weight of the pixels for the edge area is relatively small, and the smoothing effect is relatively weak, which plays the role of maintaining the border of the graphic. That is, the depth image can be smoothed in a more targeted and accurate manner, and the lower boundary information is retained.
  • the specific implementation of the method may also include the following:
  • S1 Detect whether there is a blank area in the first depth image, wherein the blank area is an area including a plurality of pixels with a data value of 0;
  • S3 Modify the data values of the pixels in the blank area according to the data values of the pixels connected to the blank area in the non-blank area in the first depth image.
  • the above-mentioned blank area can be specifically understood as a range area including a plurality of consecutive pixels with a data value of 0.
  • FIG. 4 for a schematic diagram of an example of a method for generating a high-precision dense depth image according to an embodiment of the present application.
  • a third image with lower accuracy and poor resolution is still used in the process of acquiring the first depth image, or that the first image and the second image also have data errors , Resulting in the locality of the depth image indicating insufficient texture information, resulting in the appearance of blank areas.
  • the data values of the pixels in the non-blank area in the depth image but connected to the blank area whose data value is not 0 can be used to fill the data values in the adjacent blank area.
  • the data value 3 of the pixels in the non-blank area in the first row and second column connected to the pixels in the blank area in the first row and third column can be used to fill in the blank area. pixel.
  • the pixels in each blank area are respectively filled in correspondingly, so that a complete and accurate depth map is obtained, and the accuracy of the depth map is further improved.
  • the depth image obtained based on the above method usually has a high accuracy at the edge position, so blank areas rarely appear, and even if blank areas appear, they may not necessarily be caused by errors. Instead of edge positions, such as the interior of the target object, if there is a blank area, there is usually a higher probability that it is introduced due to error. At this time, it is relatively more suitable to fill the blank area using the above method.
  • the above detection of whether the blank area is located at the edge position of the target object it can be determined whether it is located by detecting whether the gradient of the data value on both sides of the boundary of the blank area and the non-blank area is greater than a preset threshold The edge position of the target object. If the gradient of the data values on both sides of the boundary is greater than the preset threshold, it can be determined that the blank area is located at the edge position of the target object. If the gradient of the data values on both sides of the boundary is less than or equal to the preset threshold, it can be determined that the blank area is not located at the edge position of the target object.
  • the obtained first depth image usually has good accuracy at the edge position itself. Therefore, when it is determined that the blank area is located at the edge position of the target object, the blank area is not filled.
  • the method for generating a high-precision dense depth image acquires and utilizes pixels in a matching window adjacent to pixels to be matched in the first image according to a preset encoding rule Point preset coding, combined with the gray value of the pixel to be matched, and constrained by the disparity map, through the matching to determine a more accurate matching pixel from the second image to determine the depth image, reducing the gray caused by lighting
  • the matching error caused by the difference in information solves the technical problem of determining the depth image with large error and low accuracy in the existing methods, and achieves the technical effect of obtaining a dense depth image with higher accuracy and resolution; It also determines the correction weights according to the disparity map, and then uses the above correction weights to guide the correction of the first depth image, so that the image obtained by the depth camera will not be introduced during the smoothing process of the first depth image
  • the error data caused by the poor accuracy of the image obtained by the depth camera can obtain a depth image with higher accuracy and
  • an embodiment of the present invention also provides a high-precision dense depth image generation device, as described in the following embodiments. Since the principle of the high-precision dense depth image generation device to solve the problem is similar to the high-precision dense depth image generation method, the implementation of the device can be referred to the implementation of the high-precision dense depth image generation method, and the repetition is not repeated here.
  • the term "unit” or "module” may implement a combination of software and/or hardware that achieves a predetermined function.
  • the devices described in the following embodiments are preferably implemented in software, implementation of hardware or a combination of software and hardware is also possible and conceived. Please refer to FIG.
  • the device may specifically include: an acquisition module 51, a first determination module 52, a second determination module 53, and a third
  • the third determination module 54 and the fourth determination module 55 will be described in detail below.
  • the obtaining module 51 can be specifically used to obtain a first image, a second image and a third image, wherein the first image is image data containing a target object obtained through a left camera, and the second image is through a right Image data obtained by the camera containing the target object, and the third image is image data obtained by the depth camera and containing the target object;
  • the first determining module 52 may be specifically configured to determine a disparity map according to the first image, the second image, and the third image;
  • the second determination module 53 may be specifically used to determine the gray value of the pixel to be matched in the first image, the gray value of the pixel in the matching window of the pixel to be matched, and a preset encoding rule, Determine the preset encoding of the pixels in the matching window of the pixels to be matched;
  • the third determining module 54 may be specifically configured to use the gray value of the pixel to be matched in the first image, the preset encoding of the pixel in the matching window of the pixel to be matched, and the disparity map, Determining a matching pixel corresponding to the pixel to be matched in the first image from the second image;
  • the fourth determining module 55 may be specifically configured to determine the first according to the pixels to be matched in the first image and the matching pixels in the second image corresponding to the pixels to be matched in the first image Depth image.
  • the second determining module 53 may specifically include the following structural units:
  • the first comparison unit may specifically be used to compare the gray values of the pixels in the matching window of the pixels to be matched with the gray values of the pixels to be matched in the first image;
  • the first determining unit may be specifically configured to determine, according to the comparison result, the preset encoding of pixels whose gray value in the matching window of the pixel to be matched is less than or equal to the gray value of the pixel to be matched as 1: Determine the preset encoding of pixels whose gray value in the matching window of the pixel to be matched is greater than the gray value of the pixel to be matched to 0.
  • the third determining module 54 may specifically include the following structural units:
  • the screening unit may be specifically configured to screen out a plurality of pixels from the second image as test pixels based on the first coordinates of the pixels to be matched and the disparity map;
  • the second determining unit may specifically be used to determine the gray value of the test pixel and the preset encoding of the pixel in the matching window of the test pixel;
  • the first calculation unit may be specifically used to determine the gray value of the pixel to be matched, the preset encoding of the pixel in the matching window of the pixel to be matched, the gray value of the test pixel, and the The preset encoding of pixels in the matching window of the test pixels, calculating the matching cost of the pixels to be matched and the test pixels;
  • the third determining unit may specifically be used to determine the test pixel with the smallest matching cost value as the matching pixel corresponding to the pixel to be matched in the first image.
  • the matching cost of the pixel to be matched and the test pixel may be calculated according to the following formula:
  • C can be specifically expressed as the matching cost of the pixel to be matched with the test pixel, Specifically, it can be expressed as the gray value of the pixel to be matched in the first image, It can be expressed as the gray value of the test pixel in the second image Specifically, it can be expressed as the preset encoding of the pixel number k in the matching window of the pixel to be matched in the first image, Specifically, it may be expressed as the preset encoding of the pixel number k in the matching window of the test pixel in the second image, and n may specifically be the total number of pixels in the matching window.
  • the apparatus may further specifically include a fifth determining module, configured to determine the second depth image according to the first depth image.
  • the fifth determination module may specifically include the following structural units:
  • the first generating unit may be specifically configured to generate a correction weight based on the disparity map
  • the fourth determining unit may be specifically configured to determine the second depth image based on the correction weight and the first depth image.
  • the above-mentioned fourth determining unit may calculate the data value of the pixels in the second depth image according to the following formula:
  • q i can be specifically expressed as the data value of the pixel number i in the second depth image
  • W ij (I) can specifically be expressed as the correction weight
  • I can be expressed as the disparity map
  • p j can be The data value of the pixel number j in the corresponding preset window in a depth image.
  • the above-mentioned fourth determining unit may determine the correction weight according to the following formula:
  • I i and I j can be specifically expressed as the data values of two adjacent pixels in the corresponding preset window in the disparity map
  • ⁇ k can specifically be expressed as the pixel values in the corresponding preset window in the disparity map
  • the average value of the data values, ⁇ k can specifically be expressed as the variance of the data values of the pixels in the corresponding preset window in the disparity map
  • can be specifically expressed as the penalty value
  • can be the disturbance value.
  • the device may further include a filling module, specifically configured to detect whether there is a blank area in the first depth image, wherein the blank area is a pixel including a plurality of data values of 0 The area of the point; when it is determined that there is a blank area in the first depth image, acquiring data values of pixels connected to the non-blank area and the blank area in the first depth image; according to the first depth In the image, the data values of pixels connected to the blank area in the non-blank area are modified to the data values of pixels in the blank area.
  • a filling module specifically configured to detect whether there is a blank area in the first depth image, wherein the blank area is a pixel including a plurality of data values of 0 The area of the point; when it is determined that there is a blank area in the first depth image, acquiring data values of pixels connected to the non-blank area and the blank area in the first depth image; according to the first depth In the image, the data values of pixels connected to the blank area in the non-bla
  • system, device, module, or unit explained in the above embodiments may be specifically implemented by a computer chip or entity, or by a product having a certain function.
  • the functions are divided into various units and described separately.
  • the functions of each unit may be implemented in one or more software and/or hardware.
  • adjectives such as first and second can only be used to distinguish one element or action from another element or action without requiring or implying any actual such relationship or order. Where circumstances permit, reference to elements or components or steps (etc.) should not be interpreted as being limited to only one of the elements, components, or steps, but may be one or more of the elements, components, or steps, etc.
  • the device for generating a high-precision dense depth image obtained by an embodiment of the present application obtains and uses the first through the second determination module, the third determination module, and the fourth determination module according to preset encoding rules
  • the pixels in the matching window adjacent to the pixel to be matched in the image are preset to encode, combined with the gray value of the pixel to be matched, and constrained by the disparity map, to determine a more accurate matching pixel from the second image through matching Point to determine the depth image, reducing the matching error caused by the difference in grayscale information caused by the illumination, thereby solving the technical problems of the existing method of determining the depth image with large errors and low accuracy, to achieve dense,
  • the technical effect of the depth image with higher accuracy and resolution also through the correction module to determine the correction weights based on the disparity map obtained from the third image, and then use the above correction weights to guide the correction of the first depth image, so that In the process of smoothing the first depth image using the image obtained by the depth camera, the error data
  • An embodiment of the present application also provides an electronic device. Specifically, refer to FIG. 6 for a schematic diagram of a composition structure of an electronic device based on a method for generating a high-precision dense depth image provided by an embodiment of the present application.
  • the electronic device may specifically include an input Device 61, processor 62, memory 63.
  • the input device 61 may specifically be used to input a first image, a second image, and a third image, where the first image is image data containing a target object acquired through a left camera, and the second image For the image data containing the target object acquired by the right camera, the third image is the image data containing the target object acquired by the depth camera.
  • the processor 62 may be specifically configured to determine a disparity map according to the first image, the second image, and the third image; according to the gray value of the pixel to be matched in the first image, the The gray value of the pixel in the matching window of the pixel to be matched and the preset encoding rule to determine the preset encoding of the pixel in the matching window of the pixel to be matched; according to the to-be-matched in the first image The gray value of the pixel, the preset encoding of the pixel in the matching window of the pixel to be matched, and the disparity map, determining the pixel to be matched in the first image from the second image The matching pixel points corresponding to the points; the first depth image is determined according to the pixel points to be matched in the first image and the matching pixel points in the second image corresponding to the pixel points to be matched in the first image.
  • the memory 63 may be specifically used to store the first image, the second image, the third image input through the input device 61
  • the input device may specifically be one of the main devices for information exchange between the user and the computer system.
  • the input device may include a keyboard, a mouse, a camera, a scanner, a light pen, a handwriting input board, a voice input device, etc.
  • the input device is used to input raw data and programs that process these numbers into the computer.
  • the input device may also acquire and receive data transmitted from other modules, units, and devices.
  • the processor can be implemented in any suitable way.
  • the processor may employ, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (such as software or firmware) executable by the (micro)processor, logic gates, switches, application specific integrated circuits ( Application Specific (Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller, etc.
  • the memory may specifically be a memory device for storing information in modern information technology.
  • the memory may include multiple levels. In a digital system, as long as it can store binary data, it can be a memory.
  • a circuit with a storage function that does not have a physical form is also called a memory, such as RAM, FIFO, etc.;
  • storage devices with physical form are also called memories, such as memory sticks, TF cards, etc.
  • An embodiment of the present application also provides a computer storage medium based on a method for generating a high-precision dense depth image, where the computer storage medium stores computer program instructions, which are implemented when the computer program instructions are executed: acquiring the first image, A second image and a third image, wherein the first image is image data including the target object acquired through the left camera, and the second image is image data including the target object acquired through the right camera, the The third image is image data containing the target object acquired by the depth camera; a disparity map is determined according to the first image, the second image, and the third image; according to pixels to be matched in the first image The gray value of the dot, the gray value of the pixel in the matching window of the pixel to be matched and the preset encoding rule, to determine the preset encoding of the pixel in the matching window of the pixel to be matched; The gray value of the pixel to be matched in the first image, the preset encoding of the pixel in the matching window of the pixel to be matched,
  • the above storage medium includes but is not limited to random access memory (RandomAccess Memory, RAM), read-only memory (Read-Only Memory, ROM), cache (Cache), hard disk (Hard DiskDrive, HDD), or storage Card (Memory).
  • the memory may be used to store computer program instructions.
  • the network communication unit may be an interface configured to perform network connection communication according to a standard prescribed by a communication protocol.
  • the method and apparatus for generating a high-precision dense depth image are applied to obtain a high-precision, dense depth image.
  • the two RGB cameras ie, the left camera and the right camera
  • the optical axes are parallel to each other, and the v-axis is completely aligned.
  • the images obtained by the two RGB cameras that is, the first image and the second image
  • the depth map (ie, the third image) obtained by the depth camera can restore the three-dimensional point cloud in its coordinate system.
  • a rectangular window is constructed with the point to be matched (that is, the pixel to be matched in the first image) as the center, and the center point of the window and its neighboring pixels (that is, the pixels in the matching window of the pixel to be matched) Point), compare the gray value of the pixel with a gray value less than the center point (preset code) to 1, and the gray value greater than the center point (preset code) to 0, and count these values by pixel
  • the positions of are connected in sequence into a vector, which is used as the feature sequence of the point.
  • the matching cost can be calculated according to the following formula:
  • C represents the matching cost of the pixel to be matched with the test pixel, Expressed as the gray value of the pixel to be matched in the first image, Expressed as the gray value of the test pixel in the second image, Expressed as the preset encoding of the pixel number k in the matching window of the pixel to be matched in the first image, It is expressed as the preset encoding of the pixel number k in the matching window of the test pixel in the second image, and n is the total number of pixels in the matching window.
  • the points with the same name to be matched can be determined according to the matching cost, and the depth image (ie, the first depth image) can be obtained through corresponding matching processing.
  • q i represents the data value of pixel number i in the second depth image
  • W ij (I) represents the correction weight
  • I represents the disparity map
  • p j represents the corresponding pre-value in the first depth image
  • I i and I j represent the data values of two adjacent pixels in the corresponding preset window in the disparity map
  • ⁇ k represents the corresponding pre-value in the disparity map
  • the average value of the data values of the pixels in the window ⁇ k is the variance of the data values of the pixels in the corresponding preset window in the disparity map
  • is the penalty value
  • is the disturbance value.
  • is a small disturbance to ensure that the denominator is not zero.
  • the difference between I i and I j is large, and when I i and I j are on both sides of the boundary, (I i - ⁇ k ) and (I j - ⁇ k ) are different. Sign; otherwise, the difference is small and the same sign.
  • the weight value at the edge will be much smaller than the weight value at the flat position, so that pixels in the flat area will be given greater weight, the smoothing effect is more obvious, and pixels on both sides of the border will be reduced
  • the weight of, the smoothing effect is weak, and it can play the role of maintaining the boundary.
  • the local encoding ie, preset encoding
  • the gray value of the central pixel of the window to be matched is used as the difference
  • the non-parameter The combination of transformation and parameter transformation can be used reasonably without excessively relying on pixel gray information, which improves the processing accuracy.
  • the fusion matching strategy considering the low accuracy and unreliability of the ToF depth map, it is only used as a guide instead of Dependence, this can not only eliminate the effects of inconsistent light intensity received by the left and right cameras at different viewing angles, differences caused by camera gains and level changes, and different noises in different channels of image acquisition, but also get clear edges. Smooth and dense disparity map.
  • the calculation of the above matching cost combines the advantages of census transformation and absolute difference, while maintaining the smooth continuity while eliminating the effects of lighting differences; since the specific gray value and data value of ToF are not used, they are not used As a "seed point", it does not perform hierarchical matching according to its value, and will not expand its local errors; in the featureless area, instead of using the initial disparity value converted from ToF depth, it is used under its guidance. Binocular matching fills its own disparity value; in the process of guided matching, the initial disparity design weights obtained by ToF are used. The weights in the edge area are small, and the weights in the flat area are large, which can further reduce noise and smooth.
  • the above implementation process can be based on a mobile intelligent terminal to realize the accurate 3D reconstruction function of the dual camera + TOF module; it can also be made into a larger module with a larger working distance for robot 3D visual perception and guidance And other uses.
  • the method and device for generating a high-precision dense depth image provided by the embodiments of the present application are verified.
  • a preset encoding rule Preset encoding combined with the gray value of the pixel to be matched, and constrained by the disparity map, through the matching to determine a more accurate matching pixel from the second image to determine the depth image, reducing gray information due to lighting
  • the matching error caused by the difference does indeed solve the technical problems of the existing method of determining the depth image with large errors and low accuracy, and achieves the technical effect of obtaining a dense depth image with higher accuracy and resolution.
  • the device or module explained in the above embodiments may be implemented by a computer chip or entity, or by a product with a certain function.
  • the functions are divided into various modules and described separately.
  • the functions of each module may be implemented in one or more software and/or hardware, or a module that implements the same function may be implemented by a combination of multiple sub-modules.
  • the device embodiments described above are only schematic.
  • the division of the modules is only a division of logical functions. In actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated To another system, or some features can be ignored, or not implemented.
  • controller in addition to implementing the controller in the form of pure computer-readable program code, it is entirely possible to logically program method steps to make the controller use logic gates, switches, application specific integrated circuits, programmable logic controllers and embedded To achieve the same function in the form of a microcontroller, etc. Therefore, such a controller can be regarded as a hardware component, and the device for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even, the means for realizing various functions can be regarded as both a software module of an implementation method and a structure within a hardware component.
  • the present application may be described in the general context of computer-executable instructions executed by a computer, such as program modules.
  • program modules include routines, programs, objects, components, data structures, classes, etc. that perform specific tasks or implement specific abstract data types.
  • the present application may also be practiced in distributed computing environments in which tasks are performed by remote processing devices connected through a communication network.
  • program modules may be located in local and remote computer storage media including storage devices.
  • the present application can be implemented by means of software plus a necessary general hardware platform.
  • the technical solution of the present application can be embodied in the form of a software product in essence or part that contributes to the existing technology, and the computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disk , CD-ROM, etc., including several instructions to enable a computer device (which may be a personal computer, mobile terminal, server, or network device, etc.) to perform the methods described in the embodiments or some parts of the embodiments of the present application.

Abstract

本申请实施例提供了一种高精度稠密深度图像的生成方法和装置,其中,方法包括:获取第一图像、第二图像和第三图像;根据上述图像,确定视差图;根据待匹配像素点的灰度值、匹配窗口内的像素点的灰度值和预设编码规则,确定匹配窗口内的像素点的预设编码;根据第一图像中的待匹配像素点的灰度值、匹配窗口内的像素点的预设编码和视差图,从第二图像中确定出匹配像素点;进而确定第一深度图像。通过根据预设编码规则获取并利用待匹配像素点邻近的匹配窗口内的像素点预设编码,结合待匹配像素点的灰度值,以视差图为约束,通过匹配从第二图像中确定出匹配像素点,进而确定出深度图像,从而解决了现有方法确定深度图像误差大、精度不高的技术问题。

Description

高精度稠密深度图像的生成方法和装置 技术领域
本申请涉及图像处理技术领域,特别涉及一种高精度稠密深度图像的生成方法和装置。
背景技术
随着计算机视觉技术的发展,携带有深度信息的深度图像的应用越来越广泛。例如,在图像识别与处理、场景理解、增强与虚拟现实、机器人导航等应用领域,都会使用到深度图像这种图像数据。相应的,人们对深度图像的精度、分辨率等要求也越来越高。
目前,为了获取得到精度较高的深度图像,大多会将ToF(Time ofFlight)相机等深度相机所采集到的图像数据直接融入并参与双目立体视觉的匹配算法中,以简化由双目相机(包括左摄像头和右摄像头的相机)所采集的图像数据的立体匹配,最终得到精度相对较高的深度图像。
但是,由于深度相机抗干扰能力较差,所采集的图像数据往往分辨率、精度相对较低,容易出现误差信息。例如,在某些特殊材料或物体边缘会出现一些不可靠的图像数据。而基于上述方法,具体实施时,会将深度相机所采集的图像数据中存在的一些误差信息也一并引入匹配算法中,并随着匹配过程被传递和扩散,导致最终得到深度图像往往会存在误差,表现较为稀疏,且分辨率不高、准确度也相对较差。即,现有方法往往存在所确定的深度图像误差大、精度不高的技术问题。
针对上述问题,目前尚未提出有效的解决方案。
发明内容
本申请实施例提供了一种高精度稠密深度图像的生成方法和装置,以解决现有方法中存在的确定深度图像误差大、精度不高的技术问题,达到获取稠密的、具有较高精度和分辨率的深度图像的技术效果。
本申请实施例提供了一种高精度稠密深度图像的生成方法,包括:
获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据;
根据所述第一图像、所述第二图像和所述第三图像确定视差图;
根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码;
根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;
根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹 配像素点对应的匹配像素点,确定第一深度图像。
在一个实施方式中,根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码,包括:
将所述待匹配像素点的匹配窗口内的像素点的灰度值分别与所述第一图像中的待匹配像素点的灰度值进行比较;
根据比较结果,将所述待匹配像素点的匹配窗口内的灰度值小于或等于所述待匹配像素点的灰度值的像素点的预设编码确定为1;将所述待匹配像素点的匹配窗口内的灰度值大于所述待匹配像素点的灰度值的像素点的预设编码确定为0。
在一个实施方式中,根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点,包括:
根据所述待匹配像素点的第一坐标和所述视差图,从所述第二图像中筛选出多个像素点作为测试像素点;
确定测试像素点的灰度值,以及测试像素点的匹配窗口内的像素点的预设编码;
根据所述待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述测试像素点的灰度值、所述测试像素点的匹配窗口内的像素点的预设编码,计算所述待匹配像素点与所述测试像素点的匹配代价;
将匹配代价数值最小的测试像素点确定为与所述第一图像中的待匹配像素点对应的匹配像素点。
在一个实施方式中,根据所述待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述测试像素点的灰度值、所述测试像素点的匹配窗口内的像素点的预设编码,计算所述待匹配像素点与所述测试像素点的匹配代价,包括:
按照以下公式,计算所述待匹配像素点与所述测试像素点的匹配代价:
Figure PCTCN2019121495-appb-000001
其中,C表示为待匹配像素点与测试像素点的匹配代价,
Figure PCTCN2019121495-appb-000002
表示为第一图像中的待匹配像素点的灰度值,
Figure PCTCN2019121495-appb-000003
表示为第二图像中的测试像素点的灰度值,
Figure PCTCN2019121495-appb-000004
表示为第一图像中待匹配像素点的匹配窗口内的编号为k的像素点的预设编码,
Figure PCTCN2019121495-appb-000005
表示为第二图像中测试像素点的匹配窗口内的编号为k的像素点的预设编码,n表示为所述匹配窗口内的像素点的总数。
在一个实施方式中,在根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像后,所述方法还包括:
根据所述视差图,生成修正权值;
根据所述修正权值和所述第一深度图像,确定第二深度图像。
在一个实施方式中,根据所述修正权值和所述第一深度图像,确定第二深度图像,包括:
按照以下公式,计算所述第二深度图像中像素点的数据值:
Figure PCTCN2019121495-appb-000006
其中,q i表示为第二深度图像中编号为i的像素点的数据值,W ij(I)表示为修正权值,I表示为视差图,p j表示为第一深度图像中对应的预设窗口内编号为j的像素点的数据值。
在一个实施方式中,所述修正权值按照以下公式确定:
Figure PCTCN2019121495-appb-000007
其中,I i、I j表示为视差图中对应的预设窗口内相邻的两个像素点的数据值,μ k表示为视差图中对应的预设窗口内的像素点的数据值的平均值,σ k表示为视差图中对应的预设窗口内的像素点的数据值的方差,ε表示为惩罚值,τ表示为扰动值。
在一个实施方式中,在根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像后,所述方法还包括:
检测所述第一深度图像中是否存在空白区域,其中,所述空白区域为包含有多个数据值为0的像素点的区域;
在确定所述第一深度图像中存在空白区域的情况下,获取所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值;
根据所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值,修改所述空白区域内的像素点的数据值。
本申请实施例还提供了一种高精度稠密深度图像的生成装置,包括:
获取模块,用于获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据;
第一确定模块,用于根据所述第一图像、所述第二图像和所述第三图像确定视差图;
第二确定模块,用于根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码;
第三确定模块,用于根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;
第四确定模块,用于根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。
本申请实施例还提供了一种电子设备,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目 标对象的图像数据;根据所述第一图像、所述第二图像和所述第三图像确定视差图;根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码;根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机指令,所述指令被执行时实现获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据;根据所述第一图像、所述第二图像和所述第三图像确定视差图;根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码;根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。
在本申请实施例中,通过根据预设编码规则获取并利用第一图像中的待匹配像素点邻近的匹配窗口内的像素点预设编码,结合待匹配像素点的灰度值,并以视差图为约束,通过匹配从第二图像中确定出较为准确的匹配像素点,用以确定深度图像,减少了由于光照导致灰度信息出现差异等因素所引起的匹配误差,从而解决了现有方法中存在的确定深度图像误差大、精度不高的技术问题,达到获取稠密的、具有较高精度和分辨率的深度图像的技术效果。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是根据本申请实施方式提供的高精度稠密深度图像的生成方法的处理流程图;
图2是根据本申请实施方式提供的高精度稠密深度图像的生成方法中的一个实施例示意图;
图3是根据本申请实施方式提供的高精度稠密深度图像的生成方法中的一个实施例示意图;
图4是根据本申请实施方式提供的高精度稠密深度图像的生成方法中的一个实施例示意图;
图5是根据本申请实施方式提供的高精度稠密深度图像的生成装置的组成结构 图;
图6是基于本申请实施例提供的高精度稠密深度图像的生成方法的电子设备组成结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请中的技术方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
考虑到现有的深度图像的生成方法,大多是直接将通过ToF相机等深度相机采集得到图像数据,引入并参与双目相机中的左、右摄像头采集的图像数据的立体视觉匹配中,以简化匹配过程,得到最终的深度图像。但是,由于ToF相机等深度相机的实现机理不同于普通的相机,相对更容易受到外界干扰,导致所采集的图像数据分辨率相对较差、精度相对较低,即,本身就可能会存在许多误差信息。因此,现有方法在引入ToF相机等深度相机采集的图像数据参与匹配时,也会将上述误差信息一同引入匹配过程,且在匹配过程上述误差信息还会被传递、放大,导致匹配结果不准确,进而使得最终得到的深度图像精度较低、分辨率较差,且得到的深度图像往往还表现得较为稀疏,不能满足较高的处理要求。
针对产生上述技术问题的根本原因,本申请考虑到ToF相机等深度相机所采集的图像数据通常容易受环境光照等影响,导致所得到的图像数据中的像素点的灰度信息(或称灰度值)往往会存在误差、不够精确。进一步分析现有方法在实施过程中会引入上述图像数据,并且在匹配的过程中非常依赖受上述图像数据影响的灰度信息,从而在匹配过程中将上述图像数据中所携带的误差信息进行了传递和放大,导致匹配不准确,并影响最终得到深度图像的精度和分辨率。正是发现并注意上述情况,本申请考虑可以在利用深度相机采集的图像数据引导匹配、提高匹配效率的同时,减少匹配过程中对像素点的灰度信息依赖,从而减少上述图像数据所携带的误差的传递和放大,保护深度图像的分辨率和精度。具体的,可以通过根据预设编码规则确定匹配过程中待匹配像素点邻近的匹配窗口内的像素点预设编码,进而可以不利用上述匹配窗口内的像素点的灰度值,而是利用对应的预设编码,结合待匹配像素点的灰度值进行双目匹配,找到对应的匹配点,以生成深度图像,从而减少了对灰度信息的依赖,提高了匹配的准确度,解决了现有方法中存在的确定深度图像误差大、精度不高的技术问题,达到获取稠密的、具有较高精度和分辨率的深度图像的技术效果。
基于上述思考思路,本申请实施例提供了一种高精度稠密深度图像的生成方法。具体请参阅图1所示的根据本申请实施方式提供的高精度稠密深度图像的生成方法的处理流程图。本申请实施例提供的高精度稠密深度图像的生成方法,具体实施时,可以包括以下步骤。
S11:获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据。
在本实施方式中,上述第一图像和第二图像具体可以理解为一种包含有目标对象的彩色图像(也可称RGB图像)或者黑白图像。
在本实施方式中,上述第一图像具体可以是由双目相机(或者双目立体系统)中的左摄像头(或独立的左摄像机,记为l)拍摄采集的针对目标对象的图像数据,上述第二图像具体可以是由双目相机中的右摄像头(或独立的右摄像机,记为r)拍摄采集的同一时间针对同一个目标对象的图像数据。当然,也可以是以下情形:上述第一图像具体可以是由双目相机中的右摄像头拍摄采集的针对目标对象的图像数据,上述第二图像具体可以是由双目相机中的左摄像头拍摄采集的同一时间针对同一个目标对象的图像数据。对此,本申请不作限定。
在本实施方式中,上述第三图像具体可以是由深度相机拍摄采集的同一时间针对同一目标对象的图像数据。上述第三图像携带有深度信息,但精度较差、分辨率较低,可以视为一种初始的深度图像。其中,上述深度相机具体可以包括ToF(Time ofFlight)相机等能够获取深度图像的相机。当然,上述所列举的由ToF相机只是为了更好地说明本申请实施方式。具体实施时,上述第三图像还可以是通过除ToF相机以外的其他类型的深度相机获取的图像数据。对此,本申请不作限定。
需要说明的是,上述深度相机不同于普通相机,即不同于上述左摄像头或右摄像头,在拍摄采集图像数据时,是通过内置发射器对目标对象发射连续的近红外脉冲,然后用传感器接收由物体反射回的光脉冲;通过比较发射光脉冲与经过目标对象反射的光脉冲的相位差,计算得到光脉冲之间的传输延迟,进而得到目标对象相对于发射器的距离(即一种深度信息),最终得到包含有深度信息的图像数据。因此,第三图像本身就可以理解为一种深度图像。但是,由于深度相机自身机理的缺陷,导致所得到的深度图像(即第三图像)的分辨率往往不及普通的彩色图像的分辨率,且深度图像中的深度值、灰度值等信息容易受到外界噪声干扰。此外由于一个像素点所对应的目标对象可能涵盖了不同的物体对象的表面,导致在目标对象的边缘处的深度值容易出现误差等问题。
在本实施方式中,上述第一图像、第二图像和第三图像具体是同步获取的,同一时间针对同一目标对象的图像数据。
在一个实施方式中,在获取第一图像、第二图像和第三图像前,可以先参阅图2所示的根据本申请实施方式提供的高精度稠密深度图像的生成方法中的一个实施例示意图,按照预设的布设规则布设用于获取第一图像、第二图像和第三图像的左摄像头、右摄像头和深度相机。
在本实施方式中,具体实施时,可以将左摄像头、右摄像头与深度相机布设于同一水平位置。同时,通过调整使得上述左摄像头、右摄像头的成像原点坐标一致,保持镜头光轴平行、成像平面共面、对极线行对齐等,便于后续的数据处理(例如,双目匹配等)。
需要说明的是,通过使得左摄像头、右摄像头对极线行对齐,可以使得后续匹配时,可以以对极线为约束,在对应行内寻找匹配像素点。从而将二维搜索降为了一维搜索,减少了匹配搜索的范围,提高了处理效率。
在一个实施方式中,在布设好左摄像头、右摄像头和深度相机后,所述方法还包括:对左摄像头、右摄像头和深度相机进行联合标定,以确定出相机内参数和相机外参数。
在本实施方式中,上述相机内参数具体可以理解为左摄像头、右摄像头和深度相 机各自的自身内部的运行参数,可以记为K。具体的,上述相机内参数可以包括以下所列举运行参数中的一种或多种:焦距、成像原点和畸变系数等。当然,需要说明的是,上述所列举的相机内参数只是一种示意性说明。具体实施时,根据具体情况,上述相机内参数还可以包括有其他类型的运行参数。对此,本申请不作限定。
在本实施方式中,上述相机外参数具体可以理解为限定左摄像头与右摄像头、左摄像头与深度相机、有摄像头与深度相机两两相对的位置关系的位置参数,可以记为R和t。具体的,上述相机外参数可以包括以下所列举的位置参数中的一种或多种:旋转矢量、平移矢量等等。当然,需要说明的是,上述所列举的相机外参数只是一种示意性说明。具体实施时,根据具体情况,上述相机外参数还可以包括有其他类型的运行参数。对此,本申请不作限定。
在一个实施方式中,上述对左摄像头、右摄像头和深度相机进行联合标定,以确定出相机内参数和相机外参数,具体实施时,可以包括以下内容:通过所述左摄像头和所述右摄像头分别获取同一棋盘图像,根据得到的棋盘对象分别计算出左摄像头和右摄像头的内参数和外参数。
具体的,对于得到的每一张棋盘图像中的投影点的位置向量,可以表示为以下公式:
Figure PCTCN2019121495-appb-000008
其中,
Figure PCTCN2019121495-appb-000009
具体可以表示为投影点的位置向量,K具体可以表示为左摄像头和右摄像头的内参数,R具体可以表示为左摄像头和内摄像头间的旋转矢量,t具体可以表示为左摄像头和内摄像头间的平移矢量,M具体可以表示为三维坐标点。
对上述公式进行变形转化,将求解相机内参数和外参数(即参数K、R和t)的问题转化为求解以下最大似然函数的最优解:
Figure PCTCN2019121495-appb-000010
这样,可以通过求解上述最大似然函数的最优解,确定出合适的相机内参数和相机外参数,以便后续可以配合摄像头的设置位置,对第一图像和第二图像进行畸变消除和行对准,使得第一图像和第二图像的成像原点坐标统一,两图像成像平面共面,对极线行对齐,有助于后续进行图像数据处理时可以进一步减少匹配搜索范围,进一步提高处理效率。
在一个实施方式中,在获取了第三图像后,所述方法还包括以下内容:对所述第三图像进行预处理。其中,所述预处理至少包括过滤处理。
在本实施方式中,考虑到上述第三图像由于是通过深度相机等深度相机获得的图像数据,往往精度较差、分辨率较低,导致在目标对象边缘处的图像数据往往误差较大、不可靠。为了避免上述误差相对较大的图像数据对后续数据处理的影响,可以先检测出第三图像中的表征目标对象边缘的图像数据,并过滤掉上述目标对象边缘的图像数据,从而减少上述图像数据后续引入的误差,进一步提高了处理精度。
S12:根据所述第一图像、所述第二图像和所述第三图像确定视差图。
在本实施方式中,上述视差图也可以称为初始视差,是没有通过双目匹配得到的视差图。这种视差图精度相对较低、准确度相对较差,但在一定程度上可以反映出一些整体 信息,因此,后续可以利用该视差图作为约束,辅助进行匹配处理。
在一个实施方式中,上述根据所述第一图像、所述第二图像和所述第三图像确定视差图,具体实施时,可以包括以下内容:根据所述第三图像,恢复得到三维点云;根据所述相机内参数和所述相机外参数,将所述三维点云投影到第一图像中,得到第一投影图像;根据所述相机内参数和所述相机外参数,将所述三维点云投影到第二图像中,得到第二投影图像;根据所述第一投影图像和所述第二投影图像,确定视差图。
在本实施方式中,考虑到上述左摄像头、右摄像头和深度相机是按照一定规则布设的,且通过联合标定和相应的调整校正,使得光轴相互平行,即得到第一图像、第二图像和第三图像在u轴方向上对齐,只在v轴方向上存在偏移。这时,可以将摄像机坐标系的设置在左摄像头和右摄像头之间中心对称的位置处,通过将携带有深度信息的、基于第三图像恢复得到的三维点的坐标分别投影到第一图像、第二图像中,得到对应的二维点的坐标的第一投影图像、第二投影图像,便于后续的数据处理。
具体的,任意一个基于第三图像恢复得到的三维点在摄像机坐标系中的3维坐标可以表示为:X=[x,y,z] T。根据相机内参数和相机外参数,将上述三维点分别投影到第一图像中,得到所述第一投影图像,其中,第一投影图像中的像素点的2维坐标可以表示为:X l=[u l,v l] T。类似的,根据相机内参数和相机外参数,将所述第三图像中的像素点分别投影到第二图像中,得到所述第二投影图像,其中,第二投影图像中的像素点的2维坐标可以表示为:X r=[u r,v r] T。其中,上述坐标u、v分别用于表征图像中的行(即第一坐标)和列(即第二坐标)。
在本实施方式中,上述根据所述第一投影图像和所述第二投影图像,确定视差图,具体实施时,可以包括将第二投影图像与第一投影图像中同名点的第二坐标值进行做差,得到所述视差图。具体的,上述第一投影图像和第二投影图像之间的视差图可以表示为以下形式:d 0=v r-v l
S13:根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码。
在本实施方式中,在得到上述第一图像、第二图像和视差图后,进一步可以以视差图作为约束条件,引导进行第一图像和第二图像间的双目立体匹配。即,例如,可以以第一图像为基准,分别确定出第一图像上的各个待匹配像素点在第二图像中对应的匹配像素点,进而完成后续的匹配,得到对应的深度图像。当然,需要说明的是,上述所列举的以第一图像为基准进行双目匹配只是一种示意性说明。具体实施时,也可以以第二图像为基准进行双目匹配。对此,本申请不作限定。
在本实施方式中,需要说明的是,由于上述视差图是通过基于第三图像得到的三维点云进行投影得到的;而第三图像又是通过ToF相机等深度相机所获取的图像数据。因此,得到的上述视差图中的像素点的灰度值往往并不准确,会存在一定的误差。
而现有的深度图像的生成方法,往往没有考虑到上述像素点的灰度值存在误差,而是直接利用上述视差图,去匹配搜索待匹配像素点和匹配像素点,导致将灰度值上的误差传递到了匹配过程,影响匹配精度,进而使得后续确定的深度图像的精度较低。
在本实施方式中,正是考虑到上述视差图中的像素点的灰度值存在误差、不够精 确,因此在待匹配像素点的临近区域(即匹配窗口)引入了基于与待匹配像素点灰度值的差异所确定的预设编码来代替灰度值,从而避免了由于过分依赖灰度值,来确定匹配像素点,导致将灰度值的误差传递到匹配过程,影响后续匹配精度的问题。
在一个实施方式中,上述根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码,具体实施时,可以包括以下内容:
S1:将所述待匹配像素点的匹配窗口内的像素点的灰度值分别与所述第一图像中的待匹配像素点的灰度值进行比较;
S2:根据比较结果,将所述待匹配像素点的匹配窗口内的灰度值小于或等于所述待匹配像素点的灰度值的像素点的预设编码确定为1;将所述待匹配像素点的匹配窗口内的灰度值大于所述待匹配像素点的灰度值的像素点的预设编码确定为0。
在本实施方式中,上述匹配窗口可以理解为一种与待匹配像素点临近的,不包含待匹配像素点的,由其他像素点构成的范围区域。具体的,可以参阅图3所示的根据本申请实施方式提供的高精度稠密深度图像的生成方法中的一个实施例示意图,待匹配像素点位于该匹配窗口的中心位置,与该待匹配像素点相邻的周围的8个像素点为该待匹配像素点的匹配窗口内的像素点。当然,需要说明的是,上述所列举的待匹配像素点的匹配窗口内的像素点只是一种示意性说明。具体实施时,根据所选择的匹配窗口的形状和大小,上述待匹配像素点的匹配窗口内的像素点还可以是包含有其他个数、按照其他方式分布的像素点。对此,本申请不作限定。
具体的,例如,参阅图3所示,根据预设编码规则进行编码,先通过比较,发现匹配窗口内第一个位置的像素点的灰度值为6小于待匹配像素点的灰度值7,因此,可以确定第一像素点对应的预设编码为1。而匹配窗口内的第二个位置的像素点的灰度值为8大于待匹配像素点的灰度值7,因此,可以确定出第二个像素点对应的预设编码为0。按照上述方法,可以分别确定出第一图像中待匹配像素点的匹配窗口内的8个像素点中各个像素点的预设编码依次为:1、0、0、0、1、1、1、0。进一步,系统可以将待匹配像素点位置处的预设编码置空,同时将上述匹配窗口内的像素点的预设编码按照像素点的位置进行排列整理,进而可以记录得到一个表征特征序列的向量,即:(1,0,0,0,1,1,1,0)。其中,该向量中每一位数值对应待匹配像素点的匹配窗口内的一个位置的像素点的预设编码。
在本实施方式中,可以按照上述方式根据预设编码规则,确定出第一图像中的各个待匹配像素点的匹配窗口内的像素点的预设编码,进而后续可以基于上述预设编码,而不是存在误差的灰度值,进行具体匹配,从而可以有效地降低匹配误差,提高匹配精度。
S14:根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二投影图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点。
在本实施方式中,上述第二图像中与第一图像中的待匹配像素点对应的匹配像素点具体可以理解为第二图像中指示的实际位置与第一图像中的待匹配像素点所指示的实际位置相同的像素点,也可以称为是待匹配像素点在第二图像中的同名像素点。
在本实施方式中,由于是利用待匹配像素点的匹配窗口内的像素点的预设编码,结合待匹配像素点的灰度值,即不是全部依赖像素点的灰度值来匹配搜索对应的匹配像素 点,因此,减少了灰度值的误差导致的匹配误差,提高了匹配的准确度。
在一个实施方式中,上述根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点,具体实施时,可以包括以下内容:
S1:根据所述待匹配像素点的第一坐标和所述视差图,从所述第二图像中筛选出多个像素点作为测试像素点;
S2:确定测试像素点的灰度值,以及测试像素点的匹配窗口内的像素点的预设编码;
S3:根据所述待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述测试像素点的灰度值、所述测试像素点的匹配窗口内的像素点的预设编码,计算所述待匹配像素点与所述测试像素点的匹配代价;
S4:将匹配代价数值最小的测试像素点确定为与所述第一图像中的待匹配像素点对应的匹配像素点。
在本实施方式中,考虑到视差图虽然存在误差,但能够反映出整体的特征趋势,可以作为指导和参照。因此,具体实施时,可以以所述视差图作为指导和参照,确定出第二图像中可以待匹配像素点对应的匹配像素点可能存在的区域范围;再根据第一坐标从上述区域范围中进一步筛选出多个测试像素点。
在本实施方式中,上述第一坐标具体可以理解为是行坐标,即u。需要说明的是,由于通过联合标定,使得所获取的第一图像和第二图像是行对齐的图像。因此,可以减少匹配搜索范围,在匹配搜索待匹配像素点在第二图像中的匹配像素点时,可以将第二图像中,与待匹配像素点的行坐标,即u值相同的像素点作为测试像素点,即可能为匹配像素点的待进一步测试、确定的像素点。从而避免了对第二图像中所有像素点的遍历搜索,减少了匹配搜索范围,提高了处理效率。
在本实施方式中,在确定了上述测试像素点后,进一步可以按照确定待匹配像素点的匹配窗口内的像素点的预设编码的方式,分别确定出多个测试像素点中各个测试像素点的匹配窗口内的像素点的预设编码,以便可以根据预设编码,只结合测试像素点的灰度值搜索匹配到第二图像中最合适的像素点作为匹配像素点。
在本实施方式中,上述匹配代价具体可以理解为一种能够反映测试像素点与待匹配像素点的相似性程度的参数。具体的,通常一个测试像素点与待匹配像素点的匹配代价越小,与待匹配像素点的相似程度越高,该测试像素点具有相对越大的概率成为与待匹配像素点对应的匹配像素点。相反,一个测试像素点与待匹配像素点的匹配代价越大,与待匹配像素点的相似程度越低,该测试像素点具有相对越小的概率成为与待匹配像素点对应的匹配像素点。
在本实施方式中,为了减少对灰度信息的依赖,减少第三图像中引入的灰度值的误差对匹配的精度的影响,具体实施时,可以利用待匹配像素点的匹配窗口内的像素的预设编码、测试像素点的匹配窗口内的像素点的预设编码分别代替对应灰度值,通过进行异或运算,来确定待匹配像素点的邻近的匹配窗口与测试像素点的邻近的匹配窗口的近似程度,作为匹配代价中的第一项数据,以降低第三图像由于精度低、分辨率差对匹配过程产生的影响,保留下图像中较为准确的局部纹理的结构信息。
在本实施方式中,可以利用待匹配像素点的匹配窗口内的像素点的预设编码、测试像素点的匹配窗口内的像素点的预设编码分别代替对应灰度值,通过进行异或运算,来确定待匹配像素点的邻近的匹配窗口与测试像素点的邻近的匹配窗口的近似程度,作为匹配代价中的第一项数据。具体实施时,可以包括以下内容:将待匹配像素点的匹配窗口内的各个位置处的像素点的预设编码分别与测试像素点的匹配窗口内的相同位置处的像素点的预设编码一一进行比较;根据比较结果,每当一个待匹配像素点的匹配窗口内的一个位置处的像素点的预设编码分别与测试像素点的匹配窗口内的相同位置处的像素点的预设编码相同,累积1;比较完所有位置处的像素点的预设编号后,得到总的累加结果,作为匹配代价中的第一项数据。
在本实施方式中,又考虑为了消除图像中的毛刺等杂质信息,又引入了待匹配像素点的灰度值与测试像素点的灰度值的差值的绝对值作为匹配代价中的第二项数据,起到平滑的作用,使得后续得到图像相对更加平滑,效果相对更好。
在一个实施方式中,上述根据所述待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述测试像素点的灰度值、所述测试像素点的匹配窗口内的像素点的预设编码,计算所述待匹配像素点与所述测试像素点的匹配代价,具体实施时,可以包括以下内容:
按照以下公式,计算所述待匹配像素点与所述测试像素点的匹配代价:
Figure PCTCN2019121495-appb-000011
其中,C具体可以表示为待匹配像素点与测试像素点的匹配代价,
Figure PCTCN2019121495-appb-000012
具体可以表示为第一图像中的待匹配像素点的灰度值,
Figure PCTCN2019121495-appb-000013
具体可以表示为第二图像中的测试像素点的灰度值,
Figure PCTCN2019121495-appb-000014
具体可以表示为第一图像中待匹配像素点的匹配窗口内的编号为k的像素点的预设编码,
Figure PCTCN2019121495-appb-000015
具体可以表示为第二图像中测试像素点的匹配窗口内的编号为k的像素点的预设编码,n具体可以为所述匹配窗口内的像素点的总数。
在本实施方式中,上述符号
Figure PCTCN2019121495-appb-000016
具体可以用于表征异或运算。具体计算时,当该符号两侧的数值相同时,结果为1;当该符号两侧的数值不同时,结果为0。
在本实施方式中,具体实施时,可以按照上述方式计算得到第二图像中的多个测试像素点中的各个测试像素点与第一图像中的待匹配像素点之间的匹配代价。
在本实施方式中,在得到第二图像中的多个测试像素点中的各个测试像素点与第一图像中的待匹配像素点之间的匹配代价后,进一步可以对上述匹配代价进行比较,选择匹配代价数值最小,即对应近似程度最高的匹配代价所对应的测试像素点作为待匹配像素点在第二图像中对应的匹配像素点。按照上述方式,可以确定出第一图像中的各个待匹配像素点在第二图像中对应的匹配像素点,从而较为快速、准确地完成匹配搜索。
S15:根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。
在本实施方式中,在得到了第一图像中各个待匹配像素点分别在第二图像中对应的匹配像素点,根据待匹配像素点与匹配像素点的映射关系,对第一图像和第二图像进行具体的立体匹配处理,效果较好的视差图,再根据该视差图进一步得到了较为精确的、携带 有深度信息的第一深度图像。
其中,上述第一深度图像由于在进行匹配搜索的过程中没有全部依赖存在误差的灰度信息,而是引入预设编码,结合待匹配像素点和测试像素点的灰度值,来确定匹配像素点,从而在减少第三图像由于光照差异导致灰度值精度较差引入的匹配误差,保留了图像中局部纹理结构信息外,还对进行了平滑处理,使得得到的第一深度图像的效果更为理想。
在本申请实施例中,相较于现有方法,通过根据预设编码规则获取并利用第一图像中的待匹配像素点邻近的匹配窗口内的像素点预设编码,结合待匹配像素点的灰度值,并以视差图为约束,通过匹配从第二图像中确定出较为准确的匹配像素点,以确定深度图像,减少了由于光照导致灰度信息出现差异所引起的匹配误差,从而解决了现有方法中存在的确定深度图像误差大、精度不高的技术问题,达到获取稠密的、具有较高精度和分辨率的深度图像的技术效果。
在一个实施方式中,在根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像后,所述方法具体实施时,还可以包括以下内容:
S1:根据所述视差图,生成修正权值;
S2:根据所述修正权值和所述第一深度图像,确定第二深度图像。
在本实施方式中,考虑所得到的第一深度图像还可能会存在一些毛刺、不够平滑的地方,为了使得所得到的深度图像更为平滑、效果更好,同时也为进一步提高深度图像的精度。在获取了第一深度图像后,还可以利用基于ToF相机等深度相机所获取的视差图作为指导,对第一深度图像进行修正、调整,得到更平滑、精度更高的深度图像。
在本实施方式中,同样考虑到了基于第三图像得到视差图(也可以称为初始视差图,记为I)虽然在整体上可以用于指导对深度图像的修正调整,但是上述视差图中的像素点的数据值由于是基于第三图像中的灰度值得到的,因此,本身也可能存在误差,分辨率也相对较低。为了不引入上述视差图存在误差,具体实施时,可以仅利用视差图生成用于修正调整的权重值,而对第一深度图像进行方向性调整,而不过多地参与深度图像中像素点具体数据值的计算,从而保证了修正后的像素点的数据值尽可能少地受到视差图中数据值的误差影响,保持较高的分辨率和精度。
在一个实施方式中,上述根据所述修正权值和所述第一深度图像,确定第二深度图像,具体实施时,可以包括:
按照以下公式,计算所述第二深度图像中像素点的数据值:
Figure PCTCN2019121495-appb-000017
其中,q i具体可以表示为第二深度图像中编号为i的像素点的数据值,W ij(I)具体可以表示为修正权值,I具体可以表示为视差图,p j具体可以为第一深度图像中对应的预设窗口内编号为j的像素点的数据值。
在一个实施方式中,为了减少视差图中像素点的数据误差对深度图的影响,具体实施时,可以按照以下公式确定所述修正权值:
Figure PCTCN2019121495-appb-000018
其中,I i、I j具体可以表示为视差图中对应的预设窗口内相邻的两个像素点的数据值,μ k具体可以表示为视差图中对应的预设窗口内的像素点的数据值的平均值,σ k具体可以表示为视差图中对应的预设窗口内的像素点的数据值的方差,ε表示为惩罚值,τ为扰动值。
在本实施方式中,上述预设窗口具体可以理解为以与第二深度图像中的像素点对应的像素点为中心的范围区域。具体实施,可以根据具体情况,设置上述预设窗口形成或大小。对此,本申请不作限定。
在本实施方式中,上述扰动值的具体数值可以是一个很小数值,以确保分母不为0。上述扰动值和惩罚值的具体数值可以根据具体情况和精度要求,灵活设置。对此,本说明书不作限定。
在本实施方式中,上述数据值不同于灰度值,具体可以理解为一种同时包含有深度信息的参数数据。
在本实施方式中,需要说明的是,通过上述方式确定的修正权值,可以使得在目标对象的边缘位置处时,由于相邻的两个像素点的数据值I i、I j的差值较大,且I i、I j分别位于边缘处的两侧,使得(I ik)和(I jk)为异号,且(I i-I j)的数值相对较大,从而对第二深度图像中靠近边缘处的像素点的数据值产生较为微弱的调整和修正。而当不在目标对象的边缘位置处时,则(I ik)和(I jk)为同号,且(I i-I j)的数值相对较小,从而对第二深度图像中不在边缘处的像素点的数据值产生较为明显的调整和修正。即上述修正权值,在边缘处的具体数值往往远小于在非边缘处,例如平坦处的数值。
由于修正权值具有上述特点,在利用上述修正权值对第一深度图像进行修正调整,得到第二深度图像的过程中,可以使得针对非边缘区域的像素点的权值相对较大,平滑的效果相对更明显;针对边缘区域的像素点的权值相对较小,平滑的效果相对较微弱,起到保持图形边界的效果。即能够更有针对性、更准确地对深度图像进行平滑处理,并保留下边界信息。
在一个实施方式中,在根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像后,所述方法具体实施时还可以包括以下内容:
S1:检测所述第一深度图像中是否存在空白区域,其中,所述空白区域为包含有多个数据值为0的像素点的区域;
S2:在确定所述第一深度图像中存在空白区域的情况下,获取所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值;
S3:根据所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值,修改所述空白区域内的像素点的数据值。
在本实施方式中,上述空白区域具体可以理解为一种包含有多个连续的数据值为0的像素点的范围区域。可以参阅图4所示的根据本申请实施方式提供的高精度稠密深度图像的生成方法中的一个实施例示意图。
在本实施方式中,进一步考虑到在获取第一深度图像的过程中还是使用了精度较低、分辨率较差的第三图像,或者由于所使用的第一图像、第二图像也存在数据误差,导致得到深度图像的局部表明纹理信息不足,造成出现了空白区域。为了能够准确地填补上述空白区域,得到完整、精确的深度图,同时也为避免由于利用基于第三图像得到视差图的数 据值进行填补,进一步引入误差,影响深度图的精度,具体实施时,可以利用深度图像中非空白区域但与空白区域相连数据值不为0的像素点的数据值来填补相邻空白区域中的数据值。例如,参与图4所示,可以利用与第一行第三列的空白区域中的像素点相连的第一行第二列非空白区域中的像素点的数据值3来填补该空白区域中的像素点。按照类似的方式,分别对各个空白区域中的像素点进行对应的填补,从而得到完整、且精确的深度图,进一步提高了深度图的准确度。
在本实施方式中,进一步考虑到基于上述方法得到的深度图像,通常在边缘位置具有较高的精度,因此很少出现空白区域,即使出现空白区域可能也不一定就是误差导致的。而非边缘位置,例如目标对象的内部,如果出现空白区域,通常具有较大概率是由于误差引入的,这时相对更适合使用上述方法进行空白区域的填补。因此,具体实施时,还可以进一步细化,包括以下内容:在确定所述第一深度图像中存在空白区域的情况下,检测所述空白区域是否位于目标对象的边缘位置;在确定所述空白区域不是位于目标对象的边缘位置的情况下,获取所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值;根据所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值,修改所述空白区域内的像素点的数据值。
在本实施方式中,上述检测所述空白区域是否位于目标对象的边缘位置,具体实施时,可以通过检测空白区域和非空白区域边界两侧的数据值的梯度是否大于预设阈值来确定是否位于目标对象的边缘位置。如果边界两侧的数据值的梯度大于预设阈值,可以确定空白区域位于目标对象的边缘位置。如果边界两侧的数据值的梯度小于或等于预设阈值,可以确定空白区域不位于目标对象的边缘位置。
在本实施方式中,考虑到所得到的第一深度图像在边缘位置本身通常具备较好的精度,因此,在确定空白区域位于目标对象的边缘位置时,不对空白区域进行填补。
从以上的描述中,可以看出,本申请实施例提供的高精度稠密深度图像的生成方法,通过根据预设编码规则获取并利用第一图像中的待匹配像素点邻近的匹配窗口内的像素点预设编码,结合待匹配像素点的灰度值,并以视差图为约束,通过匹配从第二图像中确定出较为准确的匹配像素点,以确定深度图像,减少了由于光照导致灰度信息出现差异所引起的匹配误差,从而解决了现有方法中存在的确定深度图像误差大、精度不高的技术问题,达到获取稠密的、具有较高精度和分辨率的深度图像的技术效果;还通过根据视差图,确定修正权值,再利用上述修正权值对第一深度图像进行指导性修正,使得在利用深度相机得到的图像对第一深度图像进行平滑处理的过程中,不会引入深度相机得到的图像由于精度差所导致的误差数据,从而可以得到精度更高、平滑效果更好的深度图像;还通过利用视差图进行引导,使用深度图像中的像素点的数据值对深空白区域中的像素点进行填补,从而进一步提高了所获取的深度图像的精度。
基于同一发明构思,本发明实施例中还提供了一种高精度稠密深度图像的生成装置,如下面的实施例所述。由于高精度稠密深度图像的生成装置解决问题的原理与高精度稠密深度图像的生成方法相似,因此装置的实施可以参见高精度稠密深度图像的生成方法的实施,重复之处不再赘述。以下所使用的,术语“单元”或者“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。请参阅图5,是本申请实施例提供的高精度 稠密深度图像的生成装置的一种组成结构图,该装置具体可以包括:获取模块51、第一确定模块52、第二确定模块53、第三确定模块54和第四确定模块55,下面对该结构进行具体说明。
获取模块51,具体可以用于获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据;
第一确定模块52,具体可以用于根据所述第一图像、所述第二图像和所述第三图像确定视差图;
第二确定模块53,具体可以用于根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码;
第三确定模块54,具体可以用于根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;
第四确定模块55,具体可以用于根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。
在一个实施方式中,为了能够根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码,上述第二确定模块53具体可以包括以下结构单元:
第一比较单元,具体可以用于将所述待匹配像素点的匹配窗口内的像素点的灰度值分别与所述第一图像中的待匹配像素点的灰度值进行比较;
第一确定单元,具体可以用于根据比较结果,将所述待匹配像素点的匹配窗口内的灰度值小于或等于所述待匹配像素点的灰度值的像素点的预设编码确定为1;将所述待匹配像素点的匹配窗口内的灰度值大于所述待匹配像素点的灰度值的像素点的预设编码确定为0。
在一个实施方式中,为了能够根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点,上述第三确定模块54具体可以包括以下结构单元:
筛选单元,具体可以用于根据所述待匹配像素点的第一坐标和所述视差图,从所述第二图像中筛选出多个像素点作为测试像素点;
第二确定单元,具体可以用于确定测试像素点的灰度值,以及测试像素点的匹配窗口内的像素点的预设编码;
第一计算单元,具体可以用于根据所述待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述测试像素点的灰度值、所述测试像素点的匹配窗口内的像素点的预设编码,计算所述待匹配像素点与所述测试像素点的匹配代价;
第三确定单元,具体可以用于将匹配代价数值最小的测试像素点确定为与所述第一图像中的待匹配像素点对应的匹配像素点。
在一个实施方式中,上述第一计算单元具体实施时,可以按照以下公式,计算所述待匹配像素点与所述测试像素点的匹配代价:
Figure PCTCN2019121495-appb-000019
其中,C具体可以表示为待匹配像素点与测试像素点的匹配代价,
Figure PCTCN2019121495-appb-000020
具体可以表示为第一图像中的待匹配像素点的灰度值,
Figure PCTCN2019121495-appb-000021
具体可以表示为第二图像中的测试像素点的灰度值,
Figure PCTCN2019121495-appb-000022
具体可以表示为第一图像中待匹配像素点的匹配窗口内的编号为k的像素点的预设编码,
Figure PCTCN2019121495-appb-000023
具体可以表示为第二图像中测试像素点的匹配窗口内的编号为k的像素点的预设编码,n具体可以为所述匹配窗口内的像素点的总数。
在一个实施方式中,所述装置具体还可以包括第五确定模块,用于根据第一深度图像,确定第二深度图像。其中,上述第五确定模块具体可以包括以下结构单元:
第一生成单元,具体可以用于根据所述视差图,生成修正权值;
第四确定单元,具体可以用于根据所述修正权值和所述第一深度图像,确定第二深度图像。
在一个实施方式中,上述第四确定单元,具体实施时,可以按照以下公式,计算所述第二深度图像中像素点的数据值:
Figure PCTCN2019121495-appb-000024
其中,q i具体可以表示为第二深度图像中编号为i的像素点的数据值,W ij(I)具体可以表示为修正权值,I具体可以表示为视差图,p j具体可以为第一深度图像中对应的预设窗口内编号为j的像素点的数据值。
在一个实施方式中,上述第四确定单元,具体实施时,可以按照以下公式确定所述修正权值:
Figure PCTCN2019121495-appb-000025
其中,I i、I j具体可以表示为视差图中对应的预设窗口内相邻的两个像素点的数据值,μ k具体可以表示为视差图中对应的预设窗口内的像素点的数据值的平均值,σ k具体可以表示为视差图中对应的预设窗口内的像素点的数据值的方差,ε具体可以表示为惩罚值,τ具体可以为扰动值。
在一个实施方式中,所述装置具体还可以包括填充模块,具体以用于检测所述第一深度图像中是否存在空白区域,其中,所述空白区域为包含有多个数据值为0的像素点的区域;在确定所述第一深度图像中存在空白区域的情况下,获取所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值;根据所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值,修改所述空白区域内的像素点的数据值。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
需要说明的是,上述实施方式阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。为了描述的方便,在本说明书中,描述以上装置时以功能分为各种单元分别描述。当然,在实施本申请时可以把各单元的功能在同一个或多个软件和/或硬件中实现。
此外,在本说明书中,诸如第一和第二这样的形容词仅可以用于将一个元素或动作与另一元素或动作进行区分,而不必要求或暗示任何实际的这种关系或顺序。在环境允许的情况下,参照元素或部件或步骤(等)不应解释为局限于仅元素、部件、或步骤中的一个,而可以是元素、部件、或步骤中的一个或多个等。
从以上的描述中,可以看出,本申请实施例提供的高精度稠密深度图像的生成装置,通过第二确定模块、第三确定模块和第四确定模块根据预设编码规则获取并利用第一图像中的待匹配像素点邻近的匹配窗口内的像素点预设编码,结合待匹配像素点的灰度值,并以视差图为约束,通过匹配从第二图像中确定出较为准确的匹配像素点,以确定深度图像,减少了由于光照导致灰度信息出现差异所引起的匹配误差,从而解决了现有方法中存在的确定深度图像误差大、精度不高的技术问题,达到获取稠密的、具有较高精度和分辨率的深度图像的技术效果;还通过修正模块根据由第三图像得到的视差图,确定修正权值,再利用上述修正权值对第一深度图像进行指导性修正,使得在利用深度相机得到的图像对第一深度图像进行平滑处理的过程中,不会引入深度相机得到的图像由于精度差所导致的误差数据,从而可以得到精度更高、平滑效果更好的深度图像;还通过填充模块利用视差图进行引导,使用深度图像中的像素点的数据值对深空白区域中的像素点进行填补,从而进一步提高了所获取的深度图像的精度。
本申请实施例还提供了一种电子设备,具体可以参阅图6所示的基于本申请实施例提供的高精度稠密深度图像的生成方法的电子设备组成结构示意图,所述电子设备具体可以包括输入设备61、处理器62、存储器63。其中,所述输入设备61具体可以用于输入第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据。所述处理器62具体可以用于根据所述第一图像、所述第二图像和所述第三图像确定视差图;根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码;根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。所述存储器63具体可以用于存储经输入设备61输入的第一图像、第二图像、第三图像,以及处理器62所使用的程序指令。
在本实施方式中,所述输入设备具体可以是用户和计算机系统之间进行信息交换的主要装置之一。所述输入设备可以包括键盘、鼠标、摄像头、扫描仪、光笔、手写输入板、语音输入装置等;输入设备用于把原始数据和处理这些数的程序输入到计算机中。所述输入设备还可以获取接收其他模块、单元、设备传输过来的数据。所述处理器可以按任何适当的方式实现。例如,处理器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行 的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式等等。所述存储器具体可以是现代信息技术中用于保存信息的记忆设备。所述存储器可以包括多个层次,在数字系统中,只要能保存二进制数据的都可以是存储器;在集成电路中,一个没有实物形式的具有存储功能的电路也叫存储器,如RAM、FIFO等;在系统中,具有实物形式的存储设备也叫存储器,如内存条、TF卡等。
在本实施方式中,该电子设备具体实现的功能和效果,可以与其它实施方式对照解释,在此不再赘述。
本申请实施例还提供了一种基于高精度稠密深度图像的生成方法的计算机存储介质,所述计算机存储介质存储有计算机程序指令,在所述计算机程序指令被执行时实现:获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据;根据所述第一图像、所述第二图像和所述第三图像确定视差图;根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码;根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。
在本实施方式中,上述存储介质包括但不限于随机存取存储器(RandomAccess Memory,RAM)、只读存储器(Read-Only Memory,ROM)、缓存(Cache)、硬盘(Hard DiskDrive,HDD)或者存储卡(Memory Card)。所述存储器可以用于存储计算机程序指令。网络通信单元可以是依照通信协议规定的标准设置的,用于进行网络连接通信的接口。
在本实施方式中,该计算机存储介质存储的程序指令具体实现的功能和效果,可以与其它实施方式对照解释,在此不再赘述。
在一个具体实施场景示例中,应用本申请实施例的提供高精度稠密深度图像的生成方法和装置获取高精度、稠密的深度图像。
S1:初始图像(即视差图)的求取。
在本实施方式中,可以假设两个RGB摄像机(即左摄像头和右摄像头)l、r已经过校正,且光轴相互平行,v轴完全对齐。这时通过上述两个RGB摄像机得到图像(即第一图像和第二图像)只在u轴方向有偏移。
将摄像机坐标系放置在两个摄像机中心对称的位置。根据摄影几何知识,可以由深度相机获得的深度图(即第三图像)恢复其坐标系下的三维点云。其中,根据基于第三图像恢复得到的三维点云,该空间中任意一个3D点在摄像机坐标系中的坐标为X=[x,y,z] T。其在左右图像平面上分别成像(即将第三图像分别投影到第一图像和第二图像)。根据联合标定得到相机内、外参数(即相机内参数和相机外参数),可以得到该点在左右相机图像坐标系中的2D坐标:x l=[u l,v l] T,x r=[u r,v r] T(即分别对应于第一投影图像和第二投影图像),则两者在双目立体系统中的初始视差可以表示为d 0=v r-v l,从而得到了初始视差(即 视差图)。
S2:匹配代价计算。
在本实施方式中,以待匹配点(即第一图像中的待匹配像素点)为中心构建一个矩形窗口,将窗口中心点与其邻域像素点(即待匹配像素点的匹配窗口内的像素点)的灰度值进行比较,将灰度值小于中心点像素的(预设编码)记为1,灰度大于中心点的(预设编码)则记为0,并将这些数值按像素点的位置依次连接成一个向量,以此作为该点的特征序列。然后比较另一图像(第二图像)中与待匹配点的序列在相同位置上的点(即测试像素点)的不同值的数目之和,以此作为匹配代价的一项,这种非参数变换可以以保留图像的局部纹理结构信息,减少光照差异引起的误匹配。再加上又以左右两图像中待匹配窗口中心像素的灰度绝对差作为匹配代价的第二项,可起到平滑的作用。得到效果较好的匹配代价。具体的,可以按照以下公式计算匹配代价:
Figure PCTCN2019121495-appb-000026
其中,C表示为待匹配像素点与测试像素点的匹配代价,
Figure PCTCN2019121495-appb-000027
表示为第一图像中的待匹配像素点的灰度值,
Figure PCTCN2019121495-appb-000028
表示为第二图像中的测试像素点的灰度值,
Figure PCTCN2019121495-appb-000029
表示为第一图像中待匹配像素点的匹配窗口内的编号为k的像素点的预设编码,
Figure PCTCN2019121495-appb-000030
表示为第二图像中测试像素点的匹配窗口内的编号为k的像素点的预设编码,n表示为所述匹配窗口内的像素点的总数。
进而可以根据匹配代价,确定出待匹配点的同名点,通过相应的匹配处理,得到深度图像(即第一深度图像)。
S3:引导匹配策略(以对第一深度图像进行修正得到第二深度图像)。
假设ToF的图像(即第三图像)转化而来的初始视差(即视差图)记为I,双目匹配得到的结果(即第一深度图像)为p,输出(即第二深度图像)为q,可以按照以下方式进行引导匹配(得到平滑效果更好的第二深度图像):
Figure PCTCN2019121495-appb-000031
Figure PCTCN2019121495-appb-000032
其中,q i表示为第二深度图像中编号为i的像素点的数据值,W ij(I)表示为修正权值,I表示为视差图,p j表示为第一深度图像中对应的预设窗口内编号为j的像素点的数据值,I i、I j表示为视差图中对应的预设窗口内相邻的两个像素点的数据值,μ k表示为视差图中对应的预设窗口内的像素点的数据值的平均值,σ k表示为视差图中对应的预设窗口内的像素点的数据值的方差,ε表示为惩罚值,τ表示为扰动值。
其中,τ是一个很小的扰动,以确保分母不为0。在物体(即目标对象)边缘处,I i,I j的差值较大,同时由于I i,I j在边界两侧时,(I ik)和(I jk)异号;否则,差值小且同号。而在边缘处的权重值将远远小于平坦处的权重值,这样处于平坦区域的像素则会被加以较大的权重,平滑效果更明显,而处于边界两侧的像素则会被加以较小的权重,平滑效果较弱,能够起到保持边界的效果。
S4:无特征区域(即空白区域)填补。
在本实施方式中,考虑到当物体表面纹理信息不足时,可能会出现无特征区域;这时可以根据初始视差图判断无特征区域是否为物体边缘;如果不是,则可以利用物体内部表面的数据值(即非空白区域与所述空白区域相连的像素点的数据值)填补无特征区域中的这些空洞,得到更加准确的深度图像。
在本实施方式中,由于采用局部编码(即预设编码)代替中心像素点的灰度值作为待匹配窗口的相似性测度,同时又利用待匹配窗口中心像素灰度值作差,将非参数变换与参数变换结合起来,合理使用而不过分依赖像素灰度信息,提高了处理精度;在融合匹配策略上,又考虑到ToF深度图低精度和不可靠性,所以只将其作为引导而非依赖,这样不仅可以消除由于左右摄像机不同的视角接受到的光强不一致,摄像机增益、电平变化引起的差异,以及图像采集不同通道的噪声不同等因素带来的影响,还可以得到边缘清晰内部平滑的稠密的视差图。
在本实施方式中,上述匹配代价的计算综合了census变换和绝对差的优势,在消除光照差异的影响的同时保持平滑连续性;由于不使用ToF的具体灰度值和数据值,不以其作为“种子点”也不根据其值进行分层匹配,不会将其局部错误进行扩大;在无特征区域不是使用ToF深度转化而来的初始视差值进行填补,而是在其指导下利用双目匹配自身视差值进行填补;在引导匹配的过程中,利用ToF得到的初始视差设计权重项,在边缘区域权重小,在平坦区域权重大,可以进一步降噪平滑。
在本实施方式中,上述实施过程可以基于移动智能终端,实现双相机+TOF模块的精确3D重建功能;也可以做成体积较大的模块,工作距离较大,用于机器人3D视觉感知、引导等用途。
通过上述场景示例,验证了本申请实施例提供的高精度稠密深度图像的生成方法和装置,通过根据预设编码规则获取并利用第一图像中的待匹配像素点邻近的匹配窗口内的像素点预设编码,结合待匹配像素点的灰度值,并以视差图为约束,通过匹配从第二图像中确定出较为准确的匹配像素点,以确定深度图像,减少了由于光照导致灰度信息出现差异所引起的匹配误差,确实解决了现有方法中存在的确定深度图像误差大、精度不高的技术问题,达到获取稠密的、具有较高精度和分辨率的深度图像的技术效果。
尽管本申请内容中提到不同的具体实施例,但是,本申请并不局限于必须是行业标准或实施例所描述的情况等,某些行业标准或者使用自定义方式或实施例描述的实施基础上略加修改后的实施方案也可以实现上述实施例相同、等同或相近、或变形后可预料的实施效果。应用这些修改或变形后的数据获取、处理、输出、判断方式等的实施例,仍然可以属于本申请的可选实施方案范围之内。
虽然本申请提供了如实施例或流程图所述的方法操作步骤,但基于常规或者无创造性的手段可以包括更多或者更少的操作步骤。实施例中列举的步骤顺序仅仅为众多步骤执行顺序中的一种方式,不代表唯一的执行顺序。在实际中的装置或客户端产品执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行(例如并行处理器或者多线程处理的环境,甚至为分布式数据处理环境)。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、产品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、产品或者设备 所固有的要素。在没有更多限制的情况下,并不排除在包括所述要素的过程、方法、产品或者设备中还存在另外的相同或等同要素。
上述实施例阐明的装置或模块等,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。为了描述的方便,描述以上装置时以功能分为各种模块分别描述。当然,在实施本申请时可以把各模块的功能在同一个或多个软件和/或硬件中实现,也可以将实现同一功能的模块由多个子模块的组合实现等。以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。
本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内部包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。
本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构、类等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,移动终端,服务器,或者网络设备等)执行本申请各个实施例或者实施例的某些部分所述的方法。
本说明书中的各个实施例采用递进的方式描述,各个实施例之间相同或相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。本申请可用于众多通用或专用的计算机系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。
虽然通过实施例描绘了本申请,本领域普通技术人员知道,本申请有许多变形和变化而不脱离本申请的精神,希望所附的实施方式包括这些变形和变化而不脱离本申请。

Claims (11)

  1. 一种高精度稠密深度图像的生成方法,其特征在于,包括:
    获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据;
    根据所述第一图像、所述第二图像和所述第三图像确定视差图;
    根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码;
    根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;
    根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。
  2. 根据权利要求1所述的方法,其特征在于,根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内的像素点的预设编码,包括:
    将所述待匹配像素点的匹配窗口内的像素点的灰度值分别与所述第一图像中的待匹配像素点的灰度值进行比较;
    根据比较结果,将所述待匹配像素点的匹配窗口内的灰度值小于或等于所述待匹配像素点的灰度值的像素点的预设编码确定为1;将所述待匹配像素点的匹配窗口内的灰度值大于所述待匹配像素点的灰度值的像素点的预设编码确定为0。
  3. 根据权利要求1所述的方法,其特征在于,根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点,包括:
    根据所述待匹配像素点的第一坐标和所述视差图,从所述第二图像中筛选出多个像素点作为测试像素点;
    确定测试像素点的灰度值,以及测试像素点的匹配窗口内的像素点的预设编码;
    根据所述待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述测试像素点的灰度值、所述测试像素点的匹配窗口内的像素点的预设编码,计算所述待匹配像素点与所述测试像素点的匹配代价;
    将匹配代价数值最小的测试像素点确定为与所述第一图像中的待匹配像素点对应的匹配像素点。
  4. 根据权利要求3所述的方法,其特征在于,根据所述待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述测试像素点的灰度值、所述测试像素点的匹配窗口内的像素点的预设编码,计算所述待匹配像素点与所述测试像素点的匹配代价,包括:
    按照以下公式,计算所述待匹配像素点与所述测试像素点的匹配代价:
    Figure PCTCN2019121495-appb-100001
    其中,C表示为待匹配像素点与测试像素点的匹配代价,
    Figure PCTCN2019121495-appb-100002
    表示为第一图像中的待匹配像素点的灰度值,
    Figure PCTCN2019121495-appb-100003
    表示为第二图像中的测试像素点的灰度值,
    Figure PCTCN2019121495-appb-100004
    表示为第一图像中待匹配像素点的匹配窗口内的编号为k的像素点的预设编码,
    Figure PCTCN2019121495-appb-100005
    表示为第二图像中测试像素点的匹配窗口内的编号为k的像素点的预设编码,n表示为所述匹配窗口内的像素点的总数。
  5. 根据权利要求1所述的方法,其特征在于,在根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像后,所述方法还包括:
    根据所述视差图,生成修正权值;
    根据所述修正权值和所述第一深度图像,确定第二深度图像。
  6. 根据权利要求5所述的方法,其特征在于,根据所述修正权值和所述第一深度图像,确定第二深度图像,包括:
    按照以下公式,计算所述第二深度图像中像素点的数据值:
    Figure PCTCN2019121495-appb-100006
    其中,q i表示为第二深度图像中编号为i的像素点的数据值,W ij(I)表示为修正权值,I表示为视差图,p j表示为第一深度图像中对应的预设窗口内编号为j的像素点的数据值。
  7. 根据权利要求6所述的方法,其特征在于,所述修正权值按照以下公式确定:
    Figure PCTCN2019121495-appb-100007
    其中,I i、I j表示为视差图中对应的预设窗口内相邻的两个像素点的数据值,μ k表示为视差图中对应的预设窗口内的像素点的数据值的平均值,σ k表示为视差图中对应的预设窗口内的像素点的数据值的方差,ε表示为惩罚值,τ表示为扰动值。
  8. 根据权利要求1所述的方法,其特征在于,在根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像后,所述方法还包括:
    检测所述第一深度图像中是否存在空白区域,其中,所述空白区域为包含有多个数据值为0的像素点的区域;
    在确定所述第一深度图像中存在空白区域的情况下,获取所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值;
    根据所述第一深度图像中非空白区域与所述空白区域相连的像素点的数据值,修改所述空白区域内的像素点的数据值。
  9. 一种高精度稠密深度图像的生成装置,其特征在于,包括:
    获取模块,用于获取第一图像、第二图像和第三图像,其中,所述第一图像为通过左摄像头获取的包含有目标对象的图像数据,所述第二图像为通过右摄像头获取的包含有目标对象的图像数据,所述第三图像为通过深度相机获取的包含有目标对象的图像数据;
    第一确定模块,用于根据所述第一图像、所述第二图像和所述第三图像确定视差图;
    第二确定模块,用于根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的灰度值和预设编码规则,确定所述待匹配像素点的匹配窗口内 的像素点的预设编码;
    第三确定模块,用于根据所述第一图像中的待匹配像素点的灰度值、所述待匹配像素点的匹配窗口内的像素点的预设编码、所述视差图,从所述第二图像中确定出与所述第一图像中的待匹配像素点对应的匹配像素点;
    第四确定模块,用于根据第一图像中的待匹配像素点,和所述第二图像中的与所述第一图像中的待匹配像素点对应的匹配像素点,确定第一深度图像。
  10. 一种电子设备,包括处理器以及用于存储处理器可执行指令的存储器,其特征在于,所述处理器执行所述指令时实现权利要求1至8中任一项所述方法的步骤。
  11. 一种计算机可读存储介质,其上存储有计算机指令,其特征在于,所述指令被执行时实现权利要求1至8中任一项所述方法的步骤。
PCT/CN2019/121495 2018-12-12 2019-11-28 高精度稠密深度图像的生成方法和装置 WO2020119467A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811519461.4 2018-12-12
CN201811519461.4A CN109640066B (zh) 2018-12-12 2018-12-12 高精度稠密深度图像的生成方法和装置

Publications (1)

Publication Number Publication Date
WO2020119467A1 true WO2020119467A1 (zh) 2020-06-18

Family

ID=66073325

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/121495 WO2020119467A1 (zh) 2018-12-12 2019-11-28 高精度稠密深度图像的生成方法和装置

Country Status (2)

Country Link
CN (1) CN109640066B (zh)
WO (1) WO2020119467A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109640066B (zh) * 2018-12-12 2020-05-22 深圳先进技术研究院 高精度稠密深度图像的生成方法和装置
CN111656404B (zh) * 2019-05-30 2024-03-01 深圳市大疆创新科技有限公司 图像处理方法、系统及可移动平台
CN110335211B (zh) * 2019-06-24 2021-07-30 Oppo广东移动通信有限公司 深度图像的校正方法、终端设备以及计算机存储介质
CN110782412B (zh) * 2019-10-28 2022-01-28 深圳市商汤科技有限公司 图像处理方法及装置、处理器、电子设备及存储介质
CN113034585B (zh) * 2021-04-25 2023-02-28 歌尔光学科技有限公司 偏移状态测试方法、测试设备及存储介质
WO2023225825A1 (zh) * 2022-05-23 2023-11-30 上海玄戒技术有限公司 位置差异图生成方法及装置、电子设备、芯片及介质
CN115049980A (zh) * 2022-06-16 2022-09-13 威海经济技术开发区天智创新技术研究院 基于图像的目标对象确定方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523464A (zh) * 2011-12-12 2012-06-27 上海大学 一种双目立体视频的深度图像估计方法
CN105574838A (zh) * 2014-10-15 2016-05-11 上海弘视通信技术有限公司 多目相机的图像配准和拼接方法及其装置
US20170374352A1 (en) * 2016-06-22 2017-12-28 Intel Corporation Depth image provision apparatus and method
CN108520554A (zh) * 2018-04-12 2018-09-11 无锡信捷电气股份有限公司 一种基于orb-slam2的双目三维稠密建图方法
CN109640066A (zh) * 2018-12-12 2019-04-16 深圳先进技术研究院 高精度稠密深度图像的生成方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101312539B (zh) * 2008-07-03 2010-11-10 浙江大学 用于三维电视的分级图像深度提取方法
CN101720047B (zh) * 2009-11-03 2011-12-21 上海大学 基于颜色分割的多目摄像立体匹配获取深度图像的方法
KR20140039649A (ko) * 2012-09-24 2014-04-02 삼성전자주식회사 다시점 영상 생성 방법 및 다시점 영상 디스플레이 장치
WO2017101108A1 (en) * 2015-12-18 2017-06-22 Boe Technology Group Co., Ltd. Method, apparatus, and non-transitory computer readable medium for generating depth maps

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523464A (zh) * 2011-12-12 2012-06-27 上海大学 一种双目立体视频的深度图像估计方法
CN105574838A (zh) * 2014-10-15 2016-05-11 上海弘视通信技术有限公司 多目相机的图像配准和拼接方法及其装置
US20170374352A1 (en) * 2016-06-22 2017-12-28 Intel Corporation Depth image provision apparatus and method
CN108520554A (zh) * 2018-04-12 2018-09-11 无锡信捷电气股份有限公司 一种基于orb-slam2的双目三维稠密建图方法
CN109640066A (zh) * 2018-12-12 2019-04-16 深圳先进技术研究院 高精度稠密深度图像的生成方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG, WEI ET AL: "Multi-view Dense Depth Map Estimation through Match Propagation", ACTA AUTOMATICA SINICA, vol. 40, no. 12, 31 December 2014 (2014-12-31), XP009521512, ISSN: 0254-4156 *

Also Published As

Publication number Publication date
CN109640066B (zh) 2020-05-22
CN109640066A (zh) 2019-04-16

Similar Documents

Publication Publication Date Title
WO2020119467A1 (zh) 高精度稠密深度图像的生成方法和装置
TWI729995B (zh) 基於場景之拍攝圖像而產生合併、融合三維點雲
US11010924B2 (en) Method and device for determining external parameter of stereoscopic camera
EP2992508B1 (en) Diminished and mediated reality effects from reconstruction
WO2018127007A1 (zh) 深度图获取方法及系统
WO2018119889A1 (zh) 三维场景定位方法和装置
KR100513055B1 (ko) 변이지도 및 깊이지도의 융합을 통한 3차원 장면 모델생성 장치 및 그 방법
JP6417702B2 (ja) 画像処理装置、画像処理方法および画像処理プログラム
EP3135033B1 (en) Structured stereo
WO2022127918A1 (zh) 双目相机的立体标定方法、装置、系统及双目相机
JP6883608B2 (ja) 深度マップに対して画像位置合せを行って深度データを最適化することができる深度データ処理システム
CN110176032B (zh) 一种三维重建方法及装置
CN111368717B (zh) 视线确定方法、装置、电子设备和计算机可读存储介质
US20170132803A1 (en) Apparatus and method for processing a depth image
CN111160232B (zh) 正面人脸重建方法、装置及系统
WO2018216341A1 (ja) 情報処理装置、情報処理方法、及びプログラム
CN113034568A (zh) 一种机器视觉深度估计方法、装置、系统
CN110619660A (zh) 一种物体定位方法、装置、计算机可读存储介质及机器人
WO2022135588A1 (zh) 图像校正方法、装置及系统、电子设备
CN111739071B (zh) 基于初始值的快速迭代配准方法、介质、终端和装置
CN116029996A (zh) 立体匹配的方法、装置和电子设备
CN112184811A (zh) 单目空间结构光系统结构校准方法及装置
CN116129037A (zh) 视触觉传感器及其三维重建方法、系统、设备及存储介质
CN111882655A (zh) 三维重建的方法、装置、系统、计算机设备和存储介质
JP2019091122A (ja) デプスマップフィルタ処理装置、デプスマップフィルタ処理方法及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19894712

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.11.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19894712

Country of ref document: EP

Kind code of ref document: A1