WO2022021912A1 - 一种低功耗立体匹配系统及获取深度信息的方法 - Google Patents

一种低功耗立体匹配系统及获取深度信息的方法 Download PDF

Info

Publication number
WO2022021912A1
WO2022021912A1 PCT/CN2021/083603 CN2021083603W WO2022021912A1 WO 2022021912 A1 WO2022021912 A1 WO 2022021912A1 CN 2021083603 W CN2021083603 W CN 2021083603W WO 2022021912 A1 WO2022021912 A1 WO 2022021912A1
Authority
WO
WIPO (PCT)
Prior art keywords
area
pixel
cost
value
target
Prior art date
Application number
PCT/CN2021/083603
Other languages
English (en)
French (fr)
Inventor
安丰伟
付宇哲
董平成
陈卓宇
李卓奥
Original Assignee
南方科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南方科技大学 filed Critical 南方科技大学
Publication of WO2022021912A1 publication Critical patent/WO2022021912A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • the present application relates to the technical field of stereo vision technology, and in particular, to a low-power stereo matching system and a method for acquiring depth information.
  • Stereo matching is the main research direction in the field of computer vision. Its research extracts and reconstructs the three-dimensional information of the target object from the information of the two-dimensional image, and is widely used in intelligent robot systems, unmanned vehicle systems, industrial measurement and other fields. Therefore, in the stereo vision system, the accuracy and speed of the stereo matching algorithm directly affect the effect of 3D reconstruction. It finds a one-to-one correspondence between pixels by establishing an energy cost function, and estimates the pixel points by minimizing the energy cost function. Parallax value. However, in the current stereo matching test platform, it is often difficult to transplant on hardware due to the large amount of calculation and resource consumption required by complex high-precision algorithms.
  • the present application provides a low-power stereo matching system, which includes a census module, a multi-path cost aggregation module, and a depth calculation module, and further includes: a system disposed between the census module and the multi-path cost aggregation module The partition optimization module;
  • the census module is used to input the left and right images of the target, and output multiple initial pixel cost values of the target pixels;
  • the census module includes a left census transcoding unit for converting the pixel stream of the left image, and a The right census transcoding unit of the pixel stream of the image and the Hamming distance calculation unit for determining a plurality of initial pixel cost values of each target pixel; wherein, the left image or the right image is used as the target image, and the target image includes several target pixel;
  • the partition optimization module includes a plurality of area units for determining the minimum area cost value that can represent the area cost value and the location of the best matching point in the area of the minimum area cost value, and each of the area units includes two arranged in parallel. a first smallest subunit and a second smallest subunit that is commonly connected to the output ends of the two first smallest subunits;
  • the multi-path cost aggregation module includes several first-in-first-out units for transmitting the cost aggregation value of each area and assisting in updating the cost of each area, and for determining the path cost value in different preset path directions corresponding to each area. And several cost aggregation modules for aggregating multiple path cost values in the same area and an addition aggregation module for determining multiple energy function values in each area;
  • the depth calculation module includes an energy minimum search module for determining the minimum energy function value among the plurality of energy function values of each region, a shift module for determining the disparity region of the minimum energy function of the corresponding region, and a The addition module of the depth information of each target pixel determined by the position of the region best matching point of each parallax region of the shift module and the corresponding region of the partition optimization module.
  • the right census transcoding unit includes a right row cache unit, a right census window unit and a right census comparison unit
  • the left census transcoding unit includes a left row cache unit, a left census window unit and a left census comparison unit, wherein , the structure of the right line cache unit and the left line cache unit is the same, and the structures of the right census window unit and the right census comparison unit are respectively different from the left census window unit and the left census comparison unit.
  • the left census window unit when taking the left image as the target image and the right image as the non-target image, includes a left sliding window for traversing the left image, the left sliding window includes several registers, the left The census comparison unit includes a left comparison window, and the left comparison window includes a plurality of comparators;
  • the right census window unit includes a right disparity window for traversing each disparity pixel within a preset disparity threshold area from the target pixel in the right image , the number of the right parallax windows is a preset parallax threshold, and each of the right parallax windows includes several registers;
  • the right census comparison unit includes a preset parallax threshold value of right comparison windows, each of the right contrast
  • the window includes several comparators;
  • the right census window unit when taking the right image as the target image and the left image as the non-target image, includes a right sliding window for traversing the right image, the right sliding window includes several registers, and the right census compares
  • the unit includes a right contrast window, and the right contrast window includes several comparators
  • the left census window unit includes a left disparity window for traversing each disparity pixel in the preset disparity threshold area from the target pixel in the left image, so
  • the number of the left disparity windows is a preset disparity threshold, and each of the left disparity windows includes several registers
  • the left census comparison unit includes a preset disparity threshold left contrast window, and each of the left contrast windows includes several comparators.
  • each of the first minimum subunits includes a first comparator, a first data multiplexer and a first position multiplexer, and the first data multiplexer is used to output a plurality of inputted initial pixels
  • the minimum initial pixel cost value in the cost value, the first position multiplexer is used to output the position information corresponding to the minimum initial pixel cost value
  • the second minimum subunit includes a second comparator, a second data A multiplexer, two second position multiplexers and an AND gate unit, the second data multiplexer is used to obtain the minimum initial pixel cost value in the output data of the two first minimum subunits, the second The position multiplexer is used to output the best matching position information corresponding to the minimum initial pixel cost value.
  • each of the FIFO units includes three transmission units and one direction FIFO unit, and the number of the direction FIFO units is the same as the number of the cost aggregation modules.
  • the present application also provides a method for obtaining depth information by a low-power stereo matching system, the method comprising the following steps:
  • the left image or the right image is used as the target image, and the target image includes several target pixels;
  • the code stream and the second binary code stream of each disparity pixel are calculated, and the Hamming distance calculation is performed on the first binary code stream and each second binary code stream respectively, so as to obtain multiple initial pixel cost value;
  • the target pixel and all parallax pixels determine the first binary code stream of the target pixel and the second binary code stream of each parallax pixel, and separate the first binary code stream. Carry out Hamming distance calculation with each second binary code stream, and obtain a plurality of initial pixel cost values of the target pixel specifically including:
  • the comparison target pixel and the grayscale value of all neighboring pixels corresponding to the target pixel itself, and the output of the first binary code stream of the target pixel specifically includes:
  • calculating the Hamming distance between the first binary code stream and each second binary code stream, and determining the multiple initial pixel value values of the target pixel specifically include:
  • the location of the best regional matching point of the minimum regional cost value specifically includes:
  • the position of the best matching point in the area of the minimum area cost value in the target image is obtained.
  • the multiple preset path directions based on the minimum area cost value of each area, determine the path cost value in the preset path direction corresponding to each area, and map the multiple preset path directions of the same area to the corresponding preset path directions. Aggregate the path cost value of , and obtain the energy function value of each area, which includes:
  • the preset path directions include 0°, 45°, 90° and 135°;
  • the disparity region corresponding to the minimum energy function value in all the energy function values is determined based on the energy function values corresponding to all the regions and the region best matching point position corresponding to each region in all the regions, and based on the minimum energy function value
  • the position of the best matching point in the region corresponding to the region where the value is located and the parallax region, and obtaining the depth information of the target pixel specifically includes:
  • the depth information of the target pixel is obtained based on the position of the best matching point in the target pixel's area and the optimal parallax area.
  • the present application provides a low-power stereo matching system and a method for acquiring depth information
  • the system includes a census module, a partition optimization module, a multi-path cost aggregation module and a depth calculation module connected in sequence module.
  • This application uses the Census algorithm as the initial pixel cost value calculation function, and adds a partition optimization module between the Census census module and the multi-path cost aggregation module to simplify the initial pixel cost value, based on sub-region processing and optimal parallax position, without affecting In the case of accuracy, reduce the number of initial pixel cost values passed into the multi-path cost aggregation module, thereby reducing algorithm time and resource consumption, and ensuring accuracy.
  • FIG. 1 is a structural block diagram of a low power consumption stereo matching system provided by the present application.
  • FIG. 2 is a structural block diagram of a census module in the low power consumption stereo matching system of the present application.
  • FIG. 3 is a structural block diagram of a region unit in the partition optimization module of the present application.
  • FIG. 4 is a structural block diagram of the first minimum sub-unit in the partition optimization module of the present application.
  • FIG. 5 is a structural block diagram of the second smallest sub-unit in the partition optimization module of the present application.
  • FIG. 6 is a structural block diagram of a multi-path cost aggregation module of the present application.
  • FIG. 7 is an example diagram of four directions of a target pixel of the present application.
  • FIG. 8 is a flowchart of a method for acquiring depth information in a low-power stereo matching system provided by the present application.
  • FIG. 9 is a flowchart of step S30 of a method for acquiring depth information by a low-power stereo matching system.
  • FIG. 10 is a flowchart of step S40 of the method for acquiring depth information by the low-power stereo matching system.
  • FIG. 11 is a flowchart of step S50 of a method for acquiring depth information by a low-power stereo matching system
  • FIG. 12 is a left image of an object in an application scene.
  • FIG. 13 is a right image of the same object in an application scene.
  • FIG. 14 is an application scene outputting a depth map of the same target.
  • the present application provides a low-power stereo matching system and a method for obtaining depth information.
  • the present application is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
  • the basic principle of the stereo matching system is: use the calibrated cameras at different positions to shoot the same object target to obtain two-dimensional images of different angles; use the same spatial point in the two-dimensional image to obtain the depth information by using the difference of corresponding pixel positions in the two-dimensional image.
  • an image composed of depth information is called a depth image.
  • the inventor's research shows that the general stereo matching algorithm can be divided into a global stereo matching algorithm, a semi-global stereo matching algorithm (SGM), and a local stereo matching algorithm according to different energy cost functions.
  • the global stereo matching algorithm mainly adopts the global optimization theory method to estimate the disparity, establishes the global energy function, and obtains the optimal disparity value by minimizing the global energy function.
  • the local matching algorithm mainly uses the local optimization method to estimate the disparity value, and also uses the energy minimization method to estimate the disparity value.
  • the semi-global stereo matching algorithm uses the constraint information of the pixel itself and its neighbors (the characteristics of the local stereo matching algorithm), and uses the dynamic programming idea to simulate the two-dimensional smooth constraints (global smooth constraints) by one-dimensional smooth constraints in multiple directions. The characteristics of stereo matching), and then combine the data on each one-dimensional path, and introduce a different penalty factor according to the depth as a smoothing term, so that the algorithm has strong robustness to noise while ensuring accuracy.
  • the present application provides a low-power stereo matching system and a method for obtaining depth information.
  • the Census algorithm as the initial pixel cost value calculation function
  • the partition optimization module is added between the path cost aggregation modules to simplify the initial pixel cost value. Based on sub-region processing and the optimal disparity position, the number of initial pixel cost values passed into the multi-path cost aggregation module is reduced without affecting the accuracy, thereby reducing the The algorithm is time-consuming and resource-consuming, and ensures accuracy.
  • FIG. 1 is a structural block diagram of a low-power stereo matching system provided by the present application.
  • the low-power stereo matching system includes a census module 1, a partition optimization module 2, a multipath Cost aggregation module 3 and depth calculation module 4, in this way, the matching number of initial pixel cost values in census module 1 is simplified by partition optimization module 2, thereby reducing the number of initial pixel cost values passed into multi-path cost aggregation module 3, and based on the most The optimal disparity position and multi-path cost are aggregated, and finally the high-precision depth information of each target pixel is obtained.
  • the census module 1 (also known as the Census module) is used to input the left and right images of the target, and output multiple initial pixel cost values of the target pixels.
  • the census module 1 includes a left census transcoding unit 11 for converting the pixel stream of the left image, a right census transcoding unit 12 for converting the pixel stream of the right image, and a right census transcoding unit 12 for determining each target
  • the Census transformation is implemented by the left census transcoding unit 11 and the right census transcoding unit 12, respectively, that is, the left census transcoding unit 11 and the right census transcoding unit 12 respectively convert the corresponding input pixels in the left image and the right image.
  • the grayscale values of , respectively, are encoded into binary code streams, wherein the left census transcoding unit 11 encodes the grayscale values of the pixels in the input left image into a left pixel stream, and the right census transcoding unit 12 converts the pixels in the input right image into a left pixel stream.
  • the gray value of the point is encoded into the right pixel stream, so as to obtain the size relationship between the gray value of the neighborhood pixel and the gray value of the central pixel, and the initial pixel cost value of each target pixel is realized by the Hamming distance calculation unit 13. Output.
  • the initial pixel cost value refers to the similarity degree of pixel blocks in the two images obtained by the binocular camera. For example: pre-set a parallax search range, if the calibrated right camera is used as the benchmark, select a pixel point in sequence in the image acquired by the right camera, and select the same coordinate pixel point in the image acquired by the calibrated left camera. and all the pixels within the parallax range on the right side to calculate the matching cost to obtain the initial pixel cost value.
  • the lower the initial pixel cost value the better the match between the pixels in the right image and the corresponding pixels in the left image under the set parallax.
  • FIG. 2 is a structural block diagram of the census module 1 in the low-power stereo matching system.
  • the right census transcoding unit 12 includes a right line buffer unit 121 , a right census window unit (not shown in the figure) and a right census comparison unit (not shown in the figure).
  • the left census transcoding unit 11 includes The left line buffer unit 111, the left census window unit (not shown in the figure) and the left census comparison unit (not shown in the figure), wherein the right line buffer unit 121 and the left line buffer unit 111 have the same structure, the right line The census window unit (not shown in the figure) and the right census comparison unit (not shown in the figure) are respectively different in structure from the left census window unit (not shown in the figure) and the left census comparison unit (not shown in the figure).
  • the right line buffer unit 121, the right census window unit (not shown in the figure), the right census comparison unit (not shown in the figure) and the Hamming distance calculation unit 13 form a four-stage pipeline structure, so that a right image is completed every four cycles. Calculation of all Census changes and Hamming distance values (that is, initial pixel cost values) of all pixels in the preset parallax search range.
  • the right line buffer unit 121 (linebuffer) is used to sequentially buffer two consecutive lines of pixel value information in the right image pixel stream, which includes a first line buffer unit 1211 (linebuffer1) and a second line buffer unit 1212 (linebuffer2) , the depth values of the first line buffer unit 1211 and the second line buffer unit 1212 are the same, and both consist of a first-in, first-out queue.
  • the right image is input to the right line buffer unit, and the two lines of pixel value information in the right image are sequentially buffered by the right line buffer unit in a first-in, first-out manner of pixel flow.
  • each pixel has a gray value, and the depth information depends on the number of columns of the input image.
  • the input right image takes the input right image as an example, assuming that the pixels of the right image are 640*480, there are 640 pixels in a line, and the depths of linebuffer1 and linebuffer2 are both 640; the right image input takes the upper left corner as the initial incoming point, The lower right corner is the end entry point, and a pixel gray value is passed in one clock cycle to Linebuffer1 in the form of pixel stream.
  • the first line of pixels in the right image is transferred and linebuffer1 is filled; when it reaches 1280 cycles, The transmission of the pixel value information of the second row of the right image is completed.
  • linebuffer1 is filled with the pixel value information of the second row of the right image
  • linebuffer2 is filled with the pixel value information of the first row of the right image.
  • the right census window unit (not shown in the figure) includes a right sliding window 1221 for traversing the right image.
  • the right census window The unit (not shown in the figure) performs a traversal operation on the entire right image through the right sliding window 1221, and outputs the grayscale values of all the pixels included in the right sliding window 1221.
  • the right sliding window 1221 is composed of several registers, and the size of the right sliding window 1221 is predefined according to user needs. Streams match at least 8 digits, with minimal overhead.
  • the right census window unit (not shown in the figure) is a three-stage pipeline structure.
  • the right sliding window 1221 preferably includes 9 registers in this embodiment of the present application, forming a 3*3 matrix, each layer of sliding window. Both include 3 identical registers.
  • the first layer of sliding window is composed of register 1, register 2, and register 3.
  • the second layer of sliding window is composed of register 4, register 5, and register 6.
  • the third layer of sliding window is composed of register 7 and register. 8, register 9 constitutes.
  • register 5 is used to process the center pixel corresponding to the right sliding window 1221, while register 1, register 2, register 3, register 4, register 6, register 7, register 8, and register 9 are used to process the right sliding window
  • the center pixel of window 1221 corresponds to 8 pixels in the neighborhood. Therefore, these 8 pixels are also called the neighbor pixels of the center pixel.
  • the center pixel is the target pixel.
  • the right census and comparison unit (not shown) is used to use the center pixel of each time in the process of traversing the right image area by the right sliding window 1221 as a reference pixel and compare the gray value of the reference pixel with the gray value of the reference pixel.
  • the gray value of each pixel in the neighborhood corresponding to the center pixel of the right sliding window 1221 is compared, if the gray value of the neighborhood pixel is less than or equal to the gray value of the reference pixel, then the right census comparison unit (not shown in the figure) If the gray value of the neighborhood pixel is greater than the gray value of the reference pixel, the right census comparison unit (not marked in the figure) outputs 1.
  • the right census comparison unit corresponds to the right census window unit (not shown in the figure), which is also a three-level cascade, and the right census comparison unit (not shown in the figure) includes a right comparison window. 1231, the right comparison window 1231 includes several comparators.
  • the right comparison window 1231 preferably includes 8 registers in the embodiment of the present application, which also forms a 3*3 matrix, but the first layer comparison window and the third layer comparison window both include 3 identical comparators, namely the first layer.
  • the layer comparison window is composed of comparator 1, comparator 2, and comparator 3; the third layer comparison window is composed of comparator 7, comparator 8, and comparator 9, but the second layer comparison window is composed of only 2 comparators, Namely Comparator 4 and Comparator 6.
  • the output terminals of the register 5 of the right sliding window 1221 are commonly connected to the input terminals of the eight comparators.
  • the other 8 registers except the register 5 of the right sliding window 1221 are in one-to-one correspondence with the 8 comparators of the right comparison window 1231, that is, the output end of the register 1 is connected to the input end of the comparator 1, and the output end of the register 2 is connected to the input end of the comparator 1.
  • the terminal is connected to the input terminal of comparator 2
  • the output terminal of register 3 is connected to the input terminal of comparator 3
  • the output terminal of register 4 is connected to the input terminal of comparator 4
  • the output terminal of register 6 is connected to the input terminal of comparator 6
  • the output terminal of register 6 is connected to the input terminal of comparator 6.
  • the output end of the register 8 is connected to the input end of the comparator 7, the output end of the register 8 is connected to the input end of the comparator 8, and the output end of the register 9 is connected to the input end of the comparator 9.
  • the output values of the 8 comparators are connected bit by bit from left to right and from top to bottom, and an 8-bit bit string is obtained through the right census and comparison unit (not shown in the figure), that is, a binary code composed of 0 and 1 stream (aka right binary stream).
  • the pixel stream of the right image obtains the gray value of each target pixel and the gray value of the neighboring pixels corresponding to the target pixel through the right census window unit (not shown in the figure), and passes through the right census comparison unit (Fig. Not marked) size comparison is performed between the gray value of each target pixel and the gray value of each pixel in the neighborhood of the corresponding target pixel, thereby outputting each target pixel of the right image in bit order by the right census comparison unit The corresponding right binary code stream.
  • the left census transcoding unit 11 and the right census transcoding unit 12 use the same Census transformation method, that is, convert the image pixels of the left image into a binary code stream.
  • the left census transcoding unit 11 includes a left line cache unit 111, a left census window unit (not shown in the figure) and a left census comparison unit (not shown in the figure), wherein the left line cache unit 111 and the right line cache
  • the structure of the unit 121 is the same, the left census window unit (not shown in the figure) and the left census comparison unit (not shown in the figure) are respectively the same as the right census window unit (not shown in the figure), the right census comparison unit (not shown in the figure) have different structures.
  • the left line buffer unit 111 is also used to sequentially buffer two consecutive lines of pixel value information in the left image pixel stream, which includes a first line buffer unit 1111 and a second line buffer unit 1112.
  • the left line buffer unit 111 The same method as the right line buffer unit 121 is adopted, and the implementation process of the right line buffer unit 121 described above is specifically referred to.
  • the left census window unit (not shown in the figure) includes a left disparity window 1121 for traversing each disparity pixel within the preset disparity threshold region from the target pixel in the left image, that is, when calibrating the right image as the target image , each target pixel in the right image corresponds to a preset disparity threshold and left disparity windows 1121 (also known as Disparity Range windows).
  • the number of the left parallax windows 1121 is a preset parallax threshold.
  • the preset parallax thresholds are preferably 96, and the optimal matching of the target pixels can be obtained with low overhead.
  • the preset parallax thresholds are not limited, and can be set according to user requirements, such as 48, 24, and so on.
  • a target pixel 1 in the right image corresponds to a right sliding window and a right comparison window.
  • the target pixel 1 in the left image is searched to determine the preset disparity threshold area (ie neighborhood) from the target pixel.
  • the target pixel 1 has 96 disparity pixels, that is, 96 left disparity windows 1121 .
  • the left parallax window 1121 of the present application is not one, and has a preset parallax threshold corresponding to each window.
  • the concurrent processing of the present application not only improves the search rate, but also further improves the accuracy of the target pixel.
  • each of the left disparity windows 1121 includes several registers; in this embodiment of the present application, each of the left disparity windows 1121 includes 9 registers, and the left census comparison unit (not marked in the figure) includes The preset parallax threshold number of left contrast windows 1131 , that is to say, the number of left parallax windows 1121 is the same as the number of left contrast windows 1131 , both are preset parallax thresholds, for example, 96 left contrast windows 1131 .
  • Each of the left contrast windows 1131 includes 8 comparators, and each of the left contrast windows 1131 corresponds to a register corresponding to the left disparity window 1121:
  • the outputs of register 5 of the left sliding window 1121 are commonly connected to the inputs of eight comparators.
  • the remaining 8 registers except register 5 of the left parallax window 1121 are in one-to-one correspondence with the 8 comparators of the left contrast window 1131, that is, the output end of register 1 is connected to the input end of comparator 1, and the output end of register 2 is connected to the input end of comparator 1.
  • the terminal is connected to the input terminal of comparator 2, the output terminal of register 3 is connected to the input terminal of comparator 3, the output terminal of register 4 is connected to the input terminal of comparator 4, the output terminal of register 6 is connected to the input terminal of comparator 6, and the output terminal of register 6 is connected to the input terminal of comparator 6.
  • the output end of the register 8 is connected to the input end of the comparator 7, the output end of the register 8 is connected to the input end of the comparator 8, and the output end of the register 9 is connected to the input end of the comparator 9.
  • the output values of the 8 comparators are connected bit by bit from left to right and from top to bottom, and an 8-bit bit string is obtained through the left census comparison unit (not shown in the figure), that is, a binary code composed of 0 and 1 flow.
  • the left census comparison unit not shown in the figure
  • a target pixel in the right image outputs a right binary code stream, corresponding to the left image concurrently outputting 96 right binary code streams through 96 left disparity windows 1121 and 96 left contrast windows 1131 .
  • the Hamming Distance calculation unit 13 is configured to receive the bit strings respectively output by the left census transcoding unit 11 and the right census transcoding unit 12, and use to calculate The Hamming distance is used to calculate the initial pixel value of a right binary code stream and each left binary code stream of the target pixel, so that the Hamming distance calculation unit 13 determines a plurality of initial pixel value of each target pixel.
  • the Hamming distance value is the initial pixel cost value, which refers to the number of corresponding bits of the two bit strings that are different (ie, one is 1 and the other is 0).
  • the Hamming distance calculation unit 13 includes a plurality of XOR gates arranged in parallel and an adder (not shown in the figure) that is commonly connected with the output ends of all the XOR gates.
  • the number of the XOR gates is preferably 9, which are XOR gate 1, XOR gate 2, XOR gate 3, XOR gate 4, XOR gate 5, XOR gate 6.
  • each XOR gate has two input terminals and one output terminal, and the two input terminals of the XOR gate are respectively connected to the left census
  • the output terminal of one comparator of the comparison unit (not shown in the figure) and the output terminal of the corresponding comparator of the right census comparison unit (not shown in the figure) are connected to the adder. That is to say, for a target pixel, a right binary code stream and 96 left binary code streams are input in parallel to the two inputs of each XOR gate, and the XOR operation is performed through each XOR gate, and the output is output to all the XOR gates.
  • the adder is used to output 96 initial pixel cost values of a target pixel.
  • the right camera After the right camera reads the image data, it passes in linebuffer1 (the first line buffer unit) and linebuffer2 (the second line buffer unit) in turn through the right image pixel stream pixelstream.
  • the linebuffer2 When the linebuffer2 is full, the pixelstream is in At the same time of entering linebuffer1, pixelstream, the rightmost data of linebuffer1 and the rightmost data of linebuffer2 respectively enter the three registers of stage1 (state 1) in the census window unit, and pass to the right in turn when the next clocks arrive; when Census After the window unit is filled, when the next clock cycle comes, the 8 registers in the Census window unit except register 5 are respectively passed into the 8 comparators in the corresponding positions in the Census comparison unit, and register 5 is passed into the Census comparison unit.
  • each target pixel has 96 initial pixel cost values. For example: for a certain pixel in the right image, there will be several pixels corresponding to it in the left image to simulate the corresponding cost values under different parallax values, and the number of cost values is the preset parallax range value. The value is also known as the initial pixel cost of the pixel.
  • the partition optimization module 2 is used to divide the multiple initial pixel cost values of each target pixel into multiple regions, and optimize each region to obtain the minimum region cost value of each region. and the position of the best matching point in the area corresponding to the minimum area cost value, the purpose of which is to reduce the data amount of the initial pixel cost value output by the census module.
  • the optimization process refers to dividing a plurality of initial pixel cost values of the target pixel into a plurality of regions, each region contains a plurality of initial pixel cost values, and the partition optimization module 2 includes a plurality of The minimum area cost value of the value and the area unit 20 of the location of the area best matching point of the minimum area cost value.
  • Each area unit 20 uses a tree structure to process all the initial pixel cost values C1 to C4 contained in the area unit. Please refer to FIG. 3 .
  • FIG. 3 is a structural block diagram of each area unit. As shown in FIG. Each of the units 20 includes two first minimum subunits 21 (min_first) arranged in parallel and a second minimum subunit 22 (min_second) that is commonly connected to the output ends of the two first minimum subunits 21 .
  • FIG. 4 is a structural block diagram of the first smallest sub-unit 21
  • FIG. 5 is a structural block diagram of the second smallest sub-unit 22 .
  • the first minimum subunit 21 includes a first comparator, a first data multiplexer mux and a first position multiplexer mux, and the first data multiplexer is used to output any two The smallest initial pixel cost value among the initial pixel cost values, and the first position multiplexer is configured to output the position value corresponding to the smallest initial pixel cost value (the position of the best matching pixel point, represented by 1 bit).
  • the first comparator includes two input ends and one output end
  • the first data multiplexer includes three input ends and one output end
  • the first position multiplexer includes three inputs terminal and one output terminal.
  • the two input terminals of the first comparator and the two input terminals of the first data multiplexer are connected to any two initial pixel values such as C1 and C2.
  • the third input end of the data multiplexer and the third input end of the first position multiplexer are commonly connected to the output end of the first comparator, and the output end of the first data multiplexer is connected to the output end of the first comparator.
  • the output end of the first position multiplexer is commonly connected to the input end of the second smallest subunit.
  • the first comparator outputs c1, and the first position multiplexer outputs 0, representing the best matching pixel position; otherwise, if c1>c2, the first comparator c2 is output, and the first position multiplexer outputs 1, representing the best matching position.
  • the second minimum subunit 22 includes a second comparator, a second data multiplexer mux, two second position multiplexers mux, and an AND gate unit.
  • the two data multiplexers are used to output the minimum initial pixel cost value in the output data of the two first minimum subunits
  • the two second position multiplexers are used to output the best matching position value corresponding to the minimum initial pixel cost value.
  • the second comparator includes three input ends and one output end
  • the second data multiplexer includes three input ends and one output end
  • the two second position multiplexers each include three An input terminal and an output terminal
  • the AND gate unit includes two input terminals and an output terminal.
  • the two input terminals of the second comparator and the two input terminals of the second data multiplexer are jointly connected to the minimum initial pixel cost values respectively output by the two first minimum subunits min_first, such as C1 and C2; the third input end of the second data multiplexer and the third input ends of the two second position multiplexers are connected to the output end of the second comparator, and the two second position multiplexers are connected to the output end of the second comparator.
  • the output terminal of the AND gate unit is connected to the two input terminals of the AND gate unit, wherein the two input terminals of one second position multiplexer input 0 and 1, and the two input terminals of the other second position multiplexer are connected with the second position multiplexer.
  • the two inputs of the position multiplexer correspond to inputs p1 and P2.
  • the second comparator outputs c1, and the two second position multiplexers output p10 respectively to represent the best matching position; otherwise, if c1>c2, the first The two comparators output c2, and the two second position multiplexers respectively output p21 to represent the best matching position.
  • the initial pixel cost value is simplified by the sub-region optimization module 2, and the number of initial pixel cost values passed into the multi-path cost aggregation module 3 is reduced without affecting the accuracy, thereby reducing the resource overhead. , to speed up processing.
  • FIG. 6 is a structural block diagram of the multi-path cost aggregation module.
  • the multi-path cost aggregation module 3 (SGM module) is used to calculate the path cost for a plurality of preset path directions of each minimum area cost value, and obtain the multi-path cost corresponding to the preset path direction.
  • a path cost value is obtained, and all path cost values corresponding to the preset path directions are aggregated to obtain the energy function value of the corresponding area of the target pixel.
  • the multi-path cost aggregation module 3 includes several first-in, first-out units 31 (FIFO units) for transmitting the cost aggregation value of each area and assisting in updating the cost of each area, and for determining different preset paths corresponding to each area.
  • the number of the FIFO units 31 is preferably 4, and the number of the cost aggregation modules 32 is also preferably 4, which are the cost aggregation module 1 (Agg1), the cost aggregation module 1 (Agg1), the cost aggregation module 32 Aggregation module 2 (Agg2), cost aggregation module 3 (Agg3), and cost aggregation module 4 (Agg4).
  • each of the FIFO units 31 includes three transmission units and one direction FIFO unit, and the number of the direction FIFO units is the same as the number of the cost aggregation modules 32, corresponding to 4 Cost aggregation in each direction.
  • Each transmission unit consists of 24 registers, therefore, the FIFO unit consists of four directions of 0°, 45°, 90° and 135° FIFO units and 12 transmission units.
  • four cost aggregation modules form a four-layer parallel two-stage pipeline structure: the first layer is the aggregation of the second cost aggregation module agg2 itself, and the second cost aggregation module agg2 to the second cost aggregation module agg2 , the second layer is the second cost aggregation module Agg2 to the fourth cost aggregation module Agg4, the third layer is the second cost aggregation module Agg2 to the third cost aggregation module Agg3, and the fourth layer is the first cost aggregation module Agg1 to the fourth Cost aggregation module Agg4.
  • the first line of each frame of image since the first line of each frame of image only needs to perform cost aggregation in the direction of 0°, as shown in Figure 6, it is counted by a counter.
  • the 24 area cost values are passed into the FIFO unit instead of Agg4.
  • the area cost value of the last pixel in the first line is passed into Agg2
  • the area cost value of the pixels in the second line begins to be passed into Agg4.
  • the output of Agg3 skips the first two transmission units in the FIFO unit and directly passes in the corresponding direction in the same direction.
  • the cost aggregation values in the cost aggregation module 32 can be passed on in sequence, so as to help the cost aggregation module 32 to update the cost.
  • the addition aggregation module (Sum) 33 is composed of adders, and Agg1 contains 24*4 area aggregation values after updating in 4 directions, also called path cost values, with a Taking the region as an example, the addition aggregation module 33 adds the path cost values corresponding to the four directions of the region through the adder to obtain the total path cost value, and then obtains 24 energy functions of the 24 regions.
  • the depth calculation module 4 is used to determine the parallax region corresponding to the smallest energy function value among the multiple energy function values, and based on the position of the best matching point in the region and the parallax region, obtain in-depth information.
  • the depth calculation module 4 includes a minimum energy search module 41, a shift module 42 and an addition module 43.
  • the minimum energy search module 41 is used to find the minimum energy value, that is, to determine the minimum energy among the multiple energy function values of each region.
  • the shift module 42 is used to determine the disparity region of the minimum energy function of the corresponding region, that is, an analog multiplication operation (*4) is used to form the disparity region corresponding to the minimum energy function, and the addition module 43 is used for partition-based
  • the position of the best matching point in the region output by the optimization module 2 and the disparity region corresponding to the minimum energy function output by the shift module 42 are used to calculate the depth information of each target pixel.
  • the depth calculation module 4 is a three-stage pipeline structure, that is, the calculation of 24 aggregated energy values to depth information is completed in three clock cycles, wherein the first two clock cycles are completed to find the minimum value of 24 energy values, and the third time cycle is completed. Complete the depth information calculation.
  • the energy minimum search module 41 is a two-stage pipeline structure, implemented in a tree structure, wherein each time period processes three contrast levels, the input is 24 energy function values, and the minimum value is output.
  • the present application also provides a method for obtaining depth information by a low-power stereo matching system.
  • FIG. 8 is a method for obtaining depth information by the low-power stereo matching system provided by the present application.
  • S10 Collect a left image and a right image of the target; wherein, the left image or the right image is used as the target image, and the target image includes several target pixels.
  • the left image and the right image of the same target are collected by the binocular cameras.
  • the binocular camera simulates human eyes and consists of two monocular cameras, namely a left camera and a right camera. Each monocular camera shoots the same target to output an image, corresponding to the left image and the right image respectively. Therefore, the left image and the right image are images corresponding to the same target (eg, the same pixel block) at different angles.
  • the binocular camera shoots the same object target, and obtains the image as the target image.
  • the right image of the target image as the left image of the non-target image.
  • the left and right images are input to the low-power stereo matching system to obtain high-precision depth information for each target pixel.
  • a parallax threshold that is, the parallax search range of the same target pixel
  • the same target pixel and the same target pixel in the calibrated target image can be used.
  • all the parallax pixels within the range of the parallax threshold corresponding to the target pixel in the corresponding neighborhood are subjected to matching cost and a series of calculations and optimizations to obtain the depth information of the same target pixel.
  • each target pixel in a plurality of target pixels in the target image determines a preset parallax threshold number of parallax pixels corresponding to the target pixel in the non-target image, and based on the target pixel and all parallax pixels, determine the first pixel of the target pixel.
  • the binary code stream and the second binary code stream of each disparity pixel, and the Hamming distance calculation is performed on the first binary code stream and each second binary code stream respectively to obtain the target pixel’s Multiple initial pixel cost values.
  • a preset parallax threshold is obtained and a target pixel in the target image is determined; in the image area of the target image, all domain pixels of the target pixel itself, that is, a target pixel 1 in the right image, are determined by sliding right Window 1221 searches the other 8 neighbor pixels of the target pixel 1 in the right image area. Then obtain the gray value of the target pixel 1 and the gray values of all the area pixels of the target pixel 1, and compare the gray value of the target pixel and all the adjacent pixels corresponding to the target pixel itself through the right comparison window 1231.
  • the gray value of the neighborhood pixel is less than or equal to the gray value of the target pixel, output 0, if the gray value of a neighborhood pixel is greater than the gray value of the target pixel, output 1, and 8 comparison results are bitwise (that is, from top-to-bottom, left-to-right) connection, and output the first binary code stream of the target pixel (that is, the above-mentioned right binary code stream).
  • the target pixel 1 and the preset parallax threshold search for all parallax pixels within a predetermined parallax threshold area from the same target pixel in the image area of the non-target image, that is, in the image area of the left image, the distance from the same coordinate 96 disparity pixels within the preset disparity threshold of the pixel, that is, one target pixel 1 corresponds to 96 disparity pixels, that is, corresponds to 96 left disparity windows 1121 . Then, based on all the disparity pixels, with each disparity pixel as the center, through the corresponding left disparity window 1121, the image area of the left image is searched for 8 neighboring pixels of each disparity pixel itself.
  • each disparity pixel is also used as a reference pixel, and the grayscale values of each reference pixel and all neighboring pixels corresponding to each reference pixel are compared. If the grayscale value of a neighboring pixel is Less than or equal to the gray value of the reference pixel, output 0, if the gray value of a neighborhood pixel is greater than the gray value of the reference pixel, output 1, and then the comparison result of each parallax pixel is bitwise (that is, from top to bottom, Connect from left to right), output the second binary code stream of each disparity pixel, that is, the second binary code stream of the target pixel, a total of 96 second binary code streams (ie the above left binary code stream) code stream).
  • the binary code stream conversion can be expressed by formula (1) and formula (2):
  • (u, v) is the center pixel coordinate
  • (u, v) is the center pixel coordinate
  • i ⁇ [- n', n'] is the comparison formula of 0 and 1
  • x and y are the two values to be compared
  • Cs(u, v) is obtained after the census transformation under the coordinates (u, v)
  • the 8bit bit string of , I(u, v) is the pixel value under the coordinates (u, v).
  • the first binary code stream and 96 second binary code streams are respectively XORed. If a certain bit of data in the first binary code stream and the second binary code stream If the corresponding bit data is different, output 1; if a certain bit of data in the first binary code stream is the same as the corresponding bit data in the second binary code stream, output 0, and then pass through the adder (not marked in the figure) Count the number of bits that are not 1 in the bits of the 96 XOR operation results, and this value is the initial pixel cost value, that is, 96 initial pixel cost values are output.
  • the formula (3) indicates that the XOR operation is performed on the two bit strings, and then the number of bits that are not 1 in the bits of the XOR operation result is counted.
  • C(u, v, d) is the initial matching cost of the pixel (u, v) under the depth d, d ⁇ [0, disparityrange-1], where disparityrange represents the disparity range.
  • S30 Determine several areas of the target pixel and several initial pixel cost values of each area based on a preset area cost threshold, and determine the minimum area code of each area based on several initial pixel cost values of each area in the several areas value and the location of the regional best match point for each of said minimum regional cost values.
  • the regional cost threshold is preset.
  • the area cost threshold is preferably 4, that is, each area is defined to include at least 4 initial pixel cost values, and a target pixel is used as an example to illustrate, therefore, the 96 initial pixel cost values of a target pixel are represented by 4 initial pixel values.
  • the value is an area, which can be divided into 24 area units. Therefore, the output of the partition optimization module 2 is 24 area cost values and 24 area best matching point positions. In this way, the amount of data input to the multi-path cost aggregation module 3 after optimization by the partition optimization module 2 will be reduced by at least 4 times, thereby reducing the overall resource consumption.
  • Each area unit 20 is composed of 4 initial pixel cost values C1, C2, C3, C4, the area cost value is represented by the smallest initial pixel cost value in the area unit, and the position corresponding to the smallest initial pixel cost value is maintained , the position of the minimum initial pixel cost value becomes the most matching point in the area unit i, that is, the position of the best matching pixel point.
  • the area unit i is expressed by formula (4) and formula (5) as:
  • min is the function of finding the minimum value
  • c i-3 , c i-2 , c i-1 , c i represent the four initial pixel cost values passed in, respectively
  • p′ i represents the best matching pixel point in the area unit .
  • the step S30 specifically includes:
  • the initial cost value with the smallest median of the multiple initial pixel cost values in each of the several regions is selected after the second minimum processing, which further improves the matching accuracy and ensures the accuracy of the data.
  • the multi-path cost aggregation module 3 selects 0°, 45°, 90°, and 135 for one area cost value of one of the 24 area cost values optimized by the partition optimization module 2 °
  • the path cost calculation is performed in these four directions respectively, and the path cost values in the four directions of 0°, 45°, 90° and 135° are obtained.
  • the cost value is aggregated to obtain the energy function of this area.
  • the same method is used to calculate the multipath cost of other areas, and 24 energy function values corresponding to 24 areas are obtained, thereby determining the 24 energy function values of each target pixel.
  • Agg2 calculates the minimum value of the aggregated value in the 0° direction, the minimum value of the aggregated value in the 90° direction, and the minimum value of the aggregated value in the 135° direction, while Agg1 calculates the aggregate value in the 45° direction. the minimum value of the value.
  • the minimum path cost value among the aggregated values in these four different directions is the value in the path cost calculation formula (6) in a single direction.
  • L r (p, i) represents the path cost value of the pixel point p in the path direction r under the disparity area i
  • the first item C(p, i) of the formula is the area cost value of the pixel p in the disparity area i
  • the pr point is the previous pixel point of the pixel point p in the path direction r
  • L r (pr, i) represents the pixel point
  • L r (pr, i-1) represents the area cost value of the pixel pr under the disparity area i-1 in the path direction r
  • L r (pr, i +1) represents the area cost value of the pixel point p-1 under the disparity area i+1 in the path direction r
  • P 1 , P 2 are the preset penalty coefficients
  • min j L r (pr, j) represents the point pr The minimum path cost under any disparity region j in the path
  • Agg2 in the first clock cycle corresponds to the previous pixel in the 0° direction of Agg2 in the second clock cycle
  • Agg2 in the first clock cycle corresponds to Agg4 in the second clock cycle.
  • the previous pixel in the 90° direction, Agg2 in the first clock cycle corresponds to the previous pixel in the 135° direction of Agg3 in the second clock cycle
  • Agg1 in the first clock cycle corresponds to the second clock cycle.
  • Agg2 obtains the minimum value of the 0° direction aggregated values calculated by Agg2 in the previous clock cycle, and all the 0° of Agg2 in the previous clock cycle 24 aggregate values in the direction, and the cost is updated according to the previous path cost calculation formula;
  • Agg4 obtains the minimum value of the 90° direction aggregate values calculated by Agg2 in the previous clock cycle, and all the previous clock cycles Agg2's 90° direction 24 aggregated values, obtain the minimum value of the aggregated values in the 45° direction calculated by Agg1 in the previous clock cycle, and all the 24 aggregated values in the 45° direction of Agg1 in the previous clock cycle, and then calculate the cost according to the path cost calculation formula.
  • Agg3 gets the minimum of the 135°-direction aggregated values computed by Agg2 in the last clock cycle, and all 24 aggregated values for Agg2's 135°-direction in the previous clock cycle.
  • Agg1 not only makes Agg4 complete the cost update in the 45° direction, but also contains the updated cost values in all four directions, so that in this cost aggregation module, the cost update in four directions can be completed. .
  • the path cost values in the four directions of the region are accumulated by formula (7) to form the energy function of the parallax region i for the target pixel p, and finally 24 energy function values are obtained.
  • the step S40 specifically includes:
  • S41 select multiple preset path directions; wherein, the preset path directions include 0°, 45°, 90°, and 135°;
  • S43 Aggregate and process multiple minimum path cost values of the area to obtain an energy function value of the area.
  • the depth calculation module 4 obtains the depth information of each target pixel through formula (8).
  • p′ i is the position of the best matching point in the region.
  • the energy function E(p, i) of each disparity area of the target pixel p in the figure find the disparity area i corresponding to the minimum energy function, and find the minimum energy function of the disparity area i of the pixel p from the fourth step.
  • the best matching position p' i is obtained, and then the depth information of the target pixel p is obtained.
  • the step S50 specifically includes:
  • the depth map of the target can be finally obtained.
  • 12-14 illustrate an application scenario of the method for acquiring depth information by the low-power stereo matching system of the present application.
  • the present application provides a method for acquiring depth information in a low-power stereo matching system.
  • the method for acquiring depth information by the low-power stereo matching system includes the following steps: collecting a left image and a right image of a target; wherein, the left image or the right image is used as the target image, and the target image includes several target pixels; For each target pixel in several target pixels in the non-target image, the preset parallax threshold number of parallax pixels corresponding to the target pixel in the non-target image are determined, and based on the target pixel and all the parallax pixels, the first binary code stream of the target pixel is determined.
  • the path cost value corresponding to the preset path direction of each area Aggregate the path cost values corresponding to multiple preset path directions in the same area to obtain the energy function value of each area; based on the energy function values corresponding to all areas and the area corresponding to each area in all areas is the best match point position, determine the parallax region corresponding to the minimum energy function value among all energy function values, and obtain the depth information of the target pixel based on the best matching point position of the region corresponding to the region where the minimum energy function value is located and the parallax region.
  • This application uses the Census algorithm as the initial pixel cost value calculation function, and adds a partition optimization module between the Census census module and the multi-path cost aggregation module to simplify the initial pixel cost value, based on sub-region processing and optimal parallax position, without affecting In the case of accuracy, reduce the number of initial pixel cost values passed into the multi-path cost aggregation module, thereby reducing algorithm time and resource consumption, and ensuring accuracy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

一种低功耗立体匹配系统及获取深度信息的方法,该系统包括依次连接的普查模块(1)、分区优化模块(2)、多路径代价聚合模块(3)以及深度计算模块(4)。通过使用Census算法作为初始像素代价值计算函数,并在Census普查模块(1)与多路径代价聚合模块(3)之间加入分区优化模块(2)简化初始像素代价值,基于分区域处理以及最优视差位置,在不影响精度的情况下减少传入多路径代价聚合模块(3)的初始像素代价值数量,进而降低算法耗时以及资源消耗,并确保精度。

Description

一种低功耗立体匹配系统及获取深度信息的方法 技术领域
本申请涉及立体视觉技术技术领域,特别涉及一种低功耗立体匹配系统及获取深度信息的方法。
背景技术
在计算机视觉领域中立体匹配是主要研究方向,其研究从二维图像的信息中提取并重建目标物体的三维信息,被广泛应用于智能机器人系统,无人车系统,工业测量等领域。因此,在立体视觉系统中,立体匹配算法的精度及速度直接影响到三维重建的效果,其通过建立一个能量代价函数寻找像素间一一对应的关系,通过此能量代价函数最小化来估计像素点视差值。然而目前立体匹配测试平台中往往由于复杂高精度算法所需的计算量与资源消耗量较大,难以在硬件上移植。
因此,现有技术还有待改进和提高。
发明内容
基于此,有必要针对现有技术中立体匹配中资源开销大、移植难的技术问题,提供一种低功耗立体匹配系统及获取深度信息的方法。
为了解决上述技术问题,本申请所采用的技术方案如下:
第一方面,本申请提供一种低功耗立体匹配系统,其包括普查模块、多路径代价聚合模块以及深度计算模块,还包括:设置在所述普查模块和所述多路径代价聚合模块之间的分区优化模块;
所述普查模块用于输入目标的左图像和右图像,输出目标像素的多个初始像素代价值;所述普查模块包括用于转换左图像的像素流的左普查转码单元、用于转换右图像的像素流的右普查转码单元以及用于确定每个目标像素的多个初始像素代价值的汉明距离计算单元;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素;
所述分区优化模块包括多个用于确定能够代表区域代价值的最小区域代价值以及该最小区域代价值的区域最佳匹配点位置的区域单元,每个所述区域单元均包括并行设置的两个第一最小子单元和与两个第一最小子单元的输出端共接的第二最小子单元;
所述多路径代价聚合模块包括用于传递每个区域的代价聚合值并辅助每个区域代 价更新的若干先进先出单元、用于确定每个区域对应的不同预设路径方向上的路径代价值并分别聚合同一区域的多个路径代价值的若干代价聚合模块以及用于确定每个区域的多个能量函数值的加法聚合模块;
所述深度计算模块包括用于确定每个区域的多个能量函数值中最小能量函数值的能量最小查找模块、用于确定对应区域该最小能量函数的视差区域的移位模块以及用于基于所述移位模块的每个视差区域以及所述分区优化模块的对应区域的区域最佳匹配点位置所确定的每个目标像素的深度信息的加法模块。
进一步地,所述右普查转码单元包括右行缓存单元、右普查窗口单元以及右普查对比单元,所述左普查转码单元包括左行缓存单元、左普查窗口单元以及左普查对比单元,其中,所述右行缓存单元与所述左行缓存单元的结构相同,所述右普查窗口单元和所述右普查对比单元分别与所述左普查窗口单元、所述左普查对比单元的结构不同。
进一步地,当以左图像为目标图像,以右图像为非目标图像时,所述左普查窗口单元包括一个用于遍历左图像的左滑动窗口,该左滑动窗口包括若干个寄存器,所述左普查对比单元包括一个左对比窗口,所述左对比窗口包括若干个比较器;所述右普查窗口单元包括用于遍历右图像中距离目标像素预设视差阈值区域内每个视差像素的右视差窗口,所述右视差窗口的个数为预设视差阈值,每个所述右视差窗口均包括若干个寄存器;所述右普查对比单元包括预设视差阈值个右对比窗口,每个所述右对比窗口包括若干个比较器;
或当以右图像为目标图像,以左图像为非目标图像时,所述右普查窗口单元包括一个用于遍历右图像的右滑动窗口,该右滑动窗口包括若干个寄存器,所述右普查对比单元包括一个右对比窗口,所述右对比窗口包括若干个比较器;所述左普查窗口单元包括用于遍历左图像中距离目标像素预设视差阈值区域内每个视差像素的左视差窗口,所述左视差窗口的个数为预设视差阈值,每个所述左视差窗口均包括若干个寄存器;所述左普查对比单元包括预设视差阈值个左对比窗口,每个所述左对比窗口包括若干个比较器。
进一步地,每个所述第一最小子单元均包括第一比较器、第一数据复用器以及第一位置复用器,所述第一数据复用器用于输出所输入的多个初始像素代价值中的最小初始像素代价值,所述第一位置复用器用于输出所述最小初始像素代价值对应的位置信息;所述第二最小子单元包括一个第二比较器、一个第二数据复用器、两个第二位置复用器以及一个与门单元,所述第二数据复用器用于获取两个所述第一最小子单元输出数据中最小初始像素代价值,所述第二位置复用器用于输出该最小初始像素代价值对应的最佳 匹配位置信息。
进一步地,每个所述先进先出单元均包括三个传输单元以及一个方向先进先出单元,所述方向先进先出单元的个数与所述代价聚合模块的个数相同。
第二方面,本申请还提供一种低功耗立体匹配系统获取深度信息的方法,其方法包括以下步骤:
采集目标的左图像和右图像;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素;
针对目标图像中若干目标像素中每一个目标像素,确定非目标图像中该目标像素对应的预设视差阈值个视差像素,并基于该目标像素及所有视差像素,确定该目标像素的第一二进制码流和每个视差像素的第二二进制码流,并将第一二进制码流分别与每个第二二进制码流进行汉明距离计算,得到该目标像素的多个初始像素代价值;
基于预设区域代价阈值,确定该目标像素的若干区域以及每个区域的若干初始像素代价值,并基于若干区域中每个区域的若干初始像素代价值,确定每个区域的最小区域代价值以及每个所述最小区域代价值的区域最佳匹配点位置;
基于每个区域的最小区域代价值的多个预设路径方向,确定每个区域对应预设路径方向上的路径代价值,并将同一区域的多个预设路径方向上对应的路径代价值进行聚合,得到每个区域的能量函数值;
基于所有区域对应的能量函数值以及所有区域中每个区域对应的区域最佳匹配点位置,确定所有能量函数值中最小能量函数值对应的视差区域,并基于该最小能量函数值所在区域对应的区域最佳匹配点位置以及所述视差区域,得到该目标像素的深度信息。
进一步地,所述基于该目标像素及所有视差像素,确定该目标像素的第一二进制码流和每个视差像素的第二二进制码流,并将第一二进制码流分别与每个第二二进制码流进行汉明距离计算,得到该目标像素的多个初始像素代价值具体包括:
获取预设视差阈值并确定目标图像中的一目标像素;
在目标图像的图像区域确定该目标像素自身的所有领域像素以及在非目标图像的图像区域搜索距离该相同目标像素一预设视差阈值区域内所有视差像素;
基于所有视差像素,确定每个视差像素自身的所有邻域像素;
比较目标像素与该目标像素自身对应的所有邻域像素的灰度值大小,输出该目标像素的第一二进制码流;
将每一个视差像素均作为参考像素,比较每个参考像素与每个参考像素对应的所 有邻域像素的灰度值大小,输出该目标像素的多个第二二进制码流;
计算所述第一二进制码流与每个第二二进制码流的汉明距离,确定该目标像素的多个初始像素代价值。
进一步地,所述比较目标像素与该目标像素自身对应的所有邻域像素的灰度值大小,输出该目标像素的第一二进制码流具体包括:
当某邻域像素的灰度值小于或等于目标像素的灰度值,其比较结果为0,则输出0;
当某邻域像素的灰度值大于目标像素的灰度值,其比较结果为1,则输出1;
将所有比较结果按位输出,得到该目标像素的第一二进制码流。
进一步地,所述计算所述第一二进制码流与每个第二二进制码流的汉明距离,确定该目标像素的多个初始像素代价值具体包括:
将所述第一二进制码流并行与每个第二二进制码流进行异或运算,得到多个异或运算结果;
统计每个异或运算结果的比特位不为1的个数,得到该该目标像素的多个初始像素代价值。
进一步地,所述确定该目标像素的若干区域以及每个区域的若干初始像素代价值,并基于若干区域中每个区域的若干初始像素代价值,确定每个区域的最小区域代价值以及每个所述最小区域代价值的区域最佳匹配点位置具体包括:
获取预设区域代价阈值;
基于所述预设区域代价阈值划分目标像素的多个初始像素代价值,确定该目标像素的若干区域以及对应区域的多个初始像素代价值;
选取若干区域中每个区域的多个初始像素代价值中值最小的初始代价值,作为对应区域的最小区域代价值;
基于对应区域的最小区域代价值,获取目标图像中所述最小区域代价值的区域最佳匹配点位置。
进一步地,所述基于每个区域的最小区域代价值的多个预设路径方向,确定每个区域对应预设路径方向上的路径代价值,并将同一区域的多个预设路径方向上对应的路径代价值进行聚合,得到每个区域的能量函数值具体包括:
选定多个预设路径方向;其中,所述预设路径方向包括0°、45°、90°以及135°;
针对多个区域中的每一个区域,计算该区域的最小区域代价值多个预设路径方向上的路径代价,得到该区域对应预设路径方向上的最小路径代价值;
将该区域的多个最小路径代价值聚合处理,得到该区域的能量函数值。
进一步地,所述基于所有区域对应的能量函数值以及所有区域中每个区域对应的区域最佳匹配点位置,确定所有能量函数值中最小能量函数值对应的视差区域,并基于该最小能量函数值所在区域对应的区域最佳匹配点位置以及所述视差区域,得到该目标像素的深度信息具体包括:
获取所有区域对应的能量函数值;
选取出所有能量函数值中最小值,并对该最小能量函数值采用模拟乘法操作,确定该最小能量函数值对应的最优视差区域;
接收该最优视差区域对应的区域最佳匹配点位置;
基于该目标像素的区域最佳匹配点位置以及最优视差区域,得到该目标像素的深度信息。
有益效果:与现有技术相比,本申请提供了一种低功耗立体匹配系统及获取深度信息的方法,该系统包括依次连接的普查模块、分区优化模块、多路径代价聚合模块以及深度计算模块。本申请通过使用Census算法作为初始像素代价值计算函数,并在Census普查模块与多路径代价聚合模块之间加入分区优化模块简化初始像素代价值,基于分区域处理以及最优视差位置,在不影响精度的情况下减少传入多路径代价聚合模块的初始像素代价值数量,进而降低算法耗时以及资源消耗,并确保精度。
附图说明
图1为本申请提供的一种低功耗立体匹配系统的结构框图。
图2为本申请中低功耗立体匹配系统中普查模块的结构框图。
图3为本申请分区优化模块中区域单元的结构框图。
图4为本申请分区优化模块中第一最小子单元的结构框图。
图5为本申请分区优化模块中第二最小子单元的结构框图。
图6为本申请多路径代价聚合模块的结构框图。
图7为本申请目标像素四个方向示例图。
图8为本申请提供的一种低功耗立体匹配系统获取深度信息的方法的流程图。
图9为低功耗立体匹配系统获取深度信息的方法步骤S30的流程图。
图10为低功耗立体匹配系统获取深度信息的方法步骤S40的流程图。
图11为低功耗立体匹配系统获取深度信息的方法步骤S50的流程图
图12为一应用场景一目标的左图像。
图13为一应用场景同一目标的右图像。
图14为一应用场景输出同一目标的深度图。
具体实施方式
本申请提供一种低功耗立体匹配系统及获取深度信息的方法,为使本申请的目的、技术方案及效果更加清楚、明确,以下参照附图并举实施例对本申请进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本申请所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
该立体匹配系统的基本原理为:利用标定后不同位置的摄像头对同一物体目标进行拍摄,以得到不同角度的二维图像;利用同一空间点在二维图片中对应像素位置的不同以得到深度信息,由深度信息组成的图像称为深度图像。
经发明人研究表明,通用立体匹配算法可按能量代价函数不同可分为全局立体匹配算法,半全局立体匹配算法(SGM),以及局部立体匹配算法。其中,全局立体匹配 算法主要是采用了全局的优化理论方法估计视差,建立全局能量函数,通过最小化全局能量函数得到最优视差值,其结果比较精准,但计算量大,运行时间长。局部匹配算法主要是采用局部优化方法进行视差值估计,也是通过能量最小化方法进行视差估计,其运行速度快,但是在能量函数中,只有数据项,而没有平滑项,导致精度较差。半全局立体匹配算法用到了像素点本身以及其邻点的约束信息(局部立体匹配算法的特点),又使用了动态规划思想,由多个方向的一维平滑约束来模拟二维平滑约束(全局立体匹配的特点),再将各个一维路径上的数据合并,并且引入根据深度不同而不同的惩罚因子作为平滑项,从而在确保精度情况下使算法对噪音有较强的鲁棒性。
目前立体匹配测试平台中大多算法都基于半全局立体匹配思想,但由于复杂高精度算法所需的计算量与资源消耗量因其较大,使其难以在硬件上移植,而现在众多应用均要求硬件系统拥有实时性,因此,为了解决上述问题,本申请提供一种低功耗立体匹配系统及获取深度信息的方法,通过使用Census算法作为初始像素代价值计算函数,并在Census普查模块与多路径代价聚合模块之间加入分区优化模块简化初始像素代价值,基于分区域处理以及最优视差位置,在不影响精度的情况下减少传入多路径代价聚合模块的初始像素代价值数量,进而降低算法耗时以及资源消耗,并确保精度。
请参阅图1,图1是本申请提供的一种低功耗立体匹配系统的结构框图,如图1所示,所述低功耗立体匹配系统包括普查模块1、分区优化模块2、多路径代价聚合模块3以及深度计算模块4,这样,通过分区优化模块2简化普查模块1中初始像素代价值的匹配数量,从而减少传入多路径代价聚合模块3的初始像素代价值数量,并基于最优视差位置以及多路径代价聚合,最终得到每个目标像素高精度的深度信息。
相应地,所述普查模块1(又称Census模块)用于输入目标的左图像和右图像,输出目标像素的多个初始像素代价值。如图1所示,所述普查模块1包括用于转换左图像的像素流的左普查转码单元11、用于转换右图像的像素流的右普查转码单元12以及用于确定每个目标像素的多个初始像素代价值的汉明距离计算单元13;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素。
也就是说,通过左普查转码单元11和右普查转码单元12分别实现Census变换, 即左普查转码单元11和右普查转码单元12分别将对应输入的左图像和右图像中像素点的灰度值分别编码成二进制码流,其中,左普查转码单元11将输入的左图像中像素点的灰度值编码成左像素流,右普查转码单元12将输入的右图像中像素点的灰度值编码成右像素流,以此来获取邻域像素灰度值相对于中心像素灰度值的大小关系,而通过汉明距离计算单元13实现每个目标像素的初始像素代价值的输出。
本申请实施例中,所述初始像素代价值指的是双目摄像头获取的两幅图像中像素块的相似程度。例如:预先设定一个视差搜索范围,若以标定后的右摄像头为基准,则在右摄像头获取的图像中依次选定一个像素点,并在标定后的左摄像头获取的图片中相同坐标像素点及其右边视差范围内的所有像素点进行匹配代价的计算,获取初始像素代价值。通常,初始像素代价值越低,说明右图像中像素点在所设定的视差下与左图像对应的像素点越匹配。
请参阅图2,图2是低功耗立体匹配系统中普查模块1的结构框图。如图2所示,所述右普查转码单元12包括右行缓存单元121、右普查窗口单元(图未标示)以及右普查对比单元(图未标示),所述左普查转码单元11包括左行缓存单元111、左普查窗口单元(图未标示)以及左普查对比单元(图未标示),其中,所述右行缓存单元121与所述左行缓存单元111的结构相同,所述右普查窗口单元(图未标示)和所述右普查对比单元(图未标示)分别与所述左普查窗口单元(图未标示)、所述左普查对比单元(图未标示)的结构不同。
所述右行缓存单元121、右普查窗口单元(图未标示)、右普查对比单元(图未标示)以及汉明距离计算单元13组成一个四级流水线结构,使得每四个周期完成一右图像中所有像素点在预设的视差搜索范围内所有Census变化以及汉明距离值(也即初始像素代价值)的计算。
进一步地,所述右行缓存单元121(linebuffer)用于依次缓存右图像像素流中连续两行像素值信息,其包括第一行缓存单元1211(linebuffer1)和第二行缓存单元1212(linebuffer2),所述第一行缓存单元1211和所述第二行缓存单元1212的深度值相同,且均由一个先进先出队列构成。这样,将所述右图像输入至右行缓存单元,通过所述右 行缓存单元以像素流先进先出的传入方式依次缓存所述右图像中两行像素值信息。需要说明的是,每一个像素有一个灰度值,深度信息取决于输入图像的列数。
例如:以输入右图像为例,假设右图像的像素为640*480,则一行有640个像素点,此时linebuffer1与linebuffer2的深度均为640;右图像输入以左上角为初始传入点,右下角为结束传入点,以像素流的方式一个时钟周期传入一个像素灰度值至Linebuffer1中,经过640个周期右图像第一行像素传输完毕且填满linebuffer1;到1280个周期时,右图像第二行像素值信息传输完毕,此时,linebuffer1中填充的是右图像第二行像素值信息,linebuffer2中填充的是右图像第一行像素值信息。
请继续参阅图2,标定右图像为目标图像,因此,如图2所示,所述右普查窗口单元(图未标示)包括一个用于遍历右图像的右滑动窗口1221,所述右普查窗口单元(图未标示)通过该右滑动窗口1221对整个右图像进行遍历操作,输出该右滑动窗口1221所包含的所有像素的灰度值。该右滑动窗口1221由若干个寄存器组成,该右滑动窗口1221的尺寸大小根据用户需要预先定义,在本申请实施例中,所述右滑动窗口1221较佳为3*3的矩形窗口,与比特流最小8位数吻合,开销最小。所述右普查窗口单元(图未标示)是一个三级流水线结构,因此,所述右滑动窗口1221在本申请实施例中较佳包括9个寄存器,形成3*3矩阵,每一层滑动窗口均包括3个相同的寄存器,第一层滑动窗口由寄存器1,寄存器2,寄存器3构成,第二层滑动窗口由寄存器4、寄存器5、寄存器6构成,第三层滑动窗口由寄存器7,寄存器8,寄存器9构成。需要说明的是,寄存器5用于处理该右滑动窗口1221对应的中心像素,而寄存器1,寄存器2,寄存器3,寄存器4,寄存器6,寄存器7,寄存器8,寄存器9用于处理该右滑动窗口1221的中心像素对应邻域内的8个像素。因此,这8个像素又称为该中心像素的邻域像素。中心像素即为目标像素。
请继续参阅图2,所述右普查对比单元(图未标示)用于将该右滑动窗口1221遍历右图像区域过程中每次的中心像素作为参考像素并将该参考像素的灰度值与该右滑动窗口1221的该中心像素对应的邻域中每一个像素的灰度值进行比较,若该邻域像素的灰度值小于或等于参考像素的灰度值,则右普查对比单元(图未标示)输出0,若该 邻域像素的灰度值大于参考像素的灰度值,则右普查对比单元(图未标示)输出为1。所述右普查对比单元(图未标示)与右普查窗口单元(图未标示)相对应,其同样是一个三层级联而成,所述右普查对比单元(图未标示)包括一个右对比窗口1231,所述右对比窗口1231包括若干个比较器。所述右对比窗口1231在本申请实施例中较佳包括8个寄存器,同样形成3*3矩阵,但第一层对比窗口和第三层对比窗口均包括3个相同的比较器,即第一层对比窗口由比较器1,比较器2,比较器3构成;第三层对比窗口由比较器7,比较器8,比较器9构成,但第二层对比窗口仅由2个比较器构成,即比较器4和比较器6。
进一步地,右滑动窗口1221的寄存器5的输出端共同连接8个比较器的输入端。而除右滑动窗口1221的寄存器5外的其余8个寄存器均与所述右对比窗口1231的8个比较器一一对应,即寄存器1的输出端连接比较器1的输入端,寄存器2的输出端连接比较器2的输入端,寄存器3的输出端连接比较器3的输入端,寄存器4的输出端连接比较器4的输入端,寄存器6的输出端连接比较器6的输入端,寄存器7的输出端连接比较器7的输入端,寄存器8的输出端连接比较器8的输入端,寄存器9的输出端连接比较器9的输入端。
接着将8个比较器的输出值从左到右、从上到下的顺序按位连接,通过右普查对比单元(图未标示)得到一个8bit的比特串,即由0和1组成的二进制码流(又称右二进制码流)。
这样,右图像的像素流通过所述右普查窗口单元(图未标示)获取每个目标像素的灰度值以及对应目标像素的邻域像素的灰度值,通过所述右普查对比单元(图未标示)对每个目标像素的灰度值与对应目标像素的邻域每个像素的灰度值进行大小比较,从而通过所述右普查对比单元按位顺序输出所述右图像每个目标像素对应的右二进制码流。
同样的,所述左普查转码单元11与所述右普查转码单元12采用同样的Census变换方法,即将左图像的图像像素转换为二进制码流。所述左普查转码单元11包括左行缓存单元111、左普查窗口单元(图未标示)以及左普查对比单元(图未标示),其中,所述左行缓存单元111与所述右行缓存单元121的结构相同,所述左普查窗口单元(图 未标示)和所述左普查对比单元(图未标示)分别与所述右普查窗口单元(图未标示)、所述右普查对比单元(图未标示)的结构不同。
具体地,所述左行缓存单元111同样用于依次缓存左图像像素流中连续两行像素值信息,其包括第一行缓存单元1111和第二行缓存单元1112,所述左行缓存单元111与所述右行缓存单元121采用相同的方法,具体参照上述所述右行缓存单元121实现过程。
然而,所述左普查窗口单元(图未标示)包括用于遍历左图像中距离目标像素预设视差阈值区域内每个视差像素的左视差窗口1121,也就是说,在标定右图像为目标图像中,右图像中每一个目标像素都对应预设视差阈值个左视差窗口1121(又称Disparity Range窗口)。所述左视差窗口1121的个数为预设视差阈值。在本申请实施例中,所述预设视差阈值较佳为96个,可以获取该目标像素最优匹配,开销小。该预设视差阈值并非限定的,可根据用户需求自行设定,如48个,24个等。
举例:右图像中某目标像素1,对应一个右滑动窗口,一个右对比窗口,与此同时,搜索左图像的该目标像素1,确定距离该目标像素预设视差阈值区域(即邻域),则该邻域内具有预设视差阈值(96)个视差像素,每个视差像素都对应一个左视差窗口1121,因此,在本申请实施例中,当确定右图像中一个目标像素1,则左图像该目标像素1具有96个视差像素,即96个左视差窗口1121。
因此,本申请的左视差窗口1121不是一个,具有预设视差阈值个窗口对应,本申请并发处理,不仅提高搜索速率,而且更进一步提高目标像素的精度。
进一步地,每个所述左视差窗口1121均包括若干个寄存器;在本申请实施例中,每个所述左视差窗口1121均包括9个寄存器,所述左普查对比单元(图未标示)包括预设视差阈值个左对比窗口1131,也就是说,左视差窗口1121的个数与左对比窗口1131的个数相同,均为预设视差阈值个,例如96个左对比窗口1131。每个所述左对比窗口1131均包括8个比较器,每个所述左对比窗口1131均与对应所述左视差窗口1121的寄存器对应:
例如:以一个左视差窗口1121和一个左对比窗口1131为例,左滑动窗口1121(即左视差窗口1121)的寄存器5的输出端共同连接8个比较器的输入端。而除左视差窗口 1121的寄存器5外的其余8个寄存器均与所述左对比窗口1131的8个比较器一一对应,即寄存器1的输出端连接比较器1的输入端,寄存器2的输出端连接比较器2的输入端,寄存器3的输出端连接比较器3的输入端,寄存器4的输出端连接比较器4的输入端,寄存器6的输出端连接比较器6的输入端,寄存器7的输出端连接比较器7的输入端,寄存器8的输出端连接比较器8的输入端,寄存器9的输出端连接比较器9的输入端。接着将8个比较器的输出值从左到右、从上到下的顺序按位连接,通过左普查对比单元(图未标示)得到一个8bit的比特串,即由0和1组成的二进制码流。这样,右图像中一个目标像素,输出一个右二进制码流,对应左图像中并发通过96个左视差窗口1121和96个左对比窗口1131输出96个右二进制码流。
请继续参阅图1和图2,所述汉明距离(Hamming Distance)计算单元13用于接收所述左普查转码单元11和所述右普查转码单元12分别输出的比特串,采用以计算汉明距离的方式计算目标像素的一个右二进制码流与每个左二进制码流的初始像素代价值,从而通过该汉明距离计算单元13确定每个目标像素的多个初始像素代价值。在本申请实施例中,该汉明距离值即为初始像素代价值,其指的是两个比特串的对应位不相同(即一个为1,另一个为0)的数量。所述汉明距离计算单元13包括并列设置的多个异或门以及与所有异或门的输出端共接的加法器(图未标示)。在本申请实施例中,该异或门的个数较佳为9个,分别为异或门1、异或门2、异或门3、异或门4、异或门5、异或门6、异或门7、异或门8、异或门9,无异或门5,每个异或门具有两个输入端和一个输出端,异或门的两个输入端分别连接左普查对比单元(图未标示)的一个比较器的输出端和右普查对比单元(图未标示)的对应比较器的输出端,每个异或门的输出端与加法器连接。也就是说,对于一个目标像素来说,一个右二进制码流和96个左二进制码流并行输入至每个异或门的两输入端,通过每个异或门进行异或运算,输出至所述加法器,从而输出一个目标像素的96个初始像素代价值。
为了便于理解上述获取每个目标像素的初始像素代价值,下面一具体实施例加以说明:
以输入右图为例,右摄像头读取到图像数据后,通过右图像像素流pixelstream依 次传入linebuffer1(第一行缓存单元)和linebuffer2(第二行缓存单元),当linebuffer2充满以后,pixelstream在进入linebuffer1的同时,pixelstream、linebuffer1最右边的数据和linebuffer2最右边的数据分别进入census窗口单元里stage1(状态1)的三个寄存器,在接下来的一个个时钟到来时依次向右传递;当Census窗口单元填满之后,下一个时钟周期到来时,Census窗口单元里除寄存器5外的8个寄存器,分别传入Census对比单元里对应位置的8个比较器,同时寄存器5传入Census对比单元里的8个比较器中进行比较,接着将Census对比单元里的8个比较器的比较结果,按照从左到右、从上到下的顺序进入Hamming计算单元与左图普查对比单元113(Census)传来的96个结果分别进行异或操作,最终结果输出96个Hamming distance(汉明距离),即每个目标像素具有96个初始像素代价值。例如:对于右图像某一像素点,在左图像中将有若干个像素点与之对应,以模拟不同视差取值下对应的代价值,其代价值数量为预设的视差范围值,此代价值也被称为像素点的初始像素代价值。
请继续参阅图1,所述分区优化模块2用于将每个目标像素的多个初始像素代价值划分为多个区域,并对每个区域进行优化处理,得到每个区域的最小区域代价值以及对应最小区域代价值的区域最佳匹配点位置,其目的是减少所述普查模块输出的初始像素代价值的数据量。所述优化处理指的是将目标像素的多个初始像素代价值划分为多个区域,每个区域包含多个初始像素代价值,所述分区优化模块2包括多个用于确定能够代表区域代价值的最小区域代价值以及该最小区域代价值的区域最佳匹配点位置的区域单元20。
每个区域单元20使用树状结构处理区域单元包含的所有初始像素代价值C1~C4,请参阅图3,图3为每个区域单元的结构框图,如图3所示,每个所述区域单元20均包括并行设置的两个第一最小子单元21(min_first)和与两个第一最小子单元21的输出端共接的第二最小子单元22(min_second)。
请参阅图4和图5,图4为第一最小子单元21的结构框图,图5为第二最小子单元22的结构框图。如图4所示,所述第一最小子单元21均包括第一比较器、第一数据复用器mux以及第一位置复用器mux,所述第一数据复用器用于输出任意2个初始像素 代价值中最小的初始像素代价值,所述第一位置复用器用于输出该最小的初始像素代价值对应的位置值(最佳匹配像素点位置,用1bit表示)。具体应用时,所述第一比较器包括两个输入端和一个输出端,所述第一数据复用器包括三个输入端和一个输出端,所述第一位置复用器包括三个输入端和一个输出端。如图3和图4所示,所述第一比较器的两个输入端和第一数据复用器的两个输入端共接任意2个初始像素代价值如C1、C2,所述第一数据复用器的第三个输入端和所述第一位置复用器的第三个输入端共接所述第一比较器的输出端,所述第一数据复用器的输出端和所述第一位置复用器的输出端共接第二最小子单元的输入端。
例如:若输入c1,c2,当c1<c2时,第一比较器输出c1,并且第一位置复用器输出0,代表最佳匹配像素点位置;反之,若c1>c2,第一比较器输出c2,并且第一位置复用器输出1,代表最佳匹配位置。
同样地如图5所示,所述第二最小子单元22包括一个第二比较器、一个第二数据复用器mux、两个第二位置复用器mux以及一个与门单元,所述第二数据复用器用于输出两个第一最小子单元的输出数据中最小初始像素代价值,2个第二位置复用器用于输出该最小初始像素代价值对应的最佳匹配位置值,用2bit表示。具体地,所述第二比较器包括3个输入端和1个输出端,所述第二数据复用器包括三个输入端和一个输出端,两个第二位置复用器均包括三个输入端和一个输出端,所述与门单元包括两个输入端和一个输出端。
具体实施时,所述第二比较器的两个输入端和所述第二数据复用器的两个输入端共接两个第一最小子单元min_first分别输出的最小初始像素代价值,如C1和C2;所述第二数据复用器的第三输入端、两个第二位置复用器的第三输入端共接所述第二比较器的输出端,两个第二位置复用器的输出端共接所述与门单元的两个输入端,其中一个第二位置复用器的两个输入端输入0和1,另一个第二位置复用器的两个输入端与第二位置复用器的两个输入端对应输入p1和P2。
例如:若输入c1,c2,p1,p2,当c1<c2时,第二比较器输出c1,两个第二位置复用器分别输出p10代表最佳匹配位置;反之,若c1>c2,第二比较器输出c2,两个第 二位置复用器分别输出p21代表最佳匹配位置。
这样,基于分区域处理以及最优视差位置,通过分区优化模块2简化初始像素代价值,在不影响精度的情况下减少传入多路径代价聚合模块3的初始像素代价值数量,从而减少资源开销,加快处理速率。
请参阅图6,图6为多路径代价聚合模块的结构框图。如图1和图6所示,所述多路径代价聚合模块3(SGM模块)用于对每个最小区域代价值的多个预设路径方向进行路径代价计算,得到对应预设路径方向的多个路径代价值,并将对应预设路径方向的所有路径代价值进行聚合,得到目标像素的对应区域的能量函数值。所述多路径代价聚合模块3包括用于传递每个区域的代价聚合值并辅助每个区域代价更新的若干先进先出单元31(FIFO单元)、用于确定每个区域对应的不同预设路径方向上的路径代价值并分别聚合同一区域的多个路径代价值的若干代价聚合模块32以及用于确定每个区域的多个能量函数值的加法聚合模块33。
在本申请实施例中,所述先进先出单元31的个数较佳为4个,所述代价聚合模块32的个数较佳也为4个,分别为代价聚合模块1(Agg1)、代价聚合模块2(Agg2)、代价聚合模块3(Agg3)、代价聚合模块4(Agg4)。其中,每个所述先进先出单元31均包括三个传输单元以及一个方向先进先出单元,所述方向先进先出单元的个数与所述代价聚合模块32的个数相同,对应于4个方向上代价聚合。这4个方向为目标像素的0°、45°、90°以及135°,如图7所示。每个传输单元由24个寄存器构成,因此,FIFO单元由0°、45°、90°以及135°四个方向先进先出单元以及12个传输单元构成。
进一步地,如图6所示,四个代价聚合模块构成四层平行二级流水线结构:第一层为第二代价聚合模块agg2自身聚合,由第二代价聚合模块agg2到第二代价聚合模块agg2,第二层为第二代价聚合模块Agg2到第四代价聚合模块Agg4,第三层为第二代价聚合模块Agg2到第三代价聚合模块Agg3,第四层为第一代价聚合模块Agg1到第四代价聚合模块Agg4。
需要说明的是,由于每一帧图像的第一行只需要进行0°的方向的代价聚合,如图6所示,因此通过一个计数器counter计数。当图像第一行到来时,24个区域代价值传入 FIFO单元而不是Agg4,当第一行最后一个像素的区域代价值传入Agg2后,第二行的像素的区域代价值开始传入Agg4而不是FIFO单元,这是因为第二行以后的像素的区域代价值都需要进行4个方向的聚合;同时Agg3的输出跳过FIFO单元里的前两个传输单元而是直接传入相同方向对应的第三个传输单元,这是因为当第二行像素的区域代价值从Agg4传入时,相当于替代了FIFO单元的两个深度,因此只需从第三个传输单元传入。通过FIFO单元,可将代价聚合模块32中的代价聚合值依次传递下去,帮助代价聚合模块32进行代价的更新。
请继续参阅图1和图6,所述加法聚合模块(Sum)33由加法器构成,Agg1中含有4个方向的更新后的24*4个区域聚合值,也称为路径代价值,以一个区域为例,加法聚合模块33通过加法器将区域的4个方向对应的路径代价值相加得到总路径代价值,继而得到24个区域的24个能量函数。
请继续参阅图1和图6,所述深度计算模块4用于确定多个能量函数值中最小能量函数值对应的视差区域,并基于所述区域最佳匹配点位置以及所述视差区域,得到深度信息。所述深度计算模块4包括能量最小查找模块41、移位模块42和加法模块43,所述能量最小查找模块41用于寻找最小能量值,即确定每个区域的多个能量函数值中最小能量函数值,所述移位模块42用于确定对应区域该最小能量函数的视差区域,即采用模拟乘法操作(*4),形成最小能量函数对应的视差区域,所述加法模块43用于基于分区优化模块2输出的区域最佳匹配点位置和移位模块42输出的最小能量函数对应的视差区域,计算出每个目标像素的深度信息。
所述深度计算模块4为三级流水线结构,即在三个时钟周期完成24个聚合能量值到深度信息的计算,其中前两个时钟周期完成24个能量值最小值寻找,第三个时间周期完成深度信息计算。
能量最小查找模块41为两级流水线结构,以树状结构实现,其中每一个时间周期处理三个对比阶层,输入为24个能量函数值,输出其中的最小值。
基于上述低功耗立体匹配系统,本申请还提供一种低功耗立体匹配系统获取深度信息的方法,请参阅图8,图8是本申请提供的低功耗立体匹配系统获取深度信息的方 法的具体实施例的流程图。如图8所示,所述低功耗立体匹配系统获取深度信息的方法包括以下步骤:
S10、采集目标的左图像和右图像;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素。
在本申请实施例中,通过双目摄像头采集同一目标的左图像和右图像。该双目摄像头是模拟人的双眼,由两个单目摄像头即左摄像头和右摄像头组成,每个单目摄像头对同一目标拍摄以输出一图像,分别对应左图像和右图像。因此,所述左图像与所述右图像为同一目标(例如同一像素块)不同角度下对应的图像。在拍摄前,预先标定某一单目摄像头以使得其所拍摄的图像作为目标图像,如标定双目摄像头中的右单目摄像头作为目标摄像头,然后双目摄像头对同一物体目标进行拍摄,得到作为目标图像的右图像,作为非目标图像的左图像。继而将左右图像输入至低功耗立体匹配系统,得到每个目标像素高精度的深度信息。
由于同一物体目标下的左图像和右图像在同一空间点具有一距离视差,因此,预先设定一视差阈值(即同一目标像素的视差搜索范围),可以通过标定的目标图像中同一目标像素及非目标图像中对应邻域该目标像素对应的视差阈值范围内的所有视差像素点进行匹配代价和一系列计算、优化,得到同一目标像素的深度信息。
S20、针对目标图像中若干目标像素中每一个目标像素,确定非目标图像中该目标像素对应的预设视差阈值个视差像素,并基于该目标像素及所有视差像素,确定该目标像素的第一二进制码流和每个视差像素的第二二进制码流,并将第一二进制码流分别与每个第二二进制码流进行汉明距离计算,得到该目标像素的多个初始像素代价值。
在本申请实施例中,获取预设视差阈值并确定目标图像中的一目标像素;在目标图像的图像区域,确定目标像素自身的所有领域像素,即右图像中一目标像素1,通过右滑动窗口1221搜索右图像区域中该目标像素1的其他8个邻域像素。然后获取目标像素1的灰度值以及该目标像素1的所有领域像素的灰度值,通过右比较窗口1231比较目标像素与该目标像素自身对应的所有邻域像素的灰度值大小,若某邻域像素的灰度值小于或等于目标像素的灰度值,输出0,若某邻域像素的灰度值大于目标像素的灰度值, 输出1,将8个比较结果按位(即从上到下、从左到右)连接,输出该目标像素的第一二进制码流(即上述右二进制码流)。
与此同时,基于该目标像素1以及预设视差阈值,在非目标图像的图像区域搜索距离该相同目标像素一预设视差阈值区域内所有视差像素,即在左图像的图像区域距离该相同坐标像素的预设视差阈值内的96个视差像素,也就是说,一个目标像素1,对应96个视差像素,即对应96个左视差窗口1121。然后基于所有视差像素,以每个视差像素为中心,通过对应的左视差窗口1121,在左图像的图像区域搜索每个视差像素自身的8个邻域像素。通过96个左对比窗口1131,同样将每一个视差像素均作为参考像素,比较每个参考像素与每个参考像素对应的所有邻域像素的灰度值大小,若某邻域像素的灰度值小于或等于参考像素的灰度值,输出0,若某邻域像素的灰度值大于参考像素的灰度值,输出1,然后每个视差像素的比较结果按位(即从上到下、从左到右)连接,输出每个视差像素的第二二进制码流,也就该目标像素的第二二进制码流,共96个第二二进制码流(即上述左二进制码流)。
二进制码流转换可用公式(1)和公式(2)表示:
Figure PCTCN2021083603-appb-000001
Figure PCTCN2021083603-appb-000002
其中,(u,v)为中心像素坐标,
Figure PCTCN2021083603-appb-000003
为位拼接符,将得到的0、1字符拼接成一个8bit比特串,其中,n′=1,m′=1,(u+i,v+j)为邻域像素坐标,i∈[-n',n'],ξ(x,y)为0、1比较公式,x、y为相比较的两个数值,Cs(u,v)为坐标(u,v)下的普查转换后得到的8bit比特串,I(u,v)为坐标(u,v)下的像素值。
接着,计算所述第一二进制码流与96个第二二进制码流的汉明距离,确定该目标 像素的96个初始像素代价值。即将所述第一二进制码流并行与96个第二二进制码流分别进行异或运算,若所述第一二进制码流的某位数据与第二二进制码流的对应位数据不同,则输出1,若所述第一二进制码流的某位数据与第二二进制码流的对应位数据相同,则输出0,然后通过加法器(图未标示)统计96个异或运算结果的比特位中不为1的个数,该个数值即为初始像素代价值,即输出96个初始像素代价值。
该异或运算和统计用公式(3)表示:
Figure PCTCN2021083603-appb-000004
公式(3)表示将两个比特串进行异或运算,再统计异或运算结果的比特位中不为1的个数。其中,
Figure PCTCN2021083603-appb-000005
为异或运算符,C(u,v,d)为像素点(u,v)在深度d下的初始匹配代价,d∈[0,disparityrange-1],其中,disparityrange表示视差范围。
S30、基于预设区域代价阈值,确定该目标像素的若干区域以及每个区域的若干初始像素代价值,并基于若干区域中每个区域的若干初始像素代价值,确定每个区域的最小区域代价值以及每个所述最小区域代价值的区域最佳匹配点位置。
在本申请实施例中,预设设置区域代价阈值。所述区域代价阈值较佳为4,即定义每个区域包括至少4个初始像素代价值,以一个目标像素为例说明,因此,一个目标像素的96个初始像素代价值按4个初始像素代价值为一区域,可划分为24个区域单元,因此,通过分区优化模块2输出为24个区域代价值与24个区域最佳匹配点位置。这样,通过所述分区优化模块2的优化后输入至多路径代价聚合模块3的数据量将减少至少4倍,由此减少整体的资源消耗量。
每个区域单元20由4个初始像素代价值C1、C2、C3、C4组成,用该区域单元内最小的初始像素代价值表示该区域代价值,并保持该最小的初始像素代价值对应的位置,该最小的初始像素代价值所在位置成为该区域单元i内最匹配点即最佳匹配像素点位置。 该区域单元i用公式(4)和公式(5)表示为:
p' i=position(min(c i,c i-1,c i-2,c i-3))    (4)
Figure PCTCN2021083603-appb-000006
其中,min为寻找最小值函数,c i-3,c i-2,c i-1,c i分别代表传入的4个初始像素代价值,p′ i代表区域单元内最佳匹配像素点。
因此,如图9所示,所述步骤S30具体包括:
S31,获取预设区域代价阈值;
S32,基于所述预设区域代价阈值划分目标像素的多个初始像素代价值,确定该目标像素的若干区域以及对应区域的多个初始像素代价值;
S33,选取若干区域中每个区域的多个初始像素代价值中值最小的初始代价值,作为对应区域的最小区域代价值;
S34,基于对应区域的最小区域代价值,获取目标图像中所述最小区域代价值的区域最佳匹配点位置。
需要说明的是,本申请中选取若干区域中每个区域的多个初始像素代价值中值最小的初始代价值是经过二次最小处理,更进一步提高匹配精度,保证数据准确。
S40、基于每个区域的最小区域代价值的多个预设路径方向,确定每个区域对应预设路径方向上的路径代价值,并将同一区域的多个预设路径方向上对应的路径代价值进行聚合,得到每个区域的能量函数值。
在本申请实施例中,所述多路径代价聚合模块3对分区优化模块2优化后的24个区域代价值,针对其中一个区域的一个区域代价值,选择0°、45°、90°以及135°这4个方向分别进行路径代价计算,得到0°、45°、90°以及135°这4个方向的路径代价值,将0°、45°、90°以及135°这4个方向的路径代价值进行聚合,得到该区域的能量函数,相 同方法计算其他区域的多路径代价,得到24个区域对应的24个能量函数值,从而确定每个目标像素的24个能量函数值。
具体实施时,第一个时钟周期在Agg2中,分别计算0°方向聚合值的最小值、90°方向聚合值的最小值和135°方向聚合值中的最小值,同时Agg1计算45°方向聚合值的最小值。这四个不同方向聚合值中的最小路径代价值为下路径代价计算公式(6)单个方向上中的
Figure PCTCN2021083603-appb-000007
Figure PCTCN2021083603-appb-000008
其中,L r(p,i)表示像素点p在路径方向r上视差区域i下的路径代价值,
Figure PCTCN2021083603-appb-000009
公式第一项C(p,i)为像素p在视差区域i时的区域代价值,p-r点为像素点p在路径方向r上的前一个像素点,L r(p-r,i)表示像素点p-r在路径方向r上视差区域i下的区域代价值,L r(p-r,i-1)表示像素点p-r在路径方向r上视差区域i-1下的区域代价值,L r(p-r,i+1)表示像素点p-1在路径方向r上视差区域i+1下的区域代价值,P 1,P 2为预设的惩罚系数,min jL r(p-r,j)表示点p-r的在路径方向r上任意视差区域j下的最小路径代价值。
需要说明的是,由于第一个时钟周期内的Agg2对应于第二个时钟周期内Agg2的0°方向上前一个pixel,第一个时钟周期内的Agg2对应于第二个时钟周期内Agg4的90°方向上前一个pixel,第一个时钟周期内的Agg2对应于第二个时钟周期内Agg3的135°方向上前一个pixel,第一个时钟周期内的Agg1对应于第二个时钟周期内Agg4的45° 方向上前一个pixel,因此在第二个时钟周期内Agg2获得上一个时钟周期中Agg2计算得到的0°方向聚合值中的最小值,以及所有上个时钟周期中Agg2的0°方向24个聚合值,同时根据前文路径代价计算公式进行代价的更新;Agg4获得上一个时钟周期中Agg2计算的90°方向聚合值中的最小值,以及所有上个时钟周期中Agg2的90°方向24个聚合值,同时获得上一个时钟周期Agg1计算的45°方向聚合值中的最小值,以及所有上个时钟周期中Agg1的45°方向24个聚合值,再根据路径代价计算公式进行代价的更新;Agg3获得上一个时钟周期中Agg2计算的135°方向聚合值中的最小值,以及所有上个时钟周期中Agg2的135°方向24个聚合值。在第二个时钟周期以后,Agg1不仅使Agg4完成了45°方向的代价更新,同时含有了4个方向都更新过的代价值,从而在此代价聚合模块中,可完成四个方向的代价更新。
接着,针对24个区域中每一个区域,将该区域的4个方向路径代价值通过公式(7)进行累加,形成对于目标像素p的该视差区域i的能量函数,最终得到24个能量函数值,
即目标像素对应的24个能量函数值。E(p,i)=Σ rL r(p,i)    (7)
也就是说,如图10所示,所述步骤S40具体包括:
S41,选定多个预设路径方向;其中,所述预设路径方向包括0°、45°、90°以及135°;
S42,针对多个区域中的每一个区域,计算该区域的最小区域代价值多个预设路径方向上的路径代价,得到该区域对应预设路径方向上的最小路径代价值;
S43,将该区域的多个最小路径代价值聚合处理,得到该区域的能量函数值。
S50、基于所有区域对应的能量函数值以及所有区域中每个区域对应的区域最佳匹配点位置,确定所有能量函数值中最小能量函数值对应的视差区域,并基于该最小能量函数值所在区域对应的区域最佳匹配点位置以及所述视差区域,得到该目标像素的深度信息。
在本申请实施例中,所述深度计算模块4通过公式(8)计算得到每个目标像素的深度信息。
d final=4*i+p' i   (8)
其中,p′ i为区域最佳匹配点位置。
也就是说,图中目标像素p的每个视差区域的能量函数E(p,i),找出最小能量函数对应的视差区域i,并从第4步骤中找到该像素p视差区域i的最佳匹配位置p’ i,继而求出目标像素p的深大信息。
因此,如图11所示,所述步骤S50具体包括:
S51,获取所有区域对应的能量函数值;
S52,选取出所有能量函数值中最小值,并对该最小能量函数值采用模拟乘法操作,确定该最小能量函数值对应的最优视差区域;
S53,接收该最优视差区域对应的区域最佳匹配点位置;
S54,基于该目标像素的区域最佳匹配点位置以及最优视差区域,得到该目标像素的深度信息。
进一步地,基于深度信息以及目标图像,最终可获取该目标的深度图。
图12-图14示例了本申请该低功耗立体匹配系统获取深度信息的方法的一应用场景。
综上所述,本申请提供了一种低功耗立体匹配系统获取深度信息的方法。所述低功耗立体匹配系统获取深度信息的方法包括以下步骤:采集目标的左图像和右图像;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素;针对目标图像中若干目标像素中每一个目标像素,确定非目标图像中该目标像素对应的预设视差阈值个视差像素,并基于该目标像素及所有视差像素,确定该目标像素的第一二进制码流和每个视差像素的第二二进制码流,并将第一二进制码流分别与每个第二二进制码流进行汉明距离计算,得到该目标像素的多个初始像素代价值;基于预设区域代价阈值,确定该目标像素的若干区域以及每个区域的若干初始像素代价值,并基于若干区域中每个区域的若干初始像素代价值,确定每个区域的最小区域代价值以及每个所述最小区域代价值的区域最佳匹配点位置;基于每个区域的最小区域代价值的多个预设路径方向,确定每个 区域对应预设路径方向上的路径代价值,并将同一区域的多个预设路径方向上对应的路径代价值进行聚合,得到每个区域的能量函数值;基于所有区域对应的能量函数值以及所有区域中每个区域对应的区域最佳匹配点位置,确定所有能量函数值中最小能量函数值对应的视差区域,并基于该最小能量函数值所在区域对应的区域最佳匹配点位置以及所述视差区域,得到该目标像素的深度信息。本申请通过使用Census算法作为初始像素代价值计算函数,并在Census普查模块与多路径代价聚合模块之间加入分区优化模块简化初始像素代价值,基于分区域处理以及最优视差位置,在不影响精度的情况下减少传入多路径代价聚合模块的初始像素代价值数量,进而降低算法耗时以及资源消耗,并确保精度。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (12)

  1. 一种低功耗立体匹配系统,其包括普查模块、多路径代价聚合模块以及深度计算模块,其特征在于,还包括:设置在所述普查模块和所述多路径代价聚合模块之间的分区优化模块;
    所述普查模块用于输入目标的左图像和右图像,输出目标像素的多个初始像素代价值;所述普查模块包括用于转换左图像的像素流的左普查转码单元、用于转换右图像的像素流的右普查转码单元以及用于确定每个目标像素的多个初始像素代价值的汉明距离计算单元;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素;
    所述分区优化模块包括多个用于确定能够代表区域代价值的最小区域代价值以及该最小区域代价值的区域最佳匹配点位置的区域单元,每个所述区域单元均包括并行设置的两个第一最小子单元和与两个第一最小子单元的输出端共接的第二最小子单元;
    所述多路径代价聚合模块包括用于传递每个区域的代价聚合值并辅助每个区域代价更新的若干先进先出单元、用于确定每个区域对应的不同预设路径方向上的路径代价值并分别聚合同一区域的多个路径代价值的若干代价聚合模块以及用于确定每个区域的多个能量函数值的加法聚合模块;
    所述深度计算模块包括用于确定每个区域的多个能量函数值中最小能量函数值的能量最小查找模块、用于确定对应区域该最小能量函数的视差区域的移位模块以及用于基于所述移位模块的每个视差区域以及所述分区优化模块的对应区域的区域最佳匹配点位置所确定的每个目标像素的深度信息的加法模块。
  2. 根据权利要求1所述的低功耗立体匹配系统,其特征在于,所述右普查转码单元包括右行缓存单元、右普查窗口单元以及右普查对比单元,所述左普查转码单元包括左行缓存单元、左普查窗口单元以及左普查对比单元,其中,所述右行缓存单元与所述左行缓存单元的结构相同,所述右普查窗口单元和所述右普查对比单元分别与所述左普查窗口单元、所述左普查对比单元的结构不同。
  3. 根据权利要求2所述的低功耗立体匹配系统,其特征在于,当以左图像为目标图像,以右图像为非目标图像时,所述左普查窗口单元包括一个用于遍历左图像的左滑动窗口,该左滑动窗口包括若干个寄存器,所述左普查对比单元包括一个左对比窗口,所 述左对比窗口包括若干个比较器;所述右普查窗口单元包括用于遍历右图像中距离目标像素预设视差阈值区域内每个视差像素的右视差窗口,所述右视差窗口的个数为预设视差阈值,每个所述右视差窗口均包括若干个寄存器;所述右普查对比单元包括预设视差阈值个右对比窗口,每个所述右对比窗口包括若干个比较器。
  4. 根据权利要求1所述的低功耗立体匹配系统,其特征在于,每个所述第一最小子单元均包括第一比较器、第一数据复用器以及第一位置复用器,所述第一数据复用器用于输出所输入的多个初始像素代价值中的最小初始像素代价值,所述第一位置复用器用于输出所述最小初始像素代价值对应的位置信息;所述第二最小子单元包括一个第二比较器、一个第二数据复用器、两个第二位置复用器以及一个与门单元,所述第二数据复用器用于获取两个所述第一最小子单元输出数据中最小初始像素代价值,所述第二位置复用器用于输出该最小初始像素代价值对应的最佳匹配位置信息。
  5. 根据权利要求1所述的低功耗立体匹配系统,其特征在于,每个所述先进先出单元均包括三个传输单元以及一个方向先进先出单元,所述方向先进先出单元的个数与所述代价聚合模块的个数相同。
  6. 一种如权利要求1-5任一项所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述低功耗立体匹配系统获取深度信息的方法包括以下步骤:
    采集目标的左图像和右图像;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素;
    针对目标图像中若干目标像素中每一个目标像素,确定非目标图像中该目标像素对应的预设视差阈值个视差像素,并基于该目标像素及所有视差像素,确定该目标像素的第一二进制码流和每个视差像素的第二二进制码流,并将第一二进制码流分别与每个第二二进制码流进行汉明距离计算,得到该目标像素的多个初始像素代价值;
    基于预设区域代价阈值,确定该目标像素的若干区域以及每个区域的若干初始像素代价值,并基于若干区域中每个区域的若干初始像素代价值,确定每个区域的最小区域代价值以及每个所述最小区域代价值的区域最佳匹配点位置;
    基于每个区域的最小区域代价值的多个预设路径方向,确定每个区域对应预设路径 方向上的路径代价值,并将同一区域的多个预设路径方向上对应的路径代价值进行聚合,得到每个区域的能量函数值;
    基于所有区域对应的能量函数值以及所有区域中每个区域对应的区域最佳匹配点位置,确定所有能量函数值中最小能量函数值对应的视差区域,并基于该最小能量函数值所在区域对应的区域最佳匹配点位置以及所述视差区域,得到该目标像素的深度信息。
  7. 根据权利要求6所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述基于该目标像素及所有视差像素,确定该目标像素的第一二进制码流和每个视差像素的第二二进制码流,并将第一二进制码流分别与每个第二二进制码流进行汉明距离计算,得到该目标像素的多个初始像素代价值具体包括:
    获取预设视差阈值并确定目标图像中的一目标像素;
    在目标图像的图像区域确定该目标像素自身的所有领域像素以及在非目标图像的图像区域搜索距离该相同目标像素一预设视差阈值区域内所有视差像素;
    基于所有视差像素,确定每个视差像素自身的所有邻域像素;
    比较目标像素与该目标像素自身对应的所有邻域像素的灰度值大小,输出该目标像素的第一二进制码流;
    将每一个视差像素均作为参考像素,比较每个参考像素与每个参考像素对应的所有邻域像素的灰度值大小,输出该目标像素的多个第二二进制码流;
    计算所述第一二进制码流与每个第二二进制码流的汉明距离,确定该目标像素的多个初始像素代价值。
  8. 根据权利要求7所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述比较目标像素与该目标像素自身对应的所有邻域像素的灰度值大小,输出该目标像素的第一二进制码流具体包括:
    当某邻域像素的灰度值小于或等于目标像素的灰度值,其比较结果为0,则输出0;
    当某邻域像素的灰度值大于目标像素的灰度值,其比较结果为1,则输出1;
    将所有比较结果按位输出,得到该目标像素的第一二进制码流。
  9. 根据权利要求7或8所述的低功耗立体匹配系统获取深度信息的方法,其特征在 于,所述计算所述第一二进制码流与每个第二二进制码流的汉明距离,确定该目标像素的多个初始像素代价值具体包括:
    将所述第一二进制码流并行与每个第二二进制码流进行异或运算,得到多个异或运算结果;
    统计每个异或运算结果的比特位不为1的个数,得到该该目标像素的多个初始像素代价值。
  10. 根据权利要求6所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述确定该目标像素的若干区域以及每个区域的若干初始像素代价值,并基于若干区域中每个区域的若干初始像素代价值,确定每个区域的最小区域代价值以及每个所述最小区域代价值的区域最佳匹配点位置具体包括:
    获取预设区域代价阈值;
    基于所述预设区域代价阈值划分目标像素的多个初始像素代价值,确定该目标像素的若干区域以及对应区域的多个初始像素代价值;
    选取若干区域中每个区域的多个初始像素代价值中值最小的初始代价值,作为对应区域的最小区域代价值;
    基于对应区域的最小区域代价值,获取目标图像中所述最小区域代价值的区域最佳匹配点位置。
  11. 根据权利要求6所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述基于每个区域的最小区域代价值的多个预设路径方向,确定每个区域对应预设路径方向上的路径代价值,并将同一区域的多个预设路径方向上对应的路径代价值进行聚合,得到每个区域的能量函数值具体包括:
    选定多个预设路径方向;其中,所述预设路径方向包括0°、45°、90°以及135°;
    针对多个区域中的每一个区域,计算该区域的最小区域代价值多个预设路径方向上的路径代价,得到该区域对应预设路径方向上的最小路径代价值;
    将该区域的多个最小路径代价值聚合处理,得到该区域的能量函数值。
  12. 根据权利要求6所述的低功耗立体匹配系统获取深度信息的方法,其特征在于, 所述基于所有区域对应的能量函数值以及所有区域中每个区域对应的区域最佳匹配点位置,确定所有能量函数值中最小能量函数值对应的视差区域,并基于该最小能量函数值所在区域对应的区域最佳匹配点位置以及所述视差区域,得到该目标像素的深度信息具体包括:
    获取所有区域对应的能量函数值;
    选取出所有能量函数值中最小值,并对该最小能量函数值采用模拟乘法操作,确定该最小能量函数值对应的最优视差区域;
    接收该最优视差区域对应的区域最佳匹配点位置;
    基于该目标像素的区域最佳匹配点位置以及最优视差区域,得到该目标像素的深度信息。
PCT/CN2021/083603 2020-07-31 2021-03-29 一种低功耗立体匹配系统及获取深度信息的方法 WO2022021912A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010762904.3 2020-07-31
CN202010762904.3A CN112070821B (zh) 2020-07-31 2020-07-31 一种低功耗立体匹配系统及获取深度信息的方法

Publications (1)

Publication Number Publication Date
WO2022021912A1 true WO2022021912A1 (zh) 2022-02-03

Family

ID=73657313

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/083603 WO2022021912A1 (zh) 2020-07-31 2021-03-29 一种低功耗立体匹配系统及获取深度信息的方法

Country Status (2)

Country Link
CN (1) CN112070821B (zh)
WO (1) WO2022021912A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114677261A (zh) * 2022-05-27 2022-06-28 绍兴埃瓦科技有限公司 一种视差处理电路和视差处理系统及其方法
CN114723967A (zh) * 2022-03-10 2022-07-08 北京的卢深视科技有限公司 视差图优化方法、人脸识别方法、装置、设备及存储介质
CN115100153A (zh) * 2022-06-29 2022-09-23 武汉工程大学 基于双目匹配的管内检测方法、装置、电子设备及介质
CN116129037A (zh) * 2022-12-13 2023-05-16 珠海视熙科技有限公司 视触觉传感器及其三维重建方法、系统、设备及存储介质
CN116228601A (zh) * 2023-05-08 2023-06-06 山东中都机器有限公司 一种火车双向平煤的平煤效果视觉监控方法
CN116958134A (zh) * 2023-09-19 2023-10-27 青岛伟东包装有限公司 基于图像处理的塑料膜挤出质量评估方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070821B (zh) * 2020-07-31 2023-07-25 南方科技大学 一种低功耗立体匹配系统及获取深度信息的方法
CN113329219B (zh) * 2021-05-07 2022-06-14 华南理工大学 多输出参数可动态配置深度相机
CN113436057B (zh) * 2021-08-27 2021-11-19 绍兴埃瓦科技有限公司 数据处理方法及双目立体匹配方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460792A (zh) * 2016-12-12 2018-08-28 南京理工大学 一种基于图像分割的高效聚焦立体匹配方法
CN109255811A (zh) * 2018-07-18 2019-01-22 南京航空航天大学 一种基于可信度图视差优化的立体匹配方法
CN110310320A (zh) * 2019-07-09 2019-10-08 南京美基森信息技术有限公司 一种双目视觉匹配代价聚合优化方法
CN110473217A (zh) * 2019-07-25 2019-11-19 沈阳工业大学 一种基于Census变换的双目立体匹配方法
US10554947B1 (en) * 2015-12-16 2020-02-04 Marvell International Ltd. Method and apparatus for stereo vision matching including disparity refinement based on matching merit values
CN112070821A (zh) * 2020-07-31 2020-12-11 南方科技大学 一种低功耗立体匹配系统及获取深度信息的方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104427324A (zh) * 2013-09-02 2015-03-18 联咏科技股份有限公司 视差计算方法及其立体匹配装置
US10818025B2 (en) * 2017-01-26 2020-10-27 Samsung Electronics Co., Ltd. Stereo matching method and apparatus
CN107220997B (zh) * 2017-05-22 2020-12-25 成都通甲优博科技有限责任公司 一种立体匹配方法及系统
CN109743562B (zh) * 2019-01-10 2020-12-25 中国科学技术大学 基于Census算法的匹配代价计算电路结构及其工作方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10554947B1 (en) * 2015-12-16 2020-02-04 Marvell International Ltd. Method and apparatus for stereo vision matching including disparity refinement based on matching merit values
CN108460792A (zh) * 2016-12-12 2018-08-28 南京理工大学 一种基于图像分割的高效聚焦立体匹配方法
CN109255811A (zh) * 2018-07-18 2019-01-22 南京航空航天大学 一种基于可信度图视差优化的立体匹配方法
CN110310320A (zh) * 2019-07-09 2019-10-08 南京美基森信息技术有限公司 一种双目视觉匹配代价聚合优化方法
CN110473217A (zh) * 2019-07-25 2019-11-19 沈阳工业大学 一种基于Census变换的双目立体匹配方法
CN112070821A (zh) * 2020-07-31 2020-12-11 南方科技大学 一种低功耗立体匹配系统及获取深度信息的方法

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723967A (zh) * 2022-03-10 2022-07-08 北京的卢深视科技有限公司 视差图优化方法、人脸识别方法、装置、设备及存储介质
CN114723967B (zh) * 2022-03-10 2023-01-31 合肥的卢深视科技有限公司 视差图优化方法、人脸识别方法、装置、设备及存储介质
CN114677261A (zh) * 2022-05-27 2022-06-28 绍兴埃瓦科技有限公司 一种视差处理电路和视差处理系统及其方法
CN114677261B (zh) * 2022-05-27 2022-08-26 绍兴埃瓦科技有限公司 一种视差处理电路和视差处理系统及其方法
CN115100153A (zh) * 2022-06-29 2022-09-23 武汉工程大学 基于双目匹配的管内检测方法、装置、电子设备及介质
CN116129037A (zh) * 2022-12-13 2023-05-16 珠海视熙科技有限公司 视触觉传感器及其三维重建方法、系统、设备及存储介质
CN116129037B (zh) * 2022-12-13 2023-10-31 珠海视熙科技有限公司 视触觉传感器及其三维重建方法、系统、设备及存储介质
CN116228601A (zh) * 2023-05-08 2023-06-06 山东中都机器有限公司 一种火车双向平煤的平煤效果视觉监控方法
CN116228601B (zh) * 2023-05-08 2023-07-21 山东中都机器有限公司 一种火车双向平煤的平煤效果视觉监控方法
CN116958134A (zh) * 2023-09-19 2023-10-27 青岛伟东包装有限公司 基于图像处理的塑料膜挤出质量评估方法
CN116958134B (zh) * 2023-09-19 2023-12-19 青岛伟东包装有限公司 基于图像处理的塑料膜挤出质量评估方法

Also Published As

Publication number Publication date
CN112070821A (zh) 2020-12-11
CN112070821B (zh) 2023-07-25

Similar Documents

Publication Publication Date Title
WO2022021912A1 (zh) 一种低功耗立体匹配系统及获取深度信息的方法
US11954879B2 (en) Methods, systems and apparatus to optimize pipeline execution
CN109800692B (zh) 一种基于预训练卷积神经网络的视觉slam回环检测方法
EP3872764A1 (en) Method and apparatus for constructing map
CN101625768A (zh) 一种基于立体视觉的三维人脸重建方法
WO2021051526A1 (zh) 多视图3d人体姿态估计方法及相关装置
CN110243390B (zh) 位姿的确定方法、装置及里程计
CN110517309A (zh) 一种基于卷积神经网络的单目深度信息获取方法
US20190164296A1 (en) Systems and methods for determining a confidence measure for a motion vector
CN104240217B (zh) 双目摄像头图像深度信息获取方法及装置
CN112465704B (zh) 一种全局-局部自适应优化的全景光场拼接方法
CN106952304A (zh) 一种利用视频序列帧间相关性的深度图像计算方法
US20090315976A1 (en) Message propagation- based stereo image matching system
CN111553296B (zh) 一种基于fpga实现的二值神经网络立体视觉匹配方法
CN107220932B (zh) 基于词袋模型的全景图像拼接方法
CN110428461B (zh) 结合深度学习的单目slam方法及装置
Niu et al. Boundary-aware RGBD salient object detection with cross-modal feature sampling
CN115695763A (zh) 一种三维扫描系统
CN214587004U (zh) 一种立体匹配加速电路、图像处理器及三维成像电子设备
CN110782480A (zh) 一种基于在线模板预测的红外行人跟踪方法
CN112399162A (zh) 一种白平衡校正方法、装置、设备和存储介质
Zhou et al. Effective dual-feature fusion network for transmission line detection
CN114071015A (zh) 一种联动抓拍路径的确定方法、装置、介质及设备
Ding et al. Improved real-time correlation-based FPGA stereo vision system
CN116704432A (zh) 基于分布不确定性的多模态特征迁移人群计数方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21851562

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21851562

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21851562

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 31.08.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21851562

Country of ref document: EP

Kind code of ref document: A1