WO2022021912A1 - 一种低功耗立体匹配系统及获取深度信息的方法 - Google Patents
一种低功耗立体匹配系统及获取深度信息的方法 Download PDFInfo
- Publication number
- WO2022021912A1 WO2022021912A1 PCT/CN2021/083603 CN2021083603W WO2022021912A1 WO 2022021912 A1 WO2022021912 A1 WO 2022021912A1 CN 2021083603 W CN2021083603 W CN 2021083603W WO 2022021912 A1 WO2022021912 A1 WO 2022021912A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- area
- pixel
- cost
- value
- target
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000002776 aggregation Effects 0.000 claims abstract description 64
- 238000004220 aggregation Methods 0.000 claims abstract description 64
- 238000004364 calculation method Methods 0.000 claims abstract description 39
- 238000005457 optimization Methods 0.000 claims abstract description 29
- 238000005192 partition Methods 0.000 claims abstract description 24
- 230000008569 process Effects 0.000 claims description 10
- 230000005540 biological transmission Effects 0.000 claims description 8
- 230000004931 aggregating effect Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 abstract description 21
- 238000012545 processing Methods 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 13
- 238000011160 research Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000002924 energy minimization method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Definitions
- the present application relates to the technical field of stereo vision technology, and in particular, to a low-power stereo matching system and a method for acquiring depth information.
- Stereo matching is the main research direction in the field of computer vision. Its research extracts and reconstructs the three-dimensional information of the target object from the information of the two-dimensional image, and is widely used in intelligent robot systems, unmanned vehicle systems, industrial measurement and other fields. Therefore, in the stereo vision system, the accuracy and speed of the stereo matching algorithm directly affect the effect of 3D reconstruction. It finds a one-to-one correspondence between pixels by establishing an energy cost function, and estimates the pixel points by minimizing the energy cost function. Parallax value. However, in the current stereo matching test platform, it is often difficult to transplant on hardware due to the large amount of calculation and resource consumption required by complex high-precision algorithms.
- the present application provides a low-power stereo matching system, which includes a census module, a multi-path cost aggregation module, and a depth calculation module, and further includes: a system disposed between the census module and the multi-path cost aggregation module The partition optimization module;
- the census module is used to input the left and right images of the target, and output multiple initial pixel cost values of the target pixels;
- the census module includes a left census transcoding unit for converting the pixel stream of the left image, and a The right census transcoding unit of the pixel stream of the image and the Hamming distance calculation unit for determining a plurality of initial pixel cost values of each target pixel; wherein, the left image or the right image is used as the target image, and the target image includes several target pixel;
- the partition optimization module includes a plurality of area units for determining the minimum area cost value that can represent the area cost value and the location of the best matching point in the area of the minimum area cost value, and each of the area units includes two arranged in parallel. a first smallest subunit and a second smallest subunit that is commonly connected to the output ends of the two first smallest subunits;
- the multi-path cost aggregation module includes several first-in-first-out units for transmitting the cost aggregation value of each area and assisting in updating the cost of each area, and for determining the path cost value in different preset path directions corresponding to each area. And several cost aggregation modules for aggregating multiple path cost values in the same area and an addition aggregation module for determining multiple energy function values in each area;
- the depth calculation module includes an energy minimum search module for determining the minimum energy function value among the plurality of energy function values of each region, a shift module for determining the disparity region of the minimum energy function of the corresponding region, and a The addition module of the depth information of each target pixel determined by the position of the region best matching point of each parallax region of the shift module and the corresponding region of the partition optimization module.
- the right census transcoding unit includes a right row cache unit, a right census window unit and a right census comparison unit
- the left census transcoding unit includes a left row cache unit, a left census window unit and a left census comparison unit, wherein , the structure of the right line cache unit and the left line cache unit is the same, and the structures of the right census window unit and the right census comparison unit are respectively different from the left census window unit and the left census comparison unit.
- the left census window unit when taking the left image as the target image and the right image as the non-target image, includes a left sliding window for traversing the left image, the left sliding window includes several registers, the left The census comparison unit includes a left comparison window, and the left comparison window includes a plurality of comparators;
- the right census window unit includes a right disparity window for traversing each disparity pixel within a preset disparity threshold area from the target pixel in the right image , the number of the right parallax windows is a preset parallax threshold, and each of the right parallax windows includes several registers;
- the right census comparison unit includes a preset parallax threshold value of right comparison windows, each of the right contrast
- the window includes several comparators;
- the right census window unit when taking the right image as the target image and the left image as the non-target image, includes a right sliding window for traversing the right image, the right sliding window includes several registers, and the right census compares
- the unit includes a right contrast window, and the right contrast window includes several comparators
- the left census window unit includes a left disparity window for traversing each disparity pixel in the preset disparity threshold area from the target pixel in the left image, so
- the number of the left disparity windows is a preset disparity threshold, and each of the left disparity windows includes several registers
- the left census comparison unit includes a preset disparity threshold left contrast window, and each of the left contrast windows includes several comparators.
- each of the first minimum subunits includes a first comparator, a first data multiplexer and a first position multiplexer, and the first data multiplexer is used to output a plurality of inputted initial pixels
- the minimum initial pixel cost value in the cost value, the first position multiplexer is used to output the position information corresponding to the minimum initial pixel cost value
- the second minimum subunit includes a second comparator, a second data A multiplexer, two second position multiplexers and an AND gate unit, the second data multiplexer is used to obtain the minimum initial pixel cost value in the output data of the two first minimum subunits, the second The position multiplexer is used to output the best matching position information corresponding to the minimum initial pixel cost value.
- each of the FIFO units includes three transmission units and one direction FIFO unit, and the number of the direction FIFO units is the same as the number of the cost aggregation modules.
- the present application also provides a method for obtaining depth information by a low-power stereo matching system, the method comprising the following steps:
- the left image or the right image is used as the target image, and the target image includes several target pixels;
- the code stream and the second binary code stream of each disparity pixel are calculated, and the Hamming distance calculation is performed on the first binary code stream and each second binary code stream respectively, so as to obtain multiple initial pixel cost value;
- the target pixel and all parallax pixels determine the first binary code stream of the target pixel and the second binary code stream of each parallax pixel, and separate the first binary code stream. Carry out Hamming distance calculation with each second binary code stream, and obtain a plurality of initial pixel cost values of the target pixel specifically including:
- the comparison target pixel and the grayscale value of all neighboring pixels corresponding to the target pixel itself, and the output of the first binary code stream of the target pixel specifically includes:
- calculating the Hamming distance between the first binary code stream and each second binary code stream, and determining the multiple initial pixel value values of the target pixel specifically include:
- the location of the best regional matching point of the minimum regional cost value specifically includes:
- the position of the best matching point in the area of the minimum area cost value in the target image is obtained.
- the multiple preset path directions based on the minimum area cost value of each area, determine the path cost value in the preset path direction corresponding to each area, and map the multiple preset path directions of the same area to the corresponding preset path directions. Aggregate the path cost value of , and obtain the energy function value of each area, which includes:
- the preset path directions include 0°, 45°, 90° and 135°;
- the disparity region corresponding to the minimum energy function value in all the energy function values is determined based on the energy function values corresponding to all the regions and the region best matching point position corresponding to each region in all the regions, and based on the minimum energy function value
- the position of the best matching point in the region corresponding to the region where the value is located and the parallax region, and obtaining the depth information of the target pixel specifically includes:
- the depth information of the target pixel is obtained based on the position of the best matching point in the target pixel's area and the optimal parallax area.
- the present application provides a low-power stereo matching system and a method for acquiring depth information
- the system includes a census module, a partition optimization module, a multi-path cost aggregation module and a depth calculation module connected in sequence module.
- This application uses the Census algorithm as the initial pixel cost value calculation function, and adds a partition optimization module between the Census census module and the multi-path cost aggregation module to simplify the initial pixel cost value, based on sub-region processing and optimal parallax position, without affecting In the case of accuracy, reduce the number of initial pixel cost values passed into the multi-path cost aggregation module, thereby reducing algorithm time and resource consumption, and ensuring accuracy.
- FIG. 1 is a structural block diagram of a low power consumption stereo matching system provided by the present application.
- FIG. 2 is a structural block diagram of a census module in the low power consumption stereo matching system of the present application.
- FIG. 3 is a structural block diagram of a region unit in the partition optimization module of the present application.
- FIG. 4 is a structural block diagram of the first minimum sub-unit in the partition optimization module of the present application.
- FIG. 5 is a structural block diagram of the second smallest sub-unit in the partition optimization module of the present application.
- FIG. 6 is a structural block diagram of a multi-path cost aggregation module of the present application.
- FIG. 7 is an example diagram of four directions of a target pixel of the present application.
- FIG. 8 is a flowchart of a method for acquiring depth information in a low-power stereo matching system provided by the present application.
- FIG. 9 is a flowchart of step S30 of a method for acquiring depth information by a low-power stereo matching system.
- FIG. 10 is a flowchart of step S40 of the method for acquiring depth information by the low-power stereo matching system.
- FIG. 11 is a flowchart of step S50 of a method for acquiring depth information by a low-power stereo matching system
- FIG. 12 is a left image of an object in an application scene.
- FIG. 13 is a right image of the same object in an application scene.
- FIG. 14 is an application scene outputting a depth map of the same target.
- the present application provides a low-power stereo matching system and a method for obtaining depth information.
- the present application is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
- the basic principle of the stereo matching system is: use the calibrated cameras at different positions to shoot the same object target to obtain two-dimensional images of different angles; use the same spatial point in the two-dimensional image to obtain the depth information by using the difference of corresponding pixel positions in the two-dimensional image.
- an image composed of depth information is called a depth image.
- the inventor's research shows that the general stereo matching algorithm can be divided into a global stereo matching algorithm, a semi-global stereo matching algorithm (SGM), and a local stereo matching algorithm according to different energy cost functions.
- the global stereo matching algorithm mainly adopts the global optimization theory method to estimate the disparity, establishes the global energy function, and obtains the optimal disparity value by minimizing the global energy function.
- the local matching algorithm mainly uses the local optimization method to estimate the disparity value, and also uses the energy minimization method to estimate the disparity value.
- the semi-global stereo matching algorithm uses the constraint information of the pixel itself and its neighbors (the characteristics of the local stereo matching algorithm), and uses the dynamic programming idea to simulate the two-dimensional smooth constraints (global smooth constraints) by one-dimensional smooth constraints in multiple directions. The characteristics of stereo matching), and then combine the data on each one-dimensional path, and introduce a different penalty factor according to the depth as a smoothing term, so that the algorithm has strong robustness to noise while ensuring accuracy.
- the present application provides a low-power stereo matching system and a method for obtaining depth information.
- the Census algorithm as the initial pixel cost value calculation function
- the partition optimization module is added between the path cost aggregation modules to simplify the initial pixel cost value. Based on sub-region processing and the optimal disparity position, the number of initial pixel cost values passed into the multi-path cost aggregation module is reduced without affecting the accuracy, thereby reducing the The algorithm is time-consuming and resource-consuming, and ensures accuracy.
- FIG. 1 is a structural block diagram of a low-power stereo matching system provided by the present application.
- the low-power stereo matching system includes a census module 1, a partition optimization module 2, a multipath Cost aggregation module 3 and depth calculation module 4, in this way, the matching number of initial pixel cost values in census module 1 is simplified by partition optimization module 2, thereby reducing the number of initial pixel cost values passed into multi-path cost aggregation module 3, and based on the most The optimal disparity position and multi-path cost are aggregated, and finally the high-precision depth information of each target pixel is obtained.
- the census module 1 (also known as the Census module) is used to input the left and right images of the target, and output multiple initial pixel cost values of the target pixels.
- the census module 1 includes a left census transcoding unit 11 for converting the pixel stream of the left image, a right census transcoding unit 12 for converting the pixel stream of the right image, and a right census transcoding unit 12 for determining each target
- the Census transformation is implemented by the left census transcoding unit 11 and the right census transcoding unit 12, respectively, that is, the left census transcoding unit 11 and the right census transcoding unit 12 respectively convert the corresponding input pixels in the left image and the right image.
- the grayscale values of , respectively, are encoded into binary code streams, wherein the left census transcoding unit 11 encodes the grayscale values of the pixels in the input left image into a left pixel stream, and the right census transcoding unit 12 converts the pixels in the input right image into a left pixel stream.
- the gray value of the point is encoded into the right pixel stream, so as to obtain the size relationship between the gray value of the neighborhood pixel and the gray value of the central pixel, and the initial pixel cost value of each target pixel is realized by the Hamming distance calculation unit 13. Output.
- the initial pixel cost value refers to the similarity degree of pixel blocks in the two images obtained by the binocular camera. For example: pre-set a parallax search range, if the calibrated right camera is used as the benchmark, select a pixel point in sequence in the image acquired by the right camera, and select the same coordinate pixel point in the image acquired by the calibrated left camera. and all the pixels within the parallax range on the right side to calculate the matching cost to obtain the initial pixel cost value.
- the lower the initial pixel cost value the better the match between the pixels in the right image and the corresponding pixels in the left image under the set parallax.
- FIG. 2 is a structural block diagram of the census module 1 in the low-power stereo matching system.
- the right census transcoding unit 12 includes a right line buffer unit 121 , a right census window unit (not shown in the figure) and a right census comparison unit (not shown in the figure).
- the left census transcoding unit 11 includes The left line buffer unit 111, the left census window unit (not shown in the figure) and the left census comparison unit (not shown in the figure), wherein the right line buffer unit 121 and the left line buffer unit 111 have the same structure, the right line The census window unit (not shown in the figure) and the right census comparison unit (not shown in the figure) are respectively different in structure from the left census window unit (not shown in the figure) and the left census comparison unit (not shown in the figure).
- the right line buffer unit 121, the right census window unit (not shown in the figure), the right census comparison unit (not shown in the figure) and the Hamming distance calculation unit 13 form a four-stage pipeline structure, so that a right image is completed every four cycles. Calculation of all Census changes and Hamming distance values (that is, initial pixel cost values) of all pixels in the preset parallax search range.
- the right line buffer unit 121 (linebuffer) is used to sequentially buffer two consecutive lines of pixel value information in the right image pixel stream, which includes a first line buffer unit 1211 (linebuffer1) and a second line buffer unit 1212 (linebuffer2) , the depth values of the first line buffer unit 1211 and the second line buffer unit 1212 are the same, and both consist of a first-in, first-out queue.
- the right image is input to the right line buffer unit, and the two lines of pixel value information in the right image are sequentially buffered by the right line buffer unit in a first-in, first-out manner of pixel flow.
- each pixel has a gray value, and the depth information depends on the number of columns of the input image.
- the input right image takes the input right image as an example, assuming that the pixels of the right image are 640*480, there are 640 pixels in a line, and the depths of linebuffer1 and linebuffer2 are both 640; the right image input takes the upper left corner as the initial incoming point, The lower right corner is the end entry point, and a pixel gray value is passed in one clock cycle to Linebuffer1 in the form of pixel stream.
- the first line of pixels in the right image is transferred and linebuffer1 is filled; when it reaches 1280 cycles, The transmission of the pixel value information of the second row of the right image is completed.
- linebuffer1 is filled with the pixel value information of the second row of the right image
- linebuffer2 is filled with the pixel value information of the first row of the right image.
- the right census window unit (not shown in the figure) includes a right sliding window 1221 for traversing the right image.
- the right census window The unit (not shown in the figure) performs a traversal operation on the entire right image through the right sliding window 1221, and outputs the grayscale values of all the pixels included in the right sliding window 1221.
- the right sliding window 1221 is composed of several registers, and the size of the right sliding window 1221 is predefined according to user needs. Streams match at least 8 digits, with minimal overhead.
- the right census window unit (not shown in the figure) is a three-stage pipeline structure.
- the right sliding window 1221 preferably includes 9 registers in this embodiment of the present application, forming a 3*3 matrix, each layer of sliding window. Both include 3 identical registers.
- the first layer of sliding window is composed of register 1, register 2, and register 3.
- the second layer of sliding window is composed of register 4, register 5, and register 6.
- the third layer of sliding window is composed of register 7 and register. 8, register 9 constitutes.
- register 5 is used to process the center pixel corresponding to the right sliding window 1221, while register 1, register 2, register 3, register 4, register 6, register 7, register 8, and register 9 are used to process the right sliding window
- the center pixel of window 1221 corresponds to 8 pixels in the neighborhood. Therefore, these 8 pixels are also called the neighbor pixels of the center pixel.
- the center pixel is the target pixel.
- the right census and comparison unit (not shown) is used to use the center pixel of each time in the process of traversing the right image area by the right sliding window 1221 as a reference pixel and compare the gray value of the reference pixel with the gray value of the reference pixel.
- the gray value of each pixel in the neighborhood corresponding to the center pixel of the right sliding window 1221 is compared, if the gray value of the neighborhood pixel is less than or equal to the gray value of the reference pixel, then the right census comparison unit (not shown in the figure) If the gray value of the neighborhood pixel is greater than the gray value of the reference pixel, the right census comparison unit (not marked in the figure) outputs 1.
- the right census comparison unit corresponds to the right census window unit (not shown in the figure), which is also a three-level cascade, and the right census comparison unit (not shown in the figure) includes a right comparison window. 1231, the right comparison window 1231 includes several comparators.
- the right comparison window 1231 preferably includes 8 registers in the embodiment of the present application, which also forms a 3*3 matrix, but the first layer comparison window and the third layer comparison window both include 3 identical comparators, namely the first layer.
- the layer comparison window is composed of comparator 1, comparator 2, and comparator 3; the third layer comparison window is composed of comparator 7, comparator 8, and comparator 9, but the second layer comparison window is composed of only 2 comparators, Namely Comparator 4 and Comparator 6.
- the output terminals of the register 5 of the right sliding window 1221 are commonly connected to the input terminals of the eight comparators.
- the other 8 registers except the register 5 of the right sliding window 1221 are in one-to-one correspondence with the 8 comparators of the right comparison window 1231, that is, the output end of the register 1 is connected to the input end of the comparator 1, and the output end of the register 2 is connected to the input end of the comparator 1.
- the terminal is connected to the input terminal of comparator 2
- the output terminal of register 3 is connected to the input terminal of comparator 3
- the output terminal of register 4 is connected to the input terminal of comparator 4
- the output terminal of register 6 is connected to the input terminal of comparator 6
- the output terminal of register 6 is connected to the input terminal of comparator 6.
- the output end of the register 8 is connected to the input end of the comparator 7, the output end of the register 8 is connected to the input end of the comparator 8, and the output end of the register 9 is connected to the input end of the comparator 9.
- the output values of the 8 comparators are connected bit by bit from left to right and from top to bottom, and an 8-bit bit string is obtained through the right census and comparison unit (not shown in the figure), that is, a binary code composed of 0 and 1 stream (aka right binary stream).
- the pixel stream of the right image obtains the gray value of each target pixel and the gray value of the neighboring pixels corresponding to the target pixel through the right census window unit (not shown in the figure), and passes through the right census comparison unit (Fig. Not marked) size comparison is performed between the gray value of each target pixel and the gray value of each pixel in the neighborhood of the corresponding target pixel, thereby outputting each target pixel of the right image in bit order by the right census comparison unit The corresponding right binary code stream.
- the left census transcoding unit 11 and the right census transcoding unit 12 use the same Census transformation method, that is, convert the image pixels of the left image into a binary code stream.
- the left census transcoding unit 11 includes a left line cache unit 111, a left census window unit (not shown in the figure) and a left census comparison unit (not shown in the figure), wherein the left line cache unit 111 and the right line cache
- the structure of the unit 121 is the same, the left census window unit (not shown in the figure) and the left census comparison unit (not shown in the figure) are respectively the same as the right census window unit (not shown in the figure), the right census comparison unit (not shown in the figure) have different structures.
- the left line buffer unit 111 is also used to sequentially buffer two consecutive lines of pixel value information in the left image pixel stream, which includes a first line buffer unit 1111 and a second line buffer unit 1112.
- the left line buffer unit 111 The same method as the right line buffer unit 121 is adopted, and the implementation process of the right line buffer unit 121 described above is specifically referred to.
- the left census window unit (not shown in the figure) includes a left disparity window 1121 for traversing each disparity pixel within the preset disparity threshold region from the target pixel in the left image, that is, when calibrating the right image as the target image , each target pixel in the right image corresponds to a preset disparity threshold and left disparity windows 1121 (also known as Disparity Range windows).
- the number of the left parallax windows 1121 is a preset parallax threshold.
- the preset parallax thresholds are preferably 96, and the optimal matching of the target pixels can be obtained with low overhead.
- the preset parallax thresholds are not limited, and can be set according to user requirements, such as 48, 24, and so on.
- a target pixel 1 in the right image corresponds to a right sliding window and a right comparison window.
- the target pixel 1 in the left image is searched to determine the preset disparity threshold area (ie neighborhood) from the target pixel.
- the target pixel 1 has 96 disparity pixels, that is, 96 left disparity windows 1121 .
- the left parallax window 1121 of the present application is not one, and has a preset parallax threshold corresponding to each window.
- the concurrent processing of the present application not only improves the search rate, but also further improves the accuracy of the target pixel.
- each of the left disparity windows 1121 includes several registers; in this embodiment of the present application, each of the left disparity windows 1121 includes 9 registers, and the left census comparison unit (not marked in the figure) includes The preset parallax threshold number of left contrast windows 1131 , that is to say, the number of left parallax windows 1121 is the same as the number of left contrast windows 1131 , both are preset parallax thresholds, for example, 96 left contrast windows 1131 .
- Each of the left contrast windows 1131 includes 8 comparators, and each of the left contrast windows 1131 corresponds to a register corresponding to the left disparity window 1121:
- the outputs of register 5 of the left sliding window 1121 are commonly connected to the inputs of eight comparators.
- the remaining 8 registers except register 5 of the left parallax window 1121 are in one-to-one correspondence with the 8 comparators of the left contrast window 1131, that is, the output end of register 1 is connected to the input end of comparator 1, and the output end of register 2 is connected to the input end of comparator 1.
- the terminal is connected to the input terminal of comparator 2, the output terminal of register 3 is connected to the input terminal of comparator 3, the output terminal of register 4 is connected to the input terminal of comparator 4, the output terminal of register 6 is connected to the input terminal of comparator 6, and the output terminal of register 6 is connected to the input terminal of comparator 6.
- the output end of the register 8 is connected to the input end of the comparator 7, the output end of the register 8 is connected to the input end of the comparator 8, and the output end of the register 9 is connected to the input end of the comparator 9.
- the output values of the 8 comparators are connected bit by bit from left to right and from top to bottom, and an 8-bit bit string is obtained through the left census comparison unit (not shown in the figure), that is, a binary code composed of 0 and 1 flow.
- the left census comparison unit not shown in the figure
- a target pixel in the right image outputs a right binary code stream, corresponding to the left image concurrently outputting 96 right binary code streams through 96 left disparity windows 1121 and 96 left contrast windows 1131 .
- the Hamming Distance calculation unit 13 is configured to receive the bit strings respectively output by the left census transcoding unit 11 and the right census transcoding unit 12, and use to calculate The Hamming distance is used to calculate the initial pixel value of a right binary code stream and each left binary code stream of the target pixel, so that the Hamming distance calculation unit 13 determines a plurality of initial pixel value of each target pixel.
- the Hamming distance value is the initial pixel cost value, which refers to the number of corresponding bits of the two bit strings that are different (ie, one is 1 and the other is 0).
- the Hamming distance calculation unit 13 includes a plurality of XOR gates arranged in parallel and an adder (not shown in the figure) that is commonly connected with the output ends of all the XOR gates.
- the number of the XOR gates is preferably 9, which are XOR gate 1, XOR gate 2, XOR gate 3, XOR gate 4, XOR gate 5, XOR gate 6.
- each XOR gate has two input terminals and one output terminal, and the two input terminals of the XOR gate are respectively connected to the left census
- the output terminal of one comparator of the comparison unit (not shown in the figure) and the output terminal of the corresponding comparator of the right census comparison unit (not shown in the figure) are connected to the adder. That is to say, for a target pixel, a right binary code stream and 96 left binary code streams are input in parallel to the two inputs of each XOR gate, and the XOR operation is performed through each XOR gate, and the output is output to all the XOR gates.
- the adder is used to output 96 initial pixel cost values of a target pixel.
- the right camera After the right camera reads the image data, it passes in linebuffer1 (the first line buffer unit) and linebuffer2 (the second line buffer unit) in turn through the right image pixel stream pixelstream.
- the linebuffer2 When the linebuffer2 is full, the pixelstream is in At the same time of entering linebuffer1, pixelstream, the rightmost data of linebuffer1 and the rightmost data of linebuffer2 respectively enter the three registers of stage1 (state 1) in the census window unit, and pass to the right in turn when the next clocks arrive; when Census After the window unit is filled, when the next clock cycle comes, the 8 registers in the Census window unit except register 5 are respectively passed into the 8 comparators in the corresponding positions in the Census comparison unit, and register 5 is passed into the Census comparison unit.
- each target pixel has 96 initial pixel cost values. For example: for a certain pixel in the right image, there will be several pixels corresponding to it in the left image to simulate the corresponding cost values under different parallax values, and the number of cost values is the preset parallax range value. The value is also known as the initial pixel cost of the pixel.
- the partition optimization module 2 is used to divide the multiple initial pixel cost values of each target pixel into multiple regions, and optimize each region to obtain the minimum region cost value of each region. and the position of the best matching point in the area corresponding to the minimum area cost value, the purpose of which is to reduce the data amount of the initial pixel cost value output by the census module.
- the optimization process refers to dividing a plurality of initial pixel cost values of the target pixel into a plurality of regions, each region contains a plurality of initial pixel cost values, and the partition optimization module 2 includes a plurality of The minimum area cost value of the value and the area unit 20 of the location of the area best matching point of the minimum area cost value.
- Each area unit 20 uses a tree structure to process all the initial pixel cost values C1 to C4 contained in the area unit. Please refer to FIG. 3 .
- FIG. 3 is a structural block diagram of each area unit. As shown in FIG. Each of the units 20 includes two first minimum subunits 21 (min_first) arranged in parallel and a second minimum subunit 22 (min_second) that is commonly connected to the output ends of the two first minimum subunits 21 .
- FIG. 4 is a structural block diagram of the first smallest sub-unit 21
- FIG. 5 is a structural block diagram of the second smallest sub-unit 22 .
- the first minimum subunit 21 includes a first comparator, a first data multiplexer mux and a first position multiplexer mux, and the first data multiplexer is used to output any two The smallest initial pixel cost value among the initial pixel cost values, and the first position multiplexer is configured to output the position value corresponding to the smallest initial pixel cost value (the position of the best matching pixel point, represented by 1 bit).
- the first comparator includes two input ends and one output end
- the first data multiplexer includes three input ends and one output end
- the first position multiplexer includes three inputs terminal and one output terminal.
- the two input terminals of the first comparator and the two input terminals of the first data multiplexer are connected to any two initial pixel values such as C1 and C2.
- the third input end of the data multiplexer and the third input end of the first position multiplexer are commonly connected to the output end of the first comparator, and the output end of the first data multiplexer is connected to the output end of the first comparator.
- the output end of the first position multiplexer is commonly connected to the input end of the second smallest subunit.
- the first comparator outputs c1, and the first position multiplexer outputs 0, representing the best matching pixel position; otherwise, if c1>c2, the first comparator c2 is output, and the first position multiplexer outputs 1, representing the best matching position.
- the second minimum subunit 22 includes a second comparator, a second data multiplexer mux, two second position multiplexers mux, and an AND gate unit.
- the two data multiplexers are used to output the minimum initial pixel cost value in the output data of the two first minimum subunits
- the two second position multiplexers are used to output the best matching position value corresponding to the minimum initial pixel cost value.
- the second comparator includes three input ends and one output end
- the second data multiplexer includes three input ends and one output end
- the two second position multiplexers each include three An input terminal and an output terminal
- the AND gate unit includes two input terminals and an output terminal.
- the two input terminals of the second comparator and the two input terminals of the second data multiplexer are jointly connected to the minimum initial pixel cost values respectively output by the two first minimum subunits min_first, such as C1 and C2; the third input end of the second data multiplexer and the third input ends of the two second position multiplexers are connected to the output end of the second comparator, and the two second position multiplexers are connected to the output end of the second comparator.
- the output terminal of the AND gate unit is connected to the two input terminals of the AND gate unit, wherein the two input terminals of one second position multiplexer input 0 and 1, and the two input terminals of the other second position multiplexer are connected with the second position multiplexer.
- the two inputs of the position multiplexer correspond to inputs p1 and P2.
- the second comparator outputs c1, and the two second position multiplexers output p10 respectively to represent the best matching position; otherwise, if c1>c2, the first The two comparators output c2, and the two second position multiplexers respectively output p21 to represent the best matching position.
- the initial pixel cost value is simplified by the sub-region optimization module 2, and the number of initial pixel cost values passed into the multi-path cost aggregation module 3 is reduced without affecting the accuracy, thereby reducing the resource overhead. , to speed up processing.
- FIG. 6 is a structural block diagram of the multi-path cost aggregation module.
- the multi-path cost aggregation module 3 (SGM module) is used to calculate the path cost for a plurality of preset path directions of each minimum area cost value, and obtain the multi-path cost corresponding to the preset path direction.
- a path cost value is obtained, and all path cost values corresponding to the preset path directions are aggregated to obtain the energy function value of the corresponding area of the target pixel.
- the multi-path cost aggregation module 3 includes several first-in, first-out units 31 (FIFO units) for transmitting the cost aggregation value of each area and assisting in updating the cost of each area, and for determining different preset paths corresponding to each area.
- the number of the FIFO units 31 is preferably 4, and the number of the cost aggregation modules 32 is also preferably 4, which are the cost aggregation module 1 (Agg1), the cost aggregation module 1 (Agg1), the cost aggregation module 32 Aggregation module 2 (Agg2), cost aggregation module 3 (Agg3), and cost aggregation module 4 (Agg4).
- each of the FIFO units 31 includes three transmission units and one direction FIFO unit, and the number of the direction FIFO units is the same as the number of the cost aggregation modules 32, corresponding to 4 Cost aggregation in each direction.
- Each transmission unit consists of 24 registers, therefore, the FIFO unit consists of four directions of 0°, 45°, 90° and 135° FIFO units and 12 transmission units.
- four cost aggregation modules form a four-layer parallel two-stage pipeline structure: the first layer is the aggregation of the second cost aggregation module agg2 itself, and the second cost aggregation module agg2 to the second cost aggregation module agg2 , the second layer is the second cost aggregation module Agg2 to the fourth cost aggregation module Agg4, the third layer is the second cost aggregation module Agg2 to the third cost aggregation module Agg3, and the fourth layer is the first cost aggregation module Agg1 to the fourth Cost aggregation module Agg4.
- the first line of each frame of image since the first line of each frame of image only needs to perform cost aggregation in the direction of 0°, as shown in Figure 6, it is counted by a counter.
- the 24 area cost values are passed into the FIFO unit instead of Agg4.
- the area cost value of the last pixel in the first line is passed into Agg2
- the area cost value of the pixels in the second line begins to be passed into Agg4.
- the output of Agg3 skips the first two transmission units in the FIFO unit and directly passes in the corresponding direction in the same direction.
- the cost aggregation values in the cost aggregation module 32 can be passed on in sequence, so as to help the cost aggregation module 32 to update the cost.
- the addition aggregation module (Sum) 33 is composed of adders, and Agg1 contains 24*4 area aggregation values after updating in 4 directions, also called path cost values, with a Taking the region as an example, the addition aggregation module 33 adds the path cost values corresponding to the four directions of the region through the adder to obtain the total path cost value, and then obtains 24 energy functions of the 24 regions.
- the depth calculation module 4 is used to determine the parallax region corresponding to the smallest energy function value among the multiple energy function values, and based on the position of the best matching point in the region and the parallax region, obtain in-depth information.
- the depth calculation module 4 includes a minimum energy search module 41, a shift module 42 and an addition module 43.
- the minimum energy search module 41 is used to find the minimum energy value, that is, to determine the minimum energy among the multiple energy function values of each region.
- the shift module 42 is used to determine the disparity region of the minimum energy function of the corresponding region, that is, an analog multiplication operation (*4) is used to form the disparity region corresponding to the minimum energy function, and the addition module 43 is used for partition-based
- the position of the best matching point in the region output by the optimization module 2 and the disparity region corresponding to the minimum energy function output by the shift module 42 are used to calculate the depth information of each target pixel.
- the depth calculation module 4 is a three-stage pipeline structure, that is, the calculation of 24 aggregated energy values to depth information is completed in three clock cycles, wherein the first two clock cycles are completed to find the minimum value of 24 energy values, and the third time cycle is completed. Complete the depth information calculation.
- the energy minimum search module 41 is a two-stage pipeline structure, implemented in a tree structure, wherein each time period processes three contrast levels, the input is 24 energy function values, and the minimum value is output.
- the present application also provides a method for obtaining depth information by a low-power stereo matching system.
- FIG. 8 is a method for obtaining depth information by the low-power stereo matching system provided by the present application.
- S10 Collect a left image and a right image of the target; wherein, the left image or the right image is used as the target image, and the target image includes several target pixels.
- the left image and the right image of the same target are collected by the binocular cameras.
- the binocular camera simulates human eyes and consists of two monocular cameras, namely a left camera and a right camera. Each monocular camera shoots the same target to output an image, corresponding to the left image and the right image respectively. Therefore, the left image and the right image are images corresponding to the same target (eg, the same pixel block) at different angles.
- the binocular camera shoots the same object target, and obtains the image as the target image.
- the right image of the target image as the left image of the non-target image.
- the left and right images are input to the low-power stereo matching system to obtain high-precision depth information for each target pixel.
- a parallax threshold that is, the parallax search range of the same target pixel
- the same target pixel and the same target pixel in the calibrated target image can be used.
- all the parallax pixels within the range of the parallax threshold corresponding to the target pixel in the corresponding neighborhood are subjected to matching cost and a series of calculations and optimizations to obtain the depth information of the same target pixel.
- each target pixel in a plurality of target pixels in the target image determines a preset parallax threshold number of parallax pixels corresponding to the target pixel in the non-target image, and based on the target pixel and all parallax pixels, determine the first pixel of the target pixel.
- the binary code stream and the second binary code stream of each disparity pixel, and the Hamming distance calculation is performed on the first binary code stream and each second binary code stream respectively to obtain the target pixel’s Multiple initial pixel cost values.
- a preset parallax threshold is obtained and a target pixel in the target image is determined; in the image area of the target image, all domain pixels of the target pixel itself, that is, a target pixel 1 in the right image, are determined by sliding right Window 1221 searches the other 8 neighbor pixels of the target pixel 1 in the right image area. Then obtain the gray value of the target pixel 1 and the gray values of all the area pixels of the target pixel 1, and compare the gray value of the target pixel and all the adjacent pixels corresponding to the target pixel itself through the right comparison window 1231.
- the gray value of the neighborhood pixel is less than or equal to the gray value of the target pixel, output 0, if the gray value of a neighborhood pixel is greater than the gray value of the target pixel, output 1, and 8 comparison results are bitwise (that is, from top-to-bottom, left-to-right) connection, and output the first binary code stream of the target pixel (that is, the above-mentioned right binary code stream).
- the target pixel 1 and the preset parallax threshold search for all parallax pixels within a predetermined parallax threshold area from the same target pixel in the image area of the non-target image, that is, in the image area of the left image, the distance from the same coordinate 96 disparity pixels within the preset disparity threshold of the pixel, that is, one target pixel 1 corresponds to 96 disparity pixels, that is, corresponds to 96 left disparity windows 1121 . Then, based on all the disparity pixels, with each disparity pixel as the center, through the corresponding left disparity window 1121, the image area of the left image is searched for 8 neighboring pixels of each disparity pixel itself.
- each disparity pixel is also used as a reference pixel, and the grayscale values of each reference pixel and all neighboring pixels corresponding to each reference pixel are compared. If the grayscale value of a neighboring pixel is Less than or equal to the gray value of the reference pixel, output 0, if the gray value of a neighborhood pixel is greater than the gray value of the reference pixel, output 1, and then the comparison result of each parallax pixel is bitwise (that is, from top to bottom, Connect from left to right), output the second binary code stream of each disparity pixel, that is, the second binary code stream of the target pixel, a total of 96 second binary code streams (ie the above left binary code stream) code stream).
- the binary code stream conversion can be expressed by formula (1) and formula (2):
- (u, v) is the center pixel coordinate
- (u, v) is the center pixel coordinate
- i ⁇ [- n', n'] is the comparison formula of 0 and 1
- x and y are the two values to be compared
- Cs(u, v) is obtained after the census transformation under the coordinates (u, v)
- the 8bit bit string of , I(u, v) is the pixel value under the coordinates (u, v).
- the first binary code stream and 96 second binary code streams are respectively XORed. If a certain bit of data in the first binary code stream and the second binary code stream If the corresponding bit data is different, output 1; if a certain bit of data in the first binary code stream is the same as the corresponding bit data in the second binary code stream, output 0, and then pass through the adder (not marked in the figure) Count the number of bits that are not 1 in the bits of the 96 XOR operation results, and this value is the initial pixel cost value, that is, 96 initial pixel cost values are output.
- the formula (3) indicates that the XOR operation is performed on the two bit strings, and then the number of bits that are not 1 in the bits of the XOR operation result is counted.
- C(u, v, d) is the initial matching cost of the pixel (u, v) under the depth d, d ⁇ [0, disparityrange-1], where disparityrange represents the disparity range.
- S30 Determine several areas of the target pixel and several initial pixel cost values of each area based on a preset area cost threshold, and determine the minimum area code of each area based on several initial pixel cost values of each area in the several areas value and the location of the regional best match point for each of said minimum regional cost values.
- the regional cost threshold is preset.
- the area cost threshold is preferably 4, that is, each area is defined to include at least 4 initial pixel cost values, and a target pixel is used as an example to illustrate, therefore, the 96 initial pixel cost values of a target pixel are represented by 4 initial pixel values.
- the value is an area, which can be divided into 24 area units. Therefore, the output of the partition optimization module 2 is 24 area cost values and 24 area best matching point positions. In this way, the amount of data input to the multi-path cost aggregation module 3 after optimization by the partition optimization module 2 will be reduced by at least 4 times, thereby reducing the overall resource consumption.
- Each area unit 20 is composed of 4 initial pixel cost values C1, C2, C3, C4, the area cost value is represented by the smallest initial pixel cost value in the area unit, and the position corresponding to the smallest initial pixel cost value is maintained , the position of the minimum initial pixel cost value becomes the most matching point in the area unit i, that is, the position of the best matching pixel point.
- the area unit i is expressed by formula (4) and formula (5) as:
- min is the function of finding the minimum value
- c i-3 , c i-2 , c i-1 , c i represent the four initial pixel cost values passed in, respectively
- p′ i represents the best matching pixel point in the area unit .
- the step S30 specifically includes:
- the initial cost value with the smallest median of the multiple initial pixel cost values in each of the several regions is selected after the second minimum processing, which further improves the matching accuracy and ensures the accuracy of the data.
- the multi-path cost aggregation module 3 selects 0°, 45°, 90°, and 135 for one area cost value of one of the 24 area cost values optimized by the partition optimization module 2 °
- the path cost calculation is performed in these four directions respectively, and the path cost values in the four directions of 0°, 45°, 90° and 135° are obtained.
- the cost value is aggregated to obtain the energy function of this area.
- the same method is used to calculate the multipath cost of other areas, and 24 energy function values corresponding to 24 areas are obtained, thereby determining the 24 energy function values of each target pixel.
- Agg2 calculates the minimum value of the aggregated value in the 0° direction, the minimum value of the aggregated value in the 90° direction, and the minimum value of the aggregated value in the 135° direction, while Agg1 calculates the aggregate value in the 45° direction. the minimum value of the value.
- the minimum path cost value among the aggregated values in these four different directions is the value in the path cost calculation formula (6) in a single direction.
- L r (p, i) represents the path cost value of the pixel point p in the path direction r under the disparity area i
- the first item C(p, i) of the formula is the area cost value of the pixel p in the disparity area i
- the pr point is the previous pixel point of the pixel point p in the path direction r
- L r (pr, i) represents the pixel point
- L r (pr, i-1) represents the area cost value of the pixel pr under the disparity area i-1 in the path direction r
- L r (pr, i +1) represents the area cost value of the pixel point p-1 under the disparity area i+1 in the path direction r
- P 1 , P 2 are the preset penalty coefficients
- min j L r (pr, j) represents the point pr The minimum path cost under any disparity region j in the path
- Agg2 in the first clock cycle corresponds to the previous pixel in the 0° direction of Agg2 in the second clock cycle
- Agg2 in the first clock cycle corresponds to Agg4 in the second clock cycle.
- the previous pixel in the 90° direction, Agg2 in the first clock cycle corresponds to the previous pixel in the 135° direction of Agg3 in the second clock cycle
- Agg1 in the first clock cycle corresponds to the second clock cycle.
- Agg2 obtains the minimum value of the 0° direction aggregated values calculated by Agg2 in the previous clock cycle, and all the 0° of Agg2 in the previous clock cycle 24 aggregate values in the direction, and the cost is updated according to the previous path cost calculation formula;
- Agg4 obtains the minimum value of the 90° direction aggregate values calculated by Agg2 in the previous clock cycle, and all the previous clock cycles Agg2's 90° direction 24 aggregated values, obtain the minimum value of the aggregated values in the 45° direction calculated by Agg1 in the previous clock cycle, and all the 24 aggregated values in the 45° direction of Agg1 in the previous clock cycle, and then calculate the cost according to the path cost calculation formula.
- Agg3 gets the minimum of the 135°-direction aggregated values computed by Agg2 in the last clock cycle, and all 24 aggregated values for Agg2's 135°-direction in the previous clock cycle.
- Agg1 not only makes Agg4 complete the cost update in the 45° direction, but also contains the updated cost values in all four directions, so that in this cost aggregation module, the cost update in four directions can be completed. .
- the path cost values in the four directions of the region are accumulated by formula (7) to form the energy function of the parallax region i for the target pixel p, and finally 24 energy function values are obtained.
- the step S40 specifically includes:
- S41 select multiple preset path directions; wherein, the preset path directions include 0°, 45°, 90°, and 135°;
- S43 Aggregate and process multiple minimum path cost values of the area to obtain an energy function value of the area.
- the depth calculation module 4 obtains the depth information of each target pixel through formula (8).
- p′ i is the position of the best matching point in the region.
- the energy function E(p, i) of each disparity area of the target pixel p in the figure find the disparity area i corresponding to the minimum energy function, and find the minimum energy function of the disparity area i of the pixel p from the fourth step.
- the best matching position p' i is obtained, and then the depth information of the target pixel p is obtained.
- the step S50 specifically includes:
- the depth map of the target can be finally obtained.
- 12-14 illustrate an application scenario of the method for acquiring depth information by the low-power stereo matching system of the present application.
- the present application provides a method for acquiring depth information in a low-power stereo matching system.
- the method for acquiring depth information by the low-power stereo matching system includes the following steps: collecting a left image and a right image of a target; wherein, the left image or the right image is used as the target image, and the target image includes several target pixels; For each target pixel in several target pixels in the non-target image, the preset parallax threshold number of parallax pixels corresponding to the target pixel in the non-target image are determined, and based on the target pixel and all the parallax pixels, the first binary code stream of the target pixel is determined.
- the path cost value corresponding to the preset path direction of each area Aggregate the path cost values corresponding to multiple preset path directions in the same area to obtain the energy function value of each area; based on the energy function values corresponding to all areas and the area corresponding to each area in all areas is the best match point position, determine the parallax region corresponding to the minimum energy function value among all energy function values, and obtain the depth information of the target pixel based on the best matching point position of the region corresponding to the region where the minimum energy function value is located and the parallax region.
- This application uses the Census algorithm as the initial pixel cost value calculation function, and adds a partition optimization module between the Census census module and the multi-path cost aggregation module to simplify the initial pixel cost value, based on sub-region processing and optimal parallax position, without affecting In the case of accuracy, reduce the number of initial pixel cost values passed into the multi-path cost aggregation module, thereby reducing algorithm time and resource consumption, and ensuring accuracy.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (12)
- 一种低功耗立体匹配系统,其包括普查模块、多路径代价聚合模块以及深度计算模块,其特征在于,还包括:设置在所述普查模块和所述多路径代价聚合模块之间的分区优化模块;所述普查模块用于输入目标的左图像和右图像,输出目标像素的多个初始像素代价值;所述普查模块包括用于转换左图像的像素流的左普查转码单元、用于转换右图像的像素流的右普查转码单元以及用于确定每个目标像素的多个初始像素代价值的汉明距离计算单元;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素;所述分区优化模块包括多个用于确定能够代表区域代价值的最小区域代价值以及该最小区域代价值的区域最佳匹配点位置的区域单元,每个所述区域单元均包括并行设置的两个第一最小子单元和与两个第一最小子单元的输出端共接的第二最小子单元;所述多路径代价聚合模块包括用于传递每个区域的代价聚合值并辅助每个区域代价更新的若干先进先出单元、用于确定每个区域对应的不同预设路径方向上的路径代价值并分别聚合同一区域的多个路径代价值的若干代价聚合模块以及用于确定每个区域的多个能量函数值的加法聚合模块;所述深度计算模块包括用于确定每个区域的多个能量函数值中最小能量函数值的能量最小查找模块、用于确定对应区域该最小能量函数的视差区域的移位模块以及用于基于所述移位模块的每个视差区域以及所述分区优化模块的对应区域的区域最佳匹配点位置所确定的每个目标像素的深度信息的加法模块。
- 根据权利要求1所述的低功耗立体匹配系统,其特征在于,所述右普查转码单元包括右行缓存单元、右普查窗口单元以及右普查对比单元,所述左普查转码单元包括左行缓存单元、左普查窗口单元以及左普查对比单元,其中,所述右行缓存单元与所述左行缓存单元的结构相同,所述右普查窗口单元和所述右普查对比单元分别与所述左普查窗口单元、所述左普查对比单元的结构不同。
- 根据权利要求2所述的低功耗立体匹配系统,其特征在于,当以左图像为目标图像,以右图像为非目标图像时,所述左普查窗口单元包括一个用于遍历左图像的左滑动窗口,该左滑动窗口包括若干个寄存器,所述左普查对比单元包括一个左对比窗口,所 述左对比窗口包括若干个比较器;所述右普查窗口单元包括用于遍历右图像中距离目标像素预设视差阈值区域内每个视差像素的右视差窗口,所述右视差窗口的个数为预设视差阈值,每个所述右视差窗口均包括若干个寄存器;所述右普查对比单元包括预设视差阈值个右对比窗口,每个所述右对比窗口包括若干个比较器。
- 根据权利要求1所述的低功耗立体匹配系统,其特征在于,每个所述第一最小子单元均包括第一比较器、第一数据复用器以及第一位置复用器,所述第一数据复用器用于输出所输入的多个初始像素代价值中的最小初始像素代价值,所述第一位置复用器用于输出所述最小初始像素代价值对应的位置信息;所述第二最小子单元包括一个第二比较器、一个第二数据复用器、两个第二位置复用器以及一个与门单元,所述第二数据复用器用于获取两个所述第一最小子单元输出数据中最小初始像素代价值,所述第二位置复用器用于输出该最小初始像素代价值对应的最佳匹配位置信息。
- 根据权利要求1所述的低功耗立体匹配系统,其特征在于,每个所述先进先出单元均包括三个传输单元以及一个方向先进先出单元,所述方向先进先出单元的个数与所述代价聚合模块的个数相同。
- 一种如权利要求1-5任一项所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述低功耗立体匹配系统获取深度信息的方法包括以下步骤:采集目标的左图像和右图像;其中,以左图像或右图像作为目标图像,所述目标图像包括若干目标像素;针对目标图像中若干目标像素中每一个目标像素,确定非目标图像中该目标像素对应的预设视差阈值个视差像素,并基于该目标像素及所有视差像素,确定该目标像素的第一二进制码流和每个视差像素的第二二进制码流,并将第一二进制码流分别与每个第二二进制码流进行汉明距离计算,得到该目标像素的多个初始像素代价值;基于预设区域代价阈值,确定该目标像素的若干区域以及每个区域的若干初始像素代价值,并基于若干区域中每个区域的若干初始像素代价值,确定每个区域的最小区域代价值以及每个所述最小区域代价值的区域最佳匹配点位置;基于每个区域的最小区域代价值的多个预设路径方向,确定每个区域对应预设路径 方向上的路径代价值,并将同一区域的多个预设路径方向上对应的路径代价值进行聚合,得到每个区域的能量函数值;基于所有区域对应的能量函数值以及所有区域中每个区域对应的区域最佳匹配点位置,确定所有能量函数值中最小能量函数值对应的视差区域,并基于该最小能量函数值所在区域对应的区域最佳匹配点位置以及所述视差区域,得到该目标像素的深度信息。
- 根据权利要求6所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述基于该目标像素及所有视差像素,确定该目标像素的第一二进制码流和每个视差像素的第二二进制码流,并将第一二进制码流分别与每个第二二进制码流进行汉明距离计算,得到该目标像素的多个初始像素代价值具体包括:获取预设视差阈值并确定目标图像中的一目标像素;在目标图像的图像区域确定该目标像素自身的所有领域像素以及在非目标图像的图像区域搜索距离该相同目标像素一预设视差阈值区域内所有视差像素;基于所有视差像素,确定每个视差像素自身的所有邻域像素;比较目标像素与该目标像素自身对应的所有邻域像素的灰度值大小,输出该目标像素的第一二进制码流;将每一个视差像素均作为参考像素,比较每个参考像素与每个参考像素对应的所有邻域像素的灰度值大小,输出该目标像素的多个第二二进制码流;计算所述第一二进制码流与每个第二二进制码流的汉明距离,确定该目标像素的多个初始像素代价值。
- 根据权利要求7所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述比较目标像素与该目标像素自身对应的所有邻域像素的灰度值大小,输出该目标像素的第一二进制码流具体包括:当某邻域像素的灰度值小于或等于目标像素的灰度值,其比较结果为0,则输出0;当某邻域像素的灰度值大于目标像素的灰度值,其比较结果为1,则输出1;将所有比较结果按位输出,得到该目标像素的第一二进制码流。
- 根据权利要求7或8所述的低功耗立体匹配系统获取深度信息的方法,其特征在 于,所述计算所述第一二进制码流与每个第二二进制码流的汉明距离,确定该目标像素的多个初始像素代价值具体包括:将所述第一二进制码流并行与每个第二二进制码流进行异或运算,得到多个异或运算结果;统计每个异或运算结果的比特位不为1的个数,得到该该目标像素的多个初始像素代价值。
- 根据权利要求6所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述确定该目标像素的若干区域以及每个区域的若干初始像素代价值,并基于若干区域中每个区域的若干初始像素代价值,确定每个区域的最小区域代价值以及每个所述最小区域代价值的区域最佳匹配点位置具体包括:获取预设区域代价阈值;基于所述预设区域代价阈值划分目标像素的多个初始像素代价值,确定该目标像素的若干区域以及对应区域的多个初始像素代价值;选取若干区域中每个区域的多个初始像素代价值中值最小的初始代价值,作为对应区域的最小区域代价值;基于对应区域的最小区域代价值,获取目标图像中所述最小区域代价值的区域最佳匹配点位置。
- 根据权利要求6所述的低功耗立体匹配系统获取深度信息的方法,其特征在于,所述基于每个区域的最小区域代价值的多个预设路径方向,确定每个区域对应预设路径方向上的路径代价值,并将同一区域的多个预设路径方向上对应的路径代价值进行聚合,得到每个区域的能量函数值具体包括:选定多个预设路径方向;其中,所述预设路径方向包括0°、45°、90°以及135°;针对多个区域中的每一个区域,计算该区域的最小区域代价值多个预设路径方向上的路径代价,得到该区域对应预设路径方向上的最小路径代价值;将该区域的多个最小路径代价值聚合处理,得到该区域的能量函数值。
- 根据权利要求6所述的低功耗立体匹配系统获取深度信息的方法,其特征在于, 所述基于所有区域对应的能量函数值以及所有区域中每个区域对应的区域最佳匹配点位置,确定所有能量函数值中最小能量函数值对应的视差区域,并基于该最小能量函数值所在区域对应的区域最佳匹配点位置以及所述视差区域,得到该目标像素的深度信息具体包括:获取所有区域对应的能量函数值;选取出所有能量函数值中最小值,并对该最小能量函数值采用模拟乘法操作,确定该最小能量函数值对应的最优视差区域;接收该最优视差区域对应的区域最佳匹配点位置;基于该目标像素的区域最佳匹配点位置以及最优视差区域,得到该目标像素的深度信息。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010762904.3 | 2020-07-31 | ||
CN202010762904.3A CN112070821B (zh) | 2020-07-31 | 2020-07-31 | 一种低功耗立体匹配系统及获取深度信息的方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022021912A1 true WO2022021912A1 (zh) | 2022-02-03 |
Family
ID=73657313
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/083603 WO2022021912A1 (zh) | 2020-07-31 | 2021-03-29 | 一种低功耗立体匹配系统及获取深度信息的方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112070821B (zh) |
WO (1) | WO2022021912A1 (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114677261A (zh) * | 2022-05-27 | 2022-06-28 | 绍兴埃瓦科技有限公司 | 一种视差处理电路和视差处理系统及其方法 |
CN114723967A (zh) * | 2022-03-10 | 2022-07-08 | 北京的卢深视科技有限公司 | 视差图优化方法、人脸识别方法、装置、设备及存储介质 |
CN115100153A (zh) * | 2022-06-29 | 2022-09-23 | 武汉工程大学 | 基于双目匹配的管内检测方法、装置、电子设备及介质 |
CN116129037A (zh) * | 2022-12-13 | 2023-05-16 | 珠海视熙科技有限公司 | 视触觉传感器及其三维重建方法、系统、设备及存储介质 |
CN116228601A (zh) * | 2023-05-08 | 2023-06-06 | 山东中都机器有限公司 | 一种火车双向平煤的平煤效果视觉监控方法 |
CN116958134A (zh) * | 2023-09-19 | 2023-10-27 | 青岛伟东包装有限公司 | 基于图像处理的塑料膜挤出质量评估方法 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112070821B (zh) * | 2020-07-31 | 2023-07-25 | 南方科技大学 | 一种低功耗立体匹配系统及获取深度信息的方法 |
CN113329219B (zh) * | 2021-05-07 | 2022-06-14 | 华南理工大学 | 多输出参数可动态配置深度相机 |
CN113436057B (zh) * | 2021-08-27 | 2021-11-19 | 绍兴埃瓦科技有限公司 | 数据处理方法及双目立体匹配方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460792A (zh) * | 2016-12-12 | 2018-08-28 | 南京理工大学 | 一种基于图像分割的高效聚焦立体匹配方法 |
CN109255811A (zh) * | 2018-07-18 | 2019-01-22 | 南京航空航天大学 | 一种基于可信度图视差优化的立体匹配方法 |
CN110310320A (zh) * | 2019-07-09 | 2019-10-08 | 南京美基森信息技术有限公司 | 一种双目视觉匹配代价聚合优化方法 |
CN110473217A (zh) * | 2019-07-25 | 2019-11-19 | 沈阳工业大学 | 一种基于Census变换的双目立体匹配方法 |
US10554947B1 (en) * | 2015-12-16 | 2020-02-04 | Marvell International Ltd. | Method and apparatus for stereo vision matching including disparity refinement based on matching merit values |
CN112070821A (zh) * | 2020-07-31 | 2020-12-11 | 南方科技大学 | 一种低功耗立体匹配系统及获取深度信息的方法 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104427324A (zh) * | 2013-09-02 | 2015-03-18 | 联咏科技股份有限公司 | 视差计算方法及其立体匹配装置 |
US10818025B2 (en) * | 2017-01-26 | 2020-10-27 | Samsung Electronics Co., Ltd. | Stereo matching method and apparatus |
CN107220997B (zh) * | 2017-05-22 | 2020-12-25 | 成都通甲优博科技有限责任公司 | 一种立体匹配方法及系统 |
CN109743562B (zh) * | 2019-01-10 | 2020-12-25 | 中国科学技术大学 | 基于Census算法的匹配代价计算电路结构及其工作方法 |
-
2020
- 2020-07-31 CN CN202010762904.3A patent/CN112070821B/zh active Active
-
2021
- 2021-03-29 WO PCT/CN2021/083603 patent/WO2022021912A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10554947B1 (en) * | 2015-12-16 | 2020-02-04 | Marvell International Ltd. | Method and apparatus for stereo vision matching including disparity refinement based on matching merit values |
CN108460792A (zh) * | 2016-12-12 | 2018-08-28 | 南京理工大学 | 一种基于图像分割的高效聚焦立体匹配方法 |
CN109255811A (zh) * | 2018-07-18 | 2019-01-22 | 南京航空航天大学 | 一种基于可信度图视差优化的立体匹配方法 |
CN110310320A (zh) * | 2019-07-09 | 2019-10-08 | 南京美基森信息技术有限公司 | 一种双目视觉匹配代价聚合优化方法 |
CN110473217A (zh) * | 2019-07-25 | 2019-11-19 | 沈阳工业大学 | 一种基于Census变换的双目立体匹配方法 |
CN112070821A (zh) * | 2020-07-31 | 2020-12-11 | 南方科技大学 | 一种低功耗立体匹配系统及获取深度信息的方法 |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114723967A (zh) * | 2022-03-10 | 2022-07-08 | 北京的卢深视科技有限公司 | 视差图优化方法、人脸识别方法、装置、设备及存储介质 |
CN114723967B (zh) * | 2022-03-10 | 2023-01-31 | 合肥的卢深视科技有限公司 | 视差图优化方法、人脸识别方法、装置、设备及存储介质 |
CN114677261A (zh) * | 2022-05-27 | 2022-06-28 | 绍兴埃瓦科技有限公司 | 一种视差处理电路和视差处理系统及其方法 |
CN114677261B (zh) * | 2022-05-27 | 2022-08-26 | 绍兴埃瓦科技有限公司 | 一种视差处理电路和视差处理系统及其方法 |
CN115100153A (zh) * | 2022-06-29 | 2022-09-23 | 武汉工程大学 | 基于双目匹配的管内检测方法、装置、电子设备及介质 |
CN116129037A (zh) * | 2022-12-13 | 2023-05-16 | 珠海视熙科技有限公司 | 视触觉传感器及其三维重建方法、系统、设备及存储介质 |
CN116129037B (zh) * | 2022-12-13 | 2023-10-31 | 珠海视熙科技有限公司 | 视触觉传感器及其三维重建方法、系统、设备及存储介质 |
CN116228601A (zh) * | 2023-05-08 | 2023-06-06 | 山东中都机器有限公司 | 一种火车双向平煤的平煤效果视觉监控方法 |
CN116228601B (zh) * | 2023-05-08 | 2023-07-21 | 山东中都机器有限公司 | 一种火车双向平煤的平煤效果视觉监控方法 |
CN116958134A (zh) * | 2023-09-19 | 2023-10-27 | 青岛伟东包装有限公司 | 基于图像处理的塑料膜挤出质量评估方法 |
CN116958134B (zh) * | 2023-09-19 | 2023-12-19 | 青岛伟东包装有限公司 | 基于图像处理的塑料膜挤出质量评估方法 |
Also Published As
Publication number | Publication date |
---|---|
CN112070821A (zh) | 2020-12-11 |
CN112070821B (zh) | 2023-07-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022021912A1 (zh) | 一种低功耗立体匹配系统及获取深度信息的方法 | |
US11954879B2 (en) | Methods, systems and apparatus to optimize pipeline execution | |
CN109800692B (zh) | 一种基于预训练卷积神经网络的视觉slam回环检测方法 | |
EP3872764A1 (en) | Method and apparatus for constructing map | |
CN101625768A (zh) | 一种基于立体视觉的三维人脸重建方法 | |
WO2021051526A1 (zh) | 多视图3d人体姿态估计方法及相关装置 | |
CN110243390B (zh) | 位姿的确定方法、装置及里程计 | |
CN110517309A (zh) | 一种基于卷积神经网络的单目深度信息获取方法 | |
US20190164296A1 (en) | Systems and methods for determining a confidence measure for a motion vector | |
CN104240217B (zh) | 双目摄像头图像深度信息获取方法及装置 | |
CN112465704B (zh) | 一种全局-局部自适应优化的全景光场拼接方法 | |
CN106952304A (zh) | 一种利用视频序列帧间相关性的深度图像计算方法 | |
US20090315976A1 (en) | Message propagation- based stereo image matching system | |
CN111553296B (zh) | 一种基于fpga实现的二值神经网络立体视觉匹配方法 | |
CN107220932B (zh) | 基于词袋模型的全景图像拼接方法 | |
CN110428461B (zh) | 结合深度学习的单目slam方法及装置 | |
Niu et al. | Boundary-aware RGBD salient object detection with cross-modal feature sampling | |
CN115695763A (zh) | 一种三维扫描系统 | |
CN214587004U (zh) | 一种立体匹配加速电路、图像处理器及三维成像电子设备 | |
CN110782480A (zh) | 一种基于在线模板预测的红外行人跟踪方法 | |
CN112399162A (zh) | 一种白平衡校正方法、装置、设备和存储介质 | |
Zhou et al. | Effective dual-feature fusion network for transmission line detection | |
CN114071015A (zh) | 一种联动抓拍路径的确定方法、装置、介质及设备 | |
Ding et al. | Improved real-time correlation-based FPGA stereo vision system | |
CN116704432A (zh) | 基于分布不确定性的多模态特征迁移人群计数方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21851562 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21851562 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21851562 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 31.08.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21851562 Country of ref document: EP Kind code of ref document: A1 |