CN103400390A - Hardware acceleration structure adopting variable supporting area stereo matching algorithm - Google Patents

Hardware acceleration structure adopting variable supporting area stereo matching algorithm Download PDF

Info

Publication number
CN103400390A
CN103400390A CN201310349180XA CN201310349180A CN103400390A CN 103400390 A CN103400390 A CN 103400390A CN 201310349180X A CN201310349180X A CN 201310349180XA CN 201310349180 A CN201310349180 A CN 201310349180A CN 103400390 A CN103400390 A CN 103400390A
Authority
CN
China
Prior art keywords
module
accumulation
row
stereo matching
mini
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310349180XA
Other languages
Chinese (zh)
Other versions
CN103400390B (en
Inventor
单羿
郝宇辰
汪玉
王文强
杨华中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201310349180.XA priority Critical patent/CN103400390B/en
Publication of CN103400390A publication Critical patent/CN103400390A/en
Application granted granted Critical
Publication of CN103400390B publication Critical patent/CN103400390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a hardware acceleration structure adopting a variable supporting area stereo matching algorithm. The variable supporting area stereo matching algorithm adopted by the invention can produce an appropriate matching area, and improve the matching precision remarkably. Aiming at solving the key problem that a large number of row cache hardware sources are needed to be occupied when the accumulation is carried out through adopting the variable supporting area stereo matching algorithm, the hardware acceleration structure adopts an accumulation sequence from the column direction to the row direction, and decreases the quantity of row caches required in processing parallel parallaxes in the supporting area, and saves the hardware storage resources through performing the caching output on pictures by the row caches. In addition, the structure adopts an integrogram thought, so that the accumulation operation is suitable for hardware implementation; data multiplex in the column direction and the row direction in accumulation is achieved through adopting the mixed parallel calculation way. Through the technological means, the high-precision real-time stereo matching of the high-definition pictures becomes reality.

Description

The hardware acceleration structure of variable supporting zone Stereo Matching Algorithm
Technical field
The invention belongs to the computing machine electronic technology field, be specifically related to a kind of hardware acceleration structure of variable supporting zone Stereo Matching Algorithm.
Background technology
Stereo matching is one of important content of image processing, and the binocular solid coupling is method comparatively traditional in Stereo matching.The binocular solid coupling utilizes position to exist the dual camera of deviation to gather binocular image, searches the corresponding relation of binocular image mid point, utilizes the position relationship of corresponding point to push over out the three-dimensional information of spatial point.
The fundamental purpose of Binocular Stereo Matching Algorithm is to search the corresponding relation of binocular image mid point.Binocular image will be through the process of overcorrect before coupling, and the corresponding point in the assurance image are all in same delegation.Due to the difference of camera position, in binocular image, the point of piece image moves to right than another width global existence; The image that we move to right take integral body, as reference picture, is designated as L; Another width is candidate image, is designated as R.
Solid matching method can be divided into two classes on the whole, global approach and partial approach.Global approach is considered the two-dimentional connectivity relationship of image, by optimizing a global energy equation, realizes coupling; It is compensatory that the coupling of each pixel is calculated in partial approach one by one, and accumulate in the parallax of ,Cong Ge road the parallax of selecting the compensatory minimum of the accumulation result as coupling in a limited zone.The account form of global approach has caused the time complexity of algorithm higher, and and is not suitable for hardware and realizes.Have the scholar to realize global approach on hardware, but system performance also greatly differs from each other from real-time, can't be applied to actual scene.By contrast, the calculated amount of partial approach significantly reduces, and account form is relatively fixing, and the computation schema of each road parallax is identical, and utilizing the characteristics such as parallel computation of hardware to accelerate is the effective ways that obtain system real time.And the shortcoming of partial approach on accuracy also can be by with some algorithms, making up, and it and the global approach gap on accuracy is dwindled gradually.
The basic thought of partial approach is: with reference to every in image with candidate image in point in certain limit compare, getting point the most similar in candidate image is match point.At first there is maximal value according to application scenarios agreement corresponding point deviation of position in binocular image, be designated as D_MAX, so just limited the seek scope of match point.Image is set up two-dimensional coordinate, and wherein the x axial coordinate increases progressively from left to right, and the y coordinate from top to bottom increases progressively; Be the some P of (x, y) for coordinate in reference picture, all point { Q in selected Corresponding matching scope in candidate image i(x – i, y) } (i=0 wherein, 1,2 ..., D_MAX), compare P and Q iMatching degree.The point Q that mates most kBe used as the match point of P, k is called the parallax that P is ordered.Points all in reference picture is repeated this step, can obtain the matching relationship of entire image.
Due to the existence of picture noise, based on single-point erroneous matching more easily appears, therefore need to select the image block of certain size around point, be called supporting zone, the pixel in such zone is mated, can reduce the impact of picture noise.The Variable Area algorithm, according to the brightness relationship of pixel in neighborhood, is the zone of a right-angled intersection of each pixel configuration.Namely set maximum support brachium L_MAX, the support of each pixel intersect can be described as up and down the four-tuple that four brachiums form l, r, u, d}, each brachium is between 0~L_MAX.Through the accumulation along the brachium direction, the support of neighbor pixel intersects finally can form an irregular supporting zone, and maximum possible will be a square that the length of side is L_MAX*2+1.Compare the algorithm in fixed support zone, the Variable Area algorithm will effectively be removed some and be in the pixel of the discontinuous position of the degree of depth, makes in supporting zone the pixel that comprises as much as possible from same depth, thereby has improved the accuracy of coupling.
On compensatory account form, traditional method of directly getting poor absolute value may be subject to the impact of some dazzles or radiation.The Mini-Census conversion is the conversion that a kind of value of absolute brightness with pixel changes relative information into.For the picture element matrix of a 5*5, in the Mini-Census conversion, central pixel point and the brightness of the pixel of six positions selecting are on every side compared, if less than being designated as 1, otherwise be designated as 0.Can obtain the 0-1 vector of one 6 dimension relatively, as the numerical value after the center pixel point transformation, coupling adopts the mode of calculating Hamming distance to carry out.Can effectively tackle the even situation of uneven illumination from definitely transferring relative record on the one hand, the numerical value after conversion has been considered local structure on the other hand, has improved the reliability of coupling.
Having calculated in the image after the supporting zone and Mini-Census conversion of all pixels, right for a point in the middle of left and right two figure, get the common factor of their supporting zones and right coupling compensatory (Hamming distance) of point in this zone of node-by-node algorithm.Then in supporting zone to these compensatory accumulations, finally compensatory as the right normalization coupling of current point divided by the pixel count in zone.As previously mentioned, to calculate in it and right figure in 0~D_MAX –, 1 scope the coupling of pixel to each pixel in left figure compensatory.When obtained this D_MAX normalization coupling compensatory after, choosing wherein minimum compensatory corresponding parallax as optimum matching.
In the accelerating structure of hardware, the most directly thinking is that the calculating parallelization of D_MAX road parallax is carried out.Yet variable supporting zone Stereo Matching Algorithm has also proposed new challenge to the accelerating structure of hardware.The first, due to zone two variable above dimension, in the process that hardware calculates and inconvenient, result of calculation how to utilize neighbor pixel is a difficulty.The second, the column direction changeability of Variable Area makes the account form of line direction sliding window formula commonly used, namely uses new delegation's result replacement result the earliest on column direction, becomes no longer available.In order to realize the accumulation of column direction, we must use a large amount of row cache devices in the calculating of each road parallax, and its length equals the width of image, and need L_MAX*2+1 on column direction.Imagining a picture size is 1920*1080, parallax is 256 applied environment, L_MAX is 15, the unit of storing in the row cache device is the 6-bit vector, and so required storage space will reach (log6+log (L_MAX*2+1)) * 1920* (L_MAX*2+1) * 256=121.90Mbits.This numeral head and shoulders above the resource distribution of state-of-the-art field programmable gate array (FPGA) instantly.Therefore, imperative for the optimization of hardware.
Summary of the invention
The present invention one of is intended to solve the problems of the technologies described above at least to a certain extent or provides at least a kind of useful business to select.For this reason, the object of the invention is to propose that a kind of to have a matching speed fast, the hardware acceleration structure of the variable supporting zone Stereo Matching Algorithm that the matching result precision is high.
Hardware acceleration structure according to the variable supporting zone Stereo Matching Algorithm of the embodiment of the present invention, comprise: memory controller, described memory controller is connected with the DRAM outside sheet, be used for the original left and right road image of input from described DRAM, and the anaglyph after processing for output is final is to described DRAM; The Mini-Census conversion module, described Mini-Census conversion module is connected with described memory controller, described Mini-Census conversion module is used for left and right road image is carried out respectively the Mini-Census conversion, calculates the Mini-Census vector of each pixel; Variable supporting zone is set up module, described variable supporting zone is set up module and is connected with described memory controller, described variable supporting zone is set up module and is used for left and right road image is carried out respectively supporting zone foundation, calculates the support brachium of each pixel on four direction; Stereo matching module, described Stereo matching module are set up module with described Mini-Census conversion module with variable supporting zone and are connected, and described Stereo matching module is used for carrying out Stereo matching and compensatory accumulation is calculated; And parallax selection module, described parallax selects module to be connected with described Stereo matching module, and described parallax selects module to have the compensatory parallax of minimum normalization coupling for selecting, and exports.
In an embodiment of the present invention, also comprise: two Mini-Census vector line buffers, described two row cache devices are connected with described Mini-Census conversion module, being respectively used to transfer the image after left and right two-way conversion to L_MAX*2+1 through the row cache device respectively is that a row formal output that is listed as is to the Hamming distance computing module, wherein, L_MAX is the default maximum support brachium of algorithm.
in an embodiment of the present invention, in described Stereo matching module, the time delay of right wing Mini-Census vector line buffer output row image is gone out D_MAX – 1 tunnel, D_MAX is position maximum deflection difference value in binocular image, Yu Zuo road Mini-Census vector line buffer output row image mates when prostatitis respectively, D_MAX fully independently coupling and compensatory accumulation calculating sub module have been formed, in described coupling and compensatory accumulation calculating sub module, at first to a left side, right two row images are asked Hamming distance, try to achieve initial matching compensatory, then use the integrogram algorithm to accumulate along column direction, calculate required part according to two support brachiums of column direction again, each clock period obtains the accumulation results of corresponding row, in the row direction in the process of accumulation, still take the integrogram algorithm, at first the row result that each cycle is obtained and last cycle write the data of RAM and carry out addition, then result is deposited in RAM, it is poor the taking-up of the accumulation results at two ends to be done from RAM according to the skew of address after having obtained corresponding line direction support brachium again, to obtain the line direction accumulation results that needs.
In an embodiment of the present invention, during described column direction accumulation, adopt the level add tree, so that complete the accumulation to a row pixel within three clock period.
In the hardware acceleration structure of the variable supporting zone Stereo Matching Algorithm of the embodiment of the present invention, during described line direction accumulation, at first use length, as the L_MAX*2+1 dual port RAM, the cumulative data of each clock period input is carried out circular buffer, it is poor that the data of reading correspondence position after obtaining line direction sway brace long message from RAM are done, wherein, constantly the base address of mobile RAM to calculate new pixel.As from the foregoing, the present invention has following useful effect at least: i) with the cumulative orders of first line direction rank rear direction, compare, the present invention has significantly reduced the use of system for storage resources, and has improved data path, has realized stream treatment.Ii) by using the integrogram algorithm to make system can complete processing to any supporting zone within the identical time, thereby reliability and the predictability of system have been promoted.Iii) mode multiplexing with only adopting line direction compared, and the present invention has utilized having increased access to of a small amount of resource new degree of parallelism has improved the reusing degree of data.And provide new adjustment (adjusting) lever in the design of system.Synthetically, hardware acceleration structure of the present invention can produce suitable matching area, promotes significantly matching precision, and while making the high-precision real of high-definition picture, Stereo matching becomes reality.
Additional aspect of the present invention and advantage part in the following description provide, and part will become obviously from the following description, or by practice of the present invention, recognize.
Description of drawings
Above-mentioned and/or additional aspect of the present invention and advantage are from obviously and easily understanding becoming the description of embodiment in conjunction with following accompanying drawing, wherein:
Fig. 1 is the one-piece construction block diagram of hardware acceleration structure of the variable supporting zone Stereo Matching Algorithm of the embodiment of the present invention;
Fig. 2 is the hardware configuration frame diagram of first line direction rank rear directional cumulation;
Fig. 3 is the hardware configuration frame diagram of line direction accumulation after first column direction;
Fig. 4 is line direction accumulation structural representation;
Fig. 5 is walk abreast application in the column direction accumulation of line direction;
Fig. 6 is the hardware structure diagram of column direction accumulation; With
Fig. 7 is the hardware structure diagram of line direction accumulation.
Embodiment
Below describe embodiments of the invention in detail, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or the element with identical or similar functions from start to finish.Be exemplary below by the embodiment that is described with reference to the drawings, be intended to for explaining the present invention, and can not be interpreted as limitation of the present invention.
, for making those skilled in the art understand better the present invention, first principle of the present invention is done and briefly introduced.
For the problem in background technology, the present invention intends innovating improvement from three aspects: i) adjusted the order of compensatory accumulation, made the use of row cache device greatly reduce; Ii) combine the integrogram algorithm, realized the data-reusing that neighbor calculates, and then realized the stream treatment of hardware; Iii) introduced the parallel mode of mixing, the lever of adjustment is provided between the storage of hardware and register resources.
As previously mentioned, the uncertainty of Variable Area algorithm on column direction determined that we must give up the mode of traditional glide filter.In order to meet the variable demand of column direction, we need to use the row cache device to transfer image to input in column direction with according to sway brace length by the input of line direction and complete accumulation.If we still accumulate according to the order of the first line direction rank rear direction of routine, as shown in Figure 2, all need to use this accumulation results row cache device in each road parallax, the storage overhead that brings thus is difficult to bear.For this situation, the present invention proposes the accumulation order of line direction after a kind of first column direction, as shown in Figure 3, namely the data of left and right two-way image are first through a line buffer,, to complete the accumulation of column direction, then enter in the middle of each road parallax the operation of completing the row accumulation.On the whole, the change on order will be distributed in row cache device in the middle of each road parallax and complete before concentrating on Liao Ge road parallax, thereby the quantity of row cache device has been reduced to two from D_MAX.
In the accumulation of calculated column direction and line direction, the present invention has used the integrogram algorithm, namely in advance to the compensatory summation that adds up in 0~L_MAX*2+1 scope, then select corresponding according to the current support brachium of waiting to investigate a little and do poor, compensatory with the accumulation of obtaining pixel in a certain segment limit.In accumulation in the row direction, because the accumulation results of column direction is inputted successively, we adopt the structure as Fig. 4, in original place, add up, and L_MAX*2+1 result before only preserving all the time, complete when the brachium that is supported after corresponding cumulative and the difference of doing operate.It is multiplexing that the algorithm of integrogram makes the data of line direction accumulation to obtain in the middle of the calculating of neighbor pixel, fully again adds up for no longer needing between the pixel within L_MAX*2+1, thereby saved inner calculating and bandwidth.And under the algorithm of integrogram, no matter which type of shape the supporting zone of a certain pixel is, system can be completed within the identical time.System can realize the stream treatment to any supporting zone on this basis.
In addition, the present invention has introduced the parallel mode of mixing, and parallax is parallel parallel with row.The parallel resource that takes full advantage of hardware of parallax, will not have the calculating parallelization ground of data dependence De Ge road parallax to carry out each other.Improved data-reusing degree in the row accumulation and row is parallel by the mode of calculating simultaneously multirow.Because can there be a lot of reusable data in adjacent lines when calculated column is accumulated, so we can use the data that these share by the mode of calculating simultaneously multirow.In Fig. 5, we have added 3 extra pixels on original single file structure, and corresponding according to supporting brachium, do poor logical organization, can realize calculating simultaneously the purpose of 4 ranks accumulations.Simultaneously, the module of row accumulation also will correspondingly increase, and namely completes simultaneously the row accumulation of 4 row.Thereby we have become the parallax of per clock period calculating from four pixels of four lines from the parallax of per clock period calculating from a pixel of delegation.This just means, is keeping under the constant prerequisite of total degree of parallelism, and it is original 1/4th that the parallax degree of parallelism can correspondingly become, and can save in this section 75% hardware resource.Although the row that increase accumulation, row accumulation structure and row cache device will make the expense of storage rise thereupon, the register resources that totally can save is still considerable.
Below in conjunction with Figure of description, the specific embodiment of the present invention is described in detail.Fig. 1 is the one-piece construction block diagram of hardware acceleration structure of the variable supporting zone Stereo Matching Algorithm of the embodiment of the present invention.As shown in Figure 1, this hardware acceleration structure comprises: memory controller 100, and memory controller 100 is connected with the DRAM outside sheet, is used for the original left and right road image of input from DRAM, and the anaglyph after finally processing for output is to DRAM; Mini-Census conversion module 200, Mini-Census conversion module 200 is connected with memory controller 100, and Mini-Census conversion module 200 is used for left and right road image is carried out the Mini-Census conversion, calculates the Mini-Census vector of each pixel; Variable supporting zone is set up module 300, variable supporting zone is set up module 300 and is connected with memory controller 100, variable supporting zone is set up module 300 and is used for left and right road image is carried out supporting zone foundation, calculates the support brachium of each pixel on four direction; Stereo matching module 400, Stereo matching module 400 are set up module 300 with Mini-Census conversion module 200 and variable supporting zone and are connected, and Stereo matching module 400 is used for carrying out Stereo matching and compensatory accumulation is calculated; And parallax selection module 500, parallax selects module 500 to be connected with Stereo matching module 400, and parallax selects module 500 to have the compensatory parallax of minimum normalization coupling for selecting, and exports.
In brief, in binocular solid coupling hardware acceleration structure of the present invention, the anaglyph after original image and processing is preserved in dynamic RAM (DRAM) outside sheet.Original image is input to the Stereo matching system in the mode of stream, pass through successively in the Stereo matching system, memory controller 100, Mini-Census conversion module 200, variable supporting zone is set up module 300, Stereo matching module 400, select module 500 these modules with parallax, the anaglyph after processing outputs to DRAM in the mode of stream from the Stereo matching system.Its detailed process is as follows:
At first, the left and right two-way image that is input in system will, through two identical row cache devices, will transfer to take L_MAX*2+1 as row and exporting by the image that the row order flows into.Can obtain image array through time delay in Mini-Census conversion module 200 and variable supporting zone are set up module 300, complete Mini-Census conversion and variable supporting zone and set up in matrix.Variable supporting zone is set up in process, we estimate that with the brightness of pixel in image whether different pixels is from the identical degree of depth, namely set one and support the threshold value t that intersects, if the difference of the brightness of pixel current to be investigated and central pixel point is less than t on a direction, support brachium and add one, otherwise just the previous point of current pixel is arrived the distance of central pixel point as the party's support brachium upward.After process Mini-Census conversion module 200 and variable supporting zone were set up module 300, the Mini-Census vector sum four direction that just can sequentially obtain each pixel supported brachium.
The Mini-Census vector data of each pixel is listed as output through the length that is rearranged as L_MAX*2+1 after Mini-Census vector line buffer.In Stereo matching module 400, we go out D_MAX – 1 tunnel with the time delay of right wing Mini-Census vector line buffer output row image, respectively when prostatitis, mating of Yu Zuo road Mini-Census vector line buffer output row image, D_MAX fully independently coupling and compensatory accumulation computing module have been formed.In these modules, at first left and right two row images are asked Hamming distance, try to achieve initial matching compensatory.Then use the integrogram algorithm to accumulate along column direction, then according to two support brachiums of column direction, calculate required part, at this moment each clock period can obtain the accumulation results of corresponding row.In the process of accumulating of being expert at, at first the row result that each cycle is obtained and last cycle write the data of RAM and carry out addition, then result is deposited in RAM, it is poor the taking-up of the accumulation results at two ends to be done from RAM according to the skew of address after having obtained corresponding line direction support brachium again, to obtain the line direction accumulation results that needs.Counting for pixel in supporting zone also adopts identical result, and just bit wide is different.
After calculating through above accumulation, can obtain compensatory accumulation results Rawcost and the pixel quantity PixCount of whole D_MAX road parallax in one-period.Select in module 500 at parallax, can use a tree-like comparative structure to select a road from the result of D_MAX road and have the compensatory parallax of minimum normalization coupling.In twos relatively in,, for fear of the calculating of division, can adopt following replacement:
Raw cos t i PixCount i ~ Raw cos t j PixCount j ⇔ RawCost i × PixCount j ~ RawCost j × PixCount i
Can calculate parallax value corresponding to a pixel in left figure in each clock period thus.Namely be output in DRAM and preserve after obtaining after parallax further improving precision through interpolation and medium filtering.
In the column direction accumulation,, due to the addition that relates to a row pixel, be difficult to complete within a clock period.The present invention has used a kind of add tree of level, can complete the accumulation (structure as shown in Figure 6) to a row pixel within three clock period.At first, in using according to reality, to the requirement of frequency, our selected eight pixels are one group, use simple add tree to accumulate within first clock period.Then part and the use add tree to each group accumulated within second clock period, obtains summation.Thus, we can take out corresponding summation and subtract each other after having obtained the sway brace long message, add corresponding part and can complete accumulation to certain part.After having introduced capable walking abreast, we can newly add several pixels very simply on original hardware configuration, correspondingly in selection, do poor part and add identical logic, calculate when can complete the accumulation of multirow row, each cycle can be exported the result that the multirow row are accumulated simultaneously.
, because the data of row accumulation are the Cycle by Cycle inputs, therefore need the dozens of cycle just can complete row accumulation to a pixel.As shown in Figure 7, the present invention has used and the diverse hardware configuration of row accumulation, at first uses a length, as the L_MAX*2+1 dual port RAM, the data of each clock period input are carried out circular buffer.It is poor that the data of reading correspondence position after obtaining line direction sway brace long message from RAM are done.Here need the base address of mobile RAM constantly to calculate new pixel.Because we only need the difference of two column direction accumulation results, the wide maximum bit wide that also can be reduced to the row accumulation results of the word of RAM, and do not need care to overflow., in order to support row parallel, a plurality of row accumulation structures have as above been used in the present invention, to accumulate simultaneously the result of multirow.
Based on hardware acceleration structure as above, the present invention can realize processing in real time under the configuration of multiple different resolution and disparity range, and test result is as shown in table 1.
Table 1 experimental result
Picture size Disparity range Frame per second MDE/s
1920*1080 256 47.6 25242
1024*768 128 129 13076
640*480 64 357 7028
352*288 64 1121 7279
Annotate: MDE/s is 1,000,000 parallax numbers that million disparity estimated per second(per second is set up)
After the first column direction that the present invention proposes, the accumulation order of line direction has reduced the hardware spending of system significantly, makes the Stereo matching system of a full HD 1920*1080 resolution and very large disparity range still can realize on existing most of FPGA device.Experimental data shows,, if adopt the accumulation mode of first line direction rank rear direction, will use and approximately be five times in storage resources used in the present invention, and this existing FPGA device is difficult to support.In addition, the present invention has used the parallel mode of mixing, and is keeping under the constant condition of total degree of parallelism, and adjustment parallax degree of parallelism and row degree of parallelism can be realized the allotment between logical resource and storage resources.
As from the foregoing, the present invention has following useful effect at least: i) with the cumulative orders of first line direction rank rear direction, compare, the present invention has significantly reduced the use of system for storage resources, and has improved data path, has realized stream treatment.Ii) by using the integrogram algorithm to make system can complete processing to any supporting zone within the identical time, thereby reliability and the predictability of system have been promoted.Iii) mode multiplexing with only adopting line direction compared, and the present invention has utilized having increased access to of a small amount of resource new degree of parallelism has improved the reusing degree of data.And provide new adjustment (adjusting) lever in the design of system.Synthetically, hardware acceleration structure of the present invention can produce suitable matching area, promotes significantly matching precision, and while making the high-precision real of high-definition picture, Stereo matching becomes reality.
Describe and can be understood in process flow diagram or in this any process of otherwise describing or method, expression comprises module, fragment or the part of code of the executable instruction of the step that one or more is used to realize specific logical function or process, and the scope of the preferred embodiment of the present invention comprises other realization, wherein can be not according to order shown or that discuss, comprise according to related function by the mode of basic while or by opposite order, carry out function, this should be understood by the embodiments of the invention person of ordinary skill in the field.
In the description of this instructions, the description of reference term " embodiment ", " some embodiment ", " example ", " concrete example " or " some examples " etc. means to be contained at least one embodiment of the present invention or example in conjunction with specific features, structure, material or the characteristics of this embodiment or example description.In this manual, the schematic statement of above-mentioned term not necessarily referred to identical embodiment or example.And the specific features of description, structure, material or characteristics can be with suitable mode combinations in any one or more embodiment or example.
Although the above has illustrated and has described embodiments of the invention, be understandable that, above-described embodiment is exemplary, can not be interpreted as limitation of the present invention, those of ordinary skill in the art is not in the situation that break away from principle of the present invention and aim can change above-described embodiment within the scope of the invention, modification, replacement and modification.

Claims (5)

1. the hardware acceleration structure of a variable supporting zone Stereo Matching Algorithm, is characterized in that, comprising:
Memory controller, described memory controller is connected with the DRAM outside sheet, is used for the original left and right road image of input from described DRAM, and the anaglyph after finally processing for output is to described DRAM;
The Mini-Census conversion module, described Mini-Census conversion module is connected with described memory controller, described Mini-Census conversion module is used for left and right road image is carried out respectively the Mini-Census conversion, calculates the Mini-Census vector of each pixel;
Variable supporting zone is set up module, described variable supporting zone is set up module and is connected with described memory controller, described variable supporting zone is set up module and is used for left and right road image is carried out respectively supporting zone foundation, calculates the support brachium of each pixel on four direction;
Stereo matching module, described Stereo matching module are set up module with described Mini-Census conversion module with variable supporting zone and are connected, and described Stereo matching module is used for carrying out Stereo matching and compensatory accumulation is calculated; And
Parallax is selected module, and described parallax selects module to be connected with described Stereo matching module, and described parallax selects module to have the compensatory parallax of minimum normalization coupling for selecting, and exports.
2. the hardware acceleration structure of variable supporting zone Stereo Matching Algorithm as claimed in claim 1, is characterized in that, also comprises:
Two Mini-Census vector line buffers, described two row cache devices are connected with described Mini-Census conversion module, being respectively used to transfer the image after left and right two-way conversion to L_MAX*2+1 through the row cache device respectively is that a row formal output that is listed as is to the Hamming distance computing module, wherein, L_MAX is the default maximum support brachium of algorithm.
3. the hardware acceleration structure of variable supporting zone Stereo Matching Algorithm as claimed in claim 2, it is characterized in that, in described Stereo matching module, the time delay of right wing Mini-Census vector line buffer output row image is gone out D_MAX – 1 tunnel, D_MAX is position maximum deflection difference value in binocular image, Yu Zuo road Mini-Census vector line buffer output row image mates when prostatitis respectively, D_MAX fully independently coupling and compensatory accumulation calculating sub module have been formed, in described coupling and compensatory accumulation calculating sub module, at first to a left side, right two row images are asked Hamming distance, try to achieve initial matching compensatory, then use the integrogram algorithm to accumulate along column direction, calculate required part according to two support brachiums of column direction again, each clock period obtains the accumulation results of corresponding row, in the row direction in the process of accumulation, still take the integrogram algorithm, at first the row result that each cycle is obtained and last cycle write the data of RAM and carry out addition, then result is deposited in RAM, it is poor the taking-up of the accumulation results at two ends to be done from RAM according to the skew of address after having obtained corresponding line direction support brachium again, to obtain the line direction accumulation results that needs.
4. the hardware acceleration structure of variable supporting zone Stereo Matching Algorithm as claimed in claim 3, is characterized in that, during described column direction accumulation, adopts the level add tree, so that complete the accumulation to a row pixel within three clock period.
5. the hardware acceleration structure of variable supporting zone Stereo Matching Algorithm as claimed in claim 3, it is characterized in that, during described line direction accumulation, at first use length, as the L_MAX*2+1 dual port RAM, the cumulative data of each clock period input is carried out circular buffer, it is poor that the data of reading correspondence position after obtaining line direction sway brace long message from RAM are done, wherein, constantly the base address of mobile RAM to calculate new pixel.
CN201310349180.XA 2013-08-12 2013-08-12 The hardware acceleration structure of variable supporting zone Stereo Matching Algorithm Active CN103400390B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310349180.XA CN103400390B (en) 2013-08-12 2013-08-12 The hardware acceleration structure of variable supporting zone Stereo Matching Algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310349180.XA CN103400390B (en) 2013-08-12 2013-08-12 The hardware acceleration structure of variable supporting zone Stereo Matching Algorithm

Publications (2)

Publication Number Publication Date
CN103400390A true CN103400390A (en) 2013-11-20
CN103400390B CN103400390B (en) 2016-02-24

Family

ID=49564002

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310349180.XA Active CN103400390B (en) 2013-08-12 2013-08-12 The hardware acceleration structure of variable supporting zone Stereo Matching Algorithm

Country Status (1)

Country Link
CN (1) CN103400390B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110706146A (en) * 2019-09-26 2020-01-17 北京机电工程研究所 Image processing method and device
CN113436057A (en) * 2021-08-27 2021-09-24 绍兴埃瓦科技有限公司 Data processing method and binocular stereo matching method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100142828A1 (en) * 2008-12-10 2010-06-10 Electronics And Telecommunications Research Institute Image matching apparatus and method
CN103220545A (en) * 2013-04-28 2013-07-24 上海大学 Hardware implementation method of stereoscopic video real-time depth estimation system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100142828A1 (en) * 2008-12-10 2010-06-10 Electronics And Telecommunications Research Institute Image matching apparatus and method
CN103220545A (en) * 2013-04-28 2013-07-24 上海大学 Hardware implementation method of stereoscopic video real-time depth estimation system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
丁菁汀 等: "基于FPGA的立体视觉匹配的高性能实现", 《电子与信息学报》, vol. 33, no. 3, 31 March 2011 (2011-03-31) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110706146A (en) * 2019-09-26 2020-01-17 北京机电工程研究所 Image processing method and device
CN113436057A (en) * 2021-08-27 2021-09-24 绍兴埃瓦科技有限公司 Data processing method and binocular stereo matching method
CN113436057B (en) * 2021-08-27 2021-11-19 绍兴埃瓦科技有限公司 Data processing method and binocular stereo matching method

Also Published As

Publication number Publication date
CN103400390B (en) 2016-02-24

Similar Documents

Publication Publication Date Title
US20190303731A1 (en) Target detection method and device, computing device and readable storage medium
CN111199273B (en) Convolution calculation method, device, equipment and storage medium
CN111931918B (en) Neural network accelerator
KR20180034557A (en) Improving the performance of a two-dimensional array processor
CN101841730A (en) Real-time stereoscopic vision implementation method based on FPGA
CN110175670B (en) Method and system for realizing YOLOv2 detection network based on FPGA
Jin et al. A parallel optimization method for stencil computation on the domain that is bigger than memory capacity of GPUs
Fan et al. F-C3D: FPGA-based 3-dimensional convolutional neural network
KR20180123846A (en) Logical-3d array reconfigurable accelerator for convolutional neural networks
CN109743562B (en) Matching cost calculation circuit structure based on Census algorithm and working method thereof
CN113361695B (en) Convolutional neural network accelerator
Li et al. High throughput hardware architecture for accurate semi-global matching
CN105931256A (en) CUDA (compute unified device architecture)-based large-format remote sensing image fast segmentation method
CN103400393B (en) A kind of image matching method and system
Chang et al. Efficient stereo matching on embedded GPUs with zero-means cross correlation
CN106101712A (en) A kind of processing method and processing device of video stream data
CN103400390B (en) The hardware acceleration structure of variable supporting zone Stereo Matching Algorithm
CN113436057B (en) Data processing method and binocular stereo matching method
CN104469381A (en) Implement system of VLSI adopting adaptive adjustment algorithm for H.264 motion estimation search window
CN114003201A (en) Matrix transformation method and device and convolutional neural network accelerator
Ding et al. Improved real-time correlation-based FPGA stereo vision system
CN103809177A (en) FPGA based radar imaging parallelizing method
Roszkowski et al. FPGA design of the computation unit for the semi-global stereo matching algorithm
CN108197613B (en) Face detection optimization method based on deep convolution cascade network
Qamar et al. Analysis and implementation of the semi-global matching 3d vision algorithm using code transformations and high-level synthesis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant