CN116563087B

CN116563087B - Gradient calculation and caching device and resource-saving cost calculation method

Info

Publication number: CN116563087B
Application number: CN202310544020.4A
Authority: CN
Inventors: 李可; 董平成
Original assignee: Shenzhen Juyuan Shixin Technology Co ltd
Current assignee: Shenzhen Juyuan Shixin Technology Co ltd
Priority date: 2023-05-12
Filing date: 2023-05-12
Publication date: 2024-05-14
Anticipated expiration: 2043-05-12
Also published as: CN116563087A

Abstract

The invention discloses a gradient calculation and caching device and a resource-saving cost calculation method, wherein the device comprises a plurality of rows of caches and a window cache; the window cache comprises a plurality of registers arranged in rows and columns, and each row of cache corresponds to one row of registers; the input data of the first line cache is a left-mesh gray level diagram or a right-mesh gray level diagram, and the line cache lengths of the two gray level diagrams are different; the input data of each other line of cache is determined based on one register of the previous line of cache; the A column register in the window cache is used for gradient calculation, the A column register forms B gradient calculation modules, each gradient calculation module comprises a plurality of continuous rows of registers, and A and B are determined based on the size of the window; the row of registers corresponding to the gradient calculation modules respectively contain gradient information and are used for calculating matching cost. According to the invention, by arranging a plurality of gradient calculation modules, a large amount of gradient information is prevented from being cached, and the problem that a hardware architecture applied to cost calculation at present consumes a large amount of resources and cache space is solved.

Description

Gradient calculation and caching device and resource-saving cost calculation method

Technical Field

The invention relates to the field of binocular stereo matching, in particular to a gradient calculation and caching device and a resource-saving cost calculation method.

Background

Binocular stereoscopic vision is a technology for realizing three-dimensional depth perception by simulating a human vision principle. The method aims at finding corresponding points from pictures of the same scene shot by the left camera and the right camera, and generating a reference image parallax map by using a similar triangle principle. The method is widely applied to the fields of unmanned aerial vehicles, automatic driving, target detection and the like. Currently, the mainstream binocular algorithm has global matching, semi-global matching and local matching. Among them, semi-global stereo matching can balance computation amount and accuracy well, but due to its extremely high complexity, hardware implementation still has challenges. Gradient computation and cost computation are important steps in binocular stereo matching. The cost calculation directly affects the accuracy and efficiency of binocular stereo matching, and aims to measure the correlation between pixels to be matched and candidate pixels. Specifically, in gradient computation, convolution operation is required by using a window and a sobel operator, and meanwhile, in general survey transformation in cost computation, the window is also required. The common thinking is that the input gray data is buffered to form a window required by gradient calculation, and gradient information is transmitted after calculation. Gradient information also enters the window through the line cache, preparing for cost calculation. Therefore, three lines of gray level caches and three lines of gradient information caches are needed, and the gradient information needs to be cached in the x direction, the y direction and the total direction, so that a large amount of resources and cache space are consumed by the existing hardware architecture applied to cost calculation. At present, a hardware architecture with large information quantity of fusion cost and low design resource consumption is urgently needed.

Accordingly, there is a need for improvement and development in the art.

Disclosure of Invention

The invention aims to solve the technical problems that the prior hardware architecture applied to cost calculation consumes a large amount of resources and cache space.

The technical scheme adopted by the invention for solving the problems is as follows:

In a first aspect, an embodiment of the present invention provides a gradient computing and caching apparatus, where the apparatus includes a plurality of line caches and a window cache; the window cache comprises a plurality of registers arranged in rows and columns, and each row cache corresponds to one row of registers;

The input data of the first line cache is a left-eye or right-eye gray scale image, and the lengths of the line caches respectively corresponding to the left-eye gray scale image and the right-eye gray scale image are different; the input data of each line cache except the first line cache is determined based on one of the registers corresponding to the previous line cache;

The register A in the window cache is used for carrying out gradient calculation, and the register A forms B gradient calculation modules; each gradient calculation module comprises a plurality of continuous lines of registers, wherein the numerical values of A and B are determined based on a preset window size;

The registers in one row corresponding to the gradient calculation modules respectively contain gradient information of the pixel points, and the gradient information is used for calculating the matching cost of the pixel points.

In one embodiment, the gradient calculation process of each gradient calculation module is independent from each other.

In one embodiment, the total number of registers is determined based on a preset disparity range, the window size, and the total number of lines of the line cache.

In a second aspect, an embodiment of the present invention further provides a method for calculating a cost of saving resources, where the method includes:

acquiring gray maps corresponding to the left eye and the right eye respectively;

Inputting the gray level images of the left and right objects into the gradient calculating and caching device according to any one of the above, so as to obtain gradient information corresponding to the gray level images of the left and right objects respectively;

For each corresponding pixel point in the two gray level diagrams, determining the gradient cost of the pixel point according to the gradient information, and determining the hamming distance of the pixel point based on the gradient information through census transformation;

And respectively carrying out normalization processing on the gradient cost and the Hamming distance, and carrying out weighted fusion according to the gradient cost and the Hamming distance after normalization processing to obtain the matching cost of the pixel point.

In one embodiment, the determining the gradient cost of the pixel point according to the gradient information includes:

acquiring gradients of the pixel point along the x direction and the y direction in the two gray level diagrams according to the gradient information;

and determining the gradient cost corresponding to the pixel point according to the sum of the absolute value of the subtraction of the gradient in the x direction and the absolute value of the subtraction of the gradient in the y direction.

In one embodiment, the census bit string of each pixel in the census transformation is determined based on the gradient magnitude relationship between the pixel and pixels within a preset range around the pixel.

In one embodiment, the normalization of the gradient cost is performed by linear fitting and the normalization of the hamming distance is performed by a selector.

In one embodiment, the method further comprises:

And calculating parallax according to the matching cost to obtain a parallax image.

In a third aspect, an embodiment of the present invention further provides a terminal, where the terminal includes a memory and one or more processors; the memory stores more than one program; the program comprising instructions for performing the resource-saving cost calculation method according to any one of the above; the processor is configured to execute the program.

In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium having a plurality of instructions stored thereon, where the instructions are adapted to be loaded and executed by a processor to implement the steps of any of the above-described resource-saving cost calculation methods.

The invention has the beneficial effects that: the embodiment of the invention discloses a gradient calculation and caching device and a resource-saving cost calculation method, wherein the device comprises a plurality of rows of caches and a window cache; the window cache comprises a plurality of registers arranged in rows and columns, and each row of cache corresponds to one row of registers; the input data of the first line cache is a left-mesh gray level diagram or a right-mesh gray level diagram, and the line cache lengths of the two gray level diagrams are different; the input data of each other line of cache is determined based on one register of the previous line of cache; the A column register in the window cache is used for gradient calculation, the A column register forms B gradient calculation modules, each gradient calculation module comprises a plurality of continuous rows of registers, and A and B are determined based on the size of the window; the row of registers corresponding to the gradient calculation modules respectively contain gradient information and are used for calculating matching cost. According to the invention, by arranging a plurality of gradient calculation modules, a large amount of gradient information is prevented from being cached, and the problem that a hardware architecture applied to cost calculation at present consumes a large amount of resources and cache space is solved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to the drawings without inventive effort to those skilled in the art.

Fig. 1 is a schematic diagram of a gradient computing and buffering device according to an embodiment of the present invention.

Fig. 2 is a timing diagram of gradient repeat calculation provided in an embodiment of the present invention.

Fig. 3 is a flow chart of a resource-saving cost calculation method according to an embodiment of the present invention.

Fig. 4 is a diagram of a cost calculation architecture according to an embodiment of the present invention.

Fig. 5 is a schematic diagram and a schematic diagram of linear fitting e-exponent normalization according to an embodiment of the present invention.

Fig. 6 is a schematic diagram of main steps of binocular stereo matching according to an embodiment of the present invention.

Fig. 7 is a schematic block diagram of a terminal according to an embodiment of the present invention.

Detailed Description

The invention discloses a gradient calculation and caching device and a resource-saving cost calculation method, which are used for making the purposes, technical schemes and effects of the invention clearer and more definite, and further detailed description of the invention is provided below by referring to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.

It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

In view of the above-mentioned drawbacks of the prior art, the present invention provides a gradient calculating and buffering device, as shown in fig. 1, where the device includes a plurality of line buffers and a window buffer; the window cache comprises a plurality of registers arranged in rows and columns, and each row cache corresponds to one row of registers;

In short, since the two steps of gradient calculation and cost calculation both require windows, in order to avoid buffering huge gradient information, the embodiment adopts a mode of sharing line buffering and window buffering to save resource consumption, and designs a plurality of gradient calculation modules, each gradient calculation module can independently perform gradient calculation, namely, pixels are transmitted into a second gradient calculation module from a first gradient calculation module, and gradient information is not buffered, so that the trouble of buffering gradient information is avoided. The calculated gradient is transmitted backwards, and other registers in the window cache have gradient information and can be used for calculating the matching cost between pixel points. Since the pixels in the left image are matched with the pixels in the right image to the left, and the pixels in the right image are matched with the pixels in the left image to the left, the gradient calculation and the buffer device respectively corresponding to the left image and the right image are different in the length of the line buffer according to the data flow of the line buffer. By arranging a plurality of gradient calculation modules, the embodiment can avoid buffering a large amount of gradient information, and solves the problem that the existing hardware architecture applied to cost calculation consumes a large amount of resources and buffering space.

For example, as shown in fig. 1, taking the left-hand gray scale image as an example, there are five line caches in total. The input gray scale map enters a line buffer, and the buffered data is input to a plurality of registers in preparation for calculating gradients and costs. The preset window size is 3x3, so the three columns of registers closest to the line cache (i.e., a=3) are used for gradient calculations, i.e., the leftmost 3x5 blue window in the figure is the register for gradient calculations. Any three consecutive rows of registers (i.e., b=3) within the 3×5 blue window are one gradient calculation module, so there are three gradient calculation modules that are performed simultaneously. The principle of the gradient calculation module is as follows: and carrying out convolution operation of the Sobel operator on the three pixels at the center to obtain gradients in two directions X, Y, and adding according to the gradients in the two directions to obtain a total gradient. The computed gradient is passed back, and the corresponding register in the window cache holds gradient information (i.e., the register in the figure contains multiple vertical lines) that can be used to compute the matching cost. It should be noted that, the calculation processes of the gradient calculation modules are independent from each other, as shown in fig. 2, the window data of 3×3 sequentially becomes the center pixel of the gradient calculation module, and the a22 pixel is transferred from the first gradient calculation module to the second gradient calculation module through a clock of one row, and the gradient of a22 will be recalculated because the gradient information is not cached.

In one implementation, the total number of registers is determined based on a preset disparity range, the window size, and the total number of lines of the line cache.

Specifically, the registers are arranged in rows and columns, and the number of registers in a row can be calculated according to the parallax range and the window size corresponding to gradient calculation/census transformation. Each row of registers corresponds to a row cache, and a total of several rows of registers can be determined according to the total number of rows of the row cache. The total number of registers can be obtained by multiplying the number of registers in one row by the number of rows corresponding to the registers.

For example, assuming that the parallax range is 256, and the window sizes of the gradient calculation and the census transform are both 3×3, the registers are (256+3) ×5 in total. Because 256 windows of 3x3 are arranged in the window cache, the preliminary costs (comprising gradient costs and hamming distances determined based on census transformation) of the left graph and the right graph can be calculated simultaneously, and the synchronicity of calculating the costs of the left graph and the right graph can be ensured while the assembly line for calculating the gradient and the calculation costs is realized, so that the utilization rate of resources is improved.

In one implementation, the window cache includes two types of registers, where the two types of registers have different numbers of bits corresponding to each other, and the number of bits of the register including the gradient information is greater than that of the register not including the gradient information, so as to achieve the purpose of saving resources. As shown in fig. 1, the pure gray register represents the input gray data, with only 8 bits. The registers of the vertical stripes have four parts, namely gray data, x-direction gradient, y-direction gradient and total gradient, which are 41 bits in total.

Based on the above embodiment, the present invention further provides a method for calculating cost of saving resources, as shown in fig. 3, where the method includes:

Step S100, obtaining gray maps corresponding to left eyes and right eyes respectively;

Step 200, inputting the gray level images of the left eye and the right eye into the gradient calculating and caching device according to any one of the above, so as to obtain gradient information corresponding to the gray level images of the left eye and the right eye respectively;

Step S300, for each corresponding pixel point in the two gray level diagrams, determining the gradient cost of the pixel point according to the gradient information, and determining the hamming distance of the pixel point based on the gradient information through census transformation;

And step 400, respectively carrying out normalization processing on the gradient cost and the Hamming distance, and carrying out weighted fusion according to the gradient cost and the Hamming distance after normalization processing to obtain the matching cost of the pixel point.

Specifically, the cost calculation in binocular stereo matching is to measure the correlation between the pixels to be matched and the candidate pixels, and when the cost is larger, the corresponding pixel points in the left-eye gray level map and the right-eye gray level map are more dissimilar. In this embodiment, in order to improve accuracy and reliability of matching cost, first, the left-eye gray level map and the right-eye gray level map are input into the gradient computing and buffering device corresponding to each other, so as to obtain gradient information corresponding to the two gray level maps respectively. And then calculating the gradient cost and the Hamming distance of the corresponding pixel points in the two gray maps according to the gradient information of the two gray maps. The gradient cost is used for reflecting the similarity degree of gradients between the pixels to be matched and the candidate pixels (the gradient difference can comprise a transverse gradient difference and a longitudinal gradient difference); the hamming distance is determined by a census transform and gradient information, and the gradient-based census transform can highlight areas of parallax discontinuities. And respectively normalizing the gradient cost and the Hamming distance, then fusing, and determining the final matching cost of the corresponding pixel points in the left-eye gray level diagram and the right-eye gray level diagram according to the fusion result.

In one implementation, the determining the gradient cost of the pixel point according to the gradient information includes:

In short, for each pixel point p, its gradient cost at parallax d needs to be determined with reference to the gradients in the horizontal direction and the vertical direction. The gradient cost is specifically the sum of the absolute value of the gradient subtraction of the pixel point p in the x direction and the absolute value of the gradient subtraction of the pixel point p in the y direction, and the specific formula is as follows:

Wherein, And/>Gradient information of the pixel p in x and y directions is represented, and I _L and I _R represent left and right diagrams, respectively.

In one implementation, the census bit string of each pixel in the census transformation is determined based on a gradient magnitude relationship between the pixel and pixels within a preset range around the pixel.

In particular, the conventional census transform is to represent a pixel window with a bit string determined by the gray value magnitude relationship of the center pixel and the surrounding pixels, in other words, the conventional census transform is implemented using the gray values of the original image of the pixel. In this embodiment, the conventional census transform is improved, and the matching accuracy of the repeated texture region is improved by comparing the gradient magnitudes of the pixels. Specifically, the census transform represents a window of pixels with a string of bits that is determined by the gradient magnitude relationship of the center pixel and the surrounding pixels. After the window size is fixed, each pixel has a corresponding census bit string, census conversion codes of the left-eye and right-eye images are calculated, and the census conversion codes can be converted into hamming distances to judge the similarity degree of pixel pairs.

In one implementation, the improved census transformation may be represented by the following formula:

Representing the field of bit-wise connection, NP representing p, typically a 3x3 or 5x5 window, I _GRAD representing the gradient of the gray-scale picture I, and after census transformation, using the hamming distance to calculate the similarity of two pixels:

in one implementation, the expression formula for obtaining the final matching cost by carrying out weighted fusion on the gradient cost and the hamming distance is as follows:

C(p，d)＝Hamming[C_L(p),C_R(p-d)]；

Where λ is the regularization coefficient. In one implementation, λ _GRAD and λ _Census are set to 32 and 16, respectively, to facilitate hardware computation, while ensuring accuracy.

In one implementation, the normalization of the gradient cost is performed by linear fitting and the normalization of the hamming distance is performed by a selector.

Specifically, cost fusion normalization can be calculated on software using an e-exponent, however, hardware implementation of an e-exponent requires significant resources. Therefore, in order to save resources, the embodiment adopts the lookup table and the linear fitting to normalize the two cost values respectively on the premise of not affecting the precision by analyzing the characteristics of the two cost values, and then fuses the two cost values after normalization to obtain the final matching cost.

In one implementation, the normalization processing is performed on the hamming distance determined based on the census transformation through a selector, and specifically includes: according to the preset window size, a plurality of possible integer result values of the hamming distance calculated in the window are determined, normalized values corresponding to the integer result values are placed into a lookup table, and the normalized result of the hamming distance determined based on census transformation can be obtained by using a selector based on the lookup table.

For example, as shown in the upper right of fig. 4, the hamming distance is calculated in a 3x3 window to obtain a value of only 0-9 total 10 integer values, the corresponding normalized value is put into a lookup table, and the normalized result can be obtained by using a simple selector.

In one implementation, the normalization processing is performed on the gradient cost through linear fitting, and specifically includes: dividing the index part into a plurality of sections according to the bit number of the index part, and generating a plurality of linear fitting points according to each section, wherein the section area close to 0 is smaller in area, and binary data of each linear fitting point which is supposed to meet the corresponding section/area can be represented by using a simple NAND gate.

Specifically, the normalization of gradient costs must be implemented using an e-exponent. Implementation of the e-exponent is accomplished in hardware in a number of ways, such as separating the integer and fractional parts of the exponent, storing the fractional result of the e-exponent in a look-up table, and then shifting the data in the look-up table by the integer part, but this approach is not applicable to the calculations currently required. Taylor expansion e-exponents are available methods, but six floating-point multipliers are needed for one exponential operation, and the time delay is large, so that the balance between resources and precision cannot be achieved. Therefore, the embodiment chooses to normalize the gradient cost by adopting a linear fitting mode. The exponent part is divided into a plurality of intervals according to the bit number of the exponent part, and the interval distinction range near 0 is smaller because the exponent part changes faster near 0. In addition, in order to avoid the comparison symbol, the selection of each linear fitting point needs to satisfy a certain characteristic, that is, binary data of the interval/range corresponding to the linear fitting point, and preferably, a simple nand gate representation can be adopted.

For example, as shown in fig. 5, the exponent portion is divided into 11 sections according to the number of bits of the exponent portion, and the section area near 0 is small. Meanwhile, the selected fitting point also satisfies a certain characteristic, for example, the range represented by R0 is [0,0.125], and the corresponding binary system is 0.00000-0.00010 (five-bit binary decimal), so that the range can be represented by a simple NAND gate. All ranges can be expressed in this way, which saves a great deal of resources for using comparators with multiple bits. K and b of each fitting straight line are calculated in advance, and the result after GAD normalization can be obtained by shifting and adding after judging the range.

In one implementation, as shown in fig. 6, the method further includes:

Specifically, binocular stereo matching mainly comprises four steps: and (5) initial cost calculation, cost aggregation, post-processing parallax optimization and parallax calculation. In the embodiment, the post-processing and parallax calculation are performed by adopting the fused matching cost, so that an accurate parallax image can be obtained.

The invention has the advantages that:

1. with the architecture of recalculation, a large amount of storage is saved with a small amount of logic resources;

2. A piecewise linear fitting architecture based on a look-up table is performed on the nonlinear function.

3. The designed hardware architecture is easy to realize and has good resource consumption, compared with software calculation, the parallax image precision is greatly improved and the edge is more obvious.

Based on the above embodiment, the present invention also provides a terminal, and a functional block diagram thereof may be shown in fig. 7. The terminal comprises a processor, a memory, a network interface and a display screen which are connected through a system bus. Wherein the processor of the terminal is adapted to provide computing and control capabilities. The memory of the terminal includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the terminal is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a resource-efficient cost calculation method. The display screen of the terminal may be a liquid crystal display screen or an electronic ink display screen.

It will be appreciated by those skilled in the art that the functional block diagram shown in fig. 7 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the terminal to which the present inventive arrangements may be applied, and that a particular terminal may include more or less components than those shown, or may combine some of the components, or have a different arrangement of components.

In one implementation, the memory of the terminal has stored therein one or more programs, and the execution of the one or more programs by one or more processors includes instructions for performing a resource-saving cost calculation method.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

In summary, the invention discloses a gradient computing and caching device and a resource-saving cost computing method, wherein the device comprises a plurality of rows of caches and a window cache; the window cache comprises a plurality of registers arranged in rows and columns, and each row of cache corresponds to one row of registers; the input data of the first line cache is a left-mesh gray level diagram or a right-mesh gray level diagram, and the line cache lengths of the two gray level diagrams are different; the input data of each other line of cache is determined based on one register of the previous line of cache; the A column register in the window cache is used for gradient calculation, the A column register forms B gradient calculation modules, each gradient calculation module comprises a plurality of continuous rows of registers, and A and B are determined based on the size of the window; the row of registers corresponding to the gradient calculation modules respectively contain gradient information and are used for calculating matching cost. According to the invention, by arranging a plurality of gradient calculation modules, a large amount of gradient information is prevented from being cached, and the problem that a hardware architecture applied to cost calculation at present consumes a large amount of resources and cache space is solved.

It is to be understood that the invention is not limited in its application to the examples described above, but is capable of modification and variation in light of the above teachings by those skilled in the art, and that all such modifications and variations are intended to be included within the scope of the appended claims.

Claims

1. The gradient calculating and caching device is characterized by comprising a plurality of line caches and a window cache; the window cache comprises a plurality of registers arranged in rows and columns, and each row cache corresponds to one row of registers; the window cache comprises two types of registers, and the two types of registers respectively correspond to different digits, wherein the digits of the registers containing gradient information are larger than those of the registers not containing gradient information;

The input data of the first line cache is a left-eye or right-eye gray scale image, and the lengths of the line caches respectively corresponding to the left-eye gray scale image and the right-eye gray scale image are different; the input data of each line cache except the first line cache is determined based on the register corresponding to the previous line cache, wherein the register does not contain gradient calculation;

2. The gradient computing and buffering device according to claim 1, wherein the gradient computing processes of the gradient computing modules are independent of each other.

3. The gradient computing, buffering apparatus of claim 1, wherein the total number of registers is determined based on a preset disparity range, the window size, and a total number of lines of the line buffer.

4. A method for calculating a cost of saving resources, the method comprising:

Inputting the gray level images of the left and right objects into the gradient calculating and caching device according to any one of claims 1-3 to obtain gradient information corresponding to the gray level images of the left and right objects respectively;

5. The method for calculating the cost of saving resources according to claim 4, wherein determining the gradient cost of the pixel point according to the gradient information comprises:

6. The method for calculating the cost of saving resources according to claim 4, wherein the census bit string of each pixel point in the census transformation is determined based on a gradient magnitude relation between the pixel point and the pixel points in a surrounding preset range.

7. The resource saving cost calculation method according to claim 4, wherein the normalization processing of the gradient cost is performed by linear fitting, and the normalization processing of the hamming distance is performed by a selector.

8. The resource-saving cost calculation method according to claim 4, further comprising:

9. A terminal comprising a memory and one or more processors; the memory stores more than one program; the program comprising instructions for performing the resource-saving cost calculation method according to any one of claims 4 to 8; the processor is configured to execute the program.

10. A computer readable storage medium having stored thereon a plurality of instructions adapted to be loaded and executed by a processor to implement the steps of the resource-efficient cost calculation method according to any of the preceding claims 4-8.