CN111583092A

CN111583092A - Variable-split optical flow FPGA implementation method, system, storage medium and terminal

Info

Publication number: CN111583092A
Application number: CN202010234746.4A
Authority: CN
Inventors: 贾媛; 李鑫; 李娇娇; 宋彬; 王养利; 李云松
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2020-03-30
Filing date: 2020-03-30
Publication date: 2020-08-25
Anticipated expiration: 2040-03-30
Also published as: CN111583092B

Abstract

The invention belongs to the technical field of computer vision processing, and discloses a variable-split optical flow FPGA (field programmable gate array) implementation method, a system, a storage medium and a terminal, which are used for preprocessing two frames of input images, including image color space conversion and image denoising, and outputting the two preprocessed images; calculating the horizontal gradient and the vertical gradient of the current pixel of the preprocessed image through the neighborhood pixels of the image; meanwhile, calculating the gradient in the time direction through pixels at corresponding positions in the two frames of images; calculating parameters of a large linear equation set required by the variable-split optical flow calculation according to the output of the image preprocessing model and the image gradient calculation model; the final optical flow output is solved by an iterative computational model. The invention can greatly shorten the application development time of the optical flow algorithm on FPGA hardware, each module is mutually independent, the cutting, the expansion and the maintenance are easy, the operation speed of the variational optical flow algorithm can be greatly improved, and the purpose of real-time application can be achieved.

Description

Variable-split optical flow FPGA implementation method, system, storage medium and terminal

Technical Field

The invention belongs to the technical field of computer vision processing, and particularly relates to a variable-split optical flow FPGA (field programmable gate array) implementation method, a system, a storage medium and a terminal.

Background

Computer vision is currently a subject of major research on how to express using images. As a branch of computer vision, optical flow estimation is a widely used pixel motion representation method, and is one of the most basic and extensive problems in the field of computer vision. The mainstream optical flow estimation method at present is an optical flow algorithm based on a variational optimization technology. The variational optimization technology describes the optical flow estimation problem as a minimization problem of a target energy function, and converts the final problem into a solution problem of an ultra-large linear equation by solving the target energy function. Typical variational optical flow techniques typically include the steps of preprocessing, linear equation set construction, equation solution, etc., which are quite computationally complex. The common realization of the variational optical flow is mainly realized based on high-level languages such as C/C + +, Matlab and the like, and the realization operation speed based on the high-level languages is slow due to the extremely high computational complexity of the variational optical flow, so that the application scene is limited. Aiming at the problem that the variable optical flow algorithm realized by adopting a high-level language is slow in speed, the algorithm is generally divided into a plurality of modules, and the simultaneous operation of the modules is realized on a general processor by adopting a multithreading technology so as to improve the utilization rate of the processor. The FPGA (field programmable Gate array) can effectively realize the parallel acceleration of a plurality of modules, and is a good choice for hardware acceleration. However, for the variable-split optical flow technology, the traditional FPGA development mode is high in implementation difficulty and difficult in architecture design, and the requirement for rapid update iteration of the algorithm is difficult to meet. In view of the above problems, a new method is needed to conveniently increase the computation speed of the variational optical flow algorithm and to quickly implement the algorithm.

Through the above analysis, the problems and defects of the prior art are as follows:

(1) the current variable light stream technology has high computational complexity, low running speed based on high-level language realization and limited application scenes.

(2) The traditional FPGA development mode is high in implementation difficulty, difficult in architecture design and difficult to meet the requirement for rapid updating iteration of the algorithm.

The difficulty in solving the above problems and defects is:

(1) algorithms designed based on high-level languages are generally processed in units of frames, while algorithms implemented based on FPGAs are generally processed in units of pixels, and the algorithms based on frame processing need to be converted into algorithms based on pixel stream processing.

(2) The traditional multithreading acceleration mode needs to decompose an algorithm into a plurality of steps which can be parallelized, and then a plurality of threads are used for simultaneously operating. The FPGA describes an algorithm based on a data flow structure based on registers and logic circuits, which is equivalent to adopting more threads to calculate simultaneously, and the algorithm needs to be decomposed into more precise calculation steps.

(3) Traditional FPGA development adopts a hardware description language to describe a hardware structure of an algorithm, while the FPGA development needs experienced hardware engineers to be realized, and usually, a large amount of time is wasted for communication between the FPGA development engineers and the algorithm engineers.

The significance of solving the problems and the defects is as follows:

(1) directly, the variable-split optical flow algorithm is applied to FPGA hardware, and the application range of the variable-split optical flow algorithm in an embedded scene can be expanded.

(2) Indirectly, the FPGA implementation of the variable optical flow algorithm can accelerate the application range of some computer vision algorithms needing optical flow calculation results in embedded scenes.

(3) In addition, the method based on model design is adopted to realize the realization of the optical flow algorithm on the FPGA, the simulation and debugging of the algorithm can be easily carried out, high-efficiency hardware description codes can be automatically generated, and the problem of unexpected errors in the process of manually writing the codes is avoided. Meanwhile, the method based on model design can provide beneficial reference for FPGA hardware realization of other computer vision algorithms.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a method, a system, a storage medium and a terminal for realizing a variable-split optical flow FPGA.

The invention is realized in this way, a variable-split optical flow FPGA realizing method, which comprises the following steps:

the first step, the preprocessing of two frames of input images, including image color space conversion and image denoising, is output as two preprocessed images;

secondly, calculating the horizontal direction gradient and the vertical direction gradient of the current pixel through the image neighborhood pixels of the preprocessed image; meanwhile, calculating the gradient in the time direction through pixels at corresponding positions in the two frames of images;

thirdly, calculating parameters of a large linear equation set required by the variable optical flow calculation according to the output of the image preprocessing model and the image gradient calculation model;

and fourthly, solving the final optical flow output through an iterative computation model.

Further, the first step includes: when the input is a color image, the color image is converted into a gray image through a color space conversion module, and if the input image is the gray image, the original input is kept; the extraction of window column vectors is realized through an image line cache module, then a delay cache unit caches multi-shooting column vectors and outputs a sliding window corresponding to a current pixel, after the sliding window is obtained, point multiplication of window pixels and filter template pixels is realized through a matrix multiplication unit, and then the accumulation sum of all elements of all point multiplication results is realized through an accumulation summation unit, namely the final output of the corresponding pixel of the window.

Further, the second step includes: the image gradient calculation realizes the horizontal x-direction gradient, the vertical y-direction gradient and the time t-direction gradient of two frames of images, and the gradient in the x direction is calculated by subtracting the gray values of left and right adjacent pixels by a delay unit; the gradient in the y direction is calculated by subtracting the extracted upper and lower adjacent pixels by a line cache unit.

Further, the third step includes: calculating parameters of a large linear equation set required to be constructed for realizing the variable optical flow calculation by parameter calculation, wherein the value of a psi function needs to be calculated in the variable optical flow calculation process by I_xRepresenting the gradient I in the x-direction of the image_yDenotes the gradient I in the y direction_tRepresenting the time direction gradient u represents the initialized x-direction optical flow v represents the initialized y-direction optical flow, representing a constant, the Ψ formula is expressed as:

the linear equations finally required by the variational optical flow are 5 groups of coefficients A₁₁、A₁₂、A₂₂、B₁、B₂The formula is expressed as:

the psi function is used as an independent unit in the parameter calculation module, other parameters are connected together through a signal wire and output after calculation, and the required parameters comprise I_x、I_y、I_t、u、v。

Further, the fourth step includes: optical flow solution requires the calculation of a function phi, the calculation of a gradient value which requires the use of an initial optical flow value, assuming that the gradients of the current initial optical flow value (u, v) in the x and y directions are respectively denoted as u_x、u_y、v_x、v_yThen the phi function is formulated as:

further, the optical flow solving realizes the solving of the constructed large linear equation set, and the A calculated by the parameter calculating module₁₁、A₁₂、A₂₂、B₁、B₂Phi as parameter number to equationLine solving; the adopted solving algorithm is a red and black SOR method, and an SOR calculation model is designed to be configured into an odd number group calculation model and an even number group calculation model based on the principle of red and black SOR grouping operation; selecting a model to participate in odd group operation, and judging whether the pixel is an in-group pixel or not when the pixel flow passes through the module; if the current pixel is the pixel in the group, updating the output current iteration value, otherwise, keeping the input value unchanged, and outputting after corresponding delay; and connecting the odd-number group calculation model and the even-number group calculation model end to form a complete single iteration model.

Further, the variable-split optical flow FPGA implementation method has two forms for implementing a multi-iteration model through a single iteration model:

(1) adopting a single iteration model and a storage controller, directly storing the output of the single iteration in a memory through the storage controller, realizing one iteration through a calculation module after the output of the single iteration is read out from the memory in the next cycle, and realizing the iterative solution of an equation under the control of the storage controller;

(2) and cascading a plurality of single iteration models to realize a plurality of times of iteration operation, wherein the output of one iteration is directly connected to the input of the next iteration module.

It is another object of the present invention to provide a program storage medium for receiving user input, the stored computer program causing an electronic device to perform the steps comprising:

Another object of the present invention is to provide a variable-split optical flow FPGA implementing system for implementing the variable-split optical flow FPGA implementing method, including:

the image preprocessing module is used for completing the preprocessing function of two frames of input images, comprises an image color space conversion module and an image denoising module, and outputs two preprocessed images;

the image gradient calculation module comprises an image window extraction module and a gradient calculation module and is used for calculating the horizontal gradient and the vertical gradient of the current pixel through the image neighborhood pixels; meanwhile, calculating the gradient in the time direction through pixels at corresponding positions in the two frames of images;

the parameter calculation module is used for calculating parameters of a large linear equation set required by the variational optical flow calculation according to the output of the image preprocessing model and the image gradient calculation model;

and the optical flow calculating module is used for solving the final optical flow output through an iterative computation model.

The invention also aims to provide a computer vision terminal, which is provided with the variable-split optical flow FPGA implementation system.

By combining all the technical schemes, the invention has the advantages and positive effects that: the invention adopts the FPGA design method based on the model, generates the hardware description code of the target FPGA platform in a code generation mode, and can greatly shorten the application development time of the optical flow algorithm on the FPGA hardware. The invention adopts a modular design method, adopts a uniform data interface for each module, and each module is mutually independent and is easy to cut, expand and maintain. The model of the invention adopts a pipeline structure design, can greatly improve the operation speed of the variational optical flow algorithm, and can achieve the purpose of real-time application.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.

Fig. 1 is a flowchart of a method for implementing a variable-split optical flow FPGA according to an embodiment of the present invention.

FIG. 2 is a schematic structural diagram of a variable-split optical flow FPGA implementation system provided in an embodiment of the present invention;

in the figure: 1. an image preprocessing module; 2. an image gradient calculation module; 3. a parameter calculation module; 4. and an optical flow calculating module.

Fig. 3 is a flowchart of an implementation method of a variational optical flow FPGA according to an embodiment of the present invention.

Fig. 4 is a schematic design diagram of a denoising module according to an embodiment of the present invention.

Fig. 5 is a schematic design diagram of an x-direction gradient calculation module and a y-direction gradient calculation module according to an embodiment of the present invention.

Fig. 6 is a schematic design diagram of a t-direction gradient calculation module according to an embodiment of the present invention.

Fig. 7 is a schematic design diagram of the Ψ function according to an embodiment of the present invention.

Fig. 8 is a schematic design diagram of a large linear equation set parameter calculation module according to an embodiment of the present invention.

Fig. 9 is a schematic design diagram of a Φ function according to an embodiment of the present invention.

Fig. 10 is a schematic design diagram of a Φ parameter calculation module of an entire image according to an embodiment of the present invention.

FIG. 11 is a schematic design diagram of a single iteration model in an optical flow calculation module according to an embodiment of the present invention.

FIG. 12 is a schematic diagram of a model according to implementation (1) of multiple iterations in an optical flow calculation module according to an embodiment of the present invention.

FIG. 13 is a schematic diagram of a model according to implementation (2) of multiple iterations in the optical flow calculation module according to an embodiment of the present invention.

Fig. 14 is an example of a simulation result 1 of an algorithm model implemented in a specific implementation manner according to an embodiment of the present invention. Fig. 15 is an example 2 of simulation results of an algorithm model implemented in a specific implementation manner according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Aiming at the problems in the prior art, the invention provides a method, a system, a storage medium and a terminal for realizing a variable-split optical flow FPGA, and the invention is described in detail with reference to the attached drawings.

As shown in fig. 1, the method for implementing a variable-split optical flow FPGA provided by the present invention includes the following steps:

s101: preprocessing two frames of input images, including image color space conversion and image denoising, and outputting two preprocessed images;

s102: calculating the horizontal gradient and the vertical gradient of the current pixel of the preprocessed image through the neighborhood pixels of the image; meanwhile, calculating the gradient in the time direction through pixels at corresponding positions in the two frames of images;

s103: calculating parameters of a large linear equation set required by the variable-split optical flow calculation according to the output of the image preprocessing model and the image gradient calculation model;

s104: the final optical flow output is solved by an iterative computational model.

As shown in fig. 2, the variable-split optical flow FPGA implementation system provided by the present invention includes:

the image preprocessing module 1 is used for completing the preprocessing function of two frames of input images, and comprises an image color space conversion module and an image denoising module, wherein the model output is two preprocessed images.

The image gradient calculation module 2 comprises an image window extraction module and a gradient calculation module, and is used for calculating the horizontal gradient and the vertical gradient of the current pixel through the image neighborhood pixels. Meanwhile, the gradient in the time direction is calculated through the pixels at the corresponding positions in the two frames of images.

And the parameter calculation module 3 is used for calculating parameters of a large linear equation set required by the variable optical flow calculation according to the output of the image preprocessing model and the image gradient calculation model.

And the optical flow calculating module 4 is used for solving the final optical flow output through an iterative computation model.

The technical solution of the present invention is further described below with reference to the accompanying drawings.

As shown in fig. 3, after the image is input in the whole optical flow calculation process, the image data sequentially passes through the image preprocessing module, the image gradient calculation module, the parameter calculation module, and the optical flow solving module, and finally the final optical flow calculation result is output. In the specific implementation of the invention, the model and the system are built by adopting a Simulink platform.

First consider the data connection method between modules. In the embodiment of the invention, the model input and output interface is realized by adopting a pixel stream interface (streamline Pixel interface). The pixel stream interface is used for transmitting serial pixel stream data and comprises a pixel data bus and a pixel control bus. The pixel data bus is an abstract bus signal, and the format of the pixel data bus is not fixed in the Simulink model and can be a floating point number, a fixed point number, an integer, a vector and the like. The control bus comprises 5 boolean signals by means of which the validity of the pixels and their relative position in the frame is indicated. The data and control outputs of one module can be easily connected to the inputs of another module through a pixel stream interface.

The pre-processing module typically includes image color space conversion and filtering processes. When the input is a color image, the color image is firstly converted into a gray image through a color space conversion module, and if the input image is the gray image, the original input is kept. The image after color space conversion is output to a filtering module, which is usually implemented by a sliding window filter, as shown in fig. 4, in the embodiment of the present invention, extraction of a window column vector is implemented by an image line cache module, and then a sliding window corresponding to a current pixel is output after a multi-beat column vector is cached by a delay cache unit. After the sliding window is obtained, the dot multiplication of the window pixels and the filter template pixels is realized through the matrix multiplication unit, and then the accumulated sum of all elements of all dot multiplication results is realized through the accumulation summation unit, namely the final output of the corresponding pixels of the window.

The image gradient calculation module realizes the horizontal x-direction gradient, the vertical y-direction gradient and the time t-direction gradient of the two frames of images. As shown in fig. 5, the gradient in the x direction in the embodiment of the present invention is calculated by subtracting the gray values of the left and right adjacent pixels by the delay unit; the gradient in the y direction is calculated by subtracting the extracted upper and lower adjacent pixels by a line cache unit. As shown in fig. 6, the gradient calculation in the t direction in the embodiment of the present invention is implemented as the subtraction of corresponding pixels of two frames of images.

And the parameter calculation module is used for realizing the calculation of the parameters of the large linear equation set required to be constructed by the variable optical flow calculation. In the embodiment of the invention, the value of psi function needs to be calculated in the process of calculating the variable optical flow, and I is used_xRepresenting the gradient I in the x-direction of the image_yDenotes the gradient I in the y direction_tRepresenting the time direction gradient u represents the initialized x-direction optical flow v represents the initialized y-direction optical flow, representing a constant, the Ψ formula is expressed as:

based on equation (1), the Ψ function calculation unit implemented by the embodiment of the present invention is shown in fig. 7. The linear equations finally required by the variational optical flow are 5 groups of coefficients A₁₁、A₁₂、A₂₂、B₁、B₂The formula is expressed as:

the linear system building block after implementation is shown in fig. 8 according to equation (2). The psi function is used as an independent unit in the parameter calculation module, other parameters are connected together through a signal wire and output after calculation, and the required parameters comprise I_x、I_y、I_tU, v. In order to ensure the alignment of data, some delay registers are arranged for delay balance. At the same time, the control signal is also output after delaying the corresponding beat.

In the embodiment of the invention, the optical flow calculation module also needs the calculation value of the phi function, and the calculation of the phi function is related to the smoothing term in the optical flow equation. The calculation of which requires the use of a gradient value of the initial optical flow value, given that the gradient of the current initial optical flow value (u, v) in the x and y directions is denoted u, respectively_x、u_y、v_x、v_yThen the phi function is formulated as:

according to the formula (3), the Φ function calculation sub-modules implemented in the embodiment of the present invention are shown in fig. 9, the complete Φ value calculation sub-module combined with the Φ function calculation sub-modules is shown in fig. 10, and the image gradient calculation sub-module is used in the left part of the Φ function calculation sub-module.

The optical flow solving module realizes the solution of the constructed large linear equation set, namely, A calculated by the parameter calculating module₁₁、A₁₂、A₂₂、B₁、B₂And phi is used as a parameter number to solve the equation. The solving algorithm adopted in the embodiment of the invention is the red-black method SOR (red-black method super-relaxation iteration). Based on the principle of red and black SOR grouping operation, the SOR calculation model is designed to be configurable into an odd number group calculation model and an even number group calculation model. The calculation of the base and even sets is the same, so that only one input parameter signal indicates whether the current model is involved in an odd or even operation. Once the model is selected to participate in odd group operations, it is determined whether the pixel is in-group as the pixel stream passes through the moduleA pixel. If the current pixel is the pixel in the group, the output current iteration value is updated, otherwise, the input value is kept unchanged, and the current iteration value is output after corresponding delay. The odd-numbered set of calculation models and the even-numbered set of calculation models are connected end to form a complete single iteration model, as shown in fig. 11.

In the embodiment of the invention, the multiple iteration model realized by the single iteration model has two forms: (1) as shown in fig. 12, with a single iteration model and a memory controller, the output of a single iteration is stored directly into the memory by the memory controller and an iteration is implemented by the computation module after the next cycle is read out of the memory. And under the control of the storage controller, realizing the iterative solution of the equation. (2) As shown in fig. 13, a plurality of single iteration models are cascaded to realize a plurality of iteration operations, and the output of one iteration is directly connected to the input of the next iteration module.

The technical effects of the present invention will be described in detail with reference to experiments.

The simulation output result of the model implemented by the embodiment of the present invention is shown in fig. 14 and 15, where the leftmost graph is the input image, the middle graph is the calculation result of the Matlab-implemented optical flow algorithm, and the rightmost graph is the simulation result of the model implemented by the embodiment. It can be seen that, because the model facing the FPGA adopts fixed point number for calculation, the simulation output of the model is only slightly different from the algorithm realized based on Matlab.

It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.

The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims

1. The FPGA implementation method is characterized by comprising the following steps of:

2. The method of claim 1, wherein the first step comprises: when the input is a color image, the color image is converted into a gray image through a color space conversion module, and if the input image is the gray image, the original input is kept; the extraction of window column vectors is realized through an image line cache module, then a delay cache unit caches multi-shooting column vectors and outputs a sliding window corresponding to a current pixel, after the sliding window is obtained, point multiplication of window pixels and filter template pixels is realized through a matrix multiplication unit, and then the accumulation sum of all elements of all point multiplication results is realized through an accumulation summation unit, namely the final output of the corresponding pixel of the window.

3. The method of claim 1, wherein the second step comprises: the image gradient calculation realizes the horizontal x-direction gradient, the vertical y-direction gradient and the time t-direction gradient of two frames of images, and the gradient in the x direction is calculated by subtracting the gray values of left and right adjacent pixels by a delay unit; the gradient in the y direction is calculated by subtracting the extracted upper and lower adjacent pixels by a line cache unit.

4. The method of claim 1, wherein the third step comprises: calculating parameters of a large linear equation set required to be constructed for realizing the variable optical flow calculation by parameter calculation, wherein the value of a psi function needs to be calculated in the variable optical flow calculation process by I_xRepresenting the gradient I in the x-direction of the image_yDenotes the gradient I in the y direction_tRepresenting the time direction gradient u represents the initialized x-direction optical flow v represents the initialized y-direction optical flow, representing a constant, the Ψ formula is expressed as:

5. The method of claim 1, wherein the fourth step comprises: optical flow solution requires the calculation of a function phi, the calculation of a gradient value which requires the use of an initial optical flow value, assuming that the gradients of the current initial optical flow value (u, v) in the x and y directions are respectively denoted as u_x、u_y、v_x、v_yThen the phi function is formulated as:

6. the FPGA implementation method of claim 5, wherein the optical flow solving implementation solves the constructed large linear equation set by solving A calculated by the parameter calculation module₁₁、A₁₂、A₂₂、B₁、B₂Phi is used as a parameter number to solve the equation; the adopted solving algorithm is a red and black SOR method, and an SOR calculation model is designed to be configured into an odd number group calculation model and an even number group calculation model based on the principle of red and black SOR grouping operation; selecting a model to participate in odd group operation, and judging whether the pixel is an in-group pixel or not when the pixel flow passes through the module; if the current pixel is the pixel in the group, updating the output current iteration value, otherwise, keeping the input value unchanged, and outputting after corresponding delay; and connecting the odd-number group calculation model and the even-number group calculation model end to form a complete single iteration model.

7. The variable-split optical flow FPGA implementation method of claim 1, wherein the variable-split optical flow FPGA implementation method implements a multi-iteration model through a single iteration model in two forms:

8. A program storage medium for receiving user input, the stored computer program causing an electronic device to perform the steps comprising:

9. A variable-division optical-flow FPGA implementation system for implementing the variable-division optical-flow FPGA implementation method according to any one of claims 1 to 7, the variable-division optical-flow FPGA implementation system comprising:

the optical flow calculating module is used for solving the final optical flow output through an iterative computation model;

the input and output interfaces of the variable split optical flow FPGA implementation system are realized by adopting pixel flow interfaces; the pixel stream interface is used for transmitting serial pixel stream data and comprises a pixel data bus and a pixel control bus; the control bus comprises 5 boolean signals by means of which the validity of the pixels and their relative position in the frame is indicated.

10. A computer vision terminal, characterized in that it carries the variational optical flow FPGA implementation system of claim 9.