CN113421184A

CN113421184A - Real-time image processing method based on FPGA

Info

Publication number: CN113421184A
Application number: CN202110688723.5A
Authority: CN
Inventors: 于非; 赵征; 高岩
Original assignee: Shanghai Tong Guan Intelligent Technology Co Ltd; Shanghai Eyevolution Technology Co ltd
Current assignee: Shanghai Tong Guan Intelligent Technology Co Ltd; Shanghai Eyevolution Technology Co ltd
Priority date: 2021-06-21
Filing date: 2021-06-21
Publication date: 2021-09-21

Abstract

The invention provides a real-time image processing method based on an FPGA (field programmable gate array), which comprises the following steps of: caching a frame of source images into an off-chip memory DDR of the FPGA line by line; after a frame of source image is cached in an off-chip memory DDR to reach a preset line number n, the source image in the off-chip memory DDR is subjected to calculation of rotation, translation, scaling, cutting and overturning operations at the same time to obtain image data of a target image, and the image data of the target image is read out when the n +1 th line of the frame of source image is cached in the off-chip memory DDR, wherein n is a positive integer, and the value of n is smaller than the total line number of the frame of source image, so that the image which is smaller than the frame of source image is completely stored in a memory to start output, the output delay time is shortened, the target image can be started to output without waiting for the source image to be completely stored in the memory for one frame of source image, and the real-time performance is improved; and simultaneously carrying out the operations of rotation, translation, zooming, cutting and turning on the source image.

Description

Real-time image processing method based on FPGA

Technical Field

The invention relates to the field of image processing, in particular to a real-time image processing method based on an FPGA (field programmable gate array).

Background

In the field of image processing, operations such as flipping, zooming, rotating, translating, and cropping of an image are often required in some scenes (e.g., scenes such as steady shooting, 3D shooting, VR, 4K/8K high-precision jigsaw puzzle). The image processing device comprises a turning finger, a zooming finger, a rotating finger, a translation finger, a cropping finger and a display, wherein the turning finger turns the image left and right, the zooming finger performs size conversion on the image by taking the center of the image as an origin, the rotating finger performs clockwise or anticlockwise rotation by taking the center of the image as an origin, the translation finger performs up-down and left-right movement on the image, the cropping finger reserves the central part of the image, and the peripheral part is set to be a certain fixed color such as black. The schematic diagram of a source image 1 undergoing operations of flipping, zooming, rotating, translating, cropping, etc. to obtain a target image 2 is shown in fig. 1.

At present, in three operations of rotation, translation and scaling, the rotation operation on an image is more, specifically as follows: 1. for example, a video format converter or the like performs fixed rotation operations of 90 degrees, 180 degrees, and the like on the image; 2. for example, a rotation operation of a small angle is performed by digital anti-shake operation or the like. The above operation cannot realize real-time, variable rotation of the angle of the picture.

The current rotation and translation operations basically need to be started after one frame of the source image is completely stored in the memory, so that the output image is delayed by at least one frame (such as CN 106530209A); the scaling operation for the image mostly adopts a separate functional module, and the operation is started after the source image is completely saved in the memory one frame (for example, CN 112017107A). In addition, if the existing functional blocks such as flip, zoom, rotate, translate, and crop are simply stacked in order, a larger delay of the output image is caused. In practical application, the image needs to be configured according to the actual situation of the camera; in many occasions, the rotation angle of the image cannot be simple 90 degrees, 180 degrees and the like, but needs to be rotated by a self-defined angle; the magnification of the image cannot be fixed, such as 1.5 times, 2 times, etc., but rather requires a self-defined magnification; the number of pixels for translation and clipping also needs to be self-defined according to actual conditions.

Disclosure of Invention

One of the objectives of the present invention is to provide a real-time image processing method based on FPGA, which can reduce delay and improve real-time performance, and simultaneously complete operations such as flipping, zooming, rotating, translating, and cropping in a single mapping process.

It is another object of the present invention that the parameters of the flipping, zooming, rotating, translating, cropping operations are adjustable.

In order to solve the above problems, the present invention provides a real-time image processing method based on an FPGA, comprising the following steps:

s1: caching a frame of source images into an off-chip memory DDR of the FPGA line by line;

step S2: after the caching of one frame of the source image in the off-chip memory DDR reaches a preset line number n, the source image in the off-chip memory DDR is subjected to calculation of rotation, translation, scaling, clipping and overturning operations at the same time to obtain image data of a target image, and the image data of the target image is read out when the line n +1 of the frame of the source image is cached in the off-chip memory DDR,

and n is a positive integer, and the value of n is less than the total line number of the source image of one frame.

Optionally, step S1 includes:

establishing an affine transformation matrix M to combine scaling, rotation and translation operations of the source image;

calculating the composite affine transformation matrix M according to the parameters of the zooming, rotating and translating operations of the source image; and

and starting to cache the source image into an off-chip memory DDR of the FPGA line by line.

Further, the affine transformation matrix M is characterized as follows:

wherein, α, β, t_x，t_yIs an intermediate variable, and α, β, t_x，t_yThe following formula is satisfied:

α＝s*cos(r) (2)

β＝s*sin(r) (3)

t_x＝(p*tan(r)+h)*s (4)

t_y＝p*s (5)

p is the coordinate upward shift amount or downward shift amount of the pixel point of the source image; h is the coordinate left shift amount or the coordinate right shift amount of the pixel point of the source image; r is the rotation angle of the source image, and r is more than 90 degrees and less than 90 degrees; s is the zoom ratio of the source image; cx is the transverse coordinate of the center point C of the source image; cy is the longitudinal coordinate of the center point C of the source image.

Optionally, step S2 includes:

step S21: when the caching of a frame of source image in the off-chip memory DDR reaches a preset line number n, establishing a relation between coordinates of pixel points of a target image and coordinates of corresponding pixel points Q in the source image;

step S22: if the pixel point Q is within the effective range of the DDR cache of the off-chip memory, four pixel points around the pixel point Q are searched in the DDR cache of the off-chip memory, and the calculated pixel value of the pixel point of the target image is stored in a second memory; and

and when the n +1 th line of the source image in one frame is cached in the DDR, reading out the pixel value of the pixel point of the target image from the second memory to finish the pixel value output of one pixel point of the target image.

Further, step S21 includes:

preparing image data of a first pixel of a processing target image;

determining whether the number of lines of the source image cached in the off-chip memory DDR reaches a preset number of lines n:

if the number of lines n does not reach the preset number of lines n, continuing caching the source image, if the number of lines n reaches the preset number of lines n, judging whether the pixel point is in the range needing to be cut, if the pixel point is in the range needing to be cut, the image data of the pixel point is invalid data, directly outputting the pixel value of the pixel point to a set fixed value, and if the pixel point is out of the range needing to be cut, establishing the coordinates (x, y) of the pixel point and the coordinates (x) of the pixel point Q corresponding to the pixel point in the source image_s，y_s) The relationship between them.

Further, the coordinate (x) of the pixel point Q_s，y_s) The following formula is satisfied:

x_s＝m₁₁*x+m₁₂*y+m₁₃ (7)

y_s＝m₂₁*x+m₂₂*y+m₂₃ (8)；

wherein m11 ═ m22 ═ α; m12 ═ β; m13 ═ 1- α c_x-β*c_y+t_x；m21＝-β； m23＝β*c_x+(1-α)*c_y+t_y(ii) a And α, β, t_x，t_yIs an intermediate variable, and α, β, t_x，t_ySatisfies the following formula:

α＝s*cos(r) (2)

β＝s*sin(r) (3)

t_x＝(p*tan(r)+h)*s (4)

t_y＝p*s (5)

Further, after the range of the pixel point to be cut is judged, the method further comprises the following steps:

calculating the coordinates of pixel points Q corresponding to pixel points needing to be output in the target image in the source image; and

if the turning operation is needed, further establishing the coordinates (x, y) of the pixel point needing to be output and the coordinates (x) of the pixel point Q corresponding to the pixel point in the source image_s′，y_s) The relationship between them.

Further, the coordinate (x) of the pixel point Q_s′，y_s) The following formula is satisfied:

x_s′＝w-1-x_s (9)

y_s＝m₂₁*x+m₂₂*y+m₂₃ (8)；

α＝s*cos(r) (2)

β＝s*sin(r) (3)

t_x＝(p*tan(r)+h)*s (4)

t_y＝p*s (5)

Further, storing the calculated pixel values of the pixel points of the target image into a second memory specifically includes:

converting the coordinates of four pixel points around the pixel point Q into memory addresses, and reading the pixel value of each pixel point from the memory addresses;

storing the read pixel values of the pixel points into a second memory; and

and obtaining the pixel value of the pixel point Q through interpolation calculation.

Further, the pixel value (r) of the pixel point Q_s，g_s，b_s) The following formula is satisfied:

r_s＝(r₁*(y₀+1-y_s)+r₃*(y_s-y₀))*(x₀+1-x_s) +(r₂*(y₀+1-y_s)+r₄*(y_s-y₀))*(x_s-x₀) (10)

g_s＝(g₁*(y₀+1-y_s)+g₃*(y_s-y₀))*(x₀+1-x_s) +(g₂*(y₀+1-y_s)+g₄*(y_s-y₀))*(x_s-x₀) (11)

b_s＝(b₁*(y₀+1-y_s)+b₃*(y_s-y₀))*(x₀+1-x_s) +(b₂*(y₀+1-y_s)+b₄*(y_s-y₀))*(x_s-x₀) (12)

wherein, the coordinates of four pixel points around the pixel point Q are respectively P₁(x₀，y₀)、 P₂(x0+1，y₀)、P₃(x₀，y₀+1)、P₄(x₀+1，y₀+ 1)); the pixel values of four pixels around the pixel Q are respectively (r)₁，g₁，b₁)、(r₂，g₂，b₂)、(r₃，g₃，b₃)、(r₄，g₄，b₄)。

Further, the pixel value (r) of the pixel point Q_s′，g_s′，b_s′) The following formula is satisfied:

r_s′＝(r₁*(y₀+1-y_s)+r₃*(y_s-y₀))*(x₀+1-x_s′) +(r₂*(y₀+1-y_s)+r₄*(y_s-y₀))*(x_s′-x₀) (13)

g_s′＝(g₁*(y₀+1-y_s)+g₃*(y_s-y₀))*(x₀+1-x_s′) +(g₂*(y₀+1-y_s)+g₄*(y_s-y₀))*(x_s′-x₀) (14)

b_s′＝(b₁*(y₀+1-y_s)+b₃*(y_s-y₀))*(x₀+1-x_s′) +(b₂*(y₀+1-y_s)+b₄*(y_s-y₀))*(x_s′-x₀) (15)

wherein, the coordinates of four pixel points around the pixel point Q are respectively P₁(x₀，y₀)、 P₂(x₀+1，y₀)、P₃(x₀，y₀+1)、P₄(x₀+1，y₀+ 1)); the pixel values of four pixels around the pixel Q are respectively (r)₁，g₁，b₁)、(r₂，g₂，b₂)、(r₃，g₃，b₃)、(r₄，g₄，b₄)。

Further, the second memory is an on-chip memory of the FPGA.

Further, step S2 further includes:

step S23: when a frame of source image is cached in the (n +1) th line of the off-chip memory DDR, reading out the pixel value of a pixel point needing to be output of the target image from the second memory to finish the pixel value output of one pixel point of the target image;

processing the next pixel (x +1, y) in the current row until the pixel point processing of the current row is completed;

step S24: repeating steps S21 to S23 to continue the processing for a new line, and continuing to read out the image data of the target image; and

step S25: after one frame of the source image is cached to the off-chip memory DDR, the source image is directly subjected to rotation, translation, scaling, cutting and overturning operations at the same time to obtain image data of the rest part of the target image, and the image data of the rest part of the target image is read out at the same time, so that the output of the target image is completed.

Compared with the prior art, the invention has the following beneficial effects:

the invention provides a real-time image processing method based on an FPGA (field programmable gate array), which comprises the following steps of: step S1: caching a frame of source images into an off-chip memory DDR of the FPGA line by line; step S2: after the cache of one frame of the source image in the off-chip memory DDR reaches a preset line number n, the source image in the off-chip memory DDR is subjected to rotation, translation, scaling, clipping and overturning operations at the same time to obtain image data of a target image, and the image data of the target image is read out when the cache of one frame of the source image in the off-chip memory DDR is performed on the n +1 th line, wherein n is a positive integer, and the value of n is smaller than the total line number of one frame of the source image. The invention realizes that the output is started when less than one frame of image is completely stored in the memory by the value of n being less than the total line number of one frame of source image, thereby shortening the time of output delay, starting the output of the target image without waiting for the source image to be completely stored in the memory by one frame, and improving the real-time property; the rotation, translation, scaling, cropping and flipping operations are also performed simultaneously on the source image.

Furthermore, the real-time image processing method is based on the Xilinx platform, and the zooming, rotating and translating parameters are dynamically set through a Linux application program of the PS end, so that the zooming, rotating and translating parameters can be dynamically modified, and the real-time variable effect of the zooming, rotating and translating parameters is achieved.

Drawings

FIG. 1 is a schematic representation of an image before and after flipping, zooming, rotating, translating, and cropping;

FIG. 2 is a schematic diagram of an image of a 3D usage scene before and after flipping, zooming, rotating, translating, and cropping;

fig. 3 is a schematic flowchart of a real-time image processing method based on an FPGA according to an embodiment of the present invention;

FIG. 4 is a detailed processing flow diagram of step S2 according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a packet storage of image data of a source image according to an embodiment of the present invention;

fig. 6 is a schematic diagram illustrating bilinear interpolation to calculate pixel values of pixel points of a target image that need to be output according to an embodiment of the present invention.

Detailed Description

An FPGA-based real-time image processing method according to the present invention will be described in further detail below. The present invention will now be described in more detail with reference to the accompanying drawings, in which preferred embodiments of the invention are shown, it being understood that one skilled in the art may modify the invention herein described while still achieving the advantageous effects of the invention. Accordingly, the following description should be construed as broadly as possible to those skilled in the art and not as limiting the invention.

In the interest of clarity, not all features of an actual implementation are described. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail. It should be appreciated that in the development of any such actual embodiment, numerous implementation-specific details may be set forth in order to achieve the developer's specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the art.

In order to make the objects and features of the present invention more comprehensible, embodiments of the present invention are described in detail below with reference to the accompanying drawings. It is to be noted that the drawings are in a very simplified form and that non-precision ratios are used for convenience and clarity only to aid in the description of the embodiments of the invention.

As shown in fig. 3 and 4, the present embodiment provides a real-time image processing method based on an FPGA, including the following steps:

step S1: caching a frame of source images into an off-chip memory DDR of the FPGA line by line;

step S2: after the buffering of one frame of the source image in the off-chip memory DDR reaches a preset line number n, the source image with the preset line number n is subjected to rotation, translation, scaling, clipping and overturning operations at the same time to obtain image data of a target image, and the image data of the target image is read out when the n +1 th line of the source image is buffered in the off-chip memory DDR at the same time,

and the value of n is less than the total line number of the source image of one frame.

The following describes a real-time image processing method based on FPGA according to this embodiment in detail with reference to fig. 2 to 6.

Step S1 is executed first, and a frame of source images are buffered line by line into the off-chip memory DDR of the FPGA.

The method specifically comprises the following steps:

first, an affine transformation matrix M is established to combine the scaling, rotation and translation operations of the source image. In detail, a composite affine transformation matrix M is established to combine the scaling, rotation and translation operations of the source images. In this embodiment, in this step, since the FPGA is provided with an ARM, the scaling, rotation, and translation parameters can be dynamically set by the Linux application program at the PS end, so that the scaling, rotation, and translation parameters can be dynamically modified, thereby achieving the effect of real-time variable scaling, rotation, and translation parameters. In other embodiments, since the FPGA does not have an ARM in itself, the parameters can be set by other means, such as I2C communication, etc.

The affine transformation matrix M is as follows:

wherein, α, β, t_x，t_yIs an intermediate variable, andα，β，t_x，t_ythe following formula is satisfied:

α＝s*cos(r) (2)

β＝s*sin(r) (3)

t_x＝(p*tan(r)+h)*s (4)

t_y＝p*s (5)

wherein p is a coordinate up-shift amount or a coordinate down-shift amount of a pixel point of the source image, the value of the coordinate up-shift amount is positive, and the value of the coordinate down-shift amount is negative; h is a coordinate left shift amount or a coordinate right shift amount of a pixel point of the source image, the value of the coordinate left shift amount is negative, and the value of the coordinate right shift amount is positive; r is the rotation angle of the source image (for example, the clockwise rotation value is positive, the anticlockwise rotation value is negative), and r is more than-90 degrees and less than 90 degrees; s is the zoom ratio of the source image, the value of s is more than 1 when the source image is zoomed in, and s is more than 0 and less than 1 when the source image is zoomed in; cx is the transverse coordinate of the center point C of the source image; cy is the longitudinal coordinate of the center point C of the source image.

For ease of description, each element of the notation matrix M is

Due to the fact that the zooming, rotating and translating operations are different in sequence, the formula (1-5) is changed, the formula (1-5) is sequentially carried out according to the zooming, rotating and translating operations, and if the operation sequence needs to be changed, the formula (1-5) needs to be adjusted according to actual conditions to generate a composite affine transformation matrix M corresponding to the formula.

In this embodiment, the source image is applied to 2D, and in other embodiments, the source image is applied to 3D, for example, when the source By Side image is applied to 3D (i.e., two images are horizontally or vertically spliced into one image, as shown in fig. 2), only one composite affine transformation matrix M needs to be generated for each of the two images (source image 1), and each image uses its own composite affine transformation matrix M in the subsequent processing flow, so as to obtain the target image 2.

Then, the composite affine transformation matrix M is calculated according to the parameters of the scaling, rotation and translation operations of the source image to obtain fixed-valued M11, M12, M13, M21, M22 and M23.

The source image format is then converted to a target format, for example to RGB format.

Then, the source image is buffered into an off-chip memory DDR of the FPGA line by line. Specifically, the source image converted into the RGB format is buffered in the off-chip memory DDR of the FPGA line by line, and when the buffer does not reach the preset line number, the buffered source image is not calculated, and therefore, no target image is generated and read.

The reason why the present embodiment directs the source image to the off-chip memory DDR of the FPGA instead of the on-chip memory (e.g., BRAM, URAM, etc.) of the FPGA is as follows: the memory of the on-chip memory of the FPGA is small, and more line numbers cannot be cached; in order to improve the efficiency of reading image data when the source image is rotated, translated, scaled, cropped, and flipped, the image data of the source image needs to be stored in groups, which further increases the amount of buffered data.

For example, when the image data of the mth row of the source image is cached, the data point of the mth row (i.e., the coordinate and the pixel value of each pixel point) and the data point of the m +1 th row need to be combined to be a new data point, for example, the data point of the pixel point (0, m) and the data point of the pixel point (0, m +1) are merged to be a new data point. As shown in fig. 5, taking processing of image data in RGB format as an example, data points in each row of RGB format are stored in 32 bits, and pixels in two upper and lower rows are combined into a new data point by grouping storage and stored in 64 bits. Therefore, in the subsequent steps, the pixel values of the pixels around the pixel points corresponding to the coordinates of the pixel points of the target image in the source image can be read out from the off-chip memory DDR in sequence at one time, and the reading efficiency is achieved by using a large amount of internal memory.

Then, step S2 is executed, after a frame of the source image is cached in the off-chip memory DDR to reach a preset line number n, the source image with the preset line number n is simultaneously subjected to rotation, translation, scaling, cropping and flipping operations to obtain image data of a target image, and the image data of the target image starts to be read out while a frame of the source image is cached in the off-chip memory DDR at line n + 1. In the step, the source image is simultaneously subjected to the rotation, translation, scaling, clipping and turning operations, and parameters of the turning and clipping operations can be dynamically modified, so that the effect of real-time variable parameters is further achieved. Because the preset line number is less than the total line number of the source image of one frame, the time delay of the pixel value of the pixel point needing to be output of the processed target image, which is read out from the second memory, is the cache time of the preset line number, so that the output is started when less than one frame of image is completely stored in the memory, and the output delay time is shortened.

The present embodiment aims at a small rotation angle of the source image, for example, the rotation angle r satisfies-90 ° < r < 90 °, so that the output of the target image can be performed before the one-frame source image is not completely buffered in the off-chip memory DDR. The more the rotation angle of the source image approaches to 90 degrees, the larger the value of n is, the less the output delay time can be shortened, the more the rotation angle of the source image approaches to 0 degree, the smaller the value of n is, and the more the output delay time can be shortened. Therefore, the rotation angle of the source image is preferably close to 0 °.

The method specifically comprises the following steps:

step S21, when the frame of source image is cached in the off-chip memory DDR to reach the preset line number n, establishing the relation between the coordinates of the pixel points of the target image and the coordinates of the corresponding pixel points in the source image. Specifically, coordinates (x, y) of a pixel point to be output of a target image are recorded first, and the target image needs to be read out in order, so that the target image is calculated and read out from the first pixel point (x is 0 and y is 0) in the first line, and thus, the first pixel point of the target image is prepared to be processedImage data; then, whether the number of lines of the source image cached in the off-chip memory DDR in one frame reaches a preset number of lines is confirmed, and if the number of lines does not reach the preset number of lines, caching of the source image is continued; if the preset line number n is reached, then executing the next step, namely judging whether the pixel point is in the range needing to be cut, if so, the image data of the pixel point is invalid data, the pixel value of the pixel point directly outputs a set fixed value (for example, black, the pixel value is 0), and if not, establishing the coordinate (x, y) of the pixel point and the coordinate (x, y) of the pixel point Q corresponding to the pixel point Q in the source image_s，y_s) The relationship between the two, the coordinate (x) of the pixel point Q_s，y_s) The following formula is satisfied:

x_s＝m₁₁*x+m₁₂*y+m₁₃ (7)

y_s＝m₂₁*x+m₂₂*y+m₂₃ (8)；

calculating the coordinates of the pixel point Q corresponding to a specific pixel point (for example, the first pixel point in the first row) needing to be output in the target image in the source image according to the relationship; then, if the turning operation is needed, further establishing the coordinate (x, y) of the pixel point needing to be output and the coordinate (x) of the pixel point Q corresponding to the pixel point in the source image_s′，y_s) The relationship between the two or more of them,

the coordinate (x) of said pixel point Q_s′，y_s) The following formula is satisfied:

x_s′＝w-1-x_s (9)

y_s＝m₂₁*x+m₂₂*y+m₂₃ (8)；

step S22, if the pixel point Q is within the effective range of the off-chip memory DDR buffer memory, finding four pixel points around the pixel point Q in the off-chip memory DDR (i.e. the coordinates of the four pixel points around the pixel point Q are (P) respectively₁(x₀，y₀)、P₂(x₀+1，y₀)、P₃(x₀，y₀+1)、 P₄(x₀+1，y₀+1)), and storing the calculated pixel values of the pixel points of the target image into a second memory.

Storing the calculated pixel values of the pixel points of the target image into a second memory in detail as follows:

firstly, the coordinates (P) of the four pixel points are calculated₁(x₀，y₀)、P₂(x₀+1，y₀)、P₃(x₀，y₀+1)、 P₄(x₀+1，y₀+1)) into a memory address, from which the pixel value of each pixel point is read.

The read pixel values are then stored in a second memory, such as an on-chip memory of an FPGA, specifically a BRAM, for randomly reading the image data of the pixels. In this embodiment, the read pixel values are stored in two second memories, for example, a dual BRAM is provided, and a ping-pong operation is adopted to improve the reading efficiency.

Then, as shown in fig. 6, the pixel value of the pixel point Q is obtained through interpolation calculation, and further, the pixel value of the pixel point Q is obtained through bilinear interpolation calculation. In the embodiment, taking processing an image in RGB format as an example, the pixel point Q, P₁、P₂、P₃、P₄The corresponding pixel values are respectively (r)_s，g_s，b_s)、(r₁，g₁，b₁)、(r₂，g₂，b₂)、(r₃，g₃，b₃)、(r₄，g₄，b₄). Since the processing modes of each channel are the same, if only rotation, translation, scaling, clipping, and other operations are performed, the pixel value (r) of the pixel point Q is obtained_s，g_s，b_s) The following formula is satisfied:

b_s＝(b₁*(y₀+1-y_s)+b₃*(y_s-y₀))*(x₀+1-x_s) +(b₂*(y₀+1-y_s)+b₄*(y_s-y₀))*(x_s-x₀) (12)。

if the operations of rotation, translation, scaling, clipping and turning are carried out, the pixel value (r) of the pixel point Q is obtained_s′，g_s′，b_s′) The following formula is satisfied:

if the pixel point Q is out of the effective range of the DDR cache of the off-chip memory, the pixel point (the pixel point corresponding to the pixel point Q in the source image in the target image) that needs to be output in the target image directly outputs the set fixed value (for example, black, the pixel value of which is 0).

Due to the fact that the pixel values of the pixel points of the target image are stored in the second storage, and the source image is stored in the off-chip storage DDR of the FPGA, due to the combined storage mode, the data stream only needs to be cached for one clock cycle, and due to the pipeline mode, the pixel values of the pixel points, needing to be output, of the target image can be calculated by the pixel values of the adjacent pixel points.

Step S23, while buffering the n +1 th line in the off-chip memory DDR for a frame of the source image, starting to read out the pixel value of the pixel point to be output of the processed target image from the second memory, so as to complete the pixel value output of a pixel point of the target image. The next pixel (x +1, y) in the current row is processed until the processing of the pixel point in the current row is completed.

Step S24, repeating steps S21 to S23, and then continuing to start processing for a new line to continue reading out the image data of the target image. The execution range of the step starts from that one frame of the source image is cached in the off-chip memory DDR to reach a preset line number n +1, and ends with that one frame of the source image is completely cached in the off-chip memory DDR.

In this step, it is necessary to wait for one frame of the source image to be cached in the off-chip memory DDR by the preset number of lines n before execution, but in general, the output speed of the target image and the speed of caching the source image should be substantially consistent, so that there is no need to wait; and then, repeatedly processing the pixel points in the next line until all the pixel points needing to be output of the target image are processed.

And step S25 is executed, after the buffering of the source image into the off-chip memory DDR for one frame is completed, the source image is directly and simultaneously subjected to the operations of rotation, translation, scaling, cropping and flipping to obtain the image data of the remaining portion of the target image, and the image data of the remaining portion of the target image is read out, so as to complete the output of the target image. And simultaneously performing rotation, translation, scaling, clipping and turning operations on the source image to obtain pixel values of the remaining lines of the target image, and sequentially reading out the pixel values of the remaining lines of the target image, wherein the number of the remaining lines is the same as the preset number n of the lines. In this step, when the total number of lines of each frame of source image is cached, no source image is input at this time, but the source image of one frame is completely cached, so that each pixel point of the remaining lines can be processed without waiting for the number of lines cached by the DDR of the off-chip memory to reach the preset number of lines at this time.

Real-time image processing with the Xilinx ZCU106 based development kit: firstly, rotation, translation and scaling parameters can be dynamically set through a Linux application program of a PS terminal, and after the parameters are set at the PS terminal, the source image scaling, rotation and translation operations are combined according to an affine transformation matrix M. And then, inputting a video signal through an HDMI Rx interface, and converting the image format of the source image into an RGB format through a VPSS scaler IP. And then, caching the source image after the format conversion into the DDR. And then, establishing a relation between coordinates of pixel points of a target image and coordinates of corresponding pixel points in the source image according to parameters of zooming, rotating and translating operations of the source image and requirements of clipping and overturning operations, calculating pixel values of pixel points of the target image according to pixel values of pixel points around the corresponding pixel points, and simultaneously performing the rotating, translating, zooming, clipping and overturning operations on the source image so as to process the video stream. Next, the processed image data is stored in the DDR. Then, the image data is transferred to the Mixer IP in a DMA manner through the gsstreamer program configuration of the PS terminal, and the video is output through the HDMI Tx interface. The video stream conforms to the Xilinx format (PPC ═ 2), input and output of image data, parsing of image data, conversion of clock domains using FIFOs, and the like.

In summary, the present invention provides a real-time image processing method based on FPGA, which firstly reduces the delay between the output of the target image and the source image caching, and can start the output of the target image without waiting for the source image to be completely stored in the memory for one frame, thereby improving the real-time performance; secondly, operations such as turning, zooming, rotating, translating, clipping and the like can be simultaneously carried out in the process of processing the video stream; finally, parameters of operations such as turning, zooming, rotating, translating, cutting and the like can be dynamically modified, and the effect that the parameters are variable in real time is achieved.

It is to be understood that while the present invention has been disclosed in connection with the preferred embodiments thereof, the same is not to be considered as limiting. It will be apparent to those skilled in the art from this disclosure that many changes and modifications can be made, or equivalents modified, in the embodiments of the invention without departing from the scope of the invention. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.

Claims

1. A real-time image processing method based on FPGA is characterized by comprising the following steps:

step S2: after the one frame of the source image is cached in the off-chip memory DDR to reach a preset line number n, the source image in the off-chip memory DDR is subjected to rotation, translation, scaling, clipping and overturning operations at the same time to obtain image data of a target image, and the image data of the target image is read out when the n +1 th line of the one frame of the source image is cached in the off-chip memory DDR,

2. The real-time image processing method based on FPGA of claim 1, wherein step S1 includes:

calculating the composite affine transformation matrix M according to the parameters of the scaling, rotation and translation operations of the source image; and

3. The FPGA-based real-time image processing method of claim 2, wherein said affine transformation matrix M is as follows:

α＝s*cos(r) (2)

β＝s*sin(r) (3)

t_x＝(p*tan(r)+h)*s (4)

t_y＝p*s (5)

4. The FPGA-based real-time image processing method of claim 2, wherein the real-time image processing method is based on a Xilinx platform and dynamically sets the scaling, rotation and translation parameters through a Linux application program of a PS-side.

5. The real-time image processing method based on FPGA of claim 1, wherein step S2 includes:

step S22: if the pixel point Q is within the effective range of the DDR cache of the off-chip memory, four pixel points around the pixel point Q are searched in the DDR cache of the off-chip memory, and the calculated pixel values of the pixel points of the target image are stored in a second memory; and

6. The real-time image processing method based on FPGA of claim 5, wherein step S21 includes:

preparing image data of a first pixel of a processing target image;

7. The FPGA-based real-time image processing method of claim 6, wherein the coordinates (x) of the pixel point Q_s，y_s) The following formula is satisfied:

x_s＝m₁₁*x+m₁₂*y+m₁₃ (7)

y_s＝m₂₁*x+m₂₂*y+m₂₃ (8)；

wherein m11 ═ m22 ═ α; m12 ═ β; m13 ═ 1- α c_x-β*c_y+t_x；m21＝-β；m23＝β*c_x+(1-α)*c_y+t_y(ii) a And α, β, t_x，t_yIs an intermediate variable, and α, β, t_x，t_yThe following formula is satisfied:

α＝s*cos(r) (2)

β＝s*sin(r) (3)

t_x(p*tan(r)+h)*s (4)

t_y＝p*s (5)

8. The real-time image processing method based on FPGA of claim 6, wherein after determining the range of clipping of the pixel point, further comprising:

9. The FPGA-based real-time image processing method of claim 8, wherein said pixel point Q has a coordinate (x)_s′，y_s) The following formula is satisfied:

x_s′＝w-1-x_s (9)

y_s＝m₂₁*x+m₂₂*y+m₂₃ (8)；

α＝s*cos(r) (2)

β＝s*sin(r) (3)

t_x(p*tan(r)+h)*s (4)

t_y＝p*s (5)

10. The FPGA-based real-time image processing method of claim 7 or 9, wherein storing the calculated pixel values of the pixel points of the target image in a second memory specifically comprises:

storing the read pixel values of the pixel points into a second memory; and

11. The FPGA-based real-time image processing method of claim 7,

the pixel value (r) of the pixel point Q_s，g_s，b_s) The following formula is satisfied:

r_s＝(r₁*(y₀+1-y_s)+r₃*(y_s-y₀))*(x₀+1-x_s)+(r₂*(y₀+1-y_s)+r₄*(y_s-y₀))*(x_s-x₀) (10)

g_s＝(g₁*(y₀+1-y_s)+g₃*(y_s-y₀))*(x₀+1-x_s)+(g₂*(y₀+1-y_s)+g₄*(y_s-y₀))*(x_s-x₀) (11)

b_s＝(b₁*(y₀+1-y_s)+b₃*(y_s-y₀))*(x₀+1-x_s)+(b₂*(y₀+1-y_s)+b₄*(y_s-y₀))*(x_s-x₀) (12)

wherein, the coordinates of four pixel points around the pixel point Q are respectively P₁(x₀，y₀)、P₂(x₀+1，y₀)、P₃(x₀，y₀+1)、P₄(x₀+1，y₀+ 1)); the pixel values of four pixels around the pixel Q are respectively (r)₁，g₁，b₁)、(r₂，g₂，b₂)、(r₃，g₃，b₃)、(r₄，g₄，b₄)。

12. The FPGA-based real-time image processing method of claim 9, wherein said pixel point Q has a pixel value (r)_s′，g_s′，b_s′) The following formula is satisfied:

r_s′＝(r₁*(y₀+1-y_s)+r₃*(y_s-y₀))*(x₀+1-x_s′)+(r₂*(y₀+1-y_s)+r₄*(y_s-y₀))*(x_s′-x₀) (13)

g_s′＝(g₁*(y₀+1-y_s)+g₃*(y_s-y₀))*(x₀+1-x_s′)+(g₂*(y₀+1-y_s)+g₄*(y_s-y₀))*(x_s′-x₀) (14)

b_s′＝(b₁*(y₀+1-y_s)+b₃*(y_s-y₀))*(x₀+1-x_s′)+(b₂*(y₀+1-y_s)+b₄*(y_s-y₀))*(x_s′-x₀) (15)

wherein, the coordinates of four pixel points around the pixel point Q are respectively P₁(x₀，y₀)、P₂(x₀+1，y₀)、P₃(x₀，y₀+1)、P₄(x₀+1，y₀+ 1)); the pixel values of four pixels around the pixel Q are respectively (r)₁，g₁，b₁)、(r₂，g₂，b₂)、(r₃，g₃，b₃)、(r₄g₄，b₄)。

13. The FPGA-based real-time image processing method of claim 5, wherein the second memory is an on-chip memory of the FPGA.

14. The real-time image processing method based on FPGA of claim 2, wherein step S2 further comprises:

step S25: after one frame of the source image is cached to the off-chip memory DDR, the source image is directly and simultaneously subjected to rotation, translation, scaling, cutting and overturning operations to obtain image data of the rest part of the target image, and the image data of the rest part of the target image is read out at the same time, so that the output of the target image is completed.