CN111818339A

CN111818339A - Multi-core processing method of Webp image compression algorithm based on FPGA

Info

Publication number: CN111818339A
Application number: CN202010664400.8A
Authority: CN
Inventors: 栗梦祥
Original assignee: Fengyi Technology Shanghai Co ltd
Current assignee: Fengyi Technology Shanghai Co ltd
Priority date: 2020-07-10
Filing date: 2020-07-10
Publication date: 2020-10-23

Abstract

The invention relates to the technical field of image processing, in particular to a multi-core processing method of a Webp image compression algorithm based on an FPGA (field programmable gate array). the method comprises the steps of sending an image to a color space conversion link by a DDR (double data rate) on-chip cache, converting the image into correspondingly analyzed YUV macro blocks according to RGB (red, green and blue) three channels, analyzing, predicting and quantizing the YUV macro blocks to obtain a parameter matrix, and finally writing the parameter matrix back to the DDR on-chip cache. The invention realizes WebP on FPGA, adopts CPU type multi-core design idea, makes multi-core work together, increases speed to limit, and realizes flow processing between pictures.

Description

Multi-core processing method of Webp image compression algorithm based on FPGA

Technical Field

The invention relates to the technical field of image processing, in particular to a multi-core processing method of a Webp image compression algorithm based on an FPGA.

Background

Among many compression algorithms, the WebP compression algorithm stands out, what is the WebP format picture? The picture file format is developed by Google (Google) and aims to accelerate the loading speed of pictures, save a large amount of server bandwidth resources and data space, is superior to a JPEG (joint photographic experts group) format in compression rate, provides a lossy compression and lossless compression picture file format, and has the advantages that the volume of a WebP (Web page) format image is 40% smaller than that of the JPEG format image under the condition of the same quality, and is a relatively new counter, so that the WebP format image is perfectly compatible with the JPEG format image in the market at present, the usability and practicability of the WebP format image become very practical, and if the good object does not have good compatibility, the WebP format image is very difficult to popularize and widely use, and the practicability and operation difficulty can be increased. According to the compatibility analysis of the current domestic browser ratio and the WebP, more than 50% of domestic users can directly experience the WebP.

The image compression efficiency on the market is not high, the precision is not enough, the size is too large, the network transmission speed caused is low, the storage pressure is large, most of WebP compression algorithms are realized based on a CPU (Central processing Unit), the cost is higher, the garbage time of the CPU cannot be utilized, the processing speed realized based on the FPGA is low, the structure is not elegant, the heterogeneous calculation of the FPGA is not performed, the characteristic of high calculation speed is fully exerted, the compression efficiency is low, the realization on the market is not quantized, and the predicted result is not accurate enough.

Disclosure of Invention

In view of the technical problems, the invention provides a multi-core processing method of a Webp image compression algorithm based on an FPGA, the invention realizes WebP on the FPGA, adopts a CPU type multi-core design idea, enables multi-core to work together, improves the speed to the limit, and simultaneously realizes the flow processing between pictures.

A multi-core processing method of a Webp image compression algorithm based on an FPGA is characterized by comprising the following steps:

and the DDR on-chip cache sends the image to a color space conversion link, the image is converted into a correspondingly analyzed YUV macro block according to RGB three channels, then the YUV macro block is analyzed, predicted and quantized to obtain a parameter matrix, and finally the parameter matrix is written back to the DDR on-chip cache.

The multi-core processing method of the FPGA-based Webp image compression algorithm is characterized in that a parameter matrix is obtained by analyzing and predicting YUV macro blocks, and the method further comprises analyzing parameters and analyzing a central point.

The multi-core processing method of the FPGA-based Webp image compression algorithm is characterized in that the parameter matrix, the analysis parameters and the analysis center point are used in quantization calculation, and finally, the quantized result is scored and finally, the file is encoded and written.

The multi-core processing method of the FPGA-based Webp image compression algorithm is characterized in that the encoded write file is written back to a DDR on-chip cache.

The multi-core processing method of the Webp image compression algorithm based on the FPGA is characterized in that the prediction and quantization step mainly comprises the following steps: prediction, DCT, quantization, inverse quantization, IDCT.

The multi-core processing method of the FPGA-based Webp image compression algorithm is characterized in that after IDCT conversion processing is finished, scoring is carried out, and then the steps return to the prediction step to form a closed loop.

The technical scheme has the following advantages or beneficial effects:

1) the method has the advantages that the method has an elegant multi-core structure, flexibly allocates processing speed and can be matched according to needs;

2) the DSP is in a full-load working state due to a perfect flow structure, the throughput is high, the time delay is low, and the characteristic of heterogeneous computation of the FPGA is utilized to the maximum extent;

3) the method can be deployed in the garbage time of the CPU, and the spare time of the CPU is greatly utilized without influencing the normal work.

Drawings

The invention and its features, aspects and advantages will become more apparent from reading the following detailed description of non-limiting embodiments with reference to the accompanying drawings. Like reference symbols in the various drawings indicate like elements. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.

FIG. 1 is a core architecture diagram of the present invention;

FIG. 2 is a flow diagram of an FPGA implementation of predictive quantization;

fig. 3 is a color space conversion calculation process and evaluation diagram.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The FPGA has the characteristics of high-speed calculation and low power consumption, is low in price, is very suitable for processing large flow calculation amount, belongs to heterogeneous calculation, and has a trend of combining a CPU and the FPGA for work. As shown in fig. 1 and fig. 2, the multi-core processing method of a Webp image compression algorithm based on an FPGA provided in the present invention is mainly summarized as: and the DDR on-chip cache sends the image to a color space conversion link, the image is converted into a correspondingly analyzed YUV macro block (Y represents brightness and UV represents chroma) according to RGB three channels, then the YUV macro block is analyzed, predicted and quantized to obtain a parameter matrix, and finally the parameter matrix is written back to the DDR on-chip cache.

In the technical scheme of the invention, the YUV macro block is analyzed and predicted and quantized to obtain a parameter matrix, an analysis parameter and an analysis center point. The parameter matrix, the analysis parameters and the analysis center point are used in the quantization calculation, and finally, the quantized result is scored, and finally, the coding is carried out to write a file; the encoded write file is then written back to the DDR on-chip cache.

Preferably, the prediction quantization step mainly includes: predicting, DCT, quantizing, inverse quantizing, IDCT, scoring after the IDCT conversion processing is finished, and then returning to the predicting step to form a closed loop.

In the detailed introduction scheme of the invention, the WebP compression mainly comprises the following steps: the color space conversion, which is mainly a conventional RGB color space, is converted into a color space of luminance (i.e., YUV). The next analysis stage is mainly to analyze the color space after the image conversion, analyze each component in different prediction modes (DC TM, etc.), then perform statistical analysis on the analysis result, and finally obtain an analysis matrix for one image, i.e. a parameter matrix, analysis parameters, and an analysis center point. The latter parameters are used in the quantization calculation, and the quantized result is finally scored and finally encoded to write a file.

As shown in figure 1, the technical scheme of the invention divides the whole algorithm into 3 main lines, wherein the first main line comprises a step from DDR to on-chip cache, then color space conversion is carried out, the converted result is written back to DDR, meanwhile, pre-analysis is carried out on the converted data to obtain a parameter matrix, and finally, the parameter matrix is written back to DDR. Fig. 3 is a calculation process of color space conversion and a detailed evaluation of specific resource periods, since this is a critical step, here related to the throughput of the first line. FIG. 2 is an overall architecture of the second main line, the core concept of the design is also pipelining, time division multiplexing and decoupling, the pipelining design ensures high throughput of the entire framework, the decoupling design ensures that the entire design is multi-core, the second line determines the overall processing speed, if the requirement is high-speed compression, the second line can be copied in multiple parts, which is equivalent to simultaneous working of multiple threads without interference, thus greatly improving the parallelism, and if the resources are sufficient, any part can be copied, and the processing speed is increased by visible form multiples. The last line is the final coding write file, which is also in line with the design idea of the whole framework, decoupling and multi-core, and can be copied as required and then work with FPGA in idle time of CPU.

In summary, due to the decoupling of the three mainlines and the multi-core design idea, each mainline and the whole framework are extremely flexible and can be configured at will, and finally, the high standards of high throughput, multi-core and low delay are realized.

Those skilled in the art will appreciate that those skilled in the art can implement the modifications in combination with the prior art and the above embodiments, and the details are not described herein. Such variations do not affect the essence of the present invention and are not described herein.

The above description is of the preferred embodiment of the invention. It is to be understood that the invention is not limited to the particular embodiments described above, in that devices and structures not described in detail are understood to be implemented in a manner common in the art; those skilled in the art can make many possible variations and modifications to the disclosed embodiments, or modify equivalent embodiments, without affecting the spirit of the invention, using the methods and techniques disclosed above, without departing from the scope of the invention. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.

Claims

1. A multi-core processing method of a Webp image compression algorithm based on an FPGA is characterized by comprising the following steps:

2. The multi-core processing method of the FPGA-based Webp image compression algorithm according to claim 1, wherein the parameter matrix is obtained by analyzing and predicting the YUV macro blocks, and the method further comprises analyzing parameters and analyzing a center point.

3. The multi-core processing method of the FPGA-based Webp image compression algorithm as claimed in claim 2, wherein the parameter matrix, the analysis parameters and the analysis center point are used in quantization calculation, and finally, the quantized result is scored and finally encoded to write a file.

4. The multi-core processing method of the FPGA-based Webp image compression algorithm as claimed in claim 3, wherein the encoded write file is written back to a DDR on-chip cache after being encoded.

5. The multi-core processing method of the FPGA-based Webp image compression algorithm according to claim 1, wherein the step of predicting and quantizing mainly comprises: prediction, DCT, quantization, inverse quantization, IDCT.

6. The multi-core processing method of the FPGA-based Webp image compression algorithm as recited in claim 5, wherein after IDCT transformation processing is finished, scoring is performed, and then the prediction step is returned to, so as to form a closed loop.