CN104683806A

CN104683806A - High-speed FPGA realization method applied to MQ arithmetic encoder based on deep running water

Info

Publication number: CN104683806A
Application number: CN201510091224.2A
Authority: CN
Inventors: 陶宏江; 张柯; 金龙旭; 张然峰; 郝贤鹏
Original assignee: Changchun Institute of Optics Fine Mechanics and Physics of CAS
Current assignee: Changchun Institute of Optics Fine Mechanics and Physics of CAS
Priority date: 2015-02-28
Filing date: 2015-02-28
Publication date: 2015-06-03
Anticipated expiration: 2035-02-28
Also published as: CN104683806B

Abstract

The invention relates to a high-speed FPGA (Field Programmable Gate Array) realization method applied to an MQ arithmetic encoder based on deep running water and belongs to the field of computers and digital image processing. In order to improve the execution speed of an MQ coding algorithm, on the basis of an original four-level assembly line structure, through more reasonably distributing the workload and expanding an assembly line, a novel fast MQ arithmetic coding realization mode based on six levels of assembly lines is provided; through reasonably distributing the workload, the necessary serial workload in the same level of assembly line is reduced; through increasing the number of levels of the assembly line, the maximum transformation workload in each level of assembly line is reduced; through analyzing a control signal of a related link between CX table update and arithmetic coding interval A update, a novel realization mode of three-level decomposition through an advanced prediction access technology and a multi-index value analysis and selection technology, namely the first three levels of assembly lines are obtained; the addition speed and the multi-way selection speed are improved through a novel register allocation way, so that the overall execution speed of the MQ coder is improved.

Description

Based on the MQ arithmetic encoder high speed FPGA implementation method of degree of depth flowing water

Technical field

The present invention proposes a kind of realization of High Speed method being applied to the MQ arithmetic encoder of FPGA system based on multistage deep pipeline, effectively can improve coding rate when using FPGA to realize MQ encryption algorithm in JPEG2000, belong to computer, digital image processing field.

Background technology

JPEG2000 image compression algorithm is the Static Picture Compression standard of a new generation, and it not only has excellent compression performance, supports that lossy compression method, Lossless Compression, appointed area compression etc. can selection mode and very strong resist miscode characteristics simultaneously.Owing to having these excellent characteristics, JPEG2000 algorithm has been applied in increasing field.

Although JPEG200O has above-mentioned numerous superiority, but because JPEG2O00 has higher algorithm complexity, especially the control structure of the MQ Arithmetic Coding algorithm that JPEG2000 algorithm is special and arithmetic element all more complicated, and have very strong serial and front and back correlation, thus cause JPEG2000 encryption algorithm speed very slow.

In order to improve the execution speed of MQ arithmetic coding, and then improve the overall work speed of JPEG2000 encryption algorithm, existing use FPGA or VLSI realizes the working forms that all employ streamline in the scheme of MQ encoder.Now conventional streamline implementation is four stage pipeline structure, this structure is at paper Michael Dyer, David Taubman, Saeid Nooshabadi, and Amit Kumar Gupta.Concurrency Techniques for Arithmetic Coding in JPEG2000.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS [J] .2006, has carried out detailed elaboration in 53 (6).But this structure is to two front and back strong correlation rings of restriction MQ encoder execution speed, and (CX table upgrades the correlative link between arithmetic coding interval A renewal, as shown in Figure 1; Coded data exports correlative link, the decomposition degree of depth as shown in Figure 2) is inadequate, cause existing in first order streamline in two-stage table lookup operation, the fourth stage shifting function that there is 18 additions and the realization of 21:1 MUX, these operations are consuming time longer, thus reduce the operating rate of MQ encoder.

Summary of the invention

In order to improve MQ encryption algorithm execution speed, the present invention is based on the MQ arithmetic encoder high speed FPGA implementation method of degree of depth flowing water, it comprises the following steps:

Step 1, in first order streamline, to (the CX of input, D) data are to by the time corresponding MQ encoder probability tables index and the MPS value of tabling look-up, to realize in the course of work CX data the renewal of look-up table and maintenance, the process of the preliminary renewal-conflict of tabling look-up when realizing multiple same CX value input;

Step 2, in the streamline of the second level, realize probability tables index value tabling look-up to probable value, first whether by continuously, two or three input same CX value and judge that the index value of probable value is selection from the NLPS of the input of first order streamline, the input of third level streamline, the NMPS of last checking result or last checking result;

Step 3, in third level streamline, according to the probable value that second level streamline exports, repartitions the interval A of current arithmetic coding, equals A-Qe or Qe, and perform normalization operation by input data D and MPS whether identical selection A;

Step 4, in fourth stage streamline, determined the value of output register C by the Output rusults of third level streamline, if MPS state, A<Qe then C does not upgrade, otherwise C equals C+Qe; If LPS state, A<Qe then C equals C+Qe, otherwise C does not upgrade;

Step 5, in level V streamline, realize managing a high position of register C, according to JPEG2000 agreement, CH is adjustable length in the output procedure of data, by CT, the data bits in CH is counted, determine the updated value of CH and the output valve of B0, B1 by CT with from the displacement number of fourth stage streamline simultaneously;

Step 6, in 6th level production line, realize the management to the last output byte B in JPEG2000 agreement, if B equals 0xff, then directly exporting BOut0 is B, otherwise BOut0 is the carry that B adds from level V streamline, simultaneously according to the byte number that level V streamline exports, upgrade BOut1 and B.

Inventive concept of the present invention can be summarized as:

1, by the load that reasonably shares out the work, work in series amount necessary in same level production line is reduced;

2, by increasing pipeline series, the maximum Transition work load in each level production line is reduced;

3, by upgrading the analysis of the control signal of the correlative link between arithmetic coding interval A renewal to CX table, the three grades of new implementations of decomposing being analyzed selection technique by advanced prediction peek technology and many index values have been drawn, three class pipeline namely;

4, for existing level Four pipelining algorithm data output stage owing to there is 18 additions and 21:1 MUX and the problem causing coding rate low, propose the speed that the new register method of salary distribution improves addition speed and multi-path choice simultaneously, thus improve the overall execution speed of MQ encoder.

The invention has the beneficial effects as follows: this programme is by the analysis to existing MQ encoder bottleneck, use advanced prediction peek technology, many index values analysis selection technique and new byte data output scheme, add pipeline series, decrease the maximum execution time of every level production line, improve the overall execution speed of MQ encoder.This programme, while raising execution speed, ensure that resource utilization does not change, and is suitable for the coded system realized the FPGA that performance, resource occupation requirement are strict.

Accompanying drawing explanation

Fig. 1 CX shows to upgrade the related link flow chart between arithmetic coding interval A renewal.

Fig. 2 coded data exports related link flow process.

Fig. 3 six level production line implementation method structured flowchart.

Fig. 4 streamline first order workflow diagram.

Fig. 5 streamline second level workflow diagram.

Fig. 6 streamline third level workflow diagram.

Fig. 7 streamline fourth stage workflow diagram.

Fig. 8 streamline level V workflow diagram.

Fig. 9 streamline the 6th level work flow chart.

Embodiment

Below in conjunction with accompanying drawing, the present invention is described in further details.

As shown in Figure 3, the present invention is based on the MQ arithmetic encoder high speed FPGA implementation method of degree of depth flowing water, adopt six level production line MQ encoder implementations, comprise the following steps:

Step 1, in first order streamline, is responsible for finding corresponding MQ encoder probability tables index and MPS value by (CX, D) data of input in CX table, and data collision control signal when generating continuously input 2 times or 3 same CX values.

In this level production line, distributed RAM is used to realize the look-up table of CX to MQ encoder probability tables index value and MPS value; Used input CX value during a register Cx1 preservation upper clock cycle; Use register Cx2 preserves the input CX value before the clock cycle; Use register Cx3 preserves the input CX value before two clock cycle; Register D1 is used to preserve input data D; Register Ni1 is used to preserve the probability tables index value of tabling look-up and exporting; Register Mps is used to preserve the MPS value of tabling look-up and exporting; Register MpsEq preservation input data values D and current tabling look-up is used to export the comparative result of MPS; Cx12E is used to preserve the comparative result of save value in current input CX and Cx1; Cx13E is used to preserve the comparative result of save value in current input CX and Cx2.

In first order streamline, by tabling look-up, corresponding probability tables index value and MPS value are obtained to the CX of each input, and CX data are saved in Cx1, Cx2, Cx3; Selecting the CX from third level streamline to show updated value is kept in Ni and Mps when CX value equals Cx3 as output, and the CX that selecting when CX value is not equal to Cx3 tables look-up exports shows updated value and is kept in Ni and Mps as output; Input data D and Mps is carried out logical AND operation, and result is kept in MpsEq; Preserve the comparative result of CX and Cx1 to Cx12E; Preserve the comparative result of CX and Cx2 to Cx13E; This level production line specific implementation flow process, as shown in Figure 4.

Step 2, second level streamline realizes probability tables index value tabling look-up to probable value.

In this level production line, use ROM to realize probability tables PET, wherein preserve JPEG2000 normal probability value information; Register Pb is used to preserve output probability value; Register Ni2 is used to preserve the index value of tabling look-up and using; Register Ni is used to preserve prediction index value; Register Lz is used to preserve the figure place that moves to left; Register Mps2 is used to preserve the Mps value of prediction; Register MpsMatch2 is used to preserve the updated value of Mps comparative result; Use register MpsSelect2 to preserve Mps value and select control information; Used MpsEq value during a register MED preservation upper clock cycle; Use register MED2 preserves the MpsEq value before the clock cycle.

The groundwork of second level streamline is prediction probability table lookup table index value, and by next index value of table look-up output probability value and concordance list; First the corresponding signal value of content choice of foundation Cx12E and Cx13E is as ME result, then waits until the selective value of probability tables lookup table index value according to Cx12E, Cx13E and ME, and selects corresponding index value to carry out tabling look-up obtaining probable value by MUX; This level production line specific implementation flow process, as shown in Figure 5.

Step 3, third level streamline realizes the calculating of A between code area.

In this level production line, register ARN is used to preserve A normalized signal between code area; Register CxUpdate is used to preserve for upgrading the lastest imformation that in first order streamline, CX shows; Use the lastest imformation of register MpsUpdate preservation for upgrading mps signal in the first order and second level streamline; Register Lz3 is used to preserve the figure place that moves to left; Register Pb3 is used to preserve output probability value; Use register CAS to preserve and export selection.

First according to current pipeline state, judgement uses the output probability value of second level streamline or uses last probable value, then A between code area is used to deduct probable value to realize and repartition the interval A of current arithmetic coding, finally equal A-Qe or Qe by input data D and MPS whether identical selection A, and perform normalization operation, this level production line realization flow, as shown in Figure 6.

Step 4, in fourth stage streamline, is determined the value of output register C by the Output rusults of third level streamline.

Register C is used to export temporary result; Register BPC is used to preserve output carry position; Use register BPO0 to preserve carry and export upper byte; Use register BPO1 to preserve carry and export low byte; Register CCRA is used to preserve right-aligned output data; Lz4 is used to preserve the figure place that moves to left.

Use 16 bit register C different from 4 stage pipeline structure of MQ algorithm, this patent structure uses 19 bit register C, because use 28 bit register C in MQ algorithm, its low 16 are used to be added working length with probable value Pb and fix, its high 9 variable-lengths being used to output function, also have 3 bit data like this, what in fact this three bit data represented is if be in MPS state, in algorithm, during A<Pb, then C does not upgrade, otherwise C equals C+Pb; If be in LPS state, in algorithm, during A<Pb, then C equals C+Pb, otherwise C does not upgrade, this level production line realization flow, as shown in Figure 7.

Step 5, in level V streamline, realizes the management of the high-order CH [7:0] to register C.

Register Bc is used to preserve Output rusults; Use register Len to preserve and export length; Register CH is used to preserve the high position data of C; Register CT is used to preserve the figure place of significance bit in CH.

According to JPEG2000 agreement, CH is that length is variable between 0 ~ 7 in the output procedure of data, by CT, the data bits in CH is counted, determine the updated value of CH and the output valve of Bc by CT with from the displacement number of fourth stage streamline simultaneously, the bit value retained in CH is used to upgrade CT counter, this level production line realization flow, as shown in Figure 8.

Step 6, in the 6th level production line, realizes the management to the last output byte B in JPEG2000 agreement.

Register B is used to preserve temporal data; Register BOut0 is used to preserve high-order output; Use register BOut1 to preserve low level to export.

If B equals 0xff, then directly exporting BOut0 is B, otherwise BOut0 is the carry digit that B adds from level V streamline, simultaneously according to the byte number that level V streamline exports, upgrades BOut1 and B, this level production line realization flow, as shown in Figure 9.

The present invention and the speed realized based on the same model FPGA of level Four pipelining, resource compare in table 1.

Table 1:

Claims

1., based on the MQ arithmetic encoder high speed FPGA implementation method of degree of depth flowing water, it is characterized in that, it comprises the following steps: