CN104683806B

CN104683806B - MQ arithmetic encoder high speed FPGA implementation methods based on depth flowing water

Info

Publication number: CN104683806B
Application number: CN201510091224.2A
Authority: CN
Inventors: 陶宏江; 张柯; 金龙旭; 张然峰; 郝贤鹏
Original assignee: Changchun Institute of Optics Fine Mechanics and Physics of CAS
Current assignee: Changchun Institute of Optics Fine Mechanics and Physics of CAS
Priority date: 2015-02-28
Filing date: 2015-02-28
Publication date: 2017-12-26
Anticipated expiration: 2035-02-28
Also published as: CN104683806A

Abstract

MQ arithmetic encoder high speed FPGA implementation methods based on depth flowing water, belong to computer, digital image processing field, speed is performed in order to improve MQ encryption algorithms, on the basis of original four stage pipeline structure, pass through load and the expansion streamline of more reasonably sharing out the work, propose a kind of new block speed MQ based on six level production lines to count coding implementation, by the load that reasonably shares out the work, reduce necessary work in series amount in same level production line；By increasing pipeline series, the maximum Transition work load in each level production line is reduced；By the analysis of the control signal to the correlative link between the renewal of CX tables and arithmetic coding interval A renewals, the implementation for the new three-level decomposition that selection technique is analyzed by advanced prediction access technology and more index values, i.e., preceding three class pipeline have been drawn；Addition speed and the speed of multi-path choice are improved with the new register method of salary distribution simultaneously, so as to improve the overall execution speed of MQ encoders.

Description

MQ arithmetic encoder high speed FPGA implementation methods based on depth flowing water

Technical field

The present invention proposes a kind of height of the MQ arithmetic encoders applied to FPGA system based on multistage deep pipeline Fast implementation method, coding rate when MQ encryption algorithms in JPEG2000 are realized using FPGA can be effectively improved, belongs to meter Calculation machine, digital image processing field.

Background technology

JPEG2000 image compression algorithms are the Static Picture Compression standards of a new generation, and it does not only have excellent compressibility Can, while support optional mode and the very strong anti-bit error performances such as lossy compression method, Lossless Compression, designated area compression.Due to With these excellent characteristics, JPEG2000 algorithms have been applied in increasing field.

Although JPEG200O possesses above-mentioned numerous superiority, because JPEG2O00 has higher algorithm complexity, Especially the control structure of the special MQ Arithmetic Coding algorithms of JPEG2000 algorithms and arithmetic element are all more complicated, and with very Strong serial property and front and rear correlation, so as to cause JPEG2000 encryption algorithms speed very slow.

In order to improve the execution speed of MQ arithmetic codings, and then the overall work speed of JPEG2000 encryption algorithms is improved, It is existing that the working forms that streamline is all employ in the scheme of MQ encoders are realized using FPGA or VLSI.Existing frequently-used stream Waterline implementation is four stage pipeline structures, and the structure is in paper Michael Dyer, David Taubman, Saeid Nooshabadi,and Amit Kumar Gupta.Concurrency Techniques for Arithmetic Coding Carried out in JPEG2000.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS [J] .2006,53 (6) in detail Thin elaboration.But the structure performs two front and rear strong correlation rings (renewal of CX tables and arithmetic volumes of speed to restricting MQ encoders Correlative link between code interval A renewals, as shown in Figure 1；Coded data export correlative link, as shown in Figure 2) decomposition depth not Enough, cause to exist in the first level production line and 18 additions and 21 in two-stage table lookup operation, the fourth stage be present:1 MUX is real Existing shifting function, these operations are time-consuming longer, so as to reduce the operating rate of MQ encoders.

The content of the invention

Speed is performed in order to improve MQ encryption algorithms, MQ arithmetic encoder high speed FPGA of the present invention based on depth flowing water is real Existing method, it comprises the following steps：

Step 1, in the first level production line, corresponding MQ encoder probability tableses are found to tabling look-up to (CX, the D) data of input Index and MPS values, realize in the course of work to renewal and maintenance of the CX data to look-up table, when realizing that multiple same CX values input Preliminary renewal-conflict of tabling look-up processing；

Step 2, in the second level production line, realize probability table index value tabling look-up to probable value, first by whether continuous two Individual or three same CX values of input judge that the index value of probable value chooses to the input from the first level production line, third level flowing water The inputting of line, the NLPS of the NMPS of last checking result or last checking results；

Step 3, in third level streamline, according to the probable value of the second level production line output, to current arithmetic coding area Between A repartitioned, A-Qe or Qe is equal to by input data D and the whether identical selection A of MPS, and perform normalization operation；

Step 4, in fourth stage streamline, output register C value is determined by the output result of third level streamline, if It is MPS states, A<Then C does not update Qe, and otherwise C is equal to C+Qe；If LPS states, A<Then C is equal to C+Qe to Qe, and otherwise C is not more Newly；

Step 5, in level V streamline, realize and a register C high position is managed, according to JPEG2000 agreements, CH is in number According to output procedure in be adjustable length, the data bits in CH is counted by CT, while by CT and from the fourth stage The displacement number of streamline determines CH updated value and B0, B1 output valve；

Step 6, in the 6th level production line, the management to the last output byte B in JPEG2000 agreements is realized, if B Equal to 0xff, then it be B directly to export BOut0, and otherwise BOut0 is that B adds the carry from level V streamline, while foundation the The byte number of five-stage pipeline output, updates BOut1 and B.

The inventive concept of the present invention can be summarized as：

1st, by the load that reasonably shares out the work, necessary work in series amount in same level production line is reduced；

2nd, by increasing pipeline series, the maximum Transition work load in each level production line is reduced；

3rd, by the analysis of the control signal to the correlative link between the renewal of CX tables and arithmetic coding interval A renewals, draw Technology is fetched by advanced prediction and more index values analyze the implementation that the new three-level of selection technique is decomposed, i.e., preceding three-level Streamline；

4th, for existing level Four pipelining algorithm data output level due to 18 additions and 21:1 MUX And the problem of causing coding rate low, it is proposed that the new register method of salary distribution improves addition speed and multi-path choice simultaneously Speed, so as to improve the overall execution speed of MQ encoders.

The beneficial effects of the invention are as follows：This programme is taken by the analysis to existing MQ encoders bottleneck using advanced prediction Number technology, more index values analysis selection technique and new byte data output scheme, add pipeline series, reduce every grade The maximum execution time of streamline, improve the overall execution speed of MQ encoders.This programme improve perform speed while, Ensure that resource utilization does not change, the coded system realized suitable for strict FPGA is required performance, resource occupation.

Brief description of the drawings

Fig. 1 CX tables update the related link flow chart between arithmetic coding interval A renewals.

Fig. 2 coded datas export related link flow.

The level production line implementation method structured flowcharts of Fig. 3 six.

Fig. 4 streamline first order workflow diagrams.

Fig. 5 streamlines second level workflow diagram.

Fig. 6 streamline third level workflow diagrams.

Fig. 7 streamline fourth stage workflow diagrams.

Fig. 8 streamline level V workflow diagrams.

The level work flow chart of Fig. 9 streamlines the 6th.

Embodiment

The present invention is described in further details below in conjunction with the accompanying drawings.

As shown in figure 3, the MQ arithmetic encoder high speed FPGA implementation methods of the invention based on depth flowing water, using six grades of streams Waterline MQ encoder implementations, comprise the following steps：

Step 1, in the first level production line, it is responsible for compiling to finding corresponding MQ in CX tables by (CX, the D) data inputted Code device probability table index and MPS values, and generate the data collision control signal continuously inputted 2 times or during 3 same CX values.

In the level production line, the lookup of CX to MQ encoder probability table index values and MPS values is realized using distributed RAM Table；Input CX values during a upper clock cycle were preserved using register Cx1；Using register Cx2 preserve clock cycle it Preceding input CX values；The input CX values before two clock cycle are preserved using register Cx3；Preserved and inputted using register D1 Data D；The probability table index value for output of tabling look-up is preserved using register Ni1；The MPS for output of tabling look-up is preserved using register Mps Value；Input data values D and the output MPS that currently tables look-up comparative result are preserved using register MpsEq；Preserved and worked as using Cx12E The comparative result of save value in preceding input CX and Cx1；The comparison knot of save value in current input CX and Cx2 is preserved using Cx13E Fruit.

In first level production line, corresponding probability table index value and MPS values are obtained by tabling look-up to the CX of each input, And CX data are saved in Cx1, Cx2, Cx3；The CX table updated value from third level streamline is selected when CX values are equal to Cx3 It is stored in as output in Ni and Mps, the table look-up CX table updated value of output of selection is stored in as output when CX values are not equal to Cx3 In Ni and Mps；Input data D and Mps are carried out logical AND operation, are as a result stored in MpsEq；Preserve CX and Cx1 comparison As a result Cx12E is arrived；CX and Cx2 comparative result is preserved to Cx13E；The level production line implements flow, as shown in Figure 4.

Step 2, the second level production line realizes probability table index value tabling look-up to probable value.

In the level production line, probability tables PET is realized using ROM, wherein preserving JPEG2000 normal probability value informations；Make Output probability value is preserved with register Pb；The index value tabled look-up and used is preserved using register Ni2；Preserved using register Ni pre- Survey index value；Preserved using register Lz and move to left digit；The Mps values of prediction are preserved using register Mps2；Use register MpsMatch2 preserves the updated value of Mps comparative results；Mps values selection control information is preserved using register MpsSelect2；Make MpsEq values during a upper clock cycle were preserved with register MED；Before register MED2 one clock cycle of preservation MpsEq values.

The groundwork of second level production line is prediction probability table lookup table index value, and passes through output probability value and the rope of tabling look-up Draw next index value of table；The Cx12E and Cx13E corresponding signal value of content selection is first depending on as ME results, then foundation Cx12E, Cx13E and ME wait until the selective value of probability tables lookup table index value, and select corresponding index value by MUX Tabled look-up to obtain probable value；The level production line implements flow, as shown in Figure 5.

Step 3, third level streamline realizes coding section A calculating.

In the level production line, coding section A normalized signals are preserved using register ARN；Use register CxUpdate preserves the fresh information for updating CX tables in the first level production line；Preserved and be used for using register MpsUpdate Update the fresh information of mps signals in the first order and the second level production line；Preserved using register Lz3 and move to left digit；Using posting Storage Pb3 preserves output probability value；Output selection is preserved using register CAS.

First according to current pipeline state, judgement is using the output probability value of the second level production line or using upper Probable value once, then subtract probable value using coding section A and current arithmetic coding section A is drawn again to realize Point, A-Qe or Qe is finally equal to by input data D and the whether identical selection A of MPS, and normalization operation is performed, this grade of flowing water Line implementation process, as shown in Figure 6.

Step 4, in fourth stage streamline, output register C value is determined by the output result of third level streamline.

Temporary result is exported using register C；Output carry position is preserved using register BPC；Protected using register BPO0 Deposit carry-out upper byte；Carry-out low byte is preserved using register BPO1；It is right right to be preserved using register CCRA Neat output data；Preserved using Lz4 and move to left digit.

Different using 16 bit register C from 4 stage pipeline structures of MQ algorithms, this patent structure uses 19 bit register C, Because using 28 bit register C in MQ algorithms, its low 16 are fixed for being added working length with probable value Pb, Its high 9 are variable for exporting the length of operation, and so middle to also have 3 data, actually this three data represent It is the A in algorithm if being in MPS states<Then C does not update during Pb, and otherwise C is equal to C+Pb；If it is in LPS states, algorithm Middle A<Then C is equal to C+Pb during Pb, and otherwise C does not update, the level production line implementation process, as shown in Figure 7.

Step 5, in level V streamline, the high-order CH [7 to register C is realized:0] management.

Output result is preserved using register Bc；Output length is preserved using register Len；C is preserved using register CH High position data；The digit of significance bit in CH is preserved using register CT.

According to JPEG2000 agreements, CH is that length is variable between 0~7 in the output procedure of data, by CT to CH In data bits counted, while by CT and the displacement number from fourth stage streamline determine CH updated value and Bc it is defeated Go out value, update CT counters using the bit value that retains in CH, the level production line implementation process, as shown in Figure 8.

Step 6, in the 6th level production line, the management to the last output byte B in JPEG2000 agreements is realized.

Temporal data is preserved using register B；High-order output is preserved using register BOut0；Protected using register BOut1 Deposit low level output.

If B is equal to 0xff, it is B directly to export BOut0, and otherwise BOut0 is that B adds entering from level V streamline Position position, while according to the byte number of level V streamline output, update BOut1 and B, the level production line implementation process, such as Fig. 9 institutes Show.

Speed, the resource that of the invention and based on level Four pipelining same model FPGA is realized relatively are shown in Table 1.

Table 1：

Claims

1. the MQ arithmetic encoder high speed FPGA implementation methods based on depth flowing water, it is characterized in that, it comprises the following steps：

Step 1, in the first level production line, corresponding MQ encoder probability table indexs are found to tabling look-up to (CX, the D) data of input With MPS values, realize in the course of work to renewal and maintenance of the CX data to look-up table, realize first when multiple same CX values input Step updates-tabled look-up the processing of conflict；

Step 2, in the second level production line, realize probability table index value tabling look-up to probable value, first by whether continuous two or Three same CX values of input judge that the index value of probable value chooses to the input from the first level production line, third level streamline Input, the NLPS of the NMPS or last checking results of last checking result；

Step 3, in third level streamline, according to the probable value of the second level production line output, current arithmetic coding section A is entered Row is repartitioned, and is equal to A-Qe or Qe by input data D and the whether identical selection A of MPS, and perform normalization operation；

Step 4, in fourth stage streamline, output register C value is determined by the output result of third level streamline, if MPS states, A<Then C does not update Qe, and otherwise C is equal to C+Qe；If LPS states, A<Then C is equal to C+Qe to Qe, and otherwise C is not more Newly；

Step 5, in level V streamline, the management to register C high-order CH is realized, according to JPEG2000 agreements, CH is in number According to output procedure in be adjustable length, the data bits in register C high-order CH is counted by register CT, together When by register CT and the displacement number from fourth stage streamline determines CH updated value and register Bc output valve；

Step 6, in the 6th level production line, the management to the last output byte B in JPEG2000 agreements is realized, uses register B preserves temporal data；High-order output is preserved using register BOut0；Low level is preserved using register BOut1 to export；If B etc. In 0xff, then it is B directly to export BOut0, and otherwise BOut0 is that B adds the carry from level V streamline, while according to the 5th The byte number of level production line output, updates BOut1 and B.