CN102523455B

CN102523455B - Multi-thread arithmetic coding circuit and method based on standard JPEG 2000

Info

Publication number: CN102523455B
Application number: CN201210000505.9A
Authority: CN
Inventors: 郝跃; 邸志雄; 逄杰; 史江义; 马佩军; 田映辉; 龚章芯
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2012-01-02
Filing date: 2012-01-02
Publication date: 2014-04-02
Anticipated expiration: 2032-01-02
Also published as: CN102523455A

Abstract

The invention discloses a multi-thread arithmetic coding circuit and method based on the standard JPEG 2000 and mainly aims to solve the problem of large area, low encoding efficiency and low throughput of a conventional multi-context arithmetic coder. The coding circuit is characterized in that under the premise of ensuring that the compression result is identical with that of the standard JPEG 2000, a command generating subunit, a command register and a comparator are introduced to a command generating and index forecasting unit' in the arithmetic coder, and the comparator is used for generating a 'command for controlling the thread coding mode; and simultaneously, an interval adjustment selector, a probability estimate selector and an index selector are controlled according to the command value to allocate the last coding result to the current to-be-coded thread, so that the logic complexity is reduced. Besides, according to the invention, LUTs (lookup tables) are split and expanded, wherein the primary LUT stores all possible index values, and the secondary LUT only stores probability estimate value. The simulation result shows that the invention has the characteristics of small area and high throughput and can be applied to high-performance image processing chips.

Description

Multithreading arithmetic coding circuit and method based on JPEG2000 standard

Technical field

The invention belongs to microelectronics technology, relate to chip design, particularly a kind of coding method and circuit structure that meets the arithmetic encoder of JPEG2000 standard, is mainly used in digital image coding chip design field.

Background technology

JPEG2000 has been widely applied to a plurality of fields such as the Internet, image transmitting as the static compression and coding standard of a new generation.Compare with traditional JPEG, JPEG2000 has efficient compression effectiveness and good anti-error code capacity, and it supports quality scalability, resolution flexible, encoding region of interest, and supports harmless and lossy compression method in same framework simultaneously.

Embedded block coding algorithm EBCOT, quantification, Bit-Plane Encoding, arithmetic encoder MQ and code stream that JPEG2000 algorithm is mainly blocked by wavelet transform DWT, optimization are controlled these modules and are formed.Wherein wavelet transformation and block encoding technique have improved the anti-error code capacity of the code stream of Image Coding generation; And interior embedding technique for code stream flexibly control provide may, can realize compatibility harmless, Image Lossy Compression.Bit-Plane Encoding and arithmetic encoder MQ are 2 higher modules of complexity in JPEG2000, and the time of these 2 resume module has spent the over half of whole scramble time.Current Bit-Plane Encoding processing speed has far surpassed the speed of arithmetic encoder MQ, Bit-Plane Encoding can the multipair context data of parallel output, and arithmetic encoder MQ algorithm is because it is to the dependence between context coding result, can only each context data of serial process, therefore the code rate of arithmetic encoder MQ has become the key point of restriction JPEG2000 processing speed.

In JPEG2000, MQ encoder is used for reading in code block context CX and the binary decision D that Bit-Plane Encoding module produces, and then through coding, obtains the data compression code stream CD of each independent code block coefficient.Its encryption algorithm is mainly exported these three steps and is formed by searching index and probability Estimation value table, renormalization and code stream.

Index and probability Estimation value table are the look-up table of a two-stage: the first order is concordance list, and the second level is probability Estimation value table.In JPEG2000, there are 0-18 totally 19 possible context CX.The concordance list of the first order is a dynamic adjustable look-up table, deposits 19 each self-corresponding index values of context CX present clock period.Wherein, each CX is to there being 47 different index values.And the probability Estimation value table of the second level is fixing immutable look-up table, deposit totally 29 of the corresponding next one of index value large probability index value NMPS (6bit) and next small probability index value NLPS (6bit), an exchange SWITCH (1bit) and probability Estimation value Qe (16bit).The context CX of the current reception of MQ encoder is searching of process two-stage successively, obtains correct probability Estimation value Qe.

In MQ arithmetic encoder, with interval adjustment register A, represent the width in current subinterval, code registers C represents the original position in subinterval.Wherein interval adjustment register A is 16bit, and code registers C is 28bit.After obtaining correct probability Estimation value Qe, according to the difference between code area, interval adjustment register A and code registers C are normalized.Wherein, the interval register A that adjusts is updated to A-Qe or Qe, and code registers C is updated to C+Qe or maintenance.In order to make the interval span of adjusting register A remain on [0.75,0.5], it is constantly moved to left.Each interval register A that adjusts moves to left one time, and built-in counter CT is subtracted to 1 operation, until during A >=0x8000, stop that register A is adjusted in interval and move to left.When register A moves to left, the code registers C identical figure place that also moves to left, until during CT=0, stop displacement, exports a high position of code registers C as packed data CD.

In traditional MQ coding method, completely according to JPEG2000 standard implementation, do not make full use of the feature that hardware circuit can parallel computation, can only realize serial code, a clock cycle is processed a context CX, as number being 03129690.4 patent application, number for the patent application of 200410026019.X be exactly this.In recent years, in more existing papers, circuit structure and method that some can parallel processing a plurality of context CX have been proposed.As Dyer proposes a kind of circuit structure in paper < < Concurrency techniques for arithmetic Coding in JPEG2000 > >, can be at two context CX of a clock cycle parallel processing.Although code efficiency and aspect of performance at circuit have had certain improvement, but still deposit problem both ways: first, this technology is in order to process a plurality of context CX, increase a large amount of MUX and memory cell, increased to a certain extent circuit area and logical complexity; The second, if interval in this technology, adjust register A to be normalized carry digit in operation be not zero, the calculating of context CX will pause, and makes the treatment effeciency of circuit be subject to certain restriction.

Summary of the invention

The object of the invention is to overcome the deficiency of above-mentioned prior art, a kind of coding method of multithreading arithmetic encoder and coding circuit based on JPEG2000 standard proposed, not change under the prerequisite of coding result, make full use of the relation between context CX, reduced index value and probability Estimation value search procedure, the quantity that reduces MUX and memory cell, reduces circuit area and logical complexity; Adjust renormalization computational process, while making interval adjustment register A carry digit not be zero, circuit can not pause yet, and promotes code efficiency and throughput.

For achieving the above object, the present invention is based on the multithreading arithmetic coding circuit of JPEG2000 standard, comprising:

Instruction generation and index predicting unit, its include instruction generates subelement, command register instr and index predictor unit, this instruction generates subelement according to equal judged result whether between two between the context CX of all threads, write command register instr, the wherein corresponding context CX of each thread; This command register instr is for controlling the normalization of adjusting register A with interval of searching of index and normalization unit concordance list; This index predictor unit, for predicting the index transient state value of each context CX;

Index selection and normalization unit, it comprises data allocations subelement and thread-data stream subelement; This data allocations subelement is for distributing the coding result of its required front context CX_dff, coding result to comprise index value index_dff, probability Estimation value Qe_dff and the interval register A_dff that adjusts to each thread; This thread-data stream subelement is used for according to the value of data allocations subelement distribution and the value of command register instr, select index value and the probability Estimation value Qe of the context CX of current each thread, then computation interval is adjusted the renormalization result a of register A, wherein the corresponding context CX of each thread;

Code registers normalization unit, it comprises code registers C and renormalization subelement; This renormalization subelement is used for code registers C renormalization;

Code stream output unit, for according to the renormalization result of code registers C, exports the code stream that meets JPEG2000 standard.

Described instruction generates in subelement and contains comparator and command register instr; Whether comparator is used for judging between current each context CX and equates, and whether current each context CX equates between two with front once all context CX_dff respectively between two; If total number of threads is n, comparator sum T is

individual; Command register instr bit wide value W is determined by Thread Count n, meets relation

wherein low whether equal bit represent between current each context CX between two relation, high n ²bit represents in current each thread, whether each context CX equates between two with front once all context CX_dff respectively.

Described data allocations subelement comprises 3 class selectors, interval selector a_mux, probability Estimation selector q_mux and the index selection device i_mux of adjusting; Wherein selector, 1 probability Estimation selector and 1 index selection device are adjusted in corresponding 1 interval of each thread; The renormalization result A_dff of register A is adjusted in the interval that the interval input signal of adjusting selector is front once all context CX_dff, and output signal is to distribute to the Thread control signal a_pre of this thread; Probability Estimation selector input signal is the probability Estimation value Qe_dff of front once all context CX_dff, and output signal is to distribute to the possible probable value qe_pre of this thread; Index selection device input signal is the index value i_dff of front once all context CX_dff, and output signal is to distribute to the possible index value i_pre of this thread; The interval control signal of adjusting selector, probability Estimation selector and index selection device is the output signal of command register instr.

The look-up table that contains a two-stage in described index predictor unit, all threads are all shared this look-up table; In first order look-up table, the address of data is current context CX, the content of data be index value, next time large probability environment, next time small probability environment and next time large probability environment and next time the next stage of small probability environment arrive successively the probability environment of next (n-1) level, wherein n is total number of threads; The bit wide DW=6 of data * (2 ⁿ⁺¹-1), wherein n is total number of threads; In the look-up table of the second level, the address of data is the lookup result of first order look-up table, and the content of data is probability Estimation value Qe, and data bit width is 16.

To achieve these goals, the present invention is based on the multithreading arithmetic coding method of JPEG2000 standard, comprising:

1) instruction generates step:

The comparator that 1a) uses instruction to generate in subelement carries out between two all context CX of a current n thread, the corresponding context of each thread wherein, the corresponding context of each thread wherein, in in the corresponding command register of the 2nd thread the 0th, corresponding the 1st and the 2nd of the 3rd thread, corresponding the 3rd, 4 and 5 of the 4th thread; If comparative result equates, by command register instr[C _n ²-1:0] in position corresponding bit position 1, otherwise set to 0;

1b) current context CX and front once all context CX_dff are compared between two the corresponding context of each thread wherein, i the high n of thread correspondence command register ²[C in bit _n ²+ i (n-1)+n-2, C _n ²+ i (n-1)-1] position, n ²for the number of times that current context and a front context are compared between two, n is total number of threads, i ∈ [1, n], and i is integer; If comparative result equates, by the high n of command register instr ² corresponding bit position 1 in position, otherwise set to 0;

2) index prediction steps:

First according to current context, search first order look-up table, for i thread, lookup result is index transient state value, arrives successively large probability coding environment and the small probability coding environment of lower i time next time, is total to 2i+1 value; Then according to data to be encoded D[j] with coding environment MPS[j] whether equate, from the result of first order look-up table, determine the probability encoding sign of lower j time; If equate, descend the probability encoding sign of j time to be defined as large probability coding environment; If etc., do not descend the probability encoding sign of j time to be defined as small probability coding environment; Index prediction steps final result is index transient state value, arrives the probability encoding sign of lower i time next time, is total to i+1 value; Wherein, j gets all integers from 1 to i, i ∈ [1, n], and i is integer, n is total number of threads;

3) index is determined and normalization step:

3a) usage data distributes subelement according to the value of command register, and front once contextual index value, probability Estimation value and the interval register of adjusting are distributed to each current thread, and allocation result is possible index value, possibility probable value and Thread control value;

3b) according to possibility index value, possibility probable value and Thread control value, determine the index value of each thread: for i thread, first whether equal zero the position of decision instruction register [2i (i-1)+i, 2i (i-1)]; If equalled zero, this thread index value is index transient state value; Otherwise, according to the magnitude relationship of Thread control value and possibility probable value, determine index value: if a_pre>=i * qe_pre+2 ¹⁵, this thread index value is possible index value; Otherwise, judgement a_pre>=l * qr_pre+2 ¹⁵whether set up, if set up, index value be under (i-l) inferior probability encoding sign, if be false, carry out circulation: l from subtracting 1, judgement a_pre>=l * qe_pre+2 ¹⁵whether set up, if set up, jump out circulation, if be false, circulation, wherein l initial value is (i-1) if continuing, span is [i-1,0], and a_pre is Thread control value, and qe_pre is possible probable value, i ∈ [1, n], and i is integer, n is total number of threads;

Index value 3c) obtaining according to previous step is searched second level look-up table, obtains the probability Estimation value Qe of current context CX;

4) the adjustment register normalization step of first thread:

First computation interval is adjusted register A the first intermediate variable A1=A-Qe; Then according to the value of the highest two of A the first intermediate variable A1, A the first intermediate variable A1 is moved to left, obtain A the second intermediate variable A2: if the value of the highest two of A intermediate variable A1 is 3 or 2, A the second intermediate variable A2=A1; If the value that A the first intermediate variable A1 is the highest two is 1, A the second intermediate variable A2=A1 < < 1; If the value that A the first intermediate variable A1 is the highest two is 0, A intermediate variable A2=A1 < < 2; Finally use formula selection marker sel certain range to adjust the renormalization value of register: when the value of formula selection marker sel is 1, renormalization result is A2; Otherwise renormalization result is Qe;

5) the code registers normalization step of first thread:

First, calculation code register C the first intermediate variable C1=C+Qe;

Then, the value that interval the first intermediate variable A1 that adjusts register A is the highest two, moves to left to C1, obtains code registers C the second intermediate variable C2: if the value of the highest two of A intermediate variable A1 is 3 or 2, and C2=C1; If the value that A the first intermediate variable A1 is the highest two is 1, C2=C1 < < 1; If the value that A the first intermediate variable A1 is the highest two is 0, C2=C1 < < 2;

Finally, use formula selection marker sel certain range to adjust the renormalization value of register: when the value of formula selection marker sel is 1, adjusting register renormalization result is C2; Otherwise adjusting register renormalization result is C;

6) repeating step: repeating step 4) and step 5) complete the normalization of the interval of other threads being adjusted to register and code registers;

7) code stream output step: code registers C is input to code stream output unit by the value after normalization, obtains final coded data.

Described usage data distributes subelement, front once contextual index value, probability Estimation value and the interval register value of adjusting are distributed to each current thread, whether to equal zero and determine according to the value of the corresponding position of command register: for i thread, if command register [C _n ²+ i (n-1)+n-2, C _n ²+ i (n-1)-1] position equal zero, the index value of a last order n thread, probability Estimation value and interval adjustment register value are distributed to current thread i; Otherwise, if C in command register _n ²+ i (n-1)-1+k position is 1, by a last order k thread index value, probability Estimation value and the interval register value of adjusting distribute to current thread i; C wherein _n ²represent the number of times that all context CX of a current n thread compare between two, n is total number of threads, i ∈ [1, n], and i is integer, k ∈ [1, n], k is integer.

Described formula is selected signal sel, to be produced by the expression formula through concluding: D==MPS (CX) ⊙ (M > 2Qe), wherein D is data to be encoded, CX is the context of data to be encoded, ⊙ is xor operator, M is for adjusting the value of register A, and MPS () is coding environment symbol M PS look-up-table function, and Qe is probability Estimation value.

Tool of the present invention has the following advantages:

The present invention generates subelement and command register because " instruction generation and index predicting unit " by arithmetic encoder encryption algorithm introduces instruction, take full advantage of between each thread context CX whether equal relation, thereby the quantity that has reduced MUX and memory cell, has reduced circuit area; Because adopting data distributing method and index, coding method of the present invention determines method simultaneously, different threads is adopted to different distribution and determination methods, thereby adjusted renormalization computational process, reduced decision logic complexity, promoted code efficiency and throughput.Simulation result shows, the present invention significantly reduces the area of circuit, and has significantly improved the operating frequency of arithmetic encoder.

Accompanying drawing explanation

Fig. 1 is multithreading arithmetic encoder structured flowchart of the present invention;

Fig. 2 is instruction generation and index predicting unit circuit diagram in arithmetic encoder of the present invention;

Fig. 3 is index selection and normalization element circuit figure in arithmetic encoder of the present invention;

Fig. 4 is data allocations subelement circuit diagram in arithmetic encoder of the present invention;

Fig. 5 is the general flow chart of multithreading arithmetic encoder of the present invention coding method;

Fig. 6 is that the index in the present invention is determined and normalization sub-process figure.

Embodiment

With reference to Fig. 1, arithmetic encoder coder structure of the present invention comprises instruction generation and index predicting unit, index selection and normalization unit, code registers normalization unit and code stream output unit.Wherein:

Instruction generation and index predicting unit, its structure as shown in Figure 2, its include instruction generates subelement, command register instr and index predictor unit, this instruction generates subelement according to equal judged result whether between two between the context CX of all threads, write command register instr, the corresponding context CX of each thread wherein, it comprises comparator, whether comparator is used for judging between current each context CX and equates between two, and whether current each context CX equates between two with front once all context CX_dff respectively, if total number of threads is n, comparator sum T is T=C _n ²+ n ²individual, this command register instr bit wide value W is determined by Thread Count n, meets and is related to W=C _n ²+ n ², wherein low

whether equal bit represent between current each context CX between two relation, high n ²bit represents in current each thread, whether each context CX equates between two with front once all context CX_dff respectively, the look-up table that this index predictor unit contains a two-stage, all threads are all shared this look-up table, in first order look-up table, the address of data is current context CX, the content of data be index value, next time large probability environment, next time small probability environment and next time large probability environment and next time the next stage of small probability environment arrive successively the probability environment of next (n-1) level, wherein n is total number of threads, the bit wide DW=6 of data * (2 ⁿ⁺¹-1), wherein n is total number of threads, in the look-up table of the second level, the address of data is the lookup result of first order look-up table, and the content of data is probability Estimation value Qe, and data bit width is 16,

Index selection and normalization unit, as shown in Figure 3, it comprises data allocations subelement and thread-data stream subelement to its structure, this data allocations subelement, its result as shown in Figure 4, for distribute the coding result of its required front context CX_dff to each thread, coding result comprises index value index_dff, probability Estimation value Qe_dff and the interval register A_dff that adjusts, it comprises 3 class selectors, be respectively the interval selector a_mux that adjusts, probability Estimation selector q_mux and index selection device i_mux, wherein selector is adjusted in corresponding 1 interval of each thread, 1 probability Estimation selector and 1 index selection device, the renormalization result A_dff of register A is adjusted in the interval that the interval input signal of adjusting selector is front once all context CX_dff, output signal is to distribute to the Thread control signal a_pre of this thread, probability Estimation selector input signal is the probability Estimation value Qe_dff of front once all context CX_dff, output signal is to distribute to the possible probable value qe_pre of this thread, index selection device input signal is the index value i_dff of front once all context CX_dff, output signal is to distribute to the possible index value i_pre of this thread, the interval selector of adjusting, the control signal of probability Estimation selector and index selection device is the output signal of command register instr,

Code registers normalization unit, it comprises code registers C and renormalization subelement, and this renormalization subelement is used for code registers C renormalization;

With reference to Fig. 5, the arithmetic coding method that the present invention is based on JPEG2000 comprises the steps:

Step 1, instruction generates step.

The comparator that 1a) uses instruction to generate in subelement carries out between two all context CX of a current n thread, the corresponding context of each thread wherein, in in the corresponding command register of the 2nd thread the 0th, corresponding the 1st and the 2nd of the 3rd thread, corresponding the 3rd, 4 and 5 of the 4th thread; If comparative result equates, by command register instr[C _n ²-1:0] in position corresponding bit position 1, otherwise set to 0;

1b) current context CX and front once all context CX_dff are compared between two the corresponding context of each thread wherein, i the high n of thread correspondence command register ²[C in bit _n ²+ i (n-1)+n-2, C _n ²+ i (n-1)-1] position, n ²for the number of times that current context and a front context are compared between two, n is total number of threads, i ∈ [1, n], and i is integer; If comparative result equates, by the high n of command register instr ² corresponding bit position 1 in position, otherwise set to 0.

Step 2, index prediction.

First, according to current context, search first order look-up table, for i thread, lookup result is index transient state value, arrives successively large probability coding environment and the small probability coding environment of lower i time next time, is total to 2i+1 value;

Then, according to data to be encoded D[j] with coding environment MPS[j] whether equate, from the result of first order look-up table, determine the probability encoding sign of lower j time; If equate, descend the probability encoding sign of j time to be defined as large probability coding environment; If etc., do not descend the probability encoding sign of j time to be defined as small probability coding environment; Index prediction steps final result is index transient state value, arrives the probability encoding sign of lower i time next time, is total to i+1 value, and wherein, j gets all integers from 1 to i, i ∈ [1, n], and i is integer, n is total number of threads.

Step 3, index is determined and normalization step.

3a) front once contextual index value, probability Estimation value and the interval register value of adjusting are distributed to each current thread, according to the value of the corresponding position of command register, whether equal zero and determine: for i thread, if command register [C _n ²+ i (n-1)+n-2, C _n ²+ i (n-1)-1] position equal zero, the index value of a last order n thread, probability Estimation value and interval adjustment register value are distributed to current thread i; Otherwise, if C in command register _n ²+ i (n-1)-1+k position is 1, by a last order k thread index value, probability Estimation value and the interval register value of adjusting distribute to current thread i, wherein C _n ²represent the number of times that all context CX of a current n thread compare between two, n is total number of threads, i ∈ [1, n], and i is integer, k ∈ [1, n], k is integer.

3b) according to possibility index value, possibility probable value and Thread control value, determine the index value of each thread:

With reference to Fig. 6, the specific implementation of this step is: for i thread, first whether equal zero the position of decision instruction register [2i (i-1)+i, 2i (i-1)]; If equalled zero, this thread index value is index transient state value; Otherwise, according to the magnitude relationship of Thread control value and possibility probable value, determine index value: if a_pre>=i * qe_pre+2 ¹⁵, this thread index value is possible index value; Otherwise, judgement a_pre>=l * qe_pre+2 ¹⁵whether set up, if set up, index value be under (i-l) inferior probability encoding sign, if be false, carry out circulation: l from subtracting 1, judgement a_pre>=l * qe_pre+2 ¹⁵whether set up, if set up, jump out circulation, if be false, circulation continues, wherein l is an integer, and its initial value is (i-1), and span is [i-1,0], and a_pre is Thread control value, qe_pre is possible probable value, i ∈ [1, n], and i is integer, n is total number of threads;

Step 4, the adjustment register normalization to first thread.

First, computing formula is selected signal sel, it is to be produced by the expression formula through concluding: D==MPS (CX) ⊙ (M > 2Qe), wherein D is data to be encoded, CX is the context of data to be encoded, and ⊙ is xor operator, and M is for adjusting the value of register A, MPS () is coding environment symbol M PS look-up-table function, and Qe is probability Estimation value.

Then, computation interval is adjusted register A the first intermediate variable A1=A-Qe;

Then, according to the value of the highest two of A the first intermediate variable A1, A the first intermediate variable A1 is moved to left, obtain A the second intermediate variable A2: if the value of the highest two of A intermediate variable A1 is 3 or 2, A the second intermediate variable A2=A1; If the value that A the first intermediate variable A1 is the highest two is 1, A the second intermediate variable A2=A1 < < 1; If the value that A the first intermediate variable A1 is the highest two is 0, A intermediate variable A2=A1 < < 2;

Finally, use formula selection marker sel certain range to adjust the renormalization value of register: when the value of formula selection marker sel is 1, renormalization result is A2; Otherwise renormalization result is Qe.

Step 5, the code registers normalization to first thread.

First, calculation code register C the first intermediate variable C1=C+Qe;

Then, the value with the highest two of interval the first intermediate variable A1 that adjusts register A, moves to left to C1, obtains code registers C the second intermediate variable C2: if the value of the highest two of the intermediate variable A1 of A is 3 or 2, and C2=C1; If the value that the first intermediate variable A1 of A is the highest two is 1, C2=C1 < < 1; If the value that the first intermediate variable A1 of A is the highest two is 0, C2=C1 < < 2;

Step 6, repeating step 4) and step 5) complete the interval adjustment register of other threads and the normalization of code registers;

Step 7, is input to code stream output unit by the code registers C in step 6 by the value after normalization, obtains final coded data.

Effect of the present invention can further illustrate by following emulation:

Emulation 1, the present invention uses VerilogHDL language to carry out the description of register transfer rtl code to whole circuit, by the data that C language compilation program is done before arithmetic encoder encodes, prepare, on the NC-verilog instrument of use Candence company, complete functional simulation, picture to a 400*400 pixel is encoded, and simulation result coding is correct.

Emulation 2, the present invention uses the Design-Compile instrument of Synopsys company, adopts 0.18 μ mCMOS standard cell technology library of SMIC company to carry out comprehensively, and comprehensive amassing is below 307100.27 square microns, maximum clock frequency is 287MHz, and throughput is 574Msymbols/sec.And the area of arithmetic encoder in document < < Concurrency techniques for arithmetic Coding in JPEG2000 > > is 384817.91 square microns, operating frequency is 211.86MHz, and disposal ability is only 388.34Msymbols/sec.

Claims

1. the multithreading arithmetic coding circuit based on JPEG2000 standard, comprising:

Instruction generation and index predicting unit, for providing index transient state value to index selection and normalization unit, its include instruction generates subelement and index predictor unit, this instruction generates subelement according to equal judged result whether between two between the context CX of all threads, write command register instr, the wherein corresponding context CX of each thread; This command register instr is for controlling the normalization of adjusting register A with interval of searching of index selection and normalization unit concordance list; This index predictor unit, for predicting index transient state value index_pos and the probability encoding sign of each context CX;

2. coding circuit according to claim 1, is characterized in that: described instruction generates in subelement and contains comparator and command register instr; Whether comparator is used for judging between current each context CX and equates, and whether current each context CX equates between two with front once all context CX_dff respectively between two; If total number of threads is n, comparator sum T is T=C _n ²+ n ²individual; Command register instr bit wide value W is determined by Thread Count n, meets and is related to W=C _n ²+ n ², wherein low

whether equal bit represent between current each context CX between two relation, high n ²bit represents in current each thread, whether each context CX equates between two with front once all context CX_dff respectively.

3. coding circuit according to claim 1, is characterized in that: described data allocations subelement comprises 3 class selectors, is respectively interval selector a_mux, probability Estimation selector q_mux and the index selection device i_mux of adjusting; Wherein selector, 1 probability Estimation selector and 1 index selection device are adjusted in corresponding 1 interval of each thread.

4. coding circuit according to claim 3, is characterized in that: the interval input signal of adjusting selector is the interval register A_dff that adjusts, and output signal is to distribute to the Thread control signal a_pre of this thread; Probability Estimation selector input signal is probability Estimation value Qe_dff, and output signal is to distribute to the possible probable value qe_pre of this thread; Index selection device input signal is index value i_dff, and output signal is to distribute to the possible index value i_pre of this thread; The interval control signal of adjusting selector, probability Estimation selector and index selection device is the output signal of command register instr.

5. coding circuit according to claim 1, is characterized in that: the look-up table that contains a two-stage in described index predictor unit, and all threads are all shared this look-up table;

In first order look-up table, the address of data is current context CX, the content of data be index value, next time large probability coding environment, next time small probability coding environment and next time large probability coding environment and next time the next stage of small probability coding environment arrive successively the probability encoding environment of next (n-1) level, wherein n is total number of threads; The bit wide DW=6 of data * (2 ⁿ⁺¹-1), wherein n is total number of threads;

In the look-up table of the second level, the address of data is the lookup result of first order look-up table, and the content of data is probability Estimation value Qe, and data bit width is 16.

6. the arithmetic coding method based on JPEG2000 standard, comprising:

1) instruction generates step:

The comparator that 1a) uses instruction to generate in subelement compares between two to all context CX of a current n thread, the corresponding context of each thread wherein, in the corresponding command register of the 2nd thread the 0th, corresponding the 1st and the 2nd of the 3rd thread, corresponding the 3rd, 4 and 5 of the 4th thread; If comparative result equates, by command register instr[C _n ²-1:0] in position corresponding bit position 1, otherwise set to 0;

1b) current context CX and front once all context CX_dff are compared between two the corresponding context of each thread wherein, i the high n of thread correspondence command register ²[C in bit _n ²+ i (n-1)+n-2, C _n ²+ i (n-1)-1] position, n ²for the number of times that current context and a front context are compared between two, n is total number of threads, i ∈ [1, n], and i is integer; If comparative result equates, by the high n of command register instr ²corresponding bit position 1 in position, otherwise set to 0;

2) index prediction steps:

First according to current context, search first order look-up table, for i thread, lookup result is index transient state value, arrives successively large probability coding environment and the small probability coding environment of lower i time next time, is total to 2i+1 value; Then according to data to be encoded D[j] with coding environment MPS[j] whether equate, from the result of first order look-up table, determine the probability encoding sign of lower j time; If equate, descend the probability encoding sign of j time to be defined as large probability coding environment; If etc., do not descend the probability encoding sign of j time to be defined as small probability coding environment; Index prediction steps final result is index transient state value, arrives the probability encoding sign of lower i time next time, is total to i+1 value; Wherein, MPS is coding environment symbol, and j gets all integers from 1 to i, i ∈ [1, n], and i is integer, n is total number of threads;

3) index is determined and normalization step:

3a) usage data distributes subelement according to the value of command register, the coding result of a front context CX_dff is distributed to each current thread, this coding result comprises index value index_dff, probability Estimation value Qe_dff and the interval register A_dff that adjusts, and allocation result is possible index value, possibility probable value and Thread control value;

3b) according to possibility index value, possibility probable value and Thread control value, determine the index value of each thread: for i thread, first whether equal zero the position of decision instruction register [2i (i-1)+i, 2i (i-1)]; If equalled zero, this thread index value is index transient state value; Otherwise, according to the magnitude relationship of Thread control value and possibility probable value, determine index value: if a_pre>=i * qe_pre+2 ¹⁵, this thread index value is possible index value; Otherwise, judgement a_pre>=l * qe_pre+2 ¹⁵whether set up, if set up, index value be under (i-l) inferior probability encoding sign, if be false, carry out circulation: l from subtracting 1, judgement a_pre>=l * qe_pre+2 ¹⁵whether set up, if set up, jump out circulation, if be false, circulation, wherein l initial value is (i-1) if continuing, span is [i-1,0], and a_pre is Thread control value, and qe_pre is possible probable value, i ∈ [1, n], and i is integer, n is total number of threads;

4) the adjustment register normalization step of first thread:

First computation interval is adjusted register A the first intermediate variable A1=A-Qe; Then according to the value of the highest two of A the first intermediate variable A1, A the first intermediate variable A1 is moved to left, obtain A the second intermediate variable A2: if the value of the highest two of A intermediate variable A1 is 3 or 2, A the second intermediate variable A2=A1; If the value that A the first intermediate variable A1 is the highest two is 1, A the second intermediate variable A2=A1<<1; If the value that A the first intermediate variable A1 is the highest two is 0, A intermediate variable A2=A1<<2; Finally use formula selection marker sel certain range to adjust the renormalization value of register: when the value of formula selection marker sel is 1, renormalization result is A2; Otherwise renormalization result is Qe;

5) the code registers normalization step of first thread:

First, calculation code register C the first intermediate variable C1=C+Qe;

Then, the value that interval the first intermediate variable A1 that adjusts register A is the highest two, moves to left to C1, obtains code registers C the second intermediate variable C2: if the value of the highest two of A intermediate variable A1 is 3 or 2, and C2=C1; If the value that A the first intermediate variable A1 is the highest two is 1, C2=C1<<1; If the value that A the first intermediate variable A1 is the highest two is 0, C2=C1<<2;

6) repeating step 4) and step 5) complete the normalization of the interval of other threads being adjusted to register and code registers;

7. arithmetic coding method according to claim 6, step 3a wherein) described usage data distributes subelement, the coding result of a front context CX_dff is distributed to each current thread, whether to equal zero and determine according to the value of the corresponding position of command register, wherein coding result comprises index value index_dff, probability Estimation value Qe_dff and the interval register A_dff that adjusts: for i thread, if command register [C _n ²+ i (n-1)+n-2, C _n ²+ i (n-1)-1] position equal zero, the index value of a last order n thread, probability Estimation value and interval adjustment register value are distributed to current thread i; Otherwise, if C in command register _n ²+ i (n-1)-1+k position is 1, the index value of a last order k thread, probability Estimation value and the interval register value of adjusting is distributed to current thread i; C wherein _n ²represent the number of times that all context CX of a current n thread compare between two, n is total number of threads, i ∈ [1, n], and i is integer, k ∈ [1, n], k is integer.

8. arithmetic coding method according to claim 6, wherein the formula described in step 4) is selected signal sel, to be produced by the expression formula through concluding: D==MPS (CX) ⊙ (M>2Qe), wherein D is data to be encoded, CX is the context of data to be encoded, and ⊙ is xor operator, and M is for adjusting the value of register A, MPS () is coding environment symbol M PS look-up-table function, and Qe is probability Estimation value.