CN104683806A - High-speed FPGA realization method applied to MQ arithmetic encoder based on deep running water - Google Patents

High-speed FPGA realization method applied to MQ arithmetic encoder based on deep running water Download PDF

Info

Publication number
CN104683806A
CN104683806A CN201510091224.2A CN201510091224A CN104683806A CN 104683806 A CN104683806 A CN 104683806A CN 201510091224 A CN201510091224 A CN 201510091224A CN 104683806 A CN104683806 A CN 104683806A
Authority
CN
China
Prior art keywords
streamline
level
value
input
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510091224.2A
Other languages
Chinese (zh)
Other versions
CN104683806B (en
Inventor
陶宏江
张柯
金龙旭
张然峰
郝贤鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun Institute of Optics Fine Mechanics and Physics of CAS
Original Assignee
Changchun Institute of Optics Fine Mechanics and Physics of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun Institute of Optics Fine Mechanics and Physics of CAS filed Critical Changchun Institute of Optics Fine Mechanics and Physics of CAS
Priority to CN201510091224.2A priority Critical patent/CN104683806B/en
Publication of CN104683806A publication Critical patent/CN104683806A/en
Application granted granted Critical
Publication of CN104683806B publication Critical patent/CN104683806B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention relates to a high-speed FPGA (Field Programmable Gate Array) realization method applied to an MQ arithmetic encoder based on deep running water and belongs to the field of computers and digital image processing. In order to improve the execution speed of an MQ coding algorithm, on the basis of an original four-level assembly line structure, through more reasonably distributing the workload and expanding an assembly line, a novel fast MQ arithmetic coding realization mode based on six levels of assembly lines is provided; through reasonably distributing the workload, the necessary serial workload in the same level of assembly line is reduced; through increasing the number of levels of the assembly line, the maximum transformation workload in each level of assembly line is reduced; through analyzing a control signal of a related link between CX table update and arithmetic coding interval A update, a novel realization mode of three-level decomposition through an advanced prediction access technology and a multi-index value analysis and selection technology, namely the first three levels of assembly lines are obtained; the addition speed and the multi-way selection speed are improved through a novel register allocation way, so that the overall execution speed of the MQ coder is improved.

Description

Based on the MQ arithmetic encoder high speed FPGA implementation method of degree of depth flowing water
Technical field
The present invention proposes a kind of realization of High Speed method being applied to the MQ arithmetic encoder of FPGA system based on multistage deep pipeline, effectively can improve coding rate when using FPGA to realize MQ encryption algorithm in JPEG2000, belong to computer, digital image processing field.
Background technology
JPEG2000 image compression algorithm is the Static Picture Compression standard of a new generation, and it not only has excellent compression performance, supports that lossy compression method, Lossless Compression, appointed area compression etc. can selection mode and very strong resist miscode characteristics simultaneously.Owing to having these excellent characteristics, JPEG2000 algorithm has been applied in increasing field.
Although JPEG200O has above-mentioned numerous superiority, but because JPEG2O00 has higher algorithm complexity, especially the control structure of the MQ Arithmetic Coding algorithm that JPEG2000 algorithm is special and arithmetic element all more complicated, and have very strong serial and front and back correlation, thus cause JPEG2000 encryption algorithm speed very slow.
In order to improve the execution speed of MQ arithmetic coding, and then improve the overall work speed of JPEG2000 encryption algorithm, existing use FPGA or VLSI realizes the working forms that all employ streamline in the scheme of MQ encoder.Now conventional streamline implementation is four stage pipeline structure, this structure is at paper Michael Dyer, David Taubman, Saeid Nooshabadi, and Amit Kumar Gupta.Concurrency Techniques for Arithmetic Coding in JPEG2000.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS [J] .2006, has carried out detailed elaboration in 53 (6).But this structure is to two front and back strong correlation rings of restriction MQ encoder execution speed, and (CX table upgrades the correlative link between arithmetic coding interval A renewal, as shown in Figure 1; Coded data exports correlative link, the decomposition degree of depth as shown in Figure 2) is inadequate, cause existing in first order streamline in two-stage table lookup operation, the fourth stage shifting function that there is 18 additions and the realization of 21:1 MUX, these operations are consuming time longer, thus reduce the operating rate of MQ encoder.
Summary of the invention
In order to improve MQ encryption algorithm execution speed, the present invention is based on the MQ arithmetic encoder high speed FPGA implementation method of degree of depth flowing water, it comprises the following steps:
Step 1, in first order streamline, to (the CX of input, D) data are to by the time corresponding MQ encoder probability tables index and the MPS value of tabling look-up, to realize in the course of work CX data the renewal of look-up table and maintenance, the process of the preliminary renewal-conflict of tabling look-up when realizing multiple same CX value input;
Step 2, in the streamline of the second level, realize probability tables index value tabling look-up to probable value, first whether by continuously, two or three input same CX value and judge that the index value of probable value is selection from the NLPS of the input of first order streamline, the input of third level streamline, the NMPS of last checking result or last checking result;
Step 3, in third level streamline, according to the probable value that second level streamline exports, repartitions the interval A of current arithmetic coding, equals A-Qe or Qe, and perform normalization operation by input data D and MPS whether identical selection A;
Step 4, in fourth stage streamline, determined the value of output register C by the Output rusults of third level streamline, if MPS state, A<Qe then C does not upgrade, otherwise C equals C+Qe; If LPS state, A<Qe then C equals C+Qe, otherwise C does not upgrade;
Step 5, in level V streamline, realize managing a high position of register C, according to JPEG2000 agreement, CH is adjustable length in the output procedure of data, by CT, the data bits in CH is counted, determine the updated value of CH and the output valve of B0, B1 by CT with from the displacement number of fourth stage streamline simultaneously;
Step 6, in 6th level production line, realize the management to the last output byte B in JPEG2000 agreement, if B equals 0xff, then directly exporting BOut0 is B, otherwise BOut0 is the carry that B adds from level V streamline, simultaneously according to the byte number that level V streamline exports, upgrade BOut1 and B.
Inventive concept of the present invention can be summarized as:
1, by the load that reasonably shares out the work, work in series amount necessary in same level production line is reduced;
2, by increasing pipeline series, the maximum Transition work load in each level production line is reduced;
3, by upgrading the analysis of the control signal of the correlative link between arithmetic coding interval A renewal to CX table, the three grades of new implementations of decomposing being analyzed selection technique by advanced prediction peek technology and many index values have been drawn, three class pipeline namely;
4, for existing level Four pipelining algorithm data output stage owing to there is 18 additions and 21:1 MUX and the problem causing coding rate low, propose the speed that the new register method of salary distribution improves addition speed and multi-path choice simultaneously, thus improve the overall execution speed of MQ encoder.
The invention has the beneficial effects as follows: this programme is by the analysis to existing MQ encoder bottleneck, use advanced prediction peek technology, many index values analysis selection technique and new byte data output scheme, add pipeline series, decrease the maximum execution time of every level production line, improve the overall execution speed of MQ encoder.This programme, while raising execution speed, ensure that resource utilization does not change, and is suitable for the coded system realized the FPGA that performance, resource occupation requirement are strict.
Accompanying drawing explanation
Fig. 1 CX shows to upgrade the related link flow chart between arithmetic coding interval A renewal.
Fig. 2 coded data exports related link flow process.
Fig. 3 six level production line implementation method structured flowchart.
Fig. 4 streamline first order workflow diagram.
Fig. 5 streamline second level workflow diagram.
Fig. 6 streamline third level workflow diagram.
Fig. 7 streamline fourth stage workflow diagram.
Fig. 8 streamline level V workflow diagram.
Fig. 9 streamline the 6th level work flow chart.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further details.
As shown in Figure 3, the present invention is based on the MQ arithmetic encoder high speed FPGA implementation method of degree of depth flowing water, adopt six level production line MQ encoder implementations, comprise the following steps:
Step 1, in first order streamline, is responsible for finding corresponding MQ encoder probability tables index and MPS value by (CX, D) data of input in CX table, and data collision control signal when generating continuously input 2 times or 3 same CX values.
In this level production line, distributed RAM is used to realize the look-up table of CX to MQ encoder probability tables index value and MPS value; Used input CX value during a register Cx1 preservation upper clock cycle; Use register Cx2 preserves the input CX value before the clock cycle; Use register Cx3 preserves the input CX value before two clock cycle; Register D1 is used to preserve input data D; Register Ni1 is used to preserve the probability tables index value of tabling look-up and exporting; Register Mps is used to preserve the MPS value of tabling look-up and exporting; Register MpsEq preservation input data values D and current tabling look-up is used to export the comparative result of MPS; Cx12E is used to preserve the comparative result of save value in current input CX and Cx1; Cx13E is used to preserve the comparative result of save value in current input CX and Cx2.
In first order streamline, by tabling look-up, corresponding probability tables index value and MPS value are obtained to the CX of each input, and CX data are saved in Cx1, Cx2, Cx3; Selecting the CX from third level streamline to show updated value is kept in Ni and Mps when CX value equals Cx3 as output, and the CX that selecting when CX value is not equal to Cx3 tables look-up exports shows updated value and is kept in Ni and Mps as output; Input data D and Mps is carried out logical AND operation, and result is kept in MpsEq; Preserve the comparative result of CX and Cx1 to Cx12E; Preserve the comparative result of CX and Cx2 to Cx13E; This level production line specific implementation flow process, as shown in Figure 4.
Step 2, second level streamline realizes probability tables index value tabling look-up to probable value.
In this level production line, use ROM to realize probability tables PET, wherein preserve JPEG2000 normal probability value information; Register Pb is used to preserve output probability value; Register Ni2 is used to preserve the index value of tabling look-up and using; Register Ni is used to preserve prediction index value; Register Lz is used to preserve the figure place that moves to left; Register Mps2 is used to preserve the Mps value of prediction; Register MpsMatch2 is used to preserve the updated value of Mps comparative result; Use register MpsSelect2 to preserve Mps value and select control information; Used MpsEq value during a register MED preservation upper clock cycle; Use register MED2 preserves the MpsEq value before the clock cycle.
The groundwork of second level streamline is prediction probability table lookup table index value, and by next index value of table look-up output probability value and concordance list; First the corresponding signal value of content choice of foundation Cx12E and Cx13E is as ME result, then waits until the selective value of probability tables lookup table index value according to Cx12E, Cx13E and ME, and selects corresponding index value to carry out tabling look-up obtaining probable value by MUX; This level production line specific implementation flow process, as shown in Figure 5.
Step 3, third level streamline realizes the calculating of A between code area.
In this level production line, register ARN is used to preserve A normalized signal between code area; Register CxUpdate is used to preserve for upgrading the lastest imformation that in first order streamline, CX shows; Use the lastest imformation of register MpsUpdate preservation for upgrading mps signal in the first order and second level streamline; Register Lz3 is used to preserve the figure place that moves to left; Register Pb3 is used to preserve output probability value; Use register CAS to preserve and export selection.
First according to current pipeline state, judgement uses the output probability value of second level streamline or uses last probable value, then A between code area is used to deduct probable value to realize and repartition the interval A of current arithmetic coding, finally equal A-Qe or Qe by input data D and MPS whether identical selection A, and perform normalization operation, this level production line realization flow, as shown in Figure 6.
Step 4, in fourth stage streamline, is determined the value of output register C by the Output rusults of third level streamline.
Register C is used to export temporary result; Register BPC is used to preserve output carry position; Use register BPO0 to preserve carry and export upper byte; Use register BPO1 to preserve carry and export low byte; Register CCRA is used to preserve right-aligned output data; Lz4 is used to preserve the figure place that moves to left.
Use 16 bit register C different from 4 stage pipeline structure of MQ algorithm, this patent structure uses 19 bit register C, because use 28 bit register C in MQ algorithm, its low 16 are used to be added working length with probable value Pb and fix, its high 9 variable-lengths being used to output function, also have 3 bit data like this, what in fact this three bit data represented is if be in MPS state, in algorithm, during A<Pb, then C does not upgrade, otherwise C equals C+Pb; If be in LPS state, in algorithm, during A<Pb, then C equals C+Pb, otherwise C does not upgrade, this level production line realization flow, as shown in Figure 7.
Step 5, in level V streamline, realizes the management of the high-order CH [7:0] to register C.
Register Bc is used to preserve Output rusults; Use register Len to preserve and export length; Register CH is used to preserve the high position data of C; Register CT is used to preserve the figure place of significance bit in CH.
According to JPEG2000 agreement, CH is that length is variable between 0 ~ 7 in the output procedure of data, by CT, the data bits in CH is counted, determine the updated value of CH and the output valve of Bc by CT with from the displacement number of fourth stage streamline simultaneously, the bit value retained in CH is used to upgrade CT counter, this level production line realization flow, as shown in Figure 8.
Step 6, in the 6th level production line, realizes the management to the last output byte B in JPEG2000 agreement.
Register B is used to preserve temporal data; Register BOut0 is used to preserve high-order output; Use register BOut1 to preserve low level to export.
If B equals 0xff, then directly exporting BOut0 is B, otherwise BOut0 is the carry digit that B adds from level V streamline, simultaneously according to the byte number that level V streamline exports, upgrades BOut1 and B, this level production line realization flow, as shown in Figure 9.
The present invention and the speed realized based on the same model FPGA of level Four pipelining, resource compare in table 1.
Table 1:

Claims (1)

1., based on the MQ arithmetic encoder high speed FPGA implementation method of degree of depth flowing water, it is characterized in that, it comprises the following steps:
Step 1, in first order streamline, to (the CX of input, D) data are to by the time corresponding MQ encoder probability tables index and the MPS value of tabling look-up, to realize in the course of work CX data the renewal of look-up table and maintenance, the process of the preliminary renewal-conflict of tabling look-up when realizing multiple same CX value input;
Step 2, in the streamline of the second level, realize probability tables index value tabling look-up to probable value, first whether by continuously, two or three input same CX value and judge that the index value of probable value is selection from the NLPS of the input of first order streamline, the input of third level streamline, the NMPS of last checking result or last checking result;
Step 3, in third level streamline, according to the probable value that second level streamline exports, repartitions the interval A of current arithmetic coding, equals A-Qe or Qe, and perform normalization operation by input data D and MPS whether identical selection A;
Step 4, in fourth stage streamline, determined the value of output register C by the Output rusults of third level streamline, if MPS state, A<Qe then C does not upgrade, otherwise C equals C+Qe; If LPS state, A<Qe then C equals C+Qe, otherwise C does not upgrade;
Step 5, in level V streamline, realize managing a high position of register C, according to JPEG2000 agreement, CH is adjustable length in the output procedure of data, by CT, the data bits in CH is counted, determine the updated value of CH and the output valve of B0, B1 by CT with from the displacement number of fourth stage streamline simultaneously;
Step 6, in 6th level production line, realize the management to the last output byte B in JPEG2000 agreement, if B equals 0xff, then directly exporting BOut0 is B, otherwise BOut0 is the carry that B adds from level V streamline, simultaneously according to the byte number that level V streamline exports, upgrade BOut1 and B.
CN201510091224.2A 2015-02-28 2015-02-28 MQ arithmetic encoder high speed FPGA implementation methods based on depth flowing water Expired - Fee Related CN104683806B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510091224.2A CN104683806B (en) 2015-02-28 2015-02-28 MQ arithmetic encoder high speed FPGA implementation methods based on depth flowing water

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510091224.2A CN104683806B (en) 2015-02-28 2015-02-28 MQ arithmetic encoder high speed FPGA implementation methods based on depth flowing water

Publications (2)

Publication Number Publication Date
CN104683806A true CN104683806A (en) 2015-06-03
CN104683806B CN104683806B (en) 2017-12-26

Family

ID=53318291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510091224.2A Expired - Fee Related CN104683806B (en) 2015-02-28 2015-02-28 MQ arithmetic encoder high speed FPGA implementation methods based on depth flowing water

Country Status (1)

Country Link
CN (1) CN104683806B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108900840A (en) * 2018-07-10 2018-11-27 珠海亿智电子科技有限公司 For hard-wired H264 macro-block level bit rate control method
US11967119B2 (en) 2019-07-22 2024-04-23 Zhejiang Dahua Technology Co., Ltd. Systems and methods for coding

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1477879A (en) * 2003-07-03 2004-02-25 复旦大学 High-speed low power consumption MQ encoder applicable dto JPEG2000 standard
CN1564196A (en) * 2004-04-07 2005-01-12 西安交通大学 VLSI realizing method of synchronous flowing arithmetic coder
CN101820549A (en) * 2010-03-19 2010-09-01 西安电子科技大学 High-speed real-time processing arithmetic entropy coding system based on JPEG2000
CN102088607A (en) * 2011-02-28 2011-06-08 西安电子科技大学 Memory quotient (MQ) coding method and circuit based on JPEG (joint photographic experts group) 2000 standard
CN103248896A (en) * 2013-05-15 2013-08-14 中国科学院光电技术研究所 MQ arithmetic coder

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1477879A (en) * 2003-07-03 2004-02-25 复旦大学 High-speed low power consumption MQ encoder applicable dto JPEG2000 standard
CN1564196A (en) * 2004-04-07 2005-01-12 西安交通大学 VLSI realizing method of synchronous flowing arithmetic coder
CN101820549A (en) * 2010-03-19 2010-09-01 西安电子科技大学 High-speed real-time processing arithmetic entropy coding system based on JPEG2000
CN102088607A (en) * 2011-02-28 2011-06-08 西安电子科技大学 Memory quotient (MQ) coding method and circuit based on JPEG (joint photographic experts group) 2000 standard
CN103248896A (en) * 2013-05-15 2013-08-14 中国科学院光电技术研究所 MQ arithmetic coder

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
M.AHMADVAND ET.AL: "A NEW PIPELINED ARCHITECTURE FOR JPEG2000 MQ-CODER", 《PROCEEDING OF THE WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE》 *
MICHAEL DYER: "CONCURRENCY TECHNIQUES FOR ARITHMETIC CODING IN JPEG2000", 《IEEE TRANSACTIONS ON CIRCUITS ANS SYSTEMS》 *
何国栋等: "JPEG2000MQ编码算法的优化和FPGA实现", 《技术纵横》 *
刘彬: "JEPG2000中的MQ编码模块的研究与FPGA实现", 《中国优秀硕士学位论文全文数据库》 *
雷磊: "JPEG2000MQ编码器的设计与实现", 《图形、图像与多媒体》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108900840A (en) * 2018-07-10 2018-11-27 珠海亿智电子科技有限公司 For hard-wired H264 macro-block level bit rate control method
US11967119B2 (en) 2019-07-22 2024-04-23 Zhejiang Dahua Technology Co., Ltd. Systems and methods for coding

Also Published As

Publication number Publication date
CN104683806B (en) 2017-12-26

Similar Documents

Publication Publication Date Title
US11055287B2 (en) Eigenvalue-based data query
EP2750047B1 (en) Hash table and radix sort based aggregation
CN106407201A (en) Data processing method and apparatus
CN103269212B (en) Low cost low-power consumption Multilevel FIR filter implementation method able to programme
CN110765709A (en) FPGA-based 2-2 fast Fourier transform hardware design method
CN102694554A (en) Data compression devices, operating methods thereof, and data processing apparatuses including the same
Choi et al. Energy-efficient design of processing element for convolutional neural network
CN113660113B (en) Self-adaptive sparse parameter model design and quantization transmission method for distributed machine learning
CN111047034B (en) On-site programmable neural network array based on multiplier-adder unit
CN104683806A (en) High-speed FPGA realization method applied to MQ arithmetic encoder based on deep running water
CN114996638A (en) Configurable fast Fourier transform circuit with sequential architecture
CN113779499A (en) Fast Fourier algorithm optimization method and system based on high-level comprehensive tool
KR102656567B1 (en) Apparatus for enabling the conversion and utilization of various formats of neural network models and method thereof
CN107277553A (en) A kind of binary arithmetic encoder
CN103631659B (en) Schedule optimization method for communication energy consumption in on-chip network
CN115526131A (en) Method and device for approximately calculating Tanh function by multi-level coding
CN113946617A (en) Data processing method and device, electronic equipment and storage medium
CN209496362U (en) Three n binary adders of input
CN109684761B (en) Wide exclusive nor circuit optimization method
CN104090895B (en) Obtain the method for radix, device, server and system
US10534885B1 (en) Modifying data flow graphs using range information
CN103699729A (en) Modulus multiplier
CN106954264A (en) A kind of downlink physical shared channel PDSCH method for mapping resource and system
Norollah et al. An efficient sorting architecture for area and energy constrained edge computing devices
CN104918259A (en) Cache data scheduling method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20171226

Termination date: 20200228

CF01 Termination of patent right due to non-payment of annual fee