CN108184127B - Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture - Google Patents

Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture Download PDF

Info

Publication number
CN108184127B
CN108184127B CN201810039762.0A CN201810039762A CN108184127B CN 108184127 B CN108184127 B CN 108184127B CN 201810039762 A CN201810039762 A CN 201810039762A CN 108184127 B CN108184127 B CN 108184127B
Authority
CN
China
Prior art keywords
butterfly
layer
data
unit
data processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810039762.0A
Other languages
Chinese (zh)
Other versions
CN108184127A (en
Inventor
陈志峰
郑静宜
杨秀芝
吴林煌
施隆照
郑明魁
陈建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201810039762.0A priority Critical patent/CN108184127B/en
Publication of CN108184127A publication Critical patent/CN108184127A/en
Application granted granted Critical
Publication of CN108184127B publication Critical patent/CN108184127B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation

Abstract

The invention relates to a configurable multi-size DCT transform hardware multiplexing architecture. The method comprises the following steps: the judgment and data rearrangement module judges whether the data input into the multiplexing framework needs to be rearranged according to the size of DCT transformation; the K-layer butterfly data processing module is used for carrying out K-layer butterfly data processing on the data processed by the judgment and data rearrangement module; and the final-stage vector inner product module multiplies the even-number position data vectors output by the last-layer butterfly data processing module by the corresponding core matrix coefficients, adds the multiplied results, and outputs the result. The invention is respectively realized by a digital logic hardware circuit based on FPGA and a digital logic hardware circuit based on ASIC, is simple, effective and reconfigurable, and can be widely applied to multi-size DCT conversion in various video compression coding standards.

Description

Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture
Technical Field
The invention relates to the technical field of DCT (discrete cosine transform) transformation of video compression coding, in particular to a configurable multi-size DCT hardware multiplexing architecture.
Background
Discrete Cosine Transform (DCT) is an important module in video encoders, and in the currently mainstream video compression coding standard, DCT Transform is usually required to support multiple different Transform sizes, for example, the DCT Transform of HEVC has four sizes, 4 × 4, 8 × 8, 16 × 16 and 32 × 32. And as video resolution moves toward 4K/8K, the maximum transform size supported by DCT transforms in future video compression coding standards will also increase.
Two in video compression coding standardThe dimensional DCT transform (2D-DCT) can be written as a matrix multiplication form Z ═ CXCTWhere X is a residual matrix generated by the prediction encoding module, C is an integer coefficient matrix specified by a standard, and Z is transformed data. The 2D-DCT is usually implemented in steps, and the input residual matrix is first subjected to 1D-DCT line by line, i.e. Y ═ XCTThe intermediate result Y is then subjected to 1D-DCT column by column, i.e. Z ═ CY, so that the implementation of a 2D-DCT requires only two 1D-DCTs. For single size DCT transform, existing DCT/IDCT hardware designs typically optimize resources by implementing constant multiplication based on shift and addition instead of multiplication. However, as the transform size continues to increase, the register area and power consumption consumed by the shift operation will increase; meanwhile, the method based on shift and addition can only realize constant multiplication, and is not flexible enough when multiplexing DCT transform of different sizes in the video compression coding standard. Some researchers propose some multiplexing architectures suitable for 4, 8, 16, 32-point DCT transformation, but these architectures have problems of low utilization rate of some modules, too high complexity and consumed resource amount of hardware implementation multiplexing architecture, or inflexible configuration of throughput, and the like.
Disclosure of Invention
The invention aims to provide a configurable multi-size DCT hardware multiplexing architecture, which can realize multi-size 1D-DCT, has high resource utilization rate, can flexibly call core matrix coefficients with different sizes to multiply so as to realize DCT with different sizes, and can realize different throughputs under the condition of different configuration parameters.
In order to achieve the purpose, the technical scheme of the invention is as follows: a configurable multi-size DCT transform hardware multiplexing architecture, comprising:
the judgment and data rearrangement module judges whether the data input into the multiplexing framework needs to be rearranged according to the size of DCT transformation; for the DCT transform with the maximum size, the data input into the module does not need to be rearranged and is directly output; for DCT transformation smaller than the maximum size, rearranging the data input into the module to ensure that the arranged data meets the rule of subsequent butterfly operation, and providing guarantee for realizing parallel processing of multiple rows of input data, thereby fully utilizing interface resources of a multiplexing architecture;
the K-layer butterfly data processing module is used for carrying out K-layer butterfly data processing on the data processed by the judgment and data rearrangement module; each layer of butterfly shape data processing module firstly carries out butterfly operation on data input into the butterfly unit, even position data output after the calculation of the butterfly unit is used as the input of the next layer of butterfly shape data processing module, odd position data output after the calculation of the butterfly unit is used as the input of the multiplication unit of the current layer and multiplied by the corresponding core matrix coefficient, and the multiplied results are added through the addition unit of the current layer and then output;
and the final-stage vector inner product module multiplies the even-number position data vectors output by the last-layer butterfly data processing module by the corresponding core matrix coefficients, adds the multiplied results, and outputs the result.
In an embodiment of the present invention, the K-layer butterfly data processing module includes K layers of butterfly data processing modules, and the rule of each layer of butterfly data processing module is even position data output after the operation of the butterfly unit in the previous layer of butterfly data processing module, and the even position data is used as the input of the butterfly unit in the current layer of butterfly data processing module; each layer of butterfly shape data processing module in K layer butterfly shape data processing module includes:
the k-layer butterfly unit is used for performing butterfly operation on even position data output by the butterfly processing unit in the previous layer of butterfly data processing module, and if k is 1, performing butterfly operation on data output by the judgment and data rearrangement module; the even position data output after operation is used as the input of the next butterfly data processing module, and the odd position data output after operation is used as the input of the multiplication unit of the current layer;
the k-layer multiplication unit multiplies the odd-number position data output by the k-layer butterfly unit by the corresponding core matrix coefficient, and the number of multipliers contained in the multiplication unit is configurable so as to realize different data throughputs;
the k-th layer addition unit is used for adding the data output by the k-th layer multiplication unit step by step in pairs, and the result is used as the output of the multiplexing frame;
wherein, 1< ═ K.
In an embodiment of the present invention, the last-stage vector inner product module is configured to perform a vector inner product operation, and includes:
the final-stage multiplication unit is used for multiplying the even-number position data output by the K-th-layer butterfly unit by the corresponding core matrix coefficient;
and the final-stage addition unit is used for adding the data output by the final-stage multiplication unit pairwise and taking the data as the output of the multiplexing architecture.
In an embodiment of the present invention, the configurable multi-size DCT transform hardware multiplexing architecture is implemented by using two ways, namely, an FPGA-based digital logic hardware circuit and an ASIC-based digital logic hardware circuit.
Compared with the prior art, the invention has the following beneficial effects: the configurable multi-size DCT transform hardware multiplexing architecture improves the realization architecture of the traditional DCT transform, can effectively improve the resource utilization rate of the internal module of the whole multiplexing architecture, is compatible with DCT transforms of various sizes, and can flexibly configure the throughput of the DCT transform hardware multiplexing architecture; in addition, the configurable DCT conversion hardware multiplexing architecture is respectively realized by adopting a digital logic hardware circuit based on FPGA and a digital logic hardware circuit based on ASIC, is simple, effective and reconfigurable, and can be widely applied to multi-size DCT conversion in various video compression coding standards.
Drawings
FIG. 1 is a block diagram of an embodiment of the present invention
FIG. 2 shows S (S is smaller than S) in the embodiment of the present inventionmax) Inputting S when point 1D-DCT is transformedmaxPine-tree ranking of individual data.
FIG. 3 shows (log) of an example of the present invention2Smax-1) internal structure diagram of the level addition.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
The invention discloses a configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture which comprises a judgment and data rearrangement module, a K-layer butterfly data processing module and a final-stage vector inner product module. The judgment and data rearrangement module judges whether the data input into the multiplexing framework needs to be rearranged according to the size of DCT, and for the DCT with the maximum size, the data input into the module does not need to be rearranged and is directly output; for DCT transformation smaller than the maximum size, rearranging the data input into the module to ensure that the arranged data meets the rule of subsequent butterfly operation, and providing guarantee for realizing parallel processing of multiple rows of input data, thereby fully utilizing interface resources of a multiplexing framework; the K-layer butterfly data processing module carries out K-layer butterfly data processing on the data processed by the judgment and data rearrangement module, each layer of butterfly data processing module carries out butterfly operation on the data input into the butterfly unit, the even position data output after the calculation of the butterfly unit is used as the input of the next layer of butterfly data processing module, the odd position data calculated by the butterfly unit is used as the input of the multiplication unit at the current layer and multiplied by the corresponding core matrix coefficient, and the multiplied results are added through the addition unit and then output; and the final-stage vector inner product module multiplies the even position data vectors output by the butterfly units in the last-layer butterfly data processing module by the corresponding core matrix coefficients, and the multiplied results are added step by step in pairs through the addition unit and then output.
Fig. 1 is a block diagram of the structure of an embodiment of the present invention. In this embodiment, the maximum size of the multi-size DCT transform supported by the multiplexing architecture is set to SmaxMinimum size is set to Smin. In the realization of S (S is less than or equal to S)max) In point DCT transformation, r (S) row (or column) data is taken out from S-S image block every time
Figure GDA0002418695240000031
And sequentially form a row Nin(Nin=Smax) Inputting the data into the multiplexing structure, wherein the bit width of each data is winA bit. Multiplexing fabric packetsButterfly data processing module with K layers, wherein the value of K is K-min (log)2Smin,log2Smax)=log2Smin. The whole multiplexing framework is realized on FPGA and ASIC hardware, and comprises a judgment and data rearrangement module 11, a K-layer butterfly data processing module 12 and a final-stage vector inner product module 13.
The judgment and data rearrangement module 11 judges whether the data input into the multiplexing framework needs to be rearranged according to the size of DCT transformation, and for the DCT transformation with the maximum size, the data input into the module does not need to be rearranged and is directly output; for DCT transforms smaller than the maximum size, the data input to the module is rearranged such that the arranged data satisfies the rules of subsequent butterfly operations, the rules followed during rearrangement being as shown in fig. 2. The order of the arrows in FIG. 2 is the rearranged data vector
Figure GDA0002418695240000041
The order of (a). The square array refers to the input vector x ═ x with r (S) row length S before rearrangement0,x1,...,xS-1]Composed input matrix X'(r×S)Wherein, matrix X'(r×S)Is x'a,bAnd is and
Figure GDA0002418695240000042
and correspond to each other. In order to support parallel processing of r (S) row vectors x for all K-level butterflies, we will do so
Figure GDA0002418695240000043
Is divided into 2KThe number of the parts is one,
Figure GDA0002418695240000044
belong to the first
Figure GDA0002418695240000045
Part of which is the first
Figure GDA0002418695240000046
And (4) each element. Each part is internally arranged in a raster orderAll the rows are traversed, and in order to ensure head-to-tail symmetry, the scanning order of two adjacent parts is opposite, namely when c is an even number,
Figure GDA0002418695240000047
and when c is an odd number, the number of the carbon atoms,
Figure GDA0002418695240000048
the K-layer butterfly data processing module 12 is configured to perform butterfly data processing on the rearranged data in K layers, where even-numbered position data output by the butterfly processing unit in the previous layer of butterfly data processing module is used as input of the next layer of butterfly data processing module, and a horizontal ellipsis in fig. 1 indicates a middle layer of butterfly data processing module. By configuring the number of multipliers contained in the multiplication unit in the K-layer butterfly data processing module, the data processing module can be operated at Smin·r(S)≤T(S)≤S·r(S)=SmaxThe throughput t(s) of the multiplexing architecture is flexibly set within range. Each level of butterfly data processing module comprises 3 sub-modules. Take butterfly data processing modules of a first layer, a second layer, a K-1 layer and a K layer as examples.
The first layer butterfly data processing module comprises 3 sub-modules which are respectively:
(1) first-tier butterfly unit 1201: the module performs butterfly operation on the rearranged data and outputs two parts of data at even number positions and odd number positions, the number of the two parts of data output by the butterfly unit is reduced by half compared with the number of input data of the butterfly unit, and the bit width of each data is increased by 1 bit.
(2) First-layer multiplication unit 1202: the module multiplies the odd position data output by the first-layer butterfly unit 1201 by the corresponding core matrix coefficient.
(3) The first-layer addition unit 1203: this module adds the vector data output by the first layer of multiplication units 1202 two by two in stages, which are total (log)2Smax-1) stages of addition, as shown in fig. 3. For SmaxPoint conversion, namely outputting a datum at the adder at the last stage of the adding unit; for the
Figure GDA0002418695240000049
Point transformation, two data are output from the last second-stage adder of the adding unit; for smaller point number transforms, the analogy can be followed. Similarly, for the k (1)<=k<K) layer addition units, which share (log)2Smax-k) stages of addition.
The second layer butterfly data processing module comprises 3 sub-modules which are respectively:
(1) second tier butterfly unit 1204: the module performs butterfly operation on the even position data output by the first-layer butterfly unit 1201, and outputs two parts of data of the even position and the odd position.
(2) Second-layer multiplication unit 1205: which multiplies the odd position data output by the second-tier butterfly unit 1204 by the corresponding core matrix coefficients.
(3) Second layer addition unit 1206: the module adds the vector data output by the second layer of multiplication unit 1205 two by two step, and the sum is total (log)2Smax-2) stages of addition.
The 3 sub-modules contained in the K-1 layer butterfly data processing module are respectively:
(1) layer K-1 butterfly unit 1207: the module carries out butterfly operation on even position data output by the K-2 layer butterfly unit and outputs the data into two parts of data of an even position and an odd position.
(2) Layer K-1 multiplication unit 1208: the module multiplies the odd position data output by the K-1 butterfly unit 1207 by the corresponding core matrix coefficients.
(3) Layer K-1 addition unit 1209: the module adds the vector data output by the K-1 layer multiplication unit 1208 two by two step, and the sum is total (log)2Smax-K-1) stages of addition.
The K-th layer butterfly data processing module comprises 3 sub-modules which are respectively:
(1) layer K butterfly unit 1210: the module performs butterfly operation on the even position data output by the K-1 layer butterfly unit 1207, and outputs two parts of data of the even position and the odd position.
(2) K-th layer multiplication unit 1211: this module multiplies the odd position data output by the K-th butterfly unit 1210 by the corresponding core matrix coefficients.
(3) Layer K addition unit 1212: the module adds the vector data output by the K-th layer multiplication unit 1211 two by two step, and the sum is total (log)2Smax-K) stages of addition.
The last-stage vector inner product module 13 is configured to multiply the data vector output by the last-layer (i.e., K-th-layer) butterfly data processing module by the corresponding core matrix coefficient, add the multiplied results by the addition unit, and output the result. The module comprises 2 submodules, wherein each submodule is respectively as follows:
(1) final multiplication unit 131: this module multiplies the even position data output by the K-th butterfly unit 1210 by the corresponding core matrix coefficients.
(2) Last-stage addition unit 132: the block adds the vector data output by the final multiplication unit 131 two by two in stages, which are total (log)2Smax-K) stages of addition.
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (2)

1. A configurable multi-size DCT transform hardware multiplexing architecture, comprising:
the judgment and data rearrangement module judges whether the data input into the multiplexing framework needs to be rearranged according to the size of DCT transformation; for the DCT transform with the maximum size, the data input into the module does not need to be rearranged and is directly output; for DCT transformation smaller than the maximum size, rearranging the data input into the module to ensure that the arranged data meets the rule of subsequent butterfly operation, and providing guarantee for realizing parallel processing of multiple rows of input data, thereby fully utilizing interface resources of a multiplexing architecture;
the K-layer butterfly data processing module is used for carrying out K-layer butterfly data processing on the data processed by the judgment and data rearrangement module; each layer of butterfly shape data processing module firstly carries out butterfly operation on data input into the butterfly unit, even position data output after the calculation of the butterfly unit is used as the input of the next layer of butterfly shape data processing module, odd position data output after the calculation of the butterfly unit is used as the input of the multiplication unit of the current layer and multiplied by the corresponding core matrix coefficient, and the multiplied results are added through the addition unit of the current layer and then output;
the last-stage vector inner product module multiplies the even-number position data vectors output by the last-layer butterfly data processing module by the corresponding core matrix coefficients, adds the multiplied results, and then outputs the result;
the K layers of butterfly data processing modules comprise K layers of butterfly data processing modules, and the rule of each layer of butterfly data processing module is even number position data output after the operation of a butterfly unit in the previous layer of butterfly data processing module and is used as the input of the butterfly unit in the current layer of butterfly data processing module; each layer of butterfly shape data processing module in K layer butterfly shape data processing module includes:
the k-layer butterfly unit is used for performing butterfly operation on even position data output by the butterfly processing unit in the previous layer of butterfly data processing module, and if k is 1, performing butterfly operation on data output by the judgment and data rearrangement module; the even position data output after operation is used as the input of the next butterfly data processing module, and the odd position data output after operation is used as the input of the multiplication unit of the current layer;
the k-layer multiplication unit multiplies the odd-number position data output by the k-layer butterfly unit by the corresponding core matrix coefficient, and the number of multipliers contained in the multiplication unit is configurable so as to realize different data throughputs;
the k-th layer addition unit is used for performing pairwise addition on the data output by the k-th layer multiplication unit step by step, and the result is used as the output of the multiplexing framework;
wherein 1< K;
the last-stage vector inner product module is used for finishing vector inner product operation and comprises the following steps:
the final-stage multiplication unit is used for multiplying the even-number position data output by the K-th-layer butterfly unit by the corresponding core matrix coefficient;
and the final-stage addition unit is used for adding the data output by the final-stage multiplication unit pairwise and taking the data as the output of the multiplexing architecture.
2. A configurable multi-size DCT-transform hardware multiplexing architecture according to claim 1, wherein said configurable multi-size DCT-transform hardware multiplexing architecture is implemented using both FPGA-based digital logic hardware circuits and ASIC-based digital logic hardware circuits.
CN201810039762.0A 2018-01-13 2018-01-13 Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture Active CN108184127B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810039762.0A CN108184127B (en) 2018-01-13 2018-01-13 Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810039762.0A CN108184127B (en) 2018-01-13 2018-01-13 Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture

Publications (2)

Publication Number Publication Date
CN108184127A CN108184127A (en) 2018-06-19
CN108184127B true CN108184127B (en) 2020-06-12

Family

ID=62550603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810039762.0A Active CN108184127B (en) 2018-01-13 2018-01-13 Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture

Country Status (1)

Country Link
CN (1) CN108184127B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112218082B (en) * 2020-12-04 2021-03-16 北京电信易通信息技术股份有限公司 Reconfigurable multi-video coding acceleration design-based method and system
CN114007079A (en) * 2021-10-09 2022-02-01 上海为旌科技有限公司 Conversion circuit, method, device and encoder

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6308193B1 (en) * 1998-01-30 2001-10-23 Hyundai Electronics Ind. Co., Ltd. DCT/IDCT processor
CN101634981A (en) * 2009-08-25 2010-01-27 浙江大学 Method and device for processing discrete cosine transform
CN102857756A (en) * 2012-07-19 2013-01-02 西安电子科技大学 Transfer coder adaptive to high efficiency video coding (HEVC) standard
CN103369326A (en) * 2013-07-05 2013-10-23 西安电子科技大学 Transition coder applicable to HEVC ( high efficiency video coding) standards
CN104581174A (en) * 2015-01-22 2015-04-29 复旦大学 High-throughput DCT and IDCT hardware multiplexing structure suitable for HEVC standard

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9049452B2 (en) * 2011-01-25 2015-06-02 Mediatek Singapore Pte. Ltd. Method and apparatus for compressing coding unit in high efficiency video coding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6308193B1 (en) * 1998-01-30 2001-10-23 Hyundai Electronics Ind. Co., Ltd. DCT/IDCT processor
CN101634981A (en) * 2009-08-25 2010-01-27 浙江大学 Method and device for processing discrete cosine transform
CN102857756A (en) * 2012-07-19 2013-01-02 西安电子科技大学 Transfer coder adaptive to high efficiency video coding (HEVC) standard
CN103369326A (en) * 2013-07-05 2013-10-23 西安电子科技大学 Transition coder applicable to HEVC ( high efficiency video coding) standards
CN104581174A (en) * 2015-01-22 2015-04-29 复旦大学 High-throughput DCT and IDCT hardware multiplexing structure suitable for HEVC standard

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《H.264/AVC中整数DCT变换量化模块的Verilog设计》;沈劲桐等;《计算机与现代化》;20130228(第210期);全文 *

Also Published As

Publication number Publication date
CN108184127A (en) 2018-06-19

Similar Documents

Publication Publication Date Title
CN111445012B (en) FPGA-based packet convolution hardware accelerator and method thereof
Shen et al. A unified 4/8/16/32-point integer IDCT architecture for multiple video coding standards
US5590067A (en) Method and arrangement for transformation of signals from a frequency to a time domain
JP4491798B2 (en) Method and system for performing two-dimensional transformation on data value array with low power consumption
US9665540B2 (en) Video decoder with a programmable inverse transform unit
JP2008117368A5 (en)
Chiang et al. A reconfigurable inverse transform architecture design for HEVC decoder
CN108184127B (en) Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture
Darji et al. High-performance hardware architectures for multi-level lifting-based discrete wavelet transform
US5636152A (en) Two-dimensional inverse discrete cosine transform processor
Zheng et al. A reconfigurable architecture for discrete cosine transform in video coding
JP6357345B2 (en) Data processing apparatus and method for performing conversion between spatial domain and frequency domain when processing video data
CN114007079A (en) Conversion circuit, method, device and encoder
Li et al. A highly parallel joint VLSI architecture for transforms in H. 264/AVC
Sun et al. An area-efficient 4/8/16/32-point inverse DCT architecture for UHDTV HEVC decoder
CN100452880C (en) Integral discrete cosine transform method in use for encoding video
He et al. A 995Mpixels/s 0.2 nJ/pixel fractional motion estimation architecture in HEVC for Ultra-HD
Hong et al. A cost effective 2-D adaptive block size IDCT architecture for HEVC standard
US7756351B2 (en) Low power, high performance transform coprocessor for video compression
Zhang et al. Hardware architecture design of block-matching and 3D-filtering denoising algorithm
CN104811738B (en) The one-dimensional discrete cosine converting circuit of low overhead multi-standard 8 × 8 based on resource-sharing
Liang et al. Area-efficient HEVC IDCT/IDST architecture for 8K× 4K video decoding
Liang et al. A full-pipelined 2-D IDCT/IDST VLSI architecture with adaptive block-size for HEVC standard
Kammoun et al. An optimized hardware architecture of 4× 4, 8× 8, 16× 16 and 32× 32 inverse transform for HEVC
Chatterjee et al. A low cost, constant throughput and reusable 8# x00D7; 8 DCT architecture for HEVC

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant