CN1268231A - Variable block size 2-dimensional inverse discrete cosine transform engine - Google Patents

Variable block size 2-dimensional inverse discrete cosine transform engine Download PDF

Info

Publication number
CN1268231A
CN1268231A CN 98808477 CN98808477A CN1268231A CN 1268231 A CN1268231 A CN 1268231A CN 98808477 CN98808477 CN 98808477 CN 98808477 A CN98808477 A CN 98808477A CN 1268231 A CN1268231 A CN 1268231A
Authority
CN
China
Prior art keywords
idct
butterfly operation
processor
serial
plurality
Prior art date
Application number
CN 98808477
Other languages
Chinese (zh)
Inventor
K·D·伊斯顿
Original Assignee
夸尔柯姆股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US91809097A priority Critical
Application filed by 夸尔柯姆股份有限公司 filed Critical 夸尔柯姆股份有限公司
Publication of CN1268231A publication Critical patent/CN1268231A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/147Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform

Abstract

一种数据块规模可变的2-D IDCT机(10),可计算任意的变换混合。 A data size of a variable 2-D IDCT block (10), an arbitrary conversion can be calculated mixing. 第一1-DIDCT处理器(20a)按列计算数据块的变换,并在易位存储器中存储中间结果。 1-DIDCT first processor (20a) is calculated by column transformation data block, and store intermediate results in the translocation of the memory. 第二1-D IDCT处理器(20b)按行计算中间结果的变换。 A second 1-D IDCT processor (20b) of the intermediate results of calculations by the line change. 可通过使输入数据正确排序,在蝶式运算级前有选择地组合输入数据,并控制每一蝶式运算级的加法和乘法,来方便地进行不同的变换混合。 Sort the input data may be correct by selectively combining the input data before the butterfly operation stage, and controls the butterfly operation in each stage of addition and multiplication, conveniently different mixing transformation. 按旁路方式设置不需要的蝶式运算。 Unnecessary butterfly operation provided by the bypass mode. 可利用串行加法器(56)和位串行乘法器实现蝶式运算,以大大简化硬件设计,减少连续的蝶式运算级间的路由选择需要。 Available serial adder (56) and bit-serial multipliers butterfly operation, to greatly simplify the hardware design, reduce routing between successive select butterfly operation stage. 整体流水线结构使IDCT机将吞吐速率保持在每一时钟周期一像素。 IDCT pipeline structure so that the overall machine throughput rate will remain at each pixel clock cycle.

Description

数据块规模可变的2维逆向离散余弦变换机 Variable block size two-dimensional inverse discrete cosine transform unit

发明背景I.发明领域本发明涉及数字信号处理。 BACKGROUND OF THE INVENTION I. Field of the Invention The present invention relates to digital signal processing. 具体来说,本发明涉及一种新颖、改进的数据块规模可变的2维(2-D)逆向离散余弦变换(IDCT)机。 In particular, the present invention relates to a novel and improved variable data block size 2-dimensional (2-D) inverse discrete cosine transform (IDCT) unit.

II.相关技术说明2维离散余弦变换(IDCT)和逆向离散余弦变换(IDCT)是数字图像压缩中重要的信号处理操作。 II. Description of Related Art 2-dimensional discrete cosine transform (IDCT) and inverse discrete cosine transform (IDCT) operation is an important signal processing in a digital image compression. 一种这样的数字图像压缩应用于高清晰度电视(HDTV)领域。 One such digital image compression used in high definition television (HDTV) field. 在HDTV中,模拟视频波形由模拟—数字变换器(ADC)控制和数字化。 In HDTV, the analog video waveform by an analog - digital converter and digital control (ADC). 所得到的经取样的数据接着经过数字处理,以减小在保留高图像质量情况下所必须传送和/或存储的数据量。 The obtained sampled data is then processed digitally to reduce the reservation must be transmitted in a case where a high image quality / or the amount of data stored. 具体来说,压缩处理的关键因素是2-D离散余弦变换,其中将经取样数据的N×N数据块或图像从时域变换为频域。 Specifically, the compression process is a key factor in 2-D discrete cosine transform, which transforms from the time domain to the frequency domain by the N × N block of the image or the sample data. 经变换的数据可进一步由诸如霍夫曼码(Huffman code)、游程长度码(run length codes)等分组码,和/或诸如卷积码(convolutional codes)和里德-所罗门码(Reed-Solomon codes)等纠错码进行处理。 May be further transformed data, such as a Huffman code (Huffman code), run length codes (run length codes) block code and the like, and / or such as a convolutional code (convolutional codes), and a Reed - Solomon code (Reed-Solomon codes) for processing an error correction code and the like. 在3份标题均为“数据块规模自适应的图像压缩方法和系统”的美国专利USPat.No.5,452,104,USPat.No.5,107,345和USPat.No.5,021,891,以及标题为“帧间视频编码和解码系统”的美国专利USPat.No.5,576,767中揭示了一种示范性HDTV图像压缩方案,上述4项专利均已转让给本发明受让人,现通过引用归并于此。 3 parts of both entitled "adaptive block size image compression method and system," U.S. Patent No. USPat.No.5,452,104, USPat.No.5,107,345 and USPat.No.5,021,891, and entitled "interframe video coding and decoding system, "U.S. Patent No. USPat.No.5,576,767 discloses a HDTV image compression exemplary embodiment, the above four patents have been assigned to the assignee of the present invention, now merged herein by reference.

所传送和/或存储的是经数字编码的视频波形。 And / or stored is transmitted digitally coded video waveform. 在接收机处执行逆向的数字信号处理来重建原始图像的各个像素。 Performing an inverse digital signal processing at the receiver to reconstruct the original image of each pixel. 经还原的图像提供给数字—模拟变换器(DAC),该变换器便将所重建的图像变换回可以在监视器或电视机上显示的模拟视频波形。 Is provided to the image-reduced digital - analog converter (DAC), the converter is put back into the reconstructed image conversion can be displayed on a television monitor or an analog video waveform.

解码处理中的重要因素是将频域数据变换回时域的逆向离散余弦变换。 Important factor in the decoding process is to transform the frequency domain data is inverse discrete cosine transform back into the time domain. 该IDCT机需要以高输出率运行来实时重建原始图像。 The IDCT machine needs to reconstruct the original image in real-time at high output rates run. 另外,IDCT机通常装在消费类产品中,因而成本是考虑的主要因素。 In addition, IDCT machine is usually installed in the consumer products, and thus the cost is a major factor to consider. IDCT机需要设计成高速工作但复杂度很低。 IDCT machine needs to be designed to operate at high speed but low complexity.

该数字图像压缩系统通常按逐帧方式对视频信号进行处理。 The digital image compression system typically manner frame by frame by the video signal processing. 每一视频帧进一步分成一些N×N数据块。 Each video frame is further divided into a number of N × N block. 大多数压缩系统中,由系统设计将数据块规模固定,以简化DCT和IDCT机的实施。 Most compression system, the system design fixed size data blocks, to simplify the implementation of DCT and IDCT machine.

允许数据块规模可变可以在某些条件下增强压缩系统性能,以允许对图像进行优化压缩,并且/或者提高所重建图像的质量。 Data block size allows variable compression system performance may be enhanced under certain conditions, to allow optimization of image compression, and / or improving the quality of the reconstructed images. 可变数据块规模可用来利用图像的某些特性。 Variable block size can be used to take advantage of certain characteristics of the image. 在现有技术中,用不同规模的变换处理器组设计数据块规模可变的DCT和IDCT机。 In the prior art, with different sizes of design data conversion processor sets a variable block size DCT and the IDCT unit. 每一处理器对相同数据块计算不同数据块规模的变换。 Converting different data blocks for each processor calculates the size of the same data block. 然后,不同处理器输出的变换组合成期望的复合变换数据块。 Then, the output of the transformation of different processors are combined into a composite transform the desired data block. 由于需要大量硬件,而且各种硬件模块间的协调复杂,因而此方法会不灵活。 Due to need a lot of hardware, and coordination between the various hardware modules complicated, so this method is not flexible.

发明概述本发明为一种新颖、改进的数据块规模可变的2维(2-D)逆向离散余弦变换(IDCT)机。 SUMMARY The present invention is a novel and improved variable data block size 2-dimensional (2-D) inverse discrete cosine transform (IDCT) unit. 按照本发明,该N×N数据块由第一1-D IDCT处理器按列进行变换。 According to the present invention, the N × N block of the first column is converted by 1-D IDCT processor. 该第一IDCT处理器输出的中间结果暂时存储在易位存储器中。 The first intermediate result output from the IDCT processor is temporarily stored in a memory metathesis. 一旦所有各列均得到处理,该中间结果便由第二1-D IDCT处理器按行进行变换。 Once all columns have been processed, the result will be transformed by the intermediate row by the second 1-D IDCT processor. 该第二IDCT处理器的输出包括IDCT机的经变换输出。 The output of the IDCT processor includes a second output IDCT transformed machine.

本发明目的在于,提供一种能在一N×N数据块内计算任意的变换混合的2-DIDCT机。 Object of the present invention is to provide a machine capable of computing arbitrary 2-DIDCT transformation blending in a N × N block. 在示范性实施例中,每一数据块可以是16×16变换,也可以是8×8变换、4×4变换、和/或2×2变换的任意组合的混合。 In an exemplary embodiment, each data block may be a 16 × 16 transform may be transformed 8 × 8, 4 × 4 transform, and mixing / or any combination of 2 × 2 transform. 在示范性实施例中,有一21位的控制信号精确地描述所期望的分割,并告知IDCT机计算适当的变换组合。 Embodiment, a control signal 21 to accurately describe the desired segmentation, computing and inform the appropriate transformation IDCT combination exemplary embodiment. 本发明中,可通过使输入数据正确排序,在蝶式运算级前有选择地组合数据,并控制每一蝶式运算级处的加法和乘法,方便地进行不同的变换组合。 In the present invention, the input data can be correctly sorted by selectively combining stage butterfly operation before the data, and controls the addition and multiplication of each stage of the butterfly operation, easily converting different combinations. 按旁路方式设置不需要的蝶式运算。 Unnecessary butterfly operation provided by the bypass mode.

本发明另一目的在于,通过提供串行计算来简化2-D IDCT机的设计。 Another object of the present invention is to simplify the design of 2-D IDCT is calculated by the machine provides the serial. 因为一次仅对数据的一位进行计算,因而串行加法器和位串行乘法器大大简化这种设计。 Since only one data is calculated once, and thus the serial bit-serial adder and multiplier greatly simplifies this design. 串行计算还大幅度简化连续蝶式运算级间的交叉路由。 Serial calculations also greatly simplify the routing between successive cross butterfly operation stage. 由于本发明IDCT机的流水线结构,吞吐速率保持在每一时钟周期一变换点或像素的速率。 Since the pipeline structure, IDCT machine throughput rate of the present invention is maintained at each clock cycle rate or a pixel change point. 此速率与并行计算时的吞吐速率相同。 The same throughput rate and at this rate parallel computing. 只是由于计算的串行特性,处理延时有所增加。 Only because of the serial nature of computing, processing delay has increased.

本发明再一目的在于,降低存储器要求。 A further object of the present invention is to reduce memory requirements. 对于2-D IDCT,该数据块首先由1-DIDCT处理器按列进行变换,而中间结果暂时按列存储在易位存储器中。 For 2-D IDCT, the block is first transformed by a 1-DIDCT column processors, the intermediate result is temporarily stored in columns translocation memory. 仅在所有各列均变换后才执行第二1-D变换。 Only performing a second 1-D transform until all columns are converted. 由于IDCT机的流水线结构,以并行方式按列写入至存储器并按行从存储器读出中间结果。 Since the machine IDCT pipeline structure, in parallel to the columns written in the memory is read out row by row from the memory intermediate results. 为避免覆盖存储器中含有后面所需数据的位置,存储器对于连续的N×N数据块易位,或在主列和主行之间交替。 To avoid overwriting the memory location containing the required data back to the memory for successive data blocks of N × N translocation, or alternating between the main column and the main row. 通过采用读出-修改-写入循环周期,在相同时钟周期内从存储器位置读出中间结果,并将新结果写入至相同存储器位置。 By using the read - modify - write cycle, the intermediate result is read out from the memory location in the same clock cycle, writes the new result to the same memory location. 易位存储器使存储器要求降低至具有与一N×N数据块相同规模的一个存储体。 The memory of the memory required translocation has lowered to a N × N block of data a bank of the same size.

附图简要说明本发明的特征、目的和优点从下面结合附图的具体说明当中会变得更为清楚。 BRIEF DESCRIPTION OF DRAWINGS Features of the present invention, the objects and advantages described in detail in conjunction with the accompanying drawings of which the following will become apparent. 图中各处相同标号均指对应部分,附图当中,图1A、图1B和图1C分别示出的是本发明中示范性N×N图像、经分割图像、和与所分割图像对应的树形图;图2是本发明示范性数据块规模可变的2-D IDCT机的框图;图3A-3D分别是本发明中2点IDCT格点结构、4点IDCT格点结构、8点IDCT格点结构、和16点IDCT格点结构的示意图;图4是本发明中示范性1-D IDCT处理器的框图;图5A和5B分别是本发明中串行蝶式运算的示例图和该串行蝶式运算示范性实施例的框图;图6A和图6B分别是按字范围表示和位范围表示的本发明示范性位串行乘法器的框图;图7是本发明中示范性串行加法器的框图;以及图8是本发明中示范性I/O缓存器的框图。 The same reference numerals throughout the figures refer to corresponding portions, among the drawings, FIGS. 1A, 1B and 1C illustrate the present invention are exemplary N × N image, the segmented image, and an image corresponding to the divided tree FIG shape; FIG. 2 is an exemplary of the present invention, a block diagram of a variable data block size 2-D IDCT machine; Figures 3A-3D are 2-point IDCT of the present invention, lattice structure, 4-point IDCT lattice structure, 8-point IDCT lattice structure, and a schematic view 16:00 IDCT grid structure; FIG. 4 is a block diagram of an exemplary 1-D IDCT processor of the present invention; FIGS. 5A and 5B are serially present invention butterfly operation and the example of FIGS. serial block diagram of an exemplary embodiment butterfly operation; FIGS. 6A and 6B are block diagrams showing word-bit range and represented a range of exemplary bit-serial multiplier of the present invention; FIG. 7 is an exemplary of the present invention, the serial a block diagram of an adder; and FIG. 8 is a block diagram exemplary of the present invention, I / O buffer.

较佳实施例的具体说明离散余弦变换(DCT)和逆向离散余弦变换(IDCT)均为重要的互补数字信号处理操作。 Specific preferred embodiments of the description are important complementary digital signal processing operations on discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT). DCT按下面公式将经取样数据从时域变换至频域。 DCT transform by the following equation sampled data from the time domain to the frequency domain.

式中,N是变换的维数,C(0)=1/(21/2),对于k=1,2,3…N-1,C(k)=1。 Wherein, N is the number of dimensional transform, C (0) = 1 / (21/2), for k = 1,2,3 ... N-1, C (k) = 1. 作为一系列数字信号处理操作的一种通常对经取样数据执行DCT变换。 As a series of digital signal processing operations normally performed by the DCT transformation on the sampled data. 对经变换数据执行包括量化、数据压缩、以及纠错编码处理在内的其他操作。 Other operations include performing quantization, data compression and error correction encoding process including the transformed data. 上面提及的美国专利USPat.No.5,452,104中具体说明了对示范性数字图像压缩技术的讨论。 U.S. Patent No. USPat.No.5,452,104 specifically described in the above-mentioned discussion of exemplary digital image compression technique.

IDCT按下面公式将该数据从频域变换回时域。 According to the following formula IDCT data back from the frequency domain into the time domain.

DCT和IDCT变换均为可分变换。 DCT and IDCT transforms are separable transform. 这意味着,2-D变换可拆分为2个1-D变换。 This means, 2-D transformation can be separated into two 1-D conversion. 对一数据块执行2-D IDCT变换,可通过先对该数据块各列执行1-D IDCT变换。 Performing 2-D IDCT transform on a block of data, 1-D IDCT may be performed by first converting the data in each column block. 第一IDCT变换的中间结果暂时存储于存储器元件。 A first intermediate result of the IDCT is temporarily stored in the memory elements. 然后对中间结果的各行执行第二IDCT变换。 Then each row of the intermediate results of execution of the second IDCT transform. 第二IDCT变换的输出包括原始图像的重建像素。 Output of the second IDCT transform comprises reconstructed pixels of the original image.

参见附图,图1A图示的是一示范性数据块。 Referring to the drawings, FIG. 1A is illustrated an exemplary block. 数据块2的容量为N×N,其中N为2的幂,即N=2x,x为整数1,2,3…。 Capacity of the data block of N × N 2, where N is a power of 2, i.e. N = 2x, x is an integer of 1, 2, .... 满足此条件时,式(1)和式(2)均可显著简化。 When this condition is satisfied, the formula (1) and (2) can be significantly simplified. 示范性实施例中,N等于16,但本发明可方便地扩展至其他N值。 In the exemplary embodiment, N is equal to 16, but the present invention can be extended easily to other N values.

图2示出本发明中2-D IDCT机10的示范性框图。 FIG 2 shows an exemplary block diagram of the present invention in 2-D IDCT unit 10. 示范性实施例中,包括IDCT系数的输入数据块按列提供给IDCT处理器20a。 In the exemplary embodiment, it includes an input data block IDCT coefficients to the IDCT processor 20a provided in columns. IDCT处理器20a和20b是按照式(2)对输入数据执行IDCT变换的同一1-D IDCT处理器。 IDCT processor 20a and 20b are in accordance with the formula (2) for the same input data to perform the IDCT 1-D IDCT processor. IDCT处理器20a输出的中间结果提供给存储器元件22,这里中间结果暂时按列存储。 IDCT processor 20a outputs intermediate results to the memory element 22, where intermediate results temporarily stored in columns. 该中间结果随后按行提供给IDCT处理器20b。 The intermediate result is then provided to a row IDCT processor 20b. IDCT处理器20b执行1-D IDCT变换,并将经变换的输出即重建的图像提供给后面的数字信号处理模块(未在图2中示出)。 The processor 20b performs IDCT 1-D IDCT conversion, and supplies the converted output reconstructed image, i.e. to the back of the digital signal processing module (not shown in FIG. 2). 示范性实施例中,输入数据块按列提供给IDCT处理器20a,而中间结果则按行提供给IDCT处理器20b。 An exemplary embodiment, the input data block to the IDCT processor 20a in columns, while the intermediate results supplied to the press line IDCT processor 20b. 要么数据块按列提供给IDCT处理器20a,而中间结果则按行提供给IDCT处理器20b。 Either the data blocks supplied to the column IDCT processor 20a, the intermediate result is supplied to the press line IDCT processor 20b. 示范性实施例中,IDCT处理器20a和20b具有流水线结构,使得IDCT处理器20这两者同时活动。 In the exemplary embodiment, IDCT processor having the pipeline structure 20a and 20b, so that both the IDCT processor 20 are simultaneously active.

IDCT其重要特性在于,可通过管理输入数据点,计算数据点选择性组合的总和,并对两个较小的变换的输出执行串行蝶式运算,来创建一较大的变换。 IDCT important properties that can be managed by the input data points, calculate the sum of the data points of selective combination, and two smaller serial output transform butterfly operation is performed to create a greater conversion. 串行蝶式运算是一种下面具体说明的运算。 Serial butterfly operation is one of the following detailed description of the operation. 这样,一16点IDCT是两个8点IDCT的蝶式运算,每一8点IDCT是两个4点IDCT的蝶式运算,而每一4点IDCT又是两个2点IDCT的蝶式运算。 Thus, a 16-point IDCT is a butterfly operation of the two 8-point IDCT, 8-point IDCT each butterfly operation are two 4-point IDCT, the 4-point IDCT each butterfly operation is of two 2-point IDCT . IDCT这一特性在本技术领域内是公知的,并得到格网图最为理想的解说。 This feature IDCT in the art are well known, and has been the most ideal grid illustrated in FIG. 图3A示出2点IDCT的格网图,图3B示出4点IDCT的格网图,而图3C示出8点IDCT的格网图。 Grid FIG. 3A shows the IDCT 2:00, FIG. 3B shows a grid FIG IDCT 4:00, and FIG 3C shows the 8-point IDCT grid FIG. 2点IDCT只包括一个蝶式运算级。 2:00 IDCT comprises only a butterfly operation stage. 如图3A-3B所示,4点IDCT包括:2个2点IDCT组成的级,该2点IDCT级之前的一串行加法级,以及该2点IDCT级之后的一蝶式运算级。 As shown in FIG. 3A-3B, 4-point IDCT comprising: two 2-point IDCT stages consisting, prior to the two-point IDCT a serial adder stage level, and a butterfly operation stage after the two-point IDCT stage. 同样,如图3B-3C所示,8点IDCT包括:2个4点IDCT组成的级,该4点IDCT级之前的一串行加法级,以及该4点IDCT级之后的一蝶式运算级。 Similarly, as shown in FIG 3B-3C, 8-point IDCT comprising: two 4-point IDCT stage consisting, prior to the IDCT 4:00 serial adder stage a stage, and a butterfly operation stage following the 4-point IDCT stage .

图3D中示出本发明中16点IDCT格网100的示意图。 FIG. 3D shows a schematic view of the present invention is 16-point IDCT of the grid 100. 格网100源于BGLee,并且在KRRao所著书名为“离散余弦变换:算法,优点以及应用”(学院出版社,1990)中有具体说明。 Grid 100 from BGLee, and KRRao book titled: There are specific instructions "discrete cosine transform algorithm, advantages and applications" (Academic Press, 1990). 16点IDCT格网100包括三级串行加法和四级串行蝶式运算,每一级蝶式运算包括8个串行蝶式运算。 16-point IDCT grid 100 comprises three serial-serial adder and four butterfly operations, each stage comprising a butterfly operation eight serial butterfly operation. 现有技术中的IDCT处理器,各连续级之间的互联关系固定,因而限制IDCT处理器只能执行16点IDCT变换。 IDCT processor of the prior art, the interconnection relationship between successive fixed level, thereby limiting the IDCT processor can perform a 16-point IDCT transform. 而本发明中,通过利用可重组格网交叉连接使各个级互联。 In the present invention, by utilizing recombinant grid interconnection cross-connection of the respective stages.

如图3D所示,16点IDCT110是2个8点IDCT108的蝶式运算,至较低的8点IDCT的数据点被有选择地组合。 3D, a 16-point IDCT110 2 8:00 IDCT108 the butterfly operation, to the lower 8-point IDCT data points are selectively combined. 2个8点IDCT108是4个4点IDCT106的蝶式运算,至较低的4点IDCT的数据点被有选择地组合。 2 is IDCT108 8:00 4 4:00 IDCT106 the butterfly operation to the lower 4-point IDCT data points are selectively combined. 4个4点IDCT106是8个2点IDCT104的蝶式运算,至较低的2点IDCT的数据点被有选择地组合。 Four 4 point butterfly operation IDCT106 eight IDCT104 2 points, the data point to a lower point IDCT 2 are selectively combined. 该可重组格网交叉连接与串行蝶式运算旁路模式相结合,使本发明的2-D IDCT机10可计算N×N数据块内的任意变换混合。 The grid can be cross-connected with the recombinant serial combination of butterfly operation in bypass mode, it can be calculated in arbitrary N × N block of the present invention, 2-D IDCT transform mixing 10. 可通过使输入数据正确排序,在蝶式运算级之前有选择地组合输入数据,并在格网各个级处控制加法和乘法,来完成2点、4点、8点、和16点IDCT变换的任意组合。 By sorting the input data is correct, the butterfly operation stage prior to selectively combined input data, and controlling addition and multiplication at each stage of the grid to complete the 2:00, 4:00, 8:00, and 16-point IDCT transform random combination. 例如,IDCT处理器20可执行2个8点变换,8个2点变换,1个8点变换和2个4点变换,或者1个8点、1个4点以及2个2点变换。 For example, the processor 20 may perform an IDCT two 8 point transforms, eight transformation 2:00, 1 8:00 2 4:00 transform and transform, or a 8:00, 4:00, and a two 2-point transformation. 本发明中,不需要组合级将不同的变换输出汇总成一复合变换数据块,因为IDCT机10配置成对各种变换进行适当混合时这会自动发生。 In the present invention, it does not require the combination of different level converted output aggregated into a composite block of transformed data, as this will occur automatically configured to various transform IDCT unit 10 appropriately mixed. 任何时候计算较小的变换,串行加法器和蝶式运算都不需要进行高次变换再度延迟锁存器。 Any small time calculated conversion and serial adder butterfly operation not required for re-converting the higher the delay latches. 这样,不管变换混合如何,总可将各个IDCT处理器20的输出按时间对齐。 Thus, regardless of the transformation mix, the total output may be IDCT processor 20 each time aligned.

示范性实施例中,串行蝶式运算对2输入位流进行运算,并提供2输出位流。 Embodiment, serial 2 butterfly operation on input bit stream operation, and provides two output bit stream to an exemplary embodiment. 串行蝶式运算包括1个明显简化的位串行乘法器和2个串行加法器。 Serial butterfly operation comprises a bit-serial multiplier significantly simplified and two serial adders. IDCT处理器20的串行结构使连续的串行蝶式运算级之间的交叉路由可仅用1位宽数据总线来实现。 IDCT processor 20 serial structure crossing routes between the continuous serial butterfly operation stage only 1 bit wide data path may be implemented.

示范性实施例中,IDCT机10在256时钟周期内计算16×16数据块的变换。 In the exemplary embodiment, IDCT 10 16 × 16 transform block 256 calculates the clock cycle. 每一时钟周期,向IDCT机10提供一IDCT系数,并从IDCT机10当中提取一输出像素。 Each clock cycle, to provide an IDCT unit 10 IDCT coefficients, and extracts an output pixel from among the IDCT unit 10. IDCT处理器20a和20b具有流水线结构,以便两处理器同时活动。 IDCT processor having the pipeline structure 20a and 20b, so that two processors simultaneously active. 每一IDCT处理器20接收1个输入数据点,并且每一时钟周期提供1个变换的数据点。 Each IDCT processor 20 receives an input data points per clock cycle and provide a transformed data point.

I.IDCT处理器图4中示出本发明IDCT处理器20的示范性框图。 Processor I.IDCT FIG. 4 shows an exemplary block diagram of an IDCT processor 20 of the present invention. 每16时钟周期(N=16),16个I/O缓存器52接收16个输入数据点,每一时钟周期1个数据点,每一I/0缓存器521个数据点。 Every 16 clock cycles (N = 16), 16 I / O buffer 52 receives 16 input data points, each clock cycle one data point, each of the I / 0 buffer 521 data points. 向I/O缓存器52加载数据点的顺序取决于正执行着的变换混合,由控制器26通过4位WRITE_NABLE信号进行控制。 The I / O buffer 52 sequentially loads the data transformation point depends on the mixing being performed, control is performed by four WRITE_NABLE signal by the controller 26. 根据WRITE_ENABLE信号,每一数据点包括并行加载至相应的I/O缓存器52的q位。 The WRITE_ENABLE signal, each data point comprises a parallel load to the corresponding I / O buffer 52 of q bits. I/O缓存器52接着按照LSB(最低有效位)在先,每一时钟周期1位进行串行移位,将此16个数据点一起经交叉路由54移出至串行加法器56。 I / O buffer 52 and then in accordance with the LSB (least significant bit) first, each clock cycle a serial shifting out this 16 data points along the route through the intersection 54 to the serial adder 56. I/O缓存器52可如下面所述作为一并行至串行移位寄存器来实现。 I / O buffer 52 may be as described below as a parallel-to-serial shift register implemented.

串行加法器56接收此数据位,并按下面说明的方式对这些位执行串行加法。 Serial adder 56 receives the data bit, performed in a manner described below in accordance serial adder these bits. 串行加法器56由ADD_ENABLE启动,该信号在示范性实施例中包括7位,并与图3D中格子100所示的前三级加法相对应。 Initiated by the serial adder 56 ADD_ENABLE, including 7 of the embodiment in an exemplary embodiment the signal, and the adder correspond to the first three grid 100 shown in FIG. 3D. 每一串行加法由小圈112表示(为简便起见,仅标注一个小圈)。 Each serial adder 112 represented by a small circle (for simplicity, only a small circle denoted). 第一级中有7个需要4位进行启动/禁止的串行加法112。 The first stage requires four seven for enabling / disabling the serial adder 112. 第二级中有2组各3个需要2位进行控制的串行加法112。 The second stage has two sets of three each control requires two serial adder 112. 而第三级中有4组单个需要1位进行控制的串行加法112。 Stage 4 and the third group has a single required for controlling the serial adder 112. 可根据7位ADD_ENABLE信号对串行加法器56进行控制来计算如图3D中格网100前三级所需的串行加法112。 7 can be calculated according to the signal ADD_ENABLE serial adder 56 shown in FIG 3D 100 controls required in the first three grid-serial adder 112. 图4中16个串行加法器56的加法器组象征性地代表格网100前三级所需的功能。 Three required functions 100 in FIG. 4 before serial adder 16 adder group 56 symbolically represents the grid.

串行加法器56的输出提供给8个串行蝶式运算的第一级58,执行图3D中8个2点IDCT104内所示功能。 Serial output of the adder 56 is supplied to 8-stage serial first butterfly operation 58, the function shown in FIG. 8 2:00 IDCT104 executes the 3D. 图5B框图示出每一串行蝶式运算。 5B, a block diagram illustrating a serial each butterfly operation. 串行蝶式运算接收2串行流输入X1和X2,生成2串行输出Z1=X1+C·X2和Z2=X1-C·X2,式中,C是根据蝶式运算在IDCT格网100中的位置定义的固定标量。 Serial 2 butterfly operation receives a serial stream of input X1 and X2, generating two serial output Z1 = X1 + C · X2 and Z2 = X1-C · X2, where, C is the butterfly operation according to the IDCT grid 100 scalar fixed in a defined position. 下面具体说明串行蝶式运算一示范性实施例。 DETAILED DESCRIPTION serial butterfly operation following an exemplary embodiment. 示范性实施例中,串行蝶式运算的第一级58总是启动使得IDCT处理器20执行至少2点变换。 In an exemplary embodiment, serial butterfly operation always starts first stage 58 such that the processor 20 performs IDCT transform at least two points. 第一级58的输出经交叉路由60提供给串行蝶式运算第二级62。 CROSS routed via a first output 60 of stage 58 is supplied to the second stage 62 serial butterfly operation. 交叉路由60如IDCT格网100所示将前2级互联。 The IDCT 60 CROSS routing grid interconnection shown in the previous stage 2 100. 示范性实施例中,可有选择地启动串行蝶式运算中的第二级62来提供4点变换。 An exemplary embodiment, the selectively activated serial butterfly operation in a second stage 62 to provide four point transform. 由于有4组蝶式运算(见图3D),需要4位控制来分别启动各组。 Because there are four sets of butterfly operations (see FIG. 3D), you are required to start the four control groups. 该4位控制是图4中标注为MAP的控制信号中的一部分。 The 4-bit control is denoted in Figure 4 as part of the MAP control signals.

串行蝶式运算的第二级62的输出经交叉路由64提供给第三级66。 CROSS routed via the serial output of the second butterfly operation stage 62 is supplied to the third stage 64 66. 第三级66包括如图3D中2个8点IDCT108内所示的2组各4个串行蝶式运算。 Third stage 66 includes a respective group 2 in FIG. 3D four butterfly operations as shown in the serial 8:00 IDCT108 2 th. 每一组可以由一2位控制分别启动。 Each group may be initiated by a control 2, respectively. 第三级66的输出经交叉路由68提供给第四级70。 CROSS routed via the third output 68 of stage 66 is supplied to the fourth stage 70. 第四级70包括如图3D中16点IDCT110内所示的1组8个串行蝶式运算。 The fourth stage 70 includes a set of eight serial butterfly operation shown in FIG. 3D IDCT110 16:00. 该串行蝶式运算可以由1位控制有选择地启动。 The serial butterfly operation can be selectively controlled by a promoter. 第四级70输出的经串行变换的数据包括IDCT处理器20的输出。 Data outputted from the fourth stage 70 through an output comprises serial conversion IDCT processor 20.

第四级70输出的经1位串行变换的数据,路由选择至串行—并行输出缓存器组。 Data outputted from the fourth stage 70 via a serial conversion, routing to the serial - parallel output buffer group. 示范性实施例中,IDCT处理器20按每一时钟周期提供一输出字这种字串行方式提供IDCT输出。 In an exemplary embodiment, the processor 20 by the IDCT provide an output per clock cycle this word string output line mode provides IDCT. 该输出缓存器可以与输入缓存器结合形成下面具体说明的I/O缓存器52。 The output buffer may be formed in conjunction with the input buffer I / O buffer 52 is specifically explained below.

II.控制器参见图2,控制器26将控制信号提供给IDCT处理器20a和20b以及存储器元件22。 II. Controller Referring to Figure 2, the controller 26 provides a control signal to the IDCT processor 20a and 20b and the memory element 22. 这些控制信号使IDCT处理器20a和20b与存储器元件22同步,并且判定重建的合成图像。 These control signals cause IDCT processor 20a and 20b and the synchronous memory element 22, and determines whether the composite image reconstruction. 控制器26接收地址输入和PQR输入。 The controller 26 receives an input address and input PQR. 该地址输入告知控制器26数据块的起始位置。 The start position address entered data block to inform the controller 26. PQR输入则包括告知控制器26所需数据块分割的三条命令P、Q以及R。 PQR comprises informing the controller 26 inputs a desired command three segmented data blocks P, Q and R. 示范性实施例中,R等于“1”表明该16×16数据块要分成较小的8×8变换数据块,Q等于“1”表明该8×8数据块要分成较小的4×4变换数据块,而P等于“1”表明该4×4数据块要分成较小的2×2变换数据块。 In an exemplary embodiment, R is equal to "1" indicates that the 16 × 16 block to be divided into smaller blocks of 8 × 8 transform data, Q is equal to "1" indicates that the 8 × 8 block to be divided into smaller 4 × 4 transformed data block, and P is equal to "1" indicates that the 4 × 4 block to be divided into smaller blocks of 2 × 2 transform data. 示范性实施例中,每一数据块可以不必考虑该图像中其他数据块,分别进行分割。 In an exemplary embodiment, each data block may not necessarily take into account other data blocks in the image, segmentation respectively. 因而,对于R需要1位控制,因为16×16数据块中仅有一个16×16变换数据块,对于Q需要4位控制,因为16×16数据块中可以有四个8×8变换数据块,对于P需要16位控制,因为16×16数据块中可以有十六个4×4变换数据块。 Thus, the need for an R control, since there is only a 16 × 16 16 × 16 block of transformed data block, Q needs for the 4-bit control, since 16 × 16 block may have four 8 × 8 transform data block for P 16 need to control, because the 16 × 16 block can have sixteen 4 × 4 transform block. 图1B示出数据块4的示范性分割方案,图1C示出与图像分割方案相对应的PQR控制的示例图示,例如树形图。 1B shows an exemplary block segmentation scheme 4, FIG. 1C shows an example illustrating the image segmentation scheme corresponding PQR control such tree. 该21位控制PQR可按串行或并行方式提供给控制器26。 The PQR 21 may control a serial or parallel manner to the controller 26.

该PQR输入是所需数据块分割方案的2-D表示。 The PQR data input is required block partitioning scheme represents a 2-D. 控制器26将PQR输入分析为1-D行列控制信号。 The controller 26 inputs PQR 1-D analysis is a control signal line. 接着利用这些行列控制信号生成控制信号来命令IDCT处理器20a和20b执行相应的变换混合。 Using these signals and then generates a control signal line to a control command execution IDCT processor 20a and 20b corresponding transformed mixing. 对于图1B中示出的示范性分割方案4,控制器26命令IDCT处理器20a执行前4列数据的2个4点变换和1个8点变换。 For the exemplary embodiment shown is divided 4, the controller 26 commands IDCT processor 20a performs two 4-point transform data before 4 and an 8-point transform in FIG. 1B. 控制器26命令IDCT处理器20a却对后2列数据执行1个4点变换、2个2点变换、和1个8点变换。 IDCT processor 26 commands the controller 2 performs a data transformed 4:00 Quedui 20a, two 2-point transform, and an 8-point transform. 处理继续到所有列得到处理为止。 Process continues until all columns are processed so far. IDCT处理器20a输出的中间结果在存储器元件22中按列存储。 IDCT processor 20a outputs intermediate results in the memory cell 22 stored in columns.

控制器26按同样方式命令IDCT处理器20b对存储器元件22输出各行中间结果执行相应的变换混合。 The controller 26 commands the same manner IDCT processor 20b outputs to the memory elements 22 in each row corresponding intermediate result of performing the transformation mixture. 所有列由IDCT处理器20a处理后,控制器26命令IDCT处理器20b执行前4行中间结果的2个4点变换和1个8点变换。 After all the columns in the processing by the IDCT processor 20a, the controller 26 commands IDCT processor 20b performs two 4-point transform first 4 lines and an intermediate result of transformation 8:00. 至于后2行,控制器26则命令IDCT处理器20b执行1个4点变换、2个2点变换、和1个8点变换。 As for the line 2, the controller 26 commands IDCT processor 20b performs a 4-point transform, two 2-point transform, and an 8-point transform. 处理也继续到所有列得到处理为止。 Also continue to deal with all the columns get processed.

参见图4,控制器26生成的对IDCT处理器20a和20b的控制信号包括WRITE_ENABLE、READ_ENABLE、ADD_ENABLE和MAP。 Referring to Figure 4, the generated control signal 26 to the IDCT processor 20a and 20b comprises a controller WRITE_ENABLE, READ_ENABLE, ADD_ENABLE and MAP. WRITE_NABLE控制输入数据点写入至相应的I/O缓存器52,使得输入数据点按正确顺序排列(参见图3D)。 WRITE_NABLE write control input data points to the corresponding I / O buffer 52, so that the input data point correct order (see Figure 3D). READ_ENABLE控制从IDCT处理器20中读出经变换数据的顺序。 READ_ENABLE control readout order by the IDCT transformed data from the processor 20. 示范性实施例中,经变换数据可以依次从IDCT处理器20中读出。 In an exemplary embodiment, the transformed data can be sequentially read out from the IDCT processor 20. ADD_ENABLE对在格网100中前3级执行加法的第一组串行加法器56进行控制。 ADD_ENABLE addition is performed prior to 100 in the first set of three grid-serial adder 56 is controlled. ADD_ENABLE取决于所需的变换混合,并根据PQR输入生成。 ADD_ENABLE depending on the desired transformation mix, and generated according to the input PQR. MAP控制串行蝶式运算的最后3级62、66和70来生成所需的变换混合。 Serial MAP butterfly operation control of the final stage 3 to 62, 66 and 70 required to generate the transformation mixture. MAP也根据PQR输入生成。 The MAP also input generating PQR. 第二级62需要4位控制位来分别启动或禁止4组蝶式运算中的每一组(参见图3D)。 The second stage 62 requires four bits to control the enabling or disabling of each group of four sets of butterfly operations (see FIG. 3D), respectively. 同样,第三级66需要2位控制位,第四级70需要1位控制位。 Also, the third stage 66 requires two control bits, the fourth stage 70 requires a control bit. 示范性实施例中,第一级58不需要控制信号,因为IDCT处理器20总是执行至少2点变换。 Embodiment, the control signal of the first stage 58 does not need to exemplary embodiments, because the IDCT processor 20 performs at least two points is always converted. 但需要的话可生成一控制信号,提供第一级58的旁路。 But if desired may generate a control signal, provides a first stage 58 of the bypass. 本发明的2-D变换是采用2个1-D变换串行执行的,因而控制器26使至IDCT处理器20b的控制信号相对于IDCT处理器20a延迟,使控制信号与输入数据保持同步。 2-D conversion of the present invention is to use two 1-D conversion executed serially, and therefore the controller 26 causes the control signal to the IDCT processor 20b with respect to the delay IDCT processor 20a, the control signal synchronized with the input data.

控制器26可作为组合逻辑和状态机的组合来实现。 The controller 26 can be used as a combination of logic and state machine implemented. 作为替代,控制器26可利用运行微代码的微控制器或微处理器来实现。 Alternatively, the controller 26 may be implemented using microcode running in a microcontroller or microprocessor. 如这里所说明的执行该功能的控制器26的种种方案均落在本发明保护范围以内。 The controller performs the functions of the various embodiment described herein all fall within the 26 scope of the invention.

III.易位存储器示范性实施例中,存储器元件22可按易位存储器实现。 III. Exemplary metathesis memory embodiment, memory element 22 may be implemented translocation memory. 通过对输入数据块各列执行1-D变换,存储中间结果,并对中间结果各行执行1-D变换来实现2-D变换。 1-D conversion performed, store intermediate results by operating on input data blocks in each column, each row and the intermediate result of performing 1-D 2-D transformation to achieve conversion. 直到所有列均得到变换,才对各行执行1-D变换。 Until all columns have been transformed, each row fishes perform 1-D transforms. 示范性实施例中,这两1-D变换具有流水线结构,使得两者并行操作。 In the exemplary embodiment, two 1-D transform having pipelined architecture, so that both operate in parallel.

存储器元件22可按图1A所示存储器模块实现。 The memory element 22 shown in FIG. 1A according to the memory modules. 假定IDCT处理器20a输出的中间结果最初按列写入至存储器元件22。 IDCT processor 20a is assumed that the intermediate result output by the first column write to the memory element 22. 直到IDCT处理器20a对所有列操作后,IDCT处理器20b才对中间结果各行进行操作。 IDCT processor 20a operates until all the columns for operation, IDCT processor 20b fishes intermediate results of the respective rows. 一旦存储器元件22最后一列存满,中间结果便按行提供给IDCT处理器20b。 Once a last memory element 22 is full, the intermediate results supplied to it by the row IDCT processor 20b. 但由于是流水线结构,IDCT处理器20a对IDCT处理器20b所检索的每一行数据提供一列数据。 However, due to the pipeline structure, an IDCT processor 20a provides the data for each line data of the IDCT processor 20b retrieved. 该列数据不能对以前的列进行重写,因为IDCT处理器20b还需要以前列中有些数据点。 The column data can not be rewritten previous column because IDCT processor 20b also need to be in the forefront of some of the data points. 为了解决这种问题,新的一列中间结果重写在IDCT处理器20b刚刚检索到的那行数据上。 To solve this problem, a new intermediate result of a rewrite on that line data IDCT processor 20b just retrieved. 事实上,存储器元件22可利用读出-修改-写入能力来实现,使得相同存储位置可在相同时钟周期中读出和写入。 In fact, the memory element 22 may be utilized read - modify - write capability is achieved, so that the same memory location can be read and written in the same clock cycle. 在一个时钟周期内,可以由IDCT处理器20b从存储器元件22某一位置当中读出一数据点,并且由IDCT处理器20a写入至那相同位置。 In one clock cycle, can be read by the IDCT processor 20b from a position memory device 22 among the data point, and by the IDCT processor 20a writes to that same position. 按此方式实施,存储器元件22便被易位,或者对连续的16×16数据块在主行和主列间交替。 In this way embodiments, the memory element 22 will be translocated, or 16 × 16 successive data blocks alternate between the main line and the main column. 这种易位将存储器需求减少到仅需一个存储器组。 This translocation would reduce the memory requirements to only one memory group.

由控制器26提供控制信号,按一易位存储器来实现存储器元件22。 A control signal provided by the controller 26, by a metathesis memory to implement a memory element 22. 控制器26具有所需的定时信息,并能够按输入数据块使IDCT处理器20a和20b与存储器元件22保持同步。 The controller 26 has a timing information needed, and to enable IDCT processor 20a and 20b and the memory element 22 by the input data sync block.

存储器元件22可以采用诸如RAM存储器件、锁存器、或其他存储器件等本领域众所周知的存储元件或任何数目的存储器件之一来实施。 The memory element 22 in one embodiment may be such as RAM memory devices, latches, or other memory device known in the art and other storage elements or any number of memory devices.

IV.串行蝶式运算图5A和图5B示出串行蝶式运算。 IV. Butterfly operation Serial 5A and 5B show the serial butterfly operation. 图5A是串行蝶式运算的示例图示,图5B是相同串行蝶式运算的框图。 5A is an example illustrating a serial butterfly operation, the same as FIG. 5B is a block diagram of a serial butterfly operations. 串行蝶式运算140对2个输入X1和X2进行运算。 Serial butterfly operation 140 pairs two inputs X1 and X2 is operated. 输入X1由延迟元件148延迟,使最高和最低信号通路对齐。 Input X1 delayed by the delay elements 148, so that the maximum and minimum signal path alignment. 输入X2由位串行乘法器150按1/(2Cnk)定标。 X2 is input by the bit-serial multiplier 150 by 1 / (2Cnk) calibration. Cnk表示cos(kπ/n)。 Cnk represents cos (kπ / n). 延迟元件148和乘法器150的输出提供给串行加法器160a和160b。 The output of multiplier 148 and the delay element 150 is supplied to a serial adder 160a and 160b. 串行加法器160a将乘法器150的输出与延迟元件148的输出相加,串行加法器160b则从延迟元件148的输出当中减去乘法器150的输出。 Serial adder 160a and the output of the delay element 148 is added to the multiplier 150, 160b from the output of the delay element 148 serial adder which subtracts the output of multiplier 150. 串行加法器160a和160b的输出分别包括串行蝶式运算输出Z1和Z2。 Serial adder 160a and 160b each include a serial output butterfly operation outputs Z1 and Z2. 本发明中,串行加法器160a和160b设计成可关闭以分别允许Y1和Y2通过作为Z1和Z2。 In the present invention, serial adder 160a and 160b are designed to be closed to allow through as Y1 and Y2 Z1 and Z2. 示范性实施例中,串行蝶式运算140对2个输入位流进行运算,并提供2个输出位流。 Embodiment, the butterfly operation 140 pairs of serial input bitstream 2 calculates and provides two output bit stream to an exemplary embodiment.

图6A和图6B示出位串行乘法器150的示范性框图。 6A and FIG. 6B shows a block diagram of an exemplary bit-serial multiplier 150. 图6A示出按字范围表示的位串行乘法器150,图6B示出按位范围表示的相同乘法器150。 6A shows a range indicated by the word bit-serial multiplier 150, FIG. 6B shows the same range of the multiplier 150 represents the bitwise. 通过连续将C与中间生成项相加,按一二进制位使该结果移位来实现X和C的位串行乘法。 The intermediate C by continuously adding items to generate, according to a result of the shift of bits so that the X-bit and C to achieve serial multiplication. 这由图6A中的框图示出。 This is illustrated by a block in FIG. 6A. 锁存器212由每16时钟周期中1周期内处于启动状态的LD信号清零,使锁存器212准备下一乘法。 LD signal latch 212 is cleared by a one cycle every 16 clock cycles in the activated state, the latch 212 is ready for the next multiplication. LD信号还使并行至串行移位寄存器214加载加法器210输出的刚完成乘法中的生成项。 LD signal also enables parallel-to-serial shift register 214 to load the output of the adder 210 has just completed the production term multiplication. 生成项接着在下一乘法期间被串行移位,移出寄存器214。 Serial shift key is then generated during the next multiplication register 214 is removed.

示范性实施例中,输入数据X、常数C和生成结果Y的精度为16位。 An exemplary embodiment, Y input data X, to generate a result and the constant C is a 16-bit accuracy. 16位精度造成的算术性差错少于“IEEE标准1180-1990:8×8逆向离散余弦变换实施规范”中的规定。 16-bit arithmetic precision caused by errors of less than "the IEEE Standard 1180-1990: 8 × 8 inverse discrete cosine transform implementation specifications" specified in. 该16位表示可包括1位符号位、9位幅值位、和6位分数位。 The 16-bit representation may include a sign bit, the magnitude of 9-bit, and six fractional bits. 可实施少于16位或多于16位的其他表示,这些均落在本发明保护范围。 Embodiment may be less than 16 or more than 16 other representations, which are fall within the scope of the present invention.

示范性实施例中,加法器210、锁存器212、和寄存器214均按16位实施。 Embodiment, the adder 210, a latch 212, and a 16-bit register 214 are by implementation of the exemplary embodiment. 每一时钟周期,X中的1位按最低有效位在先移位至位串行乘法器150中。 Each clock cycle, X is a least significant bit by the shift to the previous bit-serial multiplier 150. 常数C取决于输入位数值和LD信号,并与锁存器212中存储的中间生成项相加。 The constant C depends on the input value and the LD signal bits, and summing the intermediate key stored in the latch 212 generates. 逻辑电路200中,“与”门204根据该输入位和LD信号判定C是否要与中间生成项相加。 Logic circuit 200, "and" gate 204 is determined according to the input bit and the LD signal C to be added to the intermediate key generated. 加法器210输出的中间生成项接着移一位,并按数字位D[14..0]存储回锁存器212。 The intermediate output generated from the adder 210 is then shifted one item, and press the digital bit D [14..0] memory 212 back to the latches. 加法器210输出的最低有效位被舍弃,锁存器212的最高有效位具有符号扩展,例如D[15]=Co[15],其中C[15]是加法器210中最高有效位的进位输出。 The least significant bit output of the adder 210 is discarded, the most latch significant bit 212 having a sign extension, for example, D [15] = Co [15], where C [15] is the adder 210 the most significant bit carry output . 如图6A所示,位串行乘法器150可利用相同数量的硬件例如累加器来实现,这对于IC设计来说较为紧凑。 6A, the bit-serial multiplier 150 may utilize the same amount of hardware implemented, for example, an accumulator, which is more compact for IC design.

图6B中进一步详细示出位串行乘法器150。 FIG 6B shown in further detail in bit-serial multiplier 150. 加法器210、锁存器212、和寄存器214按位形式示出。 An adder 210, a latch 212, and bit register 214 shown in the form. 常数C取决于输入位X数值和LD信号,并与锁存器212中存储的中间生成项相加。 The constant C depends on the input value and the X position signal LD, and added to items stored in the intermediate latch 212 is generated. 每一加法器210接收下一最低有效位的锁存器212输出的进位输入(Ci),向下一最高有效位的加法器210提供进位输出(Co)。 Each adder 210 receives the carry input of the next least significant bit output of latch 212 (Ci), to the next most significant bit of the adder 210 provides a carry output (Co). 属于加法器的标准进位链。 Standards are adder carry chain.

对最低有效位的简单舍弃产生2进制补码输出生成项的略微负值偏移。 Simple to discard the least significant bit to produce a slightly negative offset 2's complement output generated items. 通过在末级加法器210a前的加法器加最低有效位,产生输出生成项中一半最低有效位的正值偏移,来补偿该略微负值偏移。 By the end of the stage before the adder 210a plus the least significant bit adder, an output generates a key value of half the least significant bit offset to compensate for the slightly negative offset. 通过在连续的乘法器150交替进行舍弃和正值偏移,可减小总偏移。 By discarding the offset values ​​and the multiplier 150 alternate in a continuous, total offset can be reduced. 利用可根据所需结果硬布线接入高电平或低电平的ROUND信号(接地信号)来控制偏移。 You can access using hardwired ROUND high or low signal (ground signal) to control the offset in accordance with the desired result.

图7A示出串行加法器160的示范性框图。 FIG 7A shows an exemplary block diagram of the serial adder 160. 串行加法器160按最低有效位在先串行接收两个输入Y1和Y2。 Prior serial-serial adder 160 receives two inputs Y1 and Y2 according to the least significant bit. 串行加法器160可使两个输入相加(Y1+Y2),从一输入减去另一输入(Y1-Y2),或将某一输入旁路通过成为输出(Z=Y2)。 Serial two input adder 160 may sum (Y1 + Y2), the other input is subtracted from an input (Y1-Y2), or a bypass input becomes the output (Z = Y2). 至于是加法还是减法,取决于IDCT格网中串行加法器160的位置,例如串行加法器160是位于蝶式运算的上支路还是下支路。 As for the addition or subtraction, depending on the position in the grid IDCT serial adder 160, for example, serial adder 160 is located in the upper arm of the butterfly operation or the next leg. 旁路方式允许本发明IDCT处理器20执行不同的变换混合。 Bypass mode allows performing IDCT processor of the present invention 20 different mixing transformation.

输入Y1和Y2分别串行提供给“与”门电路240和“异或”门电路242。 Y1 and Y2, respectively, serial input is provided to "and" gate circuit 240, and "exclusive or" gate circuit 242. ADD_EN也提供给“与”门电路240。 ADD_EN is also provided to "and" gate circuit 240. ADD_EN为低电平时,“与”门240的输出为低电平,Y1未提供给加法器244。 ADD_EN is low, "and" the output of gate 240 is low, Y1 is not supplied to the adder 244. ADD_EN为高电平时,Y1便提供给加法器244。 ADD_EN is high, Y1 then supplied to the adder 244. INVERT信号提供给“异或”电路242和寄存器246。 INVERT signal provided to the "exclusive OR" circuit 242 and a register 246. 为了执行减法,输入Y2变换为负数并与另一操作数相加。 To perform subtraction, converted into a negative input Y2 and added to the other operand. 将2进制补码数变换为负数需要对原始数所有位反转,并在最低有效位加“1”。 The 2's complement code is converted to a negative number required for all raw bit-reversed, and add a "1" in the least significant bit. 当INVERT信号(反转信号)为高电平时,便利用“异或”门242执行各位反转。 When the INVERT signal (inverted signal) is high, then the use of "exclusive OR" gate 242 performs inversion you. 当LD信号处于启动状态,且INVERT信号为高电平时,通过将“1”存储于串行加法器的起始位置,并将该数值与加法器244的进位输入(Ci)相加,使“1”加到该输入数的最低有效位上。 When the LD signal is active, and the INVERT signal is high, by adding "1" is stored in the starting position of the serial adder, and adds the value to the adder carry input 244 (Ci of), so that the " 1 "is added to the input number of the least significant bit.

每一后续时钟周期,在寄存器246中存储先前1位加法输出的进位输出(Co)。 Each subsequent clock cycle, a previously stored carry output (Co) output from the adder 246 in register. 该进位输出与两个输入Y1和Y2的下一组2进制位相加。 The carry output bit of the next set of two binary inputs Y1 and Y2 are added. 加法器244的总和输出S表示串行加法器160的输出。 Sum of the output of the adder S 244 represents the output of serial adder 160.

常数C可以硬布线接入,或可屏蔽编程。 The constant C may be hard-wired access, or may be programmed shield. 因为在示范性实施例中总要执行蝶式运算的第一级58,因而可以调整该级位串行乘法器150的常数C。 Because in the exemplary embodiment, a first stage 58 always perform the butterfly operations, it is possible to adjust the level of the bit-serial multiplier 150 constant C. 但串行蝶式运算140置于旁路模式时,对于余下的蝶式运算级62、66和70来说,常数C可屏蔽编程,以允许乘法器150执行输入X2同1/(2Cnk)或1相乘。 But the serial butterfly operation 140 in bypass mode, the butterfly operation for the remaining stages 62, 66 and 70, the constant C may be shielded programmed to allow the input X2 multiplier 150 performs the same 1 / (2Cnk) or 1 multiplied. 乘法器150也可以加载C的其他数值来执行输入X2的定标或归一化。 The multiplier 150 can also load value C to the other input X2 perform scaling or normalization.

如图7所示,串行加法器160可对两个输入执行加法、减法或旁路。 7, the serial adder 160 may perform addition, subtraction, or to bypass two inputs. 可将串行加法器160修改成执行串行蝶式运算140所需的函数。 Serial adder 160 may be modified to perform the butterfly operation 140 serial functions required. 例如,参见图5B,串行加法器160a仅执行加法或旁路。 For example, referring to Figure 5B, only the serial adder 160a performs addition or bypassed. 所以,图7中串行加法器160可通过直接向加法器244的B输入提供Y1,去除“异或”门242,并向“与”门240提供Y2进行修改。 Therefore, FIG. 7 serial adder 160 by providing Y1 directly to the B input of the adder 244, the removal of "exclusive or" gate 242, to "and" gate 240 provides a modified Y2. 可以去掉INVERT信号,因为加法器160a仅执行加法。 INVERT signal can be removed, because the adder 160a perform only addition. 同样,串行加法器160b仅执行减法或旁路。 Similarly, serial adder 160b performs only subtraction or bypass. 所以,串行加法器160的INVERT信号可以与高基准源相连。 Therefore, serial adder INVERT signal 160 may be connected to a high reference source.

串行加法器160可用于执行图4中串行加法器56所需的串行加法和旁路,该加法器实现图3D中所示格网100前3级所需的串行加法112。 Serial adder 160 may be used to implement the serial-serial adder 4 and adder 56 bypass required, the implementation of FIG. 3D 100 adder stage before the desired grid 3 illustrated serial adder 112.

参见图5B,延迟元件148可以利用一连串锁存器来实现。 5B, the delay element 148 may be implemented using a series of latches. 锁存器的数目选择为与乘法器150的处理延迟相配合。 Selecting the number of latches processing delay multiplier 150 mate.

VI/O缓存器示范性实施例中,在每一IDCT处理器20中,16个I/O缓存器52的存储器组接收输入数据并提供经变换数据。 VI / O buffer to an exemplary embodiment, in each of the IDCT processor 20, 16 I / O buffer memory group 52 receives input data and provides the transformed data. IDCT处理器20的输入和输出按字串行方式提供,或按每一时钟周期一完整数据点方式提供。 Input and output string IDCT processor 20 by way of lines or provide a complete each clock cycle by data point manner. 16个数据点在16个时钟周期内加载至16个I/O缓存器52中。 16 to 16 data points loaded I / O buffer 52 at 16 clock cycles. 一旦加载所有I/O缓存器52,16个数据点按位串行方式每一时钟周期提供一位至IDCT格网。 Once loaded all I / O data buffer 52,16 points provided by the bit-serial manner each clock cycle to an IDCT grid. 每一时钟周期,I/O缓存器52还接收串行蝶式运算最后一级70输出的经变换数据位。 Each clock cycle, I / O buffer 52 also receives the transformed data butterfly operation bit-serial output of the last stage 70. 该经变换数据串行提供给I/O缓存器52。 The transformed data is supplied to a serial I / O buffer 52.

图8示出一I/O缓存器52的示范性框图。 Figure 8 shows a I / O buffer exemplary block diagram 52. I/O缓存器52包括16位锁存器262、16位并行一串行移位寄存器264、16位锁存器266以及输出缓存器268。 I / O buffer 52 includes a 16-bit parallel latch 262,16 264,16 a serial shift register bit latch 266 and an output buffer 268. IDCT输入提供给16个I/O缓存器52中的所有锁存器262。 IDCT input to the 16 I / O buffer all latches 52 262. 每一I/O缓存器52当得到控制信号WR(w)指令时对IDCT输入进行锁存。 Each I / O buffer 52 is controlled when the signal WR (w) of input latches IDCT instruction. 从控制器26所始发的WRITE_ENABLE信号解码得到WR(w)。 WRITE_ENABLE decoded signal from the controller 26 originated obtained WR (w). 每一I/O缓存器52中的锁存器262仅在每16个时钟周期中的一个时钟周期期间处于启动状态。 Each I / O buffer 52 in the latch 262 during only one clock cycle every 16 clock cycles is enabled. 待16个数据点均由锁存器262锁存后,LD信号才处于启动状态,锁存器262中锁存的数值才提供给寄存器264。 After the 16 data points latched by latch 262, before the LD signal is active, the value latched in the latch 262 to the register 264 only.

对于每一I/O缓存器52,按最低有效位先移出方式,以串行方式每一时钟周期一位将数据移位至交叉路由54。 For each I / O buffer 52, according to the least significant bit is shifted out first embodiment, each clock cycle to serially shift data a route 54 to the cross. 每一时钟周期,有一位经变换数据位按最低有效位先移位方式,以串行方式每一时钟周期一位移位进入最高有效位寄存器264q。 Each clock cycle, the transformed data has a bit to the least significant bit by the shift mode, each clock cycle in a serial fashion into the most significant bit of a shift register 264q. 16个时钟周期之后,所有16个数据位移出至交叉路由54,而所有16个经变换数据位则移位进入寄存器264。 After 16 clock cycles, all 16 pieces of data are shifted out to cross the route 54, and all 16 bits of the transformed data into the shift register 264. 每隔16个时钟周期,LD信号便向寄存器264加载下一数据点,并向锁存器266加载经变换数据点。 Every 16 clock cycles, register 264 to load the LD signal Pianxiang next data point, the latch 266 to load the transformed data points. 经变换数据存储在锁存器266中直到通过输出缓存器268读出。 Transformed data storage 266 until the output buffer 268 reads out the latch. 输出缓存器268有选择地处于启动状态,使得16个I/O缓存器52以串行方式每一时钟周期一经变换数据点提供经变换数据。 Output buffer 268 selectively in an activated state, so that the 16 I / O buffer 52 each clock cycle in a serial fashion to provide a transformed data of the transformed data point. 读出顺序由从READ_ENABLE解码得到的RD(w)信号控制。 Sequentially read out (w) is controlled by a signal obtained from the RD READ_ENABLE decoded.

图8中框图示出的是I/O缓存器52的一实施例。 FIG 8 block is illustrated an embodiment of the I / O buffer 52. 还可实现执行与上面所述功能相同的其他实施例,它们均落在本发明保护范围内。 It may also be implemented to perform the same functions of the other embodiments described above, and thus are within the scope of the present invention.

尽管本发明是围绕2-D IDCT机说明的,但本发明概念可扩展至例如离散傅里叶变换(DFT)、逆向离散傅里叶变换(IDFT)、快速傅里叶变换(FFT)、逆向快速傅里叶变换(IFFT)、离散余弦变换(DCT)、以及阿达玛(Hadamard)变换等其他变换。 Although the present invention is about a 2-D IDCT unit described in, for example, the concepts of the present invention extends to a discrete Fourier transform (the DFT), an inverse discrete Fourier transform (the IDFT), fast Fourier transform (an FFT), inverse other transform fast Fourier transform (IFFT), discrete cosine transform (DCT), and Adama (Hadamard) transform. 上述本发明概念对其他变换的应用均落在本发明保护范围内。 Application of the above-described concept of the present invention to other transform fall within the scope of the invention.

所提供的对较佳实施例的上述说明能够让本领域技术人员制造或利用本发明。 The above description of the preferred embodiments is provided to allow those skilled in the art to make or use the invention. 对这些实施例的种种修改对于本领域技术人员来说是显而易见的,不需要运用创造性思维便可将上面所述的总体原理应用于其他实施例。 Various modifications to these embodiments the skilled person will be apparent, does not require the use of creative thinking can be applied to the general principles described above other embodiments. 因而,本发明不受这里说明的实施例限制,但要符合与在此说明的原理和新颖特征相一致的最大保护范围。 Accordingly, the present invention is not limited to the embodiments described herein, but in the best protection scope consistent with the principles and novel features described herein.

Claims (19)

1.一种数据块规模可变的IDCT处理器,其特征在于,包括:接收多个输入数据点和第一控制信号的加法器组,所述第一控制信号命令所述加法器组对输入数据点的选定组合执行加法;多个蝶式运算级;以及多个交叉路由,一种是设置在所述加法器组和第一蝶式运算级之间的交叉路由,一种是设置在连续蝶式运算级之间的交叉路由,其中,所述多个蝶式运算级接收第二控制信号,命令所述多个蝶式运算级对选定的至所述多个蝶式运算级的输入执行蝶式运算。 A data block size of the IDCT processor variable, characterized by comprising: receiving a plurality of adder groups of input data points and the first control signal, the first signal commanding the control input of adder groups selected combinations of data points to perform addition; a plurality of stages of butterfly operations; and a plurality of intersecting routes, one is disposed between the adder and the first set of intersecting routes butterfly operation stage, one is provided route between successive cross butterfly operation stage, wherein the plurality of butterfly operation stage receives a second control signal commanding said plurality of stages of the butterfly operation to the selected plurality of stages of butterfly operations input performs butterfly operation.
2.如权利要求1所述的IDCT处理器,其特征在于,用串行加法器和位串行乘法器实施所述加法器组和所述多个蝶式运算级。 2. The IDCT processor according to claim 1, wherein the serial bit-serial adder and multiplier adders embodiment of the group and the plurality of butterfly operation stages.
3.如权利要求2所述的IDCT处理器,其特征在于,还包括:按字串行格式接收所述多个输入数据点,并按位串行格式向所述加法器组提供所述输入数据点的I/O缓存器组。 3. The IDCT processor according to claim 2, characterized in that, further comprising: receiving the plurality of input data points by row format string, and press to provide the bit-serial format input to the adder groups data points I / O buffer set.
4.如权利要求3所述的IDCT处理器,其特征在于,所述多个蝶式运算级具有流水线结构,以便所有级均并行活动。 Said IDCT processor as claimed in claim 3, wherein said plurality of stages of butterfly operations having a pipeline structure, so that all stages are parallel activities.
5.如权利要求4所述的IDCT处理器,其特征在于,所述位串行乘法器的所述被乘数可利用屏蔽进行编程。 5. The IDCT processor according to claim 4, characterized in that the said multiplicand bit-serial multiplier may be programmed with a shield.
6.一种数据块规模可变的2维IDCT机,其特征在于,包括:第一IDCT处理器,所述第一IDCT处理器接收输入数据点;与所述第一IDCT处理器连接的存储器元件;与所述存储器元件连接的第二IDCT处理器;以及与所述第一IDCT处理器、所述第二IDCT处理器、和所述存储器元件连接并向它们提供控制信号的控制器,所述控制器接收输入信号,并根据所述输入信号生成控制信号。 A data block size of the variable two-dimensional IDCT unit, characterized by comprising: a first IDCT processor, IDCT processor receives the first input data point; and a memory connected to the first IDCT processor element; a second IDCT processor coupled to the memory element; IDCT processor, and the first, the second IDCT processor, and the memory element are connected to the controller and providing a control signal, the said controller receiving an input signal and generating a control signal according to the input signal.
7.如权利要求6所述的IDCT机,其特征在于,所述IDCT处理器包括:接收所述多个输入数据点和第一控制信号的加法器组,所述第一控制信号命令所述加法器组对输入数据点的选定组合执行加法;多个蝶式运算级;以及多个交叉路由,一种是设置在所述加法器组和第一蝶式运算级之间的交叉路由,一种是设置在连续蝶式运算级之间的交叉路由,其中,所述多个蝶式运算级接收第二控制信号,命令所述多个蝶式运算级对选定的至所述多个蝶式运算级的输入执行蝶式运算。 Receiving said first control signal commands said plurality of adders and a set of input data points to a first control signal, said: IDCT machine as claimed in claim 6, wherein said IDCT processor comprising an adder performing an addition group selected combination of input data points; a plurality of butterfly operation stage; and a plurality of intersecting routes, one is disposed between the cross routing group and a first adder stage butterfly operation, One is provided between the successive cross routing stage butterfly operations, wherein the plurality of butterfly operation stage receives a second control signal, the command to the plurality of stages of butterfly operations of said plurality of selected butterfly operation stage butterfly operation input performed.
8.如权利要求7所述的IDCT机,其特征在于,用串行加法器和位串行乘法器实施所述加法器组和所述多个蝶式运算级。 8 IDCT machine according to claim 7, wherein the serial bit-serial adder and multiplier adders embodiment of the group and the plurality of butterfly operation stages.
9.如权利要求8所述的IDCT机,其特征在于,所述IDCT处理器还包括:按字串行格式接收所述多个输入数据点,并按位串行格式向所述加法器组提供所述输入数据点的I/O缓存器组。 9. The IDCT machine according to claim 8, wherein said IDCT processor further comprising: receiving the plurality of input data points by row format string, bit-serial format to press said adder group providing the input data point I / O buffer set.
10.如权利要求9所述的IDCT机,其特征在于,所述IDCT处理器具有流水线结构,以便两IDCT处理器均并行活动。 10. The IDCT machine according to claim 9, wherein said IDCT processor having a pipeline structure, two IDCT processors for parallel activities.
11.如权利要求10所述的IDCT机,其特征在于,所述蝶式运算第一级总是处于启动状态。 11. The IDCT machine according to claim 10, wherein said first-stage butterfly operation is always activated state.
12.如权利要求11所述的IDCT机,其特征在于,可利用屏蔽对所述位串行乘法器的所述被乘数进行编程。 12. The IDCT unit according to claim 11, characterized in that the shield can utilize the multiplicand bit-serial multipliers programming. 13.如权利要求12所述的IDCT机,其特征在于,所述IDCT机具有每一时钟周期输出一像素的吞吐速率。 13. The IDCT unit according to claim 12, wherein the IDCT unit having a throughput rate of each clock cycle the output of a pixel.
14.如权利要求13所述的IDCT机,其特征在于,所述串行加法器和位串行乘法器具有大于8位的分辨率。 IDCT 14. The machine according to claim 13, wherein said serial adders and bit serial multiplier has a resolution greater than 8 bits.
15.如权利要求14所述的IDCT机,其特征在于,所述串行加法器和位串行乘法器具有16位分辨率。 IDCT 15. The machine according to claim 14, wherein said serial adders and bit serial multiplier with 16-bit resolution.
16.如权利要求15所述的IDCT机,其特征在于,所述存储器元件包括易位存储器。 IDCT 16. The machine according to claim 15, wherein said storage memory element comprises a translocation.
17.一种执行数据块规模可变的2维IDCT变换的装置,其特征在于,包括:执行多个输入数据点的1维IDCT变换的第一IDCT变换装置;存储所述第一IDCT变换装置输出的中间结果的存储装置;及执行所述中间结果的1维IDCT变换的第二IDCT变换装置;以及向所述第一IDCT变换装置、所述第二IDCT变换装置、和所述存储装置提供控制信号的控制装置,所述控制装置接收输入信号,根据所述输入信号生成所述控制信号。 A data block size variable means 2-D IDCT transform is performed, characterized by comprising: performing a first plurality of 1-dimensional IDCT transform means IDCT transform data points; storing the first IDCT transform means storing intermediate results output means; and means for executing the second intermediate IDCT transform 1-dimensional IDCT transform results; and IDCT transform to said first means, said second IDCT transform means and said storage means providing control means a control signal, said control means receives an input signal, said control signal according to the input signal generator.
18.如权利要求17所述的装置,其特征在于,所述IDCT变换装置包括:接收多个输入数据点和第一控制信号的加法装置级,所述第一控制信号命令所述加法装置对输入数据点的选定组合执行加法;对成对输入数据执行蝶式运算的多级蝶式运算装置;在所述加法装置级和所述多级蝶式运算装置之间对信号进行路由选择的路由选择装置;其中,所述多级蝶式运算装置接收第二控制信号,命令所述多级蝶式运算装置对选定的至所述多级蝶式运算装置的成对输入执行蝶式运算。 Adding means for receiving a plurality of input data points stage addition means and the first control signal, the first signal commanding the control: 18. The apparatus of claim 17, wherein said converting means comprises IDCT selected combinations of input data points to perform addition; multi-stage butterfly operation means performs butterfly operation paired input data; and the signal routing means between said summing stage and the multi-stage butterfly operation device routing means; wherein the multi-stage butterfly operation means for receiving a second control signal commanding said pair of multi-stage butterfly operation input device selected to the multi-stage butterfly operation device performs butterfly operation .
19.一种变换机,其特征在于,包括:第一变换处理器,所述第一变换处理器接收输入数据点;与所述第一处理器连接的存储器元件;与所述存储器元件连接的第二变换处理器;以及与所述第一变换处理器、所述第二变换处理器、和所述存储器元件连接并向它们提供控制信号的控制器,所述控制器接收输入信号,并根据所述输入信号生成控制信号。 19. A relay unit, characterized in that, comprising: a first transform processor, the first transform processor receives input data points; memory element connected to the first processor; connected to the memory element second transform processor; and converting the first processor, the second transform processor, and the memory element are connected to the controller to provide a control signal, the controller receives input signals, and in accordance with generating a control signal to the input signal.
20.如权利要求19所述的变换机,其特征在于,所述变换处理器包括:多个蝶式运算级;和多个交叉路由,一交叉路由介于连续的蝶式运算级之间;其中,所述多个蝶式运算级接收第二控制信号,命令所述多个蝶式运算级对选定的至所述多个蝶式运算级的输入执行蝶式运算。 20. A relay unit according to claim 19, characterized in that the transformation processor comprises: a plurality of stages of butterfly operations; and a plurality of intersecting routes, a route between successive cross butterfly operation stage; wherein the plurality of butterfly operation stage receives a second control signal, a plurality of butterfly operation command to said plurality of stages of butterfly operations performed on input stages of butterfly operations selected.
CN 98808477 1997-08-25 1998-08-24 Variable block size 2-dimensional inverse discrete cosine transform engine CN1268231A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US91809097A true 1997-08-25 1997-08-25

Publications (1)

Publication Number Publication Date
CN1268231A true CN1268231A (en) 2000-09-27

Family

ID=25439787

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 98808477 CN1268231A (en) 1997-08-25 1998-08-24 Variable block size 2-dimensional inverse discrete cosine transform engine

Country Status (4)

Country Link
EP (1) EP1018082A1 (en)
CN (1) CN1268231A (en)
AU (1) AU9030298A (en)
WO (1) WO1999010818A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100504847C (en) 2004-10-12 2009-06-24 联发科技股份有限公司 Method and apparatus for inverse discrete cosine transform implementation
CN101605259A (en) * 2009-05-31 2009-12-16 华亚微电子(上海)有限公司 Device and method for transforming coding and decoding for multimedia data
CN101351792B (en) 2005-10-05 2010-12-22 高通股份有限公司 Fast dct algorithm for dsp with vliw architecture
CN102065309A (en) * 2010-12-07 2011-05-18 青岛海信信芯科技有限公司 DCT (Discrete Cosine Transform) realizing method and circuit
CN101646080B (en) 2009-06-18 2013-09-25 杭州高特信息技术有限公司 Method for fast switching parallel pipeline IDCT based on AVS and device thereof
CN106663085A (en) * 2014-08-08 2017-05-10 高通股份有限公司 System and method for reusing transform structure for multi-partition transform

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003198504A (en) * 2001-12-27 2003-07-11 Mitsubishi Electric Corp Despreading processing method, despread code assignment method, terminal for moving object and base station
JP2003223433A (en) 2002-01-31 2003-08-08 Matsushita Electric Ind Co Ltd Method and apparatus for orthogonal transformation, encoding method and apparatus, method and apparatus for inverse orthogonal transformation, and decoding method and apparatus
US7096245B2 (en) 2002-04-01 2006-08-22 Broadcom Corporation Inverse discrete cosine transform supporting multiple decoding processes
US9110849B2 (en) 2009-04-15 2015-08-18 Qualcomm Incorporated Computing even-sized discrete cosine transforms
US8762441B2 (en) 2009-06-05 2014-06-24 Qualcomm Incorporated 4X4 transform for media coding
US9069713B2 (en) 2009-06-05 2015-06-30 Qualcomm Incorporated 4X4 transform for media coding
US8451904B2 (en) 2009-06-24 2013-05-28 Qualcomm Incorporated 8-point transform for media data coding
US9075757B2 (en) 2009-06-24 2015-07-07 Qualcomm Incorporated 16-point transform for media data coding
US9118898B2 (en) 2009-06-24 2015-08-25 Qualcomm Incorporated 8-point transform for media data coding
US9081733B2 (en) 2009-06-24 2015-07-14 Qualcomm Incorporated 16-point transform for media data coding
US9824066B2 (en) * 2011-01-10 2017-11-21 Qualcomm Incorporated 32-point transform for media data coding

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2608808B1 (en) * 1986-12-22 1989-04-28 Efcis Circuit digital signal processing integrated

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100504847C (en) 2004-10-12 2009-06-24 联发科技股份有限公司 Method and apparatus for inverse discrete cosine transform implementation
CN101351792B (en) 2005-10-05 2010-12-22 高通股份有限公司 Fast dct algorithm for dsp with vliw architecture
CN101605259A (en) * 2009-05-31 2009-12-16 华亚微电子(上海)有限公司 Device and method for transforming coding and decoding for multimedia data
CN101646080B (en) 2009-06-18 2013-09-25 杭州高特信息技术有限公司 Method for fast switching parallel pipeline IDCT based on AVS and device thereof
CN102065309A (en) * 2010-12-07 2011-05-18 青岛海信信芯科技有限公司 DCT (Discrete Cosine Transform) realizing method and circuit
CN102065309B (en) 2010-12-07 2012-12-05 青岛海信信芯科技有限公司 DCT (Discrete Cosine Transform) realizing method and circuit
CN106663085A (en) * 2014-08-08 2017-05-10 高通股份有限公司 System and method for reusing transform structure for multi-partition transform

Also Published As

Publication number Publication date
EP1018082A1 (en) 2000-07-12
AU9030298A (en) 1999-03-16
WO1999010818A1 (en) 1999-03-04

Similar Documents

Publication Publication Date Title
US5446651A (en) Split multiply operation
KR100714358B1 (en) Method and system for performing calculation operations and a device
KR100684134B1 (en) Improved apparatus ? method for modular multiplication ? exponentiation based on montgomery multiplication
US5053985A (en) Recycling dct/idct integrated circuit apparatus using a single multiplier/accumulator and a single random access memory
CA2099146C (en) Method and arrangement for transformation of signals from a frequency to a time domain
RU2273044C2 (en) Method and device for parallel conjunction of data with shift to the right
US5941940A (en) Digital signal processor architecture optimized for performing fast Fourier Transforms
KR100715770B1 (en) Method and a system for performing calculation operations and a device
CN1149496C (en) Apparatus for adaptively processing video signals
JP2756257B2 (en) Parallel processing system and method
JP2945487B2 (en) Matrix multiplier
Shams et al. NEDA: A low-power high-performance DCT architecture
EP0250152A2 (en) High speed transform circuit
US6240437B1 (en) Long instruction word controlling plural independent processor operations
Peleg et al. Intel MMX for multimedia PCs
JP2531955B2 (en) One-dimensional cosine transform calculation apparatus and an image coding apparatus and decoding apparatus comprising the computing device
US6219688B1 (en) Method, apparatus and system for sum of plural absolute differences
US5500811A (en) Finite impulse response filter
EP0661886A2 (en) Method and apparatus for fast digital signal decoding
US5859788A (en) Modulated lapped transform method
US6288723B1 (en) Method and apparatus for converting data format to a graphics card
Lai et al. A high-performance and memory-efficient VLSI architecture with parallel scanning method for 2-D lifting-based discrete wavelet transform
Chan et al. On the realization of discrete cosine transform using the distributed arithmetic
CA2121197C (en) Inverse discrete cosine transform processor
KR100481067B1 (en) Apparatus for 2-D Discrete Cosine Transform using Distributed Arithmetic Module

Legal Events

Date Code Title Description
C06 Publication
C01 Deemed withdrawal of patent application (patent law 1993)