KR100254393B1

KR100254393B1 - Dct core capable of multiplying of weighting coefficients architecture

Info

Publication number: KR100254393B1
Application number: KR1019960081069A
Authority: KR
Inventors: 임정환; 최종관; 김지흡; 권병섭
Original assignee: 김덕중; 사단법인고등기술연구원연구조합
Priority date: 1996-12-31
Filing date: 1996-12-31
Publication date: 2000-05-01
Also published as: KR19980061695A

Abstract

PURPOSE: A discrete cosine conversion core structure capable of processing a weighting coefficient is provided to reduce the size of hardware by using memory in which a weighting coefficient is multiplied by a DCT coefficient in a DCT/IDCT core without a special multiplier. CONSTITUTION: An input module(30) outputs data signals of input matrices of 8*1 DCT/IDCT and 2*(4*1) DCT/IDCT mode. A 3*1 multiplexer(32) selects a data signal of the input matrix. A butterfly(34) performs an addition and a subtraction of the input matrix selected by the 3*1 multiplexer(32). A 2*1 multiplexer(36) selects a data signal of the input matrix selected by the 3*1 multiplexer(32). A memory module(50) stores an addition and a subtraction result between cosine coefficients of the data signals and a weighting coefficient. A 4*1 multiplexer(42) selects an output signal of the memory module(50). A bit serial adder(60) outputs an 8*1 DCT, a 2*(4*1) DCT and a 2*(4*1) IDCT result. A butterfly (62) performs an addition and a subtraction of the 8*1 DCT of the bit serial adder(60). A 2*1 multiplexer(64) selects an output of the butterfly(62) or an output of the bit serial adder(60). An output module(66) outputs the signal selected by the multiplexer(64).

Description

Discrete Cosine Transform Core Structure with Weight Coefficient Processing

본 발명은 디지탈 영상 처리에 사용되는 DCT 코어(core)에 관한 것으로, 특히, 가중(weighting) 계수 처리가 가능한 DCT 코어 구조에 관한 것이다.The present invention relates to a DCT core used in digital image processing, and more particularly, to a DCT core structure capable of weighting coefficient processing.

잘 알려진 바와 같이, 디지탈화된 비디오 신호는 아날로그 신호보다 더욱 고품질의 비디오 영상으로서 전송될 수 있다. 일련의 영상 프레임으로 구성되는 영상신호를 디지탈 형태로 표현할 때, 이 디지탈 영상 신호를 전송하기위해서는 상당량의 데이터가 발생된다. 그러나, 통상의 전송 채널의 이용가능한 주파수 대역폭은 제한되어있기 때문에, 상당량의 디지탈 데이터를 제한된 채널 대역폭을 통하여 전송하기위해서는 전송할 데이터의 양을 압축하는 것이 필요하다.As is well known, the digitalized video signal can be transmitted as a higher quality video image than the analog signal. When a video signal composed of a series of video frames is represented in a digital form, a considerable amount of data is generated to transmit the digital video signal. However, since the available frequency bandwidth of a typical transmission channel is limited, it is necessary to compress the amount of data to be transmitted in order to transmit a large amount of digital data through the limited channel bandwidth.

영상 신호는 하나의 프레임내에서나 이웃하는 프레임내에서의 소정 픽셀들간의 어떤 상관관계 또는 리던던시가 존재하기 때문에, 영상 신호의 전체에 심각한 악영향을 주지않고도 영상 신호를 압축하는 것이 가능하다. 따라서, 대부분의 종래 기술의 영상 신호 부호화 방법은 상술한 리던던시를 이용하거나 또는 없애버린다는 기술사상에 기초하여 개발된 여러 가지 압축 기술을 이용한다.Since the video signal has any correlation or redundancy between certain pixels in one frame or in neighboring frames, it is possible to compress the video signal without seriously adversely affecting the whole of the video signal. Therefore, most conventional video signal encoding methods use various compression techniques developed based on the technical idea of using or eliminating the above-described redundancy.

이러한 부호화 방법의 한가지 카테고리는 하나의 프레임내에 존재하는 리던던시의 장점을 취하는 변환 기법에 관한 것으로, 디지탈 영상 데이터 블록을 변환계수, 예로, 2차원 이산 코사인 변환(DCT) 계수로 변환하는 대각 변환 방법(orthogonal transform method)을 포함한다.One category of such coding methods relates to a transformation technique that takes advantage of the redundancy that exists within a frame, and includes a diagonal transformation method for transforming a digital image data block into a transform coefficient, for example, a two-dimensional discrete cosine transform (DCT) coefficient. orthogonal transform method).

특히, 상술한 DCT 등과같은 대각 변환 방법에 있어서, 한 프레임의 비디오 신호는 중첩되지않는 동일한 크기의 블록, 예를 들면, 8 * 8 픽셀 블록들로 나누어 지며, 각각의 픽셀 블록들은 공간 영역에서 주파수 영역으로 변환된다. 그 결과로서, 각 픽셀블록은 하나의 DC 계수와 다수개(예로, 63개)의 AC 계수로 이루어지는 한 세트의 변환 계수를 갖게된다. 이러한 변환 계수들은 하나의 블록내에있는 각각의 픽셀의 주파수 성분의 진폭을 나타내며, 특히, 그 블록의 DC 계수는 그 블록내 픽셀들의 평균 휘도를 갖는 반면, 나머지 AC 계수들은 각각의 픽셀들의 공간 주파수 성분의 휘도를 나타낸다.In particular, in the above-described diagonal conversion method such as DCT, a video signal of one frame is divided into non-overlapping equal sized blocks, for example, 8 * 8 pixel blocks, each pixel block being a frequency in a spatial domain. Is converted to an area. As a result, each pixelblock will have a set of transform coefficients consisting of one DC coefficient and a plurality of (e.g., 63) AC coefficients. These transform coefficients represent the amplitude of the frequency component of each pixel in one block, in particular the DC coefficient of the block has the average brightness of the pixels in the block, while the remaining AC coefficients are the spatial frequency components of each pixel. Indicates the luminance.

디지탈 VCR과 같은 디지탈 영상 처리분야에서 DCT가 개발된 초기에는 입력 시퀀스를 재배열하여 구한 FET를 수학적으로 변환하여 DCT 및 IDCT를 구현하는 방법을 주로 사용하였다. 이 방법 이후에 발표된 새로운 알고리즘중 첸(Chen)과 리(Lee)의 고속 알고리즘은 복잡한 계산량을 많이 줄이는데 기여하였으며, 하드웨어 구현에도 적합하도록 멀티플라이어(multiplier)를 이용하여 DCT/IDCT를 구현하였다.In the early days of DCT development in the field of digital image processing such as digital VCRs, DCT and IDCT were mainly implemented by mathematically converting FETs obtained by rearranging input sequences. Among the new algorithms announced after this method, Chen and Lee's high-speed algorithms contributed to the reduction of complex computational loads and implemented DCT / IDCT using multiplier to be suitable for hardware implementation.

DCT 수행시 가장 많은 시간을 필요로하는 부분은 곱셈 부분이며, 수행속도의 향상을 위하여 고속 알고리즘들이 제안되어있거나 2차원 DCT를 로우-컬럼 디콤 포지션(low-column decomposition)방식을 이용하여 2회의 1차원 DCT를 계산함으로써 구현의 용이성을 취하려는 시도가 있었다. 그러나, 이를 하드웨어로 구현하고자 할때는 많은 면적을 차지하는 멀티플라이어를 사용해야 한다는 단점이 있었다.The most time-consuming part of the DCT is the multiplication part, and high-speed algorithms have been proposed to improve the execution speed, or two-dimensional DCT using low-column decomposition method. Attempts have been made to facilitate implementation by calculating the dimensional DCT. However, when implementing this in hardware, there is a disadvantage that a multiplier that takes up a large area must be used.

이러한 단점을 극복하기 위하여 제안된 DA(distributed arithmetic) 알고리즘은 DCT/IDCT를 수행할 때 멀티플라이어를 사용하지않고 DCT/IDCT 곱셈 연산 기능을 수행하는 방식으로 이 DA 방식은 DCT/IDCT를 구현할 때 칩 면적 및 속도면에서 효율적인 구현이 가능하게 해준다.In order to overcome this drawback, the proposed distributed arithmetic (DA) algorithm performs DCT / IDCT multiplication without using multiplier when performing DCT / IDCT. Efficient implementation in area and speed.

상기한 DA 방식은 디지탈 영상기기에서 변환된 DCT 계수 출력의 주파수 성분에 따라 각기 상이한 가중 계수(W)를 곱하도록 구성되어있다. 이러한 가중 계수는 하기에 예시적으로 열거된다.The above-described DA scheme is configured to multiply different weighting coefficients W according to frequency components of the DCT coefficient outputs converted in the digital imaging apparatus. These weighting coefficients are listed by way of example below.

(8*8) 모드에서In (8 * 8) mode

h=v=0 이면, W(h,v)=1/4If h = v = 0, W (h, v) = 1/4

아니면, W(h,v)=w(h)w(v)/2Otherwise, W (h, v) = w (h) w (v) / 2

(2*4*8) 모드에서In (2 * 4 * 8) mode

h=v=0 이면, W(h,v)=1/4If h = v = 0, W (h, v) = 1/4

아니면, W(h,v)=w(h)w(v)/2Otherwise, W (h, v) = w (h) w (v) / 2

이때, w(0)=1Where w (0) = 1

w(1)=CS₄/(4CS₇CS₂)w (1) = CS ₄ / (4CS ₇ CS ₂ )

w(2)=CS₄/(2CS₆)w (2) = CS ₄ / (2CS ₆ )

w(3)=1/(2CS₅)w (3) = 1 / (2CS ₅ )

w(4)=7/8w (4) = 7/8

w(5)=CS₄/CS₃ w (5) = CS ₄ / CS ₃

w(6)=CS₄/CS₂ w (6) = CS ₄ / CS ₂

w(7)=CS₄/CS₁ w (7) = CS ₄ / CS ₁

여기서, CS_i=COS(iπ/16) 이고, h. v는 DCT 변환에서 사용되는 파라미터로서, 소스 정보 메트릭스와 변환 계수 메트릭스에 있어서 인자들의 수평 인덱스(h), 수직 인덱스(v)를 나타내는 변수로, 이 기술 분야에 통상의 지식을 가진 자들은 용이하게 이해할 수 있을 것이다.Where CS _i = COS (iπ / 16), h. v is a parameter used in the DCT transform, and is a variable representing the horizontal index (h) and the vertical index (v) of the factors in the source information matrix and the transform coefficient matrix, and those skilled in the art can easily I can understand.

도 1에는 종래 기술에 따른 DCT 변환 계수에 가중 계수(W)를 곱하는 구성의 DA 방식을 이용한 DCT 코어가 도시된다.FIG. 1 shows a DCT core using a DA scheme in which a DCT transform coefficient according to the prior art is multiplied by a weighting coefficient (W).

DCT 블록(10)은 메트릭스 형태로 입력되는 입력 데이터(x₀~ x₇)를 8*1 또는 2*4*1 DCT 모드로 처리하여 메트릭스 형태의 출력 데이터(y₀~ y₇)으로 출력한다. 계수 데이터의 곱셈 및 덧셈 결과를 저장하는 ROM을 포함하는 가중 계수 처리기(20)는 가중 계수값(W)을 출력 메트릭스 데이터(y₀~ y₇)에 곱하는 기능을 수행한다.The DCT block 10 processes the input data (x ₀ to x ₇ ) input in the matrix form in 8 * 1 or 2 * 4 * 1 DCT mode and outputs the output data (y ₀ to y ₇ ) in the matrix form. . The weighting coefficient processor 20 including a ROM for storing the multiplication and addition results of the coefficient data performs a function of multiplying the weighting coefficient value W by the output matrix data y ₀ to y ₇ .

예를 들어, DA 방식으로 구현된 8*1 DCT 메트릭스 연산식은 하기 [수학식1]로 표현할 수 있다.For example, the 8 * 1 DCT matrix calculation equation implemented by the DA method may be expressed by Equation 1 below.

상기 식에서이다.In the above formula to be.

이때 8*1 모드의 DCT 변환 계수 y₂는 하기 [수학식 2]와 같이 계산될 수 있다.In this case, the DCT transform coefficient y ₂ of the 8 * 1 mode may be calculated as shown in Equation 2 below.

이렇게 [수학식 2] 에서 계산된 변환 계수 y₂는 가중 계수 처리기(20)에 의해 가중 계수(W(2))와 곱해짐으로써 가중 처리된 출력 메트릭스 Y'2를 구하게된다.The transform coefficient y ₂ calculated in Equation 2 is multiplied by the weighting coefficient W (2) by the weighting coefficient processor 20 to obtain the weighted output matrix Y'2.

그러나, 상기한 바와 같은 DCT 변환 계수에 가중 계수(W)를 곱하는 구성의 DA 방식을 이용한 DCT 코어 구조는, 가중 계수의 곱셈 연산을 위해 별도의 멀티 플라이어를 사용함에 따라 하드웨어의 규모가 매우 증가되는 결과로서 칩의 면적과 부피가 증가된다는 문제가 있었다.However, in the DCT core structure using the DA method of multiplying the DCT transform coefficients by the weighting coefficient W, the scale of the hardware is greatly increased by using a separate multiplier for the multiplication of the weighting coefficients. As a result, there has been a problem that the area and volume of the chip are increased.

본 발명은 상기한 바에 의하여 안출된 것으로서, 본 발명은 별도의 멀티플라이어를 사용하지않고도 가중 계수의 처리가 가능한 DA 방식의 DCT/IDCT 코어를 제공하는 데 그 목적이 있다.The present invention has been made as described above, an object of the present invention is to provide a DC type DCT / IDCT core capable of processing the weighting coefficient without using a separate multiplier.

상술한 목적을 달성하기위한 본 발명에 따르면, 8 비트 단위의 영상신호를 처리하여 8*1 DCT/IDCT 및 2*4*1 DCT/IDCT 모드의 입력 메트릭스의 데이터 신호(x₀~ x₇) 및 (y₀~ y₇)를 출력하는 입력 모듈과, 상기 입력 모듈로부터 출력된 8*1 DCT, 2*4*1 DCT 또는 2*4*1 IDCT 모드의 입력 메트릭스 신호를 선택하는 제 1 멀티플렉서와, 상기 제 1 멀티플렉서에서 선택된 입력 메트릭스의 가감산을 수행하는 제 1 버터플라이와, 상기 제 1 버터플라이에서 가감산된 결과 또는 8*1 IDCT 모드의 입력 메트릭스의 데이터 신호를 선택하는 제 2 멀티플렉서와, 상기 (x₀~ x₇) 및 (y₀~ y₇)의 코사인 계수와 가중 계수(W)와의 덧셈 및 곱셈 결과를 저장하고 있으며 상기 제 2 멀티플렉서에서 선택된 신호와 저장된 덧셈 및 곱셈 결과의 메트릭스 연산을 수행하는 메모리 모듈과, 상기 메모리 모듈로부터 출력된 가중 처리된 비트 단위의 메트릭스 연산 결과를 모든 비트에 대하여 누적합을 구하여 8*1 DCT 2*(4*1) DCT 및 2*(4*1) IDCT 결과를 출력하는 비트 직렬 가산기와, 상기 비트 직렬 가산기의 8*1 DCT 의 메트릭스에 대하여 가감산을 수행하는 제 2 버터플라이와, 상기 제 2 버터플라이의 출력 또는 상기 비트 직렬 가산기의 출력을 선택하는 제 3 멀티플렉서를 포함하는 가중 계수 처리가 가능한 DA 방식의 DCT/IDCT DCT 코어가 제공된다.According to the present invention for achieving the above object, the data signal of the input matrix in the 8 * 1 DCT / IDCT and 2 * 4 * 1 DCT / IDCT mode by processing the image signal of 8 bit unit (x ₀ ~ x ₇ ) And an input module for outputting (y ₀ to y ₇ ) and a first multiplexer for selecting an input matrix signal of an 8 * 1 DCT, 2 * 4 * 1 DCT, or 2 * 4 * 1 IDCT mode output from the input module. And a first butterfly for performing the subtraction of the input matrix selected by the first multiplexer, and a second multiplexer for selecting the data signal of the input matrix in the 8 * 1 IDCT mode as a result of the subtraction of the first butterfly. And the addition and multiplication results of the cosine coefficients of the (x ₀ to x ₇ ) and the (y ₀ to y ₇ ) and the weighting coefficients (W), and storing the added and multiplying results of the signal selected by the second multiplexer. A memory module performing a matrix operation and the memory module Bit serial adder that outputs 8 * 1 DCT 2 * (4 * 1) DCT and 2 * (4 * 1) IDCT results by accumulating the sum of the weighted bit-wise matrix calculation output from the module for all bits. And a second butterfly performing addition and subtraction with respect to a matrix of 8 * 1 DCTs of the bit series adder, and a third multiplexer for selecting the output of the second butterfly or the output of the bit series adder. A DA method DCT / IDCT DCT core capable of counting is provided.

본 발명의 상기 및 기타 목적과 여러가지 장점은 첨부된 도면을 참조하여 하기에 기술되는 본 발명의 바람직한 실시예로 부터 더욱 명확하게 될 것이다.The above and other objects and various advantages of the present invention will become more apparent from the preferred embodiments of the present invention described below with reference to the accompanying drawings.

제1도는 종래 기술에서 수행되는 가중 계수 처리를 위한 하드웨어 구성도이고,1 is a hardware block diagram for weighting coefficient processing performed in the prior art,

제2도는 본 발명에서 이용되는 DA 방식의 DCT/IDCT 코어의 블록도이다.2 is a block diagram of a DCT / IDCT core of the DA method used in the present invention.

〈도면의 주요부분에 대한 부호의 설명〉<Explanation of symbols for main parts of drawing>

30 : 입력 모듈 32, 36, 42, 64 : 멀티플렉서30: input module 32, 36, 42, 64: multiplexer

34, 62 : 버터플라이 50 : 메모리 모듈34, 62: butterfly 50: memory module

60 : 비트 직렬 가산기 66 : 출력 모듈60 bit serial adder 66 output module

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예에 대하여 상세하게 설명한다.Hereinafter, with reference to the accompanying drawings will be described in detail a preferred embodiment of the present invention.

도 2를 참조하면, 본 발명의 바람직한 실시예에 따른 가중 계수 처리가 가능한 8*1 DCT/IDCT 및 2*(4*1) 공유 DCT/IDCT 코어의 구성이 도시된다.Referring to FIG. 2, there is shown the configuration of 8 * 1 DCT / IDCT and 2 * (4 * 1) shared DCT / IDCT cores capable of weighting factor processing according to a preferred embodiment of the present invention.

본 발명에서 이용되는 8*1 DCT/IDCT 및 2*(4*1) DCT/IDCT 공유 DCT 코어는 본 발명과 동일자로 발명의 명칭 "이산 코사인 변환 코어 구조" 으로 출원되고 동일한 양수인에게 양도된 특허 출원 제 96-81068 호에 기술되어 있다.The 8 * 1 DCT / IDCT and 2 * (4 * 1) DCT / IDCT shared DCT cores used in the present invention are the same as the present invention and filed under the name "discrete cosine transform core structure" and assigned to the same assignee. Application No. 96-81068.

8*1 DCT/IDCT 및 2*(4*1) DCT/IDCT 공유 DCT 코어는 8 비트 단위의 영상신호를 처리하여 8*1 DCT/IDCT 및 2*4*1 DCT/IDCT 모드의 입력 메트릭스의 데이터 신호(x₀~ x₇) 및 (y₀~ y₇)를 출력하는 입력 모듈(30)과, 입력 모듈(30)로부터 출력된 8*1 DCT, 2*4*1 DCT 또는 2*4*1 IDCT 모드의 입력 메트릭스의 데이터 신호를 선택하는 3*1 멀티플렉서(32), 3*1 멀티플렉서(32)에서 선택된 입력 메트릭스의 가감산을 수행하는 버터플라이(34), 버터플라이(34)에서 가감산된 결과 또는 8*1 IDCT 모드의 입력 메트릭스의 데이터 신호를 선택하는 2*1 멀티플렉서(36), (x₀~ x₇) 및 (y₀~ y₇)의 코사인 계수와 가중 계수(W)와의 덧셈 및 곱셈 결과를 저장하고 있으며 2*1 멀티플렉서(36)에서 선택된 신호와 저장된 덧셈 및 곱셈 결과의 메트릭스 연산을 수행하는 메모리 모듈(50)과, 메모리 모듈(50)의 출력 신호를 선택하는 4*1 멀티플렉서(42), 4*1 멀티플렉서(42)로부터 출력되는 가중 처리된 비트 단위의 메트릭스 연산 결과를 모든 비트에 대하여 누적합을 구하여 8*1 DCT, 2*(4*1) DCT 및 2*(4*1) IDCT 결과를 출력하는 비트 직렬 가산기(BSA : Bit Serial Adder)(60)와, 비트 직렬 가산기(60)의 8*1 DCT 의 메트릭스에 대하여 가감산을 수행하는 버터플라이(62)와, 버터플라이(62)의 출력 또는 비트 직렬 가산기(60)의 출력을 선택하는 2*1 멀티플렉서(64)와, 멀티플렉서(64)에 의해 선택된 신호를 출력하는 출력 모듈(66)를 포함한다.The 8 * 1 DCT / IDCT and 2 * (4 * 1) DCT / IDCT shared DCT cores process video signals in 8-bit increments to provide input metrics for 8 * 1 DCT / IDCT and 2 * 4 * 1 DCT / IDCT modes. An input module 30 for outputting data signals x ₀ to x ₇ and (y ₀ to y ₇ ), and 8 * 1 DCT, 2 * 4 * 1 DCT or 2 * 4 output from the input module 30 * 1 In the butterfly 34 and the butterfly 34 performing addition and subtraction of the input matrix selected in the 3 * 1 multiplexer 32 and the 3 * 1 multiplexer 32 to select the data signal of the input matrix in the IDCT mode. Cosine coefficients and weighting coefficients (W) of 2 * 1 multiplexers 36, (x ₀ to x ₇ ), and (y ₀ to y ₇ ) that select the data signal of the additive result or the input matrix in 8 * 1 IDCT mode. The memory module 50 stores the addition and multiplication results of the memory module 50, and performs a matrix operation on the signal selected by the 2 * 1 multiplexer 36 and the stored addition and multiplication result, and an output signal of the memory module 50. 4 * 1 multiplexer 42 and 4 * 1 multiplexer 42 obtain the cumulative sum of the weighted bit-wise matrix calculation results for all the bits, and calculate 8 * 1 DCT, 2 * (4 * 1). Butter that adds and subtracts the matrix of the 8 * 1 DCT of the bit serial adder (BSA) 60 that outputs DCT and 2 * (4 * 1) IDCT results, and the bit serial adder 60. 2 * 1 multiplexer 64 for selecting the fly 62, the output of the butterfly 62, or the output of the bit serial adder 60, and an output module 66 for outputting the signal selected by the multiplexer 64. It includes.

즉, 입력 모듈(30)에 연결되어 있는 3*1 멀티플렉서(32)는 도면에는 도시되어 있지 않은 제어부에서 제공되는 8*1 모드 선택 제어 신호 또는 2*(4*1) 모드 선택 제어 신호에 따라 입력되는 신호에 대하여 8*1 DCT/IDCT 또는 2*(4*1) DCT/IDCT 모드를 수행할것인지를 결정하며, 2*1 멀티플렉서(36)는 도시안된 제어부로부터 제공되는 DCT/IDCT 선택 제어 신호에 의해 버터플라이(34)에서 출력되는 가감산된 결과 또는 8*1 IDCT 출력 신호에 대하여 DCT 또는 IDCT를 수행할것인지의 여부를 결정하게된다.That is, the 3 * 1 multiplexer 32 connected to the input module 30 depends on an 8 * 1 mode selection control signal or a 2 * (4 * 1) mode selection control signal provided by a controller not shown in the figure. It determines whether to perform 8 * 1 DCT / IDCT or 2 * (4 * 1) DCT / IDCT mode on the input signal, and the 2 * 1 multiplexer 36 controls the DCT / IDCT selection control provided from the control unit (not shown). The signal determines whether to perform DCT or IDCT on the subtracted result or the 8 * 1 IDCT output signal output from the butterfly 34.

이렇게함으로써 8*1 DCT, 2*(4*1) DCT 2*(4*1) IDCT에서 공통적으로 사용하는 버터플라이(34)를 공유할 수 있다. 따라서, 뒷단의 2*1 멀티플렉서(64)에서는 8*1 IDCT를 수행할경우에만 뒷단의 버터플라이(62)의 출력을 선택하게된다.In this way, the butterfly 34 commonly used in 8 * 1 DCT, 2 * (4 * 1) DCT 2 * (4 * 1) IDCT can be shared. Therefore, the rear 2 * 1 multiplexer 64 selects the output of the rear end butterfly 62 only when performing 8 * 1 IDCT.

그리고, 메모리 모듈(50)은 8*1 DCT, 8*1 IDCT, 2*4*1 DCT 및 2*4*1 IDCT 의 연산에 필요한 코사인 계수와 가중 계수(W)와의 덧셈 및 곱셈 결과를 각기 저장하는 ROM 과 같은 메모리(52), (54), (56) 및 (58)를 포함한다.The memory module 50 adds and multiplies the cosine coefficient and the weighting coefficient (W) necessary for the calculation of 8 * 1 DCT, 8 * 1 IDCT, 2 * 4 * 1 DCT, and 2 * 4 * 1 IDCT, respectively. Memory 52, 54, 56 and 58, such as a ROM to store.

본 발명에 따르면, 가중 계수는 수평 인덱스(h)와 수직 인덱스(v) 즉, h=v=0인 경우를 제외하고는 수평(h) 방향과 수직(v) 방향으로 분리할 수 있다. 따라서 일차원 DCT/IDCT를 위한 ROM에 코사인 계수와 가중 계수의 곱을 미리 계산하여 저장해 놓으면 별도의 곱셈기가 필요없이 DCT/IDCT 뿐만아니라 가중 계수 처리도 가능할 것이다. 이때 h=v=0 인 경우에는 예외적으로 1 비트 우측 시프트 회로(DCT 경우)와 1 비트 좌측 시프트 회로(IDCT 경우)만을 추가함으로써 가중 계수 처리가 가능하다.According to the present invention, the weighting coefficient may be separated in the horizontal (h) direction and the vertical (v) direction except when the horizontal index h and the vertical index v, that is, when h = v = 0. Therefore, if the product of the cosine coefficient and the weighting coefficient is pre-calculated and stored in the ROM for the one-dimensional DCT / IDCT, the weighting coefficient processing as well as the DCT / IDCT will be possible without a separate multiplier. At this time, when h = v = 0, the weighting coefficient processing is possible by adding only one bit right shift circuit (DCT case) and one bit left shift circuit (IDCT case).

예로, 8*1 모드의 DCT 변환 계수 y₂를 계산하는 상술한 [수학식 2] 를 참조하면, y'₂= W(2)*y₂이므로 8*1 DCT 모드용 메모리(52)에는 W(2)C₂, W(2)C₆, -W(2)C₆, -W(2)C₂의 곱셈 및 덧셈 결과가 저장되어있으므로 가중 계수를 별도의 곱셈 연산없이 처리할 수 있다.For example, referring to Equation 2 above, which calculates the DCT transform coefficient y ₂ in the 8 * 1 mode, since y ' ₂ = W (2) * y ₂ , the memory 52 for the 8 * 1 DCT mode is W. The multiplication and addition results of (2) C ₂ , W (2) C ₆ , -W (2) C ₆ , and -W (2) C ₂ are stored so that the weighting coefficient can be processed without a separate multiplication operation.

따라서, 각각의 메모리(52), (54), (56) 및 (58)를 이용하여 출력되는 가중 처리된 계수 데이터는 멀티플렉서(42)에의해 선택적으로 다음단의 비트 직렬 가산기(60)와 버터플라이(62)를 통하여 처리된다.Accordingly, the weighted coefficient data output using the respective memory 52, 54, 56 and 58 is selectively transmitted by the multiplexer 42 to the next bit serial adder 60 and butter. It is processed via ply 62.

이상 설명한 바와 같이, 본 발명에 따른 DA 방식의 DCT/IDCT 코어에서 DCT 계수에 가중 계수가 곱해져있는 메모리를 이용함으로써, 가중 계수 처리에 사용되는 멀티플라이어를 대신할 수 있게 되어 하드웨어 규모와 칩 면적을 최소화 할 수 있는 장점을 제공할 수 있다.As described above, in the DA DCT / IDCT core according to the present invention, by using a memory in which the DCT coefficients are multiplied by the weight, it is possible to replace the multiplier used for the weighting coefficient processing, and thus the hardware scale and the chip area. It can provide the advantage that can minimize.

Claims

Inputs to process data signals (x ₀ ~ x ₇ ) and (y ₀ ~ y ₇ ) of input matrix in 8 * 1 DCT / IDCT and 2 * 4 * 1 DCT / IDCT modes by processing 8-bit video signals module; A first multiplexer for selecting an input matrix signal of an 8 * 1 DCT, 2 * 4 * 1 DCT, or 2 * 4 * 1 IDCT mode output from the input module; A butterfly that performs addition and subtraction of the selected input matrix in the first multiplexer; A second multiplexer for selecting a data signal of an input or subtracted result in the butterfly or an input matrix in an 8 * 1 IDCT mode; Add and multiply the cosine coefficients of the (x ₀ to x ₇ ) and (y ₀ to y ₇ ) and the weighting coefficients (W), and store a matrix operation of the selected signal and the stored addition and multiplication result in the second multiplexer A memory module to perform; Bits for outputting 8 * 1 DCT 2 * (4 * 1) DCT and 2 * (4 * 1) IDCT results by accumulating the sum of the weighted bit-metric metrics operation output from the memory module for all bits. Serial adder; A second butterfly performing addition and subtraction on a matrix of 8 * 1 DCTs of the bit serial adder; And a third multiplexer for selecting the output of the second butterfly or the output of the bit serial adder.