KR20030030404A

KR20030030404A - The multiplier circuit

Info

Publication number: KR20030030404A
Application number: KR1020010062532A
Authority: KR
Inventors: 변현일
Original assignee: (주)씨앤에스 테크놀로지
Priority date: 2001-10-11
Filing date: 2001-10-11
Publication date: 2003-04-18
Also published as: KR100431354B1

Abstract

PURPOSE: A multiplier circuit is provided to remove unnecessary output bits from a conventional multiplication circuit so that it can reduce a delay time from a data input from a data output, and a consumption power. CONSTITUTION: The circuit comprises a multiplier(401), an overflow checker(402) and a selector(403). The multiplier(401) receives N bit data and M bit data, and multiplies the N bit data by the M bit data while limiting the multiplication output within L bits where L is not over N + M. The overflow checker(402) checks whether there occurs an overflow at the output from the multiplier(401) by referring to the carry data of (N + M) - L. The selector(403) outputs the largest number which can be expressed with the L bits if there occurs an overflow at the output from the multiplier(401), and, otherwise, outputs the multiplication result with the L bits.

Description

The multiplier circuit

본 발명은 제한된 크기의 출력을 갖는 곱셈기 회로에 관한 것이다. 보다 상세하게는 N 비트 ×M 비트의 곱셈을 할 때 결과가 (N+M) 비트 보다 작은 범위로 예를 들면, L 비트 이내로 나오는 것이 확실한 경우 결과를 (N+M) 비트까지 다 구하지 않고 L 비트까지만 구함으로써 입력이 들어가서 출력이 나오기까지의 지연시간및 구현에 필요한 하드웨어의 면적을 줄이고, 그에 따라 전력 소모도 줄일 수 있는 곱셈기 회로에 관한 것이다.The present invention relates to a multiplier circuit having a limited size output. More specifically, when multiplying N bits × M bits, if it is certain that the result is within a range of (N + M) bits, for example, within L bits, do not return the result to (N + M) bits. It is a multiplier circuit that can reduce the area of hardware required for implementation and the delay between input and output by only finding bits, thus reducing power consumption.

도 1은 기존의 곱셈기 블록도로서 N 비트의 수와 M 비트의 수를 곱하여 L 비트의 출력을 구하는 곱셈기의 블록도를 나타낸 것이다. 여기서 L은 N+M 보다 크지 않은 자연수를 의미한다.FIG. 1 is a block diagram of a conventional multiplier. The block diagram of the multiplier multiplies the number of N bits and the number of M bits to obtain an output of L bits. Where L is a natural number not greater than N + M.

도 1에서와 같이, 일반적인 캐리-세이브형 곱셈기(101)는 N 비트의 수와 M 비트의 수를 곱하여 N+M 비트의 결과를 출력하고, 포화기(102)는 N+M 비트의 수를 L 비트의 수로 바꾸어주는 역할을 한다.As shown in FIG. 1, a typical carry-saved multiplier 101 multiplies the number of N bits by the number of M bits and outputs the result of N + M bits, and the saturator 102 outputs the number of N + M bits. It converts the number of L bits.

즉, 상기 포화기(102)는 만약 N+M 비트의 수가 L 비트로 표현할 수 있는 수의 범위를 벗어났다면 L 비트로 표현할 수 있는 가장 큰 수를 최종 출력으로 하고, 그렇지 않다면 캐리-세이브형 곱셈기(101)의 출력을 그대로 최종 출력으로 내보내 준다.That is, the saturator 102 makes the final output the largest number that can be represented by L bits if the number of N + M bits is out of the range that can be represented by L bits. Otherwise, the saturator 102 has a carry-saved multiplier ( The output of 101) is exported to the final output as it is.

도 2는 도 1에 도시된 일반적인 캐리-세이브형 곱셈기의 상세 블록도로서 두 개의 입력이 각각 9 비트와 5 비트인 경우의 예이다. 도 3은 도 2에 도시된 곱셈기의 기호에 대한 정의를 나타내었으며, 도 4는 도 1에 도시된 포화기의 상세 블록도이다.FIG. 2 is a detailed block diagram of the general carry-saved multiplier shown in FIG. 1, in which two inputs are 9 bits and 5 bits, respectively. 3 shows a definition of a symbol of the multiplier shown in FIG. 2, and FIG. 4 is a detailed block diagram of the saturator shown in FIG. 1.

첫번째 입력된 수에서 X8은 그 수의 제일 높은 자릿수 비트를 의미하고, X0은 제일 낮은 자릿수의 비트를 의미한다. 두번째 입력된 수에서 Y4는 그 수의 제일 높은 자릿수 비트를 의미하고, Y0은 제일 낮은 자릿수의 비트를 의미한다.In the first input number, X8 means the highest digit bit of the number, and X0 means the lowest digit bit. In the second input number, Y4 means the highest digit bit of the number and Y0 means the lowest digit bit.

도 1에 도시된 일반적인 캐리-세이브형 곱셈기(101)의 결과를 P라하면,P is the result of the typical carry-saved multiplier 101 shown in FIG.

상기와 같은 수학식 1이 성립하는데, 수학식 1의 최종 결과의 각 항들은 차례대로 도 2에 도시된 (201), (202), (203), (204), 그리고 (205)와 대응된다.Equation 1 holds, and each term of the final result of Equation 1 corresponds in turn to 201, 202, 203, 204, and 205 shown in FIG. .

부분곱 (201) ~ (205)를 차례대로 더하면 P가 나오는데, 먼저 (201), (202)를 더하는 부분을 살펴보면 반가산기가 사용되고 있다. 상기 반가산기는 (201)과 (202)의 세로로 겹치는 자리수 부분을 더하는 데 사용되고 있는데, 입력이 두 개이면 되므로 전가산기 대신 반가산기가 사용된 것이다.If you add partial products (201) to (205) in turn, P comes out. First, the parts added to (201) and (202) are semi-adders. The half adder is used to add a vertically overlapping digit portion of the 201 and 202, and since the input is only two, the half adder is used instead of the full adder.

반면, 다음에 (202)과 (203)을 더할 때부터는 전가산기를 사용했는데 그 이유는 (202)과 (203)의 세로로 겹치는 자리수 부분을 더하는데 있어서 (201)과 (202)를 더할 때 나온 캐리 출력도 같이 더해야 하기 때문이다.On the other hand, from the next time (202) and (203) was added, the full adder was used because when adding (201) and (202) to add the vertically overlapping digits of (202) and (203) Because the carry output also needs to be added together.

비슷한 방법으로 (204)과 (205)의 세로로 겹치는 자리수까지 더하고 나면, (205) 아래쪽에 있는 전가산기에서 나오는 합 출력과 캐리 출력을 자리수 하나씩 차이나게 더하는 작업이 남는데, 그 덧셈작업이 (201) ~ (205)를 더하는 작업과 다른 것은 제일 큰 자리수를 제외하면 캐리 출력을 뽑을 필요가 없다는 것이다.Similarly, after adding up to the vertically overlapping digits of (204) and (205), the task of adding the sum output and the carry output from the full adder below (205) by one digit is left. The difference from adding () to (205) is that you don't need to pull the carry output except for the largest digit.

따라서, 그 작업을 하는 벡터병합가산기(206)는 전가산기를 이용할 수도 있지만, 연산 속도를 높이기 위해 캐리-룩어헤드(carry-lookahead) 방식을 비롯한 다른 더 복잡한 구현 방법을 사용할 수 있다.Thus, the vector merge adder 206 doing the work may use a full adder, but may use other more complex implementations, including a carry-lookahead approach, to speed up computation.

부분곱 (201) ~ (205)를 서로 더할 때 캐리 출력을 왼쪽으로 넘겨주는 곱셈기와 달리 캐리-세이브형 곱셈기(101)는 부분곱을 서로 더할 때 캐리 출력을 대각선 방향으로 넘겨주기 때문에 임계경로가 위에서 언급한 벡터병합가산기(206)를 지나가게 된다. 이것은 벡터병합가산기(206)의 연산 속도를 조절하여 전체 곱셈기의 연산 속도 조절을 가능하게 해주는 장점이 있다.Unlike multipliers that carry the carry output to the left when adding the partial products 201 to 205 to each other, the carry-save multiplier 101 passes the carry output diagonally when adding the partial products together so that the critical path is from above. The vector merge adder 206 is passed. This has the advantage of enabling the calculation speed of the entire multiplier by adjusting the calculation speed of the vector merge adder 206.

도 2에 도시된 곱셈기는 입력이 특정 비트수일 경우의 예이지만, 일반적인 경우, 즉 N 비트의 수와 M 비트의 수를 곱하여 N+M 비트의 수를 출력으로 하는 곱셈기의 구현을 위해 필요한 하드웨어 요소는 아래와 같다.The multiplier shown in Fig. 2 is an example of the case where the input is a specific number of bits, but in general, the hardware element necessary for the implementation of the multiplier outputting the number of N + M bits by multiplying the number of N bits with the number of M bits. Is shown below.

- AND 게이트 : N×MAND gate: N × M

- 반가산기 : (N-1)Half adder: (N-1)

- 전가산기 : (N-1)×(M-2)Full adder: (N-1) × (M-2)

- 벡터병합가산기 : (N-1) 비트.Vector adder: (N-1) bits.

이 때, 벡터병합가산기는 구현 방법에 따라 필요한 하드웨어의 크기가 달라질 수 있으며, 비트수가 커질수록 하드웨어의 크기도 커진다.In this case, the size of the required hardware may vary according to the implementation method, and as the number of bits increases, the size of the hardware increases.

도 2의 곱셈기에서 임계경로(critical path)는 X1Y0을 구하기 위한 AND 게이트 1개, 반가산기 1개, 전가산기 3개, 그리고 벡터병합가산기 1개를 순서대로 지난다. 일반적인 경우의 임계경로는 다음 요소를 통과한다.In the multiplier of FIG. 2, the critical path passes in order through one AND gate, one half adder, three full adders, and one vector adder to obtain X1Y0. In the general case, the critical path passes through the following elements:

- AND 게이트 1개1 AND gate

- 반가산기 1개-1 half adder

- 전가산기 (M-2)개-Full adder (M-2)

- 벡터병합가산기 1개-1 vector adder

만약 도 2의 곱셈기의 결과 (209)가 예를 들어, 항상 10 비트 표현범위 이내로 나온다고 하면, 도 2를 사용해서 14 비트 결과를 뽑아서 앞쪽 4 비트를 버리는 것은 결국 도 2의 (210)이 아무 역할도 하지 않는 결과가 되어 비효율적이다.If the result 209 of the multiplier of FIG. 2 is always within the 10-bit representation range, for example, using 14 to extract the 14-bit result and discarding the first 4 bits results in 210 of FIG. This results in inefficiency.

이에, 본 발명은 도 2의 곱셈기의 결과가 만약 10 비트 표현 범위를 벗어나지 않는다는 보장이 있을 경우 필요없게 되는 (210)을 제거하고, 그 대신 곱셈 결과가 10 비트 표현 범위를 벗어났다는 것을 알려주는 오버플로우 처리기가 달린 곱셈기 회로를 제공한다.Thus, the present invention eliminates the needless 210 if there is a guarantee that the result of the multiplier of FIG. 2 does not fall outside the 10-bit representation range, and instead indicates that the multiplication result is outside the 10-bit representation range. Provides a multiplier circuit with an overflow handler.

즉, 본 발명은 상기한 문제점을 해결하기 위한 것으로서 본 발명의 목적은 곱셈기에서 필요없는 출력 비트들을 계산하기 위한 부분을 떼어냄으로써 하드웨어 면적과 입력에서 출력까지의 지연시간, 그리고 소모 전력을 줄일 수 있는 곱셈기 회로를 제공하는 데 있다.That is, the present invention has been made to solve the above problems, and an object of the present invention is to remove a portion for calculating unnecessary output bits in a multiplier, thereby reducing hardware area, delay time from input to output, and power consumption. To provide a multiplier circuit.

상기한 본 발명의 목적을 달성하기 위한 기술적 사상으로써 본 발명은As a technical idea for achieving the above object of the present invention

N 비트의 수와 M 비트의 수를 곱하여 L(≤N+M) 비트의 수를 출력으로 하는 곱셈기 회로에 있어서,In a multiplier circuit which multiplies the number of N bits by the number of M bits and outputs the number of L (≤N + M) bits,

N 비트와 M 비트의 입력을 받아 제한된 크기의 출력을 갖는 곱셈기와;A multiplier that receives an input of N bits and M bits and has an output of limited size;

상기 곱셈기로부터 (N+M)-L 개의 캐리출력를 넘겨 받아 오버플로우의 발생 여부를 판단하는 오버플로우 판단기와;An overflow determiner which receives (N + M) -L carry outputs from the multiplier and determines whether an overflow occurs;

상기 오버플로우 판단기의 오버플로우 발생 여부에 따라 L 비트로 나타낼 수있는 가장 큰 수를 선택해서 최종 출력으로 내보내거나 출력 L 비트를 그대로 최종출력으로 내보내도록 선택하는 선택기를 포함하여 이루어진 것을 특징으로 하는 곱셈기 회로를 제시한다.A multiplier comprising a selector for selecting the largest number that can be represented by L bits and outputting the final output or output L bits as the final output according to whether or not the overflow judging has occurred; Present the circuit.

도 1 은 기존의 곱셈기 블록도로서 N 비트의 수와 M 비트의 수를 곱하여 L 비트의 출력을 구하는 곱셈기의 블록도를 나타낸 것이다.FIG. 1 is a block diagram of a conventional multiplier and shows a block diagram of a multiplier multiplying the number of N bits with the number of M bits to obtain an output of L bits.

도 2 는 도 1에 도시된 일반적인 캐리-세이브형 곱셈기의 상세 블록도이다.FIG. 2 is a detailed block diagram of the general carry-saved multiplier shown in FIG. 1.

도 3 은 도 2에 도시된 곱셈기의 기호에 대한 정의를 나타낸 것이다.FIG. 3 shows definitions of symbols of the multiplier shown in FIG. 2.

도 4는 도 1에 도시된 포화기의 상세 블록도이다.4 is a detailed block diagram of the saturator shown in FIG. 1.

도 5 는 본 발명에 따른 곱셈기 회로의 블록도이다.5 is a block diagram of a multiplier circuit according to the present invention.

도 6 은 도 5에 도시된 곱셈기 회로의 상세 블록도이다.FIG. 6 is a detailed block diagram of the multiplier circuit shown in FIG. 5.

도 7 은 도 5에 도시된 오버플로우판단기의 상세 블록도이다.FIG. 7 is a detailed block diagram of the overflow determiner illustrated in FIG. 5.

도 8 은 본 발명에 따라 MPEG-4의 영상 텍스쳐 복호기의 블록도이다.8 is a block diagram of an MPEG-4 image texture decoder according to the present invention.

이하, 본 발명의 실시예에 대한 구성 및 그 작용을 첨부한 도면을 참조하면서 상세히 설명하기로 한다.Hereinafter, with reference to the accompanying drawings, the configuration and operation of the embodiment of the present invention will be described in detail.

도 5는 본 발명에 따른 곱셈기 회로의 블록도이다.5 is a block diagram of a multiplier circuit according to the present invention.

도 5를 살펴보면, 제한된 크기의 출력을 갖는 곱셈기(401), 오버플로우판단기(402), 선택기(403)로 구성되어 있다.Referring to FIG. 5, a multiplier 401 having an output of limited size, an overflow determiner 402, and a selector 403 are configured.

상기 제한된 크기의 출력을 갖는 곱셈기(401)은 각각 N 비트와 M비트의 입력을 받아서 그들의 곱을 출력하는 곱셈기인데, L 비트로 제한된 출력을 내놓는 곱셈기라는 점에서 N+M 비트의 출력을 내놓는 도 1의 일반적인 캐리-세이브형 곱셈기(101)과는 다르다.The multiplier 401 having the output of the limited size is a multiplier that receives inputs of N bits and M bits, respectively, and outputs their products. The multiplier 401 of FIG. It is different from the general carry-saved multiplier 101.

입력 A와 입력 B가, 예를 들어, 각각 9 비트와 5 비트일 때 출력 결과를 표현하기 위해서는 일반적으로 14 비트가 필요하지만 결과가 항상 10 비트로 표현될 수 있다는 것을 미리 알고 있고서 기존 곱셈기 출력의 14번째 ~ 11번째 비트에 해당되는 부분을 제거한 형태이다. 단, 여기서 다음의 수학식 2의 관계가 만족되어야 한다.For input A and input B, for example, 9 bits and 5 bits, respectively, 14 bits are usually needed to represent the output result, but we know in advance that the result can always be represented by 10 bits. The part corresponding to the 14th to 11th bits is removed. However, the relationship of the following equation 2 should be satisfied here.

이 때, MAX(M,N)은 M>N이면 M이 되고, 그렇지 않으면 N이 된다. 수학식 2에서 L이 M+N보다 작아야 하는 것은 그렇게 되어야 도 1의 캐리-세이브형 곱셈기(101)에서 절약되는 부분이 생길 수 있기 때문이다.At this time, MAX (M, N) becomes M if M> N, otherwise N. In Equation 2, L should be smaller than M + N because it may result in a saving in the carry-saved multiplier 101 of FIG.

만약, L=M+N이 된다면 도 1의 캐리-세이브형 곱셈기(101)과 도 5의 제한된 크기의 출력을 갖는 곱셈기(401)와는 동일한 곱셈기가 되기 때문이다. 또한, 수학식 2에서 L이 MAX(M,N)보다 작지 않아야 하는 것은 본 발명과 관계없이 항상 만족되어야 한다.If L = M + N, the result is the same multiplier as the carry-saved multiplier 101 of FIG. 1 and the multiplier 401 having the limited size output of FIG. In addition, it should always be satisfied that L should not be smaller than MAX (M, N) in Equation 2.

만약, L이 MAX(M,N)보다 작다는 것은 MAX(M,N)이 필요 이상으로 크다는 것을 의미한다. 예를 들어, 곱셈기의 두 입력에 9 비트(M)와 5 비트(N)가 들어왔는데, 출력이 항상 7 비트(L)이내라고 말할 수 있다면, 9 비트짜리 입력은 분명히 7비트로 표현이 가능해야 한다. 결국 항상 L은 MAX(M,N)보다 작지 않아야 한다.If L is smaller than MAX (M, N), then MAX (M, N) is larger than necessary. For example, if two inputs of a multiplier contain 9 bits (M) and 5 bits (N), and you can say that the output is always within 7 bits (L), then the 9-bit input must be expressible in 7 bits. do. After all, L should not be less than MAX (M, N).

한편, 제한된 크기의 출력을 갖는 곱셈기(401)에서 곱셈이 끝나면 (N+M)-L개의 캐리 출력이 나오게 되는데, 이것을 오버플로우판단기(402)에 넘겨주어 곱셈기 (401)의 출력이 오버플로우가 된 것인지 아닌지를 판단하도록 한다. 오버플로우가 발생했다는 것은 제한된 크기의 출력을 갖는 곱셈기(401)의 출력이 L 비트만으로는 다 표현하지 못하는 값이 되었다는 것을 의미한다.On the other hand, when multiplication is completed in the multiplier 401 having a limited size output, (N + M) -L carry outputs are output, which are passed to the overflow determiner 402 so that the output of the multiplier 401 overflows. Determine whether or not. Overflow means that the output of the multiplier 401 with the output of limited size has become a value that cannot be represented by L bits alone.

물론, 본 발명은 제한된 크기의 출력을 갖는 곱셈기(401)의 출력이 L 비트로 표현할 수 있는 것이 확실한 상황에서 사용되어야 하지만, 제한된 크기의 출력을 갖는 곱셈기(401)의 두 입력 중 하나 또는 두 입력 모두에 에러가 발생해서 그 출력이 L 비트 표현 범위를 초과하는 경우도 고려하였다. 그렇게 에러가 발생하더라도 제한된 크기의 출력을 갖는 곱셈기(401)의 출력을 L 비트 표현 범위로 제한하는 것이 곱셈기 활용 목적에 더 맞기 때문이다.Of course, the present invention should be used in situations where it is certain that the output of multiplier 401 with a limited size output can be represented in L bits, but either or both inputs of multiplier 401 with limited size output We also consider a case where an error occurs and the output exceeds the L bit representation range. This is because limiting the output of the multiplier 401 having a limited size output to the L bit representation range is more suitable for the purpose of using the multiplier.

상기 오버플로우판단기(402)와 선택기(403)은 곱셈기(401)에서 출력의 사용하지 않는 비트들에 해당하는 부분을 제거했기 때문에 생겨난 것이다. 그에 따라, 새로 나타난 오버플로우판단기(402)와 선택기(403) 때문에 제한된 크기의 출력을 갖는 곱셈기(401)에서 줄어든 하드웨어 면적이 상쇄되는 것처럼 보일 수도 있지만, 사실 그렇지 않다.The overflow determiner 402 and the selector 403 are generated because the multiplier 401 removes portions corresponding to unused bits of the output. As a result, the reduced hardware area in the multiplier 401 with the limited size output may appear to be offset by the newly introduced overflow determiner 402 and the selector 403, but in fact it is not.

왜냐하면, 오버플로우판단기(402)와 선택기(403)가 새로 생긴 대신 도 1의 포화기(102)가 제거되었기 때문이다. 게다가 기존 곱셈기에서 포화기(102)의 하드웨어 면적은 본 발명에서 제안하는 곱셈기의 오버플로우판단기(402)와 선택기(403)의 하드웨어 면적의 합과 정확히 일치한다.This is because the saturator 102 of FIG. 1 is removed instead of the overflow determiner 402 and the selector 403. In addition, the hardware area of the saturator 102 in the existing multiplier is exactly the sum of the hardware area of the overflow determiner 402 and the selector 403 of the multiplier proposed in the present invention.

따라서, 순수하게 도 1의 캐리-세이브형 곱셈기(101)과 도 5의 제한된 크기의 출력을 갖는 곱셈기(401)의 면적 비교로 본 발명에 의한 곱셈기의 면적이 줄었는지를 판단할 수 있다.Accordingly, it is possible to determine whether the area of the multiplier according to the present invention is reduced by purely comparing the area of the carry-saved multiplier 101 of FIG. 1 and the multiplier 401 having the limited size output of FIG.

즉, 상기 오버플로우판단기(402)에서 오버플로우가 발생했다고 판단을 내리면 선택기(403)은 L 비트로 나타낼 수 있는 가장 큰 수를 선택해서 최종 출력으로 내보낸다. L 비트로 나타낼 수 있는 가장 큰 수는로서, 이진수로 나타내면 1이 L개 연속된다. 반면, 상기 오버플로우판단기(402)가 오버플로우가 생기지 않았다고 판단을 하면, 선택기(403)은 제한된 크기의 출력을 갖는 곱셈기(401)의 출력 L 비트를 그대로 최종 출력으로 내보낸다.That is, when the overflow determiner 402 determines that an overflow has occurred, the selector 403 selects the largest number that can be represented by the L bit and sends it to the final output. The largest number that can be represented by L bits is As a binary number, L is 1 consecutive. On the other hand, when the overflow determiner 402 determines that no overflow occurs, the selector 403 outputs the output L bit of the multiplier 401 having the output of the limited size as it is to the final output.

도 6은 도 2의 곱셈기에서 출력에서 사용하지 않게 될 비트들을 위한 부분 (210)을 제거한 곱셈기로서, 9 비트의 수와 5 비트의 수를 입력 받아 항상 10 비트 표현 범위 이내의 수를 출력으로 하는 경우의 예이다. 도 6이 도 2에서 개선된 점은 다음과 같다.FIG. 6 is a multiplier removing portions 210 for bits that will not be used in an output of the multiplier of FIG. 2, and receives a number of 9 bits and a number of 5 bits, and always outputs a number within a 10-bit representation range. Example of the case. 6 is an improvement of FIG. 2 as follows.

첫째, AND 게이트가 6개 줄었다.First, there are six AND gates.

둘째, 전가산기가 3개 줄었다.Second, three full adders were reduced.

셋째, 벡터병합가산기의 너비가 8비트에서 5비트로 줄었다.Third, the width of the vector merge adder has been reduced from 8 bits to 5 bits.

이들 중 특히 셋째는 벡터병합가산기의 구현 방식을 더 복잡하게 하여 전체 곱셈기의 출력 지연시간을 줄이고자 할 때 더 많은 하드웨어 면적의 감소를 가능하게 한다.The third of these, in particular, complicates the implementation of the vector-adder, allowing more hardware area to be reduced in order to reduce the output delay of the overall multiplier.

도 6의 부분곱 (503) 아래쪽 전가산기 열의 제일 왼쪽 전가산기의 캐리 출력, (504) 아래쪽 전가산기 열의 제일 왼쪽 전가산기의 캐리 출력, (505) 아래쪽 전가산기 열의 제일 왼쪽 전가산기의 캐리 출력, 그리고 벡터병합가산기의 캐리 출력은 모두 도 5의 오버플로우판단기(402)의 입력으로 연결된다. 이들 신호의 개수인 4는 두 입력의 비트수의 합 14에서 제한된 출력의 비트수 10을 뺀 값과 같다.Carry output of the leftmost full adder of the partial product 503 bottom full adder row of FIG. 6, carry output of the leftmost full adder of the bottom full adder row 504 The carry outputs of the vector merge adder are all connected to the inputs of the overflow determiner 402 of FIG. 5. The number of these signals, 4, is equal to the sum of the bits of the two inputs, minus 14, the number of bits of the limited output.

도 7은 도 5에 도시된 오버플로우판단기의 블록 상세도이다.FIG. 7 is a detailed block diagram of the overflow determiner illustrated in FIG. 5.

도 7에 도시된 오버플로우판단기는 도 5의 제한된 크기의 출력을 갖는 곱셈기(401)에서 받은 입력으로 곱셈기(401)의 출력이 의미있는 값인지, 즉 10 비트 표현 범위 이내의 값으로 나온 것인지를 판단한다.The overflow determiner illustrated in FIG. 7 is an input received from the multiplier 401 having the limited size output of FIG. 5 to determine whether the output of the multiplier 401 is a meaningful value, that is, a value within a 10-bit representation range. To judge.

그것을 판단하는 방법은 입력 받은 모든 신호에 대해 논리합(OR)을 취한 결과를 보는 것이다. 왜냐하면 오버플로우판단기(402)가 입력받은 신호들 중 하나라도 0이 아닌 값을 갖는다면, 그것은 제한된 크기의 출력을 갖는 곱셈기(401)의 출력에서 11번째 ~ 14번째의 비트가 존재해야 올바른 계산 결과를 나타낼 수 있다는 것을 의미하기 때문이다.The way to judge it is to look at the result of ORing all the signals received. Because if the overflow determiner 402 has a non-zero value in any of the input signals, it must have 11th to 14th bits in the output of the multiplier 401 having a limited size output. This means that you can represent the result.

도 6의 곱셈기는 입력이 특정 비트수일 경우의 예이지만, 일반적인 경우, 즉 N 비트의 수와 M 비트의 수를 곱하여 L(≤N+M) 비트의 수를 출력으로 하는 곱셈기의 구현을 위해 필요한 하드웨어 요소는 아래와 같다.The multiplier of FIG. 6 is an example of the case where the input is a specific number of bits, but in general, that is necessary for the implementation of a multiplier that outputs the number of L (≤N + M) bits by multiplying the number of N bits with the number of M bits. The hardware elements are as follows.

- 벡터병합가산기 : (L-M) 비트-Vector adder: (L-M) bit

이 때, 상기 벡터병합가산기는 구현 방법에 따라 필요한 하드웨어의 크기가 달라질 수 있으며, 비트수가 커질수록 하드웨어의 크기도 커진다.In this case, the size of the required hardware may vary according to an implementation method, and as the number of bits increases, the size of the hardware also increases.

단, MAX(A, B)는 A, B 중 작지 않은 수를 의미한다. 위의 하드웨어 요소를 나타낸 식들에서 '-' 뒤의 항이 도 2의 기존 곱셈기에 비해 본 발명에 의한 곱셈기에서 줄어든 개수를 나타낸다.However, MAX (A, B) means the lesser of A and B. In the above equations for hardware elements, the term after '-' represents a reduced number in the multiplier according to the present invention compared to the conventional multiplier of FIG. 2.

도 6의 곱셈기에서 임계경로(critical path)는 X1Y0을 구하기 위한 AND 게이트 1개, 반가산기 1개, 전가산기 3개, 그리고 벡터병합가산기 캐리출력을 순서대로 지난다. 일반적인 경우의 임계경로는 다음 요소를 통과한다.In the multiplier of FIG. 6, the critical path passes through an AND gate, a half adder, three full adders, and a vector combine adder carry output in order to obtain X1Y0. In the general case, the critical path passes through the following elements:

- AND 게이트 1개1 AND gate

- 반가산기 1개-1 half adder

- 전가산기 (M-2)개-Full adder (M-2)

- 벡터병합가산기 1개-1 vector adder

도 2와 도 6에 도시된 임계경로의 차이는 벡터병합가산기의 너비이다. 본 발명에 의한 곱셈기가 얼마나 빨라지냐는, 동일한 벡터병합가산기의 구현방법에 대해, 제한되는 출력 비트수로 인해 벡터병합가산기의 너비가 얼마나 줄어드냐에 달려있다.The difference between the critical paths shown in FIGS. 2 and 6 is the width of the vector merge adder. How fast the multiplier according to the present invention depends on how the width of the vector merge adder is reduced due to the limited number of output bits for the implementation method of the same vector merge adder.

도 8은 본 발명에 따라 MPEG-4의 영상 텍스쳐 복호기의 블록도로서 디지털 영상의 복호 장치, 특히 엠펙-4 영상 텍스쳐 복호장치에 적용되었을 때 많은 이득을 볼 수 있다.FIG. 8 is a block diagram of an MPEG-4 image texture decoder according to the present invention, and when applied to a digital image decoding device, in particular, an MPEG-4 image texture decoding device, many advantages can be seen.

도 8를 살펴보면, 가변장부호 복호기(variable length decording; 701), 역주사기(inverse scan; 702), AD/DC 계수보상기(inverse AC/DC prediction; 703), 역양자화기(inverse quantization; 704), 역이산여현변환기(inverse DCT; 705), 움직임보상기(motion compensation; 706)로 이루어져 있다.Referring to FIG. 8, a variable length decoder (701), an inverse scan (702), an inverse AC / DC prediction (703), an inverse quantization (704) And an inverse DCT (705) and a motion compensation (706).

본 발명은 도 8의 AD/DC 계수보상기(703)과 역양자화기(704)에 적용될 수 있다. AD/DC 계수보상기(703)에서 AC 계수를 예측하기 위해서, 일단 역주사기(702)에서 받은 AC 계수에 예측방향에 해당되는 양자화계수를 곱해주는 작업을 해야하는데, 역주사기(702)에서 나오는 AC 계수의 절대값은 10 비트로 표현이 가능하고, 양자화계수는 5 비트로 표현이 가능하다.The present invention can be applied to the AD / DC coefficient compensator 703 and the inverse quantizer 704 of FIG. In order to predict the AC coefficient in the AD / DC coefficient compensator 703, the AC coefficient received from the inverse scanner 702 is multiplied by the quantization coefficient corresponding to the prediction direction. The absolute value of the coefficient can be represented by 10 bits, and the quantization coefficient can be represented by 5 bits.

그런데, 그 둘을 곱한 값은 10 비트로 표현이 가능해야 한다. 이렇게 되면 도 5에서 N=10, M=5, L=10인 경우로서, 기존 곱셈기에 비해 AND 게이트는 수학식 3에 의해 50개에서 40개로 감소하고, 반가산기는 수학식 4에 의해 9개에서 9개로 동일하고, 전가산기는 수학식 5에 의해 27개에서 21개로 감소하고, 벡터병합가산기의 너비는 수학식 6에 의해 9 비트에서 5 비트로 감소한다.However, the product of the two should be represented by 10 bits. In this case, in the case of N = 10, M = 5, and L = 10 in FIG. 5, the AND gates are reduced from 50 to 40 by Equation 3, and the half adders are reduced to 9 by Equation 4, compared to the conventional multipliers. 9 equals, the full adder is reduced from 27 to 21 by Equation 5, and the width of the vector merge adder is reduced from 9 to 5 bits by Equation 6.

여기서, 벡터병합가산기의 하드웨어 면적은 본 발명을 적용하면 상황에 따라 10분의 1까지도 감소할 수 있는데, 다음과 같은 경우이다.Here, the hardware area of the vector adder can be reduced by up to one-tenth depending on the situation according to the present invention.

본 발명을 적용하지 않고 전체 곱셈기의 출력 지연시간을 특정 값 이하로 만들기 위해, 벡터병합가산기의 입력-출력 지연시간이 3ns 이하로 되어야 한다고 가정하자.Suppose that the input-output delay time of the vector-adder should be less than or equal to 3 ns in order to make the output delay time of the entire multiplier less than or equal to a certain value without applying the present invention.

어떤 반도체 제조 공정에서 9 비트 벡터병합가산기를 리플-캐리 형태로 구현하면 면적 58, 지연시간 4.51ns이고, 캐리-룩어헤드 방식으로 구현하면 면적 347, 지연시간 1.33ns라 하자. 벡터병합가산기의 지연시간이 3ns 이하이어야 하므로 면적이 훨씬 큰 캐리-룩어헤드 방식을 사용해야 한다.In a semiconductor manufacturing process, a 9-bit vector adder in an ripple-carry form has an area of 58, and a delay time of 4.51 ns. Since the vector-adder's delay must be 3ns or less, the carry-look-ahead method with a much larger area should be used.

그런데, 본 발명을 적용하면 벡터병합가산기의 너비가 9 비트에서 5 비트로 줄어드는데 같은 공정, 같은 조건에서 5비트 벡터병합가산기를 리플-캐리 형태로구현하면 면적 32, 지연시간 2.67ns라 하면, 지연시간이 3ns 이하이므로 리플-캐리 형태의 벡터병합가산기의 사용할 수 있게 되어 결과적으로 본 발명을 적용함으로써 벡터병합가산기의 면적이 347에서 32로 약 10.8분의 1로 줄어들게 된다.However, when the present invention is applied, the width of the vector merge adder is reduced from 9 bits to 5 bits. If the 5-bit vector merge adder is implemented in the form of ripple-carrier under the same process and under the same conditions, the area is 32 and the delay time is 2.67 ns. Since the time is 3ns or less, the ripple-carry vector adder can be used. As a result, by applying the present invention, the area of the vector adder is reduced from 347 to 32 to about 10.8 / 1.

즉, 역양자화기(704)에서는 11 비트 입력에 역양자화를 할 때 양자화계수 5 비트를 곱하는데, 그 결과도 11 비트가 되어야 하기 때문에, 본 발명이 적용되어 하드웨어 면적을 줄일 수 있다.That is, the inverse quantizer 704 multiplies the 11-bit input by 5 bits of the quantization coefficient when inverse quantization. As a result, the inverse quantizer 704 must be 11 bits, so that the present invention can be applied to reduce hardware area.

이상에서와 같이 본 발명에 의한 제한된 크기의 출력을 갖는 곱셈기 회로에 따르면 다음과 같은 이점이 있다.As described above, according to the multiplier circuit having a limited size output according to the present invention, there are advantages as follows.

첫째, 본 발명에 의하면 곱셈기에서 필요없는 출력 비트들을 계산하기 위한 부분을 떼어냄으로써 하드웨어 면적과 입력에서 출력까지의 지연시간, 그리고 소모 전력을 줄일 수 있다.First, according to the present invention, the hardware area, the delay from the input to the output, and the power consumption can be reduced by removing a portion for calculating unnecessary output bits in the multiplier.

둘째, 본 발명을 적용하지 않고 도 2와 같은 일반적인 곱셈기 구조를 사용했을 때, 입력에서 출력까지 지연시간이 길어져서 부스(Booth) 알고리즘 같은 더 복잡한 구현 방법을 통해 그 지연시간을 줄여야 하는 경우가 있다고 하자. 그 때, 본 발명을 적용하면 도 6과 같은 캐리-세이브형 곱셈기로 그 만큼의 지연시간을 줄일 수도 있다는 것을 고려하면 하드웨어 자원이 더 크게 절약될 수 있다.Second, when the general multiplier structure shown in FIG. 2 is used without applying the present invention, the delay time is increased from the input to the output and the delay time may need to be reduced through a more complicated implementation method such as the Booth algorithm. lets do it. In this case, considering that the present invention can also reduce the delay time by the carry-save multiplier as shown in FIG. 6, hardware resources can be further saved.

셋째, 본 발명을 적용하면 벡터병합가산기의 너비가 줄어들어 그것의 구현 방법이 변하지 않으면 하드웨어 면적이 항상 감소하고, 더 나아가 곱셈기의 지연시간 단축을 위해 벡터병합가산기의 구현 방법을 더 복잡하게 해야 할 때, 본 발명을적용하여 벡터병합가산기의 너비 감소로 인해 굳이 구현 방법을 더 복잡하게 해야 할 필요성이 없어진다.Third, the application of the present invention reduces the width of the vector adder, so that the hardware area is always reduced if its implementation is not changed, and furthermore, when it is necessary to complicate the method of implementing the vector merge adder to reduce the delay time of the multiplier. However, the reduction of the width of the vector adder by applying the present invention eliminates the need to further complicate the implementation method.

Claims

In a multiplier circuit which multiplies the number of N bits by the number of M bits and outputs the number of L (≤N + M) bits,

A multiplier having an input of N bits and M bits and having an output of limited size;

An overflow determiner which receives (N + M) -L carry outputs from the multiplier and determines whether an overflow occurs;

A multiplier comprising a selector for selecting the largest number that can be represented by L bits and outputting the final output or output L bits as the final output according to whether the overflow determiner has overflowed; Circuit.

The multiplier circuit of claim 1 wherein the multiplier takes the form of a carry-save.

The multiplier circuit of claim 1 or 2, wherein the following equation is satisfied to implement a carry-save type multiplier that outputs the number of L (≤N + M) bits.

Vector Merge Adder: (L-M) Bit

The method according to claim 1, wherein when the overflow determiner determines that an overflow has occurred, the selector is the largest number that can be represented by L bits. Selector and output to final output.

The multiplier circuit of claim 1, wherein when the output of the multiplier is determined to be out of the L bit representation range by using the overflow determiner and the selector, the multiplier circuit replaces the final output with the maximum number that can be represented by the L bit.