KR100386979B1

KR100386979B1 - Method of paralleling bit serial multiplier for Galois field and a bit serial-parallel multipiler using thereof

Info

Publication number: KR100386979B1
Application number: KR10-2000-0028305A
Authority: KR
Inventors: 최영민; 박만혁
Original assignee: 주식회사데이콤
Priority date: 2000-05-25
Filing date: 2000-05-25
Publication date: 2003-06-09
Also published as: KR20010107087A

Abstract

본 발명은 갈로아체상에서 비트 직렬 승산기를 병렬화하는 방법과 이를 이용하여 간단한 구성으로 연산속도를 향상시킨 갈로아체상의 비트 직병렬 승산기에 관한 것이다.The present invention relates to a method for parallelizing a bit serial multiplier on a galloche, and to a bit serial-parallel multiplier on a galloche with a simple configuration.

이러한 본 발명의 방법은, GF(2^N)상의 승수 g[N-1:0]과 변환된 피승수 x'[N-1:0]을 승산하여 변환된 연산결과 z'[N-1:0]을 출력하는 갈로아체상의 비트 직렬 승산기를 병렬화하는 방법에 있어서, 상기 피승수를 저장하기 위한 시프트 레지스터를 병렬화 깊이에 따라 소정 수로 병렬화하고, 상기 피승수와 승수에 대한 곱의 합을 구하는 조합회로를 병렬화 깊이에 따라 소정 수로 병렬화하여 상기 승산기의 연산에 필요한 클럭수를 줄여 연산속도를 향상시킨 것을 특징으로 한다.The method of the present invention multiplies the multiplier g [N-1: 0] on GF ( ^2N ) by the multiplied multiplicand x '[N-1: 0] and transforms the result of the calculation z' [N-1: 0]. In the method of parallelizing a bit serial multiplier on a galloche outputting a parallel output, the parallel shift register for storing the multiplier is parallelized to a predetermined number according to the depth of parallelization, and a combination circuit for obtaining the sum of the product of the multiplier and the multiplier is parallelized. By parallelizing a predetermined number according to the depth, the number of clocks required for the multiplier calculation is reduced to improve the calculation speed.

따라서, 본 발명은 갈로아체 상에서의 비트-직렬 승산기를 병렬화 함으로써 많은 하드웨어의 추가 없이 승산을 하는데 필요한 시간을 단축할 수 있고, 또 사용된 시스템 내에서 필요한 정도만큼만 병렬화함으로써 병렬화 깊이를 최적화할 수 있어 연산시간과 하드웨어 면적을 최적화할 수 있다.Accordingly, the present invention can shorten the time required to multiply without adding a lot of hardware by parallelizing the bit-serial multiplier on Galoache, and optimize the parallelization depth by parallelizing only as much as necessary in the used system. Compute time and hardware area can be optimized.

Description

Parallel method of bit serial multiplier on Galoache and method of parallel multiplier using the same {method of paralleling bit serial multiplier for Galois field and a bit serial-parallel multipiler using about}

본 발명은 갈로아체상에서 승산기에 관한 것으로, 더욱 상세하게는 비트 직렬 승산기를 병렬화하는 방법과 이를 이용하여 간단한 구성으로 연산속도를 향상시킨 갈로아체상의 비트 직병렬 승산기에 관한 것이다.The present invention relates to a multiplier on a galloche, and more particularly, to a method for parallelizing a bit serial multiplier and to a bit serial-parallel multiplier on a galloche with a simple configuration using the same.

일반적으로, 갈로아체상에서 승산을 구현하는 방법은 크게 병렬 승산기와 직렬 승산기로 나누어진다, 직렬 승산기는 구현시 하드웨어 면적을 적게 차지한다는 장점이 있어 널리 사용되나 직렬 승산기를 이용하여 계산하기 위해서는 병렬 승산기에 비해 많은 연산시간이 소요되는 문제점이 있다. 반면에, 병렬 승산기는 연산속도를 줄일 수 있으나 하드웨어가 복잡해져 칩으로 구현시 면적을 많이 차지하여 비용이 증가하는 문제점이 있다. 즉, 종래에는 하드웨어 면적을 줄이기 위해 비트-직렬 승산기를 사용하거나 연산시간을 줄이기 위해 병렬 승산기를 사용하였다. 그러나 갈로아체상에서 직렬 승산기의 경우 승산을 수행하는데 연산시간이 N개 만큼의 클럭이 필요하고, 병렬 승산기의 경우 N*N 개의 비트 원소를 저장할 메모리가 필요해 많은 하드웨어를 차지하는 문제점이 있다.In general, the method of implementing multiplication on Galoa is divided into a parallel multiplier and a serial multiplier. A serial multiplier is widely used because it takes less hardware area to implement. There is a problem that takes a lot of computation time. On the other hand, the parallel multiplier can reduce the operation speed, but there is a problem in that the cost increases because the hardware is complicated and takes up a lot of area when implemented as a chip. That is, conventionally, a bit-serial multiplier is used to reduce the hardware area or a parallel multiplier is used to reduce the computation time. However, in the case of a serial multiplier on Galoische, multiplication of the multiplier requires N clocks, and a parallel multiplier requires a large amount of hardware to store N * N bit elements.

본 발명은 상기와 같은 문제점을 해결하기 위하여 병렬 승산기와 비트-직렬 승산기의 단점을 보완할 수 있는 갈로아체상에서 비트 직렬 승산기의 병렬화 방법및 이를 이용한 직병렬 승산기를 제공하는데 그 목적이 있다.An object of the present invention is to provide a parallelization method of a bit serial multiplier and a serial-to-parallel multiplier using the same in the Galloche to solve the problems of the parallel multiplier and the bit-serial multiplier.

도 1은 일반적인 비트 직렬 승산기의 구성을 도시한 도면,1 is a diagram showing the configuration of a general bit serial multiplier;

도 2는 본 발명에 따른 비트 직병렬 승산기(병렬화 깊이 2)의 개념을 도시한 도면,2 is a diagram illustrating the concept of a bit-parallel multiplier (parallelization depth 2) according to the present invention;

도 3은 도 2에 도시된 비트 직병렬 승산기의 세부 구성도,3 is a detailed block diagram of the bit-parallel multiplier shown in FIG.

도 4는 본 발명에 따른 비트 직병렬 승산기(병렬화 깊이 4)의 개념을 도시한 도면이다.4 is a diagram illustrating the concept of a bit-parallel multiplier (parallelization depth 4) according to the present invention.

*도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

301~309: 시프트 레지스터 311~318: 레지스터301 to 309: shift register 311 to 318: register

321~328,331~338: 갈로아체 곱셈기 341,342,351,352: 갈로아체 덧셈기321--328,331--338: Galoache Multiplier 341,342,351,352: Galoache Adder

상기와 같은 목적을 달성하기 위하여 본 발명의 방법은, GF(2^N)상의 승수 g[N-1:0]과 변환된 피승수 x'[N-1:0]을 승산하여 변환된 연산결과 z'[N-1:0]을 출력하는 갈로아체상의 비트 직렬 승산기를 병렬화하는 방법에 있어서, 상기 피승수를 저장하기 위한 시프트 레지스터를 병렬화 깊이에 따라 소정 수로 병렬화하고, 상기 피승수와 승수에 대한 곱의 합을 구하는 조합회로를 병렬화 깊이에 따라 소정 수로 병렬화하여 상기 승산기의 연산에 필요한 클럭 수를 줄여 연산속도를 향상시킨 것을 특징으로 한다.In order to achieve the above object, in the method of the present invention, a multiplication multiplier g [N-1: 0] on GF ( ^2N ) multiplied by the multiplied multiplier x '[N-1: 0] is converted to result z A method of parallelizing a bit serial multiplier on a galloche that outputs [N-1: 0], wherein the shift register for storing the multiplier is parallelized to a predetermined number according to the parallelization depth, and the product of the multiplier and the multiplier The combination circuit for obtaining the sum is parallelized to a predetermined number according to the parallelization depth, thereby reducing the number of clocks required for the multiplier operation, thereby improving the operation speed.

또한 상기와 같은 목적을 달성하기 위한 본 발명의 장치는 GF(2⁸)상의 승수 g[7:0]과 변환된 피승수 x'[7:0]을 승산하여 변환된 연산결과 z'[7:0]을 출력하는 갈로아체상의 승산기에 있어서, 4개의 시프트 레지스터(x'[1], x'[3], x'[5], x'[7])가 직렬 연결되어 매 클럭마다 저장값을 시프트하는 제1 시프트 레지스터군과 5개의 시프트 레지스터(x'[0], x'[2], x'[4], x'[6], x'[8])가 직렬로 연결되어 매 클럭마다 저장값을 시프트하는 제2 시프트 레지스터군을 병렬로 구비하되 초기에 변환된 피승수를 x'[1]~x'[8] 레지스터에 로딩하는 레지스터부와; 상기 레지스터 x'[0]~x'[7]의 출력과 상기 승수 g[0]~g[7]를 곱의 합으로 연산하여 매 클럭마다 연산결과(z'[t])를 출력하는 제1 조합회로; 및 상기 레지스터 x'[1]~x'[8]의 출력과 상기 승수 g[0]~g[7]를 곱의 합으로 연산하여 매 클럭마다 연산결과(z'[t+1])를 출력하는 제2 조합회로로 구성되는 것을 특징으로 한다.In addition, the apparatus of the present invention for achieving the above object multiplies the multiplier g [7: 0] on the GF (2 ⁸ ) by the multiplied multiplicand x '[7: 0] converted z' [7: In the multiplier on Galoache that outputs 0], four shift registers (x '[1], x' [3], x '[5], x' [7]) are connected in series and stored at every clock. The first shift register group for shifting the? And five shift registers (x '[0], x' [2], x '[4], x' [6], and x '[8]) are connected in series. A register unit including a second shift register group for shifting a stored value for each clock in parallel, and loading the initially multiplied multiplier into the registers x '[1] to x'[8]; Outputting the result of the calculation (z '[t]) every clock by calculating the sum of the outputs of the registers x' [0] to x '[7] and the multipliers g [0] to g [7]. 1 combination circuit; And calculating the sum of the outputs of the registers x '[1] to x' [8] and the multipliers g [0] to g [7] by the sum of the products to calculate the calculation result z '[t + 1] every clock. And a second combination circuit for outputting.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 자세히 설명하기로 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명에서는 갈로아체 GF(2^N) 상에서 비트-직렬 승산기에 사용된 시프트 레지스터를 병렬화하여 연산에 필요한 클럭 수를 줄임으로써 많은 하드웨어 추가없이 연산시간을 줄인 것이다. 갈로아체 GF(2^N) 상에서 비트-직렬 승산기를 이용하여 승산을 수행하기 위해 필요한 클럭(clock)의 수는 N인데, 본 발명에 따라 병렬화 깊이(parallelizing depth: P)를 2배씩 증가시킬 때마다 필요한 클럭의 수는 1/2 배씩 감소하게 된다. 반면, 추가로 필요한 2*P - 3개의 레지스터와 곱의 합을 수행할 조합회로가 필요하다. 만일, N이 큰 값이 아니면 병렬화 깊이가 N/2일 때까지 추가로 필요한 하드웨어의 양이 많은 것이 아니다.In the present invention, by reducing the number of clocks required for the operation by parallelizing the shift registers used in the bit-serial multiplier on the galloche GF ( ^2N ), the operation time is reduced without adding a lot of hardware. The number of clocks needed to perform multiplication using a bit-serial multiplier on Galoache GF ( ^2N ) is N, and each time the parallelizing depth (P) is increased by 2 times according to the present invention. The number of clocks required is reduced by half. On the other hand, there is a need for a combination circuit to perform the product of the additional 2 * P-3 registers needed. If N is not a large value, then the amount of additional hardware required until the parallelization depth is N / 2 is not large.

도 1은 원시 방정식이 x⁸+x⁴+x³+x²+1인 GF(2⁸)상에서 비트-직렬 승산기의 일반적인 구현방법을 나타낸 것이다. 도 1에서 g[7:0]은 승수(multiplier)이고, x'[7:0]은 피승수(multiplicand)의 변형된 형태(transformed form)이고, z'[7:0]는 연산결과의 변형된 형태이다. 또한 참조번호 101~108은 피승수를 로딩한 후 클럭에 따라 시프트하는 시프트 레지스터이고, 111~118은 승수를 로딩하여 저장하고 있는 레지스터이며, 121~128은 시프트 레지스터가 출력하는 피승수와 레지스터가 출력하는 승수를 곱하는 갈로아체 곱셈기이다. 131은 시프트레지스터의 출력을 합해 다시 시프트 레지스터로 제공하기 위한 갈로아체 덧셈기이고, 132는 각 갈로아체 곱셈기의 출력을 합해 승산결과를 출력하는 갈로아체 덧셈기이다. 갈로아체 연산에서 갈로아체 덧셈기는 배타적 오아(EX-OR) 게이트로, 갈로아체 곱셈기는 엔드(AND) 게이트로 구현할 수 있다.Figure 1 shows a general implementation of a bit-serial multiplier on GF (2 ⁸ ), where the primitive equation is x ⁸ + x ⁴ + x ³ + x ² +1. In FIG. 1, g [7: 0] is a multiplier, x '[7: 0] is a transformed form of multiplicand, and z' [7: 0] is a deformation of the operation result. Form. Reference numerals 101 to 108 denote shift registers that shift according to a clock after loading a multiplicand. 111 to 118 denote registers for storing and storing multipliers, and 121 to 128 output the multipliers and registers output by the shift register. Galoache multiplier to multiply. 131 is a galloche adder for summing the outputs of the shift registers and providing them back to the shift register. In a Galoache operation, a Galoache adder can be implemented as an exclusive EX-OR gate, and a Galoache multiplier can be implemented as an end (AND) gate.

이와 같은 구성에서 승산기의 동작은 승수 및 피승수값을 각 레지스터에 로딩하는 로딩단계와 연산단계로 이루어지는데, 연산단계에서는 매 클럭마다 z'[0]부터 z'[7]까지 순차적으로 출력된다. 즉, 승수와 변형된 형태의 피승수를 초기값으로 저장하면 도 1의 곱의 합을 수행하는 회로(121~128,132)에 의하여 승산결과의 첫번째 비트(z'[0])를 얻을 수 있고, 이후 시프트 레지스터(101~108)에 클럭(CLK)이 입력될 때마다 새로운 상태에 이르게 되며, 이 상태로부터 출력의 다음 비트를 얻을 수 있게 된다. 클럭(CLK)을 계속 입력받아 출력의 모든 비트를 얻게 되면, 새로운 승수와 변형된 형태의 피승수를 입력하여 새로운 승산을 위와 같은 과정에 의해 수행하게 된다. 이러한 회로에 의해 승산을 하는데 필요한 시간은 승수와 피승수로 레지스터들을 초기화한 후 8개의 클럭이 필요하고, 8개의 레지스터와 2개의 곱의 합을 수행하는 회로가 필요하다.In this configuration, the operation of the multiplier consists of a loading step and an operation step of loading a multiplier and a multiplier value into each register. In the operation step, z '[0] to z' [7] are sequentially output every clock. That is, when the multiplier and the multiplicand of the modified form are stored as initial values, the first bits z '[0] of the multiplication result can be obtained by the circuits 121 to 128 and 132 performing the sum of the products of FIG. Each time the clock CLK is input to the shift registers 101-108, a new state is reached, from which the next bit of the output can be obtained. When the clock CLK is continuously input and all bits of the output are obtained, the new multiplication is performed by the new multiplier and the modified multiplicand by the above process. The time required to multiply by this circuit requires eight clocks after initializing the registers with a multiplier and a multiplicand, and a circuit that performs the sum of eight registers and two products.

도 2는 본 발명에 따라 도 1의 시프트 레지스터를 병렬화 깊이 2로 병렬화한 개념도이다. 도 2에서 g[7:0]은 승수(multiplier)이고, x'[7:0]은 피승수 (multiplicand)의 변형된 형태(transformed form)이고, z'[t]와 z'[t+1]은 연산결과의 변형된 형태이다. 그리고 201 내지 209는 시프트 레지스터로서, 201 내지204가 직렬로 연결되고, 205 내지 209가 직렬로 연결되어 있다. x'[1]^x'[3]^x'[4]^x'[5]의 "^"는 배타적 오아 연산 즉, 덧셈을 나타내고, 따라서 x'[1], x'[3], x'[4], x'[5]의 합이 시프트 레지스터(201)로 피드백되어 입력되는 것을 나타낸다. x'[2]^x'[4]^x'[5]^x'[6]은 x'[2], x'[4], x'[5], x'[6]의 합이 시프트 레지스터(205)로 피드백되어 입력되는 것을 나타낸다. 참조번호 211~218과 231~238은 레지스터에 저장된 피승수를 해당 승수와 각각 곱하는 것을 나타내고, 211~218의 출력은 갈로아체 덧셈기(220)에서 모두 합해져 z[t]로 출력되고, 231~238의 출력은 갈로아체 덧셈기(240)에서 모두 합해져 z[t+1]로 출력된다.FIG. 2 is a conceptual diagram of parallelizing the shift register of FIG. 1 to a parallelization depth 2 according to the present invention. In FIG. 2 g [7: 0] is a multiplier, x '[7: 0] is a transformed form of multiplicand, z' [t] and z '[t + 1 ] Is a variant of the result of the operation. 201 to 209 are shift registers, in which 201 to 204 are connected in series, and 205 to 209 are connected in series. The "^" in x '[1] ^ x' [3] ^ x '[4] ^ x' [5] represents an exclusive oramic operation, that is, addition, so x '[1], x' [3], The sum of x '[4] and x' [5] is fed back to the shift register 201 and inputted. x '[2] ^ x' [4] ^ x '[5] ^ x' [6] is the sum of x '[2], x' [4], x '[5], x' [6] It is inputted to the shift register 205 and shown. Reference numerals 211 to 218 and 231 to 238 indicate multiplying the multiplicands stored in the register with the corresponding multipliers, respectively, and the outputs of 211 to 218 are summed together by the galloace adder 220 and output as z [t]. The outputs are summed together by the galloche adder 240 and output as z [t + 1].

이와 같이 병렬화 깊이 2로 병렬화된 본 발명의 승산기는 승수와 변형된 형태의 피승수를 g[0]~g[7]와 x'[1]~x'[8]에 저장하면 곱의 합회로에 의해 승산결과의 첫번째 출력비트(z'[0], z'[1])를 각각 얻는다. 이어 클럭이 입력되면 2개의 조합회로에 의해 시프트 레지스터(201~209)는 새로운 상태로 천이하게 되고, 2개의 곱의 합 회로에 의해 승산 결과의 다음 두 비트(z'[2],z['3])를 얻게 된다. 클럭을 계속 입력받아 승산 결과의 모든 8개의 출력 비트("z'[4],z'[5]", "z'[6],z'[7]")를 얻게 되면 새로운 승수와 변형된 형태의 피승수를 입력받아 다음의 승산을 준비하게 된다. 이와 같이 도 2의 회로로 구성된 승산기의 경우 매 클럭마다 2비트의 연산결과(z'[t],z'[t+1])가 출력되게 되어 필요한 클럭의 수가 도 1의 경우에 비해 1/2이 되고, 여기에 초기에 필요한 하나의 클럭을 합해 모두 5 클럭에 의해 승산을 처리할 수 있다.As described above, the multiplier of the present invention parallelized to the parallelization depth 2 stores the multiplier and the multiplicand of the modified form in g [0] to g [7] and x '[1] to x' [8]. Thus, the first output bits z '[0] and z' [1] of the multiplication result are obtained. Subsequently, when the clock is input, the shift registers 201 to 209 are shifted to the new state by two combination circuits, and the next two bits (z '[2], z [') of the multiplication result by the sum circuit of the two products. 3]). Keep getting the clock and getting all eight output bits ("z '[4], z' [5]", "z '[6], z' [7]") of the multiplication results in a new multiplier and The multiplicand of the type is input to prepare for the next multiplication. As described above, in the case of the multiplier composed of the circuit of FIG. 2, a 2-bit operation result (z '[t], z' [t + 1]) is output for each clock, so that the required number of clocks is 1 / It is 2, and one clock that is initially needed can be added together to multiply all by 5 clocks.

도 3은 도 2에 도시된 비트 직병렬 승산기를 하드웨어 로직으로 구현한 예이다. 도 3을 참조하면, 301 내지 309는 도 2의 201 내지 209에 대응하는 시프트 레지스터이고, 도 2의 211~218과 231~238은 도 3에서와 같이 2개의 갈로아체 곱셈기와 하나의 레지스터를 갖는 동일한 구성으로 구현된다. 그리고 도 3의 341은 도 2의 220에 대응하는 갈로아체 덧셈기이고, 342는 240에 대응하는 갈로아체 곱셈기이다. 그리고 도 2에서 x'[1]^x'[3]^x'[4]^x'[5]는 x'[1], x'[3], x'[4], x'[5]의 합을 구하는 갈로아체 덧셈기(351)로 구현되고, x'[2]^x'[4]^x'[5]^x'[6]은 x'[2], x'[4], x'[5], x'[6]의 합을 구하는 갈로아체 덧셈기(352)로 구현된다.FIG. 3 is an example in which the bit serial-parallel multiplier shown in FIG. 2 is implemented by hardware logic. Referring to FIG. 3, 301 to 309 are shift registers corresponding to 201 to 209 of FIG. 2, and 211 to 218 and 231 to 238 of FIG. 2 have two galloace multipliers and one register as shown in FIG. 3. Implemented in the same configuration. 341 of FIG. 3 is a galloach adder corresponding to 220 of FIG. 2, and 342 is a galloach multiplier corresponding to 240. And x '[1] ^ x' [3] ^ x '[4] ^ x' [5] in Figure 2 is x '[1], x' [3], x '[4], x' [5 ] Is implemented by a Galloche adder (351), where x '[2] ^ x' [4] ^ x '[5] ^ x' [6] is x '[2], x' [4] It is implemented by a Galloche adder 352 to obtain the sum of x '[5] and x' [6].

도 3 에서 로딩단계에서 시프트 레지스터들(302~309)에 피승수가 로딩되고, 레지스터들(311~318)에 승수가 로딩된다. 이어 연산단계에서 매 클럭(CLK)마다 시프트 레지스터의 값들이 좌로 시프트됨과 동시에 곱의 합으로 이루어진 2개의 조합회로에 의해 2개의 연산결과 z'[t]와 z'[t+1]가 각각 출력된다. 즉, 본 발명에 따른 병렬화 깊이 2의 승산기는 1클럭(CLK)에 2개의 연산결과가 출력되므로 4클럭에 의해 연산결과를 구할 수 있으므로 적은 하드웨어의 추가로 전체 처리속도를 향상시킬 수 있다.In FIG. 3, a multiplicand is loaded into the shift registers 302 ˜ 309 and a multiplier is loaded into the registers 311 ˜ 318 in the loading step. Subsequently, the values of the shift register are shifted to the left every clock CLK in the calculation step, and two calculation results z '[t] and z' [t + 1] are output by two combination circuits of product sums. do. That is, since the multiplier of the parallelization depth 2 according to the present invention outputs two calculation results in one clock CLK, the calculation result can be obtained by four clocks, and thus the overall processing speed can be improved by adding less hardware.

도 4는 본 발명에 따른 비트 직병렬 승산기(병렬화 깊이 4)의 개념을 도시한 도면이다. 도 4를 참조하면, 승수와 변형된 형태의 피승수를 g[0]~g[7]와 x'[3]~x'[10]에 저장하면 곱의 합회로에 의해 승산결과의 첫번째 출력비트(z'[0], z'[1], z'[2], z'[3])를 각각 얻는다. 이 상태에서 다음 클럭이 입력되게 되면 4개의 조합회로에 의해 시프트 레지스터는 새로운 상태로 천이하게 되고, 4개의 곱의 합회로에 의해 승산결과의 다음 4비트(z'[4], z'[5], z'[6], z'[7])를 얻게 된다.4 is a diagram illustrating the concept of a bit-parallel multiplier (parallelization depth 4) according to the present invention. Referring to FIG. 4, when the multiplier and the modified multiplicand are stored in g [0] to g [7] and x '[3] to x' [10], the first output bit of the multiplication result by the sum circuit of the product (z '[0], z' [1], z '[2], z' [3]) are obtained respectively. In this state, when the next clock is input, the shift register is shifted to the new state by four combination circuits, and the next four bits of the multiplication result by the sum circuit of the four products (z '[4], z' [5). ], z '[6], z' [7]).

이 상태에서 다음 클럭이 입력되면 새로운 승수와 변형된 형태의 피승수를 입력받아 다음의 승산을 준비하게 된다.In this state, when the next clock is input, a new multiplier and a modified multiplicand are prepared to prepare for the next multiplication.

이와 같이 도 4의 회로로 구성된 승산기는 매 클럭마다 4비트의 연산결과(z'[t], z'[t+1], z'[t+2], z'[t+3])가 출력되게 되어 필요한 클럭의 수가 도 1의 경우의 1/4이 되고, 여기에 처음 필요한 클럭 하나를 더해 모두 3클럭이 된다.In this way, the multiplier composed of the circuit of FIG. 4 has four bit calculation results (z '[t], z' [t + 1], z '[t + 2], and z' [t + 3]) for each clock. The number of clocks required to be output is 1/4 of the case of FIG. 1, and one clock is added to all three clocks.

이상에서 설명한 바와 같이, 본 발명은 갈로아체 상에서의 비트-직렬 승산기를 병렬화 함으로써 많은 하드웨어의 추가 없이 승산을 하는데 필요한 시간을 단축할 수 있고, 또 사용된 시스템 내에서 필요한 정도만큼만 병렬화함으로써 병렬화 깊이를 최적화할 수 있어 연산시간과 하드웨어 면적을 최적화할 수 있는 효과가 있다.As described above, the present invention can shorten the time required to multiply without adding a lot of hardware by parallelizing the bit-serial multiplier on Galoache, and by parallelizing only as much as necessary in the system used, the depth of parallelization can be reduced. It can be optimized, which has the effect of optimizing computation time and hardware area.

Claims

Multiplied by multiplier g [N-1: 0] on GF (2 ^N ) and the transformed multiplicand x '[N-1: 0] to output the transformed result z' [N-1: 0] In the method of parallelizing a bit serial multiplier,

Parallelizing a shift register for storing the multiplicand by a predetermined number according to the parallelization depth,

The combination circuit for obtaining the sum of the products of the multiplier and the multiplier is parallelized by a predetermined number according to the parallelization depth

And reducing the number of clocks required for the multiplier operation to improve the operation speed.

In the multiplier on Galloche that multiplies the multiplier g [7: 0] on GF (2 ⁸ ) by the converted multiplicand x '[7: 0] and outputs the converted result z' [7: 0],

Four shift registers (x '[1], x' [3], x '[5], x' [7]) are connected in series so that the first shift register group and five shifts shift the stored value every clock. Second shift register group in which registers x '[0], x' [2], x '[4], x' [6], x '[8] are connected in series to shift the stored value every clock A register having parallel to and loading the initially multiplied multiplier into the registers x '[1] to x' [8];

Outputting the result of the calculation (z '[t]) every clock by calculating the sum of the outputs of the registers x' [0] to x '[7] and the multipliers g [0] to g [7]. 1 combination circuit; And

The output of the registers x '[1] to x' [8] and the multiplier g [0] to g [7] are calculated as the sum of the products to output the calculation result z '[t + 1] every clock. And a bit series-parallel multiplier on Galoache, characterized in that it comprises a second combination circuit.

A first shift register group in which two shift registers x '[3] and x' [7] are connected in series to shift a stored value every clock, and three shift registers x '[0] and x' [ 4], x '[8]) are connected in series to shift a stored value every clock, and three shift registers (x' [1], x '[5], x' [9). ] Is connected in series, and the third shift register group for shifting the stored value every clock and the three shift registers (x '[2], x' [6], x '[10]) are connected in series. A register unit including a fourth shift register group for shifting a stored value in every clock in parallel, and for loading an initially multiplied multiplier into the registers x '[3] to x' [10];

Outputting the result of the calculation (z '[t]) every clock by calculating the sum of the outputs of the registers x' [0] to x '[7] and the multipliers g [0] to g [7]. 1 combination circuit;

The output of the registers x '[1] to x' [8] and the multiplier g [0] to g [7] are calculated as the sum of the products to output the calculation result z '[t + 1] every clock. A second combination circuit;

The output of the registers x '[2] to x' [9] and the multiplier g [0] to g [7] are calculated as the sum of the products to output the calculation result z '[t + 2] every clock. A third combination circuit; And

The output of the registers x '[3] to x' [10] and the multiplier g [0] to g [7] are calculated as the sum of the products to output the calculation result z '[t + 3] every clock. A bit serial-parallel multiplier on a galloche, characterized by comprising a fourth combination circuit.