KR102182299B1

KR102182299B1 - Device and Method for Performing Shift Operation

Info

Publication number: KR102182299B1
Application number: KR1020190089292A
Authority: KR
Inventors: 한정호
Original assignee: 에스케이텔레콤 주식회사
Priority date: 2019-07-24
Filing date: 2019-07-24
Publication date: 2020-11-24

Abstract

Disclosed are a shift operation device capable of simultaneously shifting a plurality of data having various lengths in parallel by using one shifter to reduce a hardware area and power consumption required for shifting the plurality of data, and an operation method thereof. According to one aspect of the present invention, the shift operation device capable of performing a shift operation on at least two pieces of input data each including a plurality of bits comprises: a mixer part that alternately arranges the bits of the at least two pieces of input data to output mixed data; an offset generation part that generates a shift offset value obtained by multiplying the shift amount of the at least two pieces of input data by the number of the at least two pieces of input data; a shift part that performs a shift operation on the mixed data by using the offset value to generate shifted data; and a splitter part that separates bits of the shifted data to generate at least two pieces of output data corresponding to the at least two pieces of input data.

Description

Device and Method for Performing Shift Operation TECHNICAL FIELD

본 실시예는 복수의 비트를 포함하는 복수의 데이터를 하나의 시프터를 이용하여 병렬적으로 시프트하는 시프트 모듈 및 그의 동작 방법에 관한 것이다.The present embodiment relates to a shift module for shifting a plurality of data including a plurality of bits in parallel using one shifter, and a method of operation thereof.

이 부분에 기술된 내용은 단순히 본 발명에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The content described in this section merely provides background information on the present invention and does not constitute prior art.

일반적으로, 프로세서는 단위 시간당 데이터 처리량을 늘리기 위해 다양한 길이를 가지는 복수 개의 데이터를 동시에 처리할 수 있다. 예를 들어, 32비트 프로세서는 32비트 데이터 1개를 처리하거나 8비트 데이터 4개를 동시에 처리할 수 있다.In general, the processor may simultaneously process a plurality of data having various lengths in order to increase the data throughput per unit time. For example, a 32-bit processor can process one 32-bit data or four 8-bit data simultaneously.

이러한 프로세서는 데이터를 시프트하기 위해 프로세서 내부에 시프터(shifter)를 포함할 수 있다. 프로세서 내부의 시프터는 데이터를 시프트하되, 시프트할 수 있는 데이터의 최대 길이가 정해져 있다. 또한, 시프터는 프로세서와 달리 시프트할 수 있는 최대 길이보다 짧은 길이의 복수 개의 데이터를 동시에 처리할 수 없는 것이 일반적이다.Such a processor may include a shifter inside the processor to shift the data. The shifter inside the processor shifts data, but the maximum length of data that can be shifted is set. Also, unlike a processor, a shifter generally cannot simultaneously process a plurality of data having a length shorter than the maximum length that can be shifted.

이러한 문제점을 해결하기 위하여, 종래의 프로세서는 시프트할 수 있는 데이터의 최대 길이가 다른 여러 시프터들을 포함함으로써, 다양한 길이를 가진 복수 개의 데이터를 병렬적으로 처리할 수 있도록 하였다.In order to solve this problem, a conventional processor includes several shifters having different maximum lengths of shiftable data, so that a plurality of data having various lengths can be processed in parallel.

도 1은 복수 개의 데이터를 시프트할 수 있는 종래 시프터를 포함하는 프로세서의 구조를 예시한 도면이다.1 is a diagram illustrating a structure of a processor including a conventional shifter capable of shifting a plurality of data.

도 1을 참조하면, 메모리(100) 및 프로세서(110)가 예시되어 있으며, 메모리(100)에 시프트할 데이터가 저장되어 있다. 프로세서(110)는 컨트롤러(controller, 112), 레지스터 파일(register file, 114), ALU(116, arithmetic logic unit) 및 시프터(118)를 포함한다. Referring to FIG. 1, a memory 100 and a processor 110 are illustrated, and data to be shifted is stored in the memory 100. The processor 110 includes a controller 112, a register file 114, an ALU 116, and an arithmetic logic unit, and a shifter 118.

프로세서(110)의 레지스터 파일(114)은 메모리(100)로부터 데이터를 전달받아 레지스터에 저장한다. 레지스터 파일(114)에 저장된 데이터는 다양한 길이를 가질 수 있다. 예를 들면, 1개의 32비트 데이터, 2개의 16비트 데이터 또는 4개의 8비트 데이터가 레지스터 파일(114)에 저장될 수 있다.The register file 114 of the processor 110 receives data from the memory 100 and stores it in a register. Data stored in the register file 114 may have various lengths. For example, one 32-bit data, two 16-bit data, or four 8-bit data may be stored in the register file 114.

컨트롤러(112)는 특정 프로그램 명령에 따라 레지스터 파일(114)에 저장된 데이터를 ALU(116) 또는 시프터(118)로 이동시키는 동작을 제어할 수 있다. ALU(116) 또는 시프터(118)는 레지스터 파일(114)로부터 전달받은 데이터에 대해 특정 연산을 수행할 수 있으며, 연산 결과 데이터를 레지스터 파일(114)에 저장할 수 있다. 이때, 데이터의 개수가 복수일 경우, 프로세서(110)는 복수의 ALU(116) 또는 복수의 시프터(118)를 이용하여 각 데이터들에 특정 연산을 수행한 후 레지스터 파일(114)에 저장할 수 있다.The controller 112 may control an operation of moving data stored in the register file 114 to the ALU 116 or the shifter 118 according to a specific program command. The ALU 116 or the shifter 118 may perform a specific operation on the data received from the register file 114 and may store the operation result data in the register file 114. In this case, when the number of data is plural, the processor 110 may perform a specific operation on each data using a plurality of ALUs 116 or a plurality of shifters 118 and then store them in the register file 114. .

이처럼 종래의 프로세서는 복수의 시프터들을 포함하기 때문에, 복수의 시프터들을 포함할수록 종래 프로세서의 하드웨어 면적이 더 많이 필요해지는 문제점이 있다. 종래의 프로세서는 일반적으로 병렬 연산해야 할 데이터들의 개수가 적었기 때문에, 여러 시프터들을 포함하더라도 종래 프로세서의 하드웨어 면적 및 소비전력이 증가하는 문제점이 크게 부각되지 않았다. 하지만, 뉴럴 네트워크(neural network), 인공지능(artificial intelligence) 및 딥러닝(deep learning) 등에 사용되는 프로세서는 동시에 처리해야 하는 데이터의 개수가 기하급수적으로 증가하기 때문에, 동시에 처리해야 하는 데이터의 개수만큼 시프터들을 포함한다면 프로세서의 하드웨어 면적 및 소비전력 또한 기하급수적으로 증가하는 문제점이 있다.As described above, since the conventional processor includes a plurality of shifters, there is a problem in that the hardware area of the conventional processor is required to increase as the plurality of shifters are included. Since the conventional processor generally has a small number of data to be computed in parallel, the problem of increasing the hardware area and power consumption of the conventional processor, even if several shifters are included, has not been highlighted. However, processors used for neural networks, artificial intelligence, and deep learning increase exponentially in the number of data to be processed simultaneously. If shifters are included, there is a problem in that the hardware area and power consumption of the processor also increase exponentially.

또한, 복수의 시프터들을 이용하여 데이터를 시프트하는 경우, 복수의 시프터들 간 데이터를 전달하는 경로가 길어지므로, 시프터의 연산 속도를 나타내는 논리적 깊이(logical depth)가 커지는 문제점이 있다. 이로 인해, 시프터의 연산 속도가 느려진다.In addition, when data is shifted using a plurality of shifters, a path through which data is transmitted between the plurality of shifters is lengthened, so that a logical depth representing the operation speed of the shifters increases. For this reason, the operation speed of the shifter becomes slow.

따라서, 하나의 시프터만으로 다양한 길이를 가진 복수의 데이터들을 동시에 병렬적으로 시프트할 수 있는 시프터를 구현할 필요성이 있다.Accordingly, there is a need to implement a shifter capable of simultaneously shifting a plurality of data having various lengths in parallel with only one shifter.

본 발명의 실시예들은, 하나의 시프터를 이용하여 다양한 길이를 가지는 복수의 데이터들을 동시에 병렬적으로 시프트할 수 있는 시프트 연산 장치 및 그의 동작 방법을 제공함으로써, 복수 개의 데이터들을 시프트하기 위해 필요한 하드웨어 면적 및 소비전력을 절감하는 데 주된 목적이 있다.Embodiments of the present invention provide a shift calculating device capable of simultaneously shifting a plurality of data having various lengths in parallel and in parallel using one shifter, and an operating method thereof, thereby providing a hardware area required to shift a plurality of data. And to reduce power consumption.

본 발명의 다른 실시예들은, 하나의 시프터를 이용하여 복수 개의 데이터들을 병렬적으로 시프트함으로써, 시프터 내부의 논리적 깊이를 줄이는 데 목적이 있다.Other embodiments of the present invention aim to reduce a logical depth inside a shifter by shifting a plurality of data in parallel using one shifter.

본 발명의 일 측면에 의하면, 복수의 비트를 각각 포함하는 적어도 2개의 입력 데이터들에 시프트 연산을 수행할 수 있는 시프트 연산 장치에 있어서, 상기 적어도 2개의 입력 데이터들의 각 비트들을 교번 배열하여 혼합된 데이터를 출력하는 믹서부; 상기 적어도 2개의 입력 데이터들의 시프트량에 상기 적어도 2개의 입력 데이터들의 개수를 곱한 값인 시프트 오프셋 값을 생성하는 오프셋 생성부; 상기 오프셋 값을 이용하여 상기 혼합된 데이터에 시프트 연산을 수행하여 시프트된 데이터를 생성하는 시프트부; 및 상기 시프트된 데이터의 각 비트들을 분리하여 상기 적어도 2개의 입력 데이터들에 대응되는 적어도 2개의 출력 데이터들을 생성하는 스플리터부를 포함하는 시프트 연산 장치를 제공한다.According to an aspect of the present invention, in a shift operation apparatus capable of performing a shift operation on at least two input data each including a plurality of bits, the bits of the at least two input data are alternately arranged and mixed. A mixer for outputting data; An offset generator for generating a shift offset value obtained by multiplying the shift amount of the at least two input data by the number of the at least two input data; A shift unit for generating shifted data by performing a shift operation on the mixed data using the offset value; And a splitter unit that separates each bit of the shifted data and generates at least two output data corresponding to the at least two input data.

본 실시예의 다른 측면에 의하면, 복수의 비트를 각각 포함하는 적어도 2개의 입력 데이터들에 시프트 연산을 수행할 수 있는 시프트 연산 장치의 동작 방법에 있어서, 상기 적어도 2개의 입력 데이터들의 각 비트들을 교번 배열하여 혼합된 데이터를 생성하는 단계; 상기 적어도 2개의 입력 데이터들의 시프트량에 상기 적어도 2개의 입력 데이터들의 개수를 곱한 값인 시프트 오프셋 값을 생성하는 단계; 상기 오프셋 값을 이용하여 상기 혼합된 데이터에 시프트 연산을 수행하여 시프트된 데이터를 생성하는 단계; 및 상기 시프트된 데이터의 각 비트들을 분리하여 상기 적어도 2개의 입력 데이터들에 대응되는 적어도 2개의 출력 데이터들을 생성하는 단계를 포함하는 시프트 연산 장치의 동작 방법을 제공한다.According to another aspect of the present embodiment, in a method of operating a shift operation device capable of performing a shift operation on at least two input data each including a plurality of bits, each bit of the at least two input data is alternately arranged. Generating mixed data; Generating a shift offset value obtained by multiplying the shift amount of the at least two input data by the number of the at least two input data; Generating shifted data by performing a shift operation on the mixed data using the offset value; And generating at least two output data corresponding to the at least two input data by separating each bit of the shifted data.

이상에서 설명한 바와 같이 본 실시예에 의하면, 하나의 시프터를 이용하여 다양한 길이를 가지는 복수 개의 데이터들을 병렬적으로 동시에 시프트함으로써, 복수 개의 데이터를 병렬적으로 시프트하는 데 필요한 하드웨어 면적 및 소비전력을 최소화할 수 있다.As described above, according to the present embodiment, by simultaneously shifting a plurality of data having various lengths in parallel using one shifter, the hardware area and power consumption required to shift the plurality of data in parallel are minimized. can do.

또한, 본 실시예에 의하면, 하나의 시프터를 이용하여 복수 개의 데이터들을 병렬적으로 시프트함으로써, 시프터 내부의 논리적 깊이를 줄일 수 있는 효과가 있다.In addition, according to the present embodiment, by shifting a plurality of data in parallel using one shifter, there is an effect of reducing a logical depth inside the shifter.

도 1은 복수 개의 데이터를 시프트할 수 있는 종래 시프터를 포함하는 프로세서의 구조를 예시한 도면이다.
도 2a, 도 2b, 도 2c, 도 2d 및 도 2e는 종래 시프터의 여러 시프트 연산 방법을 나타내는 도면이다.
도 3a 및 도 3b는 둘 이상의 시프터를 이용하여 복수 개의 데이터를 시프트하는 시프트 장치를 예시한 도면이다.
도 4 및 는 다양한 시프트 동작을 지원할 수 있는 퍼널 시프터에 대한 구조 및 동작 방법을 설명하기 위한 도면이다.
도 5는 다양한 시프트 동작을 지원할 수 있는 퍼널 시프터의 내부 시프터에 관한 구조를 예시한 도면이다.
도 6a, 도 6b 및 도 7은 오른쪽 시프터 및 왼쪽 시프터의 구조를 나타내는 도면이다.
도 8a, 도 8b 및 도 8c는 본 발명의 실시예에 따른 시프트 연산 장치의 구성 및 시프트 연산에 사용되는 데이터를 예시한 도면이다.
도 9a, 도 9b, 도 9c, 도 9d, 도 9e 및 도 9f는 본 발명의 실시예에 따른 시프트 연산 장치의 ASR, LSR, ROR, LSL 및 ROL 동작 과정을 각각 예시한 도면이다.
도 10은 본 발명의 실시예에 따른 시프트 연산 장치의 동작 방법을 예시한 순서도이다.1 is a diagram illustrating a structure of a processor including a conventional shifter capable of shifting a plurality of data.
2A, 2B, 2C, 2D, and 2E are diagrams illustrating various shift calculation methods of a conventional shifter.
3A and 3B are diagrams illustrating a shift device for shifting a plurality of data using two or more shifters.
4 and FIG. 4 are diagrams for explaining a structure and operation method of a funnel shifter capable of supporting various shift operations.
5 is a diagram illustrating a structure of an internal shifter of a funnel shifter capable of supporting various shift operations.
6A, 6B, and 7 are diagrams showing structures of a right shifter and a left shifter.
8A, 8B, and 8C are diagrams illustrating a configuration of a shift computing device according to an embodiment of the present invention and data used for shift calculation.
9A, 9B, 9C, 9D, 9E, and 9F are diagrams each illustrating an operation process of ASR, LSR, ROR, LSL, and ROL of a shift calculating apparatus according to an embodiment of the present invention.
10 is a flow chart illustrating a method of operating a shift calculating device according to an embodiment of the present invention.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, some embodiments of the present invention will be described in detail through exemplary drawings. In adding reference numerals to elements of each drawing, it should be noted that the same elements are assigned the same numerals as possible even if they are indicated on different drawings. In addition, in describing the present invention, if it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present invention, a detailed description thereof will be omitted.

또한, 본 발명의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 명세서 전체에서, 어떤 부분이 어떤 구성요소를 '포함', '구비'한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 '~부', '모듈' 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.In addition, in describing the constituent elements of the present invention, terms such as first, second, A, B, (a), (b) may be used. These terms are only used to distinguish the component from other components, and the nature, order, or order of the component is not limited by the term. Throughout the specification, when a part'includes' or'includes' a certain element, it means that other elements may be further included rather than excluding other elements unless otherwise stated. . In addition, terms such as'~ unit' and'module' described in the specification mean a unit that processes at least one function or operation, which may be implemented by hardware or software or a combination of hardware and software.

도 2a, 도 2b, 도 2c, 도 2d 및 도 2e는 종래 시프터의 여러 시프트 연산 방법을 나타내는 도면이다.2A, 2B, 2C, 2D, and 2E are diagrams illustrating various shift calculation methods of a conventional shifter.

시프트 연산(shift opoeration)은 컴퓨터 레지스터의 비트 처리 명령 중 하나로서, 2진법으로 나타낸 수치 또는 부호의 각 비트 내용을 왼쪽 또는 오른쪽으로 이동시키는 것을 의미한다. 일반적으로 시프트 연산은 프로세서(processor) 및 디지털 신호 처리 장치(digital signal processor) 등에서 사용된다.Shift opoeration is one of the bit processing instructions of computer registers, which means shifting the contents of each bit of a number or sign expressed in binary notation to the left or right. In general, the shift operation is used in a processor and a digital signal processor.

시프트 연산은 시프트 방향 및 시프트된 비트값의 처리 등을 고려하여 주로 5개의 연산으로 분류된다. 5개의 연산은 ASR(arithmetical shift right), LSR(logical shift right), ROR(rotate right), LSL(logical shift left), 및 ROL(rotate left)를 의미한다. The shift operation is mainly classified into five operations in consideration of the shift direction and processing of shifted bit values. The five operations mean ASR (arithmetical shift right), LSR (logical shift right), ROR (rotate right), LSL (logical shift left), and ROL (rotate left).

이하에서는, 길이가 16비트이고, 비트값이 임의의 이진수로 이루어진 2개의 데이터를 시프트하는 과정을 설명하되, 이러한 데이터 개수, 데이터 길이 및 비트값은 하나의 예시일 뿐 이에 한정되는 것은 아니다. 데이터의 길이는 8비트, 16비트 또는 32비트 등일 수 있고, 비트값 또한 도면에 나타난 예시와 달라질 수 있다. 또한, 이하의 데이터들은 2의 보수(2's complement)를 이용한 데이터인 것으로 설명하지만, 이는 하나의 예시일 뿐 이에 한정되는 것은 아니며 1의 보수(1's complement)를 이용할 수도 있다.Hereinafter, a process of shifting two pieces of data having a length of 16 bits and a bit value of an arbitrary binary number will be described, but the number of data, data length, and bit value are only examples, and are not limited thereto. The length of the data may be 8 bits, 16 bits, or 32 bits, and the bit value may be different from the example shown in the drawing. Further, the following data is described as being data using 2's complement, but this is only an example, and is not limited thereto, and 1's complement may be used.

또한, 이하에서 A[n]은 A 데이터의 n번째 비트이며, A[m:n]은 A 데이터의 m번째 비트부터 n번째 비트까지의 비트들 의미하고, x{y}는 값이 y인 비트가 x개인 데이터를 의미한다.In addition, in the following, A[n] is the nth bit of A data, A[m:n] is the bits from the mth bit to the nth bit of A data, and x{y} is the value y It means data with x bits.

도 2a를 참조하면, 16비트 길이의 데이터에서 오른쪽으로 2만큼 LSR 시프트된 데이터를 예시한다. LSR은 데이터를 오른쪽으로 시프트하고, 상위 비트 위치에 시프트량만큼 값이 0인 비트를 추가하는 시프트 방법이다. org_value는 시프트 되기 전 16비트 길이의 데이터이다. shifted_value는 org_value를 오른쪽으로 2만큼 시프트하고, 시프트된 데이터의 상위 비트 위치에 시프트량인 2에 상응하는 개수만큼 값이 0인 비트를 배열한 데이터이다 (shifted_value[15:14] = 2{0}). Referring to FIG. 2A, LSR-shifted data by 2 to the right in 16-bit length data is illustrated. LSR is a shift method that shifts data to the right and adds a bit with a value of 0 as much as the shift amount to the upper bit position. org_value is data of 16 bits length before shift. shifted_value is data in which org_value is shifted by 2 to the right, and bits with a value of 0 are arranged in the upper bit position of the shifted data by the number corresponding to the shift amount of 2 (shifted_value[15:14] = 2{0} ).

도 2b를 참조하면, 16비트 길이의 데이터에서 오른쪽으로 2만큼 ASR 시프트된 데이터를 예시한다. ASR은 데이터를 오른쪽으로 시프트하고, 상위 비트 위치에 시프트 전 데이터의 최상위 비트를 시프트량만큼 추가하는 시프트 방법이다. shifted_value는 org_value를 오른쪽으로 2만큼 시프트하고, 시프트된 데이터의 상위 비트 위치에 시프트량인 2에 상응하는 개수만큼 시프트 전 데이터의 최상위 비트인 org_value[15]를 배열한 데이터이다. 여기서, org_value가 2의 보수로 표현된 경우 최상위 비트는 부호를 의미하는 sign bit이다(shifted_value[15:14] = 2{org_value[15]}).Referring to FIG. 2B, data in which data of 16-bit length is ASR shifted to the right by 2 is illustrated. ASR is a shift method in which data is shifted to the right and the most significant bit of the data before shift is added to the upper bit position by the shift amount. The shifted_value is data in which org_value is shifted by 2 to the right, and org_value[15], which is the most significant bit of the data before shift, is arranged in the upper bit position of the shifted data by the number corresponding to the shift amount 2. Here, when org_value is expressed in two's complement, the most significant bit is a sign bit indicating a sign (shifted_value[15:14] = 2{org_value[15]}).

도 2c를 참조하면, 16비트 길이의 데이터에서 오른쪽으로 2만큼 ROR 시프트 된 데이터를 예시한다. ROR은 데이터의 시프트량에 상응하는 하위 비트를 상위 비트 위치로 이동시키는 시프트 방법이다. shifted_value는 org_value를 오른쪽으로 2만큼 시프트하고, 시프트된 데이터의 상위 비트 위치에 시프트량인 2에 상응하는 개수만큼 시프트 전 데이터의 하위 비트인 org_value[1:0]을 이동시킨 데이터이다(shifted_value[15:14] = org_value[1:0]).Referring to FIG. 2C, ROR-shifted data by 2 to the right in 16-bit length data is illustrated. ROR is a shift method of shifting the lower bit corresponding to the shift amount of data to the upper bit position. shifted_value is data obtained by shifting org_value by 2 to the right, and shifting org_value[1:0], the lower bits of the data before shift, by the number corresponding to the shift amount of 2 to the upper bit position of the shifted data (shifted_value[15). :14] = org_value[1:0]).

도 2d를 참조하면, 16비트 길이의 데이터에서 왼쪽으로 2만큼 LSL 시프트 된 데이터를 예시한다. LSR은 데이터를 왼쪽으로 시프트하고, 하위 비트 위치에 시프트량만큼 값이 0인 비트를 추가하는 시프트 방법이다. shifted_value는 org_value를 왼쪽으로 2만큼 시프트하고, 시프트된 데이터의 하위 비트 위치에 시프트량인 2에 상응하는 개수만큼 값이 0인 비트를 배열한 데이터이다(shifted_value[1:0] = 2{0}).Referring to FIG. 2D, LSL shifted data by 2 to the left from 16-bit length data is illustrated. LSR is a shift method that shifts data to the left and adds a bit with a value of 0 as much as the shift amount to the lower bit position. shifted_value is data in which org_value is shifted by 2 to the left, and bits with a value of 0 are arranged in the lower bit position of the shifted data by the number corresponding to the shift amount of 2 (shifted_value[1:0] = 2{0} ).

도 2e를 참조하면, 16비트 길이의 데이터에서 왼쪽으로 2만큼 ROL 시프트 된 데이터를 예시한다. ROL은 데이터의 시프트량에 상응하는 상위 비트를 하위 비트 위치로 이동시키는 시프트 방법이다. shifted_value는 org_value를 왼쪽으로 2만큼 시프트하고, 시프트된 데이터의 하위 비트 위치에 시프트량인 2에 상응하는 개수만큼 시프트 전 데이터의 상위 비트인 org_value[15:14]를 이동시킨 데이터이다(shifted_value[1:0] = org_value[15:14]).Referring to FIG. 2E, ROL-shifted data by 2 to the left in 16-bit length data is illustrated. ROL is a shift method in which an upper bit corresponding to a shift amount of data is moved to a lower bit position. shifted_value is data obtained by shifting org_value by 2 to the left, and shifting org_value[15:14], the upper bits of the data before the shift, by the number corresponding to the shift amount of 2 to the lower bit position of the shifted data (shifted_value[1 :0] = org_value[15:14]).

도 3a 및 도 3b는 둘 이상의 시프터를 이용하여 복수 개의 데이터를 시프트 하는 시프트 장치를 예시한 도면이다.3A and 3B are diagrams illustrating a shift device for shifting a plurality of data using two or more shifters.

도 3a를 참조하면, 제1시프트 장치(300)는 16-비트 데이터를 시프트할 수 있는 16비트 시프터(302), 8비트의 데이터를 시프트할 수 있는 8비트 시프터(304) 및 두 개의 8비트를 연쇄하여 16-비트 데이터를 출력할 수 있는 연결부(concatenate, 306)를 포함한다.3A, a first shift device 300 includes a 16-bit shifter 302 capable of shifting 16-bit data, an 8-bit shifter 304 capable of shifting 8-bit data, and two 8-bit data. It includes a concatenate 306 capable of concatenating and outputting 16-bit data.

IN[15:0]이 1개의 16-비트 데이터인 경우, 제1시프트 장치(300) IN[15:0]을 16비트 시프터(302)로 전달하고, 16비트 시프터(302)는 IN[15:0]에 대해 시프트 연산을 수행한 OUT[15:0]을 출력할 수 있다.When IN[15:0] is one 16-bit data, the first shift device 300 transfers IN[15:0] to the 16-bit shifter 302, and the 16-bit shifter 302 is IN[15 It is possible to output OUT[15:0] that has performed a shift operation on :0].

IN[15:0]이 2개의 8-비트 데이터가 연결된 데이터인 경우(IN[7:0] = Data_A[7:0], IN[15:8] = Data_B[7:0]), 제1시프트 장치(300)는 IN[15:0]을 2개의 8-비트 데이터로 분리한다. 하나의 8-비트 데이터는 16비트 시프터(302)에 의해 시프트되고, 나머지 하나의 8-비트 데이터는 8비트 시프터(304)에 의해 시프트될 수 있다. 이때, 16비트 시프터(302)에 입력될 8-비트 데이터는 부호 비트 복사(sign-extension) 또는 0 비트 복사(zero-extension) 과정을 거쳐 16-비트 데이터로 변환되는 과정이 필요하다. 16비트 시프터(302) 및 8비트 시프터(304)에 의해 시프트된 2개의 8-비트 데이터들은 연결부(306)에서 연결되어 OUT[15:0]으로 출력된다.When IN[15:0] is the data connected with two 8-bit data (IN[7:0] = Data_A[7:0], IN[15:8] = Data_B[7:0]), the first The shift device 300 separates IN[15:0] into two 8-bit data. One 8-bit data may be shifted by the 16-bit shifter 302, and the other 8-bit data may be shifted by the 8-bit shifter 304. In this case, 8-bit data to be input to the 16-bit shifter 302 needs to be converted into 16-bit data through a sign-extension or zero-extension process. Two 8-bit data shifted by the 16-bit shifter 302 and the 8-bit shifter 304 are connected at the connection part 306 and output as OUT[15:0].

도 3b를 참조하면, 제2시프트 장치(310)는 왼쪽 시프터(left shifter, 312) 및 오른쪽 시프터(right shifter, 314)를 포함할 수 있다. 왼쪽 시프터(312) 및 오른쪽 시프터(314)는 IN[15:0]을 시프트하기 위해, 내부에 2개의 8비트 시프터를 포함할 수 있다. 왼쪽 시프터(312)는 LSL 또는 ROL 동작을 수행할 수 있고, 오른쪽 시프터(314)는 LSR, ASR 또는 ROR 동작을 수행할 수 있다.Referring to FIG. 3B, the second shift device 310 may include a left shifter 312 and a right shifter 314. The left shifter 312 and the right shifter 314 may include two 8-bit shifters therein to shift IN[15:0]. The left shifter 312 may perform an LSL or ROL operation, and the right shifter 314 may perform an LSR, ASR, or ROR operation.

IN[15:0]이 2개의 8-비트 데이터가 연결된 데이터인 경우, 제2시프트 장치(310)는 IN[15:0]을 2개의 8-비트 데이터로 분리한 후, 2개의 8-비트 데이터를 왼쪽 시프터(312) 및 오른쪽 시프터(314)에게 전달한다. 즉, 왼쪽 시프터(312) 및 오른쪽 시프터(314)는 총 16-비트 길이의 데이터를 전달받는다. 왼쪽 시프터(312)의 내부에 위치한 2개의 8비트 시프터는 2개의 8-비트 데이터를 각각 입력 받아 시프트 연산을 수행한다. 왼쪽 시프터(312)는 2개의 8비트 시프터에서 출력된 2개의 8-비트 데이터를 연결하여 출력할 수 있다. 오른쪽 시프터(314)는 왼쪽 시프터(312)와 동일한 원리로 동작한다. 제2시프트 장치(310)는 위의 과정을 통해 2개의 8-비트 데이터에 대해 병렬적으로 시프트 연산을 수행할 수 있다.When IN[15:0] is data to which two 8-bit data are connected, the second shift device 310 separates IN[15:0] into two 8-bit data, and then divides IN[15:0] into two 8-bit data. Data is transmitted to the left shifter 312 and the right shifter 314. That is, the left shifter 312 and the right shifter 314 receive data of a total length of 16-bit. Two 8-bit shifters located inside the left shifter 312 receive two 8-bit data, respectively, and perform a shift operation. The left shifter 312 may connect and output two 8-bit data output from two 8-bit shifters. The right shifter 314 operates on the same principle as the left shifter 312. The second shift device 310 may perform a shift operation on two 8-bit data in parallel through the above process.

IN[15:0]이 1개의 16-비트 데이터인 경우, 제2시프트 장치(310)는 IN[15:0]를 2개의 8-비트 데이터로 분리한 뒤, 각 8-비트 데이터에 대해 병렬적으로 시프트 연산을 수행할 수 있다. 이때, 제2시프트 장치(310)는 2개의 8비트 시프터 간 데이터를 전달하는 경로로 인해 논리적 깊이가 길어지고, 시프트 연산 시 왼쪽 시프터(312) 또는 오른쪽 시프터(314) 중 하나의 시프터만 동작하기 때문에 시프터의 개수 대비 활용도가 낮다.When IN[15:0] is one 16-bit data, the second shift device 310 separates IN[15:0] into two 8-bit data, and then parallelizes each 8-bit data. You can perform the shift operation as an alternative. At this time, the second shift device 310 has a longer logical depth due to a path for transferring data between two 8-bit shifters, and only one of the left shifter 312 or the right shifter 314 operates during a shift operation. Therefore, its utilization is low compared to the number of shifters.

도 4 및 는 다양한 시프트 동작을 지원할 수 있는 퍼널 시프터에 대한 구조 및 동작 방법을 설명하기 위한 도면이다.4 and are diagrams for explaining a structure and an operating method of a funnel shifter capable of supporting various shift operations.

도 4를 참조하면, 퍼널 시프터(funnel shifter, 400) 및 먹스 컨트롤 신호 테이블이 예시되어 있다. 퍼널 시프터(400)는 내부 시프터(internal shifter, 410)를 포함할 수 있으며, 내부 시프터(410)는 입력 데이터 개수가 m이고 입력 데이터의 길이가 n-비트일 때, 최대 2×m×n 비트 데이터를 입력 받아 시프트 연산을 수행할 수 있다. 퍼널 시프터(400)는 내부 시프터(410)를 이용하여 ASR, LSR, ROR, LSL 및 ROL 동작을 수행할 수 있다. 4, a funnel shifter 400 and a mux control signal table are illustrated. The funnel shifter 400 may include an internal shifter 410, and when the number of input data is m and the length of the input data is n-bits, the internal shifter 410 is at most 2×m×n bits. Shift operation can be performed by receiving data. The funnel shifter 400 may perform ASR, LSR, ROR, LSL, and ROL operations using the internal shifter 410.

또한, 퍼널 시프터(400)는 내부 시프터(410)에 상위 n-비트 및 하위 n-비트를 각각 다르게 입력할 수 있는 먹스(multiplexer)를 포함할 수 있다. 또한, 퍼널 시프터(400)는 시프트량을 이용하여 시프트 오프셋 값을 내부 시프터(410)에 전달해줄 수 있는 먹스를 포함할 수 있다. 퍼널 시프터(400)는 시프트 연산에 적절한 오프셋 값을 생성하기 위해 먹스 외에 뺄셈기(subtractor)도 포함할 수 있다.In addition, the funnel shifter 400 may include a multiplexer capable of differently inputting upper n-bits and lower n-bits to the internal shifter 410, respectively. In addition, the funnel shifter 400 may include a mux capable of transmitting a shift offset value to the internal shifter 410 using a shift amount. The funnel shifter 400 may include a subtractor in addition to mux to generate an offset value suitable for a shift operation.

이하에서, 16{0}은 값이 0인 16-비트 데이터를 의미하며, 16{IN[15]}는 값이 IN[15]인 16-비트 데이터를 의미한다. 내부 시프터(410)는 상위 비트 위치에 16{0}, 16{IN[15]} 또는 IN[15:0] 중 어느 하나를 입력 받을 수 있고, 하위 비트 위치에 16{0} 또는 IN[15:0]을 입력 받을 수 있다. 또한, 내부 시프터(410)는 이진 데이터 형태의 오프셋 값을 입력 받을 수 있다. 오프셋 값은 입력 데이터의 최대 길이를 최대값으로 한다. 도 4에서, IN[15:0]의 최대 길이가 16비트이므로, 오프셋 값의 최대값은 십진수로 16이며, 이진수로 1-1-1-1(4bits)이다.Hereinafter, 16{0} refers to 16-bit data having a value of 0, and 16{IN[15]} refers to 16-bit data having a value of IN[15]. The internal shifter 410 may receive any one of 16{0}, 16{IN[15]}, or IN[15:0] at the upper bit position, and 16{0} or IN[15] at the lower bit position. :0] can be entered. In addition, the internal shifter 410 may receive an offset value in the form of binary data. The offset value makes the maximum length of the input data the maximum. In Fig. 4, since the maximum length of IN[15:0] is 16 bits, the maximum offset value is 16 in decimal and 1-1-1-1 (4 bits) in binary.

이하에서는 ASR 동작 과정을 설명한다. LSR, ROR, LSL 및 ROL은 동일한 원리도 동작한다.Hereinafter, the ASR operation process will be described. LSR, ROR, LSL and ROL work on the same principle.

내부 시프터(410)는 상위 비트 위치에 16{IN[15]}을 입력 받고, 하위 비트 위치에 IN[15:0]을 입력 받는다. 내부 시프터(410)는 총 32-비트 데이터를 입력 받으며, 32-비트 데이터에서 16-비트 데이터를 선택하여 출력할 수 있다. 여기서, 선택되는 16-비트 데이터는 32-비트 데이터의 최하위 비트(LSB, least significant bit)로부터 오프셋 값(sh)만큼 왼쪽으로 떨어진 위치의 16-비트 데이터이다. 결과적으로, 선택된 16-비트 데이터는 IN[15:0]을 오른쪽으로 시프트량만큼 시프트하고, 시프트량에 상응하는 상위 비트 위치에 부호 비트들을 배열한 데이터이다. 즉, 선택된 16-비트 데이터 sh_out[15:0]은 sh_out[15-sh:0]=IN[15:sh]이고, sh_out[15:15-sh+1]=IN[sh-1:0]인 데이터이다.The internal shifter 410 receives 16{IN[15]} at the upper bit position and IN[15:0] at the lower bit position. The internal shifter 410 receives a total of 32-bit data, and may select and output 16-bit data from 32-bit data. Here, the selected 16-bit data is 16-bit data at a position separated from the least significant bit (LSB) of 32-bit data to the left by an offset value (sh). As a result, the selected 16-bit data is data in which IN[15:0] is shifted to the right by a shift amount, and sign bits are arranged at an upper bit position corresponding to the shift amount. That is, the selected 16-bit data sh_out[15:0] is sh_out[15-sh:0]=IN[15:sh], and sh_out[15:15-sh+1]=IN[sh-1:0] Is data.

도 5는 다양한 시프트 동작을 지원할 수 있는 퍼널 시프터의 내부 시프터에 관한 구조를 예시한 도면이다.5 is a diagram illustrating a structure of an internal shifter of a funnel shifter capable of supporting various shift operations.

도 5를 참조하면, 내부 시프터(410)의 구조 및 먹스들의 제어 과정이 나타난다. 내부 시프터(410)는 IN[15:0]보다 두 배의 길이를 가진 input[31:0]을 입력 받는다. sh_amt[3:0]은 시프트량을 4-비트의 이진수로 나타낸 값이며, sh_amt[3]은 sh_amt[3:0]의 최상위 비트이다. 가장 상단에 위치한 먹스는 input[31:8] 및 input[22:0]을 입력으로 받고, sh_amt[3]의 값이 1인 경우 input[31:8]을 출력하고, sh_amt[3]의 값이 0인 경우 input[22:0]을 출력한다. 위와 동일한 동작 방법으로 4개의 먹스를 제어할 수 있다. 도 5에서는 시프트량인 sh_amt[3:0]의 값이 0이므로, 최종적으로 출력된 데이터 mux_L0_out[15:0]은 IN[15:0]과 동일한 값을 가진 데이터이다.Referring to FIG. 5, the structure of the internal shifter 410 and the control process of mux are shown. The internal shifter 410 receives input[31:0] having twice the length of IN[15:0]. sh_amt[3:0] is a value representing the shift amount in 4-bit binary, and sh_amt[3] is the most significant bit of sh_amt[3:0]. The mux at the top receives input[31:8] and input[22:0] as inputs, outputs input[31:8] when sh_amt[3] is 1, and outputs sh_amt[3] If it is 0, input[22:0] is output. You can control 4 mux with the same operation method as above. In FIG. 5, since the value of the shift amount sh_amt[3:0] is 0, the finally output data mux_L0_out[15:0] is data having the same value as IN[15:0].

도 6a, 도 6b 및 도 7 오른쪽 시프터 및 왼쪽 시프터의 구조를 나타내는 도면이다.6A, 6B, and 7 are diagrams showing structures of the right shifter and the left shifter.

도 6a 및 도 6b를 참조하면, 도 3b에 예시된 오른쪽 시프터(314)의 내부 구조를 나타낸다. 오른쪽 시프터(314)에 사용되는 먹스 개수 및 논리적 깊이는 표 1에서 후술한다.6A and 6B, the internal structure of the right shifter 314 illustrated in FIG. 3B is shown. The number of muxes and logical depth used in the right shifter 314 will be described later in Table 1.

도 7을 참조하면, 도 3b에 예시된 왼쪽 시프터(312)의 내부 구조를 나타낸다. 왼쪽 시프터(312)에 사용되는 먹스 개수 및 논리적 깊이는 표 1에서 후술한다.Referring to FIG. 7, the internal structure of the left shifter 312 illustrated in FIG. 3B is shown. The number of muxes and logical depths used in the left shifter 312 will be described later in Table 1.

도 8a, 도 8b 및 도 8c는 본 발명의 실시예에 따른 시프트 연산 장치의 구성 및 시프트 연산에 사용되는 데이터를 예시한 도면이다.8A, 8B, and 8C are diagrams illustrating a configuration of a shift computing device according to an embodiment of the present invention and data used for shift calculation.

도 8a를 참조하면, 본 발명의 실시예에 따른 시프트 연산 장치는 믹서부(810), 오프셋 생성부(820), 시프트부(830) 및 스플리터부(840)를 포함할 수 있다. 본 발명의 일 실시예에 따른 시프트 연산 장치는 복수의 비트를 포함하는 적어도 2개의 입력 데이터들을 병렬적으로 시프트할 수 있다. 또한, 본 발명의 다른 실시예에 따른 시프트 연산 장치는 1개의 입력 데이터를 시프트 하는 경우, 시프트부(830)만 이용하여 입력 데이터를 시프트할 수도 있다. 추가적으로 본 발명의 다른 실시예에 따른 시프트 연산 장치는 도 1에 예시된 프로세서(110) 내에서 작동할 수 있다.Referring to FIG. 8A, a shift calculating apparatus according to an embodiment of the present invention may include a mixer unit 810, an offset generator 820, a shift unit 830, and a splitter unit 840. The shift operation apparatus according to an embodiment of the present invention may shift at least two input data including a plurality of bits in parallel. In addition, the shift calculating apparatus according to another embodiment of the present invention may shift input data using only the shift unit 830 when shifting one input data. Additionally, the shift computing device according to another embodiment of the present invention may operate within the processor 110 illustrated in FIG. 1.

믹서부(810)는 시프트 연산 장치에 입력되는 입력 데이터를 혼합할 수 있다. 입력 데이터들의 개수가 2개 이상인 경우, 믹서부(810)는 2개 이상의 입력 데이터들의 각 비트들을 교번 배열함으로써 입력 데이터들을 혼합할 수 있다. 여기서, 교번 배열은 입력 데이터들의 비트 중 비트 인덱스가 동일한 비트를 인접하게 배열하되, 상기 비트 인덱스의 오름차순 및 교번 순서로 배열하는 것이다. 비트 인덱스의 오름차순은 LSB에서 MSB 방향의 순서를 의미한다.The mixer unit 810 may mix input data input to the shift calculating device. When the number of input data is two or more, the mixer unit 810 may mix the input data by alternately arranging the bits of the two or more input data. Here, in the alternating arrangement, bits having the same bit index among the bits of the input data are arranged adjacently, but arranged in an ascending order and an alternating order of the bit index. The ascending order of the bit index means the order from LSB to MSB.

도 8b를 참조하면, 본 발명의 일 실시예에 따른 믹서부(810)가 2개의 8-비트 데이터를 교번 배열하여 16-비트 길이를 가진 혼합된 데이터가 예시된다.Referring to FIG. 8B, mixed data having a 16-bit length by alternately arranging two 8-bit data by the mixer unit 810 according to an embodiment of the present invention is illustrated.

오프셋 생성부(820)는 입력 데이터들의 시프트량을 이용하여 혼합된 데이터의 시프트 오프셋 값을 생성할 수 있다. 오프셋 생성부(820)는 왼쪽 시프트 연산에 적절한 오프셋 값을 생성하기 위해 먹스 외에 뺄셈기(subtractor)도 포함할 수 있다. The offset generator 820 may generate a shift offset value of mixed data by using a shift amount of input data. The offset generator 820 may also include a subtractor in addition to mux to generate an offset value suitable for a left shift operation.

오프셋 생성부(820)에 의해 생성된 오프셋 값은 십진수 또는 이진수 중 어느 하나로 표현될 수 있다. 한편, 혼합된 데이터의 비트 개수가 2의 n제곱(거듭제곱)일 때, 이진수로 표현된 오프셋 값의 최대 비트 길이는 n-비트이다. The offset value generated by the offset generator 820 may be expressed as either a decimal number or a binary number. On the other hand, when the number of bits of the mixed data is 2 to the power of n (reverse power), the maximum bit length of the offset value expressed in binary is n-bits.

본 발명의 일 실시예에 따른 오프셋 생성부(820)는 입력 데이터들의 시프트량에 입력 데이터들의 개수를 곱한 값을 오프셋 값으로 생성할 수 있다. The offset generator 820 according to an embodiment of the present invention may generate a value obtained by multiplying the shift amount of input data by the number of input data as an offset value.

또한, 본 발명의 다른 실시예에 따른 오프셋 생성부(820)는 혼합된 데이터의 비트 개수에서 입력 데이터들의 시프트량에 입력 데이터들의 개수를 곱한 값을 뺀 값을 오프셋 값으로 생성할 수도 있다.In addition, the offset generator 820 according to another embodiment of the present invention may generate a value obtained by subtracting a value obtained by multiplying the number of input data by the shift amount of input data from the number of bits of mixed data as an offset value.

또한, 본 발명의 일 실시예에 따라, 입력 데이터들의 개수가 2의 n제곱(n은 음수 아닌 정수)이고, 시프트량이 이진 데이터로 표현되는 경우, 오프셋 값은 시프트량을 n만큼 왼쪽 시프트한 값이 될 수 있다. 이때, 오프셋 생성부(820)는 먹스 등의 논리 구조를 구현하기 위한 하드웨어를 별도로 추가하지 않고, 하드 와이어드(hard-wired) 구조를 이용하여 오프셋 값을 생성할 수 있다.In addition, according to an embodiment of the present invention, when the number of input data is 2 to the power of n (n is a non-negative integer), and the shift amount is expressed as binary data, the offset value is a value obtained by left shifting the shift amount by n. Can be In this case, the offset generator 820 may generate an offset value using a hard-wired structure without separately adding hardware for implementing a logical structure such as a mux.

시프트부(830)는 믹서부(810)에 의해 혼합된 데이터를 입력 받고, 혼합된 데이터를 오프셋 값만큼 시프트하여 시프트 된 데이터를 출력할 수 있다. 시프트부(830)는 혼합된 데이터에 대해 ASR, LSR, ROR, LSL 또는 ROL 중 적어도 하나의 연산을 수행할 수 있다.The shift unit 830 may receive mixed data by the mixer unit 810, shift the mixed data by an offset value, and output the shifted data. The shift unit 830 may perform at least one of ASR, LSR, ROR, LSL, and ROL on the mixed data.

시프트부(splitter unit, 830)가 시프트할 수 있는 데이터의 최대 길이는 입력 데이터의 비트 개수를 모두 합한 값이다. 즉, 시프트부(830)의 시프트 가능한 데이터의 최대 길이는 입력 데이터의 비트 수에 입력 데이터의 개수를 곱한 값이다. The maximum length of data that can be shifted by the splitter unit 830 is the sum of the number of bits of input data. That is, the maximum length of the shiftable data of the shift unit 830 is a value obtained by multiplying the number of bits of input data by the number of input data.

본 발명의 일 실시예에 따른 시프트부(830)는 단방향으로만 시프트할 수 있는 시프터를 포함한다. 본 발명의 실시예에 따른 시프트부(830)는 퍼널 시프터를 포함할 수 있고, 이에 관한 동작 방법은 도 9a 내지 도 9f에서 후술한다.The shift unit 830 according to an embodiment of the present invention includes a shifter capable of shifting only in one direction. The shift unit 830 according to an embodiment of the present invention may include a funnel shifter, and an operation method thereof will be described later in FIGS. 9A to 9F.

스플리터부(840)는 시프트부(830)에 의해 시프트된 데이터를 입력 받고, 믹서부(810)의 동작과 반대로 시프트된 데이터를 입력 데이터의 개수만큼 분리하여 출력할 수 있는 구성요소이다. 즉, 스플리터부(840)는 시프트된 데이터의 각 비트들을 분리하여 입력 데이터 개수에 대응되는 출력 데이터들을 생성할 수 있다. The splitter unit 840 is a component capable of receiving shifted data by the shift unit 830 and outputting the shifted data by separating the shifted data by the number of input data, contrary to the operation of the mixer unit 810. That is, the splitter unit 840 may generate output data corresponding to the number of input data by separating each bit of the shifted data.

예를 들면, 입력 데이터가 2개의 8-비트 데이터인 경우, 혼합된 데이터는 입력 데이터가 교번 배열된 16-비트 데이터이고, 스플리터부(840)는 혼합된 데이터를 비트 인덱스가 짝수인 비트들 및 홀수인 비트들로 분리하여 2개의 8-비트 데이터를 출력할 수 있다.For example, when the input data is two 8-bit data, the mixed data is 16-bit data in which the input data is alternately arranged, and the splitter unit 840 converts the mixed data to bits having an even bit index and Two 8-bit data can be output by separating them into odd bits.

도 8b를 참조하면, 본 발명의 일 실시예에 따른 스플리터부(840)가 16-비트 길이를 가진 혼합된 데이터를 분리하고, 분리된 2개의 8-비트 데이터가 예시된다.Referring to FIG. 8B, the splitter unit 840 according to an embodiment of the present invention separates mixed data having a 16-bit length, and the separated two 8-bit data are illustrated.

이하에서는, 본 발명의 일 실시예에 따른 믹서부(810) 및 스플리터부(840)의 데이터 처리 과정을 관계식으로 설명한다.Hereinafter, a data processing process of the mixer unit 810 and the splitter unit 840 according to an embodiment of the present invention will be described with a relational expression.

입력 데이터의 개수가 N 개이고, 각 입력 데이터의 비트 개수가 L 개이고, 이를 D_n[L-1:0](여기서 D_n은 n번째 입력 데이터, n=0, 1, ..., N-1)으로 표현할 때, 믹서부(810)가 입력 받는 데이터 IN은 수학식 1을 만족한다.The number of input data is N, and the number of bits of each input data is L, which is D_n[L-1:0] (where D_n is the nth input data, n=0, 1, ..., N-1) When expressed as, the data IN received by the mixer unit 810 satisfies Equation 1.

믹서부(810)의 출력이 Mout, 스플리터부(840)의 입력이 shifted_Mout, 스플리터부(840)의 출력이 Sout이며, 'm mod N'은 m을 N으로 나눈 나머지 값을 의미할 때, Mout은 수학식 2를 만족하고, Sout은 수학식 3을 만족한다.When the output of the mixer unit 810 is Mout, the input of the splitter unit 840 is shifted_Mout, and the output of the splitter unit 840 is Sout,'m mod N'denotes the remainder of m divided by N, Mout Satisfies Equation 2, and Sout satisfies Equation 3.

도 9a, 도 9b, 도 9c, 도 9d, 도 9e 및 도 9f는 본 발명의 실시예에 따른 시프트 연산 장치의 ASR, LSR, ROR, LSL 및 ROL 동작 과정을 각각 예시한 도면이다.9A, 9B, 9C, 9D, 9E, and 9F are diagrams each illustrating an operation process of ASR, LSR, ROR, LSL, and ROL of a shift calculating apparatus according to an embodiment of the present invention.

도 9a를 참조하면, 본 발명의 일 실시예에 따른 시프트 연산 장치가 시프트할 2개의 8-비트 데이터를 예시한다.Referring to FIG. 9A, two 8-bit data to be shifted by a shift operation apparatus according to an embodiment of the present invention are illustrated.

2개의 데이터 A 및 B는 2의 보수로 표현된 것으로 가정할 때, A는 25를 나타내고 B는 -56을 나타낸다. 16-비트 데이터는 하위 비트 위치에 A를 배치하고, 상위 비트 위치에 B를 배치함으로써, 두 데이터를 연결한 데이터이다. 이하에서는 입력 데이터는 A 및 B 데이터것으로 설명하나, 이는 하나의 예시일 뿐 이에 한정되는 것은 아니며, 데이터 개수, 비트 개수 또는 비트값이 달라질 수 있다. 또한, A 및 B 데이터를 각각 2-비트만큼 시프트하는 것으로 설명한다.Assuming that the two data A and B are expressed in two's complement, A represents 25 and B represents -56. 16-bit data is data that connects two data by placing A at the lower bit position and B at the upper bit position. Hereinafter, the input data will be described as A and B data, but this is only an example and is not limited thereto, and the number of data, number of bits, or bit value may vary. In addition, it will be described as shifting the A and B data by 2-bit, respectively.

이하에서 설명하는 본 발명의 실시예에 따른 시프트 연산 장치의 동작 과정은 상기 동작 과정을 포함하는 코드가 메모리에 저장되고, 시프트 연산 장치의 외부에 위치한 프로세서가 메모리에 저장된 코드를 실행함으로써 수행될 수 있다.The operation process of the shift operation apparatus according to the embodiment of the present invention described below can be performed by storing the code including the operation process in a memory, and executing the code stored in the memory by a processor located outside the shift operation apparatus. have.

도 9b를 참조하면, 본 발명의 실시예에 따른 시프트 연산 장치가 ASR(arithmetical shift right) 연산을 수행하는 동작을 예시한다.Referring to FIG. 9B, an operation of performing an arithmetical shift right (ASR) operation by a shift calculating apparatus according to an embodiment of the present invention is illustrated.

믹서부(810)는 A 및 B 데이터가 연결되어 16-비트 데이터인 IN[15:0]을 입력 받는다. 믹서부(810)는 수학식 2를 이용하여 혼합된 데이터 M[15:0]을 출력한다. M[15:0]은 A 및 B 데이터가 각각 교번 배열된 것이며, LSB에서 비트 인덱스의 오름차순으로 배열된 16-비트 데이터이다.The mixer unit 810 receives IN[15:0] which is 16-bit data by connecting A and B data. The mixer unit 810 outputs the mixed data M[15:0] by using Equation 2. M[15:0] is an alternate arrangement of A and B data, and is 16-bit data arranged in ascending order of bit index in LSB.

본 발명의 실시예에 따른 시프트부(830)는 혼합된 데이터 M[15:0]를 오른쪽으로 4만큼 시프트한 후(shifter_ouput[11:0]=M[15:4], 시프트된 데이터의 상위 비트 위치에 A 및 B 데이터의 부호 비트(최상위 비트)를 각각 배열한 데이터를 출력할 수 있다(shifter_output[15:12]=4{M[15], M[14]}).The shift unit 830 according to an embodiment of the present invention shifts the mixed data M[15:0] to the right by 4 (shifter_ouput[11:0]=M[15:4], Data in which the sign bits (most significant bits) of A and B data are arranged at the bit position, respectively, can be output (shifter_output[15:12]=4{M[15], M[14]}).

본 발명의 일 실시예에 따른 시프트부(830)가 퍼널 시프터인 경우, 시프트부(830)는 혼합된 데이터 길이의 두 배 길이를 가진 데이터를 입력받는다. 도 8c를 참조하면, 시프트부(830)는 혼합된 데이터 M[15:0]을 하위 비트 위치로 입력 받고, 상위 비트 위치에는 M[15] 및 M[14]가 반복 배열된 16-비트 데이터를 입력 받는다. 여기서, M[15] 및 M[14]는 각각 B 데이터의 부호 비트 및 A 데이터의 부호 비트이다. 시프트부(830)는 32-비트 데이터에서 최하위 비트로부터 오프셋 값만큼 떨어져있는 위치의 비트를 기준으로 16-비트 데이터를 선택하여 출력한다. 여기서, 오프셋 값은 시프트량인 2에 입력 데이터의 개수인 2를 곱한 값인 4이다. 결과적으로, 시프트부(830)는 혼합된 데이터 M[15:0]을 4만큼 오른쪽 시프트한 후, 상위 비트 위치에 A 및 B의 부호 비트를 오프셋 값만큼 반복 배열한 데이터를 출력할 수 있다.When the shift unit 830 according to an embodiment of the present invention is a funnel shifter, the shift unit 830 receives data having a length twice the length of the mixed data. Referring to FIG. 8C, the shift unit 830 receives the mixed data M[15:0] as a lower bit position, and 16-bit data in which M[15] and M[14] are repeatedly arranged at the upper bit position. Is entered. Here, M[15] and M[14] are the sign bit of B data and the sign bit of A data, respectively. The shift unit 830 selects and outputs 16-bit data based on a bit at a position separated from the least significant bit by an offset value from the 32-bit data. Here, the offset value is 4, which is a value obtained by multiplying 2, which is the shift amount, by 2, which is the number of input data. As a result, the shift unit 830 may right-shift the mixed data M[15:0] by 4, and then output data obtained by repetitively arranging the sign bits of A and B at the upper bit position by the offset value.

스플리터부(840)는 시프트부(830)로부터 출력된 데이터를 입력 받고, 이를 분리하여 입력 데이터의 개수와 동일한 개수의 데이터를 출력한다. 즉, 스플리터부(840)는 수학식 3을 이용하여 2개의 8-비트 데이터를 출력할 수 있다.The splitter unit 840 receives the data output from the shift unit 830, separates it, and outputs the same number of data as the number of input data. That is, the splitter unit 840 may output two 8-bit data by using Equation 3.

결과적으로 입력 데이터 IN[15:0]과 출력 데이터 OUT[15:0]을 비교하면, 출력 데이터의 하위 8-비트인 OUT[7:0]은 A 데이터를 2만큼 ASR 연산한 데이터이며, 출력 데이터의 상위 8-비트인 OUT[15:8]은 B 데이터를 2만큼 ASR 연산한 데이터이다. 이처럼 본 발명의 일 실시예에 따른 시프트 연산 장치는 적어도 2개의 데이터를 병렬적으로 시프트 연산할 수 있다.As a result, if input data IN[15:0] and output data OUT[15:0] are compared, OUT[7:0], the lower 8-bit of output data, is the data obtained by ASR operation by 2, and output OUT[15:8], the upper 8-bit of data, is the data obtained by ASR operation of B data by 2. As described above, the shift calculating apparatus according to an embodiment of the present invention can shift at least two pieces of data in parallel.

도 9a 및 도 9c를 참조하면, 본 발명의 실시예에 따른 시프트 연산 장치가 LSR(logical shift right) 연산을 수행하는 동작을 예시한다.9A and 9C, an operation of performing a logical shift right (LSR) operation by a shift calculating apparatus according to an embodiment of the present invention is illustrated.

이하에서는, 믹서부(810) 및 스플리터부(840)의 동작은 도 9b에서 설명한 것과 동일하므로 생략한다.Hereinafter, the operations of the mixer unit 810 and the splitter unit 840 are the same as those described in FIG. 9B, and thus will be omitted.

본 발명의 실시예에 따른 시프트부(830)는 혼합된 데이터 M[15:0]를 오른쪽으로 4만큼 시프트한 후(shifter_ouput[11:0]=M[15:4], 시프트된 데이터의 상위 비트 위치에 값이 0인 비트를 배열한 데이터를 출력할 수 있다(shifter_output[15:12]=4{0}).The shift unit 830 according to an embodiment of the present invention shifts the mixed data M[15:0] to the right by 4 (shifter_ouput[11:0]=M[15:4], Data in which bits with a value of 0 are arranged at the bit position can be output (shifter_output[15:12]=4{0}).

본 발명의 일 실시예에 따른 시프트부(830)가 퍼널 시프터(funnel shifter)인 경우, 시프트부(830)는 혼합된 데이터 길이의 두 배 길이를 가진 데이터를 입력받는다. 도 8c를 참조하면, 시프트부(830)는 혼합된 데이터 M[15:0]을 하위 비트 위치로 입력 받고, 상위 비트 위치에는 값이 모두 0인 16-비트 데이터를 입력 받는다. 시프트부(830)는 32-비트 데이터에서 최하위 비트로부터 오프셋 값만큼 떨어져있는 위치의 비트를 기준으로 16-비트 데이터를 선택하여 출력한다. 여기서, 오프셋 값은 시프트량인 2에 입력 데이터의 개수인 2를 곱한 값인 4이다. 결과적으로, 시프트부(830)는 혼합된 데이터 M[15:0]을 4만큼 오른쪽 시프트한 후, 상위 비트 위치에 값이 0인 4-비트가 배열된 데이터를 출력할 수 있다.When the shift unit 830 according to an embodiment of the present invention is a funnel shifter, the shift unit 830 receives data having a length twice the length of the mixed data. Referring to FIG. 8C, the shift unit 830 receives mixed data M[15:0] as a lower bit position and 16-bit data having all 0 values at an upper bit position. The shift unit 830 selects and outputs 16-bit data based on a bit at a position separated from the least significant bit by an offset value from the 32-bit data. Here, the offset value is 4, which is a value obtained by multiplying 2, which is the shift amount, by 2, which is the number of input data. As a result, the shift unit 830 may right-shift the mixed data M[15:0] by 4, and then output data in which 4-bits having a value of 0 are arranged at the upper bit position.

입력 데이터 IN[15:0]과 출력 데이터 OUT[15:0]을 비교하면, 출력 데이터의 하위 8-비트인 OUT[7:0]은 A 데이터를 2만큼 LSR 연산한 데이터이며, 출력 데이터의 상위 8-비트인 OUT[15:8]은 B 데이터를 2만큼 LSR 연산한 데이터이다. If input data IN[15:0] and output data OUT[15:0] are compared, OUT[7:0], the lower 8-bit of output data, is the data obtained by LSR operation of A data by 2, and The upper 8-bit OUT[15:8] is the data obtained by LSR operation of B data by 2.

도 9a 및 도 9d를 참조하면, 본 발명의 실시예에 따른 시프트 연산 장치가 ROR(rotate shift right) 연산을 수행하는 동작을 예시한다.9A and 9D, an operation of performing a rotate shift right (ROR) operation by a shift operation apparatus according to an embodiment of the present invention is illustrated.

본 발명의 실시예에 따른 시프트부(830)는 혼합된 데이터 M[15:0]를 오른쪽으로 4만큼 시프트한 후(shifter_ouput[11:0]=M[15:4], 시프트된 데이터의 상위 비트 위치에 혼합된 데이터의 하위 비트들을 배열한 데이터를 출력할 수 있다(shifter_output[15:12]=M[3:0]).The shift unit 830 according to an embodiment of the present invention shifts the mixed data M[15:0] to the right by 4 (shifter_ouput[11:0]=M[15:4], Data obtained by arranging the lower bits of the data mixed at the bit position can be output (shifter_output[15:12]=M[3:0]).

본 발명의 일 실시예에 따른 시프트부(830)가 퍼널 시프터인 경우, 시프트부(830)는 혼합된 데이터 길이의 두 배 길이를 가진 데이터를 입력받는다. 도 8c를 참조하면, 시프트부(830)는 혼합된 데이터 M[15:0]을 하위 비트 위치 및 상위 비트 위치에 입력 받는다. 시프트부(830)는 32-비트 데이터에서 최하위 비트로부터 오프셋 값만큼 떨어져있는 위치의 비트를 기준으로 16-비트 데이터를 선택하여 출력한다. 여기서, 오프셋 값은 시프트량인 2에 입력 데이터의 개수인 2를 곱한 값인 4이다. 결과적으로, 시프트부(830)는 혼합된 데이터 M[15:0]을 4만큼 오른쪽 시프트한 후, 상위 비트 위치에 혼합된 데이터의 하위 4-비트를 이동시킨 데이터를 출력할 수 있다.When the shift unit 830 according to an embodiment of the present invention is a funnel shifter, the shift unit 830 receives data having a length twice the length of the mixed data. Referring to FIG. 8C, the shift unit 830 receives mixed data M[15:0] to the lower bit position and the upper bit position. The shift unit 830 selects and outputs 16-bit data based on a bit at a position separated from the least significant bit by an offset value from the 32-bit data. Here, the offset value is 4, which is a value obtained by multiplying 2, which is the shift amount, by 2, which is the number of input data. As a result, the shift unit 830 may right-shift the mixed data M[15:0] by 4, and then output data obtained by shifting the lower 4-bit of the mixed data to the upper bit position.

입력 데이터 IN[15:0]과 출력 데이터 OUT[15:0]을 비교하면, 출력 데이터의 하위 8-비트인 OUT[7:0]은 A 데이터를 2만큼 ROR 연산한 데이터이며, 출력 데이터의 상위 8-비트인 OUT[15:8]은 B 데이터를 2만큼 ROR 연산한 데이터이다. If input data IN[15:0] and output data OUT[15:0] are compared, OUT[7:0], the lower 8-bit of output data, is the data obtained by RORing A data by 2, and The upper 8-bit OUT[15:8] is the data obtained by ROR operation of B data by 2.

도 9a 및 도 9e를 참조하면, 본 발명의 실시예에 따른 시프트 연산 장치가 LSL(logical shift left) 연산을 수행하는 동작을 예시한다.9A and 9E, an operation of performing a logical shift left (LSL) operation by a shift operation apparatus according to an embodiment of the present invention is illustrated.

본 발명의 실시예에 따른 시프트부(830)는 혼합된 데이터 M[15:0]를 왼쪽으로 4만큼 시프트한 후(shifter_ouput[15:4]=M[11:0], 시프트된 데이터의 하위 비트 위치에 값이 0인 비트를 배열한 데이터를 출력할 수 있다(shifter_output[3:0]=4{0}).The shift unit 830 according to an embodiment of the present invention shifts the mixed data M[15:0] to the left by 4 (shifter_ouput[15:4]=M[11:0], Data obtained by arranging bits with a value of 0 at the bit position can be output (shifter_output[3:0]=4{0}).

본 발명의 일 실시예에 따른 시프트부(830)가 퍼널 시프터인 경우, 시프트부(830)는 혼합된 데이터 길이의 두 배 길이를 가진 데이터를 입력받는다. 도 8c를 참조하면, 시프트부(830)는 혼합된 데이터 M[15:0]을 상위 비트 위치로 입력 받고, 하위 비트 위치에는 값이 모두 0인 16-비트 데이터를 입력 받는다. 시프트부(830)는 32-비트 데이터에서 최하위 비트로부터 오프셋 값만큼 떨어져있는 위치의 비트를 기준으로 16-비트 데이터를 선택하여 출력한다. 여기서, 오프셋 값은 혼합된 데이터의 비트 개수인 16에서 시프트량인 2에 입력 데이터의 개수인 2를 곱한 값인 4를 뺀 값이다. 결과적으로, 시프트부(830)는 혼합된 데이터 M[15:0]을 4만큼 왼쪽 시프트한 후, 하위 비트 위치에 값이 0인 4-비트가 배열된 데이터를 출력할 수 있다.When the shift unit 830 according to an embodiment of the present invention is a funnel shifter, the shift unit 830 receives data having a length twice the length of the mixed data. Referring to FIG. 8C, the shift unit 830 receives mixed data M[15:0] as an upper bit position and 16-bit data having all 0 values at the lower bit position. The shift unit 830 selects and outputs 16-bit data based on a bit at a position separated from the least significant bit by an offset value from the 32-bit data. Here, the offset value is a value obtained by subtracting 4, which is a value obtained by multiplying the shift amount 2 by 2, which is the number of input data, from 16, which is the number of bits of mixed data. As a result, after shifting the mixed data M[15:0] to the left by 4, the shift unit 830 may output data in which 4-bits having a value of 0 are arranged at a lower bit position.

입력 데이터 IN[15:0]과 출력 데이터 OUT[15:0]을 비교하면, 출력 데이터의 하위 8-비트인 OUT[7:0]은 A 데이터를 2만큼 LSL 연산한 데이터이며, 출력 데이터의 상위 8-비트인 OUT[15:8]은 B 데이터를 2만큼 LSL 연산한 데이터이다. If input data IN[15:0] and output data OUT[15:0] are compared, OUT[7:0], the lower 8-bit of output data, is the data obtained by LSL operation of A data by 2, and The upper 8-bit OUT[15:8] is the data obtained by LSL operation of B data by 2.

도 9a 및 도 9f를 참조하면, 본 발명의 실시예에 따른 시프트 연산 장치가 ROL(rotate shift left) 연산을 수행하는 동작을 예시한다.9A and 9F, an operation of performing a rotate shift left (ROL) operation by a shift operation apparatus according to an embodiment of the present invention is illustrated.

본 발명의 실시예에 따른 시프트부(830)는 혼합된 데이터 M[15:0]를 왼쪽으로 4만큼 시프트한 후(shifter_ouput[15:4]=M[11:0], 시프트된 데이터의 하위 비트 위치에 혼합된 데이터의 하위 비트들을 배열한 데이터를 출력할 수 있다(shifter_output[3:0]=M[15:12]).The shift unit 830 according to an embodiment of the present invention shifts the mixed data M[15:0] to the left by 4 (shifter_ouput[15:4]=M[11:0], Data obtained by arranging the lower bits of the mixed data at the bit position can be output (shifter_output[3:0]=M[15:12]).

본 발명의 일 실시예에 따른 시프트부(830)가 퍼널 시프터인 경우, 시프트부(830)는 혼합된 데이터 길이의 두 배 길이를 가진 데이터를 입력받는다. 도 8c를 참조하면, 시프트부(830)는 혼합된 데이터 M[15:0]을 하위 비트 위치 및 상위 비트 위치에 입력 받는다. 시프트부(830)는 32-비트 데이터에서 최하위 비트로부터 오프셋 값만큼 떨어져 있는 위치의 비트를 기준으로 16-비트 데이터를 선택하여 출력한다. 여기서, 오프셋 값은 혼합된 데이터의 비트 개수인 16에서 시프트량인 2에 입력 데이터의 개수인 2를 곱한 값인 4를 뺀 값이다. 결과적으로, 시프트부(830)는 혼합된 데이터 M[15:0]을 4만큼 왼쪽 시프트한 후, 하위 비트 위치에 혼합된 데이터의 하위 4-비트를 이동시킨 데이터를 출력할 수 있다.When the shift unit 830 according to an embodiment of the present invention is a funnel shifter, the shift unit 830 receives data having a length twice the length of the mixed data. Referring to FIG. 8C, the shift unit 830 receives mixed data M[15:0] to the lower bit position and the upper bit position. The shift unit 830 selects and outputs 16-bit data based on a bit at a position separated from the least significant bit by an offset value from the 32-bit data. Here, the offset value is a value obtained by subtracting 4, which is a value obtained by multiplying the shift amount 2 by 2, which is the number of input data, from 16, which is the number of bits of mixed data. As a result, the shift unit 830 may shift the mixed data M[15:0] to the left by 4, and then output data obtained by shifting the lower 4-bit of the mixed data to the lower bit position.

입력 데이터 IN[15:0]과 출력 데이터 OUT[15:0]을 비교하면, 출력 데이터의 하위 8-비트인 OUT[7:0]은 A 데이터를 2만큼 ROL 연산한 데이터이며, 출력 데이터의 상위 8-비트인 OUT[15:8]은 B 데이터를 2만큼 ROL 연산한 데이터이다. If input data IN[15:0] and output data OUT[15:0] are compared, OUT[7:0], the lower 8-bit of output data, is the data obtained by ROL operation of A data by 2. The upper 8-bit OUT[15:8] is the data obtained by ROL operation of B data by 2.

표 1은 퍼널 시프터 및 본 발명의 실시예에 따른 시프트 연산 장치에 사용되는 먹스 개수 및 논리적 깊이를 비교한 표이다.Table 1 is a table comparing the number of muxes and logical depths used in the funnel shifter and the shift calculating apparatus according to an embodiment of the present invention.

Area (# of 2-to-1 mux)Area (# of 2-to-1 mux) logical depthlogical depth pre-muxpre-mux shifter-muxshifter-mux post-muxpost-mux offsetoffset totaltotal 1 data1 data funnel shifterfunnel shifter 16bit16bit 48
(16×2+16)48
(16×2+16) 75
(23+19+17+16)75
(23+19+17+16) 00 44 127127 66 8bit8bit 24
(8×2+8)24
(8×2+8) 27
(8+9+10)27
(8+9+10) 00 33 5454 55 2 data2 data known shifterknown shifter 도 3aFigure 3a 1616 181
(127+54)181
(127+54) 1616 00 213
(100%)213
(100%) 8
(1+6+1)8
(1+6+1) 도 3bFig. 3b 00 203
(96+111)203
(96+111) 1616 00 219
(102.82%)219
(102.82%) 11
(10+1)11
(10+1) present inventionpresent invention 96
(16×4+16×2)96
(16×4+16×2) 75
(23+19+17+16)75
(23+19+17+16) 1616 8
(4+4)8
(4+4) 195
(91.55%)195
(91.55%) 8
(3+4+1)8
(3+4+1)

표 1 및 도 4를 참조하면, 1개의 데이터를 시프트하는 데 필요한 퍼널 시프터의 먹스 수를 계산할 수 있다. 여기서, 1개의 먹스는 2개의 비트를 입력 받아 1개의 비트를 출력할 수 있다. 퍼널 시프터(funnel shifter)의 pre-mux 항목을 살펴보면, 퍼널 시프터(400)는 내부 시프터(410)의 상위 비트 위치 및 하위 비트 위치에 각 16-비트 데이터를 입력할 수 있다. 상위 비트 위치에 16{0}, 16{IN[15]} 또는 IN[15:0] 중 어느 하나를 입력하기 위해 1 비트당 2개의 2-to-1 먹스가 사용된다. 상위 비트 위치에 16-비트 데이터를 입력하기 위해서는 16×2=32개의 먹스가 사용된다. 하위 비트 위치에 16{0} 또는 IN[15:0]을 선택하여 16-비트 데이터를 입력하기 위해 16개의 먹스가 사용된다. 따라서, pre-mux는 총 48개 사용된다.Referring to Tables 1 and 4, the number of muxes of a funnel shifter required to shift one piece of data can be calculated. Here, one mux may receive two bits and output one bit. Looking at the pre-mux item of the funnel shifter, the funnel shifter 400 may input 16-bit data to the upper bit position and the lower bit position of the internal shifter 410. Two 2-to-1 muxes per bit are used to input either 16{0}, 16{IN[15]}, or IN[15:0] to the upper bit position. To input 16-bit data in the upper bit position, 16×2=32 muxes are used. 16 muxes are used to input 16-bit data by selecting 16{0} or IN[15:0] in the lower bit position. Therefore, a total of 48 pre-mux are used.

또한, shifter-mux 항목 및 도 5를 참조하면, 도 5에서 설명한 바와 같이 내부 시프터(410)가 32-비트 데이터에서 오프셋 값만큼 시프트된 데이터인 16-비트 데이터를 선택하기 위하여 필요한 먹스의 수는 75개이다. 이는, 첫 번째 단계에서 23개의 비트를 출력하기 위해 필요한 먹스 23개, 두 번째 단계에서 19개의 비트를 출력하기 위해 필요한 먹스 19개, 세 번째 단계에서 17개의 비트를 출력하기 위해 필요한 먹스 17개 및 네 번째 단계에서 16개의 비트를 출력하기 위해 필요한 먹스 16개를 합산한 개수이다.Further, referring to the shifter-mux item and FIG. 5, the number of muxes required for the internal shifter 410 to select 16-bit data, which is data shifted by an offset value from 32-bit data as described in FIG. 5, is There are 75. That is, 23 muxes required to output 23 bits in the first step, 19 muxes required to output 19 bits in the second step, 17 muxes required to output 17 bits in the third step, and This is the sum of 16 muxes needed to output 16 bits in the fourth step.

offset 항목 및 도 4를 참조하면, 4-비트 데이터로 표현된 오프셋 값을 내부 시프터에 입력하기 위해 4개의 먹스가 사용된다.Referring to the offset item and FIG. 4, four muxes are used to input the offset value expressed as 4-bit data to the internal shifter.

logical depth 항목을 참조하면, 16 비트의 퍼널 시프터에서 내부 시프터에 2개의 16-비트 데이터를 입력하는 입력 단계에서 logical depth 1, 내부 시프터에 오프셋 값을 입력하는 단계에서 logical depth 1 및 내부 시프터 내에서 16-비트 데이터를 시프트 하는 단계에서 logical depth 4를 합산하여, 16 비트 퍼널 시프터의 logical depth는 6이다.Referring to the logical depth item, logical depth 1 in the input step of inputting two 16-bit data to the internal shifter in the 16-bit funnel shifter, and logical depth 1 in the step of entering the offset value in the internal shifter. In the step of shifting 16-bit data, logical depth 4 is added, so that the logical depth of the 16-bit funnel shifter is 6.

표 1 및 도 3a를 참조하면, 제1시프트 장치(300)는 16-비트 시프터(302)에 IN[15:0]과 sign/zero extension을 수행한 16-비트 데이터를 선택하기 위해 16개의 pre-mux가 사용된다. 또한, 16-비트 시프터(302) 및 8-비트 시프터(304)에 사용되는 내부 먹스 개수는 각각 127 및 54이므로 총 181개의 shifter-mux가 사용된다. 또한, 16-비트 시프터(302)의 출력 데이터인 16-비트 데이터와 연결부(306)의 출력 데이터인 16-비트 데이터 중 하나를 선택하려 출력하기 위해, 16개의 post-mux가 사용된다. 제1시프트 장치(300)는 총 213개의 먹스를 사용하고, 이를 100%인 것으로 다른 시프터들의 먹스 개수를 비교한다. 또한, 제1시프트 장치(300)의 logical depth는 8이다.Referring to Table 1 and FIG. 3A, the first shift device 300 selects 16-bit data obtained by performing IN[15:0] and sign/zero extension to the 16-bit shifter 302. -mux is used. In addition, since the number of internal muxes used in the 16-bit shifter 302 and the 8-bit shifter 304 are 127 and 54, respectively, a total of 181 shifter-muxes are used. In addition, 16 post-muxes are used to select and output one of 16-bit data, which is the output data of the 16-bit shifter 302 and 16-bit data, which is the output data of the connector 306. The first shift device 300 uses a total of 213 muxes, which is 100%, and compares the number of muxes of other shifters. Also, the logical depth of the first shift device 300 is 8.

표 1, 도 3b, 도 6a 및 도 6b를 참조하면, 제2시프트 장치(310)는 총 219개의 먹스를 사용하고, 제1시프트 장치(300)에 비해 102.82%의 먹스 개수 및 면적을 필요로 한다. 제2시프트 장치(310)의 logical depth는 11이다.Referring to Tables 1, 3B, 6A, and 6B, the second shift device 310 uses a total of 219 muxes, and requires 102.82% of the number and area of the muxes compared to the first shift device 300. do. The logical depth of the second shift device 310 is 11.

표 1 및 도 8a를 참조하면, 본 발명의 일 실시예에 따른 시프트 연산 장치는 시프트부(830)의 상위 비트 위치에 16{0}, 16{IN[15]}, M[15:0] 또는 16{M[15], M[14]} 중 어느 하나를 입력하기 위해, 즉 4개의 데이터 중 2개를 골라내고 다시 2개의 데이터 중 1개를 골라내기 위해 총 16×3=48개 2-to-1 먹스가 사용된다. 마찬가지로, 하위 비트 위치에 데이터를 입력하는 데 16×2=32개의 먹스가 사용된다. 따라서 시프트 연산 장치는 총 96개의 pre-mux를 필요로 한다.Referring to Table 1 and FIG. 8A, the shift calculating apparatus according to an embodiment of the present invention includes 16{0}, 16{IN[15]}, and M[15:0] at the upper bit positions of the shift unit 830. Or, to input any one of 16{M[15], M[14]}, that is, to select 2 out of 4 data and then to select 1 out of 2 data, a total of 16×3=48 2 -to-1 mux is used. Likewise, 16×2=32 muxes are used to input data in the lower bit position. Therefore, the shift calculation unit requires a total of 96 pre-muxes.

shifter-mux 항목을 참조하면, 본 발명의 실시예에 따른 시프트 연산 장치는 총 75개의 shifter-mux를 사용하고, 시프트부(830)의 출력 데이터와 스플리터부(840)의 출력 데이터 중 어느 하나를 선택하기 위해 16개의 post-mux를 사용한다.Referring to the shifter-mux item, the shift calculating apparatus according to an embodiment of the present invention uses a total of 75 shifter-mux, and either output data of the shift unit 830 and output data of the splitter unit 840 Use 16 post-mux to select.

본 발명의 실시예에 따른 시프트 연산 장치는 2개의 8-비트 데이터를 병렬적으로 시프트하기 위하여 총 195개의 2-to-1 먹스를 사용한다. 즉, 종래의 제1시프터 장치(300)의 먹스 사용 개수의 91.55%만을 사용하여 시프트 연산 장치를 구현할 수 있다. 이는, 2개의 입력 데이터를 시프트하는 경우의 비교 결과이며, 본 발명의 실시예에 따른 시프트 연산 장치는 입력 데이터의 개수가 많을수록 더 적은 비율의 먹스만으로 병렬적으로 데이터를 시프트할 수 있다.The shift operation apparatus according to the embodiment of the present invention uses a total of 195 2-to-1 muxes to shift two 8-bit data in parallel. That is, the shift calculation device may be implemented using only 91.55% of the number of mux used in the conventional first shifter device 300. This is a result of comparison when two input data are shifted, and the shift calculating apparatus according to an embodiment of the present invention can shift data in parallel with only a smaller ratio of mux as the number of input data increases.

또한, 본 발명의 실시예에 따른 시프트 연산 장치의 logical depth는 총 8이다. 이는, 시프트부(830)의 입력 단계에서 logical depth 1, 오프셋 단계에서 logical depth 2, 시프트부(830)의 내부 단계에서 logical depth 4 및 post-mux 단계에서 logical depth 1을 합산한 값이다. 이는, 종래의 제1시프트 장치(300)의 logical depth와 같은 값이며, 본 발명의 실시예에 따른 시프트 연산 장치가 종래의 시프터와 동일한 연산 속도를 가지되, 더 적은 하드웨어 면적 및 더 적은 소비전력으로 복수의 데이터를 시프트할 수 있다는 것을 의미한다. 이는, 2개의 데이터를 병렬적으로 시프트 하는 경우의 비교 결과이며, 입력 데이터의 개수가 많아질수록 본 발명의 실시예에 따른 시프트 연산 장치는 종래의 시프터보다 더욱 빠른 연산 속도를 가질 수 있다.In addition, the total logical depth of the shift computing device according to the embodiment of the present invention is 8. This is a sum of logical depth 1 in the input step of the shift unit 830, logical depth 2 in the offset step, 4 logical depth in the internal step of the shift unit 830, and logical depth 1 in the post-mux step. This is the same value as the logical depth of the conventional first shift device 300, and the shift calculation device according to the embodiment of the present invention has the same calculation speed as the conventional shifter, but has less hardware area and less power consumption. Means that you can shift multiple data. This is a result of comparison when two pieces of data are shifted in parallel, and as the number of input data increases, the shift calculating apparatus according to an exemplary embodiment of the present invention may have a faster operation speed than the conventional shifter.

도 10은 본 발명의 실시예에 따른 시프트 연산 장치의 동작 방법을 예시한 순서도이다.10 is a flow chart illustrating a method of operating a shift calculating device according to an embodiment of the present invention.

우선, 본 발명의 실시예에 따른 시프트 연산 장치는 입력 데이터의 개수 및 비트 수에 관한 정보를 수집한다(S1000). First, the shift calculating apparatus according to an embodiment of the present invention collects information on the number of input data and the number of bits (S1000).

시프트 연산 장치는 수집한 입력 데이터의 개수 및 비트 수를 고려하여, 시프트부(830)가 시프트할 수 있는 데이터의 최대 길이를 설정한 뒤, 시프트부(830)를 논리 구조를 통해 구현할 수 있다(S1002). 본 발명의 일 실시예에 따른 시프트부(830)는 입력 데이터 개수가 m이고 입력 데이터의 길이가 n-비트일 때, 최대 2×m×n 비트 데이터를 입력 받아 시프트 연산을 수행할 수 있다.The shift calculator may set the maximum length of data that can be shifted by the shift unit 830 in consideration of the number of collected input data and the number of bits, and then implement the shift unit 830 through a logical structure ( S1002). When the number of input data is m and the length of the input data is n-bits, the shift unit 830 according to an embodiment of the present invention may receive data of up to 2×m×n bits and perform a shift operation.

시프트 연산 장치는 수집한 입력 데이터의 개수 및 비트 수를 고려하여, 믹서부(810) 및 스플리터부(840)를 구현한다(S1004). 믹서부(810)는 입력 데이터의 개수에 각 비트 수를 곱한 값을 혼합할 수 있는 최대 비트 길이로 설정할 수 있다. 또한, 스플리터부(840)는 입력 데이터의 개수에 각 비트 수를 곱한 값을 분리할 수 있는 최대 비트 길이로 설정할 수 있다. The shift calculating apparatus implements the mixer unit 810 and the splitter unit 840 in consideration of the number of collected input data and the number of bits (S1004). The mixer 810 may set a value obtained by multiplying the number of input data by the number of bits to be a maximum bit length that can be mixed. In addition, the splitter unit 840 may set a value obtained by multiplying the number of input data by the number of each bit as a maximum bit length that can be separated.

시프트 연산 장치는 수집한 입력 데이터의 개수 및 입력 데이터의 각 시프트량을 이용하여 오프셋 값을 생성하는 오프셋 생성부(820)를 구현한다(S1006). 오프셋 생성부(820)가 생성하는 오프셋 값은 ASR, LSR, ROR, LSL 또는 ROL 연산에 따라 달라질 수 있다.The shift calculating apparatus implements an offset generator 820 that generates an offset value using the number of collected input data and each shift amount of the input data (S1006). The offset value generated by the offset generator 820 may vary according to ASR, LSR, ROR, LSL, or ROL operations.

시프트 연산 장치는 ASR, LSR, ROR, LSL 또는 ROL 연산을 수행하기 위해서, 시프트부(830)의 입력 데이터 및 오프셋 값에 대한 데이터 테이블 및 입력 먹스 구조를 생성할 수 있다(S1008). 즉, 시프트부(830)는 도 8c의 데이터 테이블을 참조하여, 각 연산마다 다른 데이터과 오프셋 값을 입력 받아 시프트 연산을 수행할 수 있다.In order to perform an ASR, LSR, ROR, LSL, or ROL operation, the shift calculating apparatus may generate a data table and an input mux structure for input data and offset values of the shift unit 830 (S1008). That is, the shift unit 830 may perform a shift operation by receiving different data and an offset value for each operation with reference to the data table of FIG. 8C.

시프트 연산 장치는 시프트부(830)가 데이터 테이블에 따른 데이터를 입력 받을 수 있도록 입력 먹스 구조를 제어하는 신호를 생성하고, 먹스를 제어할 수 있다(S1010).The shift calculating apparatus may generate a signal for controlling an input mux structure so that the shift unit 830 can receive data according to the data table, and control the mux (S1010).

시프트부(830)는 믹서부(810)에 의해 혼합된 데이터를 먹스를 통하여 입력 받아 시프트 연산을 수행한 뒤, 시프트된 데이터를 스플리터부(840)로 전달할 수 있다(S1012).The shift unit 830 may receive the data mixed by the mixer unit 810 through a mux, perform a shift operation, and then transmit the shifted data to the splitter unit 840 (S1012).

스플리터부(840)는 시프트부(830)로부터 전달 받은 시프트된 데이터를 분리하되, 믹서부(810)와 반대의 동작 원리로 입력 데이터의 개수만큼 분리하여 출력할 수 있다(S1014). 즉, 스플리터부(840)는 시프트된 데이터에서 입력 데이터의 개수만큼 교번적으로 분리하여 출력할 수 있다.The splitter unit 840 separates the shifted data received from the shift unit 830, but may separate and output as many input data as the number of input data based on an operation principle opposite to that of the mixer unit 810 (S1014). That is, the splitter unit 840 may alternately separate and output the shifted data by the number of input data.

도 10에서는 과정 S1000 내지 과정 S1014를 순차적으로 실행하는 것으로 기재하고 있으나, 이는 본 발명의 일 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것이다. 다시 말해, 본 발명의 일 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 일 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 도 6에 기재된 순서를 변경하여 실행하거나 과정 S1000 내지 과정 S1014 중 하나 이상의 과정을 병렬적으로 실행하는 것으로 다양하게 수정 및 변형하여 적용 가능할 것이므로, 도 10은 시계열적인 순서로 한정되는 것은 아니다.In FIG. 10, steps S1000 to S1014 are described as sequentially executing, but this is merely illustrative of the technical idea of an embodiment of the present invention. In other words, if one of ordinary skill in the technical field to which an embodiment of the present invention belongs, execute it by changing the order shown in FIG. 6 without departing from the essential characteristics of an embodiment of the present invention, or one of the processes S1000 to S1014. Since the above processes are executed in parallel, various modifications and variations may be applied, and thus FIG. 10 is not limited to a time series order.

한편, 도 10에 도시된 과정들은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 즉, 컴퓨터가 읽을 수 있는 기록매체는 마그네틱 저장매체(예를 들면, 롬, 플로피 디스크, 하드디스크 등) 및 광학적 판독 매체(예를 들면, 시디롬, 디브이디 등)와 같은 저장매체를 포함한다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Meanwhile, the processes shown in FIG. 10 can be implemented as computer-readable codes on a computer-readable recording medium. The computer-readable recording medium includes all types of recording devices that store data that can be read by a computer system. That is, the computer-readable recording medium includes storage media such as magnetic storage media (eg, ROM, floppy disk, hard disk, etc.) and optical reading media (eg, CD-ROM, DVD, etc.). In addition, the computer-readable recording medium can be distributed over a computer system connected through a network to store and execute computer-readable codes in a distributed manner.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present embodiment, and those of ordinary skill in the technical field to which the present embodiment belongs will be able to make various modifications and variations without departing from the essential characteristics of the present embodiment. Accordingly, the present exemplary embodiments are not intended to limit the technical idea of the present exemplary embodiment, but are illustrative, and the scope of the technical idea of the present exemplary embodiment is not limited by these exemplary embodiments. The scope of protection of this embodiment should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present embodiment.

100: 메모리 110: 프로세서
300: 제1시프트 장치 310: 제2시프트 장치
400: 퍼널 시프터 810: 믹서부
820: 오프셋 생성부 830: 시프트부
840: 스플리터부100: memory 110: processor
300: first shift device 310: second shift device
400: funnel shifter 810: mixer unit
820: offset generation unit 830: shift unit
840: splitter unit

Claims

In the shift operation apparatus capable of performing a shift operation on at least two input data each including a plurality of bits,
A mixer for outputting mixed data by alternately arranging each bit of the at least two input data;
An offset generator generating a shift offset value that is a value obtained by multiplying a shift amount of the at least two input data by the number of the at least two input data;
A shift unit for generating shifted data by performing a shift operation on the mixed data using the offset value; And
Splitter unit for generating at least two output data corresponding to the at least two input data by separating each bit of the shifted data
Shift calculation device comprising a.

The method of claim 1,
The alternating arrangement,
A shift operation device for arranging bits of the input data with the same bit index adjacently, but in an ascending and alternating order of the bit index.

The method of claim 1,
The offset value is,
When the number of the input data is 2 to the power of n (n is a non-negative integer), and when the shift amount is expressed as binary data, the shift amount is left-shifted by n.

The method of claim 1,
The shifted data is data obtained by shifting the mixed data to the right by the offset value and alternately arranging the MSBs of the input data at a higher bit position corresponding to the offset value.

The method of claim 1,
The shifted data is data obtained by right-shifting mixed data by the offset value and arranging bits having a value of 0 at an upper bit position corresponding to the offset value.

The method of claim 1,
The shifted data is data obtained by moving lower bits corresponding to the offset value in the mixed data to an upper bit position.

The method of claim 1,
The shifted data is data obtained by shifting mixed data to the left by the offset value and arranging bits having a value of 0 at a lower bit position corresponding to the offset value.

The method of claim 1,
The shifted data is data obtained by moving upper bits corresponding to the offset value in the mixed data to a lower bit position.

In the operating method of a shift operation device capable of performing a shift operation on at least two input data each including a plurality of bits,
Generating mixed data by alternately arranging each bit of the at least two input data;
Generating a shift offset value obtained by multiplying the shift amount of the at least two input data by the number of the at least two input data;
Generating shifted data by performing a shift operation on the mixed data using the offset value; And
The process of generating at least two output data corresponding to the at least two input data by separating each bit of the shifted data
Operating method of the shift calculating device comprising a.

The method of claim 9,
The alternating arrangement,
A method of operating a shift operation apparatus, wherein bits having the same bit index among the bits of the input data are arranged adjacently, and arranged in ascending and alternating order of the bit index.

The method of claim 9,
The offset value is,
When the number of the input data is 2 to the power of n (n is a non-negative integer) and the shift amount is expressed as binary data, the shift amount is left-shifted by n.

The method of claim 9,
The shifted data is data obtained by right shifting the mixed data by the offset value and alternately arranging each most significant bit of the input data at a higher bit position corresponding to the offset value.

The method of claim 9,
The shifted data is data obtained by right shifting the mixed data by the offset value and arranging bits having a value of 0 at an upper bit position corresponding to the offset value.

The method of claim 9,
The shifted data is data obtained by moving lower bits corresponding to the offset value in the mixed data to an upper bit position.

The method of claim 9,
The shifted data is data obtained by shifting mixed data to the left by the offset value and arranging bits having a value of 0 at a lower bit position corresponding to the offset value.

The method of claim 9,
The shifted data is data obtained by moving upper bits corresponding to the offset value in the mixed data to a lower bit position.