KR100279740B1

KR100279740B1 - Code distribution method and apparatus for parallel processing

Info

Publication number: KR100279740B1
Application number: KR1019970067114A
Authority: KR
Inventors: 김정기
Original assignee: 정선종; 한국전자통신연구원
Priority date: 1997-12-09
Filing date: 1997-12-09
Publication date: 2001-02-01
Also published as: KR19990048440A

Abstract

본 발명에서는 병렬 컴퓨터에서 사용할 수 있는 코드(code) 분산 기법을 개발한다. 병렬 컴퓨터의 목적은 많은 양의 일을 여러 개의 마이크로 프로세서(micro processor, 이하 프로세서)로 나누어 수행함으로써 수해시간을 줄이는 것이다. 이를 위해 자료(data)를 프로세서에 효율적으로 분산시키는 방법과 이러한 자료를 병렬로 처리하는 방법이 필요하다. 본 발명은 이러한 병렬처리 중에서 코드 형태로 저장되는 자료에 대해 프로세서에 분산시키는 방법과 장치를 개발하는 것이 목적이다. 코드 형태의 자료가 입력되면, 검색에 효율적인 형태로 병렬 프로세서에 분산되어 저장된다. 검색을 할 때는 검색 대상의 코드들이 분산되어 검색된다. 본 발명의 대상이 되는 검색 특성은 검색을 수행할 때 질의(query)로 주어지는 것만이 검색 대상이 되는 것이 아니라 질의 코드를 포함하는 모든 코드들이 검색 대상이 된다. 이러한 코드를 동류코드(equivalent codes)라 한다. 코드를 저장하는 단계에서 이러한 동류코드를 고르게 여러 프로세서에 나눔으로써 검색의 효율성을 높일 수 있다. 이를 위해 코드를 겹침(folding) 형태로 프로세서에 분산함으로써 동류코드를 최대한 고르게 프로세서에서 검색하도록 한다. 이렇게 함으로써 병렬 컴퓨터에서 동류코드 형태의 자료를 저장하고 검색할 경우 수행시간을 줄일 수 있다.In the present invention, a code distribution technique that can be used in a parallel computer is developed. The purpose of a parallel computer is to reduce flood time by dividing a large number of jobs into several microprocessors. To do this, we need a way to distribute the data efficiently to the processor and a method to process these data in parallel. It is an object of the present invention to develop a method and apparatus for distributing data stored in a code form to a processor in the parallel processing. Once code-type data is entered, it is stored in a distributed manner on a parallel processor in an efficient form for retrieval. When searching, the codes to be searched are distributed and searched. The retrieval characteristic that is the subject of the present invention is not only the retrieval object given as a query when performing retrieval but all the codes including the query code are retrieved. These codes are called equivalent codes. In the stage of storing the code, it is possible to increase the efficiency of the search by evenly dividing these peer codes into different processors. To do this, we distribute the code to the processor in a folding fashion so that the same code is searched in the processor as evenly as possible. By doing this, the execution time can be reduced when storing and retrieving data of the same code type in the parallel computer.

Description

Code distribution method and apparatus for parallel processing

본 발명은, 아스키(ASCII) 코드(code)나 이진 코드 형태로 된 자료(data)를 병렬처리 하기 위해 여러 개의 프로세서(processor)에 자료를 분산시키는 방법과 장치에 관한 것이다.The present invention relates to a method and apparatus for distributing data to a plurality of processors in order to process data in ASCII code or binary code form.

일정량의 자료를 처리하기 위해 하나의 프로세서를 가진 컴퓨터에서 실행하는데 많은 시간이 소요됨으로 병렬 컴퓨터를 개발하게 되었다. 병렬 컴퓨터는 여러 개의 프로세서를 가지고 동시에 일정량의 자료를 나누어서 처리함으로써 빠른 실행 결과를 얻을 수 있다. 그러나, 자료의 양을 동일하게 그리고 실행하는 시간도 동일하게 배분하기는 매우 어려운 문제이다. 뿐만 아니라, 자료 서로간에 상관 관계가 있을 때는 더 많은 문제를 가중시킨다. 그래서 자료를 병렬화 시키는 많은 기법들이 개발되었으며, 많은 연구노력을 기울이고 있는 실정이다.In order to process a certain amount of data, it takes a lot of time to run on a computer with a single processor, so that a parallel computer is developed. A parallel computer has a number of processors and processes a certain amount of data at the same time. However, it is very difficult to equally distribute the amount of data and distribute the same time. In addition, when there is a correlation between data, more problems are added. Thus, many techniques for parallelizing data have been developed and many research efforts have been made.

종래에 연구된 병렬처리를 위한 코드 분산방법은, 국내의 경우 관련 연구를 찾아볼 수 없으며, 국외의 경우 1993년 아일랜드에서 개최된 19번째 "대규모 데이터베이스 국제 학술 대회 (19th Very Large Database Conference)"에서 발표된 논문집의 314-327면에 실린 Zezula 등의 "해밍필터: 병렬 저장을 위한 동적 요약화일 구성(Hamming Filter: A Dynamic Signature File Organization for Parallel Stores)" 논문을 들을 수 있다. 이 논문에서 코드를 분산하기 위해 선형 오류 수정 기법(linear error correcting code)인 해밍 코드(Hamming Code)를 이용하고 있다.Conventionally, the code distribution method for parallel processing can not be found in the domestic case. In the case of overseas, the 19th "Very Large Database Conference" held in Ireland in 1993 We can hear the article "Hamming Filter: A Dynamic Signature File Organization for Parallel Stores" by Zezula et al. On pages 314-327 of the published paper. In this paper, we use Hamming Code, a linear error correcting code, to distribute the code.

해밍 코드는 자료 전송을 위해 1973년에 제안된 방법으로 자료를 전송할 때 자료에 코드를 덧붙이고 자료를 받은 다음 덧붙인 코드를 이용하여 전송 도중 오류가 발생한 장소를 찾아내는 방법이다. 해밍 필터에서는 이러한 해밍 코드를 행렬로 만들어 대조행렬(check matrix)이라 칭하고 코드를 대조행렬과 곱해서 얻은 결과 값으로 프로세서의 번호를 결정한다. 이 대조행렬을 만드는 것은 해밍 코드 생성 규칙을 따르며, 행의 개수는 프로세서 개수의 2를 밑으로 하는 로그(log) 값에 의해 결정되고 열의 개수는 프로세서 개수 빼기 1에 의해 결정된다. 즉, 프로세서가 4개 일 때 (3x2)의 대조행렬이 구해지고, 코드 중 3비트(bits)를 가지고 프로세서 번호를 결정하게 된다.The Hamming code is a method of transferring data in a method proposed in 1973 for transferring data, adding code to the data, receiving the data, and then using the added code to find the location of the error during transmission. In the Hamming filter, the Hamming code is made into a matrix, called a check matrix, and the number of the processor is determined by multiplying the code by the contrast matrix and obtaining the result. The creation of this contrast matrix follows the Hamming code generation rules, the number of rows is determined by the log value below 2 of the number of processors, and the number of columns is determined by the processor number minus one. That is, when a processor has four processors (3x2), a matching matrix is obtained, and 3 bits (bits) of the code are used to determine the processor number.

그러나 이러한 해밍 필터는 위에서 결정되는 형태로 대조행렬이 만들어 질 때만 가장 좋은 코드 분산을 이룬다는 단점이 있다. 그 이외의 경우에 대해서는 효율성이 적어지며, 프로세서의 개수가 변화될 때마다 대조행렬을 다시 구해야 한다는 문제점이 있다.However, such a Hamming filter has the disadvantage that it achieves the best code dispersion only when a matching matrix is formed in the form determined above. In other cases, the efficiency is reduced, and there is a problem that the verification matrix must be obtained again every time the number of processors is changed.

그런데, 데이타 처리에서 병렬처리를 수행하는 주목적은, 일정량의 작업을 여러 개의 프로세서로 나누어 실행함으로써 수행시간을 줄이기 위한 것이다. 이러한 병렬처리를 위하여 자료를 동일하게 프로세서에 할당하는 방법과, 실행을 프로세서에 동일하게 나누는 방법이 필요하다. 이것을 측정하는 기준은 자료 편중(data skew)과 실행 편중(execution skew)의 값이다.However, the main purpose of performing parallel processing in data processing is to reduce the execution time by dividing a certain amount of work into a plurality of processors. For this parallel processing, there is a need to allocate the data to the processor equally, and to divide the execution equally among the processors. The criteria for measuring this are the data skew and the execution skew.

자료 편중은 실행해야할 자료의 양이 프로세서에 편중된 척도를 나타내며, 이 값이 작을수록 효율적인 병렬 수행이 이루어진다. 실행 편중은 실행할 작업의 양이 프로세서에 편중된 정도를 나타내며, 이 값이 작을수록 효율적인 병렬 수행이 이루어지므로 수행시간을 줄인다는 사실을 알 수 있다.The amount of data to be executed indicates the amount of data to be executed on the processor, and the smaller the value, the more efficient parallel execution is achieved. Execution bias indicates the degree to which the amount of work to be executed is biased toward the processor. The smaller the value is, the more efficient parallel execution is performed, thereby reducing the execution time.

따라서, 본 발명의 목적은 일정량의 자료를 병렬 처리하기 위하여 자료 편중과 실행 편중을 최소화함으로써, 작업의 수행시간을 줄이기 위한 것이다. 이러한 목적을 달성시키기 위해서 본 발명에서는, 자료를 자료 편중과 실행 편중이 최소가 되도록 병렬 프로세서에 할당하는 기법을 제공한다.Accordingly, an object of the present invention is to reduce the execution time of a task by minimizing data bias and execution bias for parallel processing of a certain amount of data. In order to achieve the above object, the present invention provides a technique of allocating data to parallel processors so that the data bias and execution biases are minimized.

본 발명에서 병렬처리 하기 위한 자료의 특성은, 검색을 수행할 때 질의에 대한 동류코드(equivalent codes)를 동시에 병렬 프로세서에서 처리해야 한다는 것이다. 이러한 검색의 특성은 질의 자료에 대해 검색 대상의 자료가 질의와 일치하는 것뿐만 아니라 질의를 포함하는 모든 것을 찾아야 한다는 요구 조건 때문이다. 이러한 요구 조건의 경우는 주로 해싱(hashing)을 이용한 키(key) 기반의 검색에서 주로 나타난다. 여기서 키(key)란 자료의 코드 중 검색에 사용되는 부분을 의미한다. 동류키(equivalent keys)는 질의 자료의 키를 비트(bit) 단위의 논리곱 형태로 포함하는 모든 키들로 다음과 같이 정의된다.The characteristics of the data for parallel processing in the present invention are that the parallel codes for the query must be simultaneously processed by the parallel processor when performing the search. The nature of such a search is due to the requirement that the data to be searched for the query data not only matches the query but also finds everything that contains the query. These requirements are mainly found in key-based searches using hashing. Here, the key refers to the part of the code that is used for the search. Equivalent keys are defined as all keys containing the key of the query data in bitwise logical product form as follows.

EK( QK ) = { Ki | (Ki & QK ) = QK , 0 ≤ i ＜ n }EK (QK) = {Ki | (Ki & QK) = QK, 0? I <n}

여기서, QK는 질의의 키를 의미하며, n은 질의 키의 길이에서 생성될 수 있는 모든 키의 개수로 질의키의 길이가 L 일 때 2의 L승이다. 예를 들어, 질의 키가 1010이면, 동류키들은 1?1? 즉, 1010, 1011, 1110, 1111이 된다. 이러한 동류키를 프로세싱 노드(node)에 고르게 분포시킴으로써, 실행 편중 현상을 최소화 할 수 있다.Here, QK means the key of the query, and n is the number of all keys that can be generated in the length of the query key. When the length of the query key is L, For example, if the query key is 1010, the peer keys are 1? 1? That is, 1010, 1011, 1110 and 1111 are obtained. By evenly distributing this homogeneous key to the processing node, it is possible to minimize the biasing phenomenon.

본 발명에서 사용되는 병렬 컴퓨터의 구조에서 병렬 처리를 위한 프로세서는 논리적으로 전위 프로세서와 후위 프로세서로 나누어진다.In the parallel computer architecture used in the present invention, a processor for parallel processing is logically divided into a potential processor and a posterior processor.

자료를 저장할 때, 전위 프로세서는 사용자로부터 자료를 입력받아 자료가 들어가야 할 프로세서를 결정한 다음 자료를 해당 프로세서에 넘겨주는 역할을 한다. 후위 프로세서는 실제 자료를 저장하는 디스크와 그의 처리기, 메모리(memory)를 모두 갖추고 있으며, 자료를 저장하고 관리하는 역할을 한다.When storing the data, the potential processor receives data from the user, determines the processor to which the data should be input, and then passes the data to the processor. The backend processor has both a disk that stores the actual data, its processor, and memory, and stores and manages the data.

자료를 검색할 때 전위 프로세서는 사용자로부터 질의를 받아 동류키를 구하고 동류키를 가진 프로세서에 질의 자료를 넘겨 준 다음, 질의 결과를 모아 사용자에게 전달하는 역할을 한다. 후위 프로세서는 전위 프로세서에서 받은 질의 문서에 대해 실제적인 병렬처리 검색을 수행하여 결과를 전위 프로세서에 넘겨주는 역할을 한다. 이러한 프로세서들은 고속의 통신망(network)으로 연결된다.When retrieving the data, the potential processor receives the query from the user, obtains the homogeneous key, passes the query data to the processor having the homogeneous key, and then collects the query result and transmits it to the user. The posterior processor performs the actual parallel processing search on the query document received from the potential processor and passes the result to the potential processor. These processors are connected to a high-speed network.

따라서, 본 발명은 코드 형태로 된 자료를 그 코드에 근거하여 분산시킴으로써 자료의 양적 편중과 실행 편중을 최소화시키고 병렬처리의 효율성을 최대화시킬 수 있게 된다.Accordingly, the present invention can distribute data in a code form based on the code, thereby minimizing quantitative and execution bias of data and maximizing the efficiency of parallel processing.

도 1은 본 발명의 프로세서의 개수가 8개일 때, 겹침 방법에 의해 코드가 병렬 프로세서에 분산되는 순서도.Brief Description of the Drawings Fig. 1 is a flow chart in which codes are distributed to a parallel processor by an overlapping method when the number of processors of the present invention is 8; Fig.

도 2는 본 발명에 의한 입력되는 코드에 의해 저장될 프로세서 번호를 결정하는 흐름도.2 is a flow chart for determining a processor number to be stored by an input code according to the present invention;

도 3은 본 발명에 의한 코드가 저장될 프로세서를 결정하기 위해 겹침 방법의 할당 방향을 결정하는 흐름도.3 is a flow chart for determining an assignment direction of an overlapping method to determine a processor to which a code according to the present invention is to be stored.

도 4는 본 발명에 의한 코드를 프로세서에 할당하기 위한 방법이 칩(chip)으로 설계 될 때, 겹침 칩이 연결되는 블록도.FIG. 4 is a block diagram of a stacked chip when a method for allocating a code according to the present invention to a processor is designed as a chip. FIG.

도 5는 본 발명에 의한 코드를 프로세서에 할당하기 위한 방법을 칩(chip)으로 설계 될 때, 결침 칩의 내부 구조에 대한 블럭도.FIG. 5 is a block diagram of the internal structure of a bonded chip when a method for assigning a code according to the present invention to a processor is designed as a chip. FIG.

＜도면의 주요부분에 대한 부호의 설명＞Description of the Related Art

100 : 겹침칩 110 : 코드 겹침단계(FS) 계산부100: stacking chip 110: code overlapping step (FS)

120 : 초기 R_K계산부 130 : 이피롬120: initial _K R calculation unit 130: two feet Rom

140 : 반복에 의한 R_K계산부 150 : 할당방향 결정부140: _RK calculation unit by repetition 150: Assignment direction determination unit

160 : 프로세서 번호 결정부 200 : 디 멀티플렉서160: processor number determination unit 200: demultiplexer

300 : 프로세서300: Processor

본 발명에 의한 코드 분산기법은, 자료를 검색할 때 실행 편중을 최소화시키기 위하여 자료를 병렬 프로세서에 저장할 때 자료를 효율적으로 배분하는 방법이다.The code distribution technique according to the present invention is a method of efficiently distributing data when storing data in a parallel processor in order to minimize execution bias in retrieving data.

이러한 배분을 수행하는 과정은 전위 프로세서에서 일어나며, 코드를 분산하기 위해 코드의 배열 순서를 기본적으로 ↓↑↑↓의 방향으로 설정한다. 이러한 화살표의 방향을 코드의 할당방향이라 하고, ↓을 순방향,↑을 역방향이라 한다.The process of performing this distribution takes place in the dislocation processor, and in order to distribute the code, the order of the codes is basically set in the direction of ↓ ↑↑ ↓. The direction of this arrow is called the allocation direction of the code, ↓ is the forward direction, and ↑ is the reverse direction.

이 이상의 단계, 즉 네 번째 단계를 넘어서면, 도 1에서 보는 것처럼, 내부에 다시 한번 화살표가 발생한다. 이러한 과정을 겹침(folding)이라 정의한다. 즉, 4의 배수에 해당하는 코드의 배열이 될 때, 겹침이 내부에서 다시 한번 일어나고, 이러한 과정은 더 이상의 겹침이 이루어지지 않을 때까지 반복된다.When the above step, that is, the fourth step, is exceeded, as shown in FIG. 1, an arrow is once again generated inside. This process is defined as folding. That is, when it becomes an array of codes corresponding to a multiple of 4, overlapping occurs once again inside, and this process is repeated until no further overlap occurs.

도 1의 69를 포함하는 열에서 가장 작은 화살표는 두 개 코드의 할당 방향을 나타내며, 좀 더 긴 화살표는 작은 화살표가 크기 순서로 놓이는 위치가 되고, 마찬가지로 가장 긴 화살표는 전체적인 할당 방향이 된다. 이 경우 반복적인 겹침에 의해 3 단계의 화살표가 발생하고 더 이상의 겹침은 일어날 수 없다. 겹침이 발생하는 개수(NF), 즉 하나의 열에서 화살표의 최대 개수는 프로세서의 개수(NP)에 의존하여 NF=log2(NP)에 의해 결정된다. 예를 들어 프로세서가 8 개인 병렬 컴퓨터에서 겹침의 최대 개수는 log2(8)=3이 된다. 처음 화살표의 ↓↑↑↓ 할당 방향을 넘어섰을 때 다음 열의 전체적인 코드 할당방향의 결정도 역시 ↓↑↑↓의 할당 방향을 따른다. 즉, 두번째열은, 중간에서 32, 33, 34, 35의 순방향과, 상단에서 36, 37, 38, 39의 순방향으로 할당방향이 결정된다. 도 1에서 큰 화살표가 이러한 과정을 의미한다. 예를 들어, 0을 포함하는 열의 할당 방향이 순방향이므로 36을 포함하는 열의 전체적인 할당 방향은 역방향이 된다.The smallest arrow in the column containing 69 in FIG. 1 represents the allocation direction of the two codes, the longer arrow is the position where the small arrow is placed in the order of magnitude, and likewise the longest arrow is the overall allocation direction. In this case, a repetitive overlap causes an arrow in three steps, and no more overlaps can occur. The number of occurrences (NF), ie the maximum number of arrows in a row, depends on the number of processors (NP) and is determined by NF = log2 (NP). For example, on a parallel computer with eight processors, the maximum number of overlaps is log2 (8) = 3. ↓ ↓ ↓ ↓ of the first arrow When the allocation direction is exceeded, the determination of the overall code allocation direction of the next column also follows the allocation direction of ↓ ↑↑ ↓. That is, the allocation direction is determined in the forward direction of 32, 33, 34, 35 in the middle, and in the forward direction of 36, 37, 38, 39 in the upper row. A large arrow in FIG. 1 means this process. For example, since the allocation direction of the column including 0 is the forward direction, the overall allocation direction of the column including 36 becomes the reverse direction.

이렇게 코드가 입력되어 프로세서에 배분되는 과정의 순서도는 도 2와 도 3에 도시되어 있다. 도 5는 본 발명에 의한 겹침칩의 내부 블록도로서, 이에 도시된 바와같이 입력 코드(K)와 프로세서 개수(N_P) 및 겹침단위(FU)에 의거하여 코드 겹침단계(FS)를 계산하는 코드 겹침단계 계산부(110)와, 상기 코드(K)와 프로세서 개수(N_P)에 의거하여 초기 R_K를 계산하는 초기 R_K계산부(120)와, 겹침단위 FU, 겹침 최저코드 FB, 겹침 비트 패턴 BP을 미리 저장하여 두는 이피롬(130)과, 그 겹침 초저코드 FB와 상기 코드 겹침 단계 FS 및 상기 R_K값에 의거하여 반복에 의해 R_K를 계산하는 반복에 의한 R_K계산부(140)와, 상기 코드(K) 및 프로세서 개수(N_P)와 상기 겹침비트 패턴 BP에 의거하여 할당 방향을 결정하는 할당방향 결정부(150)와, 상기 반복에 의한 R_K계산부(140)의 R_K값과 상기 할당방향 결정부(150)의 할당방향 AD에 의거하여 프로세서 번호(PN)를 결정하여 출력하는 프로세서 번호 결정부(140)로 구성된다.A flow chart of the process in which the code is inputted and distributed to the processor is shown in FIG. 2 and FIG. 5 is an internal block diagram of an overlapping chip according to the present invention. As shown in FIG. 5, the code overlapping step FS is calculated based on the input code K, the number of processors N _p and the overlapping unit FU and code the overlapping step calculating unit 110, the code (K) and the number of processors beginning R _K calculation unit 120, a stacking unit FU, overlapping minimum code of calculating an initial R _K on the basis of the (N _P) FB, and two feet ROM 130 put in the overlap bit pattern BP stored in advance, and overlapping ultra low code FB and the code stacking stage FS and the portion R _K by repeatedly calculating the R _K by an iterative calculation based on the R _K value 140, the code (K) and a number of processors (N _P) and the overlap bit and the pattern BP determining assignment direction for determining the assigned direction on the basis of the portion (150), R _K calculated by the repetition unit (140 ) of the R value _K and outputs the determination to the processor numbers (PN) based on the assigned direction AD of the assigned direction-determining unit 150 It consists of a processor number determining unit 140. The

이와같이 구성된 본 발명의 겹침침은, 도 2의 흐름도에서와 같은 동작을 하는데, 코드 형태의 자료가 입력되면, 겹침 유닛(FU)과, 최저코드(FB) 및 겹침길이(FL)를 초기값으로 초기화 시키는 제1단계를 수행한다, 제1단계는, 먼저, 겹침의 기본단위인 겹침유닛 FU를 임의로 정해둔 상수값으로 초기화 시키는 것으로, 4번의 겹침되에 새로운 겹침형태가 나타나므로 FU = 4로 설정한다(S1). 이어서, 최저코드 FB를 초기값 '0'로 설정한다(S2). 이는 겹침의 최저코드로써 하나의 열에서 가장작은 코드값이다. 그리고, 겹침길이 FL의 초기값을 프로세서의 개수, FL=N_P로 초기화시킨다(S3).2, when the code type data is input, the overlapping unit FU and the minimum code FB and the overlap length FL are set as initial values The first step is to initialize the overlapping unit FU, which is the basic unit of overlapping, to a predetermined constant value. Since a new overlapping shape appears four times, FU = 4 (S1). Next, the lowest code FB is set to the initial value '0' (S2). This is the lowest code of the overlap and is the smallest code value in a column. And, the overlapping length reset the initial value of the FL to the number of processors, FL = _P N (S3).

상기 제1단계 이후에, 코드의 겹침횟수인 겹침단계 FS를 구하는 제2단계를 수행한다, 겸침단계 FS는 코드값(K)를 상기 겹침유닛(FU)과 프로세서 개수(N_P)를 곱한 값으로 나누고, 프로세스 개수(N_P)의 로그함수에 의한 나머지값을 코드겹침단계 FS값으로 구한다(S4). 즉, 하기 [수학식1]에 의해 구한다.The second step of obtaining the overlapping step FS, which is the number of times of overlapping of the codes, is performed after the first step. The duplexing step FS adds the code value K to the value obtained by multiplying the overlapping unit FU by the number of processors N _P And obtains the residual value by the log function of the number of processes N _P as the code overlapping step FS value (S4). That is, it is obtained by the following equation (1).

FS=(K/(N_P*FU))%log₂(N_P)FS = (K / (N _P * FU))% log ₂ (N _P )

그리고, 프로세스 개수(N_P)에 대한 코드(K)의 나머지를 할당번호 R_K로 구하는 제3단계(S5)를 수행한다. 상기 R_K는 프로세서중 몇번째 프로세서에 할당할 것인지를 나타내는 프로세서 번호를 의미한다.Then, the third step (S5) of obtaining the remainder of the code (K) with respect to the process number (N _P ) by the allocation number R _K is performed. And R _K denotes a processor number indicating how many of the processors are to be allocated.

이어서 겹침길이 FL를 반으로 줄여서 최저코드 FB를 갱신하고 그 최저코드 FB를 할당번호 R_K에서 감산하여 새로운 할당번호 R_K를 취하고, 상기 반으로 줄인 겹침길이 FL에 대한 상기 새로운 할당번호 R_K의 보수값을 취한후 상기 최저코드 FB를 더하여 할당번호 R_K를 구하는 과정으로 상기 코드 겹침단계 FS의 횟수 만큼 반복하여 최종적으로 프로세서 할당번호 R_K를 구하는 제4단계를 수행한다.Then the overlapping length decreasing the FL counter updates the minimum code FB, and that takes the lowest code FB assignment number R _K new assignment number R _K is subtracted from said new assigned number for the overlapping length FL reduced to the half R _K Performing a fourth step of finally obtaining a processor allocation number R _K by repeating the number of times of the code overlapping step FS by obtaining the allocation number R _K by adding the minimum code FB after taking a complement value.

상기 제 4단계는, 먼저, 상기 겹침길이(FL)을 2로 나누어(FL=FL/2) 갱신된 겹침길이(FL)를 구한다.(S6) 이는 한번의 겹침이 발행하면 그 겹침길이가 반으로 줄어들기 때문이다.In the fourth step, the overlap length FL is divided by 2 (FL = FL / 2) to obtain the updated overlap length FL (S6). When overlapping is performed once, .

그리고, 상기 할당번호 R_K가 상기 최저코드 FB와 갱신된 겹침길이 FL를 더한 값 이상인지(R_K≥ FB + FL)를 비교한다(S7).Then, the allocation number R _K is compared with the minimum code FB plus the updated overlap length FL (R _K ≥ FB + FL) (S7).

만약 이상인 경우에는 상기 최저코드 FB를 갱신된 겹침길이 FL에 더하여(FB = FB + FL) 새로운 최저코드 FB를 구한다(S8).If so, the minimum code FB is added to the updated overlap length FL (FB = FB + FL) to obtain a new minimum code FB (S8).

상기 할당번호 R_K에서 새로운 FB를 감산(R_K=R_K-FB)하여 새로운 할당번호 R_K를 구한다(S9).The new allocation number R _K is obtained by subtracting the new FB from the allocation number R _K (R _K = R _K -FB) (S 9).

그 새로운 할당번호 R_K가 구해지면 상기 반으로줄인 겹침길이 FL에 대한 새로운 할당번호 R_K의 보수값 R_K= (FL-1)- R_K을 구한다.(S10)When the new allocation number R _K is obtained, a complement value R _K = (FL-1) - R _K of the new allocation number R _K for the overlapping length FL reduced to half is obtained.

그리고, 이때의 할당번호 R_K에 상기 갱신된 할당변수 FB를 더하여 할당번호 R_K를 다시 구하게 된다(S11).And, in addition to the updated allocated variable FB is assigned in this case the number of R _K is again seek the assigned number R _K (S11).

이와같이 할당번호 R_K를 다시 구하고 이때의 FL 및 FB등을 이용하여 상기 코드 겹침단계 FS의 횟수만큼 반복하여 최종적으로 할당번호 R_K를 구하게 된다.In this way to obtain again the allocation number _K R using the case of the FL and FB including repeatedly as many times as the code FS stacking step is to seek the finally assigned number R _K.

이후, 코드를 할당할 패턴으로부터 할당방향 AD를 구하는 제5단계를 수행한다(S12). 그 코드 할당방향 AD를 구하는 흐름은 도 3에 도시된 바와같다.Thereafter, a fifth step of obtaining an allocation direction AD from a pattern to be allocated a code is performed (S12). The flow of obtaining the code allocation direction AD is as shown in Fig.

코드 할당방향 AD가 구해지면 그 코드 할당방향 AD가 순방향 인지를 판단(S13)하여 순방향이면 프로세서번호를 상기 할당번호 R_K로 정하고(S14), 역방향이면 프로세서수에 대한 할당번호 R_K의 보수값 PN = (N_P-1)-R_K을 프로세서번호(PN)으로 결정하여(S15) 그 프로세서 번호(PN)를 출력(S16)하는 제6단계를 수행하도록 한다.Code assignment direction AD is obtained when the code assignment direction AD the complement value of the allocation number R _K for the forward if the judgment (S13) the number of processors to decide (S14), if the reverse of the processor number to the assigned number R _K is the forward A sixth step of determining PN = (N _P -1) -R _K as a processor number (PN) (S 15) and outputting the processor number (PN) (S 16).

한편, 도 3은 상기 할당방향 AD를 구하는 흐름도로서, 이에 도시된 바와같이 입력코드(K)가 입력되면, 할당방향이 순방향인지 역방향인지를 16진수값의 비트 판단에 의해 할당방향을 결정한다(S31). BP = 0X6996 이는 겹침을 결정하기 위한 비트 패턴으로서 비트가 0이면 순방향, 1이면 역방향이다.Meanwhile, FIG. 3 is a flowchart for obtaining the allocation direction AD. When the input code K is input as shown in FIG. 3, the assignment direction is determined by bit determination of the hexadecimal value whether the allocation direction is forward or backward S31). BP = 0X6996 This is a bit pattern for determining overlap, where bit is 0 for forward and 1 for reverse.

그리고 코드 K를 프로세서 수 N_P로 나누어서 코드에 대한 몫 D_K= K/N_P을 구한다.(S32) 상기 몫 D_K에 대한 16진수값을 구한다.(S33) 이는 상기 BP가 16진수 길이이기 때문이다. 이어서 할당방향 AD의 초기값을 순방향 AD=0로 설정한다(S34).And obtains the quotient D _K = K / N _P to the code by dividing the code K to the number N _P processors. (S32) obtains the hexadecimal value of the quotient D _K. (S33), which is that the BP hex length Because. Subsequently, the initial value of the allocation direction AD is set to forward AD = 0 (S34).

그리고,

여기서, D_i는 D_K의 16진수 자리값이고,

는 배타적 논리합(익스클루시브 오아)을 수행한다(S35).And,

Where D _i is the hexadecimal digit value of D _K ,

Performs an exclusive OR (exclusive OR) (S35).

이와같이 구해진 할당방향 AD를 출력(S36)하여 상기 도 2의 제6단계에서 할당방향이 순방향 AD=0인지를 판별하여 프로세서번호 PN을 할당번호 R_k또는 프로세서번호에 대한 R_K의 보수값으로 결정하여 출력하게 된다.The thus obtained allocation direction AD is output (S36), and in the sixth step of FIG. 2, it is determined whether the allocation direction is forward AD = 0 and the processor number PN is determined as the allocation number R _k or the complement value of R _K for the processor number And outputs it.

상기 제 2 단계에서 프로세서의 개수(N_p)에 의해 코드의 겹침단계(FS)를 구하는 단계(S4)를 수행하면, 이 값은 입력된 코드에 대해 몇 번의 겹침이 일어나는가를 의미하며, 0이면 기본 겹침, 1이면 내부에서 한번 겹침, 2이면 내부에서 두 번 겹침이 일어난다.The steps (S4) to obtain the stacking step (FS) of the code by the number (N _p) of the processor in the second step, the value refers to happen a few overlapping for the entered code and, if 0 Basic overlap, 1 means overlap once inside, 2 means overlap twice inside.

이러한 겹침 횟수에 따라 겹침의 최저코드(FB)와 겹침의 길이(FL)가 결정된다. 즉, 코드에 의해 결정되는 화살표의 시작위치와 길이를 의미한다.The minimum code FB of the overlap and the length FL of the overlap are determined according to the number of overlaps. That is, the start position and length of the arrow determined by the code.

도 3의 할당방향 결정부에서 순방향일 때 0, 역방향일 때 1의 값으로 설정하여 할당 패턴(allocation pattern)을 16진수 '0x6996'으로 구한다. 코드에 대해 몫(D_K)을 구한 다음, 16자릿값을 계산하고, 각 자릿수에 대해 할당 패턴으로부터 할당 방향을 결정한 뒤, 배타적논리합(exclusive OR)을 수행하여 전체적인 방향을 결정한다.In the allocation direction determination unit of FIG. 3, the allocation pattern is set to a value of 0 when the forward direction is 0 and a value of 1 when the backward direction is set, and the allocation pattern is obtained as a hexadecimal number '0x6996'. After computing the quotient (D _K ) for the code, calculating the 16-digit value, determining the allocation direction from the allocation pattern for each digit, and then performing an exclusive OR operation to determine the overall direction.

이러한 겹침 방향에 따라 코드는 프로세서의 번호가 결정된다. 예를 들어, 코드가 79일 때 겹침단계(FS)는 (79 / (8*4)) % log2(8) = 2로 구해지고, 나머지(R_K)는 7로 구해진다. 두 번의 반복 루프(loop)가 실행되는데, 첫 번째 단계에서 겹침의 길이(FL)가 반으로 줄어들어 4가 되고 R_K(=7)는 FB(=0)+FL(=4) 보다 큼으로 최저코드(FB)는 4가 된다. R_K는 다시 계산에 의해 4가 된다. 두 번째 단계에서 겹침의 길이(FL)가 다시 반으로 줄어들어 2가 되고 R_K(=4)는 FB(=4)+FL(=2) 보다 크거나 같지 않으므로 최저코드(FB)는 그대로 4이며, R_K는 다시 계산에 의해 5가 된다. 다음으로 도 3의 할당 방향 결정부로 넘어간다. 코드 K(=79)에 대해 몫(D_K)은 K/N_P=79/8=9이다. 16진수 값은 0x9이다. 몫의 16진수 자리 길이가 1이므로 한번의 루프(loop) 반복을 수행한다.

이므로 순방향이다. 결국 할당 방향은 순방향이고 프로세서의 번호는 R_K그대로 5이다. 이것은 도 1에서 코드 79가 프로세서 5에 할당된 것과 같다.Depending on this direction of overlap, the code is numbered by the processor. For example, when the code is 79, the overlapping step (FS) is obtained as (79 / (8 * 4))% log2 (8) = 2 and the remainder (R _K ) Two repeated loops are executed. In the first step, the length FL of the overlap is reduced by half to 4 and R _K (= 7) is greater than FB (= 0) + FL (= 4) The code (FB) becomes 4. R _K is again 4 by calculation. In the second step, the length of the overlap (FL) is reduced again to 2 and R _K (= 4) is not greater than or equal to FB (= 4) + FL (= 2) , R _K is again 5 by calculation. Next, the process goes to the allocation direction determination unit of FIG. The share (D _K ) for code K (= 79) is K / N _P = 79/8 = 9. The hexadecimal value is 0x9. Since the hexadecimal digit length of the quotient is 1, it performs loop repetition once.

Therefore, it is forward direction. As a result, the allocation direction is forward and the number of processors is 5 as R _K. This is the same as code 79 in FIG. 1 is assigned to processor 5.

이러한 코드 분산 기법의 알고리즘(algorithm)은 코드가 삽입될 때 프로세서의 번호를 결정하는 수단으로 사용된다. 뿐만 아니라 검색을 효율적으로 하기 위해 자료를 블록(block) 단위로 저장할 때 새롭게 생성된 블록이 저장될 프로세서 번호를 결정하기 위한 수단으로도 사용된다. 이 경우와 같이 자료를 블록 단위로 저장하면, 한 블록에 대해 자료의 과잉으로 블록을 분할하게 되는데, 이때 블록에 대한 대표키도 길이가 한자리 길어지며 분할하게 된다. 이렇게 새로 생성된 키와 블록을 저장할 프로세서를 결정해야 되는데, 이때도 코드 분산 기법을 사용할 수 있다. 본 발명에서 개발한 코드 분산방법은 프로세서의 개수가 2의 제곱 형태인 병렬 컴퓨터 시스템에 적당하다. 일반적으로 병렬 컴퓨터 시스템의 프로세서는 2의 제곱 형태로 만들어지기 때문에 이러한 코드 분산 기법을 이용하는 것은 타당하다.The algorithm of this code distribution scheme is used as a means of determining the number of the processor when the code is inserted. In addition, it is also used as a means for determining the processor number in which a newly generated block is to be stored when data is stored in block units in order to efficiently search. If the data is stored in block units as in this case, the block is divided into an excess of data for one block. At this time, the representative key for the block also has a length of one long and is divided. You need to decide which processor to store the newly generated keys and blocks, and you can use code distribution techniques. The code distribution method developed in the present invention is suitable for a parallel computer system in which the number of processors is a square of 2. Generally, it is reasonable to use this code distribution technique because the processors of a parallel computer system are made of squared form of 2.

본 발명의 코드 분산 기법을 하드웨어로 설계할 때, 도 4와 같은 구조에 의해 코드를 프로세서에 할당할 수 있다. 각각의 코드(K)를 분석하여 겹침 길이와 프로세서번호 및 할당방향을 결정하여 제어하는 겹침칩(100)과, 그 겹침칩(100)의 할당길이, 방향 및 프로세서번호에 의거하여 상기 입력된 코드 K를 해당 프로세서에 출력하는 디멀티플렉서(역다중분할기, demultiplexer)(200)와, 그 디멀티플렉서(200)로 부터 분배되어 제공되는 코드(K)를 처리하는 다수(N_P)의 프로세서(0, 1, 2, ...P)(300)로 구성된다.When the code distribution scheme of the present invention is designed in hardware, the code can be assigned to the processor by the structure shown in FIG. (100) for determining and controlling the overlap length, the processor number and the allocation direction by analyzing each code (K), and a processor processor of the demultiplexer (inverse multiple splitters, demultiplexer) 200, and a plurality (N _P) for processing the code (K) provided is distributed from the demultiplexer 200 to output a K for that processor (0,1, 2, ..., P).

상기 디 멀티플렉서(200)는, 상기 코드 K 길이 만큼의 디멀티플렉서가 구비되고, 각각의 디멀티플렉서는 1입력에 상기 프로세서 개수(N_P)만큼의 출력이 구비되어 상기 겹침칩(100)의 제어에 의해 입력된 코드를 정해진 프로세서에 출력하도록 되어 있다.The de-multiplexer 200 is provided with a demultiplexer as much as the code K-length, respectively, of the demultiplexer is provided with the output of as much as the number of processors (N _P) to the first input input under the control of the overlap chip 100 And outputs the generated code to a predetermined processor.

이와같이 코드의 키 길이 만큼의 1xN_P디멀티플렉서가 필요하며, 겹침 칩의 프로세서 번호 결과에 따라 코드가 분배된다.Thus, a 1xN _P demultiplexer is required as much as the key length of the code, and the code is distributed according to the result of the processor number of the overlapping chip.

겹침칩(100)의 내부 구조에 대한 블록도는 도 5와 같으며, 진행되는 흐름은 도 2의 흐름도와 같다. 코드(K)와 프로세서 개수(N_P)를 입력받고 상수를 저장하는 EPROM(이피롬이라 읽음, Erasable Programmable Read-only Memory)(130)으로부터 겹칩 단위(FU)를 입력받는 코드 겹침 단계 계산부(110)와 초기 R_K계산부(120), EPROM(130)으로부터 겹칩비트 패턴(BP)를 입력받는 할당 방향 결정부(150)는 서로 독립적으로 수행될 수 있으며, 반복에 의한 R_K계산부(140)는 코드 겹침 단계(FS)와 R_K그리고 EPROM(130)으로부터 겹침 최저 코드(FB)를 받아야 수행되며, 프로세서 번호 결정부(160)는 반복에 의한 R_K계산부(140)로부터 R_K를 받고 할당 방향 결정부(150)로부터 할당방향(AD)를 받아야 수행될 수 있다. 결과적으로 프로세서 번호인 출력 값이 출력된다.A block diagram of the internal structure of the stacked chip 100 is the same as that of FIG. 5, and the flow of the process is the same as that of FIG. A code overlapping step calculating unit for receiving a code unit K and a processor number N _P and receiving a stack unit FU from an erasable programmable read-only memory (EPROM) 110) and the initial R _K calculation unit 120, can be performed in EPROM (130) assigned to the direction determination unit 150 independently from each other for receiving the gyeopchip bit pattern (BP) from a, R _K calculated according to the repeats ( 140 are performed only after receiving the code overlapping step FS and R _K and the overlapping lowest code FB from the EPROM 130. The processor number determining unit 160 receives R _K from the iterative R _K calculator 140, And receives the allocation direction AD from the allocation direction determination unit 150. [ As a result, the output value of the processor number is output.

본 발명의 코드 분산 기법은 프로세서 개수에 따라 알고리즘이 달라지지 않으므로 칩(chip) 설계도 달라지지 않는다. 그러나 기존의 해밍필터는 프로세서 개수에 따라 대조행렬(check matrix)이 달라지기 때문에 프로세서 개수가 달라지는 경우 칩을 다시 설계해야 한다.In the code distribution technique of the present invention, the algorithm is not changed according to the number of processors, so the chip design is not changed. However, since the check matrix is different according to the number of processors in the conventional Hamming filter, the chip must be redesigned when the number of processors is changed.

본 발명에서 개발한 코드 분산기법은 코드 형태의 자료를 저장할 때 자료의 양적인 면에서 고르게 저장할 수 있으며, 검색을 수행할 때 검색 대상이 되는 질의의 동류코드를 검색함에 있어서 병렬 프로세서가 분산되어 수행을 함으로 병렬처리를 효과적으로 수행할 수 있다. 뿐만 아니라 본 코드 분산기법은 프로세서의 개수가 달라져도 알고리즘이나 칩 구성이 달라지지 않기 때문에 프로세서의 개수 만 입력 해 주면 코드의 분산이 이루어지도록 할 수 있다.The code distribution technique developed in the present invention can be stored uniformly in terms of quantities of data when storing data in code form. In searching for the same code of a query to be searched when a search is performed, So that parallel processing can be performed effectively. In addition, since the code distribution technique does not change the algorithm or chip configuration even if the number of processors is changed, it is possible to distribute code by inputting only the number of processors.

Claims

When the code type data is inputted, it is arbitrarily determined whether a new overlapping form appears after overlapping the overlapping unit (FU), initialized to a constant value, the overlap length FL is initialized to the number of processors N _P , To an initial value " 0 "

A number of processors and an overlapping step FS indicating how many times overlapping occurs with respect to the code inputted by the overlapping unit

FS = (K / (N _P * FU))% log ₂ (N _P )

And a second step of obtaining

A third step of dividing the code value K by the number of processes (N _P ) and obtaining the remaining R _K ;

Updating the lowest code FB while reducing the overlap length FL by half, subtracting the lowest code FB from the allocation number R _K to obtain a new allocation number R _K , and allocating the new allocation number R _K A fourth step of finally obtaining the processor allocation number R _K by repeating the number of times of the code overlapping step FS by obtaining the allocation number R _K by adding the lowest code FB after taking the complement value of the code number FB,

(D _K ) is obtained for the code by calculating the allocation pattern as a hexadecimal value '0x6996' by setting the value to 0 in the forward direction and to 1 in the reverse direction to calculate the 16-digit value. A fifth step of determining an overall allocation direction AD by performing an exclusive OR operation after determining the direction,

Code assignment direction AD is obtained when the code assignment direction AD is decided the processor number is a forward is determined whether the forward to the assignment number R _K, the reverse is complemented value of the allocation number R _K of the number of processors PN = (N _P - 1) -R _K as a processor number (PN) and outputting a processor number (PN) to which the currently input code K is assigned.

The method as claimed in claim 1,

A step (S6) of dividing the overlapping length (FL) by 2 (FL = FL / 2) to obtain an updated overlapping length (FL)

(S7) comparing whether the allocation number R _K is equal to or larger than a value obtained by adding the minimum code FB and the updated overlap length FL (R _K ≥ FB + FL)

(S8) of adding the minimum code FB to the updated overlap length FL (FB = FB + FL) to obtain a new minimum code FB,

(S9) of subtracting a new FB from the allocation number R _K (R _K = R _K -FB) to obtain a new allocation number R _K ,

(S10) of obtaining a complementary value R _K = (FL-1) - R _K of the new allocation number R _K for the overlapping length FL reduced to half in the case that the new allocation number R _K is obtained,

The step S11 of adding the R _K in the step S10 to the updated allocation variable FB to obtain the allocation number R _K again,

Parallelism, characterized in that, using the allocation number R _K and a case such as the FL, and FB in the step S11 the number of times of the code overlap step FS to obtain a final allocation number R _K repeatedly S11 different from step S6 A code distribution method for.

The method according to claim 1, wherein the fifth step of obtaining an allocation direction AD from a pattern to which the code is to be allocated,

When the code K is inputted, a step S31 of determining whether the allocation direction is forward or backward by a hexadecimal value bit pattern BP = 0X6996,

And the share of the code by dividing the code number N by K _P = _K D processor stage (S32) to obtain the K / N _P,

A step (S33) of obtaining a hexadecimal value for the quotient D _K ,

A step (S34) of setting an initial value of the allocation direction AD to forward AD = 0,

Where D _i is the hexadecimal digit value of D _K ,

Is an exclusive OR (exclusive OR), a step (S35) of obtaining an AD value,

And outputting the allocation direction AD obtained in the step (S36).

4. The method according to any one of claims 1 to 3,

Wherein the code distribution method according to the overlapping method includes not only the query code but also the same code, and is distributed to the various processors in the overlapping manner according to the dispersion method.

(100) for analyzing each input code (K) to determine and control the overlap length, the processor number to be assigned, and the allocation direction, and a processor and the demultiplexer 200 to output the input code K to the processor, such that the demultiplexer plurality of processing code (K) provided is distributed from the 200 processors in the (N _P), (0, 1, 2, ... P) 300,

The de-multiplexer 200, the a de-processors as much as the number of processors (300) are provided, each of the demultiplexer is provided with the output of as much as the number of processors (N _P) to the first input the stacking chip 100 And outputting the code inputted by the control of the processor to the specified processor.

The semiconductor device according to claim 5, wherein the stacked chip (100)

And the code from the input (K) and a number of processors (N _P) for receiving the code overlap processing take gyeopchip unit (FU) from the EPROM (130) for storing the constant calculating unit 110, the initial R _K calculation section (120 An allocation direction determination unit 150 that receives an overlapped bit pattern BP from the EPROM 130 storing a constant and determines an allocation direction;

Receiving a code stacking stage (FS) from the code stacking step calculating unit 110 receives the R _K value from the initial R _K calculation unit 120, the stacking lowest code (FB) from the EPROM (130) for storing the constant An R _K calculation unit 140 for calculating an R _K value by repetition;

And a processor number determination unit 160 configured to receive R _K from the R _K calculator 140 and to receive the allocation direction AD from the allocation direction determination unit 150 and determine the processor number PN. (100). &Lt; / RTI >