KR101871871B1

KR101871871B1 - Data clustering apparatus using g-optics

Info

Publication number: KR101871871B1
Application number: KR1020160170160A
Authority: KR
Inventors: 노웅기
Original assignee: 가천대학교 산학협력단
Priority date: 2016-12-14
Filing date: 2016-12-14
Publication date: 2018-06-28
Also published as: KR20180068454A

Abstract

본 발명은 G-OPTICS를 이용한 데이터 클러스터링 장치를 제시하고 있다. 본 발명에 따른 데이터 클러스터링 장치는 G-OPTICS 알고리즘을 저장하는 저장부와, 상기 G-OPTICS 알고리즘을 이용하여 데이터를 클러스터링하는 제어부를 포함한다. 본 발명에 따르면, 기존 클러스터링 알고리즘 대비 성능이 우수한 GPU를 이용한 G-OPTICS 알고리즘을 사용하여 데이터 클러스터링이 가능하다.The present invention proposes a data clustering apparatus using G-OPTICS. The data clustering apparatus according to the present invention includes a storage unit for storing a G-OPTICS algorithm and a control unit for clustering data using the G-OPTICS algorithm. According to the present invention, data clustering is performed using the G-OPTICS algorithm using the GPU having superior performance compared to the existing clustering algorithm It is possible.

Description

DATA CLUSTERING APPARATUS USING G-OPTICS BACKGROUND OF THE INVENTION 1. Field of the Invention [0001]

본 발명은 G-OPTICS를 이용한 데이터 클러스터링 장치에 관한 것으로, 보다 상세하게는 GPU를 이용하여 OPTICS의 성능을 개선한 G-OPTICS 알고리즘을 이용한 데이터 클러스터링 기술에 관한 것이다.The present invention relates to a data clustering apparatus using G-OPTICS, and more particularly, to a data clustering technique using a G-OPTICS algorithm that improves the performance of an OPTICS using a GPU.

클러스터링(clustering)은 데이터셋 내의 유사한 객체들의 그룹 즉 클러스터(cluster)를 생성하는 과정이며, 오랫동안 많은 연구가 진행된 중요한 데이터 마이닝 문제의 하나이다. 클러스터링 알고리즘은 매우 다양한데 이 중 밀도기반(density-based) 알고리즘은 임의 형태의 클러스터를 형성하고 noises를 쉽게 filter out하며, 클러스터의 개수를 미리 정하지 않아도 된다는 등의 장점들로 인하여 널리 사용되고 있다. DBSCAN은 대표적인 밀도기반 알고리즘이며, 대용량의 데이터셋을 보다 효율적으로 처리하도록 성능을 개선하기 위한 다양한 연구가 진행되었다. Clustering is the process of creating a group of similar objects in a dataset, or clusters, and is one of the most important data mining problems that have been studied for a long time. Clustering algorithms are very diverse, among which density-based algorithms are widely used because they form a cluster of arbitrary types, filter out noises easily, and do not have to pre-define the number of clusters. DBSCAN is a representative density-based algorithm, and various studies have been conducted to improve performance to handle large-capacity datasets more efficiently.

DB-SCAN은 다른 많은 데이터 마이닝 알고리즘과 같이 클러스터링 결과가 파라미터 값에 민감하다는 약점을 갖는다. OPTICS는 이러한 약점을 해결하기 위한 알고리즘이며, 파라미터로 주어진 보다 작은 임의의 ′에 대응하는 클러스터링 결과와 equivalent한 객체들의 ordering을 산출한다. OPTICS는 DBSCAN과 마찬가지로 데이터셋 내의 각 객체에 대하여 다른 모든 객체들과의 거리를 계산하여야 하므로 O(

)의 높은 복잡도로 가지며, 여기에서 N은 데이터셋의 크기이다. DB-SCAN, like many other data mining algorithms, has a weakness that clustering results are sensitive to parameter values. OPTICS is an algorithm for solving this weakness, and it calculates the ordering of equivalent objects and the clustering result corresponding to the smaller arbitrary 'given by the parameter. Like DBSCAN, OPTICS must calculate the distance to all other objects in the dataset, so O (

), Where N is the size of the dataset.

OPTICS에 대한 성능 개선 시도 중 하나인 F-OPTICS를 설명하면, 도1에서 보인 것과 같이, OPTICS는 이전 객체들 p의 -neighbor 객체들 N(p)의 합집합인 Seeds로부터 reachability distance가 최소인 객체 q를 하나씩 추출하여 ordering O에 append한다. 객체 p에 대한 객체 q의 reachability distance에 대한 정의는 다음의 공식 (1), (2)와 같다. 여기에서 MinPts-distance(p)는 p의 MinPts-nearest 객체까지의 거리, D(p; q)는 두 객체 p, q 간의 거리이다. OPTICS 내에서 가장 많은 시간을 필요로 하는 연산은 데이터셋 내의 모든 객체 p에 대하여 -neighbor N(p)를 구하는 것이며, 이를 위하여 다른 모든 객체들과의 거리를 계산하여야 한다. 또한, 객체 p의 core distance의 산츨을 위하여 거리 값들 중에서 가장 작은 MinPts 개를 (부분)정렬하여야 한다. OPTICS의 성능 개선을 위한 기존의 알고리즘들은 이 부분을 최대화하였다. DeLi-Clu는 R-tree와 같은 다차원 인덱스 구조를 이용하여 MinPts-nearest neighbor join을 수행하였고, 실험을 통하여 OPTICS에 비하여 최대 20배 이상 성능이 향상됨을 보였다. DeLi-Clu의 단점은 다차원 인덱스의 dimensionality curse 문제로 인하여 데이터의 차원이 증가함에 따라 성능이 급격히 감소한다는 점이다. F-OPTICS는 객체 p에 대한 실제 -neighbor 대신 다음과 같이 sampled neighbor를 구한다. 데이터셋 내의 모든 객체를 임의의 라인 L 상으로 프로젝션하여 1차원 시퀀스 S를 구한다. S 내의 객체 p로부터 MinPts 개수 이내의 임의의 close 객체를 선택한다. 이러한 작업을 여러 개의 임의의 라인에 대하여 수행하고, 이때 얻어진 close 객체들의 집합을 sampled neighbor라 한다. 이렇게 -neighbor를 구하기 위한 부담을 줄임으로써 F-OPTICS는 DeLi-Clu에 비하여 최대 38배까지 성능이 향상되었다. F-OPTICS에 의하여 얻어진 객체 ordering은 OPTICS와 유사하나 일치하지는 않는다. 즉, 특정 ′(

)에 대하여 F-OPTICS에 의하여 얻어진 클러스터링 결과는 OPTICS 또는 DBSCAN과 다를 수 있다. F-OPTICS에서는 파라미터의 개수를 줄이기 위하여 항상 = 1로 설정하였다. 하지만, 이 증가함에 따라 -neighbor를 구하기 위한 부담도 증가하므로, 과대한 을 설정하는 것은 알고리즘의 실행 부담을 늘리는 문제점이 있다.As shown in FIG. 1, OPTICS is an object q (n) having a minimum reachability distance from Seeds which is a union of neighbor objects N (p) of previous objects p, and F-OPTICS And append them to ordering O. The definition of the reachability distance of object q for object p is given by the following formulas (1) and (2). Here, MinPts-distance (p) is the distance to MinPts-nearest object of p, and D (p; q) is the distance between two objects p, q. The most time-consuming operation in OPTICS is to find -neighbor N (p) for all objects p in the dataset, and for this, the distance to all other objects must be calculated. In order to generate the core distance of the object p, the smallest MinPts of the distance values should be (partial) aligned. Existing algorithms for improving the performance of OPTICS have maximized this part. DeLi-Clu performed a MinPTS-nearest neighbor join using a multi-dimensional index structure such as R-tree. Experiments show that the performance is up to 20 times higher than OPTICS. The disadvantage of DeLi-Clu is that the dimensionality curse problem of multidimensional indices reduces the performance dramatically as the dimension of the data increases. F-OPTICS obtains a sampled neighbor instead of the actual -neighbor for object p as follows. A one-dimensional sequence S is obtained by projecting all objects in the dataset onto an arbitrary line L. Select any close objects within the MinPts count from object p in S. This operation is performed on several arbitrary lines, and the set of obtained close objects is called a sampled neighbor. By reducing the burden of acquiring -neighbor-spaces, F-OPTICS improved performance up to 38 times compared to DeLi-Clu. The object ordering obtained by F-OPTICS is similar to OPTICS but does not match. That is,

), The clustering result obtained by F-OPTICS may be different from OPTICS or DBSCAN. In F-OPTICS, it is always set to 1 in order to reduce the number of parameters. However, since the burden to obtain the -neighborness increases as the number increases, setting an excessive number increases the burden of the algorithm execution.

한국공개특허공보 제10-2015-0065433호 (알고리즘 저장장치 및 이를 포함하는 클러스터링 장치)Korean Patent Laid-Open Publication No. 10-2015-0065433 (algorithm storage device and clustering device including the same)

본 발명은 상기와 같은 문제점을 감안하여 안출된 것으로, GPU를 이용하여 OPTICS의 성능을 개선한 G-OPTICS 알고리즘을 이용한 데이터 클러스터링 장치을 제공하는 것을 일 목적으로 한다.SUMMARY OF THE INVENTION It is an object of the present invention to provide a data clustering apparatus using a G-OPTICS algorithm that improves the performance of OPTICS using a GPU.

본 발명의 일 측면에 따른 G-OPTICS를 이용한 데이터 클러스터링 장치는 G-OPTICS 알고리즘을 저장하는 저장부와, 상기 G-OPTICS 알고리즘을 이용하여 데이터를 클러스터링하는 제어부를 포함한다.The data clustering apparatus using G-OPTICS according to an aspect of the present invention includes a storage unit for storing a G-OPTICS algorithm and a control unit for clustering data using the G-OPTICS algorithm.

바람직하게는 상기 제어부의 G-OPTICS 알고리즘을 이용하여 데이터를 클러스터링한 결과를 출력하는 출력부를 더 포함할 수 있다.The apparatus may further include an output unit outputting a result of clustering data using the G-OPTICS algorithm of the control unit.

바람직하게 상기 G-OPTICS 알고리즘은 GPU를 이용하여 OPTICS 알고리즘을 구현한 것일 수 있다.Preferably, the G-OPTICS algorithm may be an implementation of the OPTICS algorithm using a GPU.

본 발명에 따르면, 기존 클러스터링 알고리즘 대비 성능이 우수한 GPU를 이용한 G-OPTICS 알고리즘을 사용하여 데이터 클러스터링이 가능하다.According to the present invention, data clustering is performed using the G-OPTICS algorithm using the GPU having superior performance compared to the existing clustering algorithm It is possible.

도 1은 종래 데이터 클러스터링 알고리즘인 F-OPTICS 알고리즘을 나타낸 도면이다.
도 2는 본 발명의 일 실시예에 따른 G-OPTICS를 이용한 데이터 클러스터링 장치의 구성을 나타낸 도면이다.
도 3은 본 발명의 일 실시예에 따른 G-OPTICS를 이용한 데이터 클러스터링 장치가 사용하는 GPU의 구조 예시를 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 따른 G-OPTICS를 이용한 데이터 클러스터링 장치가 사용하는 G-OPTICS 알고리즘을 나타낸 도면이다.
도 5는 본 발명의 일 실시예에 따른 G-OPTICS의 데이터 구조를 나타낸 도면이다.
도 6은 본 발명의 일 실시예에 따른 G-OPTIC 알고리즘의 후보 객체를 나타낸 도면이다.
도 7은 F-OPTICS와 G-OPTICS 성능 비교 실험 결과 그래프를 나타낸 도면이다.FIG. 1 is a diagram illustrating an F-OPTICS algorithm, which is a conventional data clustering algorithm.
FIG. 2 is a block diagram of a data clustering apparatus using G-OPTICS according to an embodiment of the present invention. Referring to FIG.
3 is a diagram illustrating an example of a structure of a GPU used by a data clustering apparatus using G-OPTICS according to an embodiment of the present invention.
4 is a diagram illustrating a G-OPTICS algorithm used by a data clustering apparatus using G-OPTICS according to an embodiment of the present invention.
5 is a diagram illustrating a data structure of a G-OPTICS according to an embodiment of the present invention.
6 is a diagram illustrating a candidate object of the G-OPTIC algorithm according to an embodiment of the present invention.
FIG. 7 is a graph showing a comparison result between F-OPTICS and G-OPTICS performance.

본 발명과 본 발명의 동작상의 이점 및 본 발명의 실시에 의하여 달성되는 목적을 충분히 이해하기 위해서는 본 발명의 바람직한 실시 예를 예시하는 첨부 도면 및 도면에 기재된 내용을 참조하여야 한다. 또한 본 발명의 실시예에서 제시되는 특정한 구조 내지 기능적 설명들은 단지 본 발명의 개념에 따른 실시예를 설명하기 위한 목적으로 예시된 것으로, 본 발명의 개념에 따른 실시예들은 다양한 형태로 실시될 수 있다. 마찬가지로 본 명세서에 설명된 실시예들에 한정되는 것으로 해석되어서는 아니 되며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경물, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. In order to fully understand the present invention, operational advantages of the present invention, and objects achieved by the practice of the present invention, reference should be made to the accompanying drawings and the accompanying drawings which illustrate preferred embodiments of the present invention. It is also to be understood that the specific structure or functional description presented in the embodiments of the present invention is illustrated for the purpose of describing an embodiment according to the concept of the present invention only and embodiments according to the concept of the present invention may be embodied in various forms . Likewise, it should be understood that the present invention should not be construed as limited to the embodiments described herein, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention.

이하, 첨부한 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명한다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Like reference symbols in the drawings denote like elements.

도 2는 본 발명의 일 실시예에 따른 G-OPTICS를 이용한 데이터 클러스터링 장치의 구성을 나타낸 도면으로서, 본 발명의 일 실시예에 따른 G-OPTICS를 이용한 데이터 클러스터링 장치는 저장부(10), 제어부(20)를 포함하며 출력부(30)를 더 포함할 수 있다.FIG. 2 is a block diagram of a data clustering apparatus using G-OPTICS according to an embodiment of the present invention. The apparatus for clustering data using G-OPTICS according to an embodiment of the present invention includes a storage unit 10, (20), and may further include an output unit (30).

저장부(10)는 이하에서 설명할 G-OPTICS 알고리즘을 저장하는 역할을 수행하고 DataBase 형태로 구현할 수 있다. 제어부(20)는 G-OPTICS 알고리즘을 이용하여 데이터를 클러스터링하는 기능을 수행하며 중앙처리장치로서 CPU, AP 등의 형태로 구현할 수 있다. 출력부(30)는 제어부(20)의 데이터 클러스터링 결과를 출력할 수 있으며 모니터 등의 디스플레이 디바이스 형태로 구현될 수 있다.The storage unit 10 stores the G-OPTICS algorithm to be described below, and can be implemented in a DataBase format. The control unit 20 performs a function of clustering data using the G-OPTICS algorithm, and can be implemented as a CPU, an AP, or the like as a central processing unit. The output unit 30 may output the data clustering result of the controller 20 and may be implemented in the form of a display device such as a monitor.

G-OPTICS는 GPU를 이용하여 OPTICS 알고리즘을 개선한 것으로서, 도 3은 본 발명의 일 실시예에 따른 G-OPTICS를 이용한 데이터 클러스터링 장치가 사용하는 GPU의 구조 예시를 나타낸 도면이고, 도 4는 본 발명의 일 실시예에 따른 G-OPTICS를 이용한 데이터 클러스터링 장치가 사용하는 G-OPTICS 알고리즘을 나타낸 도면이며, 도 5는 본 발명의 일 실시예에 따른 G-OPTICS의 데이터 구조를 나타낸 도면이고, 마지막으로 도 6은 본 발명의 일 실시예에 따른 G-OPTIC 알고리즘의 후보 객체를 나타낸 도면으로서, 이를 참조하여 본 발명의 일 실시예에 따른 G-OPTICS 알고리즘 및 이를 이용한 데이터 클러스터링 장치를 설명한다.FIG. 3 is a diagram illustrating an exemplary structure of a GPU used by a data clustering apparatus using G-OPTICS according to an embodiment of the present invention. FIG. OPTICS algorithm used by a data clustering apparatus using G-OPTICS according to an embodiment of the present invention. FIG. 5 is a diagram illustrating a data structure of a G-OPTICS according to an embodiment of the present invention. 6 illustrates a candidate object of the G-OPTIC algorithm according to an exemplary embodiment of the present invention. Referring to FIG. 6, a G-OPTICS algorithm according to an embodiment of the present invention and an apparatus for clustering data using the algorithm will be described.

본 발명의 일측면에 따라 제시하는 G-OPTICS는 GPU를 이용하여 OPTICS의 성능을 개선한 것이다. 도3은 GPU의 쓰레드 및 메모리 구조를 간략하게 보인 도면으로서, GPU는 수백만 개의 쓰레드를 동시에 실행하며 모든 쓰레드는 동일한 함수(kernel function이라 부름)를 서로 다른 데이터에 대하여 실행한다. GPU 쓰레드는 동일한 개수의 쓰레드를 포함하는 블록(block)으로 분할된다. 하나의 블록 내의 쓰레드들은 shared memory를 이용하여 데이터를 공유하며 서로 다른 블록의 쓰레드들간의 데이터 공유는 device memory를 사용하여야 한다. Shared memory는 용량이 적은 반면(예로써 48KB), device memory에 비하여 액세스가 훨씬 빠르다(>1TB/s). GPU를 이용하여 알고리즘의 성능을 높이기 위해서는 서로 독립적인 작업을 수행하는 쓰레드의 개수를 최대화하고, shared memory를 최대한 활용할 수 있다. The G-OPTICS proposed according to one aspect of the present invention improves the performance of the OPTICS using the GPU. FIG. 3 is a simplified diagram of the GPU's threads and memory structure. The GPU executes several millions of threads simultaneously, and all threads execute the same function (called a kernel function) for different data. The GPU thread is divided into blocks containing the same number of threads. Threads in one block share data using shared memory, and data sharing between threads in different blocks should use device memory. Shared memory is much smaller (> 1 TB / s) while it is smaller (eg, 48 KB). To increase the performance of the algorithm using the GPU, it is necessary to maximize the number of independent threads and maximize shared memory.

G-OPTICS는 알고리즘 정확성을 위하여 OPTICS의 기본적인 구조를 물려 받으며, OPTICS 내에서 서로 간에 종속성(dependency)이 존재하지 않고 많은 수행 시간을 필요로 하는 작업들을 병렬화함으로써 성능을 높일 수 있다. 도4 의 Algorithms 3 및 4는 G-OPTICS 알고리즘에 대하여 정리한 것이며, Algorithm 3에서 병렬적으로 수행되는 부분은 Algorithm 라인 (1)-(5), (14), (16)이다. G-OPTICS의 Algorithm 3의 라인 (1)-(5)에서 데이터셋 D 내의 모든 객체 pi에 대하여 -neighbor, N(pi)를 구하는 방법은 다음과 같다. G-OPTICS inherits the basic structure of OPTICS for algorithmic accuracy and can improve performance by parallelizing operations that do not have dependency between OPTICS and require a lot of execution time. Algorithms 3 and 4 in FIG. 4 are summarized for the G-OPTICS algorithm, and the parts that are performed in parallel in Algorithm 3 are the Algorithm lines (1) - (5), (14), and (16). In line (1) - (5) of Algorithm 3 of G-OPTICS, the method to obtain -neighbor, N (pi) for all objects pi in data set D is as follows.

OPTICS에서 가장 많은 연산 시간을 요구하는 작업이 -neighbor를 구하는 것이며(도 1에 나타낸 종래 OPTICS의 Algorithm 1의 라인 (7)과 (14)), 이 부분의 개선이 전체 알고리즘의 성능에 크게 영향을 미칠 수 있다. G-OPTICS는 먼저 도 5와 같은 데이터 구조 Lj (0 j < d′ d)를 구축하며, 여기에서 d는 데이터 차원이다. Lj는 모든 객체 pi (0 i < N)로부터 j-차원 좌표 값 pi;j와 pi의 ID pi:id의 쌍 (pi;j ; pi:id)의 array를 생성한 후에, 이 array를 pi;j의 순서로 정렬하여 얻어진다. 여기에서 N은 데이터셋 크기(N = jDj)이다. 임의의 객체 pi의 -neighbor를 구하기 위해서는 먼저 모든 j에 대하여 Lj 내에서 pi;j와의 거리 차이가 이내인 모든 후보 객체들의 집합 Cj를 찾는다. 도 5에서 객체 pu로부터 pv까지의 모든 객체들이 해당된다. 다음에, 모든 Cj의 교집합 C(pi) = \0j<d′Cj 내의 모든 후보 객체들과 pi와의 실제 거리를 구하고, 거리 이내의 객체들만을 선택한다. 도 6에서 후보 객체들이 존재하는 영역을 회색 영역으로 나타냈으며, 여기에서 d′ = 2이다. 이 영역은 전체 데이터셋 영역에 비하여 매우 작으므로, G-OPTICS에서 수행하는 객체 간의 거리 계산의 수는 크게 감소하고 성능은 크게 향상될 수 있다. G-OPTICS는 각 쓰레드 블록에서 하나의 객체 pi에 대한 -neighbor를 구하며, 후보 집합 C(pi) 내의 모든 객체들과의 거리 계산을 병렬적으로 동시에 수행한다. 또한, G-OPTICS는 수백 개의 쓰레드 블록을 동시에 실행하므로, 성능을 더욱 크게 향상시킬 수 있다. d′ 값은 dIn OPTICS, the task requiring the greatest computation time is to obtain the -neighbor (lines (7) and (14) of Algorithm 1 of the conventional OPTICS shown in FIG. 1) I can go crazy. G-OPTICS first constructs a data structure Lj (0 j < d 'd) as shown in FIG. 5, where d is a data dimension. Lj creates an array of pi (j; pi: id) pairs of IDs pi: id of pi with j-dimensional coordinate values pi from all objects pi (0i <N) j. < / RTI > Where N is the dataset size (N = jDj). To find the -neighborhood of an arbitrary object pi, first find all the candidate objects Cj within Lj within the distance difference from pi; j for all j. In FIG. 5, all objects from object pu to pv are applicable. Next, the actual distance between all candidate objects in Cj (pi) = \ 0j <d'Cj of all Cjs and pi is found, and only objects within the distance are selected. In FIG. 6, a region in which candidate objects are present is represented as a gray region, where d '= 2. Since this area is very small compared to the entire dataset area, the number of distance calculations between objects performed by G-OPTICS is greatly reduced and performance can be greatly improved. G-OPTICS finds the -neighbor for one object pi in each thread block and performs the distance calculation with all the objects in the candidate set C (pi) in parallel at the same time. In addition, G-OPTICS runs hundreds of threads simultaneously, which can greatly improve performance. d 'value is d

이하의 임의의 값으로 정할 수 있다. Can be set to any value below.

객체p에 대하여 상기 공식(1)에 의하여 core distance를 구하기 위해서는 N(p) 내의 모든 객체들 q와의 거리 D(p; q)를 오름차순으로 정렬하여 MinPts-th nearest 객체의 거리를 구해야 한다. 기존에 널리 알려진 대부분의 효율적인 정렬 알고리즘, 즉 quick sort, heap sort, merge sort 등은 단일 쓰레드 상에서의 순차 알고리즘들이며, 병렬화하기 어렵다는 단점을 갖는다. G-OPTICS에서는 GPU 상에서의 정렬을 수행하기 위하여 bitonic sort 알고리즘 [6]을 이용한다. 병렬 환경에서 quick sort, heap sort, merge sort 등의 알고리즘들이 O(n)의 복잡도를 갖는 반면, bitonic sort 알고리즘은 그보다 훨씬 낮은 O(

n)의 복잡도를 갖는다. 여기에서 n은 C(pi) 내의 후보 객체들의 개수이다. 객체 p의 core distance를 구한 후에는 상기 공식 (2)를 이용하여 N(p) 내의 모든 객체들 q에 대하여 p에 대한 reachability distance를 산출한다. 이 값은 도 4의 Algorithms 4의 Update 함수 내의 라인 (2)에서 사용된다. 객체 p에 대한 core distance 및 객체 q(2N(p))에 대한 reachability distance의 계산과 p의 상태를 unprocessed로 설정하는 것은 N(p)를 구하는 동일한 쓰레드 블록에서 수행하므로 추가적인 부담이 적고 효율적으로 수행 가능하다(도 4의 Algorithm 3의 라인 (2)-(4)). 이러한 작업들을 동일한 쓰레드 블록에서 수행하는 또다른 장점은 shared memory를 활용함으로써 성능을 더욱 개선할 수 있다는 점이다. In order to obtain the core distance by the above formula (1) for the object p, the distance of the MinPts-th nearest object should be obtained by aligning the distances D (p; q) with all the objects q in N (p) in ascending order. Most efficient sorting algorithms, such as quick sort, heap sort, and merge sort, which are widely known, are sequential algorithms on a single thread and have the disadvantage that they are difficult to parallelize. G-OPTICS uses the bitonic sort algorithm [6] to perform alignment on the GPU. In a parallel environment, algorithms such as quick sort, heap sort, and merge sort have O (n) complexity, while bitonic sort algorithms have much lower O

n). Where n is the number of candidate objects in C (pi). After obtaining the core distance of object p, we use Equation (2) to calculate the reachability distance for p for all objects q in N (p). This value is used in line 2 in the Update function of Algorithms 4 of FIG. The computation of the reachability distance to the core distance and the object q (2N (p)) for object p and the state of p to unprocessed are performed in the same thread block for obtaining N (p) (Lines (2) - (4) of Algorithm 3 in Fig. 4). Another advantage of performing these tasks on the same thread block is that performance can be further improved by utilizing shared memory.

도 4의 G-OPTICS의 Algorithm 3의 라인 (14)에서 Update 함수를 수행하는 방법은 다음과 같다. OPTICS의 Update 함수는 N(p) 내의 모든 객체 q를 reachability distance 값이 오름차순으로 정렬되도록 Seeds내의 적절한 위치에 삽입하거나 위치 조정한다. 하지만, 이러한 작업은 각 객체 q에 대하여 Seeds내의 모든 객체와 reachability distance를 비교하여야 하므로, 최악의 경우 O(N2)의 복잡도를 가질 수 있다. G-OPTICS에서는 각 객체 q를 서로 다른 concurrent 쓰레드들에서 처리하며 그 객체를 Seeds 내에 reachability distance 순서와 관계없이 삽입하므로 O(1)의 복잡도를 갖는다. Seeds 내에서 최소 reachability distance를 갖는 객체를 찾는 작업은 도 의 Algorithm 3의 라인 (16)에서 수행한다. 이를 위하여 Seeds 내의 객체들을 여러 concurrent 쓰레드들에 분배하여 각 쓰레드 내에서 최소 reachability distance를 갖는 객체를 찾은 다음에, 모든 쓰레드들로부터의 객체들 중 최소 reachability distance를 갖는 객체를 찾을 수 있다.The method of performing the Update function in the line 14 of the Algorithm 3 of the G-OPTICS in FIG. 4 is as follows. The Update function of OPTICS inserts or relocates all objects q in N (p) to the appropriate positions in Seeds so that the reachability distance values are sorted in ascending order. However, this task can have the complexity of O (N2) in the worst case because it needs to compare reachability distances with all objects in Seeds for each object q. In G-OPTICS, each object q is processed in different concurrent threads and has the complexity of O (1) because it inserts the object in Seeds irrespective of reachability distance order. The search for objects with the minimum reachability distance in Seeds is performed on line (16) in Algorithm 3 of the Diagram. To accomplish this, we can find objects with the minimum reachability distance in each thread by distributing the objects in Seeds to multiple concurrent threads, and then find objects with the minimum reachability distance among all the threads from all threads.

도 7은 F-OPTICS와 G-OPTICS 성능 비교 실험 결과 그래프를 나타낸 도면으로서, 이하에서는 설험예를 설명하며 이하에서 설명하는 실험에서는 항상 d′ = 4로 설정하였다. F-OPTICS가 OPTICS 및 DeLi-Clu에 비하여 각각 최대 520배 및 38배까지 성능 향상이 이루어지므로 G-OPTICS는 F-OPTICS와 비교를 진행하였다. 본 실험에서 사용한 하드웨어 플랫폼은 Intel Core i7-3960X CPU, 32GB main memory, 256GB SSD, and Nvidia GeForce GTX 1080 GPU를 장착한 workstation이며, Microsoft Windows 7 64-bit Edition 상에서 CUDA Toolkit 7.5를 이용하여 구현하였다. F-OPTICS는 오픈 소스를 활용하였다. 본 실험에서 사용된 데이터는 합성 데이터이며, 모든 객체의 좌표 값은 [0.0, 1.0] 영역에 포함된다고 가정하였다. 디폴트 파라미터 값은 N = 64K, = 0.1, MinPts = 4, d = 8로 설정하였다. FIG. 7 is a graph showing a comparison result between F-OPTICS and G-OPTICS performance. Hereinafter, the description will be made and the d '= 4 is always set in the experiments described below. Since F-OPTICS improves performance up to 520 times and 38 times, respectively, compared to OPTICS and DeLi-Clu, G-OPTICS compares with F-OPTICS. The hardware platform used in this experiment is a workstation equipped with Intel Core i7-3960X CPU, 32GB main memory, 256GB SSD, and Nvidia GeForce GTX 1080 GPU, and implemented using CUDA Toolkit 7.5 on Microsoft Windows 7 64-bit Edition. F-OPTICS utilized open source. It is assumed that the data used in this experiment is synthetic data, and that the coordinate values of all objects are included in the [0.0, 1.0] area. The default parameter values were set to N = 64K, = 0.1, MinPts = 4, d = 8.

도 7에 나타낸 그래프 (a) 내지 (d)는 각각 데이터셋 크기 (N), threshold (), MinPts, 데이터 차원 (d) 파라미터 값을 변경하며 실험한 결과를 보인 것이다. 그래프에서 실제 실행 시간은 bar chart로, G-OPTICS의 성능 향상 비율은 line chart로 보였으며, 실제 실행 시간은 log scale로 나타냈다. 각 실험에서 변경되는 파라미터 값 외에는 디폴트 값으로 설정하였다. 도 6에서 보인 것과 같이, G-OPTICS는 F-OPTICS에 비하여 최대 90.2배까지 성능이 크게 향상되었다. 도 7의 (a)에서 데이터셋의 크기가 증가함에 따라 성능 향상 비율이 감소한 이유는 G-OPTICS에서 모든 객체 p에 대한 -neighbor를 구하고 정렬을 수행하기 위한 부담이 빠르게 증가하였기 때문이다. 도 7의 (b)에서 이 증가함에 따라 G-OPTICS에서 -neighbor 후보 집합 C(p)의 크기가 증가하여 실행시간이 증가하였고 성능 향상 비율이 감소하였다. 도 7의 (c)에서는 MinPts 값에 따라 G-OPTICS와 F-OPTICS 모두 실행 시간이 거의 변화하지 않았고, 성능 향상 비율도 거의 그대로 유지되었다. 도 7의 (d)에서 데이터 차원이 증가함에 따라 F-OPTICS의 실행 시간이 증가하였고, 그에 따라 G-OPTICS의 성능 향상 비율이 증가함을 알 수 있다.The graphs (a) to (d) shown in FIG. 7 show experimental results of changing data set size (N), threshold (), MinPts, and data dimension (d) parameters. Actual execution time in graph is bar chart, G-OPTICS performance improvement rate is line chart, and actual execution time is log scale. The default value is set except for the parameter values that are changed in each experiment. As shown in FIG. 6, the performance of G-OPTICS has been greatly improved up to 90.2 times compared with F-OPTICS. In FIG. 7 (a), the performance improvement ratio decreases as the size of the data set increases. This is because the burden of performing-alignment on the object-p is increased rapidly in G-OPTICS. As shown in FIG. 7 (b), the size of the neighbors of the neighbors C (p) increases in the G-OPTICS, thereby increasing the execution time and decreasing the performance improvement ratio. In FIG. 7 (c), the execution time of both G-OPTICS and F-OPTICS hardly changed according to the MinPts value, and the performance improvement ratio remained almost the same. In FIG. 7 (d), as the data dimension increases, the execution time of the F-OPTICS increases and the performance improvement ratio of the G-OPTICS increases.

지금까지 본 발명을 바람직한 실시예를 참조하여 상세히 설명하였다. 그러나 본 발명이 상기한 실시예에 한정되는 것은 아니며, 이하의 특허청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 누구든지 다양한 변형 또는 수정이 가능한 범위까지 본 발명의 기술적 사상이 미친다 할 것이다.The present invention has been described in detail with reference to preferred embodiments. It will be apparent to those skilled in the art that the present invention is not limited to the embodiments described above and that various modifications and changes may be made by one of ordinary skill in the art without departing from the scope of the present invention, It is to be understood that the technical idea of the present invention extends to the extent possible.

Claims

A storage unit for storing a G-OPTICS algorithm that implements an OPTICS algorithm using a GPU; And
And a controller for clustering data using the G-OPTICS algorithm,
The GPU runs millions of threads simultaneously, all threads execute the same function on different data,
GPU threads are divided into blocks containing the same number of threads so that threads in one block share data using shared memory and data sharing between threads in different blocks uses device memory. Data clustering device using OPTICS.

The method according to claim 1,
Further comprising an output unit outputting a result of clustering data using the G-OPTICS algorithm of the control unit.

delete