KR20180120551A

KR20180120551A - Method and apparatus for frequent pattern mining

Info

Publication number: KR20180120551A
Application number: KR1020170099641A
Authority: KR
Inventors: 김민수; 전강욱
Original assignee: 재단법인대구경북과학기술원
Priority date: 2017-04-27
Filing date: 2017-08-07
Publication date: 2018-11-06
Also published as: KR101969219B1

Abstract

Disclosed are a frequent pattern mining method and an apparatus using the same. The apparatus according to an embodiment of the present invention, copies each of different relative memory addresses of candidate item sets from a main memory to each of device memories of GPUs, copies at least one identical transaction block required for calculation of supports of the candidate item sets from the main memory to each of the device memories, and synchronizes partial supports processed by the GPUs to update the supports of the candidate item sets.

Description

[0001] METHOD AND APPARATUS FOR FREQUENT PATTERN MINING [0002]

아래 실시예들은 빈발 패턴을 마이닝하는 방법 및 장치에 관한 것이다.The following embodiments are directed to a method and apparatus for mining frequent patterns.

데이터 마이닝에서 빈발 항목집합 마이닝(Frequent Itemset Mining; FIM)은 필수적인(fundamental) 기술이다. 빈발 항목집합 마이닝은 장바구니 분석(market basket analysis), 웹 사용 마이닝, 소셜 네트워크 분석, 침입 탐지(intrusion detection), 생물 정보학(bioinformatics) 및 추천 시스템 등 다양한 분야에서 광범위하게 활용된다. 그러나, 데이터의 크기가 커지고 있음에도 불구하고 기존 방법들은 상대적으로 느린 처리 속도의 성능 때문에 적용성이 떨어지고 있다. 진단 또는 분석 목적으로 자동화된 시스템에서 생성된 데이터의 홍수(deluge)는 실제 많은 응용 프로그램에서 빈발 항목집합 마이닝 기술의 적용을 어렵게 만들고 있다.In data mining, Frequent Item Set Mining (FIM) is a fundamental technique. Frequent item set mining is widely used in a variety of fields such as market basket analysis, web usage mining, social network analysis, intrusion detection, bioinformatics and recommendation systems. However, although the size of the data is increasing, existing methods are less applicable due to the relatively slow processing speed performance. The deluge of data generated from automated systems for diagnostic or analytical purposes is making it difficult to apply frequent item-set mining techniques in many applications.

순차적(sequential) 빈발 항목집합 마이닝 방법들이 많이 제안되었다. 하지만, 이러한 방법들은 단일 스레드(thread)에서 얻을 수 있는 성능에서 제한이 있고, 빅 데이터(big data)로부터 빈발 항목집합들을 적절한(reasonable) 시간 내에 찾지 못하는 경우가 있다. 계산 시간의 관점에서, 빈발 항목집합 마이닝 방법은 아직 완전히 해결되지 않은 문제에 여전히 직면해있다.Many sequential frequent item set mining methods have been proposed. However, these methods have limitations in the performance that can be obtained from a single thread, and occasionally fail to find frequent item sets from big data in a reasonable amount of time. In terms of computation time, the frequent item set mining method is still faced with a problem that has not been fully solved yet.

단일 CPU(Central Processing Unit) 스레드를 이용하는 순차적 빈발 항목집합 마이닝 방법들은 많이 존재한다. 그러나, 일반적으로 CPU의 클럭 속도(clock speed)는 더 이상 증가하지 않기 때문에 이러한 방법들은 마이닝 성능에 근본적인 한계가 있다.There are many sequential frequent item set mining methods using a single CPU (Central Processing Unit) thread. However, since the clock speed of the CPU generally does not increase anymore, these methods have a fundamental limitation on the mining performance.

상술한 한계점들을 극복하기 위해 CPU의 멀티-코어들, 멀티 머신 또는 GPU(Graphic Processing Unit)의 많은 코어들을 사용하는 여러 가지 병렬적 방법이 제안되었다. 그러나 이러한 방법들도 비교적 느린 성능, 처리 가능한 데이터의 제한된 크기, 작업부하 비대칭(workload skewness)으로 인해 확장성이 떨어지는 문제 등의 단점을 보이고 있다.In order to overcome the above-described limitations, various parallel methods using many cores of a CPU's multi-cores, a multi-machine or a GPU (Graphic Processing Unit) have been proposed. However, these methods also have disadvantages such as relatively slow performance, limited size of processable data, and poor scalability due to workload skewness.

성능 한계의 극복을 위해, 수많은 병렬 항목집합 마이닝 방법들이 제안되었다. 이러한 방법들은 (1) (CPU 기반) 멀티-스레드 방법, (2) 분산 방법 및 (3) GPU-기반 방법의 세 그룹으로 분류된다. GPU-기반 방법은 멀티-스레드를 전제하므로, GPU-기반 방법에서 "멀티-스레드"는 생략되어도 무방하다.To overcome performance limitations, a number of parallel item set mining methods have been proposed. These methods fall into three groups: (1) a (CPU-based) multi-thread method, (2) a distributed method, and (3) a GPU-based method. Since the GPU-based method presupposes multi-threading, " multi-threading " may be omitted in the GPU-based method.

첫번째 그룹인 멀티-스레드 방법은 멀티-코어 CPU를 활용하여 단일-스레드 방식의 성능을 향상시키는데 중점을 두고 있다. 실제로 멀티-스레드 방법은 순차적 빈발 항목집합 마이닝 방법에 비해 성능을 향상시킬 수 있다. 그러나, CPU의 이론적 성능은 최근의 GPU보다 훨씬 떨어지고, CPU와 GPU 간의 성능 격차는 점점 커지는 추세이므로, 멀티-스레드 방식은 성능 면에서 좋은 방법으로 보기 어렵다.The first group, the multi-thread method, focuses on utilizing a multi-core CPU to improve single-threaded performance. Actually, the multi-thread method can improve the performance compared to the sequential frequent item set mining method. However, the theoretical performance of the CPU is much lower than the recent GPU, and the performance gap between the CPU and the GPU is getting larger and larger, so the multi-thread method is hard to see in terms of performance.

두번째 그룹인 분산 방법은 여러 머신들을 활용하여 성능 가속화를 시도한다. 일반적으로, 분산 방법의 심각한 단점은 네트워크 통신의 큰 오버 헤드에 있다.The second group, the distribution method, attempts to accelerate performance using multiple machines. In general, a serious disadvantage of the distributed method is the large overhead of network communication.

세번째 그룹인 GPU-기반 방법은 GPU의 많은 코어들을 활용하여 성능을 향상시키는 데 중점을 두고 있다. GPU 기술의 지속적 발전은 현대 컴퓨터의 이론적인 컴퓨팅 성능을 계속 향상시키고 있다. GPU의 이론적인 컴퓨팅 성능은 CPU보다 훨씬 좋기 때문에, 빈발 패턴 마이닝을 포함한 광범위한 문제들에서 GPU 이용의 중요성은 점점 커지고 있다. 그러나, 기존의 GPU-기반 방법은 제한된 GPU 메모리로 인해 데이터 크기가 제한되는 문제를 안고 있다. 일반적으로, GPU 메모리의 크기는 메인 메모리보다 훨씬 작고, GPU-기반 방법의 대부분은 GPU 메모리에 저장된 데이터에 국한하여 빈발 패턴을 찾을 수 있다. GPU-기반 방법에서 Frontier Expansion은 GPU 메모리보다 큰 데이터 세트를 처리 할 수 있는 최신 기법이다. 그러나 이 기법은 여전히 CPU-기반 방법만큼 큰 데이터를 처리할 수 없는데, 패턴 공간의 중간 레벨에서의 많은 양의 데이터를 GPU 메모리 내에 유지할 수 없다는 점이 그 이유이다.The third group, the GPU-based method, focuses on improving performance by leveraging the many cores of the GPU. The continued development of GPU technology continues to improve the theoretical computing performance of modern computers. Because the theoretical computing power of the GPU is much better than the CPU, the use of the GPU is becoming increasingly important in a wide range of problems, including frequent pattern mining. However, existing GPU-based methods suffer from limited data size due to limited GPU memory. In general, the size of GPU memory is much smaller than main memory, and most GPU-based methods can find frequent patterns that are limited to data stored in GPU memory. In the GPU-based approach, Frontier Expansion is the latest technology to handle larger sets of data than GPU memory. However, this technique is still unable to process as much data as the CPU-based method, because large amounts of data at the mid-level of the pattern space can not be kept in GPU memory.

종래 FIM 방법들은 데이터 세트의 크기에 비례하여 계산 집약적인 처리가 요구되는 경향이 있다. 종래 FIM 방법들은 크기가 큰 중간 데이터(intermediate)로 인해 마이닝 동작에 실패할 수 있다. 입력 데이터 세트가 메모리에 들어가더라도, 중간 데이터는 크기가 커져 메모리에 들어가지 못할 수 있다. 이러한 현상은 GPU의 장치 메모리의 작은 용량 때문에 GPU를 이용한 FIM 수행 시 더욱 두드러질 수 있다.Conventional FIM methods tend to require computationally intensive processing proportional to the size of the data set. Conventional FIM methods may fail mining operations due to large intermediate data. Even if the input data set enters the memory, the intermediate data may become too large to fit in the memory. This phenomenon can be more prominent when the FIM is performed using the GPU because of the small capacity of the GPU's device memory.

기존 병렬적 방법들은 작업부하 비대칭으로 인한 문제를 안고 있다. 작업부하 비대칭은 매우 일반적이지만, 병렬 컴퓨팅의 성능에 큰 영향을 미친다. 기존의 병렬적 방법들은 탐구될(explored) 패턴들의 탐색 공간을 다수의 조각들(예컨대, 동등한 클래스들)로 분할하고, 각 조각을 각 프로세서(또는 기계)에 할당한다. 열거 트리(enumeration tree)의 각 하위 트리는 다른 양의 작업부하(크기)를 담당하게 되기가 쉽다. 결과적으로, 기존 병렬적 방법들은 CPU, 기계 또는 GPU 수 측면에서 확장성이 떨어진다. 즉, 기존 병렬적 방법들은 프로세서들의 수가 늘어나더라도 속도-향상 비율이 비례적으로 커지지 않는 문제가 있다.Existing parallel methods have problems due to workload asymmetry. Workload asymmetry is very common, but it has a significant impact on the performance of parallel computing. Existing parallel methods divide the exploration space of explored patterns into multiple pieces (e.g., equivalent classes) and assign each piece to each processor (or machine). Each subtree of an enumeration tree is likely to be responsible for a different amount of workload (size). As a result, existing parallel methods are less scalable in terms of CPU, machine or GPU count. That is, existing parallel methods have a problem that the rate-improvement ratio does not increase proportionally even if the number of processors increases.

GPU 메모리에서 중간 레벨의 많은 양의 데이터를 유지하여 GPU 클럭들을 낭비하는 비효율, 메인 메모리와 GPU 메모리 간에 많은 양의 데이터 전송 오버 헤드를 갖는 제약, GPU 수의 관점에서 마이닝 기법들의 확장성이 떨어지는 제약을 개선하기 위한 데이터 마이닝 기술이 요구된다.The inefficiency of wasting GPU clocks by holding a large amount of intermediate level data in GPU memory, the constraint of having a large amount of data transfer overhead between main memory and GPU memory, the constraint of poor scalability of mining techniques in terms of GPU count Data mining techniques are needed to improve the

일실시예에 따른 빈발 패턴 마이닝 방법은 데이터 세트의 크기가 커짐에 따라 GPU들의 계산량이 커지는 문제를 해결하고자 한다.The frequent pattern mining method according to an embodiment attempts to solve the problem that the amount of calculation of GPUs increases as the size of a data set increases.

일실시예에 따른 빈발 패턴 마이닝 방법은 중간 데이터의 크기가 GPU의 메모리에 비해 너무 크기 때문에 발생하는 패턴 마이닝 실패의 가능성을 줄이고자 한다.The frequent pattern mining method according to an exemplary embodiment attempts to reduce the possibility of pattern mining failure due to the size of the intermediate data being too large as compared with the memory of the GPU.

일실시예에 따른 빈발 패턴 마이닝 방법은 열거 트리(enumeration tree)의 중간 레벨들(intermediate levels)에서 패턴들을 유지 및 활용하기 때문에 발생하는 GPU 아키텍처의 성능 및 메모리 사용 측면의 비효율 문제를 해결하고자 한다.The frequent pattern mining method according to an embodiment attempts to solve the problem of inefficiency in terms of performance and memory usage of the GPU architecture caused by maintaining and utilizing patterns at intermediate levels of an enumeration tree.

일실시예에 따른 빈발 패턴 마이닝 방법은 GPU들 간의 작업부하 비대칭 문제를 해결하고자 한다. The frequent pattern mining method according to one embodiment attempts to solve the problem of workload asymmetry between GPUs.

일실시예에 따른 빈발 패턴 마이닝 방법은 빈발 1-항목집합들에 대응하는 비트 벡터들로부터 블록들을 생성하는 단계; CPU(Central Processing Unit)의 메인 메모리로부터 GPU(Graphic Processing Unit)들의 장치 메모리들 각각으로, 후보 k-항목집합들의 서로 다른 상대 메모리 주소들을 각각 복사하는 단계; 상기 메인 메모리로부터 상기 장치 메모리들 각각으로, 상기 블록들 중 상기 후보 k-항목집합들의 지지도들의 연산에 필요한 적어도 하나의 동일한 블록을 각각 복사하는 단계; 및 상기 GPU들에 의해 계산된 부분 지지도들을 동기화하여 상기 후보 k-항목집합들의 지지도들을 갱신하는 단계를 포함한다.A frequent pattern mining method according to an exemplary embodiment includes generating blocks from bit vectors corresponding to frequent 1-item sets; Copying different relative memory addresses of candidate k-item sets, respectively, from the main memory of a central processing unit (CPU) to each of the device memories of GPUs (graphic processing units); Copying at least one identical block from the main memory to each of the device memories, each of the at least one identical block required for calculating the votes of the candidate k-itemsets of the blocks; And updating the scores of the candidate k-itemsets by synchronizing partial scores calculated by the GPUs.

일실시예에 따르면, 상기 서로 다른 상대 메모리 주소들을 각각 복사하는 단계는 제1 후보 k-항목집합의 상대 메모리 주소를 제1 GPU의 장치 메모리로 복사하는 단계; 및 제2 후보 k-항목집합의 상대 메모리 주소를 제2 GPU의 장치 메모리로 복사하는 단계를 포함하고, 상기 적어도 하나의 동일한 블록을 각각 복사하는 단계는 상기 블록들 중 제1 블록을 상기 제1 GPU의 장치 메모리로 복사하는 단계; 및 상기 제1 블록을 상기 제2 GPU의 장치 메모리로 복사하는 단계를 포함할 수 있다.According to one embodiment, copying each of the different relative memory addresses comprises: copying the relative memory address of the first candidate k-itemset into the device memory of the first GPU; And copying the relative memory address of the second candidate k-item set to the device memory of the second GPU, wherein copying each of the at least one identical block comprises copying a first one of the blocks to the first Copying to the device memory of the GPU; And copying the first block to the device memory of the second GPU.

일실시예에 따른 빈발 패턴 마이닝 방법은 상기 비트 벡터들을 포함하는 트랜잭션 비트맵-상기 트랜잭션 비트맵은 수직 비트맵 레이아웃(vertical bitmap layout)에 의해 표현됨-을 수직적 파티셔닝(vertically partitioning)하여 트랜잭션 블록들을 생성하는 단계를 더 포함하고, 상기 블록들은 상기 트랜잭션 블록들일 수 있다.A frequent pattern mining method according to an embodiment includes vertically partitioning a transaction bitmap including the bit vectors, the transaction bitmap being represented by a vertical bitmap layout to generate transaction blocks And wherein the blocks may be the transaction blocks.

일실시예에 따르면, 상기 서로 다른 상대 메모리 주소들을 각각 복사하는 단계는 상기 트랜잭션 비트맵의 빈발 1-항목집합을 상대 메모리 주소로 매핑하는 사전을 이용하여, 상기 후보 k-항목집합들의 상대 메모리 주소들을 생성하는 단계; 및 상기 생성된 상대 메모리 주소들 각각을 외부 피연산자(outer operand)로서 상기 장치 메모리들 각각으로 복사하는 단계를 포함할 수 있다.According to one embodiment, the step of copying each of the different relative memory addresses, respectively, comprises: using a dictionary mapping a frequent 1-item set of the transaction bitmap to a relative memory address, Gt; And copying each of the generated relative memory addresses as an outer operand to each of the device memories.

일실시예에 따르면, 상기 상대 메모리 주소들을 생성하는 단계는 후보 k-항목집합에 포함된 빈발 1-항목집합들을 식별하는 단계; 및 상기 식별된 빈발 1-항목집합들의 상대 메모리 주소들을 조합하여, 상기 후보 k-항목집합의 상대 메모리 주소를 생성하는 단계를 포함할 수 있다.According to one embodiment, generating the relative memory addresses comprises: identifying frequent 1-item sets included in a candidate k-itemset; And combining the relative memory addresses of the identified frequent 1-item sets to generate a relative memory address of the candidate k-item set.

일실시예에 따르면, 상기 적어도 하나의 동일한 블록을 각각 복사하는 단계는 상기 트랜잭션 블록들 중 어느 하나의 트랜잭션 블록을 내부 피연산자(inner operand)로서 상기 장치 메모리들 각각으로 복사하는 단계를 포함하고, 상기 지지도들을 갱신하는 단계는 상기 각 GPU들이, 상기 트랜잭션 블록 및 상기 각 상대 메모리 주소들을 이용하여 상기 각 상대 메모리 주소들에 대응하는 부분 지지도들을 계산하는 단계; 및 상기 상대 메모리 주소들에 대응하는 상기 부분 지지도들을 동기화하여 상기 후보 k-항목집합들의 지지도들을 갱신하는 단계를 포함할 수 있다.According to one embodiment, copying each of the at least one identical block comprises copying one of the transaction blocks into each of the device memories as an inner operand, Updating the supports may comprise: each GPU computing partial supports corresponding to each of the relative memory addresses using the transaction block and each of the relative memory addresses; And updating the support ratings of the candidate k-item sets by synchronizing the partial supports corresponding to the relative memory addresses.

일실시예에 따른 빈발 패턴 마이닝 방법은 상기 빈발 1-항목집합들을 분할하여 프래그먼트(fragment)들을 생성하는 단계; 상기 프래그먼트들 별로 프래그먼트 내 빈발 1-항목집합들의 모든 조합인 항목집합들을 생성하는 단계; 상기 생성된 항목집합들에 대응하는 비트 벡터들을 계산하는 단계; 및 상기 계산된 비트 벡터들을 포함하는 트랜잭션 비트맵-상기 트랜잭션 비트맵은 수직 비트맵 레이아웃(vertical bitmap layout)에 의해 표현됨-을 수직적 파티셔닝(vertically partitioning)하고, 상기 프래그먼트들 별로 분할하여 프래그먼트 블록들을 생성하는 단계를 더 포함하고, 상기 블록들은 상기 프래그먼트 블록들일 수 있다.A frequent pattern mining method according to an exemplary embodiment of the present invention includes dividing the frequent 1-item sets into fragments; Generating item sets that are all combinations of the frequent 1-item sets in the fragments for each of the fragments; Calculating bit vectors corresponding to the generated item sets; And a transaction bitmap including the calculated bit vectors, the transaction bitmap being vertically partitioned by a vertical bitmap layout, and generating fragment blocks by dividing the transaction bits into the fragments , And the blocks may be the fragment blocks.

일실시예에 따르면, 상기 서로 다른 상대 메모리 주소들을 각각 복사하는 단계는 상기 트랜잭션 비트맵의 항목집합을 상대 메모리 주소로 매핑하는 사전을 이용하여, 상기 후보 k-항목집합들의 상대 메모리 주소들을 생성하는 단계; 및 상기 상대 메모리 주소들 각각을 외부 피연산자(outer operand)로서 상기 장치 메모리들 각각으로 복사하는 단계를 포함할 수 있다.According to one embodiment, the copying of the different relative memory addresses, respectively, may include generating a relative memory address of the candidate k-item sets using a dictionary mapping an item set of the transaction bitmap to a relative memory address step; And copying each of the relative memory addresses as an outer operand into each of the device memories.

일실시예에 따르면, 상기 상대 메모리 주소들을 생성하는 단계는 후보 k-항목집합에 포함된 항목집합들을 식별하는 단계; 및 상기 식별된 항목집합들의 상대 메모리 주소들을 조합하여, 상기 후보 k-항목집합의 상대 메모리 주소를 생성하는 단계를 포함할 수 있다.According to one embodiment, generating the relative memory addresses comprises: identifying sets of items included in a candidate k-itemset; And combining the relative memory addresses of the identified item sets to generate a relative memory address of the candidate k-item set.

일실시예에 따르면, 상기 적어도 하나의 동일한 블록을 각각 복사하는 단계는 상기 프래그먼트 블록들 중 적어도 하나의 프래그먼트 블록을 내부 피연산자(inner operand)로서 상기 장치 메모리들 각각으로 복사하는 단계를 포함하고, 상기 지지도들을 갱신하는 단계는 상기 각 GPU들이, 상기 프래그먼트 블록 및 상기 각 상대 메모리 주소들을 이용하여 상기 각 상대 메모리 주소들에 대응하는 부분 지지도들을 계산하는 단계; 및 상기 상대 메모리 주소들에 대응하는 상기 부분 지지도들을 동기화하여 상기 후보 k-항목집합들의 지지도들을 갱신하는 단계를 포함할 수 있다.According to one embodiment, the step of copying each of the at least one identical block comprises copying at least one fragment block of the fragment blocks into each of the device memories as an inner operand, Updating the supports may comprise: each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the fragment block and the respective relative memory addresses; And updating the support ratings of the candidate k-item sets by synchronizing the partial supports corresponding to the relative memory addresses.

일실시예에 따른 빈발 패턴 마이닝 방법은 트랜잭션 데이터로부터 빈발 1-항목집합들을 마이닝하는 단계; 상기 빈발 1-항목집합들에 대응하는 비트 벡터들을 포함하는 트랜잭션 비트맵-상기 트랜잭션 비트맵은 수직 비트맵 레이아웃(vertical bitmap layout)에 의해 표현됨-을 수직적 파티셔닝(vertically partitioning)하여 트랜잭션 블록들을 생성하는 단계; GPU들을 이용하여, 상기 트랜잭션 블록들로부터 후보 k-항목집합들의 지지도들을 계산하는 단계; 및 상기 지지도들에 기초하여, 빈발 k-항목집합들을 마이닝하는 단계를 포함한다.The frequent pattern mining method according to an embodiment includes mining 1-item sets frequently from transaction data; A transaction bitmap containing bit vectors corresponding to the frequent 1-item sets, the transaction bitmap being represented by a vertical bitmap layout, to vertically partition transaction blocks step; Using GPUs, calculating support ratings of candidate k-item sets from the transaction blocks; And mining frequent k-item sets based on the ratings.

일실시예에 따르면, 상기 지지도들을 계산하는 단계는 상기 트랜잭션 비트맵의 빈발 1-항목집합을 상대 메모리 주소로 매핑하는 사전을 이용하여, 상기 후보 k-항목집합들의 상대 메모리 주소들을 생성하는 단계; 상기 상대 메모리 주소들 각각을 외부 피연산자(outer operand)로서 상기 GPU들의 장치 메모리들 각각으로 복사하는 단계; 상기 트랜잭션 블록들 중 어느 하나의 트랜잭션 블록을 내부 피연산자(inner operand)로서 상기 장치 메모리들 각각으로 복사하는 단계; 상기 각 GPU들이, 상기 트랜잭션 블록 및 상기 각 상대 메모리 주소들을 이용하여 상기 각 상대 메모리 주소들에 대응하는 부분 지지도들을 계산하는 단계; 및 상기 부분 지지도들을 동기화하여 상기 후보 k-항목집합들의 지지도들을 갱신하는 단계를 포함할 수 있다According to one embodiment, calculating the supports comprises: generating relative memory addresses of the candidate k-item sets using a dictionary that maps a frequent 1-item set of the transaction bitmap to a relative memory address; Copying each of the relative memory addresses as an outer operand into each of the device memories of the GPUs; Copying any one of the transaction blocks as an inner operand into each of the device memories; Each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the transaction block and the respective relative memory addresses; And updating the scores of the candidate k-itemsets by synchronizing the partial votes

일실시예에 따르면, 상기 지지도들을 계산하는 단계는 제2 트랜잭션 블록을 내부 피연산자(inner operand)로서 상기 장치 메모리들 각각으로 복사하는 단계; 상기 각 GPU들이, 상기 제2 트랜잭션 블록 및 상기 각 상대 메모리 주소들을 이용하여 상기 각 상대 메모리 주소들에 대응하는 제2 부분 지지도들을 계산하는 단계; 및 상기 제2 부분 지지도들을 동기화하여 상기 후보 k-항목집합들의 지지도들을 갱신하는 단계를 더 포함할 수 있다.According to one embodiment, calculating the supports comprises: copying a second transaction block into each of the device memories as an inner operand; Each of the GPUs calculating second partial supports corresponding to the respective relative memory addresses using the second transaction block and the respective relative memory addresses; And updating the scores of the candidate k-itemsets by synchronizing the second partial scores.

일실시예에 따른 빈발 패턴 마이닝 방법은 트랜잭션 데이터로부터 빈발 1-항목집합들을 마이닝하는 단계; 상기 빈발 1-항목집합들을 분할하여 프래그먼트(fragment)들을 생성하는 단계; 상기 프래그먼트들 별로 프래그먼트 내 빈발 1-항목집합들의 모든 조합인 항목집합들을 생성하는 단계; 상기 생성된 항목집합들에 대응하는 비트 벡터들을 계산하는 단계; 상기 계산된 비트 벡터들을 포함하는 트랜잭션 비트맵-상기 트랜잭션 비트맵은 수직 비트맵 레이아웃(vertical bitmap layout)에 의해 표현됨-을 수직적 파티셔닝(vertically partitioning)하고, 상기 프래그먼트들 별로 분할하여 프래그먼트 블록들을 생성하는 단계; GPU들을 이용하여, 상기 프래그먼트 블록들로부터 후보 k-항목집합들의 지지도들을 계산하는 단계; 및 상기 지지도들에 기초하여, 빈발 k-항목집합들을 마이닝하는 단계를 포함한다.The frequent pattern mining method according to an embodiment includes mining 1-item sets frequently from transaction data; Dividing the frequent 1-item sets into fragments; Generating item sets that are all combinations of the frequent 1-item sets in the fragments for each of the fragments; Calculating bit vectors corresponding to the generated item sets; Vertically partitioning a transaction bitmap including the calculated bit vectors, the transaction bitmap being represented by a vertical bitmap layout, and generating fragment blocks by dividing the transaction bitmap into the fragments step; Calculating, using GPUs, the scores of the candidate k-itemsets from the fragment blocks; And mining frequent k-item sets based on the ratings.

일실시예에 따르면, 상기 지지도들을 계산하는 단계는 상기 트랜잭션 비트맵의 항목집합을 상대 메모리 주소로 매핑하는 사전을 이용하여, 상기 후보 k-항목집합들의 상대 메모리 주소들을 생성하는 단계; 상기 상대 메모리 주소들 각각을 외부 피연산자(outer operand)로서 상기 GPU들의 장치 메모리들 각각으로 복사하는 단계; 상기 프래그먼트 블록들 중 적어도 하나의 프래그먼트 블록을 내부 피연산자(inner operand)로서 상기 장치 메모리들 각각으로 복사하는 단계; 상기 각 GPU들이, 상기 프래그먼트 블록 및 상기 각 상대 메모리 주소들을 이용하여 상기 각 상대 메모리 주소들에 대응하는 부분 지지도들을 계산하는 단계; 및 상기 부분 지지도들을 동기화하여 상기 후보 k-항목집합들의 지지도들을 갱신하는 단계를 포함할 수 있다.According to one embodiment, calculating the supports comprises: generating relative memory addresses of the candidate k-item sets using a dictionary that maps a set of items of the transaction bitmap to a relative memory address; Copying each of the relative memory addresses as an outer operand into each of the device memories of the GPUs; Copying at least one fragment block of the fragment blocks into each of the device memories as an inner operand; Each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the fragment block and the respective relative memory addresses; And updating the scores of the candidate k-item sets by synchronizing the partial scores.

일실시예에 따르면, 상기 지지도들을 계산하는 단계는 적어도 하나의 제2 프래그먼트 블록을 내부 피연산자(inner operand)로서 상기 장치 메모리들 각각으로 복사하는 단계; 상기 각 GPU들이, 상기 제2 프래그먼트 블록 및 상기 각 상대 메모리 주소들을 이용하여 상기 각 상대 메모리 주소들에 대응하는 제2 부분 지지도들을 계산하는 단계; 및 상기 제2 부분 지지도들을 동기화하여 상기 후보 k-항목집합들의 지지도들을 갱신하는 단계를 더 포함할 수 있다.According to one embodiment, calculating the supports comprises: copying at least one second fragment block into each of the device memories as an inner operand; Each of the GPUs calculating second partial degrees of support corresponding to the respective relative memory addresses using the second fragment block and the respective relative memory addresses; And updating the scores of the candidate k-itemsets by synchronizing the second partial scores.

일실시예에 따른 빈발 패턴 마이닝 방법은 트랜잭션 데이터로부터 빈발 1-항목집합들을 마이닝하는 단계; TFL(Traversal from the First Level) 전략 또는 HIL(Hopping from Intermediate Level) 전략을 선택하는 단계; 상기 선택된 전략에 기초하여, 상기 빈발 1-항목집합들에 대응하는 비트 벡터들로부터 블록들을 생성하는 단계; GPU들을 이용하여, 상기 블록들로부터 후보 k-항목집합들의 지지도들을 계산하는 단계; 및 상기 지지도들에 기초하여, 빈발 k-항목집합들을 마이닝하는 단계를 포함한다.The frequent pattern mining method according to an embodiment includes mining 1-item sets frequently from transaction data; Selecting a TFL (Traversal from the First Level) strategy or a HIL (Hopping from Intermediate Level) strategy; Generating blocks from the bit vectors corresponding to the frequent 1-item sets based on the selected strategy; Using the GPUs, calculating support ratings of candidate k-item sets from the blocks; And mining frequent k-item sets based on the ratings.

일실시예에 따른 빈발 패턴 마이닝 장치는 CPU; 및 메인 메모리를 포함하고, 상기 CPU는 빈발 1-항목집합들에 대응하는 비트 벡터들로부터 블록들을 생성하고, 상기 메인 메모리로부터 GPU들의 장치 메모리들 각각으로, 후보 k-항목집합들의 서로 다른 상대 메모리 주소들을 각각 복사하고, 상기 메인 메모리로부터 상기 장치 메모리들 각각으로, 상기 블록들 중 상기 후보 k-항목집합들의 지지도들의 연산에 필요한 적어도 하나의 동일한 블록을 각각 복사하고, 상기 GPU들에 의해 계산된 부분 지지도들을 동기화하여 상기 후보 k-항목집합들의 지지도들을 갱신할 수 있다.A frequent pattern mining apparatus according to an embodiment includes a CPU; And a main memory, the CPU generating blocks from bit vectors corresponding to the frequent 1-item sets, transferring from the main memory to each of the device memories of the GPUs, Copying each of the addresses from the main memory to each of the device memories and copying at least one identical block needed to compute the votes of the candidate k-itemsets of the blocks, Item supports can be updated by synchronizing partial supports.

일실시예에 따른 빈발 패턴 마이닝 방법은 GPU들의 계산력을 최대한 활용하여 대규모(large-scale) 데이터를 동시에 빠르게 처리하고, 계산 효율을 극대화할 수 있다.The frequent pattern mining method according to one embodiment can maximize computation efficiency by simultaneously processing large-scale data at the same time by making maximum use of the calculation power of GPUs.

일실시예에 따른 빈발 패턴 마이닝 방법은 중간 데이터를 구체화(materializing)하지 않음으로써 강인한 FIM 기법을 제공할 수 있다.The frequent pattern mining method according to one embodiment can provide a robust FIM technique by not materializing intermediate data.

일실시예에 따른 빈발 패턴 마이닝 방법은 적은 양의 데이터에 대한 많은 양의 계산을 수행(예를 들어, 열거 트리의 첫 번째 레벨에서 빈발 1-항목집합을 마이닝함)하여, GPU 아키텍처와 메모리를 효율적으로 활용할 수 있다.The frequent pattern mining method according to one embodiment performs a large amount of calculations on a small amount of data (e.g., mining a frequent 1-item set at the first level of the enumeration tree) It can be utilized efficiently.

일실시예에 따른 빈발 패턴 마이닝 방법은 긴 패턴을 포함하는 데이터 세트의 처리 성능을 향상시키고자 적절한 양의 데이터를 기반으로 적절한 양의 계산을 수행할 수 있다.The frequent pattern mining method according to an exemplary embodiment may perform an appropriate amount of calculation based on an appropriate amount of data to improve processing performance of a data set including a long pattern.

일실시예에 따른 빈발 패턴 마이닝 방법은 후보 항목집합들의 상대 메모리 주소들을 각 GPU들로 분배하고 트랜잭션 블록들을 모든 GPU들로 스트리밍하여 작업부하 비대칭 문제를 개선할 수 있다.The frequent pattern mining method according to an embodiment can improve the workload asymmetry problem by distributing the relative memory addresses of the candidate item sets to each GPUs and streaming the transaction blocks to all the GPUs.

일실시예에 따른 빈발 패턴 마이닝 방법은 작업부하 비대칭 문제를 해결하여 GPU 수가 늘어남에 따라 마이닝 처리 성능을 선형적으로 향상시킬 수 있다. The frequent pattern mining method according to one embodiment can solve the workload asymmetry problem and linearly improve the mining processing performance as the number of GPUs increases.

도 1은 일실시예에 따른 빈발 패턴 마이닝 방법을 설명하기 위한 도면이다.
도 2는 일실시예에 따른 트랜잭션 블록을 설명하기 위한 도면이다.
도 3은 일실시예에 따른 TFL 전략의 동작을 설명하기 위한 도면이다.
도 4는 일실시예에 따른 커널 함수를 설명하기 위한 도면이다.
도 5는 일실시예에 따른 프래그먼트 블록을 설명하기 위한 도면이다.
도 6은 일실시예에 따른 HIL 전략의 동작을 설명하기 위한 도면이다.
도 7은 일실시예에 따른 다중 GPU들을 활용하는 동작을 설명하기 위한 도면이다.
도 8은 일실시예에 따른 빈발 패턴 마이닝 장치의 구성의 예시도이다.
도 9는 일실시예에 따른 빈발 패턴 마이닝 방법을 설명하기 위한 순서도이다.
도 10은 일실시예에 따른 시스템 초기화를 설명하기 위한 순서도이다.
도 11은 일실시예에 따른 지지도를 계산하는 동작을 설명하기 위한 순서도이다.
도 12는 일실시예에 따른 부분 지지도를 계산하는 동작을 설명하기 위한 순서도이다. 1 is a diagram for explaining a frequent pattern mining method according to an embodiment.
2 is a view for explaining a transaction block according to an embodiment.
3 is a view for explaining the operation of the TFL strategy according to an embodiment.
4 is a diagram for explaining a kernel function according to an embodiment.
5 is a view for explaining a fragment block according to an embodiment.
6 is a diagram for explaining the operation of the HIL strategy according to an embodiment.
7 is a diagram for explaining an operation of utilizing multiple GPUs according to an embodiment.
FIG. 8 is a diagram illustrating an exemplary configuration of a frequent pattern mining apparatus according to an embodiment of the present invention. Referring to FIG.
FIG. 9 is a flowchart for explaining a frequent pattern mining method according to an embodiment.
FIG. 10 is a flowchart illustrating system initialization according to an embodiment.
11 is a flowchart for explaining an operation of calculating the support degree according to an embodiment.
12 is a flowchart for explaining an operation of calculating a partial support according to an embodiment.

실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 실시될 수 있다. 따라서, 실시예들은 특정한 개시형태로 한정되는 것이 아니며, 본 명세서의 범위는 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of embodiments are set forth for illustration purposes only and may be embodied with various changes and modifications. Accordingly, the embodiments are not intended to be limited to the particular forms disclosed, and the scope of the present disclosure includes changes, equivalents, or alternatives included in the technical idea.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.The terms first or second, etc. may be used to describe various elements, but such terms should be interpreted solely for the purpose of distinguishing one element from another. For example, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.It is to be understood that when an element is referred to as being "connected" to another element, it may be directly connected or connected to the other element, although other elements may be present in between.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설명된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, the terms " comprises ", or " having ", and the like, are used to specify one or more of the described features, numbers, steps, operations, elements, But do not preclude the presence or addition of steps, operations, elements, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the meaning of the context in the relevant art and, unless explicitly defined herein, are to be interpreted as ideal or overly formal Do not.

이하, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. Like reference symbols in the drawings denote like elements.

"개관(Overview)""Overview"

일실시예에 따른 빈발 패턴 마이닝 장치는 GPU를 이용하여 대-규모(large-scale) 데이터로부터 빈발 패턴을 마이닝할 수 있다. 빈발 패턴 마이닝 장치는 데이터로부터 빈발 패턴을 마이닝하는 장치로서, 예를 들어 트랜잭션 데이터로부터 빈발 항목집합들을 마이닝할 수 있고, 소프트웨어 모듈, 하드웨어 모듈 또는 이들의 조합으로 구현될 수 있다. The frequent pattern mining device according to an exemplary embodiment may use the GPU to mine frequent patterns from large-scale data. A frequent pattern mining device is an apparatus for mining frequent patterns from data, for example, can be used to mine frequent item sets from transaction data, and can be implemented as a software module, a hardware module, or a combination thereof.

빈발 항목집합 마이닝(Frequent Itemset Mining; FIM)은 주어진 데이터로부터 빈발 항목집합을 찾는 기법이다. FIM은 주어진 트랜잭션 데이터베이스 D = {t₁; t₂; ::: ; t_n} 내에서 최소 기-정의 프랙션 minsup로 발생하는 트랜잭션들의 부분집합으로서 모든 항목집합들 F를 결정할 수 있고, 여기서 각 트랜잭션 t_i는 I[1, N] 내 아이템들의 부분 집합이다. 다만, 이하에서는 프랙션 대신에 지지도의 개념이 이용되고, 항목집합의 지지도 support는 그 항목집합의 발생 횟수를 의미한다. Frequent Item Set Mining (FIM) is a technique for finding frequent item sets from given data. The FIM is given by a given transaction database D = {t ₁ ; t ₂ ; :::; t _n }, where each transaction t _i is a subset of the items in I [1, N]. However, in the following, the concept of support is used instead of the fraction, and support of the item set means the number of occurrences of the item set.

표 1과 같이 데이터베이스화된 트랜잭션 데이터 D에서 Tid는 트랜잭션에 대응하는 식별자이고, 트랜잭션은 Tid에 대응하는 항목들의 집합을 의미한다. 예를 들어, 1인 Tid에 대응하는 트랜잭션은 {Beer, Nuts, Diaper}이다. In Table 1, Tid is the identifier corresponding to the transaction, and transaction is the set of items corresponding to Tid. For example, the transaction corresponding to one Tid is {Beer, Nuts, Diaper}.

표 1과 같이 트랜잭션 데이터 D가 주어지고, 최소 지지도가 minsup으로 주어진 경우, FIM은 minsup보다 크거나 같은 수의 지지도를 갖는 모든 항목집합들을 D로부터 찾을 수 있다. 지지도는 D 내 항목집합의 발생 빈도를 의미하고, D 이상 발생한 항목집합을 빈발 항목집합이라 한다. 예를 들어, 표 1의 트랜잭션 데이터 D에서 minsup이 3인 경우 FIM은 입력인 D로부터 항목집합들 {Beer}, {Diaper} and {Beer, Diaper}을 찾을 수 있다. 여기서, 찾아진 {Beer}, {Diaper} and {Beer, Diaper}은 빈발 항목집합들로서 출력이 될 수 있다. 항목집합 내 항목의 개수가 n인 빈발-항목집합을 빈발 n-항목집합이라 한다. 표 1의 트랜잭션 데이터 D에서, {Beer} 및 {Diaper}는 빈발 1-항목집합이고, {Beer, Diaper}는 빈발 2-항목집합이라 한다. If the transaction data D is given and the minimum support is given as minsup as shown in Table 1, the FIM can find all itemsets from D that have a number equal to or greater than minsup. The degree of support refers to the frequency of occurrences of the item set in D, and the set of items that occur more than D is called the frequent item set. For example, in transaction data D in Table 1, if minsup is 3, the FIM can find the item sets {Beer}, {Diaper} and {Beer, Diaper} from the input D Here, the {Beer}, {Diaper} and {Beer, Diaper} found can be output as frequent item sets. A frequent-item set with n items in the item set is called a frequent n-item set. In transaction data D of Table 1, {Beer} and {Diaper} are frequent 1-item sets and {Beer, Diaper} are frequent 2-item sets.

일실시예에 따른 GPU-기반 빈발 항목집합 마이닝 기법은 GMiner라 지칭되고, 이러한 기법을 수행하는 방법은 빈발 패턴 마이닝 방법이라 지칭된다. 빈발 패턴 마이닝 장치는 GMiner에 따른 빈발 패턴 마이닝 방법을 수행하여 빈발 패턴을 마이닝할 수 있다.The GPU-based frequent item set mining technique according to an embodiment is referred to as GMiner, and a method of performing such a technique is referred to as a frequent pattern mining method. The frequent pattern mining device can perform frequent pattern mining by GMiner.

GMiner는 GPU들의 계산력을 최대한 활용하여 마이닝 처리의 속도 성능을 향상시킬 수 있다. GMiner는 CPU 기반 방법과 같이 대-규모 데이터를 처리할 수 있으며, 종래 GPU 기반 방법에 있는 상술한 문제를 해결할 수 있다. GMiner는 다음과 같은 개념들을 제안한다: GMiner는 열거 트리의 중간 레벨들에서 패턴들을 유지 및 활용하는 대신에, 열거 트리의 첫 번째 레벨로부터 빈발 패턴들을 마이닝할 수 있다. GMiner는 중간 데이터를 구체화(materializing)하지 않기 때문에, GPU 기반 방법의 성능 및 메모리 사용면에서 매우 효과적이고, 강인한 FIM 기법을 제공할 수 있다. 일실시예에 따르면, GMiner는 TFL(Traversal from the First Level) 전략을 제안한다.GMiner can take advantage of the power of GPUs to improve the speed performance of mining operations. The GMiner can handle large-scale data like the CPU-based method and solve the above-mentioned problems in the conventional GPU-based method. GMiner proposes the following concepts: GMiner can mine frequent patterns from the first level of the enumeration tree, instead of maintaining and utilizing patterns at intermediate levels of the enumeration tree. Because GMiner does not materialize intermediate data, it can provide a very effective and robust FIM technique in terms of performance and memory usage of the GPU-based method. According to one embodiment, GMiner proposes a TFL (Traversal from the First Level) strategy.

TFL 전략에 따르면, 빈발 패턴 마이닝 장치는 투영된(projected) 데이터베이스와 GPU의 장치 메모리의 (열거 트리의) 중간 레벨들의 빈발 항목집합들을 유지하지 않고, 그 대신에 첫 번째 레벨의 빈발 1-항목집합들(F₁으로 표기됨)만을 이용하여 전체 빈발 항목집합들을 찾을 수 있다. TFL 전략에 따르면, 빈발 패턴 마이닝 장치는 GPU의 장치 메모리의 사용량을 줄이면서 마이닝 처리 성능을 향상시킬 수 있다. According to the TFL strategy, the frequent pattern mining device does not maintain the frequent item sets of intermediate levels (in the enumeration tree) of the device memory of the projected database and the GPU, (Denoted as F ₁ ) can be used to find all the frequent item sets. According to the TFL strategy, frequent pattern mining devices can improve the performance of mining processing while reducing the amount of GPU device memory usage.

TFL 전략은 프로세서 속도와 메모리 속도 사이의 갭이 매우 큰 GPU 아키텍처에서 적합할 수 있다. GPU 아키텍처에서, 상대적으로 작은 집합인 F₁을 기반으로 많은 양의 계산을 수행하여 빈발 n-항목집합들을 마이닝하는 것은 상대적으로 큰 집합인 (n-1)-항목집합들을 기반으로 적은 양의 계산을 수행하여 빈발 n-항목집합들을 마이닝하는 것보다 빠를 수 있다. TFL 전략에 따르면, 빈발 패턴 마이닝 장치는 멀티 쓰레드 기법, 분산 기법 및 GPU 기반 기법을 포함하는 최신의 병렬적 방법의 성능 향상을 극대화할 수 있다.The TFL strategy may be appropriate for GPU architectures with a very large gap between processor speed and memory speed. In the GPU architecture, performing a large amount of calculations based on a relatively small set of F ₁ to mine frequent n-item sets is based on a relatively large set of (n-1) To mine the frequent n-item sets. According to the TFL strategy, frequent pattern mining devices can maximize the performance improvements of the latest parallel methods including multithreading, distributed and GPU based techniques.

일실시예에 따르면, GMiner는 긴 패턴들을 포함하는 데이터 세트들에 관한 마이닝 성능을 향상시키기 위해 TFL 전략뿐만 아니라 HIL(Hopping from Intermediate Level) 전략을 제안한다. HIL 전략에 따르면, 빈발 패턴 마이닝 장치는 많은 GPU의 장치 메모리를 사용하여 계산량을 줄일 수 있으므로, 긴 패턴들을 위한 마이닝 성능을 향상시킬 수 있다.According to one embodiment, GMiner proposes a TFL strategy as well as a Hopping from Intermediate Level (HIL) strategy to improve mining performance on data sets comprising long patterns. According to the HIL strategy, frequent pattern mining devices can reduce the amount of computation by using the device memory of many GPUs, which can improve mining performance for long patterns.

빈발 패턴 마이닝 장치는 메모리를 효율적으로 사용하고, 마이닝 속도를 높이는 것 외에 종래 병렬적 방법에 존재하던 작업부하 비대칭을 개선할 수 있다. 그 결과, GMiner의 성능은 GPU 수가 늘어남에 따라 선형적으로 향상될 수 있다. GMiner는 작업부하 비대칭의 개선을 위해 트랜잭션 블록(transaction block) 및 상대 메모리 주소(relative memory address)의 개념을 제안한다. The frequent pattern mining device can improve the workload asymmetry that existed in the conventional parallel method in addition to using the memory efficiently and increasing the mining speed. As a result, the performance of the GMiner can be improved linearly as the number of GPUs increases. GMiner proposes a concept of transaction block and relative memory address to improve workload asymmetry.

트랜잭션 블록은 트랜잭션들에 대한 비트단위 표현(bitwise representation)의 고정-크기 청크(fixed-size chunk)이며, 상대 메모리 주소는 후보 항목집합들에 대한 어레이 표현(array representation)이다. 병렬 처리의 경우, 빈발 패턴 마이닝 장치는 열거 트리의 검색 공간(search space)을 서브 트리들로 분할하지 않고, 상대 메모리 주소들의 어레이를 같은 크기의 여러 서브 어레이들(multiple subarrays)로 분할할 수 있다. 그런 다음, 빈발 패턴 마이닝 장치는 각 GPU에 각 서브 어레이를 저장하고, 트랜잭션 블록들을 모든 GPU들로 스트리밍하여 마이닝을 수행할 수 있다. 이를 통해, 각 GPU들은 서로 동일한 양의 작업부하를 갖게 될 수 있다. A transaction block is a fixed-size chunk of a bitwise representation of transactions, and a relative memory address is an array representation of a set of candidate items. In the case of parallel processing, the frequent pattern mining device may divide the array of relative memory addresses into multiple subarrays of the same size without dividing the search space of the enumeration tree into subtrees . The frequent pattern mining device can then store each subarray in each GPU and perform the mining by streaming the transaction blocks to all GPUs. This allows each GPU to have the same amount of workload to each other.

이하, 도 1을 참조하여 일실시예에 따른 빈발 패턴 마이닝 방법의 개괄적인 내용이 설명되고, 도 2 내지 도 4를 참조하여 TFL 전략과 관련된 실시예가 설명되고, 도 5 내지 도 6을 참조하여 HIL 전략과 관련된 실시예가 설명되고, 도 7을 참조하여 다중 GPU를 이용하는 동작의 실시예가 설명된다. 도 8을 참조하여, 일실시예에 따른 빈발 패턴 마이닝 장치의 구성 요소들이 설명되고, 도 9 내지 도 12를 참조하여 일실시예에 따른 빈발 패턴 마이닝 방법의 동작이 설명된다.Hereinafter, an overview of a frequent pattern mining method according to an embodiment will be described with reference to FIG. 1, an embodiment related to a TFL strategy will be described with reference to FIG. 2 to FIG. 4, An embodiment related to strategy will be described, and an embodiment of an operation using multiple GPUs with reference to Fig. 7 will be described. Referring to FIG. 8, the components of a frequent pattern mining apparatus according to an embodiment will be described, and the operation of the frequent pattern mining method according to an embodiment will be described with reference to FIGS.

도 1은 일실시예에 따른 빈발 패턴 마이닝 방법을 설명하기 위한 도면이다.1 is a diagram for explaining a frequent pattern mining method according to an embodiment.

도 1을 참조하면, 빈발 패턴 마이닝 장치는 트랜잭션 데이터를 로드하고, 로드된 트랜잭션 데이터로부터 빈발 1-항목집합들(101)을 마이닝할 수 있다. 여기서, 트랜잭션 데이터는 빈발 패턴의 타겟이 되는 데이터로서, 적어도 하나의 항목에 대응하는 적어도 하나의 트랜잭션을 포함한다. 상술한 바와 같이 빈발 패턴 마이닝 장치는 GMiner 기법을 이용하여 빈발 패턴을 마이닝할 수 있고, 후술하는 동작들은 GMiner 기법에 의해 수행될 수 있다.Referring to FIG. 1, a frequent pattern mining device may load transaction data and may minify frequent 1-item sets 101 from loaded transaction data. Here, the transaction data includes at least one transaction corresponding to at least one item as target data of a frequent pattern. As described above, the frequent pattern mining apparatus can mine frequent patterns using the GMiner technique, and the operations described below can be performed by the GMiner technique.

빈발 패턴 마이닝 장치는 빈발 1-항목집합들(101)에 대응하는 비트 벡터들(102)로부터 블록들(103)을 생성할 수 있다. 빈발 패턴 마이닝 장치는 CPU(104)를 이용하여 빈발 패턴 마이닝과 관련된 동작들을 처리할 수 있다. 일실시예에 따르면, 빈발 패턴 마이닝 장치는 CPU(104), 메인 메모리(105) 및 GPU들(106 및 107)을 포함한다. GPU들(106 및 107)은 각각 장치 메모리들(108 및 109)를 포함한다. 다만, 빈발 패턴 마이닝 장치는 CPU(104) 및 GPU들(106 및 107) 중 적어도 하나에 의해 구현될 수도 있다. The frequent pattern mining device may generate blocks 103 from bit vectors 102 corresponding to frequent 1-item sets 101. The frequent pattern mining device can use the CPU 104 to handle operations related to frequent pattern mining. According to one embodiment, the frequent pattern mining device includes a CPU 104, main memory 105 and GPUs 106 and 107. [ GPUs 106 and 107 include device memories 108 and 109, respectively. However, the frequent pattern mining device may be implemented by at least one of the CPU 104 and the GPUs 106 and 107.

빈발 패턴 마이닝 장치는 CPU(104)를 이용하여 메인 메모리(105)와 GPU들(106 및 107)을 제어할 수 있는데, 예를 들어 메인 메모리(105)와 연관된 데이터 읽기/쓰기 동작과 GPU들(106 및 107)의 장치 메모리들(108 및 109)와 연관된 데이터 읽기/쓰기 동작을 제어하고, GPU들(106 및 107)의 AND 연산과 관련된 명령 또는 처리를 제어할 수 있다. 빈발 패턴 마이닝 장치는 생성된 블록들(103)을 메인 메모리(105)에 기록할 수 있는데, 블록들(103)의 생성과 관련된 구체적인 내용은 후술하겠다.The frequent pattern mining device may use the CPU 104 to control the main memory 105 and the GPUs 106 and 107 such that data read / write operations associated with the main memory 105 and GPUs 106 and 107 associated with the device memories 108 and 109 and control commands or processes associated with the AND operation of the GPUs 106 and 107. [ The frequent pattern mining apparatus can write the generated blocks 103 in the main memory 105. Specific contents related to the generation of the blocks 103 will be described later.

빈발 패턴 마이닝 장치는 CPU(104)의 메인 메모리(105)부터 GPU들(106 및 107)의 장치 메모리들(108 및 109) 각각으로, 후보 k-항목집합들의 서로 다른 상대 메모리 주소들(110 및 111)을 각각 복사할 수 있다. 예를 들어, 빈발 패턴 마이닝 장치는 상대 메모리 주소 RA₁(110)을 장치 메모리(108)에 복사하고, 상대 메모리 주소 RA₂(111)을 장치 메모리(109)에 복사할 수 있다. 상대 메모리 주소는 후보 k-항목집합들 중 어느 하나를 식별하기 위한 주소로서, 후보 k-항목집합 내 항목들에 기초하여 표현될 수 있다. 상대 메모리 주소들(110 및 111)의 생성 및 복사와 관련된 구체적인 내용은 후술하겠다.The frequent pattern mining device is responsible for mapping the different relative memory addresses 110 and 112 of the candidate k-item sets from the main memory 105 of the CPU 104 to each of the device memories 108 and 109 of the GPUs 106 and 107, 111), respectively. For example, the frequent pattern mining device may copy the relative memory address RA ₁ 110 to the device memory 108 and copy the relative memory address RA ₂ 111 to the device memory 109. The relative memory address is an address for identifying any one of the candidate k-item sets, and may be expressed based on items in the candidate k-item set. Details related to the creation and copying of the relative memory addresses 110 and 111 will be described later.

빈발 패턴 마이닝 장치는 메인 메모리(105)로부터 장치 메모리들(108 및 109) 각각으로, 블록들(103) 중 후보 k-항목집합들의 지지도들의 연산에 필요한 적어도 하나의 동일한 블록을 각각 복사할 수 있다. 예를 들어, 빈발 패턴 마이닝 장치는 메인 메모리(105)로부터 블록들 TB₁ 및 TB₂(103) 중 블록 TB₁(112)을 장치 메모리(108)와 장치 메모리(109)에 동일하게 복사할 수 있다. 블록들(103)의 복사와 관련된 구체적인 내용은 후술하겠다.The frequent pattern mining device may copy at least one identical block from the main memory 105 to each of the device memories 108 and 109, each of which is necessary for the calculation of the scores of the candidate k-item sets of the blocks 103 . For example, the frequent pattern mining apparatus is equally copying a block TB ₁ (112) of main memory blocks from the (105) TB ₁ and TB ₂ (103) in the device memory 108 and the device memory 109 have. Concrete contents related to the copying of the blocks 103 will be described later.

빈발 패턴 마이닝 장치는 GPU들(106 및 107)에 의해 계산된 부분 지지도들(113 및 114)을 동기화하여 후보 k-항목집합들의 지지도들을 갱신할 수 있다. GPU₁(106)은 장치 메모리(108)에 복사된 상대 메모리 주소 RA₁과 블록 TB₁에 기초하여 상대 메모리 주소 RA₁에 대응하는 후보 k-항목집합의 부분 지지도 PS₁(113)를 계산할 수 있다. 마찬가지로, GPU₂(107)은 장치 메모리(109)에 복사된 상대 메모리 주소 RA₂과 블록 TB₁에 기초하여 상대 메모리 주소 RA₂에 대응하는 후보 k-항목집합의 부분 지지도 PS₂(114)를 계산할 수 있다. 빈발 패턴 마이닝 장치는 GPU들(106 및 107)에 의해 계산된 부분 지지도들 PS₁(113) 및 PS₂(114)를 메인 메모리에 복사할 수 있다. 빈발 패턴 마이닝 장치는 메인 메모리에 복사된 부분 지지도들에 기초하여, 후보 k-항목집합들의 지지도들을 갱신할 수 있다. 빈발 패턴 마이닝 장치는 갱신된 지지도들을 미리 정의된 최소 지지도와 비교하여, 후보 k-항목집합들 중 적어도 하나의 빈발 k-항목집합을 마이닝할 수 있다. 부분 지지도들(113 및 114)의 연산 및 지지도들의 갱신과 관련된 구체적인 내용은 후술하겠다. The frequent pattern mining device may update the scores of the candidate k-item sets by synchronizing the partial supports 113 and 114 calculated by the GPUs 106 and 107. [ The GPU ₁ 106 can calculate the partial support PS ₁ 113 of the candidate k-item set corresponding to the relative memory address RA ₁ based on the relative memory address RA ₁ and block TB ₁ copied into the device memory 108 have. Similarly, the GPU ₂ (107) is a memory device 109, the external memory address RA ₂ and blocks TB ₁ relative memory addresses RA _₂ PS ₂ (114) approval ratings of the candidate k- set of entries corresponding to the copy on the basis of the Can be calculated. The frequent pattern mining device may copy partial supports PS ₁ 113 and PS ₂ 114 computed by GPUs 106 and 107 to main memory. The frequent pattern mining device can update the scores of the candidate k-item sets based on the partial ratings copied into the main memory. The frequent pattern mining device can mine at least one frequent k-item set of candidate k-item sets by comparing the updated supports with a predefined minimum support. Details regarding the calculation of the partial supports 113 and 114 and the updating of the support scores will be described later.

"" TFLTFL (Traversal from the First Level) 전략"(Traversal from the First Level) Strategy "

Traversal from the First Level(TFL) 전략은 열거 트리의 첫 번째 레벨-즉, F₁-로부터 빈발 패턴들을 마이닝하는 기법이다. 빈발 패턴 마이닝 장치는 TFL 전략을 이용하여 대-규모 데이터에 대한 빈발 항목집합 마이닝을 고속으로 수행할 수 있다. TFL 전략에 따르면, 빈발 패턴 마이닝 장치는 열거 트리의 중간 레벨들의 빈발 항목집합들과 투영된(projected) 데이터베이스(트랜잭션 데이터)를 GPU의 장치 메모리 내에 유지하지 않고, 그 대신에 F₁만을 이용하여 전체 빈발 항목집합들을 찾을 수 있다. The Traversal from the First Level (TFL) strategy is a technique for mining frequent patterns from the first level of the enumeration tree - F ₁ -. The frequent pattern mining device can perform frequent item set mining on large - scale data at high speed using TFL strategy. According to the TFL strategy, the frequent pattern mining device does not maintain the frequent item sets of intermediate levels of the enumeration tree and the projected database (transaction data) in the GPU's device memory, but instead uses only F ₁ You can find frequent item sets.

빈발 패턴 마이닝 장치는 TFL 전략을 통해 GPU의 장치 메모리의 사용량을 크게 줄여 메모리 부족 문제 없이 대-규모 데이터를 처리 할 수 있다. 빈발 패턴 마이닝 장치는 트랜잭션 데이터를 메인 메모리로부터 GPU의 장치 메모리로 스트리밍하는 동안에 패턴 마이닝을 동시에 수행하여, 메인 메모리와 GPU의 장치 메모리 간의 데이터 전송 오버헤드를 없앨 수 있다. 스트리밍은 스트리밍 복사의 개념을 포함한다. 빈발 패턴 마이닝 장치는 블록-기반 스트리밍 방식을 이용하여 작업부하 비대칭 문제를 해결할 수 있다. 이하, 트랜잭션 블록, 블록-기반 스트리밍 기법 및 TFL 전략의 알고리즘을 각각 후술하겠다.The frequent pattern mining device uses the TFL strategy to significantly reduce the amount of device memory in the GPU, allowing large-scale data to be processed without running out of memory. The frequent pattern mining device concurrently performs pattern mining while streaming transactional data from main memory to the GPU's device memory, thereby eliminating data transfer overhead between the main memory and the GPU's device memory. Streaming includes the concept of streaming copying. The frequent pattern mining device can solve the workload asymmetry problem by using the block-based streaming method. Hereinafter, the transaction block, the block-based streaming technique, and the algorithm of the TFL strategy will be described respectively.

"트랜잭션 블록들(Transaction Blocks)"&Quot; Transaction Blocks "

도 2는 일실시예에 따른 트랜잭션 블록을 설명하기 위한 도면이다.2 is a view for explaining a transaction block according to an embodiment.

GPU들의 계산력을 완전히 활용하기 위해, 규칙적인(regular) 메모리 액세스 패턴을 구성하고 데이터 구조들을 간단하게 만드는 것은 수천 개의 GPU 코어들 사이의 작업부하 균형 및 통합된(coalesced) 메모리 액세스에 있어서 중요하다. GPU들의 메모리 체계와 산술 논리 단위들(ALUs; Arithmetic and Logic Units)은 CPU들의 메모리 체계와 산술 논리 단위들과 비교하여 집합, 목록, 지도 및 이들의 조합을 비롯한 복잡하고 가변적인 크기의 데이터 구조를 처리하는 데 비효율적일 수 있다. GPU는 제한적인 장치 메모리만 가지고 있고, 제한된 메모리 크기는 GPU를 이용하여 대규모 및/또는 고밀도 데이터 세트로부터 빈발 항목집합을 마이닝하는데 제약이 될 수 있다.To fully exploit the computing power of GPUs, constructing regular memory access patterns and simplifying data structures is important for workload balancing and coalesced memory access among thousands of GPU cores. The memory system and arithmetic and logic units (ALUs) of the GPUs compute complex and variable-sized data structures, including sets, lists, maps, and combinations thereof, in comparison to the CPU's memory system and arithmetic logic units It can be inefficient to process. GPUs have limited device memory, and limited memory size can limit the use of GPUs to mine frequent item sets from large and / or dense data sets.

연산 효율을 고려하여, 빈발 패턴 마이닝 장치는 데이터 표현을 위한 수직 비트맵 레이아웃(vertical bitmap layout)을 채용할 수 있다. 수평(horizonal) 레이아웃과 수직 레이아웃은 복잡하고 불규칙하므로 GPU 계산 효율 향상에 부적합할 수 있다. 도 2를 참조하면, 수평 포맷인 수평 레이아웃으로 표현된 트랜잭션 데이터(201)가 도시되어 있다.In consideration of the computation efficiency, the frequent pattern mining apparatus can employ a vertical bitmap layout for data representation. Horizonal and vertical layouts are complex and irregular, which may be unsuitable for GPU computational efficiency. Referring to FIG. 2, transaction data 201 represented in a horizontal layout, which is a horizontal format, is shown.

일실시예에 따르면, 빈발 패턴 마이닝 장치는 수직 비트맵 레이아웃을 이용하여 대-규모 비트맵으로부터 비트단위 AND 연산들(bitwise AND operations)을 수행할 수 있다. 빈발 패턴 마이닝 장치는 GPU들을 이용하여 비트단위 AND 연산들을 수행할 수 있으므로, CPU들을 이용할 때와 비교하여 처리 속도 성능을 향상시킬 수 있다. 빈발 패턴 마이닝 장치는 수직 비트맵 레이아웃을 이용하므로 입력 데이터베이스를 메인 메모리 또는 GPU의 장치 메모리에 각각 들어갈(fit) 수 있도록 서브 데이터베이스들(sub databases))로 파티셔닝할 수 있다.According to one embodiment, a frequent pattern mining device can perform bitwise AND operations from a large-scale bitmap using a vertical bitmap layout. The frequent pattern mining device can perform bitwise AND operations using GPUs, thereby improving the processing speed performance as compared to using CPUs. The frequent pattern mining device utilizes a vertical bitmap layout so that it can be partitioned into sub databases so that the input database can fit into the device memory of the main memory or GPU, respectively.

수직 비트맵 레이아웃으로 표현된 입력 데이터 D를 트랜잭션 비트맵(transaction bitmap)이라 지칭한다. 빈발 패턴 마이닝 장치는 트랜잭션 데이터로부터 빈발 1-항목집합들 F₁을 마이닝하고, F₁을 수직 비트맵 레이아웃에 의해 표현된 트랜잭션 비트맵으로 데이터베이스화할 수 있다. The input data D expressed in the vertical bitmap layout is called a transaction bitmap. Frequent pattern mining apparatus can databasing the transaction bitmap represented by a 1-frequency set of entries in the mining F _1, F _1, from the transaction data to the vertical bit map layout.

도 2를 참조하면, 트랜잭션 비트맵(202)은 빈발 1-항목집합들에 각각 대응하는 비트 벡터들을 포함하고, 수직 비트맵 레이아웃에 의해 표현된다. 예를 들어, 빈발 1-항목집합 A에 대응하는 비트 벡터는 {100111001010}이다. 일실시예에 따르면, 빈발 패턴 마이닝 장치는 F₁의 비트 벡터들을 포함하는 트랜잭션 비트맵을 획득하거나 직접 생성할 수 있다.Referring to FIG. 2, the transaction bitmap 202 includes bit vectors that correspond to frequent 1-item sets, respectively, and is represented by a vertical bitmap layout. For example, the bit vector corresponding to the frequent 1-item set A is {100111001010}. According to one embodiment, the frequent pattern mining device may obtain or directly generate a transaction bitmap containing the bit vectors of F ₁ .

일실시예에 따르면, 빈발 패턴 마이닝 장치는 트랜잭션 비트맵(202)을 미리 정의된 크기로 수직적 파티셔닝(vertically partitioning)할 수 있다. 빈발 패턴 마이닝 장치는 수직적 파티셔닝된 트랜잭션 비트맵에 기초하여 트랜잭션 블록들 TB₁, TB₂ 및 TB₃(203, 204 및 205)을 생성할 수 있다. 빈발 패턴 마이닝 장치는 GPU들의 장치 메모리들의 크기를 고려하여 트랜잭션 블록들을 생성할 수 있다. 이하 "정의 1"은 트랜잭션 비트맵의 수직적 파티셔닝을 정의한다.According to one embodiment, the frequent pattern mining device may vertically partition the transaction bitmap 202 to a predefined size. The frequent pattern mining device may generate transaction blocks TB ₁ , TB _2, and TB ₃ (203, 204, and 205) based on the vertically partitioned transaction bitmap. The frequent pattern mining device can generate transaction blocks considering the size of the device memories of the GPUs. Definition 1 below defines the vertical partitioning of the transaction bitmap.

정의 1. Definition 1. (트랜잭션 비트맵 (Transactional bitmap 파티셔닝Partitioning ) 트랜잭션 비트맵 TB(Transaction Bitmap)를 동일한 너비(width)의 ) If you want to create a transactional bitmap TB (Transaction Bitmap) with the same width R개의R 비-중복 Non-redundant 파티션들로With partitions 수직적(vertically) 분할하는 것을 Vertically dividing 의미한다. 여기서it means. here , , 파티션들을Partitions TBTB _{1:R1: R} 로in 지칭하고, Quot; TBTB _kk 는The 트랜잭션 비트맵의 k-번째 The k-th 파티션을Partition 지칭한다(1 ≤ k ≤ R). (1 ≤ k ≤ R).

수직적 파티셔닝에 의해 생성된 트랜잭션 블록은 "정의 1"의 트랜잭션 비트맵 파티셔닝에 의해 생성된 파티션을 의미한다. TB_1:R 중 k-번째 트랜잭션 블록을 TB_k로 지칭한다. 빈발 패턴 마이닝 장치는 빈발 1-항목집합들 F₁으로부터 마이닝을 시작하므로, TB의 크기는 비트 단위로 | F₁ | x | D |일 수 있다. 여기서, | D |는 트랜잭션들의 전체 수이다. 예를 들어, 도 2의 트랜잭션 비트맵(202)에서 | F₁ | = 7, | D | = 12이다. 트랜잭션 비트맵의 단일 파티션의 너비는 W로 표시된다. TB_k의 크기는 | F₁ | x W이다. 예를 들어, 도 2의 파티션인 트랜잭션 블록의 W는 4이므로, TB₁(203)의 크기는 7 x 4이다. 트랜잭션 블록들의 너비들을 W으로 보장하기 위해, 빈발 패턴 마이닝 장치는 TB_1:R 중 마지막 트랜잭션 블록 TB_R의 너비가 W보다 작은 경우 TB_R에 적어도 하나의 0을 패드(pad)하는 전처리를 수행할 수 있다. The transaction block created by the vertical partitioning means the partition created by the transaction bitmap partitioning of " Definition 1 ". TB _{1: R} The k- th block of the transaction is referred to as TB _k. Since the frequent pattern mining device starts mining from the frequent 1-item sets F ₁ , the size of TB is | F ₁ | x | D < / RTI > Here, | D | is the total number of transactions. For example, in the transaction bitmap 202 of FIG. 2, F ₁ | = 7, | D | = 12. The width of a single partition of the transactional bitmap is denoted by W. The size of TB _k is | F ₁ | x W. For example, since W in the transaction block of FIG. 2 is 4, the size of TB ₁ 203 is 7 x 4. To ensure that the widths of transaction blocks are W, the frequent pattern mining device is TB _{1: R} If the width of the last transaction block TB _R is less than W, preprocessing can be performed to pad at least one 0 to TB _R.

일실시예에 따르면, 빈발 패턴 마이닝 장치는 GPU의 장치 메모리에 각 TB_k가 들어갈 수 있는 충분히 작은 크기로 파라미터 W를 설정할 수 있다. 빈발 패턴 마이닝 장치는 각 트랜잭션 블록을 메인 메모리에 연속적으로(consecutively) 할당하거나 또는 디스크 페이지와 같은 보조 저장소(secondary storage)에 청크로 저장할 수 있다. 트랜잭션 비트맵이 수직 포맷으로 표현된 실시예가 설명되었지만, 빈발 패턴 마이닝 장치는 설계 의도, 시스템 환경 또는 효율을 고려하여 수평 포맷으로 표현된 트랜잭션을 파티셔닝하여 트랜잭션 블록들을 생성할 수도 있다. 같은 크기의 너비로 트랜잭션 블록들이 파티션되는 실시예가 설명되었지만, 빈발 패턴 마이닝 장치는 설계 의도, 시스템 환경 또는 효율을 고려하여 다른 크기의 너비인 트랜잭션 블록들을 생성할 수도 있다.According to one embodiment, the frequent pattern mining device may set the parameter W to a size sufficiently small that each TB _k can fit into the device memory of the GPU. The frequent pattern mining device can consecutively allocate each transaction block to main memory or store it as a chunk in a secondary storage such as a disk page. Although an embodiment in which a transaction bitmap is represented in a vertical format has been described, a frequent pattern mining device may also create transaction blocks by partitioning a transaction represented in a horizontal format in consideration of design intent, system environment, or efficiency. Although an embodiment has been described in which transaction blocks are partitioned to the same size width, a frequent pattern mining device may generate transaction blocks of different size widths considering design intent, system environment or efficiency.

빈발 1-항목집합 x(x ∈ F₁)는 TB에서 길이 |D|의 비트 벡터(bit vector)를 가진다. 도 2를 참조하면, 빈발 1-항목집합 {A}는 길이가 12인 비트 벡터 "100111001010"와 대응한다. TB는 수직적 파티셔닝에 의해 길이 W인 비트 벡터들 R개로 분할될 수 있다. TB_k 내 x의 비트 벡터는 TB_k(x)로 표기된다. 도 2를 참조하면, TB₁({A})는 "1001"이다. 상술한 바와 같이, TB는 빈발 1-항목집합들에 대응하는 비트 벡터들만을 포함한다. 따라서, x가 빈발 n-항목집합인 경우 x는 TB_k 내 n개의 비트 벡터들, 즉 {TB_k(i)|i ∈ x}를 가지게 된다. 이하 "정의 2"는 트랜잭션 비트맵 내, 빈발 항목집합 x를 위한 비트 벡터들의 물리적 포인터들의 집합의 개념이 정의된다.Frequent itemset 1-x (x ∈ F ₁₎ has a length in the TB | has a bit vector (bit vector) of | D. Referring to FIG. 2, a frequent 1-item set {A} corresponds to a bit vector " 100111001010 " The TB can be divided into R bit vectors, length W, by vertical partitioning. TB _k-bit vector of the x is denoted as TB _k (x). Referring to FIG. 2, TB ₁ ({A}) is " 1001 ". As described above, TB includes only bit vectors corresponding to frequent 1-item sets. Thus, if x is a frequent n- _th item set, x has n bit vectors in TB _k , {TB _k (i) | i ∈ x}. Definition 2 below defines the concept of a set of physical pointers of bit vectors for the frequent item set x in the transaction bitmap.

정의 2.Definition 2. (상대 메모리 주소( (Relative memory address Relative memory addressRelative memory address )) 항목 i의 상대 메모리 주소는 RA(i)로 표기되고, )) The relative memory address of item i is denoted RA (i) TBTB _kk 의of 시작 메모리 주소로부터 From starting memory address TBTB _kk (i)의(i) 시작 메모리 주소까지의 바이트 단위의 거리로서 As the distance in bytes to the starting memory address 정의된다. 빈발Is defined. frequency 항목집합 x의 상대 메모리 주소의 집합은 RA(x)로 표기되고, {RA(i) | i ∈ x}로 정의된다. The set of relative memory addresses of item set x is denoted RA (x), and {RA (i) | i ∈ x}.

상대 메모리 주소의 개념은 메인 메모리 또는 GPU의 장치 메모리 상의 단일 트랜잭션 블록 내 항목의 메모리 위치(또는 항목집합들의 메모리 위치들)에 대한 빠른 접근을 위한 것이다. 빈발 패턴 마이닝 장치는 항목집합 x의 식별자의 일종으로서 RA(x)를 이용할 수 있다. x의 항목들의 수를 |x|라 하고, RA(x)의 개별 메모리 주소들(distinct memory addresses)의 수를 |RA(x)|라 한다. i ∈ x인 각 항목은 TB_k 내 고유한 메모리 주소를 갖기 때문에, |x|=|RA(x)|이다. 빈발 항목집합 x의 RA(x)는 모든 TB_k(1 ≤ k ≤ R)에서 변하지 않는다. 즉, TB_k의 크기가 고정되어 있기 때문에 RA(x)는 항상 동일한 상대 메모리 주소를 가지게 된다.The concept of a relative memory address is for fast access to the memory location (or memory locations of the item sets) of items in a single transaction block on the main memory or GPU's device memory. The frequent pattern mining device can use RA (x) as a kind of identifier of the item set x. Let | x | be the number of entries in x and | RA (x) | the number of distinct memory addresses in RA (x). Since each item i ∈ x has a unique memory address in TB _k , | x | = | RA (x) |. RA (x) of the frequent item set x does not change at all TB _k (1 ≤ k ≤ R). That is, since the size of TB _k is fixed, RA (x) always has the same relative memory address.

"중첩 루프 스트리밍(Nested-Loop Streaming)"&Quot; Nested-Loop Streaming "

빈발 패턴 마이닝 장치는 후보 생성 및 테스트 방법(지지도(support) 카운팅)의 2가지 주요 단계를 반복하여 항목집합 격자(lattice)의 각 레벨에서 빈발 항목집합들을 찾을 수 있다. 2 단계들에 있어서, 테스트 단계는 후보 생성 단계보다 더 계산 집약적일 수 있다. 빈발 패턴 마이닝 GPU들을 활용하여 테스트 단계를 가속화하는데 중점을 둘 수 있고, CPU들을 이용하여 후보 생성 단계를 수행할 수 있다.The frequent pattern mining device can find frequent item sets at each level of the item set lattice by repeating the two main steps of candidate generation and test method (support counting). In the two steps, the test step may be more computationally intensive than the candidate generation step. We can concentrate on accelerating the test phase by using frequent pattern mining GPUs, and can perform the candidate generation step using CPUs.

일실시예에 따르면, 빈발 패턴 마이닝 장치는 GPU들의 대규모 병렬 처리를 완전히 활용하고 작업부하의 균형을 달성하기 위해 깊이 우선 탐색(Depth-First Search; DFS)의 순회(traversal)보다는 너비 우선 탐색(Breadth-First Search; BFS) 순회를 이용하여 마이닝을 수행할 수 있다. 다만, 빈발 패턴 마이닝 장치는 설계 의도나 시스템 효율에 따라 DFS의 순회를 이용할 수도 있다.According to one embodiment, the frequent pattern mining device is not a traversal of depth-first search (DFS) in order to fully utilize the large-scale parallel processing of GPUs and achieve a balance of workloads, -First Search; BFS) It is possible to perform mining using traversal. However, frequent pattern mining devices may use DFS traversal depending on design intent or system efficiency.

패턴 마이닝에 BFS 순회가 이용되는 경우, 특정 레벨의 빈발 항목집합들의 수는 제한된 양의 GPU의 장치 메모리에 보관되고, 다음 레벨의 후보 항목집합들의 지지도 카운팅에 이용되기에 지나치게 클 수 있다. 트랜잭션들의 수가 커짐에 따라 이러한 문제는 더 악화될 수 있지만, 빈발 패턴 마이닝 장치는 TFL 전략을 이용하여 이러한 문제를 해결할 수 있다.When BFS traversal is used for pattern mining, the number of frequent item sets at a particular level may be kept in the device memory of a limited amount of GPUs and may be too large to be used for counting the support of the next level of candidate item sets. This problem can be exacerbated as the number of transactions grows, but frequent pattern mining devices can solve this problem using the TFL strategy.

빈발 패턴 마이닝 장치는 GPU의 제한된 장치 메모리 내에서 성능의 저하 없이 대규모 데이터집합들에 대한 빈발 항목집합들을 마이닝할 수 있다. 빈발 1-항목집합들의 전체 세트는 항목집합 격자의 첫 번째 레벨(first level)로 지칭된다. 마찬가지로, 항목집합 격자의 다른 레벨들은 중간 레벨들(intermediate levels)로 지칭된다. 계산 오버헤드를 줄이기 위해 중간 레벨들의 빈발 항목집합들이 구체화되는 경우, 대량의 중간 데이터가 생성되어 메모리 부족으로 인한 문제가 발생할 수 있고, GPU의 장치 메모리의 용량은 메인 메모리의 용량보다 제한적이기 때문에 이러한 경향은 GPU를 이용할 때 더욱 두드러진다. The frequent pattern mining device can mine frequent item sets for large data sets without degrading performance within the limited device memory of the GPU. The entire set of frequent 1-item sets is referred to as the first level of the item set lattice. Likewise, the different levels of the item aggregation lattice are referred to as intermediate levels . If the frequent item sets of intermediate levels are specified to reduce the computational overhead, a large amount of intermediate data may be generated to cause problems due to lack of memory, and since the capacity of the GPU's device memory is limited to the capacity of the main memory, The trend is even more pronounced when using GPUs.

빈발 패턴 마이닝 장치는 TFL 전략을 이용하여 제1 레벨, 즉 F₁만을 이용하여 중간 레벨들의 모든 후보 항목집합들을 테스트할 수 있다. GPU는 대단위 비트 연산들(bitwise operations)과 같이 매우 강력한 컴퓨팅 성능을 지니고 있지만, GPU의 장치 메모리는 메인 메모리에 비해 상대적으로 작기 때문에, 빈발 패턴 마이닝 장치는 F₁만을 이용하여 마이닝을 수행한다. GPU 아키텍처에서, 빈발 1-항목집합들을 이용하여 후보 (n+1)-항목집합들을 테스트하는 것은 빈발 n-항목집합들(즉, F_n)을 이용하여 후보 (n+1)-항목집합들을 테스트하는 것보다 훨씬 빠르다. GPU의 장치 메모리에 F_n을 복사하는 것은 F₁을 복사하는 것보다 훨씬 큰 데이터 전송 오버 헤드가 발생하고, GPU의 장치 메모리 내 F_n에 접근하는 것은 F₁에 접근하는 것보다 더 많은 비-합체 메모리 액세스(non-coalesced memory access)가 발생하는 것이 그 이유이다.The frequent pattern mining device may use the TFL strategy to test all candidate sets of intermediate levels using only the first level, i.e., F ₁ . Although the GPU has very powerful computing capabilities, such as bitwise operations, the device memory of the GPU is relatively small compared to main memory, so the frequent pattern mining device performs mining using only F ₁ . In the GPU architecture, frequency using 1-candidate set of items (n + 1) - s The test items set frequency n- entry set (i.e., F _n) used by the candidate (n + 1) a - the set of items It is much faster than testing. Copying of F _n on the GPU device memory is much larger data transmission overhead than copying F _1, and is approaching your F _n GPU memory on the device more rain than to access the F ₁ - This is why non-coalesced memory access occurs.

대-규모 데이터를 마이닝하기 위해, GPU들 상에서 새로운 항목집합을 마이닝하는 중첩 루프 스트리밍(nested-loop streaming)이라 지칭되는 기법이 제안된다. 후보 생성 및 테스트 단계의 단일 시리즈를 반복(iteration)이라 한다. 빈발 패턴 마이닝 장치는 각 반복에서 중첩 루프 스트리밍(nested-loop streaming)을 수행할 수 있다. 빈발 패턴 마이닝 장치는 중첩 루프 스트리밍을 수행하여 후보 항목집합들의 정보를 외부 피연산자(operand)로서 GPU들에 복사할 수 있다. 구체적으로, 빈발 패턴 마이닝 장치는 후보 항목집합들 그 자체 대신에 후보 항목집합들의 상대 메모리 주소들을 외부 피연산자(operand)로서 GPU들에 복사할 수 있다. To mine large-scale data, a technique referred to as nested-loop streaming is proposed, which mines a new set of items on GPUs. A single series of candidate generation and testing steps is called iteration . The frequent pattern mining device can perform nested-loop streaming at each iteration. The frequent pattern mining device may perform nested loop streaming to copy information of candidate item sets to GPUs as external operands. Specifically, the frequent pattern mining device may copy the relative memory addresses of the candidate item sets into GPUs as external operands, instead of the candidate item sets themselves.

레벨 L에서의 후보 항목집합들을 C_L이라 하자. 빈발 패턴 마이닝 장치는 RA(C_L) = {RA(x) | x ∈ C_L}를 GPU들에 복사할 수 있다. 모호함이 없다면, RA(C_L)은 단순히 RA로 표기될 수도 있다. 빈발 패턴 마이닝 장치는 제1 레벨의 트랜잭션 블록들, 즉 TB_1:R을 내부 피연산자(operand)로서 GPU들에 복사할 수 있다. 외부 피연산자 RA, 내부 피연산자 TB_1:R 또는 이들 모두 GPU의 장치 메모리에 맞지 않을 수 있다. 따라서, 빈발 패턴 마이닝 장치는 외부 피연산자 RA를 RA_1:Q로 파티셔닝하고, 각 RA_j(1 ≤ j ≤ Q)를 한 번에 하나씩 GPU들에 복사할 수 있다. 빈발 패턴 마이닝 장치는 각 RA_j에 대해, 내부 피연산자의 각 조각(piece), 즉 트랜잭션 블록 TB_k을 GPU들에 스트리밍할 수 있다(1 ≤ k ≤ R). 대부분의 중간 레벨들에서 외부 피연산자 RA의 크기는 내부 피연산자 TB보다 훨씬 작을 수 있다. 특히, 전체 RA가 GPU 메모리에 유지 될 수 있는 경우(즉, Q = 1), 빈발 패턴 마이닝 장치는 TB_k를 GPU들로 스트리밍할 수 있다.Let the candidate itemsets at level L be C _L. The frequent pattern mining device is RA (C _L ) = {RA (x) | x ∈ C _L } to the GPUs. If there is no ambiguity, RA (C _L ) may simply be denoted RA. The frequent pattern mining device may copy the first level transaction blocks, TB _{1: R} , as GPUs as internal operands. External operand RA, internal operand TB _{1: R,} or both may not fit in the GPU's device memory. Therefore, the frequent pattern mining device can partition the outer operand RA into RA _{1: Q} and copy each RA _j (1 ≤ j ≤ Q) one at a time to the GPUs. For each RA _j , a frequent pattern mining device can stream each piece of the internal operand, namely the transaction block TB _k, to the GPUs (1 ≤ k ≤ R). At most intermediate levels, the size of the outer operand RA may be much smaller than the inner operand TB. In particular, if the entire RA can be kept in the GPU memory (i.e., Q = 1), the frequent pattern mining device may stream TB _k to the GPUs.

RA_j, TB_k의 각 쌍(pair)에 대해, 빈발 패턴 마이닝 장치는 TB_k 내에서 x ∈ RA_j의 부분 지지도들을 계산할 수 있다. RA_j, TB_k에 대한 부분 지지도들을 PS_j _{, k}라 하자. 정의 3에서 항목집합 x의 부분 지지도(partial support)가 정의된다.For each pair of RA _j , TB _k , the frequent pattern mining device can calculate the partial supports of x ∈ RA _j in TB _k . Suppose that the partial supports for RA _j , TB _k are PS _j _{, k} . In Definition 3, partial support of item set x is defined.

정의 3. (부분 지지도) σ _x (TB _k )는 주어진 트랜잭션 블록 TB _k 내에서 항목집합 x의 부분 지지도로 정의된다. 전체 트랜잭션 비트맵 TB _1:R 에서 x의 전체 지지도는 σ(x) =

σ _x (TB _k )이 된다. Definition 3. (partial support) σ _x (TB _k ) is defined as the partial support of item set x in a given transaction block TB _k . Overall transaction bitmap TB _{1: R} in full support of the x σ (x) =

σ _x (TB _k ) .

항목집합 x = {i₁, ..., i_n}에 대한 부분 지지도 σ_x(TB _k )를 계산하기 위해, 빈발 패턴 마이닝 장치는 {TB_k(i)|i ∈ x}의 비트 벡터들 사이에서 (n - 1) 번(times) 비트단위 AND(bitwise AND) 연산을 수행하고, 연산에 따른 결과 비트 벡터(resultant bit vector)의 1들의 수를 카운트할 수 있다. RA(x)는 GPU의 장치 메모리 상 TB_k에 있는 x의 상대 메모리 주소들을 포함하므로, 빈발 패턴 마이닝 장치는 비트 벡터들 TB_k(x)의 위치들에 효율적으로 접근할 수 있다. 일련의 (n - 1) 비트단위 AND 연산들(bitwise AND operations)을 항목집합 x에 적용하는 함수는 ∩{TB_k(x)}로 표기된다. 또한, 주어진 비트 벡터에서 1들의 수를 카운트하는 함수는 count(·)로 표기된다. 그럼, σ_x(TB _k ) = count(∩{TB_k(x)})이다.To calculate the partial support σ _x ( TB _k ) for the item set x = {i ₁ , ..., i _n }, the frequent pattern mining device computes the bit vectors {TB _k (i) | i ∈ x} (N-1) times bitwise AND operation between the input bit vector and the resultant bit vector, and counts the number of 1s of the resultant bit vector. RA (x), so includes x address of the external memory in the memory device the TB _k of the GPU, frequent pattern mining apparatus can efficiently access to the location of the bit vectors TB _k (x). A function that applies a series of (n - 1) bitwise AND operations to an item set x is denoted by ∩ {TB _k (x)}. Further, the function for counting a first number of bits in a given vector is indicated by count (·). Then, σ _x ( TB _k ) = count (∩ {TB _k (x)}).

"" TFLTFL 알고리즘( algorithm( TFLTFL Algorithm)" Algorithm) "

도 3은 일실시예에 따른 TFL 전략의 동작을 설명하기 위한 도면이다.3 is a view for explaining the operation of the TFL strategy according to an embodiment.

빈발 패턴 마이닝 장치는 TFL 전략의 알고리즘을 이용할 수 있다. 도 3을 참조하여, TFL 전략의 알고리즘의 전반적인 절차가 설명된다. 도 3을 참조하면, 외부 연산자(outer operand) RA_1:Q과 내부 피연산자(inner operand) TB_1:R는 메인 메모리(301) 내에 있고, 설명의 편의를 위해 Q = 1로 가정한다. RABuf로 표기되는 RA_j를 위한 버퍼와 TBBuf로 표기되는 TB_k를 위한 버퍼는 GPU의 장치 메모리(302) 내에 있다. 항목집합 격자 내에서 현재 레벨의 후보 항목집합들의 집합은 C_L로 표기된다. The frequent pattern mining device can use the algorithm of the TFL strategy. Referring to Figure 3, the overall procedure of the algorithm of the TFL strategy is described. Referring to FIG. 3, an outer operand RA _{1: Q} and an inner operand TB _{1: R} are in main memory 301, and Q = 1 is assumed for convenience of explanation. A buffer for RA _j denoted RABuf and a buffer for TB _k denoted TBBuf are in the device memory 302 of the GPU. The set of candidate item sets at the current level in the item set grid is denoted C _L.

빈발 패턴 마이닝 장치는 사전 dict(304)을 이용하여 C_L(305)내 각 항목집합 x를 상대 메모리 주소 RA(x)로 매핑하고, 매핑 결과를 이용하여 C_L(305)을 RA(306)로 변환할 수 있다. 여기서, 사전 dict(304)은 트랜잭션 블록 TB_k의 빈발 1-항목집합 x∈F₁을 RA(x)로 매핑하는 사전이다. RA의 크기가 GPU의 장치 메모리의 RABuf의 크기보다 크면, RA는 논리적으로 Q개의 파티션들, 즉 RA_1:Q로 분할되고, 각 파티션은 RABuf에 들어갈 수 있다.The frequent pattern mining apparatus maps each item set x in the C _L 305 to the relative memory address RA (x) using the dictionary dict 304 and outputs the C _L 305 to the RA 306 using the mapping result. . &Lt; / RTI > Here, the dictionary dict 304 is a dictionary that maps the frequent 1-item set xεF ₁ of the transaction block TB _k to RA (x). If the size of the RA is larger than the size of the RABuf in the GPU's device memory, then the RA is logically divided into _Q partitions, RA _{1: Q} , and each partition can enter RABuf.

일실시예에 따르면, 빈발 패턴 마이닝 장치는 트랜잭션 비트맵의 빈발 1-항목집합들(303) F₁을 상대 메모리 주소들로 매핑하는 사전 dict(304)을 이용하여, 후보 k-항목집합들 C_L(305)의 상대 메모리 주소들 RA(306)을 생성할 수 있다(321). 빈발 패턴 마이닝 장치는 후보 k-항목집합에 포함된 빈발 1-항목집합들을 식별하고, 식별된 빈발 1-항목집합들의 상대 메모리 주소들을 조합하여 후보 k-항목집합의 상대 메모리 주소를 생성할 수 있다. 예를 들어, 사전 dict(304)은 빈발 1-항목집합들(303)에 각각 대응하는 상대 메모리 주소들(예를 들어, 인덱스들)을 포함할 수 있고, 빈발 패턴 마이닝 장치는 사전 dict(304)에 기록된 0,1,2를 조합하여 후보 3-항목집합 {A,B,C}의 상대 메모리 주소 {0,1,2}를 생성할 수 있다. According to one embodiment, the frequency pattern mining apparatus of using the dictionary dict (304) for mapping the (303) frequency F ₁ l-set of entries in the transaction bitmap to the relative memory addresses, a candidate set of items k- C _L (305) relative memory addresses RA 306 (321). The frequent pattern mining device may identify the frequent 1-item sets contained in the candidate k-itemset and generate the relative memory addresses of the candidate k-itemsets by combining the relative memory addresses of the identified frequent 1-item sets . For example, the dictionary dict 304 may include relative memory addresses (e.g., indices) corresponding to the frequent 1-item sets 303, respectively, and the frequent pattern mining device may include dictionary dict 304 0, 1, 2} of the candidate 3-item set {A, B, C} can be generated by combining 0,

빈발 패턴 마이닝 장치는 PCI-E 버스를 통해 스트리밍 방식으로 RA_j를 RABuf에 복사한 후, 각 TB_k를 TBBuf에 복사할 수 있다. 일실시예에 따르면, 빈발 패턴 마이닝 장치는 상대 메모리 주소들 각각을 외부 피연산자(outer operand)로서 GPU들의 장치 메모리들 각각으로 복사할 수 있다. 도 3을 참조하면, Q=1이고, 장치 메모리(302)는 1개인 것을 가정하였으므로, 빈발 패턴 마이닝 장치는 상대 메모리 주소들 RA(306)를 외부 피연산자로서 GPU의 장치 메모리(302) 내 RABuf로 복사할 수 있다(322).The frequent pattern mining device can copy RA _j to RABuf in a streaming manner on the PCI-E bus, and copy each TB _k to TBBuf. According to one embodiment, the frequent pattern mining device may copy each of the relative memory addresses as an outer operand to each of the device memories of the GPUs. 3, since Q = 1 and the device memory 302 is assumed to be one, the frequent pattern mining device stores relative memory addresses RA 306 as external operands into RABuf in the device memory 302 of the GPU (322).

일실시예에 따르면, 빈발 패턴 마이닝 장치는 트랜잭션 블록들 중 어느 하나의 트랜잭션 블록을 내부 피연산자(inner operand)로서 GPU들의 장치 메모리들 각각으로 복사할 수 있다. 도 3을 참조하면, 빈발 패턴 마이닝 장치는 트랜잭션 블록들 중 TB₁(307)을 GPU의 장치 메모리(302) 내 TBBuf에 복사할 수 있다(323).According to one embodiment, a frequent pattern mining device may copy transaction blocks of any of the transaction blocks into the device memories of the GPUs as inner operands. Referring to FIG. 3, the frequent pattern mining apparatus can copy TB ₁ 307 among the transaction blocks to TBBuf in the device memory 302 of the GPU (323).

각 트랜잭션 블록 TB_k에 있어서 RA(306)에 대응하는 모든 후보 항목집합들에 대한 부분 지지도들을 저장하기 위해, 빈발 패턴 마이닝 장치는 메인 메모리(301) 내 PSArray로 표기되는 2-차원 어레이를 유지하고, 2-차원 어레이의 크기는 R과 |RA|에 의해 결정된다. 여기서, |RA|는 후보 항목집합들의 수이다. 빈발 패턴 마이닝 장치는 GPU의 장치 메모리(302) 내에 부분 지지도를 위한 버퍼(PSBuf로 표기됨)를 할당한다. To store partial supports for all candidate item sets corresponding to RA 306 in each transaction block TB _k , the frequent pattern mining device maintains a two-dimensional array labeled PSArray in main memory 301 , The size of the two-dimensional array is determined by R and | RA |. Where | RA | is the number of candidate item sets. The frequent pattern mining device allocates a buffer for partial support (denoted PSBuf) in the device memory 302 of the GPU.

일실시예에 따르면, 각 GPU들은 장치 메모리에 복사된 트랜잭션 블록 및 장치 메모리에 복사된 각 상대 메모리 주소들을 이용하여 각 상대 메모리 주소들에 대응하는 부분 지지도들을 계산할 수 있다. 각 GPU들은 비트단위 AND(bitwise AND) 연산들에 대한 GPU 커널 함수-K_TFL로 표시됨-를 이용하여, RABuf 내 후보 항목집합들(309)에 대응하는 부분 지지도들(311)을 계산하고, 그 값들을 PSBuf에 저장할 수 있다. 도 3을 참조하면, Q=1이고, 장치 메모리(302)는 1개인 것을 가정하였으므로, 빈발 패턴 마이닝 장치는 TBBuf에 복사된 트랜잭션 블록(308)과 RABuf에 복사된 상대 메모리 주소들(309)를 이용하여 상대 메모리 주소들(309)에 대응하는 비트 벡터들(310)을 GPU를 통해 계산할 수 있다(324). 빈발 패턴 마이닝 장치는 상대 메모리 주소들(309)에 대응하는 비트 벡터들(310)로부터 상대 메모리 주소들(309)에 대응하는 부분 지지도들(311)을 계산하고(325), 계산된 부분 지지도들(311)을 PSBuf에 기록할 수 있다(325).According to one embodiment, each GPU may calculate partial supports corresponding to each of the relative memory addresses using each of the relative memory addresses copied into the device memory and the transaction block copied into the device memory. Each GPU is computed by using the GPU kernel function -K _TFL for bitwise AND ( bitwise AND ) operations to calculate partial supports 311 corresponding to the candidate item sets 309 in RABuf, Values can be stored in PSBuf. Referring to FIG. 3, since Q = 1 and the device memory 302 is assumed to be one, the frequent pattern mining apparatus stores the transaction block 308 copied to TBBuf and the relative memory addresses 309 copied to RABuf The bit vectors 310 corresponding to the relative memory addresses 309 may be calculated 324 via the GPU. The frequent pattern mining device computes 325 partial support lines 311 corresponding to relative memory addresses 309 from bit vectors 310 corresponding to relative memory addresses 309, (311) to PSBuf (325).

일실시예에 따르면, 빈발 패턴 마이닝 장치는 상대 메모리 주소들에 대응하는 부분 지지도들을 동기화하여 후보 k-항목집합들의 지지도들을 갱신할 수 있다. 도 3을 참조하면, 빈발 패턴 마이닝 장치는 GPU 코어들에서 계산된 부분 지지도들을 GPU의 장치 메모리(302) 내 PSBuf에 우선 저장하고, 그 다음에 PSBuf 내 부분 지지도들(311)을 메인 메모리(301) 내 PSArray로 다시 복사한다. 여기서, 빈발 패턴 마이닝 장치는 TB_k에 대한 값들을 PSArray의 k번째 열에 복사할 수 있다. 빈발 패턴 마이닝 장치는 PSBuf에 기록된 부분 지지도들(311)을 PSArray로 복사하고(326), 복사된 부분 지지도들(312)에 기초하여, RA(306)에 대응하는 부분 지지도들을 동기화하고(327), C_L(305)의 지지도들(313)을 갱신할 수 있다.According to one embodiment, a frequent pattern mining device may synchronize partial supports corresponding to relative memory addresses to update the support ratings of candidate k-item sets. 3, the frequent pattern mining apparatus first stores the partial scores calculated in the GPU cores in PSBuf in the device memory 302 of the GPU, and then stores the partial scores 311 in the PSBuf in the main memory 301 ) Copy it back to my PSArray. Here, the frequent pattern mining apparatus can copy the values for TB _k to the kth column of the PSArray. The frequent pattern mining device copies 328 the partial supports 311 written to PSBuf 326 and synchronizes the partial supports corresponding to the RA 306 based on the copied partial supports 312 ), And the supporting scores 313 of the C _L 305 can be updated.

빈발 패턴 마이닝 장치는

를 얻기 위해 PSArray에서 각 항목집합 x의 부분 지지도들을 합할 수 있다(aggregate). 빈발 패턴 마이닝 장치는 지지도가 주어진 임계 값 minsup보다 크거나 같은 지지도인 빈발 L-항목집합들 F_L을 찾을 수 있다.Frequent pattern mining devices

We can aggregate the partial subsets of each item set x in the PSArray to obtain. A frequent pattern mining device is one in which the support is given a given threshold You can find frequent L-item sets F _L that are greater than or equal to minsup .

알고리즘 1은 TFL 전략의 알고리즘의 슈도 코드(pseudo code)이다. 초기화 단계에서, 빈발 패턴 마이닝 장치는 트랜잭션 데이터 D를 메인 메모리(MM)에 로드하고, MM에 PSArray를 할당하고, GPU의 장치 메모리(DM)에 TBBuf, RABuf 및 PSBuf의 세 가지 버퍼들을 할당한다(Lines 1-3). 그런 다음, 각 트랜잭션 블록이 TBBuf에 맞도록, 빈발 패턴 마이닝 장치는 F₁을 이용하여 D를 트랜잭션 블록들의 집합 TB_1:R로 변환한다(Lines 4-5). 빈발 패턴 마이닝 장치는 x를 RA(x)에 매핑하기 위한 사전 dict을 구축한다(Line 6). TFL 전략은 F₁을 위한 입력 데이터로서 구축된(built) TB_1:R만을 이용하기 때문에, 사전은 항목집합들의 마이닝 중에 변경되지 않는다. 메인 루프(main loop)는 생성 단계(Lines 10-11)와 테스트 단계(Lines 12-20)로 구성된다. 빈발 패턴 마이닝 장치는 GPU 메모리 양에 대한 제한을 극복하기 위해 F₁의 트랜잭션 블록들을 스트리밍함으로써 테스팅 단계의 성능을 크게 향상시키는 동시에 부분 지지도들의 신속하고 대규모적인 병렬 계산을 위해 GPU 컴퓨팅을 활용한다(Lines 12-18).Algorithm 1 is the pseudo code of the algorithm of the TFL strategy. In the initialization phase, the frequent pattern mining device loads transaction data D into the main memory MM, assigns a PSArray to the MM, and allocates three buffers TBBuf, RABuf and PSBuf to the device memory DM of the GPU ( Lines 1-3). Then, the frequent pattern mining device uses F ₁ to convert D to a set of transaction blocks TB _{1: R} (Lines 4-5) so that each transaction block fits TBBuf. The frequent pattern mining device constructs a dictionary dict for mapping x to RA (x) (Line 6). Since the TFL strategy uses only TB _{1: R} built as input data for F ₁ , the dictionary does not change during mining of item sets. The main loop consists of a generation phase (Lines 10-11) and a test phase (Lines 12-20). The frequent pattern mining device greatly improves the performance of the testing phase by streaming the transaction blocks of F ₁ to overcome the limitations on the amount of GPU memory, while utilizing GPU computing for rapid, massive parallel computation of partial supports 12-18).

빈발 패턴 마이닝 장치는 커널 함수 K_TFL을 보통 한번 대신에 여러 번 호출할 수 있다(Line 16). 이는 빈발 패턴 마이닝 장치가 K_TFL을 호출 할 때 결정할 수 있는 GPU 블록들의 수에 제한이 있기 때문이다. K_TFL 함수는 단일 GPU 블록을 이용하여 단일 항목집합의 부분 지지도를 계산할 수 있다. 예를 들어, maxBlk로 표시된, GPU 블록들의 수가 16K로 설정되면, K_TFL의 단일 호출은 16K 항목 집합들에 대한 부분 지지도들을 한 번에 계산할 수 있다. 따라서 |RA_j| = 100M이라면, K_TFL 함수는

번 호출된다. 즉, TBBuf 내의 동일한 트랜잭션 블록 TB_k에 대해, 빈발 패턴 마이닝 장치는 RA_j의 부분들을 변경하는 동안 커널 함수를 반복적으로 실행한다. 데이터 복사 측면에서, RA는 외부 피연산자가 되고, TB는 내부 피연산자가 된다. 그러나 커널 함수의 호출 측면에서, TB_k는 내부 피연산자가 되고, RA_j는 외부 피연산자가 된다.A frequent pattern mining device can call the kernel function K _TFL several times instead of once (Line 16). This is because there is a limit to the number of GPU blocks that a frequent pattern mining device can determine when calling K _TFL . The K _TFL function can calculate the partial support of a single item set using a single GPU block. For example, if the number of GPU blocks, denoted by maxBlk, is set to 16K, then a single call of K _TFL can calculate partial supports for 16K item sets at once. Therefore, | RA _j | = 100M, then the K _TFL function is

Lt; / RTI > That is, for the same transaction block TB _k in TBBuf, the frequent pattern mining device iteratively executes the kernel function while changing portions of RA _j . In terms of data copying, RA becomes the outer operand and TB becomes the inner operand. However, on the invocation of the kernel function, TB _k is the inner operand, and RA _j is the outer operand.

알고리즘 2는 GMiner의 GPU 커널 함수의 슈도 코드이다. 알고리즘 2의 커널 함수는 TFL 전략뿐만 아니라 HIL 전략에도 이용될 수 있다. RA_j, TB_k, doneIdx 및 maxThr은 입력들로서 이용된다. 여기서, doneIdx는 RA_j 내에서 이전에 수행된 마지막 후보의 인덱스이고, K_TFL의 현재 호출이 처리할 RA_j 부분을 식별하는데 필요하다. 예를 들어, |RA_j| = 10000이고 maxBlk = 1000이면, K_TFL의 두 번째 호출에서 doneIdx는 1000이 된다. 입력 maxThr은 maxBlk와 같이 K_TFL을 호출 할 때 결정될 수 있는, 단일 GPU 블록의 최대 스레드(threads) 수이다. BID 및 TID는 각각 현재 GPU 블록 및 GPU 스레드의 ID들로서, 자동적으로 결정되는 시스템 변수들이다. 많은 GPU 블록들이 동시에 실행되기 때문에, 그들 중 일부는 테스트할 대응 후보 항목집합들을 갖지 않을 수 있다. 예를 들어, |RA_j| = 100, maxBlk = 200이면, 100개의 GPU 블록들은 항목집합들이 없으므로 커널 함수를 수행하지 않아야 한다. 따라서, 현재 GPU 블록이 해당 항목집합을 갖지 않으면, 커널 함수는 즉시 반환(return)된다(Lines 1-2). 커널 함수는 더 나은 성능을 위해 GPU의 공유 메모리 내 can 및 sup의 두 변수들을 준비하는데, 이는 두 변수들이 빈번하게 액세스되기 때문이다. 변수 can 은 현재 GPU 블록 BID가 부분 지지도를 계산하는 항목집합을 포함하고, 벡터 sup는 0으로 초기화될 수 있다.Algorithm 2 is the pseudo code of GMiner's GPU kernel function. Algorithm 2 kernel functions can be used in HIL strategies as well as TFL strategies. RA _j , TB _k , doneIdx and maxThr are used as inputs. Here, doneIdx is the index of the last candidate previously performed in the RA _j, it is necessary to identify the RA _j portion to the current call handling _TFL K. For example, | RA _j | = 10000 and maxBlk = 1000, the doneIdx is 1000 in the second call of K _TFL . The input maxThr is the maximum number of threads in a single GPU block that can be determined when calling K _TFL , such as maxBlk. BID and TID are system variables that are automatically determined as the IDs of the current GPU block and the GPU thread, respectively. Because many GPU blocks are executing at the same time, some of them may not have corresponding candidate item sets to test. For example, | RA _j | = 100, and maxBlk = 200, then 100 GPU blocks should not perform kernel functions since there are no item sets. Thus, if the current GPU block does not have a corresponding set of items, the kernel function is immediately returned (Lines 1-2). The kernel function prepares two variables, can and sup, in the GPU's shared memory for better performance, because both variables are accessed frequently. The variable can contains the set of items for which the current GPU block BID computes the partial support, and the vector sup can be initialized to zero.

도 4는 일실시예에 따른 커널 함수를 설명하기 위한 도면이다.4 is a diagram for explaining a kernel function according to an embodiment.

K_TFL의 메인 루프는 비트단위 AND 연산들(bitwise AND operations)을 동시에 반복해서 수행한다(Lines 5-8). GPU들의 현재 아키텍처 하에서, 단일 GPU 스레드는 단일-정밀도 너비(single-precision width), 즉 32 비트에 대해 비트단위 AND를 효율적으로 수행 할 수 있다. 즉, 단일 GPU 블록은 비트단위 AND와 maxThr x 32 비트를 동시에 수행 할 수 있다. 트랜잭션 블록 W의 너비는 maxThr x 32 비트보다 훨씬 클 수 있다. The main loop of K _TFL performs bitwise AND operations at the same time (Lines 5-8). Under the current architecture of GPUs, a single GPU thread can efficiently perform a bit-wise AND for a single-precision width, i.e., 32 bits. That is, a single GPU block can perform bitwise AND and maxThr x 32 bits at the same time. The width of transaction block W may be much larger than maxThr x 32 bits.

도 4를 참조하면, maxThr = 2이고, can = 0, 1, 3일 때 K_TFL의 예가 도시된다. 여기서, 편의상 GPU 스레드는 4 비트에 대해 비트단위 AND를 수행할 수 있다고 가정한다. 후보 항목집합의 길이가 3이기 때문에, 스레드 1과 2는 {TB(0), TB(1), TB(3)}에 대해 비트단위 AND 연산들을 두 번 수행하고, bitV에 결과 비트들을 저장한다. 커널은 이 프로세스를

번 반복한다. bitV 내 1들의 수는 소위 popCount 함수를 이용하여 쉽게 계산될 수 있고, sup 벡터에 저장될 수 있다. popCount 함수의 이름은 popc()일 수 있다. 도 4를 참조하면, 부분 지지도들은 sup 벡터에

번 누적된다. 커널 함수는 parallelReduction 함수를 이용하여, sup 내 값들을 후보 항목집합 can에 대한 TB_k 내 단일 부분 지지도로 합할 수 있다(aggregate) (Line 9).Referring to FIG. 4, an example of K _TFL is shown when maxThr = 2 and can = 0, 1, 3. For the sake of simplicity, it is assumed that the GPU thread can perform bitwise AND operation on 4 bits. Since the length of the candidate item set is 3,

threads

1 and 2 perform bitwise AND operations twice on {TB (0), TB (1), TB (3)} and store the result bits in bitV . The kernel does this

Repeat times. The number of ones in bitV can easily be calculated using the so-called popCount function, and can be stored in the sup vector. The popCount function can be named popc (). Referring to FIG. 4,

Times. The kernel function can use the parallelReduction function to aggregate the values in sup into a single partial support in TB _k for the candidate item set can (Line 9).

"" HILHIL (Hopping from Intermediate Level) 전략"(Hopping from Intermediate Level) strategy "

상술한 TFL 전략은 제한된 양의 GPU 메모리(장치 메모리)를 갖는 GPU들만을 사용하여 대-규모 데이터베이스에 대한 모든 빈발 항목집합들을 찾을 수 있다. TFL 전략은 대부분의 경우 우수한 성능을 나타내지만, 빈발 항목집합들의 길이들이 길어지는 경우에도 양호한 성능을 나타내는 전략이 요구될 수 있다. 프래그먼트 블록들(Fragment Blocks)을 활용하여 항목집합들의 길이들 측면에서 테스트 단계의 성능을 보다 확장 가능하게 만드는 HIL(Hopping from Intermediate Level) 전략이 제안된다.The TFL strategy described above can find all the frequent item sets for a large-scale database using only GPUs with a limited amount of GPU memory (device memory). Although the TFL strategy shows excellent performance in most cases, a strategy that exhibits good performance may be required even when the length of frequent item sets is long. A HIL (Hopping from Intermediate Level) strategy is proposed to make the performance of the test step more scalable in terms of lengths of item sets using Fragment Blocks.

"" 프래그먼트Fragments 블록들(Fragment Blocks)" Fragment Blocks "

도 5는 일실시예에 따른 프래그먼트 블록을 설명하기 위한 도면이다.5 is a view for explaining a fragment block according to an embodiment.

HIL 전략에 따르면, 빈발 패턴 마이닝 장치는 각 트랜잭션 블록 TB_k을 분리된(disjoint) 프래그먼트 블록들(fragment blocks)로 수평적(horizontally) 파티셔닝한다. 프래그먼트 크기(fragment size)는 단일 프래그먼트 블록에 속하는 빈발 1-항목집합들의 수로 정의된다. 프래그먼트 크기는 고정되어 있으며, H로 표기된다. 따라서 각 트랜잭션 블록 내 총

프래그먼트 블럭들이 있다.According to the HIL strategy, a frequent pattern mining device horizontally partition each transaction block TB _k into disjoint fragment blocks. The fragment size is defined as the number of frequent 1-itemsets belonging to a single fragment block. The fragment size is fixed and is denoted by H. Therefore, the total number of transactions within each transaction block

There are fragment blocks.

HIL 전략에 따르면, 빈발 패턴 마이닝 장치는 각 프래그먼트 블록 내 모든 빈발 항목집합들을 구체화(materializes)할 수 있다. 여기서, 항목집합 x의 구체화는 해당 프래그먼트 블록에서 x에 대한 비트 벡터를 생성하는 것을 의미한다. 프래그먼트 크기가 H이므로, 총

- 1의 빈발 항목집합들까지 각 프래그먼트 블록에서 구체화될 수 있다. 따라서 프래그먼트 블록의 높이는 H 대신

- 1로 설정될 수 있다.According to the HIL strategy, a frequent pattern mining device can materialize all the frequent item sets within each fragment block. Here, the specification of the item set x means to generate a bit vector for x in the corresponding fragment block. Since the fragment size is H,

- Up to 1 frequent item sets can be specified in each fragment block. Therefore, the height of the fragment block is

- 1. &Lt; / RTI >

HIL 전략에 따르면, 빈발 패턴 마이닝 장치는 트랜잭션 비트맵 TB를 프래그먼트 블록들로 수직적(vertically) 및 수평적(horizontally) 파티셔닝하고, 각 프래그먼트 블록은 W의 너비와

- 1의 높이를 가진다. 프래그먼트 블록은 FID로 표기된 트랜잭션 블록 내 자체 ID를 가진다. 빈발 패턴 마이닝 장치는 각 프래그먼트 블록을 메인 메모리 내 연속적으로(consecutively) 할당할 수 있다(또는 디스크 페이지와 같은 보조 저장소(secondary storage) 내 청크로 저장할 수 있다). TB_k의 l-번째 프래그먼트 블록은 TB_k,l로 표기된다. 프래그먼트 블록 TB_k,l은

- 1의 비트 벡터들로 구성되며, 이러한 비트 벡터들에 대응하는 항목집합들의 집합은 항목집합들(TB_k,l)로 표기된다. According to the HIL strategy, the frequent pattern mining device vertically and horizontally partition the transactional bitmap TB into fragments blocks, each fragment block having a width of W

- Has a height of 1. The fragment block has its own ID in the transaction block marked FID. The frequent pattern mining device can consecutively allocate each fragment block in main memory (or store it as a chunk in secondary storage, such as a disk page). L- second fragment blocks TB of _k is denoted by TB _{k, l.} The fragment blocks TB _{k, l}

- 1 bit vectors, and the set of item sets corresponding to these bit vectors is denoted by item sets TB _{k, l} .

도 5를 참조하면, 총 R x 3의 프래그먼트 블록들이 도시되고, 여기서 H = 2이다. 일실시예에 따르면, 빈발 패턴 마이닝 장치는 빈발 1-항목집합들(501)을 분할하여 프래그먼트(fragment)들(502, 503, 504)을 생성할 수 있다. 빈발 패턴 마이닝 장치는 프래그먼트들(502, 503, 504) 별로 프래그먼트 내 빈발 1-항목집합들의 모든 조합인 항목집합들(505, 506, 507)을 생성할 수 있다. 예를 들어, 빈발 패턴 마이닝 장치는 프래그먼트 2(503) 내 빈발 1-항목집합들 {C}, {D}의 모든 조합인 {C}, {D}, {C, D}의 항목집합들(506)을 생성할 수 있다. 빈발 패턴 마이닝 장치는 항목집합들(505, 506, 507)에 대응하는 비트 벡터들을 계산할 수 있다. 여기서, 비트 벡터들을 포함하는 트랜잭션 비트맵은 수직 비트맵 레이아웃(vertical bitmap layout)에 의해 표현되고, 빈발 패턴 마이닝 장치는 비트 벡터들을 포함하는 트랜잭션 비트맵을 수직적 파티셔닝(vertically partitioning)하고, 프래그먼트들(502, 503, 504) 별로 분할하여 프래그먼트 블록들 TB_1,1, TB_1,2, TB_1,3,TB_R,1, TB_R,2, TB_R,3(509, 510, 511, 512, 513, 514)을 생성할 수 있다. 예를 들어, 트랜잭션 블록 TB₁은 3 개의 프래그먼트 블록들 {TB_1,1, TB_1,2, TB_1, ₃}로 분할되고, 각 프래먼트 블록은

비트 벡터들을 포함한다. 도 5에서, TB_1,2에 대응하는 항목집합들은 {{C}, {D}, {CD}}이다. HIL 전략에 따르면, 프래그먼트 블록 내 n-itemsets(n>1)의 비트 벡터들은 구체화(materialization) 전에 0들로 초기화될 수 있다.Referring to FIG. 5, a total of R x 3 fragment blocks are shown, where H = 2. According to one embodiment, a frequent pattern mining device may generate

fragments

502, 503, and 504 by partitioning the frequent 1-item sets 501. The frequent pattern mining device may generate item sets 505, 506, 507, which are all combinations of the frequent 1-item sets in the fragments for each of the

fragments

502, 503, 504. For example, a frequent pattern mining device may include a set of items {C}, {D}, {C, D} of all combinations of frequent 1- itemsets {C}, {D} 506). The frequent pattern mining device may calculate the bit vectors corresponding to the item sets 505, 506, 507. Herein, the transaction bitmap including the bit vectors is represented by a vertical bitmap layout, and the frequent pattern mining device vertically partition the transaction bitmap including the bit vectors, and the

fragments

502, 503, 504) for each of the divided fragments to block _{_{_{TB 1,1, TB 1,2, TB 1,3}}} , TB R, 1, TB R, 2, TB R, 3 (509, 510, 511, 512, 513, 514). For example, the transaction block TB ₁ is divided into three fragment blocks {TB _1,1 , TB _1,2 , TB _1, ₃ }, and each fragment block

Bit vectors. In FIG. 5, the item sets corresponding to TB _1,2 are {{C}, {D}, {CD}}. According to the HIL strategy, the bit vectors of n-itemsets (n > 1) in a fragment block may be initialized to zeros before materialization.

공간 복잡도(space complexity) 측면에서, HIL 전략은 트랜잭션 비트맵을 위한 비트 단위로

의 공간을 필요로 한다. 중간 레벨들에서 빈발 항목집합들이 완전히 구체화(full materialization)되는 것과 비교할 때, HIL 전략에 따른 프래그먼트 블록들은 훨씬 적은 양의 메모리를 사용한다. 예를 들어, |D| = 10M 및 |F₁|=500으로 가정하자. 그런 다음, 세번째 레벨들까지의 완전한 구체화는 최대

공간을 필요로 한다. 이에 반해, HIL 전략에 따른 H=5 인 프래그먼트 블록들은

의 공간만을 필요로 한다.In terms of space complexity, the HIL strategy is a bit-by-bit basis for transactional bitmaps

. Fragmented blocks according to the HIL strategy use a much smaller amount of memory when compared to the full materialization of frequent item sets at intermediate levels. For example, | D | = 10M and | F ₁ | = 500. Then, the full elaboration to the third level

Space is needed. In contrast, H = 5 fragment blocks according to the HIL strategy

Of space.

구체화(materialization)를 위해, HIL 전략에 따른 빈발 패턴 마이닝 장치는 R x S의 프래그먼트 블록들 각각에 대해 2부터 H의 레벨까지 빈발 항목집합 마이닝을 수행한다. 데이터베이스의 크기가 커짐에 따라 프래그먼트 블록들의 수는 매우 커질 수 있다. 많은 수의 블록들의 빠른 구체화를 위해, 상술된 중첩 루프 스트리밍(nested-loop streaming) 기법이 활용된다. 프래그먼트 블록들은 서로 상관 관계가 없기 때문에, 독립적으로 구체화될 수 있다.For materialization, the frequent pattern mining device according to the HIL strategy performs frequent item set mining from 2 to H level for each of the R x S fragment blocks. As the size of the database grows, the number of fragment blocks can become very large. For fast realization of a large number of blocks, the nested-loop streaming technique described above is utilized. Since the fragment blocks are not correlated with each other, they can be specified independently.

알고리즘 3은 프래그먼트 블록들의 구체화 알고리즘이다. 모든 프래그먼트 블록들은 입력으로 사용된다. 먼저, 빈발 패턴 마이닝 장치는 장치 메모리(DM)에서 RABuf를 사용하여 한 번에 구체화할 수 있는 프래그먼트 블록들의 최대 수를 계산한다(Line 1). 여기서, 계산된 최대 수를 Q로 가정한다. 그런 다음, 빈발 패턴 마이닝 장치는 메인 루프를 Q번 실행한다. 빈발 패턴 마이닝 장치는 모든 트랜잭션 블록들의 start와 end 사이의 FID들의 프래그먼트 블록들, 즉 FB_{1:R,begin:end}를 스트리밍할 수 있다. 즉, 총

의 프래그먼트 블록들이 스트리밍 방식으로 GPU 메모리(장치 메모리)로 전송된다. 이러한 블록들 내 항목집합들을 그것들의 상대 메모리 주소들로 매핑하기 위해, 빈발 패턴 마이닝 장치는 사전 dict을 구축한다(Lines 5-6). 그런 다음, 빈발 패턴 마이닝 장치는 F₁을 구체화할 필요가 없으므로 F_H - F₁에 대한 항목집합들만 매핑한다((Line 7). 빈발 패턴 마이닝 장치는 프래그먼트 블록들을 스트리밍하는 동안 동시에 커널 함수 K_HIL을 실행한다(Lines 9-12). 여기서, K_HIL은 기본적으로(basically) K_TFL과 동일하지만, 부분 지지도들을 계산하는 대신에 TBBuf 내 프래그먼트 블록들의 해당 위치에 비트 벡터들 bitV을 저장한다. K_HIL의 슈도 코드는 생략된다. K_HIL의 호출이 완료된 후에, 빈발 패턴 마이닝 장치는 업데이트된 프래그먼트 블록들 TB_{k,[start:end]}을 메인 메모리로 다시 복사한다. GPU 컴퓨팅을 이용한 구체화 기법은 매우 빠르므로 경과 시간(elapsed time)은 거의 무시할 수 있다.Algorithm 3 is an algorithm for specifying fragments blocks. All fragment blocks are used as inputs. First, the frequent pattern mining device calculates the maximum number of fragment blocks that can be specified at a time by using RABuf in the device memory (DM) (Line 1). Here, it is assumed that the calculated maximum number is Q. The frequent pattern mining device then executes the main loop Q times. The frequent pattern mining device can stream fragments blocks of FIDs between the start and end of all transaction blocks, FB _{1: R, begin: end} . That is,

Are transferred to the GPU memory (device memory) in a streaming manner. To map the sets of items in these blocks to their relative memory addresses, the frequent pattern mining device constructs a dictionary dict (Lines 5-6). Maps only set of items for the F ₁ ((Line 7) frequent pattern mining apparatus at the same time, the kernel function while streaming the fragment block K _HIL - then frequent pattern mining apparatus F _H does not need to materialize the F _1. (Lines 9-12), where K _HIL is basically the same as K _TFL , but instead of calculating partial supports, store the bit vectors bitV at the corresponding positions of the fragment blocks in TBBuf K pseudo code for the _HIL is omitted K after calling the _HIL is completed, the frequency pattern mining apparatus of TB _k, the updated fragment _block..: copy back a _{[start end]} to the main memory specified method using the GPU computing is very The elapsed time is almost negligible because it is fast.

"" HILHIL 알고리즘( algorithm( HILHIL Algorithm)" Algorithm) "

도 6은 일실시예에 따른 HIL 전략의 동작을 설명하기 위한 도면이다.6 is a diagram for explaining the operation of the HIL strategy according to an embodiment.

HIL 전략에 따르면, 빈발 패턴 마이닝 장치는 사전-계산된(precomputed) 결과들의 일종인 프래그먼트 블록들을 이용하여, 비트단위 AND 연산의 수를 줄일 수 있다. TFL 전략과 달리, HIL 전략은 GPU들로 전송되는 트랜잭션 비트맵의 양을 줄이기 위해 각 레벨에서 프래그먼트들의 세트를 동적으로 결정한다. According to the HIL strategy, frequent pattern mining devices can reduce the number of bitwise AND operations by using fragment blocks, which are a kind of precomputed results. Unlike the TFL strategy, the HIL strategy dynamically determines a set of fragments at each level to reduce the amount of transactional bitmaps sent to the GPUs.

알고리즘 4는 HIL 전략의 슈도 코드이다. 먼저 빈발 패턴 마이닝 장치는 알고리즘 3을 사용하여 프래그먼트 블록들을 구체화한다(Lines 1-2). 빈발 패턴 마이닝 장치는 후보 항목집합들 C_L을 생성한 후, C_L 내 모든 항목집합들을 포함하는 프래그먼트들의 최소 집합-B로 표기됨-을 찾는다(Line 7). 레벨 L이 낮으면, 대부분의 프래그먼트들이 집합 B로 선택된다. 그러나, 레벨 L이 커지면 C_L을 포함하는 프래그먼트들의 수가 줄어들므로, 빈발 패턴 마이닝 장치는 프래그먼트 블록들을 전송하는 오버 헤드를 줄일 수 있다. 각 레벨에서 프래그먼트들의 집합이 변경되므로, C_L 내 후보 항목집합들의 상대 메모리 주소들도 변경될 수 있다. 따라서, 빈발 패턴 마이닝 장치는 각 레벨에서 B 내 항목집합들만 이용하여 dict을 구축하고, C_L을 RA_1:Q로 변환한다(Lines 8-9). 트랜잭션 비트맵을 GPU들로 스트리밍할 때, 빈발 패턴 마이닝 장치는 전체 트랜잭션 블록 TB_k 대신에 관련 프래그먼트 블록들 TB_k,l∈B만 복사할 수 있다.Algorithm 4 is the pseudo code of the HIL strategy. First, the frequent pattern mining device specifies the fragment blocks using algorithm 3 (Lines 1-2). The frequent pattern mining device generates candidate item sets C _L and then finds the minimum set of fragments -B, including all the item sets in C _L (Line 7). If the level L is low, most of the fragments are selected as set B, However, when the level L is large, the number of fragments including C _L is reduced, so that the frequent pattern mining apparatus can reduce the overhead of transmitting fragment blocks. Since the set of fragments at each level is changed, the relative memory addresses of the candidate item sets in C _L can also be changed. Therefore, the frequent pattern mining device constructs a dict using only itemsets in B at each level, and converts C _L into RA _{1: Q} (Lines 8-9). When streaming transactional bitmaps to GPUs, the frequent pattern mining device is responsible for the entire transaction block TB _k Instead, only the relevant fragment blocks TB _{k, l E B} can be copied.

도 6을 참조하면, HIL 전략에 따른 C_L = {{A, B, E}, {B, E, F}}(605)로 가정한다. 빈발 패턴 마이닝 장치는 C_L(605) 내 모든 항목집합들을 포함하는 프래그먼트들 B = {1, 3}을 쉽게 식별 할 수 있다(611).Referring to FIG. 6, it is assumed that C _L = {{A, B, E}, {B, E, F}} 605 according to the HIL strategy. The frequent pattern mining device can easily identify fragments B = {1, 3} containing all the item sets in C _L (605) (611).

일실시예에 따르면, 빈발 패턴 마이닝 장치는 트랜잭션 비트맵의 항목집합들(603)을 상대 메모리 주소들로 매핑하는 사전 dict(604)을 이용하여, 후보 k-항목집합들(605)(예를 들어, k=3)의 상대 메모리 주소들(606)을 생성할 수 있다(613). 예를 들어, 빈발 패턴 마이닝 장치는 후보 k-항목집합 {A, B, E}에 포함된 항목집합들 {A, B}, {E}를 식별할 수 있다(612). 빈발 패턴 마이닝 장치는 항목집합들 {A, B}, {E}의 상대 메모리 주소들을 조합하여 후보 k-항목집합 {A, B, E}의 상대 메모리 주소 {2, 3}을 생성할 수 있다(613).According to one embodiment, the frequent pattern mining device uses the dictionary dict 604, which maps the item sets 603 of the transaction bitmap to the relative memory addresses, to generate candidate k-item sets 605 (e.g., For example, k = 3) relative memory addresses 606 (613). For example, a frequent pattern mining device may identify item sets {A, B}, {E} included in a candidate k-itemset {A, B, E} (612). The frequent pattern mining device can generate the relative memory addresses {2, 3} of the candidate k-itemsets {A, B, E} by combining the relative memory addresses of the item sets {A, B}, {E} (613).

도 6을 참조하면, 빈발 패턴 마이닝 장치는 첫 번째와 세 번째 프래그먼트들을 이용하여 사전 dict을 구축하므로, RA({E})는 6 대신 3이 된다. 항목집합 x ∈ C_L을 RA(x)로 변환하면, HIL 전략에 따른 RA(x)의 길이는 TFL 전략의 길이보다 짧아진다. 즉, 부분 지지도들의 획득을 위해, HIL 전략에서 TFL 전략보다 더 적은 수의 비트단위 AND 연산들이 요구된다. 예를 들어, HIL 전략에서 RA({A, B, E})의 길이는 2(즉, {2,3})인 반면에, TFL 전략에서는 그 길이가 3이다. Referring to FIG. 6, the frequent pattern mining apparatus constructs a dictionary dict using the first and third fragments, so RA ({E}) becomes 3 instead of 6. If the item set x ∈ C _L is transformed into RA (x), the length of RA (x) according to the HIL strategy is shorter than the length of the TFL strategy. That is, fewer bitwise AND operations are required in the HIL strategy than in the TFL strategy, in order to obtain partial supports. For example, in the HIL strategy, the length of RA ({A, B, E}) is 2 (ie {2, 3}) whereas in TFL strategy the length is 3.

일실시예에 따르면, 빈발 패턴 마이닝 장치는 상대 메모리 주소들 각각을 외부 피연산자로서 메인 메모리로부터 장치 메모리들 각각으로 복사할 수 있다. 도 6을 참조하면, 빈발 패턴 마이닝 장치는 상대 메모리 주소들(606)을 메인 메모리(601)로부터 GPU의 장치 메모리(602)로 복사할 수 있다(614). 일실시예에 따르면, 빈발 패턴 마이닝 장치는 프래그먼트 블록들 중 적어도 하나의 프래그먼트 블록을 내부 피연산자로서 장치 메모리들 각각으로 복사할 수 있다. 도 6을 참조하면, 빈발 패턴 마이닝 장치는 프래그먼트 블록들 중 프래그먼트 블록들 TB_1,1(607), TB_1,3(608)을 GPU의 장치 메모리(602)로 복사할 수 있다(615). 상술한 바와 같이, 각 RA_j는 GPU의 장치 메모리(602)의 RABuf에 복사된 다음, B 내 프래그먼트 블록들의 첫번째 집합 {TB_1,1, TB_1,3}(607, 608)은 GPU의 장치 메모리(602)의 TBBuf로 스트리밍될 수 있다. 다음 번에는, 프래그먼트 블록들의 두번째 집합 {TB_2,1, TB_2, ₃}가 스트리밍될 수 있다. TFL 전략에서 설명한 바와 같이, 일실시예에 따른 각 GPU들은 프래그먼트 블록 및 각 상대 메모리 주소들을 이용하여 각 상대 메모리 주소들에 대응하는 부분 지지도들을 계산할 수 있다. 빈발 패턴 마이닝 장치는 상대 메모리 주소들에 대응하는 부분 지지도들을 동기화하여 후보 k-항목집합들의 지지도들을 갱신할 수 있다. 지지도의 계산 및 갱신과 관련된 내용은 TFL 전략에서 설명된 내용과 중복되므로 설명을 생략한다.According to one embodiment, a frequent pattern mining device may copy each of the relative memory addresses from the main memory to each of the device memories as an external operand. Referring to FIG. 6, the frequent pattern mining device may copy 617 the relative memory addresses 606 from the main memory 601 to the device memory 602 of the GPU. According to one embodiment, a frequent pattern mining device may copy at least one fragment block of fragment blocks into each of the device memories as an internal operand. Referring to FIG. 6, the frequent pattern mining apparatus can copy the fragment blocks TB _1,1 (607), TB _1,3 (608) among the fragment blocks to the device memory 602 of the GPU (615). As described above, each RA _j is copied to the RABuf in the GPU's device memory 602, and the first set {TB _1,1 , TB _1,3 } (607, 608) of the fragment blocks in B is stored in the device And may be streamed to TBBuf in memory 602. Next time, a second set of fragment blocks {TB _2,1 , TB _2, ₃ } may be streamed. As described in the TFL strategy, each GPU according to an embodiment may calculate partial supports corresponding to respective relative memory addresses using a fragment block and respective relative memory addresses. The frequent pattern mining device can update the supports of candidate k-item sets by synchronizing partial supports corresponding to relative memory addresses. The contents related to the calculation and update of the support are redundant with those described in the TFL strategy, so the explanation is omitted.

"다중 "multiple GPU들의GPU's 활용(Exploiting Multiple Exploiting Multiple GPUsGPUs )") "

도 7은 일실시예에 따른 다중 GPU들을 활용하는 동작을 설명하기 위한 도면이다.7 is a diagram for explaining an operation of utilizing multiple GPUs according to an embodiment.

GMiner에 따른 빈발 패턴 마이닝 장치는 복수의 GPU들을 용이하게 활용하여 성능을 향상시킬 수 있다. 여러 GPU들을 사용하기 위해, 빈발 패턴 마이닝 장치는 외부 피연산자의 서로 다른 부분을 서로 다른 GPU들로 복사하고, 동일한 트랜잭션(또는 프래그먼트) 블록들을 모든 GPU들로 복사할 수 있다. 이 기법은 transaction bitmap sharing 기법이라 지칭된다. 이 기법은 TFL 및 HIL 전략에 모두 적용될 수 있다.The frequent pattern mining device according to GMiner can improve performance by easily utilizing multiple GPUs. To use multiple GPUs, a frequent pattern mining device can copy different parts of an external operand to different GPUs and copy the same transaction (or fragments) blocks to all GPUs. This technique is called the transaction bitmap sharing technique. This technique can be applied to both TFL and HIL strategies.

도 7을 참조하면, 다중 GPU들을 이용한 기법에 따른 데이터 흐름이 도시되고, GPU들의 수가 2인 실시예가 설명되며, GPU들의 수는 확장될 수 있다. 빈발 마이닝 장치는 CPU의 메인 메모리(701)로부터 RA₁을 GPU₁의 장치 메모리(702)로 복사하고, RA₂를 GPU₂의 장치 메모리(703)로 복사할 수 있다. 빈발 패턴 마이닝 장치는 CPU의 메인 메모리(701)로부터 GPU₁의 장치 메모리(702)와 GPU₂의 장치 메모리(703)로 모두 동일한 TB₁을 복사할 수 있다. GPU₁의 커널 함수가 RA₁의 부분 지지도들을 계산하는 동안에, GPU₂의 커널 함수는 RA₂의 부분 지지도들을 계산할 수 있다. RA₁과 RA₂에는 공통적인 항목집합들이 없기 때문에, 두 커널 함수들의 결과들은 충돌 없이 메인 메모리(701) 내 PSArray로 다시 복사될 수 있다.Referring to FIG. 7, a data flow according to a technique using multiple GPUs is illustrated, an embodiment in which the number of GPUs is two, and the number of GPUs can be expanded. The frequent mining device can copy RA ₁ from the main memory 701 of the CPU to the device memory 702 of the GPU ₁ and copy RA ₂ to the device memory 703 of the GPU ₂ . The frequent pattern mining device can copy the same TB ₁ from the main memory 701 of the CPU to the device memory 702 of the GPU ₁ and the device memory 703 of the GPU ₂ . While the kernel function of GPU ₁ calculates the partial supports of RA ₁ , the kernel function of GPU ₂ can calculate the partial supports of RA ₂ . Since there is no common item set for RA ₁ and RA ₂ , the results of the two kernel functions can be copied back into the PSArray in main memory 701 without conflict.

RA_j와 RA_k는 독립적인 작업들(tasks)

이기 때문에, 빈발 패턴 마이닝 장치는 상술한 기법을 이용하여 GPU들의 수 측면에서 확장성(scalable)을 실현할 수 있다. 또한, 빈발 패턴 마이닝 장치는 상술한 기법을 이용하여 분산 및 병렬 컴퓨팅 방법들의 작업부하 불균형(workload imbalance)을 해소할 수 있다. RA_j 및 RA_k에 대한 계산량들은, 그 계산이 복잡하거나 불규칙한 데이터 구조들의 사용 없이 비트단위 AND 연산들의 수에 크게 의존하기 때문에, RA_j 및 RA_k이 서로 같은 크기를 갖는 한 비대칭이지 않다. 따라서, 사용된 데이터집합들의 특성들과 무관하게, 빈발 패턴 마이닝 장치는 다중 GPU들을 이용하여 안정적인 속도 증가율을 갖는 성능을 나타낼 수 있다.RA _j and RA _k are independent tasks,

, The frequent pattern mining apparatus can realize scalability in terms of the number of GPUs using the above-described technique. In addition, the frequent pattern mining apparatus can solve the workload imbalance of the distributed and parallel computing methods using the above-described technique. Because it largely depends on the amount of calculation for the RA RA _j and _k are, the calculated number of the complex or bitwise AND operation without the use of random data structure, the RA RA _j and _k is not the asymmetry has the same size with each other. Thus, regardless of the characteristics of the data sets used, frequent pattern mining devices can exhibit performance with a steady rate increase rate using multiple GPUs.

도 8은 일실시예에 따른 빈발 패턴 마이닝 장치의 구성의 예시도이다.FIG. 8 is a diagram illustrating an exemplary configuration of a frequent pattern mining apparatus according to an embodiment of the present invention. Referring to FIG.

도 8을 참조하면, 빈발 패턴 마이닝 장치는 CPU(801), 메인 메모리(850) 및 GPU들(860)을 포함한다. 여기서, CPU(801)와 GPU들(860)은 서로 다른 독립된 주체일 수 있고, 빈발 패턴 마이닝 장치는 CPU(801) 또는 GPU들(860)로 구현될 수 있다. CPU(801), 메인 메모리(850) 및 GPU들(860) 사이의 데이터 전송은 PCI-E INTERFACE를 통해 수행될 수 있다.Referring to FIG. 8, the frequent pattern mining apparatus includes a CPU 801, a main memory 850, and GPUs 860. Here, the CPU 801 and the GPUs 860 may be different independent subjects, and the frequent pattern mining device may be implemented by the CPU 801 or the GPUs 860. [ Data transfer between the CPU 801, the main memory 850 and the GPUs 860 can be performed via the PCI-E INTERFACE.

CPU(801)는 루프 컨트롤러(810), 스트리밍 컨트롤러(820), 작업량 균형 매니저(830) 및 동기화 매니저(840)를 포함한다. 루프 컨트롤러(810)는 상술한 중첩 루프 스트리밍 등의 루프와 관련된 동작을 수행하고, 스트리밍 컨트롤러(820)는 상술한 스트리밍과 관련된 동작을 수행하고, 작업량 균형 매니저(830)는 상술한 트랜잭션 블록들 또는 프래그먼트 블록들의 생성 등의 작업량 균형과 관련된 동작을 수행하고, 동기화 매니저(840)는 상술한 지지도들의 동기화 등의 동기화와 관련된 동작을 수행한다. 메인 메모리(850)는 비트 벡터 데이터(851), 상대 주소 데이터(852), 지지도 데이터(853) 및 사전 데이터(854)를 저장한다. 비트 벡터 데이터(851)는 상술한 비트 벡터와 관련된 정보이고, 상대 주소 데이터(852)는 상술한 상대 주소와 관련된 정보이고, 지지도 데이터(853)는 상술한 지지도와 관련된 정보이고, 사전 데이터(854)는 상술한 사전과 관련된 정보이다.The CPU 801 includes a loop controller 810, a streaming controller 820, a workload balancing manager 830, and a synchronization manager 840. The loop controller 810 performs operations related to loops such as the above-described nested loop streaming, and the streaming controller 820 performs operations related to the above-described streaming, and the workload balancing manager 830 performs the above- Generation of fragments blocks, and the synchronization manager 840 performs operations related to synchronization such as synchronization of the above-described supports. Main memory 850 stores bit vector data 851, relative address data 852, support data 853, and dictionary data 854. The bit vector data 851 is information related to the above-described bit vector, the relative address data 852 is information related to the relative address described above, the support data 853 is information related to the support described above, and the dictionary data 854 ) Is information related to the dictionary described above.

GPU들(860)은 GPU₁(860a) 내지 GPU_n(860n)을 포함한다. GPU₁(860a)는 코어들(861a)과 장치 메모리(862a)를 포함하고, GPU_n(860n)는 코어들(861n)과 장치 메모리(862n)를 포함하며, 상술한 바와 같이 GPU₁(860a) 내지 GPU_n(860n)는 부분 지지도들을 계산할 수 있다. 도 8에 도시된 하드웨어들은 상술한 기법 또는 전략들을 수행할 수 있고, 중복되는 내용의 설명은 생략한다.GPUs 860 include GPU ₁ 860a through GPU _n 860n. GPU ₁ 860a includes cores 861a and device memory 862a and GPU _n 860n includes cores 861n and device memory 862n and includes GPU ₁ 860a ) To GPU _n (860n) can calculate partial supports. The hardware depicted in FIG. 8 may perform the techniques or strategies described above, and redundant descriptions are omitted.

도 9는 일실시예에 따른 빈발 패턴 마이닝 방법을 설명하기 위한 순서도이다.FIG. 9 is a flowchart for explaining a frequent pattern mining method according to an embodiment.

도 9를 참조하면, 빈발 패턴 마이닝 장치는 시스템 초기화를 수행할 수 있다(901). 시스템 초기화는 상술한 TFL 전략 또는 HIL 전략을 수행하기 위한 초기화 동작으로서, 예를 들어 트랜잭션 블록들 또는 프래그먼트 블록들의 생성 동작을 포함하고, 도 10을 참조하여 설명된다.Referring to FIG. 9, the frequent pattern mining device may perform system initialization (901). System initialization is an initialization operation for performing the above-described TFL strategy or HIL strategy, including, for example, generating operations of transaction blocks or fragment blocks, and is described with reference to FIG.

빈발 패턴 마이닝 장치는 k를 2로 설정하고(902), k-항목집합 C_k를 생성할 수 있다(903). 빈발 패턴 마이닝 장치는 C_k가 존재하는지 여부를 판단할 수 있다(904). 예를 들어, 항목들이 A, B인 경우 빈발 패턴 마이닝 장치는 C₃이 존재하지 않는 것으로 판단하고 동작을 종료할 수 있다.The frequent pattern mining device may set k to 2 (902) and generate a _k -item set C _k (903). The frequent pattern mining device may determine whether C _k is present (904). For example, when the items are A and B, the frequent pattern mining device may determine that C ₃ does not exist and terminate the operation.

빈발 패턴 마이닝 장치는 GPU를 이용한 AND 연산을 통해 C_k의 지지도를 계산할 수 있다(905). C_k의 지지도를 계산하는 동작에는 상술한 실시예가 적용될 수 있고, 도 11 내지 도 12를 참조하여 설명을 보충하도록 하겠다. The frequent pattern mining apparatus can calculate the degree of support of C _k through an AND operation using the GPU (905). The above-described embodiment can be applied to the operation of calculating the degree of support of C _k , and the description will be supplemented with reference to Figs. 11 to 12.

빈발 패턴 마이닝 장치는 C_k의 지지도가 최소 지지도보다 크거나 같은지 여부를 판단할 수 있다(906). 빈발 패턴 마이닝 장치는 C_k의 지지도가 최소 지지도보다 크거나 같은 경우, C_k를 빈발 패턴 F에 추가할 수 있다(907). 빈발 패턴 마이닝 장치는 C_k의 지지도가 최소 지지도보다 작은 경우, k에 k+1를 대입하여(908) 다음 번 레벨의 후보 항목집합을 생성하고, 상술한 동작들을 반복할 수 있다.The frequent pattern mining device may determine whether the support of C _k is greater than or equal to the minimum support (906). The frequent pattern mining device may add C _k to the frequent pattern F if the support of C _k is greater than or equal to the minimum support (907). If the frequent pattern mining apparatus is less than the minimum support of C _k , k + 1 is substituted into k (908) to generate the next set of candidate items, and the above operations can be repeated.

도 10은 일실시예에 따른 시스템 초기화를 설명하기 위한 순서도이다. FIG. 10 is a flowchart illustrating system initialization according to an embodiment.

도 10을 참조하면, 빈발 패턴 마이닝 장치는 초기화 작업을 시작하고(1001), 트랜잭션 데이터를 로드할 수 있다(1002). 빈발 패턴 마이닝 장치는 GPU 스트림을 생성할 수 있다(1003). 예를 들어, 빈발 패턴 마이닝 장치는 상대 주소에 대응하는 트랜잭션(프래그먼트) 블록을 처리하기 위한 스트림을 GPU에 할당할 수 있다. 빈발 패턴 마이닝 장치는 버퍼를 할당할 수 있다(1004). 상술한 버퍼들이 CPU의 메인 메모리 또는 GPU의 장치 메모리에 할당될 수 있다.Referring to FIG. 10, the frequent pattern mining device may start an initialization operation (1001) and load transaction data (1002). The frequent pattern mining device may generate a GPU stream (1003). For example, a frequent pattern mining device may assign a stream to a GPU to process a transaction (fragment) block corresponding to a relative address. The frequent pattern mining device may allocate a buffer (1004). The above-described buffers can be allocated to the main memory of the CPU or the device memory of the GPU.

빈발 패턴 마이닝 장치는 트랜잭션 데이터를 스캔하고(1005), 빈발 1-항목집합들 F₁을 생성할 수 있다(1006). 빈발 패턴 마이닝 장치는 트랜잭션 데이터로부터 빈발 1-항목집합들을 마이닝할 수 있다.The frequent pattern mining device may scan the transaction data (1005) and generate frequent 1-item sets F ₁ (1006). Frequent pattern mining devices can mine frequent 1-item sets from transaction data.

빈발 패턴 마이닝 장치는 TFL 전략 또는 HIL 전략을 선택할 수 있다(1007). HIL 전략이 선택된 경우, 빈발 패턴 마이닝 장치는 빈발 1-항목집합들로부터 프래그먼트들을 생성할 수 있다(1008). 빈발 패턴 마이닝 장치는 각 프래그먼트 별로 프래그먼트 내 빈발 1-항목집합들의 모든 조합인 항목집합들 P를 생성할 수 있다(1009). 빈발 패턴 마이닝 장치는 P의 비트 벡터들을 계산하고(1010), P의 비트 벡터들을 R개의 프래그먼트 블록들로 파티셔닝할 수 있다(1011). The frequent pattern mining device may select a TFL strategy or HIL strategy (1007). If the HIL strategy is selected, the frequent pattern mining device may generate fragments from the frequent 1-item sets (1008). The frequent pattern mining device may generate item sets P (1009), which are all combinations of frequent 1-item sets in a fragment for each fragment. The frequent pattern mining device may calculate (1010) the bit vectors of P and partition the bit vectors of P into R fragment blocks (1011).

빈발 패턴 마이닝 장치는 TFL 전략이 선택된 경우, F₁의 비트 벡터들을 계산하고(1012), F₁의 비트 벡터들을 R개의 트랜잭션 블록들로 파티셔닝할 수 있다(1013). 상술한 내용과 중복되는 내용의 설명은 생략한다.Frequent pattern mining apparatus TFL if the selected strategy, calculation, and 1012 bits in the vector F _1, F ₁ can be partitioned in the bit vector in the R block transaction 1013. The description of the contents overlapping with those described above will be omitted.

도 11은 일실시예에 따른 지지도를 계산하는 동작을 설명하기 위한 순서도이다. 11 is a flowchart for explaining an operation of calculating the support degree according to an embodiment.

도 11을 참조하면, 빈발 패턴 마이닝 장치는 C_k의 상대 메모리 주소들을 생성하고(1101), 외부 피연산자에 해당하는 C_k의 상대 메모리 주소들을 GPU들의 장치 메모리들로 청크 복사할 수 있다(1102). 빈발 패턴 마이닝 장치는 내부 루프 프로세스를 수행하고(1103), GPU들로부터 복사된 부분 지지도들을 동기화할 수 있다(1104). 내부 루프 프로세스는 GPU에 의해 부분 지지도가 계산되는 프로세스이고, 도 12를 참조하여 후술된다. 도 11의 프로세스는 외부 루프 프로세스로 지칭될 수 있다.Referring to FIG. 11, the frequent pattern mining apparatus generates C _k relative memory addresses (1101) and may chunk copy the C _k relative memory addresses corresponding to the external operands to the device memories of GPUs (1102) . The frequent pattern mining device may perform an inner loop process (1103) and synchronize (1104) the partial supports copied from the GPUs. The inner loop process is a process in which the partial support is calculated by the GPU and is described below with reference to FIG. The process of FIG. 11 may be referred to as an outer loop process.

빈발 패턴 마이닝 장치는 모든 상대 메모리 주소들 RA_j(1 ≤ j ≤ Q)에 대응하는 부분 지지도들의 동기화 완료 여부를 판단하기 위해, i가 Q보다 작은지 여부를 판단할 수 있다. 빈발 패턴 마이닝 장치는 i가 Q보다 작은 경우 부분 지지도의 계산 및 동기화 작업을 반복할 수 있다. 빈발 패턴 마이닝 장치는 i가 Q보다 크거나 같은 경우 단계(906)을 수행할 수 있다.Frequent pattern mining apparatus is, to determine whether or not i is less than Q to determine whether or not approval rating part of the synchronization completion corresponding to all external memory addresses _{RA j (1 ≤ j ≤ Q} ). The frequent pattern mining device can repeat the computation and synchronization of partial support when i is smaller than Q. The frequent pattern mining device may perform step 906 if i is greater than or equal to Q. [

도 12는 일실시예에 따른 부분 지지도를 계산하는 동작을 설명하기 위한 순서도이다. 12 is a flowchart for explaining an operation of calculating a partial support according to an embodiment.

도 12를 참조하면, 빈발 패턴 마이닝 장치는 내부 피연산자에 해당하는 비트 벡터 데이터를 GPU들의 장치 메모리들로 복사할 수 있다(1201). 여기서, 비트 벡터 데이터는 상술한 트랜잭션 블록 또는 프래그먼트 블록일 수 있다. 빈발 패턴 마이닝 장치는 GPU를 이용하여 사용자 정의 커널을 수행할 수 있다(1202). 여기서, 사용자 정의 커널은 상술한 커널 함수를 이용한 동작일 수 있다.Referring to FIG. 12, the frequent pattern mining apparatus can copy bit vector data corresponding to an internal operand into device memories of GPUs (1201). Here, the bit vector data may be a transaction block or a fragment block described above. The frequent pattern mining device can perform a custom kernel using the GPU (1202). Here, the user defined kernel may be an operation using the above-described kernel function.

빈발 패턴 마이닝 장치는 GPU들의 장치 메모리들로부터 메인 메모리로 부분 지지도들을 복사할 수 있다(1203). 빈발 패턴 마이닝 장치는 모든 트랜잭션 블록들 TB_k(1 ≤ k ≤ R)에 대응하는 부분 지지도들을 메인 메모리로 복사하였는지 여부를 판단하기 위해, i가 R보다 작은지 여부를 판단할 수 있다. 빈발 패턴 마이닝 장치는 i가 R보다 작은 경우 부분 지지도의 계산 및 복사 작업을 반복할 수 있다. 빈발 패턴 마이닝 장치는 i가 R보다 크거나 같은 경우 GPU 스레드를 동기화할 수 있다(1204). GPU가 1개의 상대 메모리 주소에 대응하는 모든 트랜잭션 블록들을 처리한 경우, 빈발 패턴 마이닝 장치는 다음번 상대 메모리 주소와 관련된 처리를 위해 GPU들의 스레드를 동기화할 수 있다.The frequent pattern mining device can copy partial supports from the device memories of the GPUs to main memory (1203). Frequent pattern mining apparatus is, to determine whether or not i is less than R in order to determine whether or not copying portions approval rating corresponding to all transactions block _{TB k (1 ≤ k ≤ R} ) to the main memory. The frequent pattern mining device can repeat the calculation and copying of partial support when i is smaller than R. The frequent pattern mining device may synchronize (1204) the GPU thread if i is greater than or equal to R. If the GPU processes all transaction blocks corresponding to one relative memory address, then the frequent pattern mining device can synchronize the threads of the GPUs for processing associated with the next relative memory address.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented in hardware components, software components, and / or a combination of hardware components and software components. For example, the devices, methods, and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, such as an array, a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be permanently or temporarily embodied in a transmitted signal wave. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.Although the embodiments have been described with reference to the drawings, various technical modifications and variations may be applied to those skilled in the art. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

Generating blocks from bit vectors corresponding to frequent 1-item sets;
Copying different relative memory addresses of candidate k-item sets, respectively, from the main memory of a central processing unit (CPU) to each of the device memories of GPUs (graphic processing units);
Copying at least one identical block from the main memory to each of the device memories, each of the at least one identical block required for calculating the votes of the candidate k-itemsets of the blocks; And
Updating the scores of the candidate k-item sets by synchronizing partial scores calculated by the GPUs
Containing
Frequent pattern mining methods.

The method according to claim 1,
The step of copying each of the different relative memory addresses
Copying the relative memory address of the first candidate k-item set to the device memory of the first GPU; And
Copying the relative memory address of the second candidate k-item set to the device memory of the second GPU
Lt; / RTI >
Wherein copying each of the at least one identical block comprises:
Copying a first one of the blocks to a device memory of the first GPU; And
Copying the first block to the device memory of the second GPU
/ RTI >
Frequent pattern mining methods.

The method according to claim 1,
Generating transaction blocks by vertically partitioning a transaction bitmap comprising the bit vectors, the transaction bitmap being represented by a vertical bitmap layout,
Further comprising:
Wherein the blocks are transaction blocks,
Frequent pattern mining methods.

The method of claim 3,
The step of copying each of the different relative memory addresses
Generating relative memory addresses of the candidate k-item sets using a dictionary mapping a frequent 1-item set of the transaction bitmap to a relative memory address; And
Copying each of the generated relative memory addresses as an outer operand to each of the device memories
/ RTI >
Frequent pattern mining methods.

5. The method of claim 4,
The step of generating the relative memory addresses
Identifying the frequent 1-item sets included in the candidate k-itemset; And
Combining the relative memory addresses of the identified frequent 1-item sets to generate a relative memory address of the candidate k-
/ RTI >
Frequent pattern mining methods.

5. The method of claim 4,
Wherein copying each of the at least one identical block comprises:
Copying any one of the transaction blocks as an inner operand into each of the device memories
Lt; / RTI >
The step of updating the supports
Each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the transaction block and the respective relative memory addresses; And
And updating the supports of the candidate k-item sets by synchronizing the partial supports corresponding to the relative memory addresses
/ RTI >
Frequent pattern mining methods.

The method according to claim 1,
Dividing the frequent 1-item sets into fragments;
Generating item sets that are all combinations of the frequent 1-item sets in the fragments for each of the fragments;
Calculating bit vectors corresponding to the generated item sets; And
Vertically partitioning a transaction bitmap including the calculated bit vectors, the transaction bitmap being represented by a vertical bitmap layout, and generating fragment blocks by dividing the transaction bitmap into the fragments step
Further comprising:
Wherein the blocks are the fragment blocks,
Frequent pattern mining methods.

8. The method of claim 7,
The step of copying each of the different relative memory addresses
Generating relative memory addresses of the candidate k-item sets using a dictionary mapping an item set of the transaction bitmap to a relative memory address; And
Copying each of the relative memory addresses as an outer operand to each of the device memories
/ RTI >
Frequent pattern mining methods.

9. The method of claim 8,
The step of generating the relative memory addresses
Identifying sets of items included in the candidate k-itemset; And
Combining the relative memory addresses of the identified set of items to generate a relative memory address of the candidate k-
/ RTI >
Frequent pattern mining methods.

9. The method of claim 8,
Wherein copying each of the at least one identical block comprises:
Copying at least one fragment block of the fragment blocks into each of the device memories as an inner operand
Lt; / RTI >
The step of updating the supports
Each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the fragment block and the respective relative memory addresses; And
And updating the supports of the candidate k-item sets by synchronizing the partial supports corresponding to the relative memory addresses
/ RTI >
Frequent pattern mining methods.

Mining the frequent 1-item sets from the transaction data;
A transaction bitmap containing bit vectors corresponding to the frequent 1-item sets, the transaction bitmap being represented by a vertical bitmap layout, to vertically partition transaction blocks step;
Using GPUs, calculating support ratings of candidate k-item sets from the transaction blocks; And
Based on these ratings, mining the frequent k-item sets
Containing
Frequent pattern mining methods.

12. The method of claim 11,
The step of calculating the supports
Generating relative memory addresses of the candidate k-item sets using a dictionary mapping a frequent 1-item set of the transaction bitmap to a relative memory address;
Copying each of the relative memory addresses as an outer operand into each of the device memories of the GPUs;
Copying any one of the transaction blocks as an inner operand into each of the device memories;
Each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the transaction block and the respective relative memory addresses; And
And synchronizing the partial scores to update the scores of the candidate k-itemsets
/ RTI >
Frequent pattern mining methods.

13. The method of claim 12,
The step of calculating the supports
Copying the second transaction block into each of the device memories as an inner operand;
Each of the GPUs calculating second partial supports corresponding to the respective relative memory addresses using the second transaction block and the respective relative memory addresses; And
And synchronizing the second partial scores to update the scores of the candidate k-itemsets
&Lt; / RTI >
Frequent pattern mining methods.

Mining the frequent 1-item sets from the transaction data;
Dividing the frequent 1-item sets into fragments;
Generating item sets that are all combinations of the frequent 1-item sets in the fragments for each of the fragments;
Calculating bit vectors corresponding to the generated item sets;
Vertically partitioning a transaction bitmap including the calculated bit vectors, the transaction bitmap being represented by a vertical bitmap layout, and generating fragment blocks by dividing the transaction bitmap into the fragments step;
Calculating, using GPUs, the scores of the candidate k-itemsets from the fragment blocks; And
Based on these ratings, mining the frequent k-item sets
Containing
Frequent pattern mining methods.

15. The method of claim 14,
The step of calculating the supports
Generating relative memory addresses of the candidate k-item sets using a dictionary mapping an item set of the transaction bitmap to a relative memory address;
Copying each of the relative memory addresses as an outer operand into each of the device memories of the GPUs;
Copying at least one fragment block of the fragment blocks into each of the device memories as an inner operand;
Each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the fragment block and the respective relative memory addresses; And
And synchronizing the partial scores to update the scores of the candidate k-itemsets
/ RTI >
Frequent pattern mining methods.

16. The method of claim 15,
The step of calculating the supports
Copying at least one second fragment block into each of the device memories as an inner operand;
Each of the GPUs calculating second partial degrees of support corresponding to the respective relative memory addresses using the second fragment block and the respective relative memory addresses; And
And synchronizing the second partial scores to update the scores of the candidate k-itemsets
&Lt; / RTI >
Frequent pattern mining methods.

Mining the frequent 1-item sets from the transaction data;
Selecting a TFL (Traversal from the First Level) strategy or a HIL (Hopping from Intermediate Level) strategy;
Generating blocks from the bit vectors corresponding to the frequent 1-item sets based on the selected strategy;
Using the GPUs, calculating support ratings of candidate k-item sets from the blocks; And
Based on these ratings, mining the frequent k-item sets
Containing
Frequent pattern mining methods.

18. The method of claim 17,
The step of generating the blocks
If the TFL strategy is selected, creating transaction blocks by vertically partitioning a transaction bitmap comprising the bit vectors, the transaction bitmap being represented by a vertical bitmap layout
/ RTI >
Frequent pattern mining methods.

19. The method of claim 18,
The step of calculating the supports
Generating relative memory addresses of the candidate k-item sets using a dictionary mapping a frequent 1-item set of the transaction bitmap to a relative memory address;
Copying each of the relative memory addresses as an outer operand into each of the device memories of the GPUs;
Copying any one of the transaction blocks as an inner operand into each of the device memories;
Each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the transaction block and the respective relative memory addresses; And
And synchronizing the partial scores to update the scores of the candidate k-itemsets
/ RTI >
Frequent pattern mining methods.

20. The method of claim 19,
The step of calculating the supports
Copying the second transaction block into each of the device memories as an inner operand;
Each of the GPUs calculating second partial supports corresponding to the respective relative memory addresses using the second transaction block and the respective relative memory addresses; And
And synchronizing the second partial scores to update the scores of the candidate k-itemsets
&Lt; / RTI >
Frequent pattern mining methods.

18. The method of claim 17,
The step of generating the blocks
If the HIL strategy is selected, fragmenting the frequent 1- item sets to generate fragments;
Generating item sets that are all combinations of the frequent 1-item sets in the fragments for each of the fragments;
Calculating bit vectors corresponding to the generated item sets;
Vertically partitioning a transaction bitmap including the calculated bit vectors, the transaction bitmap being represented by a vertical bitmap layout, and generating fragment blocks by dividing the transaction bitmap into the fragments step
/ RTI >
Frequent pattern mining methods.

22. The method of claim 21,
The step of calculating the supports
Generating relative memory addresses of the candidate k-item sets using a dictionary mapping an item set of the transaction bitmap to a relative memory address;
Copying each of the relative memory addresses as an outer operand into each of the device memories of the GPUs;
Copying at least one fragment block of the fragment blocks into each of the device memories as an inner operand;
Each of the GPUs calculating partial supports corresponding to the respective relative memory addresses using the fragment block and the respective relative memory addresses; And
And synchronizing the partial scores to update the scores of the candidate k-itemsets
/ RTI >
Frequent pattern mining methods.

23. The method of claim 22,
The step of calculating the supports
Copying at least one second fragment block into each of the device memories as an inner operand;
Each of the GPUs calculating second partial degrees of support corresponding to the respective relative memory addresses using the second fragment block and the respective relative memory addresses; And
And synchronizing the second partial scores to update the scores of the candidate k-itemsets
&Lt; / RTI >
Frequent pattern mining methods.

23. A computer program stored on a recording medium for executing the method of any one of claims 1 to 23 in combination with hardware.

A CPU; And
Main memory
Lt; / RTI >
The CPU
Generates blocks from bit vectors corresponding to frequent 1-item sets,
Copy each of the different relative memory addresses of the candidate k-itemsets from the main memory to each of the device memories of the GPUs,
Copying at least one identical block from each of the main memories to each of the device memories, the at least one identical block necessary for calculating the scores of the candidate k-
And updating the scores of the candidate k-item sets by synchronizing partial scores calculated by the GPUs
Frequent pattern mining device.