KR20180119888A

KR20180119888A - Hybrid Sharding system

Info

Publication number: KR20180119888A
Application number: KR1020170053564A
Authority: KR
Inventors: 이종정; 서광익; 이승원; 김성민; 박준호
Original assignee: 주식회사 알티베이스
Priority date: 2017-04-26
Filing date: 2017-04-26
Publication date: 2018-11-05
Also published as: KR102008446B1

Abstract

According to a preferred embodiment of the present invention, provided is a hybrid-sharding system including: at least one or more shard DBs for storing distributed data; and a hybrid-sharding unit. The hybrid-sharding unit includes a shard library and a meta node. Accordingly, the present invention can dispersedly process a database with a large capacity.

Description

[0001] Hybrid Sharding System [

본 발명은 데이터베이스의 샤딩방법에 관한 것이다. 보다 상세히, 서버측 샤딩기술과 클라이언트측 샤딩기술을 통합한 하이브리드샤딩시스템에 관한 것이다.The present invention relates to a method of sharding a database. More particularly, the present invention relates to a hybrid sharding system incorporating server side sharding technology and client side sharding technology.

최근 비즈니스 환경에서 폭발적인 데이터의 증가로 데이터 분산 처리 및 저장 기술과 함께 발전된 분산 데이터 베이스 기술이 각광을 받고 있다. 다양한 데이터 처리 기술 가운데 샤딩(Sharding)기술은 대용량의 데이터를 쉽게 나누어 처리하므로서 고성능의 시스템을 도입하지 않고서도 저가의 시스템을 이용한 Scale out 방식의 시스템 증설로 대용량 데이터 처리를 할 수 있다.With the explosion of data in the recent business environment, distributed database technology developed along with data distribution processing and storage technology is attracting attention. Of the various data processing technologies, Sharding technology easily divides large amount of data and can process large amount of data by expanding system of scale out method using low cost system without introducing high performance system.

데이터베이스 분야에서 샤딩은 물리적으로 다른 데이터베이스에 데이터를 수평 분할 방식으로 분산 저장하고 조회하는 방법을 의미하며, 샤드(shard)라고 부르는 각각의 개별 파티션이 있는 하나의 데이터베이스의 수평적인 파티셔닝을 의미한다. 샤딩을 수행하는 경우 하나의 큰 데이터베이스를 관리하는 것에 비하여, 각 샤드가 연산 자원의 지원을 더 많이 받을 수 있으므로, 데이터 처리 속도가 빨라지고, 이중화 기술이 적용되는 경우 하나의 샤드에 장애가 발생하더라도 다른 샤드에서 서비스를 제공할 수 있으므로 신뢰도가 향상되는 등의 효과가 있다.In the database field, sharding refers to a method of distributing and querying data horizontally in a physically different database, and refers to the horizontal partitioning of one database with each individual partition, called a shard. When sharding is performed, each shard can receive more computational resources than a single large database. Therefore, if the data processing speed is fast and the redundancy technique is applied, even if a failure occurs in one shard, It is possible to provide a service in the mobile communication system.

KR 10-1544356 B1KR 10-1544356 B1

서버측 샤딩 시스템의 경우 추가되는 데이터 노드가 증가할수록 성능향상을 기대하기 어려운 문제점이 있고, 클라이언트 샤딩 시스템의 경우 데이터 분산 정책이 변경되면 시스템 재구축 비용이 큰 단점이 있다.In the case of the server side sharding system, there is a problem that performance increases can not be expected as the number of added data nodes increases. In the case of the client sharding system, there is a disadvantage that the system reconstruction cost is large when the data dispersion policy is changed.

본 발명의 바람직한 일 실시예로서, 하이브리드샤딩(Hybrid-Sharding)시스템으로, 상기 시스템은 분산된 데이터를 저장하는 적어도 하나 이상의 샤드DB;및 하이브리드샤딩부;를 포함하고, 상기 하이브리드샤딩부는 사용자 쿼리를 분석하여 샤드 객체가 포함된 샤드쿼리인지 판단하고, 샤드쿼리인 경우 샤드키(Shard Key)를 기준으로 데이터를 상기 적어도 하나 이상의 샤드DB 각각에 분산처리하는 메타노드;및 클라이언트 단말기의 어플리케이션에 라이브러리 형태로 설치되어, 상기 어플리케이션과 상기 적어도 하나 이상의 샤드DB 간의 코디네이터 역할을 수행하며, 사용자 쿼리를 상기 메타 노드에 전달하고, 상기 메타노드에 등록된 상기 적어도 하나 이상의 샤드DB의 정보를 수신하여 상기 클라이언트 단말기와 상기 적어도 하나 이상의 샤드DB의 연결을 수행하는 샤드라이브러리;를 포함하는 것을 특징으로 한다. According to a preferred embodiment of the present invention, there is provided a hybrid-sharding system, comprising: at least one shard DB for storing distributed data; and a hybrid shading unit, A meta node for determining whether the query is a shard query including a shard object, distributing data to each of the at least one shard DB based on a shard key in the case of a shard query, A coordinator between the application and the at least one shard DB, a user query is transmitted to the meta node, information of the at least one shard DB registered in the meta node is received, And the at least one shard DB And a shard library for storing the shard library.

바람직하게, 상기 하이브리드샤딩부는 서버측 샤딩모드 또는 클라이언트측 샤딩모드를 선택하는 선택부;를 더 포함하는 것을 특징으로 한다.Preferably, the hybrid shading unit further includes a selection unit selecting a server-side sharding mode or a client-side sharding mode.

바람직하게, 상기 하이브리드샤딩부에서 상기 샤드라이브러리는 사용자 질의문을 해석할 때 최초 1회 접속시 상기 메타노드에 상기 샤드DB들 각각에 있는 테이블 정보를 요청하여, 상기 샤드DB 중 어느 곳에 데이터가 있는지를 문의하고 상기 메타노드로부터 이에 대한 답변을 수신함으로써 이후에는 상기 샤드라이브러리가 직접 상기 샤드DB 각각에 접속을 수행하는 것을 특징으로 한다. Preferably, in the hybrid shading unit, the shard library requests table information in each of the shard DBs to the meta node when accessing the first query at the time of interpreting the user query, And receives a response from the meta node, so that the shard library directly accesses each of the shard DBs.

본 발명의 또 다른 바람직한 일 실시예로서, 하이브리드샤딩(Hybrid-Sharding)시스템에서 샤딩을 수행하는 방법은 클라이언트 단말기의 어플리케이션에 라이브러리 형태로 설치되어, 상기 어플리케이션과 적어도 하나 이상의 샤드DB 간의 코디네이터 역할을 수행하는 샤드라이브러리에서 사용자 쿼리를 메타 노드에 전달하는 단계; 샤드라이브러리에서 상기 메타노드에 등록된 적어도 하나 이상의 샤드DB의 정보를 수신하여 클라이언트 단말기와 적어도 하나 이상의 샤드DB의 연결을 수행하는 단계; 메타노드에서 상기 사용자 쿼리를 분석하여 샤드 객체가 포함된 샤드쿼리인지 판단하는 단계; 및 메타노드에서 판단결과 샤드쿼리인 경우 샤드키(Shard Key)를 기준으로 데이터를 적어도 하나 이상의 샤드DB 각각에 분산처리하는 단계;를 포함한다. As another preferred embodiment of the present invention, a method of performing sharding in a hybrid-sharding system is installed in a library form in an application of a client terminal, and serves as a coordinator between the application and at least one shard DB Passing a user query to a meta node in a shard library; Receiving information of at least one shard DB registered in the meta node in the shard library and performing connection between the client terminal and at least one shard DB; Analyzing the user query in the meta node to determine whether the query is a shard query including a shard object; And distributing data to each of at least one shard DB based on a shard key in the case of a shard query as a result of determination by the meta node.

본 발명의 바람직한 일 실시예로서 하이브리드 샤딩장치는 저장 용량과 시간당 처리량을 향상시켜 대용량의 데이터베이스를 분산 처리할 수 있는 효과가 있다. As a preferred embodiment of the present invention, the hybrid sharding apparatus has the effect of improving the storage capacity and the throughput per hour, and distributing a large-capacity database.

도 1 은 본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템(100)을 도시한다.
도 2 는 서버측 샤딩을 수행하는 시스템의 일 예를 도시한다.
도 3 는 클라이언트측 샤딩을 수행하는 시스템의 일 예를 도시한다.
도 4 는 본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템의 동작 방식을 도시한다.
도 5는 기존의 샤딩 시스템에서 샤딩을 수행하기 위한 단계를 도시한 시스템도를 도시한다.
도 6 은 본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템에서 샤딩을 수행하기 위한 단계를 도시한다.
도 7 은 본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템에서 샤딩을 수행하는 흐름도를 도시한다. 1 shows a hybrid shading system 100 as a preferred embodiment of the present invention.
2 shows an example of a system for performing server side sharding.
3 shows an example of a system for performing client side sharding.
FIG. 4 illustrates a method of operating a hybrid shading system according to an embodiment of the present invention.
Figure 5 shows a system diagram illustrating steps for performing sharding in a conventional sharding system.
6 shows a step for performing sharding in a hybrid sharding system, which is a preferred embodiment of the present invention.
7 shows a flowchart for performing sharding in a hybrid sharding system, which is a preferred embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명한다. 본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention and the manner of achieving them will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings.

샤딩(Sharding)은 한 대의 데이터베이스에 저장했던 데이터를 여러 대의 데이터베이스에 분산하여 저장 및 처리하는 스케일 아웃(Scale-out) 기술이다. 샤딩기술은 대용량의 데이터를 쉽게 나누어 처리하므로서 고성능의 시스템을 도입하지 않고서도 저가의 시스템을 이용한 스케일 아웃 방식의 시스템 증설로 대용량 데이터 처리를 할 수 있다.Sharding is a scale-out technique that distributes data stored in one database to multiple databases and stores it. Sharding technology easily divides large amount of data and can process large amount of data by expanding scale-out system using low cost system without introducing high performance system.

샤딩기술은 일반적으로 코디네이터를 이용하여 데이터를 분리하여 처리하는 서버측(Server-side)샤딩 방식과 어플리케이션에서 데이터를 분리하여 처리하는 클라이언트측(Client-side) 샤딩 방식으로 나눌 수 있다. The sharding technique can be divided into a server-side sharding method for separating and processing data using a coordinator, and a client-side sharding method for separating and processing data in an application.

도 1 은 본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템(100)을 도시한다. 본 발명의 바람직한 일 실시예에로서, 하이브리드샤딩시스템은 서버측 샤딩기능과 클라이언트측 샤딩 기능을 동시에 지원할 수 있다. 또한, 필요에 따라 서버측 샤딩기능만을 선택하거나 또는 클라이언트측 샤딩기능만을 선택하도록 구현이 가능하다. 1 shows a hybrid shading system 100 as a preferred embodiment of the present invention. In a preferred embodiment of the present invention, the hybrid sharding system can simultaneously support the server-side sharding function and the client-side sharding function. In addition, it is possible to select only the server-side sharding function or only the client-side sharding function if necessary.

본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템(100)에서 서버측 샤딩은 샤드DB(130, 132, 134, 136)의 수가 증가하더라도 전체 성능향상에 무리가 없으며, 데이터 분산 정책을 변경하는 경우에도 클라이언트 단말기의 어플리케이션을 수정하지 않을 수 있는 장점이 있다. As a preferred embodiment of the present invention, in the hybrid sharding system 100, even if the number of the shard DBs 130, 132, 134, 136 increases, the server side sharding can be improved in the overall performance, There is an advantage that the application of the client terminal can be prevented from being modified.

본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템(100)에서 클라이언트측 샤딩은 기존 어플리케이션 소스나 기존 SQL을 수정하지 않은 채, 샤드 전용 라이브러리만 교체하는 것만으로 구현이 가능하다.As a preferred embodiment of the present invention, in the hybrid sharding system 100, the client side sharding can be implemented by only replacing the shard dedicated library without modifying the existing application source or existing SQL.

본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템(100)은 클라이언트 단말기(110), 클라이언트 단말기(110)에 설치된 적어도 하나의 어플리케이션(112, 114, 116) 그리고, 각각의 어플리케이션(112, 114, 116)마다 설치된 샤드라이브러리(Shard Library)(113, 115, 117), 메타노드(120), 분산된 데이터를 저장하는 적어도 하나의 샤드DB(130, 132, 134, 136)를 포함한다. The hybrid sharding system 100 includes at least one application 112, 114, and 116 installed in the client terminal 110 and the client terminal 110 and at least one application 112, 114, 116, a Shard Library 113, 115, 117, a meta node 120, and at least one Shard DB 130, 132, 134, 136 for storing distributed data.

본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템(100)은 하이브리드샤딩부를 포함한다. 하이브리드샤딩부는 메타노드(120) 및 적어도 하나의 샤드라이브러리(113, 115, 117)을 포함하는 것으로 한다. As a preferred embodiment of the present invention, the hybrid shading system 100 includes a hybrid shading unit. The hybrid sharding portion includes a meta node 120 and at least one shard library 113, 115, 117.

본 발명의 바람직한 일 실시예로서, 메타노드(120)는 데이터 노드 및 샤딩 정보를 관리하고, 사용자 쿼리를 분석하며, 도 2 에 도시된 일 예에서와 같이 서버측샤딩기능 수행시 통합 쿼리 제공 등의 코디네이터 역할을 수행한다. 또한, 데이터를 샤드DB들에 재분배하는 기능을 수행한다. As a preferred embodiment of the present invention, the meta node 120 manages data nodes and sharding information, analyzes a user query, and provides integrated queries when performing a server side sharding function as in the example shown in FIG. 2 As a coordinator. It also performs the function of redistributing the data to shard DBs.

본 발명의 바람직한 일 실시예로서, 적어도 하나의 샤드라이브러리(113, 115, 117)는 클라이언트단말기에 라이브러리(Library) 형태로 설치되어 샤딩 기능을 수행하며, 기존의 ODBC와 동일한 API인터페이스를 제공한다. In one preferred embodiment of the present invention, at least one of the shard libraries 113, 115, and 117 is installed in the form of a library in a client terminal to perform a sharding function and provides the same API interface as the existing ODBC.

본 발명의 바람직한 일 실시예로서, 적어도 하나의 샤드라이브러리(113, 115, 117)는 또한 클라이언트 단말기에 설치된 어플리케이션(112, 114, 116)과 샤드DB(130, 132, 134, 136)간에 코디네이터 역할을 수행한다. In one preferred embodiment of the present invention, at least one of the shard libraries 113, 115, and 117 also provides a coordinator role between the applications 112, 114, and 116 installed in the client terminal and the shard DBs 130, 132, .

본 발명의 바람직한 일 실시예에서는 하이브리드샤딩부를 이용하여, 서버측샤딩기능과 클라이언트측 샤딩 기능을 동시에 지원할 수 있다. 또한, 샤드 관리자는 하이브리드샤딩부에 추가로, 서버측 샤딩 기능과 클라이언트측 샤딩 기능 중 하나를 선택할 수 있는 선택기능을 더 구비할 수 있다. In a preferred embodiment of the present invention, the server side sharding function and the client side sharding function can be simultaneously supported by using the hybrid sharding portion. Further, in addition to the hybrid shading section, the shard manager may further include a selection function for selecting either the server-side sharding function or the client-side sharding function.

도 2 는 서버측 샤딩을 수행하는 시스템의 일 예를 도시한다. 도 2와 같은 서버측 샤딩은 클라이언트 단말기(200, 212, 214)에 설치된 어플리케이션과의 호환을 위해 분할된 샤드DB(230, 232, 234)를 통합하는 코디네이터(220)가 필요하다. 2 shows an example of a system for performing server side sharding. The server side sharding as shown in FIG. 2 requires a coordinator 220 that integrates divided shard DBs 230, 232, and 234 for compatibility with applications installed in the client terminals 200, 212, and 214.

코디네이터(220)는 어플리케이션에서 요청받은 질의에 해당하는 데이터의 위치를 파악하고, 해당 샤드DB(230, 232, 234)에 재접속하여 질의를 재수행한 후 결과를 어플리케이션에 반환한다. The coordinator 220 grasps the location of data corresponding to the query requested by the application, relays the data to the shard DBs 230, 232, and 234, re-executes the query, and returns the result to the application.

이러한 서버측 샤딩 시스템은 데이터 분산 정책이 변경되는 경우, 클라이언트 단말기에 설치된 어플리케이션의 수정이 불필요한 이점이 있으나, 코디네이터에 부하가 집중되어 샤드DB노드가 추가될수록 성능향상이 저하되는 단점이 있다. This server side sharding system has a merit that when the data distribution policy is changed, it is unnecessary to modify the application installed in the client terminal. However, there is a disadvantage that the performance increase is lowered as the shard DB node is added to the load of the coordinator.

도 3 는 클라이언트측 샤딩을 수행하는 시스템의 일 예를 도시한다. 도 3와 같은 클라이언트측 샤딩은 클라이언트 단말기(310, 312, 314)에 설치된 어플리케이션에서 데이터가 위치한 샤드DB(330, 332, 334)를 미리 알고 있으므로, 직접 해당 샤드DB(330, 332, 334)에 접속이 가능하다. 따라서, 클라이언트측 샤딩은 별도의 코디네이터가 필요없다는 이점이 있고, 샤드DB 노드의 수가 증가할수록 전체 처리량(throughput) 성능이 증가되는 이점이 있다. 3 shows an example of a system for performing client side sharding. Since the client side sharding as shown in FIG. 3 knows the shard DBs 330, 332 and 334 in which the data is located in the applications installed in the client terminals 310, 312 and 314, the shard DBs 330, 332 and 334 Connection is possible. Thus, client side sharding has the advantage that no separate coordinator is needed, and the overall throughput performance is increased as the number of shard DB nodes increases.

그러나, 클라이언트측 샤딩은 어플리케이션을 작성하기가 어렵고, 데이터 분산정책이 변경될 경우 클라이언트 측의 어플리케이션 수정 및 데이터 재분배 작업을 수행이 요구되어 시스템 재구축 비용이 큰 단점이 있다.However, the client side sharding is difficult to create an application, and when the data distribution policy is changed, it is required to perform application modification and data redistribution on the client side, which has a disadvantage that the system reconstruction cost is large.

도 4 는 본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템의 동작 방식을 도시한다. FIG. 4 illustrates a method of operating a hybrid shading system according to an embodiment of the present invention.

본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템은 서버측샤딩기능 및 클라이언트측샤딩기능을 모두 구현할 수 있다. As a preferred embodiment of the present invention, the hybrid sharding system may implement both the server-side sharding function and the client-side sharding function.

먼저, 도 4를 참고하여 하이브리드샤딩시스템의 서버측샤딩기능을 기술한다. First, the server side sharding function of the hybrid sharding system will be described with reference to FIG.

하이브리드샤딩시스템의 서버측 샤딩은 클라이언트 단말기에 설치된 어떤 응용프로그래도 수정할 필요없이 사용할 수 있는 이점이 있다. 서버측샤딩기능 구현시 샤드 커넥션 관리는 다음과 같이 이루어진다. The server-side sharding of the hybrid sharding system has the advantage that it can be used without any modification to any application program installed on the client terminal. Shared connection management is implemented as follows when implementing the server side sharding function.

클라이언트 단말기(410)에 설치된 어플리케이션(412)에서 샤드라이브러리(413)를 통해 메타노드(420)에 접속을 시도한다. 일반적인 데이터베이스 접속 방식과 동일한 방식으로 접속이 가능하다. The application 412 installed in the client terminal 410 attempts to access the meta node 420 through the shard library 413. [ It is possible to connect in the same way as usual database connection method.

메타노드(420)에서 세션을 생성한다. 어플리케이션(412)에서 메타노드(420)에 샤드 객체가 포함된 사용자 쿼리를 요청한다And creates a session in the meta node 420. The application 412 requests a user query including the shard object in the meta node 420

샤드객체가 포함된 샤드쿼리인지를 판단하는 일 예는 다음과 같다. An example of determining whether a shard query includes a shard object is as follows.

/* 노드 구성 완료 후 각 노드에 테이블 생성 *// * Create a table on each node after completing node configuration * /

CREATE TABLE t1(id INTEGER, name VARCHAR(50));CREATE TABLE t1 (id INTEGER, name VARCHAR (50));

/* / * T1T1 을 of 샤드Shard 테이블로 설정 */ Set as table * /

EXEC DBMS_SHARD.SET_SHARD_TABLE('SYS', 'T1', 'R', 'ID', 'NODE1'); EXEC DBMS_SHARD.SET_SHARD_TABLE ('SYS', 'T1', 'R', 'ID', 'NODE1');

EXEC DBMS_SHARD.SET_SHARD_RANGE('SYS', 'T1', 3, 'NODE2'); EXEC DBMS_SHARD.SET_SHARD_RANGE ('SYS', 'T1', 3, 'NODE2');

EXEC DBMS_SHARD.SET_SHARD_RANGE('SYS', 'T1', 6, 'NODE3'); EXEC DBMS_SHARD.SET_SHARD_RANGE ('SYS', 'T1', 6, 'NODE3');

/* 각 노드에 데이터 입력 *// * Enter data on each node * /

INSERT INTO t1 VALUES(1, 'Kim');INSERT INTO t1 VALUES (1, 'Kim');

INSERT INTO t1 VALUES(2, 'Lee');INSERT INTO t1 VALUES (2, 'Lee');

INSERT INTO t1 VALUES(3, 'Park');INSERT INTO t1 VALUES (3, 'Park');

INSERT INTO t1 VALUES(4, 'Choi');INSERT INTO t1 VALUES (4, 'Choi');

INSERT INTO t1 VALUES(5, 'Jeong');INSERT INTO t1 VALUES (5, 'Jeong');

INSERT INTO t1 VALUES(6, 'Kang');INSERT INTO t1 VALUES (6, 'Kang');

INSERT INTO t1 VALUES(7, 'Joe');INSERT INTO t1 VALUES (7, 'Joe');

INSERT INTO t1 VALUES(8, 'Yoon');INSERT INTO t1 VALUES (8, 'Yoon');

INSERT INTO t1 VALUES(9, 'Jang');INSERT INTO t1 VALUES (9, 'Jang');

/* 쿼리 테스트 *// * Test the query * /

iSQLiSQL > SELECT * FROM > SELECT * FROM t1t1 WHERE id = 2; WHERE id = 2;

특정 노드에서만 조회가 가능하므로 정상수행이 이루어진다.Since it is possible to query only from a specific node, normal operation is performed.

ID　　　　　　　　　 NAMEID NAME

--------------------------------------------------------------------------------------------------------------------- -----------------

2　　　　　　　　　　 Lee2 Lee

1 row selected.1 row selected.

iSQLiSQL > SELECT * FROM > SELECT * FROM t1t1 ; --; - 샤드Shard 테이블이므로 Because it is a table 단일쿼리Single query 조회시 오류발생 Error in view

[ERR-E1385 : The shard table is only available inside the shard view.:[ERR-E1385: The shard table is only available inside the shard view.

0001 : SELECT * FROM T10001: SELECT * FROM T1

]]

iSQLiSQL > SHARD SELECT * FROM > SHARD SELECT * FROM t1t1 ; -- 분산 저장된 모든 데이터 조회시 "SHARD" 구문 사용; - Use "SHARD" syntax when viewing all distributed data

ID　　　　　　　　　 NAMEID NAME

7　　　　　　　　　　 Joe7 Joe

8　　　　　　　　　　 Yoon8 Yoon

9　　　　　　　　　　 Jang9 Jang

1　　　　　　　　　　 Kim1 Kim

2　　　　　　　　　　 Lee2 Lee

3　　　　　　　　　　 Park3 Park

4　　　　　　　　　　 Choi4 Choi

5　　　　　　　　　　 Jeong5 Jeong

6　　　　　　　　　　 Kang6 Kang

9 rows selected.9 rows selected.

iSQLiSQL > SELECT * FROM > SELECT * FROM t1t1 WHERE id = 2 OR id = 3; -- 특정 노드에서만 조회 가능하므로 정상수행 WHERE id = 2 OR id = 3; - Normal operation is possible because it can be inquired only from specific node.

ID　　　　　　　　　 NAMEID NAME

2　　　　　　　　　　 Lee2 Lee

3　　　　　　　　　　 Park3 Park

2 rows selected.2 rows selected.

iSQLiSQL > SELECT COUNT(*) FROM > SELECT COUNT (*) FROM t1t1 ; --; - 모든 노드의 합을 구하여 조회해야 하므로 We need to look up the sum of all nodes 단일쿼리Single query 사용시 오류 발생 Error in use

0001 : SELECT COUNT(*) FROM T10001: SELECT COUNT (*) FROM T1

　]　　　　　　　　　　　　　　　　　　　　　　　　　 ]

iSQLiSQL >> SHARD SELECT COUNT(*) FROM SHARD SELECT COUNT (*) FROM t1t1 ; ;

--모든 노드의 합을 구하여 조회해야 하므로 "SHARD" 구문 사용하여 조회- You need to search by sum of all nodes, so use "SHARD" syntax

COUNT(*)COUNT (*)

----------------------------------------------

33

3 rows selected.3 rows selected.

iSQLiSQL >> SELECT SUM(SELECT SUM ( c1c1 ) FROM SHARD(SELECT COUNT(*) ) FROM SHARD (SELECT COUNT (*) c1c1 FROM FROM t1t1 ); );

SUM(C1)SUM (C1)

----------------------------------------------

99

1 row selected.1 row selected.

메타노드(420)는 메타노드에 등록된 모든 샤드DB(430, 432, 434, 436, 438)에 대해 샤드 커넥션을 세션마다 생성한다. 세션이 종료되면 샤드커넥션도 종료된다. The meta node 420 generates a shard connection for each session for all the shard DBs 430, 432, 434, 436, and 438 registered in the meta node. When the session ends, the shard connection is also terminated.

위와 같이 하이브리드샤딩시스템에서 샤드 커넥션 관리를 수행하고(S410), 그 과정에서 입력된 사용자 쿼리는 아래와 같이 분석한다(S420).The shard connection management is performed in the hybrid sharding system as described above (S410), and the input user query is analyzed as follows (S420).

메타노드(420)는 어플리케이션(412)에서 요청한 사용자 쿼리를 분석한다. 사용자 쿼리가 샤드쿼리인 경우 분석 결과가 생성되고, 분석 결과에 의해 질의 최적화를 수행하여 plan tree를 생성한다. 메타노드(420)는 사용자 쿼리가 샤드쿼리인 경우와 샤드쿼리가 아닌 경우를 분류하여 처리할 수 있다. 샤드쿼리가 아닌 사용자 쿼리는 메타노드(420)에서 코디네이터의 역할로써 해당 사용자 쿼리를 처리한다. The meta node 420 analyzes the user query requested by the application 412. When the user query is a shard query, the analysis result is generated, and the query tree is optimized by the analysis result to generate the plan tree. The meta node 420 can classify and process the case where the user query is a shard query and the case where the user query is not a shard query. A user query that is not a shard query processes the user query as a coordinator in the meta node 420.

샤드쿼리를 수행하면, 메타노드(420)는 생성된 plan tree를 수행하고, 쿼리 수행 이후 plan을 조회하면 각 샤드DB(430, 432, 434, 436, 438)에서 수행한 샤드SQL의 plan을 조회할 수 있다. 메타노드(420)는 샤드쿼리의 수행결과를 어플리케이션(412)에 반환한다. When the shard query is performed, the meta node 420 performs the generated plan tree. When the plan is inquired after the query execution, the meta node 420 inquires the shard SQL plan performed by each shard DB 430, 432, 434, 436, can do. The meta node 420 returns the execution result of the shard query to the application 412.

다음으로, 도 4를 참고하여 하이브리드샤딩시스템의 클라이언트측샤딩기능을 기술한다. Next, the client side sharding function of the hybrid sharding system will be described with reference to FIG.

본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템의 클라이언트측샤딩기능은 샤드 전용 라이브러리(413)와 메타노드(420)를 이용함으로써, 어플리케이션의 변경이나 SQL 수정이 필요하지 않다. As a preferred embodiment of the present invention, the client side sharding function of the hybrid sharding system does not require application modification or SQL modification by using the shard dedicated library 413 and the meta node 420.

본 발명의 바람직한 일 실시예로서, 하이브리드샤딩시스템이 클라이언트측샤딩기능을 구현할 경우, 메타노드(420)는 어플리케이션에서 처음으로 질의를 준비하는(442) 경우에만 분석을 통해 샤드DB들의 스키마 정보를 포함하는 메타정보를 생성한다. 어플리케이션(412)은 메타노드(420)에 최초 1회 접속시 샤드 스키마(Shard Schema) 조회를 통해 샤드DB(430, 432, 434)에 어떤 테이블들이 있는지 정보를 파악한다. 최초 1회만 분석이 요구될 뿐 추가적인 분석이 요구되지 않는다. In a preferred embodiment of the present invention, when the hybrid sharding system implements the client side sharding function, the meta node 420 includes the schema information of the shard DBs through analysis only when preparing the query for the first time in the application (442) And generates meta information. The application 412 obtains information on which tables are present in the shard DBs 430, 432, and 434 through a shard schema inquiry when accessing the meta node 420 for the first time. Only the first one analysis is required and no additional analysis is required.

메타노드(420)는 생성한 메타정보와 어플리케이션(412)의 바인드 정보만으로 질의를 반복적으로 수행할 수 있다. 그 결과 클라이언트측샤딩의 성능확장성을 그대로 유지하면서도 어플리케이션을 수정하거나 재작성할 필요가 없는 이점이 발생한다. The meta node 420 can repeatedly execute the query based only on the generated meta information and the bind information of the application 412. [ As a result, there is an advantage that the performance scalability of the client side sharding is maintained, but the application does not need to be modified or rewritten.

메타노드(420)는 사용자 쿼리를 분석하여 샤드 객체가 포함된 샤드쿼리인 경우 샤드키(Shard Key)(450)를 기준으로 데이터를 적어도 하나 이상의 샤드DB(430, 432, 434, 436, 438) 각각에 분산처리를 수행한다. 본 발명의 바람직한 일 실시예에서는 샤드키(Shard Key)(450)를 이용하는 방식으로 Range, List, Hash 등의 방식을 이용할 수 있다. The meta node 420 analyzes the user query and transmits data based on the shard key 450 to at least one of the shard DBs 430, 432, 434, 436, and 438 in the case of the shard query including the shard object. And performs dispersion processing on each of them. In a preferred embodiment of the present invention, a method such as Range, List, Hash, etc. can be used in a method using a shard key 450.

하이브리드샤딩시스템이 클라이언트측샤딩기능 구현시 어플리케이션(412)에서 메타노드(420)로 SQLDriverConnect() 함수(S414)를 호출하면 샤드라이브러리(413)가 메타노드(420)에 접속한다. 샤드라이브러리(413)는 메타노드(420)에 등록되어 있는 데이터 노드의 역할을 수행하는 모든 샤드DB(430, 432, 434, 436, 438)들의 정보를 수신한다. 그 후, 모든 샤드DB(430, 432, 434, 436, 438)에 접속하면 어플리케이션(412)에 접속에 성공했음을 알린다. 그러나, 샤드DB(430, 432, 434, 436, 438)들 중 하나라도 접속이 실패하면, 이미 접속이 성공한 샤드DB들의 연결을 종료하고 어플리케이션(412)에 접속에 실패했음을 알린다. When the hybrid sharding system implements the client side sharding function, the shard library 413 accesses the meta node 420 when the application 412 calls the SQLDriverConnect () function S414 in the meta node 420. The shard library 413 receives information of all the shard DBs 430, 432, 434, 436, and 438 that serve as data nodes registered in the meta node 420. Thereafter, accessing all the shard DBs 430, 432, 434, 436, and 438 informs the application 412 that the connection is successful. However, if one of the shard DBs 430, 432, 434, 436, and 438 fails to connect, the connection of the shard DBs that have already been successfully connected is terminated and the connection to the application 412 is notified.

샤드 커넥션이 생성되면, 어플리케이션(412)에서 SQLPrepare() 함수를 호출한다(442). 샤드라이브러리(413)는 사용자 쿼리를 메타노드(420)에 전달한다. 메타노드(420)는 어플리케이션(412)에서 수신한 사용자쿼리가 샤드쿼리인지를 분석하여 분석결과를 샤드라이브러리(413)에 전달한다.When the shard connection is created, the application 412 calls the SQLPrepare () function (442). The shard library 413 passes the user query to the meta node 420. The meta node 420 analyzes whether the user query received in the application 412 is a shard query and transmits the analysis result to the shard library 413.

사용자쿼리가 샤드라이브러리(413)에서 수행할 수 없는 쿼리인 경우에는 오류메시지를 어플리케이션(412)에 전달한다. 사용자 쿼리 분석 결과는 사용자 쿼리가 샤드쿼리인지 여부, 샤드쿼리인 경우 샤드쿼리가 수행될 수 있는 샤드DB의 리스트, 샤드키와 관련한 호스트 변수 및 바인드 값에 대한 해석 방법등을 포함할 수 있다. If the user query is a query that can not be performed in the shard library 413, an error message is transmitted to the application 412. The user query analysis result may include whether the user query is a shard query, a shard DB list in which the shard query can be performed in case of a shard query, a host variable related to the shard key, and an interpretation method for the bind value.

샤드쿼리가 분석되면 샤드라이브러리(413)는 사용자 쿼리 분석 결과에 포함된 샤드DB들에 대하여 SQLPrepare()(442) 작업을 수행한다. 어플리케이션(412)에서 SQLBindParameter() 함수(444)를 호출하면 사용자 쿼리 분석 결과에 포함된 샤드DB들에 대하여 SQLBindParameter()(444)를 수행한다. When the shard query is analyzed, the shard library 413 performs an SQLPrepare () 442 operation on the shard DBs included in the user query analysis result. When the application 412 calls the SQLBindParameter () function 444, the SQLBindParameter () 444 is performed on the shard DBs included in the user query analysis result.

어플리케이션(412)에서 SQLExecute()(446)를 수행하면, 샤드라이브러리(413)는 바인드된 값들 중에서 샤드키와 관련된 값을 찾고, 그 후 바인드값을 해석하여 샤드 쿼리를 수행할 샤드DB(430, 432, 434, 436, 438)를 선택한다. 선택된 샤드DB에 대하여 SQLExecute()(446)를 수행하고, 수행 결과를 어플리케이션(412)에 전달한다. When executing SQLExecute () 446 in the application 412, the shard library 413 finds a value associated with the shard key among the bound values, and then interprets the bind value to determine the shard DB 430, 432, 434, 436, 438). Executes SQLExecute () 446 for the selected shard DB, and transfers the execution result to the application 412.

도 5는 기존의 샤딩 시스템에서 샤딩을 수행하는 시스템도를 도시한다.Figure 5 shows a system diagram for performing shading in a conventional sharding system.

종래에는 클라이언트(510)가 코디네이터(520)에 사용자쿼리를 전달하고(S510), 결과를 코디네이터(520)로부터 수신한 후(S511), 코디네티어(520)가 데이터 노드(530, 532,534)에 재차 사용자 쿼리를 전달하고(S512), 그에 대한 결과값을 수신해야만 했다(S513). 따라서, 데이터 처리를 수행할 때마다, 네트워크를 2회 거쳐야 하는 불편함이 있었다. The client 510 transmits a user query to the coordinator 520 in operation S510 and receives the result from the coordinator 520 in operation S511. The coordinator 520 transmits the user query to the data nodes 530, 532, 534 again The user query is transmitted (S512), and the resultant value is received (S513). Therefore, there has been an inconvenience that the network has to be passed twice every time data processing is performed.

그러나, 도 6에 도시된 본 발명의 바람직한 일 실시예로서 하이브리드샤딩 시스템은 최초에만 메타노드와 통신이 요구되며(S610 S611), 그 이후의 데이터 처리 수행시에는 클라이언트 단말기에서 직접 데이터 노드 또는 샤드 노드(630, 632, 634)에 액세스가 가능하여(S612, S613) 통신 비용이 줄어들고, 샤드노드가 추가되더라도 성능향상에 제약이 없는 이점이 있다.However, as a preferred embodiment of the present invention shown in FIG. 6, the hybrid sharding system requires communication with the meta node only at the first time (S610-S611). When performing data processing thereafter, (S612, S613), the communication cost is reduced, and there is no restriction on performance enhancement even if a shadow node is added.

도 7 은 본 발명의 바람직한 일 실시예로서, 하이브리드샤딩(Hybrid-Sharding)시스템에서 샤딩을 수행하는 방법의 흐름도를 도시한다. FIG. 7 is a flowchart illustrating a method of performing shading in a hybrid-sharding system according to an embodiment of the present invention.

클라이언트 단말기의 샤드라이브러리(도 4, 413 참고)는 사용자 쿼리를 메타 노드(도 4, 420 참고)에 전달한다(S710). 샤드라이브러리에서 메타노드에 등록된 적어도 하나 이상의 샤드DB의 정보를 수신하여 클라이언트 단말기와 적어도 하나 이상의 샤드DB의 연결을 수행한다(S720). 그 후 메타노드에서 사용자 쿼리를 분석하여 샤드 객체가 포함된 샤드쿼리인지 판단한 후, 샤드쿼리인 경우 샤드키(Shard Key)를 기준으로 데이터를 적어도 하나 이상의 샤드DB 각각에 분산처리를 수행한다(S730). The shard library (see FIG. 4, 413) of the client terminal transmits the user query to the meta node (see FIG. 4, 420) (S710). The shard library receives information of at least one shard DB registered in the meta node and performs connection between the client terminal and at least one shard DB (S720). After that, the meta node analyzes the user query to determine whether the query is a shard query including the shard object, and then, when the query is a shard query, the data is distributed to each of at least one shard DB based on the shard key (S730 ).

본 발명은 또한 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플라피디스크, 광데이터 저장장치 등이 있다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.The present invention can also be embodied as computer-readable codes on a computer-readable recording medium. A computer-readable recording medium includes all kinds of recording apparatuses in which data that can be read by a computer system is stored. Examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, and the like. The computer readable recording medium may also be distributed over a networked computer system so that computer readable code can be stored and executed in a distributed manner.

이상 도면과 명세서에서 최적 실시예들이 개시되었다. 여기서 특정한 용어들이 사용되었으나, 이는 단지 본 발명을 설명하기 위한 목적에서 사용된 것이지 의미 한정이나 특허청구범위에 기재된 본 발명의 범위를 제한하기 위하여 사용된 것은 아니다. 그러므로 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다.In the drawings and specification, there have been disclosed preferred embodiments. Although specific terms have been employed herein, they are used for purposes of illustration only and are not intended to limit the scope of the invention as defined in the claims or the claims. Therefore, those skilled in the art will appreciate that various modifications and equivalent embodiments are possible without departing from the scope of the present invention. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

Claims

With a hybrid-sharding system,
At least one shard DB for storing distributed data; and
And a hybrid shading unit, wherein the hybrid shading unit
A meta node for analyzing a user query to determine whether the query is a shard query including a shard object and distributing data to each of the at least one shard DB based on a shard key in the case of a shard query;
A plurality of shard DBs, which are installed in an application of a client terminal in a library form, and serve as a coordinator between the application and the at least one shard DB, transmit a user query to the meta node, And a shard library for receiving information from the client terminal and performing connection between the client terminal and the at least one shard DB.

[2] The apparatus of claim 1, wherein the hybrid shading unit
Further comprising: a selection unit selecting a server-side sharding mode or a client-side sharding mode.

The hybrid shading device according to claim 1,
The shard library requests table information in each of the shard DBs to the meta node when accessing the meta node for the first time when analyzing a user query, inquires where the data exists in the shard DB, Wherein the shard library directly accesses each of the shard DBs by receiving an answer thereto from the node.

The method of claim 1, wherein when the hybrid sharding portion is implemented in a server-side sharding mode, the meta node may implement a compound syntax that includes additional grouping, ordering, data node joins, And the data transfer and data redistribution between the shard DBs can be implemented.

2. The method according to claim 1, wherein when the hybrid sharding part is implemented in a server side sharding mode, the application of the client terminal accesses the meta node, the meta node creates a session, And requesting the shard query, a shard connection is generated for each session for each of the at least one shard DB registered in the meta node.

6. The method of claim 5, wherein the meta node
A plan tree is generated by optimizing the shard query analysis result by separating the case where the user query is a shard query and the case where the user query is not a shard query, and the result of the execution is transmitted to an application of the client terminal And returns the hybrid sharding system.

The method according to claim 1, wherein, when the hybrid sharding part is implemented in a client side sharding mode, a shard library installed in an application of the client terminal accesses the meta node, And generates a shard connection when connecting to each of the at least one shard DB.

[8] The method of claim 7, wherein the meta node returns a shard query analysis result to the application of the client terminal when the user query is a shard query, and the shard query analysis result indicates whether the user query is a shard query, A shard DB list to which a query can be performed, a host variable associated with the shard key, and a value of a void.

A method of performing shading in a hybrid-sharding system,
Transmitting a user query to a meta node in a shard library installed as a library in an application of a client terminal and serving as a coordinator between the application and at least one or more shard DBs;
Receiving information of at least one shard DB registered in the meta node in the shard library and performing connection between the client terminal and at least one shard DB;
Analyzing the user query in the meta node to determine whether the query is a shard query including a shard object; And
And distributing data to each of at least one shard DB based on a shard key in the case of a shard query as a result of determination by the meta node.

10. The system of claim 9, wherein the hybrid sharding system
And selecting a server-side sharding mode or a client-side sharding mode.

10. The method of claim 9,
The shard library requests table information in each of the shard DBs to the meta node when accessing the meta node for the first time when interpreting a user query and inquires of which of the at least one shard DB is data exists And receiving a response thereto from the meta node, the shard library then directly connects to each of the shard DBs.