KR20190079354A

KR20190079354A - Partitioned space based spatial data object query processing apparatus and method, storage media storing the same

Info

Publication number: KR20190079354A
Application number: KR1020170181486A
Authority: KR
Inventors: 정성원; 조범준
Original assignee: 서강대학교산학협력단
Priority date: 2017-12-27
Filing date: 2017-12-27
Publication date: 2019-07-05
Also published as: KR102005343B1

Abstract

The present invention relates to an apparatus for processing a spatial data object based on partition spaces, and to a method thereof. The apparatus comprises: a partition space generation unit generating a plurality of partition spaces by partitioning a data space including at least one spatial data object; and an index tree generation unit generating an index tree based on minimum boundary rectangle (MBR) information including all information on the plurality of partition spaces and the spatial data objects included in the corresponding partition spaces. Accordingly, the apparatus can efficiently partition the data space and perform efficient query processing using an appropriate table schema.

Description

TECHNICAL FIELD [0001] The present invention relates to an apparatus and method for processing a spatial data object query based on a divided space, and a recording medium on which the spatial data object query processing apparatus and method are recorded.

본 발명은 공간 데이터 객체 질의처리 기술에 관한 것으로, 보다 상세하게는 데이터 공간을 효율적으로 분할하고 적절한 테이블 스키마를 사용하여 효율적인 질의 처리를 할 수 있는 분할 공간 기반의 공간 데이터 객체 질의처리장치 및 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a spatial data object query processing technique, and more particularly, to an apparatus and method for processing a spatial data object query based on a divided space capable of efficient partitioning of a data space and efficient query processing using an appropriate table schema .

공간 데이터와 같은 다차원 데이터에 대한 효과적인 유사 질의어(similarity query) 처리의 핵심은 공간적으로 인접한 오브젝트들을 물리적으로도 가까운 위치에 저장하고, 검색시 필요한 영역에 속한 데이터만 접근하는 것이다. 그러나, 하둡(Hadoop)과 같이 일반적으로 빅데이터를 다루기 위해 사용되는 클라우드 컴퓨팅 기반의 프레임워크들은 오브젝트들 간의 공간적 인접성을 고려하지 않고 있기 때문에 질의 처리시 대량의 긍정 오류(false positive)를 포함한 데이터를 검색하게 된다.The key to effective similarity query processing for multidimensional data such as spatial data is to store spatially adjacent objects physically close to each other and to access only data belonging to the area required for the search. However, cloud computing-based frameworks, such as Hadoop, which are typically used to handle large data, do not take into account spatial proximity between objects, so data that contains a large number of false positives .

공간 데이터 객체 질의는 범위 질의(Range Query) 및 kNN 질의(k-Nearest Neighbor Query)를 포함할 수 있고, HBase와 같은 NoSQL 데이터베이스 관리 시스템을 이용한 인덱싱 방법은 분산 파일 시스템에서 빠른 데이터 임의 접근과 효율적인 데이터 갱신을 위한 효과적인 프레임워크를 제공할 수 있다. 범위 질의는 질의 포인트 q와 질의 반경 r을 입력으로 받아 질의 포인트 q로부터의 거리가 r보다 작은 데이터 객체 집합을 결과로서 반환한다. kNN 질의는 질의 포인트 q와 최근접 이웃의 수 k를 입력으로 받아 질의 포인트 q로부터 가장 근접한 k개의 데이터 객체 집합을 결과로서 반환한다.Spatial data object queries can include range queries and kNN queries, and indexing methods using NoSQL database management systems such as HBase can provide fast data random access and efficient data access in distributed file systems. An effective framework for updating can be provided. A range query takes as input the query point q and the query radius r, and returns a set of data objects whose distance from the query point q is less than r. The kNN query receives the query point q and the number k of the closest neighbors as inputs and returns the closest set of k data objects from query point q as a result.

HBase에서의 색인 생성은 1차원 행 키(Rowkey)만 지원하도록 설계된 점이 문제이다. 따라서, 선형화(linearization) 기법은 일반적으로 HBase 시스템에서 공간 데이터를 저장하는데 사용된다. 이러한 방법은 데이터 공간을 그리드(grid) 모양의 셀(cell)로 세분화하고 z-순서를 사용하여 셀을 순차적으로 정렬한다. 공간 데이터의 행 키는 해당 셀의 z-순서 번호를 사용하여 생성될 수 있다. The problem with indexing in HBase is that it is designed to support only one-dimensional row keys. Thus, linearization techniques are generally used to store spatial data in an HBase system. This method subdivides the data space into cells of a grid shape and sequentially aligns the cells using the z-order. The row key of the spatial data can be generated using the z-sequence number of the cell.

도 4는 선형화 기법을 사용하여 범위 질의를 처리하는 과정을 설명하는 예시도이다. 도 4를 참조하면, 범위 질의(Range Query)(410)가 실행되면 질의 범위 내의 최소 및 최대 z-순서 값이 계산될 수 있다. 다음으로 행 키가 z-순서 범위 내에 있는 행을 검색하여 질의가 처리될 수 있다. 그러나, 공간적 근접성을 완벽하게 보장할 수 없기 때문에 선형화 기법의 결과에는 종종 많은 긍정 오류(False Positive)(430)가 포함될 수 있다.4 is an exemplary diagram illustrating a process of processing a range query using a linearization technique. Referring to FIG. 4, when a range query 410 is executed, the minimum and maximum z-order values in the query range can be calculated. The query can then be processed by retrieving the rows whose row keys are within the z-order range. However, since the spatial proximity can not be fully guaranteed, the result of the linearization technique can often include many false positives (False Positives) 430.

도 5는 다차원 인덱싱 레이어를 사용하여 범위 질의를 처리하는 과정을 설명하는 예시도이다. 도 5를 참조하면, 범위 질의(510)가 실행되면 먼저 인덱스 레이어(Index Layer)(530)를 검색하고 관련 영역과 연관된 행 키가 있는 행만을 접근함으로써 긍정 오류를 줄일 수 있다. 그러나, 이러한 방법은 공간 분할의 단위가 크기 때문에 질의 처리 시 여전히 많은 양의 긍정 오류에 대한 접근이 필요할 수 있다.5 is a diagram illustrating a process of processing a range query using a multidimensional indexing layer. Referring to FIG. 5, when the range query 510 is executed, it is possible to search for the index layer 530 first and reduce the positive error by accessing only rows having a row key associated with the related region. However, since this method has a large unit of spatial division, it may still require access to a large amount of positive errors in query processing.

한국공개특허 제10-2016-0004781(2016.01.13)호는 빅데이터 프레임워크를 활용한 온톨로지 질의 처리방법에 관한 것으로, 테이블에 저장되는 각 노드들에 클래스 타입을 접두어로 저장하도록 하고, 클래스(Class)와 프로퍼티(Property)들을 각각의 테이블 컬럼 패밀리로 정의하도록 스키마를 세분화하여 저장하며, HBase 기반 RDF 저장소의 개선된 테이블 스키마를 기반으로 질의를 분석하여 HBase 필터를 적용해 검색해야 하는 데이터의 양을 획기적으로 줄여 효율적으로 질의를 처리할 수 있도록 하고, 기존 질의 처리기에서 처리하지 못한 중첩 트리플 패턴 질의를 처리할 수 있다.Korean Patent Laid-Open No. 10-2016-0004781 (2016.01.13) relates to a method of processing an ontology query using a Big Data Framework, in which each node stored in a table stores a class type as a prefix and class Classes and properties are defined in the respective table column families. The database is analyzed based on the improved table schema of the HBase-based RDF repository, and the amount of data to be retrieved by applying the HBase filter To efficiently process the query, and to handle the overlapping triple pattern query that can not be processed by the existing query processor.

한국등록특허 제10-1117709(2012.02.10)호는 공간 분할 트리의 최소 데이터-불균등 커버를 이용한 다차원 히스토그램 방법 및 이를 실행하기 위한 프로그램이 저장된 기록매체에 관한 것으로, 주어진 공간을 다양한 크기의 공간들로 분할하여 형성한 공간 분할 트리 내 각 분할된 공간 내 데이터 객체의 불균등에 기초하여 최소 데이터-불균등 커버를 판단한 후 이에 기반하여 히스토그램의 버킷을 생성함으로써, 종래의 다차원 히스토그램 방법과 달리 데이터 객체가 균등하게 분포되지 않은 상황에서도 영역 질의의 선택도에 대한 추정값 계산의 정확성을 확보하는 효과가 있다.Korean Patent No. 10-1117709 (Feb. 20, 2012) discloses a multi-dimensional histogram method using a minimum data-uneven cover of a space division tree and a recording medium storing a program for executing the method. Unequal coverage on the basis of the unevenness of the data objects in each divided space in the space division tree formed by dividing the space data into the plurality of data segments, and generating a bucket of the histogram based thereon, The accuracy of the calculation of the estimated value of the selectivity of the region query can be secured even in a situation where it is not distributed.

한국공개특허 제10-2016-0004781(2016.01.13)호Korean Patent Publication No. 10-2016-0004781 (Jan. 01, 2013) 한국등록특허 제10-1117709(2012.02.10)호Korean Registered Patent No. 10-1117709 (Feb. 10, 2012)

본 발명의 일 실시예는 데이터 공간을 효율적으로 분할하고 적절한 테이블 스키마를 사용하여 효율적인 질의 처리를 할 수 있는 분할 공간 기반의 공간 데이터 객체 질의처리장치 및 방법을 제공하고자 한다.An embodiment of the present invention seeks to provide an apparatus and method for processing a spatial data object query based on a divided space that can effectively partition a data space and efficiently process a query using an appropriate table schema.

본 발명의 일 실시예는 재귀적인 사분할을 통해 분할된 복수의 분할 공간 및 공간 데이터 객체의 최소 경계 사각형을 기초로 인덱스 트리를 생성할 수 있는 분할 공간 기반의 공간 데이터 객체 질의처리장치 및 방법을 제공하고자 한다.An embodiment of the present invention is an apparatus and method for processing a spatial data object based on a divided space capable of generating an index tree based on a minimum bounding rectangle of a plurality of divided spaces and spatial data objects divided through recursive division .

본 발명의 일 실시예는 효율적인 인덱스 트리를 이용하여 유사 질의를 처리함으로써 긍정 오류를 효과적으로 줄일 수 있는 분할 공간 기반의 공간 데이터 객체 질의처리장치 및 방법을 제공하고자 한다.An embodiment of the present invention is to provide an apparatus and method for processing a spatial data object query based on a divided space, which can effectively reduce positive errors by processing similar queries using an efficient index tree.

실시예들 중에서, 분할 공간 기반의 공간 데이터 객체 질의처리장치는 적어도 하나의 공간 데이터 객체를 포함하는 데이터 공간을 분할하여 복수의 분할 공간들을 생성하는 분할 공간 생성부 및 상기 복수의 분할 공간에 관한 정보 및 해당 분할 공간에 포함된 공간 데이터 객체를 모두 포함하는 최소 경계 사각형(Minimum Boundary Rectangle, MBR) 정보를 기초로 인덱스 트리를 생성하는 인덱스 트리 생성부를 포함한다.Among the embodiments, a divided space-based spatial data object query processing apparatus includes a divided space generating unit for generating a plurality of divided spaces by dividing a data space including at least one spatial data object, And an index tree generating unit for generating an index tree based on Minimum Boundary Rectangle (MBR) information including all spatial data objects included in the divided space.

상기 분할 공간 생성부는 상기 복수의 분할 공간들 각각에 관해 해당 공간 데이터 객체의 밀집도가 특정 기준 이하가 될 때까지 해당 분할 공간을 해당 분할 공간의 중심점을 기준으로 재귀적으로 재분할 할 수 있다.The divided space generating unit may recursively subdivide the divided space with respect to the center point of the divided space until the density of the corresponding spatial data object becomes less than a specific standard for each of the plurality of divided spaces.

상기 인덱스 트리 생성부는 상기 분할 공간에 관한 정보를 기초로 생성된 키(Key) 값 및 상기 최소 경계 사각형 정보를 기초로 생성된 밸류(Value) 값을 포함하는 행 데이터로 구성된 테이블을 기초로 상기 인덱스 트리를 생성할 수 있다.Wherein the index tree generating unit generates the index tree based on the key value generated based on the information on the divided space and the table including the value data generated based on the minimum bounding rectangle information, You can create a tree.

상기 인덱스 트리 생성부는 상기 분할 공간이 생성될 때마다 각각의 축에 대해 원점에서 가까운 방향의 분할 공간을 0, 먼 방향의 분할 공간을 1로 표시하고 각 축에 대한 비트(bit)를 연결하여 상기 키 값을 생성할 수 있다.Wherein the index tree generating unit displays 0 for the divided space in the direction closer to the origin and 1 for the divisional space in the far direction with respect to each axis for each axis when the divided space is generated, The key value can be generated.

상기 인덱스 트리 생성부는 상기 분할 공간에 대한 재귀적인 재분할이 발생하는 경우 이전 분할 정보 및 현재 발생한 재분할 공간에 대한 정보를 연결하는 방식으로 표시할 수 있다.The index tree generating unit may display the previous partition information and the information about the current partition generated when the recursive partitioning of the partition space occurs.

상기 인덱스 트리 생성부는 내부 노드에 관한 상기 행 데이터를 저장하는 인덱스 테이블 및 리프 노드에 관한 상기 행 데이터를 저장하는 데이터 테이블을 생성함으로써 상기 인덱스 트리를 생성할 수 있다.The index tree generating unit may generate the index tree by generating an index table storing the row data related to the internal node and a data table storing the row data related to the leaf node.

상기 분할 공간 기반의 공간 데이터 객체 질의처리장치는 상기 인덱스 트리를 이용하여 공간 데이터 객체 질의를 처리하는 질의 처리부를 더 포함할 수 있다.The apparatus for processing a spatial data object based on the divided space may further include a query processing unit for processing a spatial data object query using the index tree.

상기 질의 처리부는 상기 인덱스 트리를 이용하여 질의 포인트와 질의 반경을 포함하는 범위 질의 또는 질의 포인트와 최근접 이웃 수를 포함하는 kNN 질의를 처리할 수 있다.The query processor may process the kNN query using the index tree, including a range query or query point including a query point and a query radius, and a nearest neighbor number.

실시예들 중에서, 분할 공간 기반의 공간 데이터 객체 질의처리방법은 분할 공간 기반의 공간 데이터 객체 질의처리장치에서 수행되는 공간 데이터 객체 질의처리방법에 있어서, (a) 적어도 하나의 공간 데이터 객체를 포함하는 데이터 공간을 분할하여 복수의 분할 공간들을 생성하는 단계 및 (b) 상기 복수의 분할 공간에 관한 정보 및 해당 분할 공간에 포함된 공간 데이터 객체를 모두 포함하는 최소 경계 사각형(Minimum Boundary Rectangle, MBR) 정보를 기초로 인덱스 트리를 생성하는 단계를 포함한다.Among the embodiments, the divided space-based spatial data object query processing method is a spatial data object query processing method performed in an apparatus for processing a spatial data object query based on a divided space, the method comprising: (a) (B) generating minimum boundary rectangle (MBR) information including both information on the plurality of divided spaces and spatial data objects included in the divided space, And generating an index tree based on the index tree.

상기 (b) 단계는 상기 분할 공간에 관한 정보를 기초로 생성된 키(Key) 값 및 상기 최소 경계 사각형 정보를 기초로 생성된 밸류(Value) 값을 포함하는 행 데이터로 구성된 테이블을 기초로 상기 인덱스 트리를 생성하는 단계일 수 있다.Wherein the step (b) comprises the steps of: based on a table composed of a key value generated based on information on the divided space and row data including a value generated based on the minimum bounding rectangle information, An index tree may be generated.

상기 (b) 단계는 상기 분할 공간이 생성될 때마다 각각의 축에 대해 원점에서 가까운 방향의 분할 공간을 0, 먼 방향의 분할 공간을 1로 표시하고 각 축에 대한 비트(bit)를 연결하여 상기 키 값을 생성하는 단계일 수 있다.In the step (b), each time the divided space is generated, the divided space in the direction closer to the origin relative to the respective axes is denoted by 0, the divided space in the far direction is denoted by 1, and the bits for each axis are concatenated And generating the key value.

상기 (b) 단계는 상기 분할 공간에 대한 재귀적인 재분할이 발생하는 경우 이전 분할 정보 및 현재 발생한 재분할 공간에 대한 정보를 연결하는 방식으로 표시하는 단계일 수 있다.In the step (b), when the recursive re-division of the divided space occurs, the step of displaying the previous divided information and the information about the re-divided space currently generated may be displayed.

상기 (b) 단계는 내부 노드에 관한 상기 행 데이터를 저장하는 인덱스 테이블 및 리프 노드에 관한 상기 행 데이터를 저장하는 데이터 테이블을 생성함으로써 상기 인덱스 트리를 생성하는 단계일 수 있다.The step (b) may be a step of generating the index tree by generating an index table storing the row data related to the internal node and a data table storing the row data related to the leaf node.

상기 분할 공간 기반의 공간 데이터 객체 질의처리방법은 (c) 상기 인덱스 트리를 이용하여 공간 데이터 객체 질의를 처리하는 단계를 더 포함할 수 있다.The divided space-based spatial data object query processing method may further include (c) processing the spatial data object query using the index tree.

상기 (c) 단계는 상기 인덱스 트리를 이용하여 질의 포인트와 질의 반경을 포함하는 범위 질의 또는 질의 포인트와 최근접 이웃 수를 포함하는 kNN 질의를 처리하는 단계일 수 있다.The step (c) may be a step of processing a kNN query including the range query or the query point including the query point and the query radius using the index tree and the closest neighbor number.

실시예들 중에서, 기록매체는 분할 공간 기반의 공간 데이터 객체 질의처리장치에서 수행되는 공간 데이터 객체 질의처리방법을 기록하는 컴퓨터 수행 가능한 기록매체에 있어서, 적어도 하나의 공간 데이터 객체를 포함하는 데이터 공간을 분할하여 복수의 분할 공간들을 생성하는 과정 및 상기 복수의 분할 공간에 관한 정보 및 해당 분할 공간에 포함된 공간 데이터 객체를 모두 포함하는 최소 경계 사각형(Minimum Boundary Rectangle, MBR) 정보를 기초로 인덱스 트리를 생성하는 과정을 포함한다.In a preferred embodiment of the present invention, a recording medium is a computer-executable recording medium for recording a spatial data object query processing method performed in an apparatus for processing a spatial data object query based on a divided space, Generating an index tree based on minimum boundary rectangle (MBR) information including both information on the plurality of divided spaces and spatial data objects included in the divided space; .

개시된 기술은 다음의 효과를 가질 수 있다. 다만, 특정 실시예가 다음의 효과를 전부 포함하여야 한다거나 다음의 효과만을 포함하여야 한다는 의미는 아니므로, 개시된 기술의 권리범위는 이에 의하여 제한되는 것으로 이해되어서는 아니 될 것이다.The disclosed technique may have the following effects. It is to be understood, however, that the scope of the disclosed technology is not to be construed as limited thereby, as it is not meant to imply that a particular embodiment should include all of the following effects or only the following effects.

본 발명의 일 실시예에 따른 분할 공간 기반의 공간 데이터 객체 질의처리장치 및 방법은 재귀적인 사분할을 통해 분할된 복수의 분할 공간 및 공간 데이터 객체의 최소 경계 사각형을 기초로 인덱스 트리를 생성할 수 있다.An apparatus and method for processing a spatial data object query based on a divided space according to an embodiment of the present invention can generate an index tree based on a minimum bounding rectangle of a plurality of divided spaces and spatial data objects divided through recursive quadrants have.

본 발명의 일 실시예에 따른 분할 공간 기반의 공간 데이터 객체 질의처리장치 및 방법은 효율적인 인덱스 트리를 이용하여 유사 질의를 처리함으로써 긍정 오류를 효과적으로 줄일 수 있다.The apparatus and method for processing a spatial data object based on a divided space according to an embodiment of the present invention can effectively reduce false positives by processing similar queries using an efficient index tree.

도 1은 본 발명의 일 실시예에 따른 분할 공간 기반의 공간 데이터 객체 질의처리 시스템을 설명하는 도면이다.
도 2는 도 1에 있는 공간 데이터 객체 질의처리장치를 설명하는 블록도이다.
도 3은 도 1에 있는 공간 데이터 객체 질의처리장치에서 공간 데이터 객체 질의를 처리하는 과정을 설명하는 순서도이다.
도 4는 선형화 기법을 사용하여 범위 질의를 처리하는 과정을 설명하는 예시도이다.
도 5는 다차원 인덱싱 레이어를 사용하여 범위 질의를 처리하는 과정을 설명하는 예시도이다.
도 6은 공간 데이터 객체 질의처리장치에서 수행되는 공간 분할 과정을 설명하는 예시도이다.
도 7은 도 2에 있는 인덱스 트리 생성부에서 생성하는 인덱스 트리 노드의 구조를 설명하는 예시도이다.
도 8은 도 2에 있는 인덱스 트리 생성부에서 생성하는 인덱스 트리 노드에 대한 테이블 구성을 설명하는 예시도이다.
도 9는 공간 데이터 객체 질의처리장치에서 수행되는 범위 질의 처리 과정을 설명하는 예시도이다.1 is a view for explaining a spatial data object query processing system based on a divided space according to an embodiment of the present invention.
2 is a block diagram illustrating the spatial data object query processing apparatus shown in FIG.
3 is a flowchart illustrating a process of processing a spatial data object query in the spatial data object query processing apparatus shown in FIG.
4 is an exemplary diagram illustrating a process of processing a range query using a linearization technique.
5 is a diagram illustrating a process of processing a range query using a multidimensional indexing layer.
6 is an exemplary diagram illustrating a spatial division process performed by the spatial data object query processing apparatus.
FIG. 7 is an exemplary diagram illustrating a structure of an index tree node generated by the index tree generation unit shown in FIG. 2. FIG.
FIG. 8 is an exemplary diagram illustrating a table structure for an index tree node generated by the index tree generating unit shown in FIG. 2. FIG.
9 is a diagram illustrating an example of a range query process performed by a spatial data object query processing apparatus.

본 발명에 관한 설명은 구조적 내지 기능적 설명을 위한 실시예에 불과하므로, 본 발명의 권리범위는 본문에 설명된 실시예에 의하여 제한되는 것으로 해석되어서는 아니 된다. 즉, 실시예는 다양한 변경이 가능하고 여러 가지 형태를 가질 수 있으므로 본 발명의 권리범위는 기술적 사상을 실현할 수 있는 균등물들을 포함하는 것으로 이해되어야 한다. 또한, 본 발명에서 제시된 목적 또는 효과는 특정 실시예가 이를 전부 포함하여야 한다거나 그러한 효과만을 포함하여야 한다는 의미는 아니므로, 본 발명의 권리범위는 이에 의하여 제한되는 것으로 이해되어서는 아니 될 것이다.The description of the present invention is merely an example for structural or functional explanation, and the scope of the present invention should not be construed as being limited by the embodiments described in the text. That is, the embodiments are to be construed as being variously embodied and having various forms, so that the scope of the present invention should be understood to include equivalents capable of realizing technical ideas. Also, the purpose or effect of the present invention should not be construed as limiting the scope of the present invention, since it does not mean that a specific embodiment should include all or only such effect.

한편, 본 출원에서 서술되는 용어의 의미는 다음과 같이 이해되어야 할 것이다.Meanwhile, the meaning of the terms described in the present application should be understood as follows.

"제1", "제2" 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위한 것으로, 이들 용어들에 의해 권리범위가 한정되어서는 아니 된다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.The terms "first "," second ", and the like are intended to distinguish one element from another, and the scope of the right should not be limited by these terms. For example, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어"있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결될 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어"있다고 언급된 때에는 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 한편, 구성요소들 간의 관계를 설명하는 다른 표현들, 즉 "~사이에"와 "바로 ~사이에" 또는 "~에 이웃하는"과 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.It is to be understood that when an element is referred to as being "connected" to another element, it may be directly connected to the other element, but there may be other elements in between. On the other hand, when an element is referred to as being "directly connected" to another element, it should be understood that there are no other elements in between. On the other hand, other expressions that describe the relationship between components, such as "between" and "between" or "neighboring to" and "directly adjacent to" should be interpreted as well.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함하다"또는 "가지다" 등의 용어는 실시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.It is to be understood that the singular " include " or "have" are to be construed as including the stated feature, number, step, operation, It is to be understood that the combination is intended to specify that it does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

각 단계들에 있어 식별부호(예를 들어, a, b, c 등)는 설명의 편의를 위하여 사용되는 것으로 식별부호는 각 단계들의 순서를 설명하는 것이 아니며, 각 단계들은 문맥상 명백하게 특정 순서를 기재하지 않는 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 단계들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In each step, the identification code (e.g., a, b, c, etc.) is used for convenience of explanation, the identification code does not describe the order of each step, Unless otherwise stated, it may occur differently from the stated order. That is, each step may occur in the same order as described, may be performed substantially concurrently, or may be performed in reverse order.

본 발명은 컴퓨터가 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현될 수 있고, 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.The present invention can be embodied as computer-readable code on a computer-readable recording medium, and the computer-readable recording medium includes all kinds of recording devices for storing data that can be read by a computer system . Examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, and the like. In addition, the computer-readable recording medium may be distributed over network-connected computer systems so that computer readable codes can be stored and executed in a distributed manner.

여기서 사용되는 모든 용어들은 다르게 정의되지 않는 한, 본 발명이 속하는 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한 이상적이거나 과도하게 형식적인 의미를 지니는 것으로 해석될 수 없다.All terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. Commonly used predefined terms should be interpreted to be consistent with the meanings in the context of the related art and can not be interpreted as having ideal or overly formal meaning unless explicitly defined in the present application.

도 1은 본 발명의 일 실시예에 따른 분할 공간 기반의 공간 데이터 객체 질의처리 시스템을 설명하는 도면이다.1 is a view for explaining a spatial data object query processing system based on a divided space according to an embodiment of the present invention.

도 1을 참조하면, 분할 공간 기반의 공간 데이터 객체 질의처리 시스템(이하, 공간 데이터 객체 질의처리 시스템이라 한다.)(100)은 사용자 단말(110), 공간 데이터 객체 질의처리장치(130) 및 데이터베이스(150)를 포함할 수 있다.Referring to FIG. 1, a spatial data object query processing system (hereinafter, referred to as a spatial data object query processing system) 100 includes a user terminal 110, a spatial data object query processing unit 130, (150).

사용자 단말(110)은 공간 데이터 객체 질의를 입력하고 관련 응답을 확인할 수 있는 컴퓨팅 장치에 해당할 수 있고, 스마트폰, 노트북 또는 컴퓨터로 구현될 수 있으며, 반드시 이에 한정되지 않고, 태블릿 PC 등 다양한 디바이스로도 구현될 수 있다. 사용자 단말(110)은 공간 데이터 객체 질의처리장치(130)와 네트워크를 통해 연결될 수 있고, 사용자 단말1(110a) 내지 사용자 단말n(110c)을 포함하는 복수의 사용자 단말(110)은 공간 데이터 객체 질의처리장치(130)와 동시에 연결될 수 있다.The user terminal 110 may correspond to a computing device capable of inputting a spatial data object query and confirming a related response, and may be implemented as a smart phone, a notebook computer, or a computer, . &Lt; / RTI > The user terminal 110 may be connected to the spatial data object query processing device 130 through a network and a plurality of user terminals 110 including the user terminal 1 10a to the user terminal n 110c may be connected to the spatial data object And may be connected to the query processing device 130 at the same time.

공간 데이터 객체 질의처리장치(130)는 공간 분할을 통해 생성한 인덱스 트리를 이용하여 공간 데이터 객체 질의를 수신하여 처리할 수 있는 컴퓨터 또는 프로그램에 해당하는 서버로 구현될 수 있다. 공간 데이터 객체 질의처리장치(130)는 사용자 단말(110)과 블루투스, WiFi 등을 통해 무선으로 연결될 수 있고, 네트워크를 통해 사용자 단말(110)과 데이터를 주고 받을 수 있다.The spatial data object query processing unit 130 may be implemented as a computer or a server corresponding to a program capable of receiving and processing a spatial data object query using an index tree generated through spatial division. The spatial data object query processing device 130 can be wirelessly connected to the user terminal 110 via Bluetooth, WiFi, or the like, and can exchange data with the user terminal 110 via the network.

공간 데이터 객체 질의처리장치(130)는 데이터베이스(150)를 포함하여 구현될 수 있고, 데이터베이스(150)와 독립적으로 구현될 수 있다. 데이터베이스(150)와 독립적으로 구현된 경우 공간 데이터 객체 질의처리장치(130)는 데이터베이스(150)와 유선 또는 무선으로 연결되어 데이터를 주고 받을 수 있다.The spatial data object query processing apparatus 130 may be embodied including a database 150, and may be implemented independently of the database 150. When implemented independently of the database 150, the spatial data object query processing device 130 can be connected to the database 150 in a wired or wireless manner to exchange data.

데이터베이스(150)는 공간 데이터 객체 질의처리를 위해 필요한 다양한 정보들을 저장할 수 있는 저장장치이다. 데이터베이스(150)는 사용자 단말(110)로부터 수신한 적어도 하나의 범위 질의 또는 kNN 질의에 관한 정보 및 해당 질의를 처리하여 얻은 결과를 저장할 수 있고, 반드시 이에 한정되지 않고, 공간 데이터 객체 질의처리장치(130)가 공간 데이터 객체 질의를 처리하는 과정에서 다양한 형태로 수집하거나 가공한 정보들을 저장할 수 있다.The database 150 is a storage device capable of storing various information necessary for processing a spatial data object query. The database 150 may store information about at least one range query or kNN query received from the user terminal 110 and the result obtained by processing the query, and is not necessarily limited to a spatial data object query processor 130 may process the spatial data object query and store the collected or processed information in various forms.

일 실시예에서, 데이터베이스(150)는 공간 데이터들을 저장하고 관리할 수 있는 HBase에 해당할 수 있다. 여기에서, HBase는 아파치 HBase(Apache HBase)로서 하둡(Hadoop) 플랫폼을 위한 공개 비관계형 분산 데이터베이스에 해당할 수 있다. HBase는 구조화된 대용량의 데이터에 빠른 임의접근을 제공하는 구글의 빅 테이블과 비슷한 데이터 모델을 가지고, HDFS(Hadoop Distributed File System)의 데이터에 대한 실시간 임의 읽기/쓰기 기능을 제공할 수 있다.In one embodiment, the database 150 may correspond to an HBase capable of storing and managing spatial data. Here, HBase is Apache HBase (Apache HBase), which may be an open, non-relational distributed database for the Hadoop platform. HBase can provide real-time random read / write functionality for data in the Hadoop Distributed File System (HDFS), with a data model similar to Google's Big Table, which provides fast, random access to large amounts of structured data.

데이터베이스(150)는 특정 범위에 속하는 정보들을 저장하는 적어도 하나의 독립된 서브-데이터베이스들로 구성될 수 있고, 적어도 하나의 독립된 서브-데이터베이스들이 하나로 통합된 통합 데이터베이스로 구성될 수 있다. 적어도 하나의 독립된 서브-데이터베이스들로 구성되는 경우에는 각각의 서브-데이터베이스들은 블루투스, WiFi 등을 통해 무선으로 연결될 수 있고, 네트워크를 통해 상호 간의 데이터를 주고 받을 수 있다. 데이터베이스(150)는 통합 데이터베이스로 구성되는 경우 각각의 서브-데이터베이스들을 하나로 통합하고 상호 간의 데이터 교환 및 제어 흐름을 관리하는 제어부를 포함할 수 있다.The database 150 may include at least one independent sub-database that stores information belonging to a specific range, and at least one independent sub-database may be configured as an integrated database integrated into one. In the case of at least one independent sub-database, each of the sub-databases may be wirelessly connected via Bluetooth, WiFi, or the like and may exchange data with each other via the network. The database 150 may include a control unit that integrates each of the sub-databases into one unit and manages data exchange and control flow between the sub-databases.

도 2는 도 1에 있는 공간 데이터 객체 질의처리장치를 설명하는 블록도이다.2 is a block diagram illustrating the spatial data object query processing apparatus shown in FIG.

도 2를 참조하면, 공간 데이터 객체 질의처리장치(130)는 분할 공간 생성부(210), 인덱스 트리 생성부(230), 질의 처리부(250) 및 제어부(270)를 포함할 수 있다.Referring to FIG. 2, the spatial data object query processing apparatus 130 may include a divided space generation unit 210, an index tree generation unit 230, a query processing unit 250, and a control unit 270.

분할 공간 생성부(210)는 적어도 하나의 공간 데이터 객체를 포함하는 데이터 공간을 분할하여 복수의 분할 공간들을 생성할 수 있다. 여기에서, 데이터 공간은 공간 데이터 객체가 정의되는 다차원 공간에 해당할 수 있다. 공간 데이터 객체는 각 차원에 해당하는 속성 값을 포함하여 정의될 수 있고, 분할 공간 생성부(210)는 공간 데이터 객체가 가질 수 있는 각 속성 값들의 범위를 이용하여 데이터 공간을 정의할 수 있다.The divided space generating unit 210 may generate a plurality of divided spaces by dividing a data space including at least one spatial data object. Here, the data space may correspond to a multidimensional space in which the spatial data object is defined. The spatial data object may be defined to include an attribute value corresponding to each dimension, and the divided space generating unit 210 may define a data space using a range of each attribute value that the spatial data object can have.

예를 들어, 분할 공간 생성부(210)는 데이터 공간이 2차원 좌표계를 가지고 각 축의 길이가 원점으로부터 100의 길이를 가지는 정사각형 공간에 해당하는 경우, 해당 공간의 중심점인 (50, 50)의 위치를 기준으로 해당 중심점을 지나는 각 축에 평행한 직선을 기초로 4개의 분할 공간으로 사분할 할 수 있다. 사분할 된 각 분할 공간은 모두 정사각형에 해당할 수 있고, 동일한 크기를 가질 수 있다.For example, when the data space has a two-dimensional coordinate system and the length of each axis corresponds to a square space having a length of 100 from the origin, the divided space generating unit 210 generates a divided space having a center position (50, 50) Can be divided into four divided spaces based on a straight line parallel to each axis passing through the corresponding center point. Each quadrant divided space may correspond to a square and may have the same size.

일 실시예에서, 분할 공간 생성부(210)는 복수의 분할 공간들 각각에 관해 해당 공간 데이터 객체의 밀집도가 특정 기준 이하가 될 때까지 해당 분할 공간을 해당 분할 공간의 중심점을 기준으로 재귀적으로 재분할 할 수 있다. 여기에서, 공간 데이터 객체의 밀집도는 특정 분할 공간 내에 존재하는 공간 데이터 객체의 수에 해당할 수 있다. 예를 들어, 분할 공간 내에 존재하는 공간 데이터 객체의 수가 4 이하가 될 때까지 재분할을 하는 경우 분할 공간 생성부(210)는 특정 분할 공간이 해당 분할 공간 내에 9개의 공간 데이터 객체를 포함하고 있다면 해당 분할 공간을 4개의 분할 공간으로 재분할 할 수 있고, 재분할된 분할 공간 중에서 4개 이상의 공간 데이터 객체를 포함하고 있는 분할 공간이 존재하는 경우 해당 분할 공간에 대해서는 재분할 과정을 다시 수행할 수 있다.In one embodiment, the divided space generation unit 210 recursively divides the divided space into a plurality of divided spaces based on the center point of the corresponding divided space until the density of the corresponding spatial data object becomes less than a specific criterion You can redistribute. Here, the density of a spatial data object may correspond to the number of spatial data objects existing in a specific divided space. For example, when re-division is performed until the number of the spatial data objects existing in the divided space becomes equal to or less than 4, the divided space creating unit 210 creates a corresponding divided space, The divided space can be re-divided into four divided spaces, and if there is a divided space including four or more spatial data objects among the re-divided divided space, the re-divided process can be performed again for the divided space.

인덱스 트리 생성부(230)는 분할 공간에 관한 정보 및 해당 분할 공간 데이터 객체를 포함하는 최소 경계 사각형(Minimum Boundary Rectangle, MBR) 정보를 기초로 인덱스 트리를 생성할 수 있다. 여기에서, 최소 경계 사각형은 해당 분할 공간 데이터 객체를 모두 포함하는 사각형 중에서 넓이가 가장 최소인 사각형에 해당할 수 있다. 인덱스 트리 생성부(230)는 특정 분할 공간 및 해당 분할 공간에 존재하는 공간 데이터 객체들 모두를 포함하는 최소 경계 사각형 각각에 대한 정보를 이용하여 인덱스 트리를 생성할 수 있다.The index tree generating unit 230 may generate an index tree based on the information on the divided space and the Minimum Boundary Rectangle (MBR) information including the corresponding divided spatial data object. Here, the minimum bounding rectangle may correspond to a rectangle having the smallest width among the rectangles including all the divided spatial data objects. The index tree generating unit 230 may generate an index tree using information on each of the minimum bounding rectangles including both the specific divided space and the spatial data objects existing in the divided space.

일 실시예에서, 인덱스 트리 생성부(230)는 분할 공간에 관한 정보를 기초로 생성된 키(Key) 값 및 최소 경계 사각형 정보를 기초로 생성된 밸류(Value) 값을 포함하는 행 데이터로 구성된 테이블을 기초로 인덱스 트리를 생성할 수 있다. 예를 들어, 인덱스 트리 생성부(230)는 (키, 밸류) 값으로 구성된 행 데이터를 생성하여 HBase 테이블의 한 행으로 저장할 수 있다.In one embodiment, the index tree generating unit 230 generates a key value based on the information on the divided space and row data including a value generated based on the minimum bounding rectangle information You can create an index tree based on a table. For example, the index tree generating unit 230 may generate row data composed of (key, value) values and store the row data in one row of the HBase table.

일 실시예에서, 인덱스 트리 생성부(230)는 분할 공간이 생성될 때마다 각각의 축에 대해 원점에서 가까운 방향의 분할 공간을 0, 먼 방향의 분할 공간을 1로 표시하고 각 축에 대한 비트(bit)를 연결하여 키 값을 생성할 수 있다. 예를 들어, 인덱스 트리 생성부(230)는 분할을 통해 생성된 분할 공간이 있는 경우 해당 분할에서 x축에 대해서는 원점으로부터 먼 방향이고 y축에 대해서는 원점에서 가까운 방향의 분할 공간이라면 각각 '1', '0'으로 표시될 수 있고 이를 연결하여 최종적으로 '10'으로 표시될 수 있다.In one embodiment, the index tree generating unit 230 displays 0 for the divided space in the direction closer to the origin and 1 for the far direction divided space for each axis for each axis when the divided space is generated, (bit) may be concatenated to generate a key value. For example, if there is a divided space generated through division, the index tree generating unit 230 generates '1' if the divided space exists in a direction far from the origin with respect to the x-axis and closer to the origin with respect to the y- , '0', and they can be concatenated and finally displayed as '10'.

일 실시예에서, 인덱스 트리 생성부(230)는 분할 공간에 대한 재귀적인 재분할이 발생하는 경우 이전 분할 정보 및 현재 발생한 재분할 공간에 대한 정보를 연결하는 방식으로 표시할 수 있다. 예를 들어, 인덱스 트리 생성부(230)는 분할을 통해 생성되고 '10'으로 표시된 분할 공간에 대해 재분할이 발생한 경우, 재분할된 4개의 분할 공간들은 각각 '00', '01', '10' 및 '11'로 표시될 수 있고, 이전 분할 정보인 '10'에 연결되어 최종적으로 '1000', '1001', '1010' 및 '1011'로 표시될 수 있다. In one embodiment, the index tree generating unit 230 may display the previous partition information and the information about the current partitioned space when the recursive partitioning of the partition space occurs. For example, if the index tree generation unit 230 generates the partitioned space indicated by '10' generated by partitioning, the divided partitioned spaces are divided into '00', '01', '10' And '11', and may be displayed as '1000', '1001', '1010', and '1011', respectively, connected to the previous division information '10'.

일 실시예에서, 인덱스 트리 생성부(230)는 내부 노드에 관한 행 데이터를 저장하는 인덱스 테이블 및 리프 노드에 관한 행 데이터를 저장하는 데이터 테이블을 생성함으로써 인덱스 트리를 생성할 수 있다. 인덱스 테이블은 내부 노드에 관한 행 데이터를 저장할 수 있고, 내부 노드에 관한 행 데이터는 분할 공간 생성부(210)에 의해 생성된 분할 공간에 관한 정보를 기초로 생성된 키 값, 해당 내부 노드를 루트로 하는 트리 내에 존재하는 공간 데이터 객체의 수, 해당 내부 노드의 자식 노드 정보 및 각 자식 노드와 연관된 MBR 정보를 포함할 수 있다. 데이터 테이블은 리프 노드에 관한 행 데이터를 저장할 수 있고, 리프 노드에 관한 행 데이터는 각 리프 노드와 연관된 분할 공간에 관한 정보를 기초로 생성된 키 값, 해당 리프 노드와 연관된 분할 공간 내에 존재하는 공간 데이터 객체에 관한 정보를 포함할 수 있다.In one embodiment, the index tree generation unit 230 may generate an index tree by generating an index table storing row data related to internal nodes and a data table storing row data related to leaf nodes. The index table can store row data related to the internal node, and the row data related to the internal node is a key value generated based on the information on the partition space generated by the partition space generating unit 210, The number of spatial data objects in the tree, the child node information of the corresponding internal node, and the MBR information associated with each child node. The data table may store row data related to a leaf node, and the row data associated with the leaf node may include a key value generated based on information about the partition space associated with each leaf node, a space existing in the partition space associated with the leaf node And may include information about the data object.

질의 처리부(250)는 인덱스 트리 생성부(230)를 통해 생성된 인덱스 트리를 이용하여 공간 데이터 객체 질의를 처리할 수 있다. 여기에서, 인덱스 트리는 Q-MBR(Quadrand-based MBR) 트리에 해당할 수 있다. Q-MBR 및 이를 이용한 Q-BMR 트리에 대해서는 도 6 내지 8에서 보다 자세히 설명한다.The query processing unit 250 may process the spatial data object query using the index tree generated through the index tree generating unit 230. Here, the index tree may correspond to a Quadrand-based MBR (Q-MBR) tree. The Q-MBR and the Q-BMR tree using it will be described in more detail in FIGS.

일 실시예에서, 질의 처리부(250)는 인덱스 트리 생성부(230)를 통해 생성된 인덱스 트리를 이용하여 질의 포인트와 질의 반경을 포함하는 범위 질의 또는 질의 포인트와 최근접 이웃 수를 포함하는 kNN 질의를 처리할 수 있다. 보다 구체적으로, 질의 처리부(250)는 인덱스를 탐색하는 동안 Q-MBR 트리를 너비 우선 순서(Breadth First Search, BFS)에 따라 탐색하고 다음 반복(iteration)을 로드하기 위한 행 키(Rowkey)를 계산할 수 있다. In one embodiment, the query processing unit 250 uses the index tree generated through the index tree generating unit 230 to generate a kNN query including a query range or query point including a query point and a query radius and a nearest neighbor number Lt; / RTI > More specifically, the query processing unit 250 searches a Q-MBR tree according to Breadth First Search (BFS) while searching for an index, and calculates a row key for loading the next iteration .

일 실시예에서, 질의 처리부(250)는 범위 질의 처리를 위한 인덱스 검색을 위해 N, I 및 D라는 세 집합을 사용할 수 있다. 여기에서, N은 현재의 반복(iteration)에서 탐색을 위해 로드된 노드를 저장할 수 있다. I는 다음 반복(iteration)에서 인덱스 테이블에서 로드될 행 키 집합에 해당할 수 있다. D는 데이터 테이블에서 로드할 행 키 집합에 해당할 수 있다.In one embodiment, the query processor 250 may use three sets of N, I, and D for an index search for range query processing. Here, N may store the loaded node for the search in the current iteration. I may correspond to a set of row keys to be loaded in the index table in the next iteration. D may correspond to a set of row keys to be loaded in the data table.

질의 처리부(250)는 루트 노드의 행 키를 I에 삽입한 다음 인덱스 테이블에서 노드를 로드하여 탐색을 시작할 수 있다. 질의 처리부(250)는 인덱스 트리의 최대 깊이를 알고 있는 경우 루트 노드에 대한 단일 행 키만 삽입하는 대신 하위 레벨 노드에 대한 복수의 행 키를 I에 삽입할 수 있다. The query processing unit 250 may insert the row key of the root node into I and then start the search by loading the node in the index table. The query processing unit 250 may insert a plurality of row keys for I-level nodes into I instead of inserting only a single row key for the root node if the maximum depth of the index tree is known.

질의 처리부(250)는 현재 노드가 내부 노드인 경우 자식 노드의 MBR과 질의 포인트 q 간의 최소 거리를 계산할 수 있고, 거리가 질의 반경 r 이내에 있는 자식의 행 키를 I에 삽입할 수 있다. 질의 처리부(250)는 N에 있는 모든 노드를 탐색한 후 I에 저장된 행 키를 사용하여 추가 노드들을 프리패치(prefetch)하고 다음 반복(iteration)을 위해 N에 결과를 저장할 수 있다. The query processing unit 250 can calculate the minimum distance between the MBR of the child node and the query point q when the current node is an internal node and insert the row key of the child whose distance is within the query radius r. The query processor 250 may search all nodes in N and prefetch additional nodes using the row key stored in I and store the result in N for the next iteration.

질의 처리부(250)는 인덱스 탐색 과정에서 리프 노드에 도달하면 리프 노드의 행 키를 D에 저장할 수 있고, 질의 처리 마지막 단계에서 관련 공간 데이터 객체를 로드할 수 있다. 질의 처리부(250)는 I에 노드가 없는 경우 리프 노드에 저장된 공간 데이터 객체를 D에 로드할 수 있고, 공간 데이터 객체와 질의 포인트 q 간의 거리를 계산하여 범위 질의에 응답할 수 있다.When the query processing unit 250 reaches the leaf node in the index search process, the leaf key of the leaf node can be stored in D, and the related spatial data object can be loaded at the end of the query processing. The query processing unit 250 may load the spatial data object stored in the leaf node in D if there is no node in I and respond to the range query by calculating the distance between the spatial data object and the query point q.

질의 처리부(250)는 질의 포인트 q와의 거리가 질의 반경 r보다 작거나 같은 공간 데이터 객체만을 결과 집합에 삽입할 수 있다. 질의 처리부(250)는 결과 집합에 속한 공간 데이터 객체에 관한 정보를 질의 처리 결과로서 사용자 단말(110)에 제공할 수 있다.The query processing unit 250 may insert only the spatial data object whose distance from the query point q is less than or equal to the query radius r into the result set. The query processing unit 250 can provide information about the spatial data object belonging to the result set to the user terminal 110 as a query processing result.

일 실시예에서, 질의 처리부(250)는 kNN 질의 처리를 위해 N과 Q의 두가지 우선순위 큐(Queue)를 유지할 수 있다. 여기에서, N은 질의 포인트에서 MBR까지의 최소 거리의 오름차순으로 노드를 저장할 수 있고, Q는 kNN의 후보 집합을 내림차순으로 저장할 수 있다.In one embodiment, the query processor 250 may maintain two priority queues of N and Q for kNN query processing. Here, N can store the node in ascending order of the minimum distance from the query point to MBR, and Q can store the candidate set of kNN in descending order.

질의 처리부(250)는 kNN 질의 두 단계로 처리할 수 있다. 첫 번째 단계에서 질의 처리부(250)는 근사 범위 r을 사용하여 리프 노드의 행 키를 찾기 위해 Q-MBR 트리를 탐색할 수 있다. 여기에서, 근사 범위는 원하는 이웃 수를 찾기 위해 충분한 수의 공간 데이터 객체를 보장하는 최소 거리에 해당할 수 있다. The query processing unit 250 can process the kNN query in two steps. In the first step, the query processing unit 250 can search the Q-MBR tree to find the leaf key of the leaf node using the approximate range r. Here, the approximate range may correspond to a minimum distance that ensures a sufficient number of spatial data objects to find the desired number of neighbors.

질의 처리부(250)는 인덱스 탐색 과정에서 N에 저장된 노드들을 순차적으로 검사하여 최소 거리가 r보다 작은 자식 노드를 가지고 있는지를 결정할 수 있다. 질의 처리부(250)는 자식 노드의 최소 거리가 r보다 작은 경우 자식 노드의 행 키를 리스트 I에 삽입할 수 있다. 질의 처리부(250)는 범위 r은 초기 단계에서 매우 크게 설정할 수 있고, 현재 노드보다 가까운 노드의 객체 수가 k를 초과하면 N에서 노드의 최대 거리로 갱신할 수 있다. The query processing unit 250 may sequentially check the nodes stored in N in the index search process to determine whether the minimum distance has a child node smaller than r. The query processing unit 250 may insert the row key of the child node into the list I if the minimum distance of the child node is less than r. The query processing unit 250 can set the range r to be very large at the initial stage and can update the maximum distance of the node from N when the number of objects of the node closer to the current node exceeds k.

질의 처리부(250)는 현재 노드가 리프 노드인 경우 현재 노드의 행 키를 리스트 L에 저장하여 함께 로드할 수 있다. 질의 처리부(250)는 N에 더 이상 노드가 없으면 행 키가 I에 저장된 노드를 로드할 수 있다. 질의 처리부(250)는 N에 노드가 없고 I에 행 키가 없을 때 첫 번째 단계를 종료할 수 있다. 질의 처리부(250)는 인덱스 검색 후 L에 저장된 리프 노드의 공간 데이터 객체를 로드하고 거리를 평가하여 질의 결과 집합을 식별할 수 있다. 질의 처리부(250)는 평가된 공간 데이터 객체를 Q에 저장할 수 있고, 평가할 공간 데이터 객체가 더 이상 없을 때 질의 결과 집합을 반환할 수 있다.If the current node is a leaf node, the query processing unit 250 may store the row key of the current node in the list L and load it together. The query processing unit 250 can load the node where the row key is stored in I if there are no more nodes in N. [ The query processing unit 250 may terminate the first step when there is no node in N and there is no row key in I. The query processing unit 250 can load the spatial data object of the leaf node stored in L after the index search and evaluate the distance to identify the query result set. The query processing unit 250 can store the evaluated spatial data object in Q and return a query result set when there are no more spatial data objects to evaluate.

제어부(270)는 공간 데이터 객체 질의처리장치(130)의 전체적인 동작을 제어하고, 분할 공간 생성부(210), 인덱스 트리 생성부(230) 및 질의 처리부(250) 간의 제어 흐름 또는 데이터 흐름을 관리할 수 있다.The control unit 270 controls the overall operation of the spatial data object query processing apparatus 130 and manages control flow or data flow between the divided space generation unit 210, the index tree generation unit 230, and the query processing unit 250 can do.

도 3은 도 1에 있는 공간 데이터 객체 질의처리장치에서 공간 데이터 객체 질의를 처리하는 과정을 설명하는 순서도이다.3 is a flowchart illustrating a process of processing a spatial data object query in the spatial data object query processing apparatus shown in FIG.

도 3을 참조하면, 공간 데이터 객체 질의처리장치(130)는 분할 공간 생성부(210)를 통해 적어도 하나의 공간 데이터 객체를 포함하는 데이터 공간을 분할하여 복수의 분할 공간들을 생성할 수 있다(단계 S310). 일 실시예에서, 분할 공간 생성부(210)는 복수의 분할 공간들 각각에 관해 해당 공간 데이터 객체의 밀집도가 특정 기준 이하가 될 때까지 해당 분할 공간을 해당 분할 공간의 중심점을 기준으로 재귀적으로 재분할 할 수 있다.Referring to FIG. 3, the spatial data object query processing unit 130 may generate a plurality of divided spaces by dividing a data space including at least one spatial data object through the divided space generating unit 210 S310). In one embodiment, the divided space generation unit 210 recursively divides the divided space into a plurality of divided spaces based on the center point of the corresponding divided space until the density of the corresponding spatial data object becomes less than a specific criterion You can redistribute.

공간 데이터 객체 질의처리장치(130)는 인덱스 트리 생성부(230)를 통해 분할 공간에 관한 정보 및 해당 분할 공간에 포함된 공간 데이터 객체를 모두 포함하는 최소 경계 사각형 정보를 기초로 인덱스 트리를 생성할 수 있다(단계 S330). 일 실시예에서, 인덱스 트리 생성부(230)는 분할 공간에 관한 정보를 기초로 생성된 키(Key) 값 및 최소 경계 사각형 정보를 기초로 생성된 밸류(Value) 값을 포함하는 행 데이터로 구성된 테이블을 기초로 인덱스 트리를 생성할 수 있다.The spatial data object query processing unit 130 generates an index tree based on the minimum bounding rectangle information including both the information on the divided space and the spatial data objects included in the divided space through the index tree generating unit 230 (Step S330). In one embodiment, the index tree generating unit 230 generates a key value based on the information on the divided space and row data including a value generated based on the minimum bounding rectangle information You can create an index tree based on a table.

일 실시예에서, 인덱스 트리 생성부(230)는 분할 공간이 생성될 때마다 각각의 축에 대해 원점에서 가까운 방향의 분할 공간을 0, 먼 방향의 분할 공간을 1로 표시하고 각 축에 대한 비트(bit)를 연결하여 키 값을 생성할 수 있다. 일 실시예에서, 인덱스 트리 생성부(230)는 분할 공간에 대한 재귀적인 재분할이 발생하는 경우 이전 분할 정보 및 현재 발생한 재분할 공간에 대한 정보를 연결하는 방식으로 표시할 수 있다. 일 실시예에서, 인덱스 트리 생성부(230)는 내부 노드에 관한 행 데이터를 저장하는 인덱스 테이블 및 리프 노드에 관한 행 데이터를 저장하는 데이터 테이블을 생성함으로써 인덱스 트리를 생성할 수 있다.In one embodiment, the index tree generating unit 230 displays 0 for the divided space in the direction closer to the origin and 1 for the far direction divided space for each axis for each axis when the divided space is generated, (bit) may be concatenated to generate a key value. In one embodiment, the index tree generating unit 230 may display the previous partition information and the information about the current partitioned space when the recursive partitioning of the partition space occurs. In one embodiment, the index tree generation unit 230 may generate an index tree by generating an index table storing row data related to internal nodes and a data table storing row data related to leaf nodes.

공간 데이터 객체 질의처리장치(130)는 질의 처리부(250)를 통해 인덱스 트리 생성부(230)에 의해 생성된 인덱스 트리를 이용하여 공간 데이터 객체 질의를 처리할 수 있다(단계 S350). 일 실시예에서, 질의 처리부(250)는 인덱스 트리를 이용하여 질의 포인트와 질의 반경을 포함하는 범위 질의 또는 질의 포인트와 최근접 이웃 수를 포함하는 kNN 질의를 처리할 수 있다.The spatial data object query processing unit 130 may process the spatial data object query using the index tree generated by the index tree generating unit 230 through the query processing unit 250 (step S350). In one embodiment, the query processing unit 250 may use an index tree to process a kNN query that includes a range query or query point including the query point and query radius and the closest neighbor number.

도 6은 공간 데이터 객체 질의처리장치에서 수행되는 공간 분할 과정을 설명하는 예시도이다.6 is an exemplary diagram illustrating a spatial division process performed by the spatial data object query processing apparatus.

도 6을 참조하면, 공간 데이터 객체 질의처리장치(130)는 분할 공간 생성부(210)를 통해 원점(611)을 기준으로 수직으로 교차하고 각각 80의 최대 크기를 갖는 두개의 축에 의해 형성되는 데이터 공간(610)을 정의할 수 있다. 분할 공간 생성부(210)는 데이터 공간(610)을 공간의 중심점 (40,40)을 기준으로 사분할하여 복수의 분할 공간들을 생성할 수 있다. 분할 공간 생성부(210)는 해당 공간 데이터 객체의 밀집도가 특정 기준 이하가 될 때까지 해당 분할 공간을 재귀적으로 재분할 할 수 있다. 도 6에서, 밀집도는 4에 해당할 수 있다. 즉, 분할 공간 생성부(210)는 단일 분할 공간 내부에 존재하는 공간 데이터 객체의 수가 4이하가 될 때까지 재귀적으로 공간을 재분할 할 수 있다.Referring to FIG. 6, the spatial data object query processor 130 is formed by two axes vertically intersecting the origin 611 through the divided space generation unit 210 and each having a maximum size of 80 Data space 610 can be defined. The divided space generating unit 210 may generate a plurality of divided spaces by dividing the data space 610 into four spaces based on the center points 40 and 40 of the space. The divided space generating unit 210 can recursively re-divide the divided space until the density of the corresponding spatial data object becomes less than a specific criterion. In FIG. 6, the density may correspond to 4. That is, the divided space generating unit 210 recursively recursively divides the space until the number of the space data objects existing in the single divided space becomes 4 or less.

공간 데이터 객체 질의처리장치(130)는 데이터 공간(610)을 분할한 후 각 사분면에 존재하는 공간 데이터 객체에 대한 최소 경계 사각형(MBR)(613)을 생성할 수 있다. 공간 데이터 객체 질의처리장치(130)는 분할 공간(631) 및 MBR(633)에 관한 정보를 기초로 Q-MBR(Quadrand-based MBR)(630)을 생성할 수 있고, HBase 데이터베이스에 저장할 수 있다. 공간 데이터 객체 질의처리장치(130)는 Q-MBR(630)을 HBase 테이블에 저장하여 계층적 인덱스 트리의 빌딩 블록으로 사용할 수 있다. The spatial data object query processor 130 may generate a minimum bounding rectangle (MBR) 613 for a spatial data object existing in each quadrant after dividing the data space 610. The spatial data object query processor 130 may generate a Quadrand-based MBR (Q-MBR) 630 based on the information about the partition space 631 and the MBR 633 and store the same in the HBase database . The spatial data object query processor 130 may store the Q-MBR 630 in the HBase table and use the Q-MBR 630 as a building block of the hierarchical index tree.

분할 공간(631)에 관한 정보는 업데이트 비용을 줄이기 위해 행 키로 사용될 수 있다. 각 열 값은 별도의 키(key)-밸류(value) 쌍 형식으로 저장될 수 있고, 행 키를 업데이트하면 해당 키-밸류 쌍에 많은 수의 삽입이 발생할 수 있다. 따라서, 자주 업데이트되는 MBR(633) 정보는 행 키로 사용할 수 없다. 최악의 경우는 업데이트가 발생할 때 그룹화된 공간 데이터 객체에 대해 새로운 행 키를 만들고 모든 키-밸류 쌍을 다시 생성하는 경우에 해당할 수 있다. 따라서, MBR(633) 정보는 상대적으로 업데이트하기 어려운 열에 저장될 수 있고, 공간 질의 처리 중에 거리 계산에 사용될 수 있다.Information about the partition space 631 can be used as a row key to reduce the update cost. Each column value can be stored in a separate key-value pair format, and updating the row key can result in a large number of inserts in the corresponding key-value pair. Therefore, information of MBR 633 frequently updated can not be used as a row key. In the worst case, this can be the case when a new row key is created for a grouped spatial data object when an update occurs and all key-value pairs are regenerated. Thus, the MBR 633 information can be stored in columns that are relatively difficult to update and can be used for distance calculations during spatial query processing.

도 7은 도 2에 있는 인덱스 트리 생성부에서 생성하는 인덱스 트리 노드의 구조를 설명하는 예시도이다.FIG. 7 is an exemplary diagram illustrating a structure of an index tree node generated by the index tree generation unit shown in FIG. 2. FIG.

도 7을 참조하면, 공간 데이터 객체 질의처리장치(130)는 인덱스 트리 생성부(230)를 통해 Q-MBR을 생성하고 관리할 수 있는 인덱스 트리를 생성할 수 있다. 여기에서, 인덱스 트리는 Q-MBR 트리에 해당할 수 있고, HBase 테이블 또는 메모리 인덱스로 구현될 수 있다. Q-MBR 트리는 쿼드 트리 구조와 유사할 수 있다. Referring to FIG. 7, the spatial data object query processor 130 may generate an index tree for generating and managing the Q-MBR through the index tree generating unit 230. Here, the index tree may correspond to a Q-MBR tree and may be implemented as an HBase table or a memory index. The Q-MBR tree may be similar to a quadtree structure.

일 실시예에서, 인덱스 트리 생성부(230)는 내부 노드(Internal Node)에 관한 행 데이터를 저장하는 인덱스 테이블 및 리프 노드(Leaf Node)에 관한 행 데이터를 저장하는 데이터 테이블을 생성함으로써 인덱스 트리를 생성할 수 있다. 내부 노드는 인덱스 테이블(Index Table)에 저장될 수 있고, 노드의 사분면 정보(Quadrant), 자식 노드의 MBR(MBRs of children) 및 서브 트리에 포함된 공간 데이터 객체의 수(Number of objects)로 구성될 수 있다. 리프 노드는 데이터 테이블(Data Table)에 저장될 수 있고, 사분면(Quadrant), 리프 노드의 공간 데이터 객체의 수(Number of objects) 및 공간 데이터 객체 목록으로 구성될 수 있다.In one embodiment, the index tree generating unit 230 generates an index table for storing row data related to an internal node and a data table for storing row data related to a leaf node, thereby generating an index tree Can be generated. An internal node can be stored in an index table and consists of quadrant information of the node, MBRs of children of the child nodes, and number of objects of spatial data contained in the subtree . A leaf node can be stored in a data table, and can be composed of a quadrant, a number of objects of a leaf node, and a list of spatial data objects.

도 8은 도 2에 있는 인덱스 트리 생성부에서 생성하는 인덱스 트리 노드에 대한 테이블 구성을 설명하는 예시도이다.FIG. 8 is an exemplary diagram illustrating a table structure for an index tree node generated by the index tree generating unit shown in FIG. 2. FIG.

도 8을 참조하면, 리프 노드 영역 내에 존재하는 공간 데이터 객체에 대한 포인터 대신 사분면 정보를 사용하여 데이터 테이블에 저장된 공간 데이터 객체에 직접 접근할 수 있다. 도 8에 포함된 테이블의 구성은 도 6에 표시된 공간 데이터 객체에 대한 계층적인 Q-MBR 인덱스 구조의 예를 포함하고 있다. 각각의 Q-MBR은 인덱스 트리의 리프 노드에 해당할 수 있고, 부모 노드의 Q-MBR은 자식 노드의 Q-MBR들을 포함하는 사분면과 MBR로 표현될 수 있다. Referring to FIG. 8, it is possible to directly access the spatial data object stored in the data table by using the quadrant information instead of the pointer to the spatial data object existing in the leaf node area. The configuration of the table included in FIG. 8 includes an example of a hierarchical Q-MBR index structure for the spatial data object shown in FIG. Each Q-MBR may correspond to the leaf node of the index tree, and the Q-MBR of the parent node may be represented by a quadrant containing the Q-MBRs of the child node and the MBR.

인덱스 테이블의 'Meta family' 항목의 '#objects' 값은 해당 노드를 루트 노드로 하는 트리 내에 존재하는 공간 데이터 객체의 수를 나타낸다. 예를 들어, 'Rowkey' 항목 값이 'Root'인 노드에 대해 '#objects' 값이 15이므로 해당 트리 내에 존재하는 공간 데이터 객체의 수는 15개에 해당한다.The '#objects' value of the 'Meta family' entry in the index table represents the number of spatial data objects in the tree whose root node is the node. For example, since the value of '#objects' is 15 for a node whose 'Rowkey' entry value is 'Root', the number of spatial data objects in the tree is 15.

도 9는 공간 데이터 객체 질의처리장치에서 수행되는 범위 질의 처리 과정을 설명하는 예시도이다.9 is a diagram illustrating an example of a range query process performed by a spatial data object query processing apparatus.

도 9를 참조하면, 공간 데이터 객체 질의처리장치(130)은 질의 처리부(250)를 통해 범위 질의를 처리할 수 있다. 질의 처리부(250)는 R0의 행 키를 L에 삽입할 수 있고, 이를 N에 로드함으로써 시작할 수 있다. 질의 처리부(250)는 두 개의 자식 노드인 R2 및 R3이 질의 범위에 중첩되므로 R2 및 R3의 행 키를 첫 번째 반복(iteration)에서 I에 삽입할 수 있고, 인덱스 테이블로부터 함께 로드할 수 있다. 동일하게, R9 및 R6의 행 키가 삽입되어 두 번째 반복(iteration)에서 로드될 수 있다. 질의 처리부(250)는 R9 및 R6가 리프 노드이므로 다음 반복(iteration)에서 범위 질의에 대한 응답을 위해 R9 및 R6의 공간 데이터 객체를 검사할 수 있다. 결과 집합 R은 2개의 공간 데이터 객체 p1과 p2를 포함하고 있고, 탐색할 노드가 없기 때문에 질의 처리부(250)는 탐색을 종료할 수 있다.Referring to FIG. 9, the spatial data object query processing device 130 can process the range query through the query processing unit 250. [ The query processing unit 250 can insert the row key of R0 into L and start by loading it into N. [ Since the two child nodes R2 and R3 overlap the query range, the query processing unit 250 can insert the row keys of R2 and R3 into I in the first iteration and load them together from the index table. Similarly, row keys of R9 and R6 can be inserted and loaded in the second iteration. The query processing unit 250 can check the spatial data objects R9 and R6 for the response to the range query at the next iteration since R9 and R6 are leaf nodes. The result set R includes two spatial data objects p1 and p2, and the query processor 250 can terminate the search because there are no nodes to search.

상기에서는 본 발명의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the present invention as defined by the following claims It can be understood that

100: 분할 공간 기반의 공간 데이터 객체 질의처리 시스템
110: 사용자 단말 130: 공간 데이터 객체 질의처리장치
150: 데이터베이스
210: 분할 공간 생성부 230: 인덱스 트리 생성부
250: 질의 처리부 270: 제어부
410: 범위 질의 430: 긍정 오류
510: 범위 질의 530: 인덱스 레이어
610: 데이터 공간 611: 원점
613: 최소 경계 사각형 630: Q-MBR
631: 분할 공간 633: MBR100: Spatial Data Object Query Processing System Based on Partition Space
110: user terminal 130: spatial data object query processing device
150: Database
210: Split Space Generation Unit 230: Index Tree Generation Unit
250: query processing unit 270:
410: Range Query 430: Positive Error
510: Range Query 530: Index Layer
610: Data space 611: Origin
613: minimum bounding rectangle 630: Q-MBR
631: partition space 633: MBR

Claims

A divided space generating unit for dividing a data space including at least one spatial data object to generate a plurality of divided spaces; And
And an index tree generating unit for generating an index tree based on minimum boundary rectangle (MBR) information including both information on the plurality of divided spaces and spatial data objects included in the divided space, A spatial data object query processing unit.

2. The apparatus of claim 1,
And dividing the divided space by the center point of the divided space, until the density of the corresponding spatial data object becomes less than a specific reference with respect to each of the plurality of divided spaces. Object query processing device.

2. The apparatus of claim 1, wherein the index tree generating unit
Generating the index tree based on a table composed of a key value generated based on information on the divided space and a row value including a value generated based on the minimum bounding rectangle information A spatial data object query processing unit based on a divided space.

4. The apparatus of claim 3, wherein the index tree generating unit
Each time the divided space is generated, the divided space in the direction closer to the origin is indicated as 0 and the divided space in the far direction is indicated as 1 with respect to each axis, and the bit value for each axis is concatenated to generate the key value Wherein the spatial data object query processing unit comprises:

5. The apparatus of claim 4, wherein the index tree generating unit
Wherein when the recursive re-division of the divided space occurs, information is displayed in a manner of linking the previous partition information and the information on the re-partitioned space currently being generated.

4. The apparatus of claim 3, wherein the index tree generating unit
Wherein the index tree is generated by generating an index table for storing the row data for the internal node and a data table for storing the row data for the leaf node.

The method according to claim 1,
And a query processing unit for processing the spatial data object query using the index tree.

8. The apparatus of claim 7, wherein the query processing unit
Wherein the processing unit processes the kNN query including the range query or query point including the query point and the query radius using the index tree and the closest neighbor number.

A spatial data object query processing method performed by a spatial data object query processor based on a divided space,
(a) dividing a data space including at least one spatial data object to generate a plurality of divided spaces; And
(b) generating an index tree based on minimum boundary rectangle (MBR) information including both the information on the plurality of divided spaces and the spatial data objects included in the divided space; Based spatial data object query processing method.

10. The method of claim 9, wherein step (b)
A step of generating the index tree based on a table composed of a key value generated based on information on the divided space and a row value including a value generated based on the minimum bounding rectangle information The spatial data object query processing method comprising:

11. The method of claim 10, wherein step (b)
Each time the divided space is generated, the divided space in the direction closer to the origin is indicated as 0 and the divided space in the far direction is indicated as 1 with respect to each axis, and the bit value for each axis is concatenated to generate the key value Wherein the spatial data object query processing step comprises:

12. The method of claim 11, wherein step (b)
When the recursive re-division of the divided space occurs, displaying the previous divided information and the information about the re-divided space that is currently generated in a manner that connects the divided information.

11. The method of claim 10, wherein step (b)
And generating the index tree by storing the index table storing the row data related to the internal node and the data table storing the row data related to the leaf node. .

10. The method of claim 9,
(c) processing the spatial data object query using the index tree. < RTI ID = 0.0 > 31. < / RTI >

15. The method of claim 14, wherein step (c)
And processing a kNN query including a range query or query point including a query point and a query radius using the index tree and a closest neighbor number.

A computer-executable recording medium for recording a spatial data object query processing method performed in an apparatus for processing a spatial data object based on a divided space,
Dividing a data space including at least one spatial data object into a plurality of divided spaces; And
And generating an index tree based on minimum boundary rectangle (MBR) information including both information on the plurality of divided spaces and spatial data objects included in the divided space.