KR100923723B1

KR100923723B1 - Method for clustering similar trajectories of moving objects in road network databases

Info

Publication number: KR100923723B1
Application number: KR1020070026351A
Authority: KR
Inventors: 김상욱; 백지행; 원정임; 박경린
Original assignee: 제주대학교 산학협력단
Priority date: 2007-03-16
Filing date: 2007-03-16
Publication date: 2009-10-27
Also published as: KR20080084504A

Abstract

본 발명은 도로 네트워크 공간에서 유사 궤적 클러스터링 방법에 대한 것으로서, (a) 도로 네트워크 공간 상에서 이동 객체의 궤적 데이터를 데이터베이스에 저장하는 단계; (b) 상기 데이터베이스에 저장된 상기 궤적 데이터에 대하여 매칭되는 유사도 측정 함수값이 있는지를 판단하는 단계; (c) 상기 (b) 단계에서 유사도 측정 함수값과 매칭되지 않는 경우 유사도 측정 함수에 의하여 유사도를 측정하여 저장하는 단계; (d) 상기 유사도 측정 함수값이 가장 큰 값을 가지는 임의의 두 궤적을 검색하는 단계; (e) 상기 (d) 단계에서 검색된 두 궤적을 기준으로 궤적들을 k차원으로 맵핑하는 단계; (f) 상기 (e) 단계에서 맵핑된 결과를 대상으로 클러스터링을 수행하는 단계; 및 (g) 상기 (f) 단계에서 구성된 클러스터를 데이터베이스에 저장하는 단계를 포함하며, 도로 네트워크 내의 이동 객체들을 대상으로 하는 효과적인 유사 궤적 검색 및 클러스터링을 수행할 수 있게 된다. The present invention relates to a method for clustering similar trajectories in a road network space, comprising: (a) storing trajectory data of a moving object in a road network space in a database; (b) determining whether there is a similarity measure function value matched with the trajectory data stored in the database; (c) measuring and storing the similarity according to the similarity measuring function when it does not match with the similarity measuring function in step (b); (d) searching for any two trajectories of which the similarity measurement function has the largest value; (e) mapping the trajectories in k dimensions based on the two trajectories retrieved in step (d); (f) performing clustering on the result mapped in step (e); And (g) storing the cluster configured in the step (f) in a database, thereby enabling effective similar trajectory search and clustering for moving objects in the road network.

도로 네트워크, 이동객체, 유사궤적, 유사도함수, DSL함수, 요약정보 Road network, moving object, similar trajectory, similarity function, DSL function, summary information

Description

Method for clustering similar trajectories of moving objects in road network databases}

도1은 본 발명의 일실시예에 따른 이동 객체의 궤적을 표현한 세그먼트의 개략도이다.1 is a schematic diagram of a segment representing a trajectory of a moving object according to an embodiment of the present invention.

도2는 본 발명에 따른 도로 네트워크 공간 상에서 이동 객체 궤적의 유사도 측정 순서도이다. 2 is a flowchart for measuring similarity of a moving object trajectory in a road network space according to the present invention.

도3은 본 발명에 따른 도로 네트워크 공간 상에서 이동 객체의 유사 궤적 클러스터링 과정의 흐름도이다.3 is a flowchart of a process for clustering similar trajectories of moving objects in a road network space according to the present invention.

도4는 본 발명에 따른 도로 네트워크 상에서 이동 객체의 유사 궤적들을 클러스터링한 예시도이다.4 is an exemplary diagram of clustering similar trajectories of a moving object on a road network according to the present invention.

도5는 본 발명에 따른 이동 객체의 유사 궤적의 클러스터링 과정에서 클러스터 정보의 요약 과정을 나타낸다.5 illustrates a summary process of cluster information in a clustering process of a similar trajectory of a moving object according to the present invention.

도6은 본 발명에 따른 도로 네트워크 공간상에서 이동 객체의 질의 궤적과 유사한 궤적을 검색하는 순서도이다.6 is a flowchart of searching for a trajectory similar to a trajectory of a query of a moving object in a road network space according to the present invention.

본 발명은 도로 네트워크 공간에서 유사 궤적 클러스터링 방법에 대한 것으로서, 더욱 상세하게는 도로 네트워크 궤적 공간 상에서 시간의 흐름에 따라 이동 객체 들이 움직인 대용량 궤적 정보를 대상으로 주어진 질의 궤적과 유사한 궤적 클러스터를 효율적으로 검색하기 위한 도로 네트워크 공간에서 유사 궤적 클러스터링 방법에 대한 것이다.The present invention relates to a similar trajectory clustering method in a road network space, and more particularly, to efficiently track a trajectory cluster similar to a given query trajectory for a large trajectory information of moving objects moving over time in a road network trajectory space. Similar trajectory clustering method in road network space for searching.

최근 들어, 이동 통신, 텔레메틱스, GPS 등의 기술발달로 인해 이동 객체의 위치 정보를 효과적으로 활용하기 위한 방안에 대한 관심이 증대되고 있다.Recently, due to the development of technologies such as mobile communication, telematics, and GPS, there is increasing interest in a method for effectively utilizing the location information of a moving object.

이동 객체는 주기적으로 자신의 위치를 서버로 전송하는데, 이 데이터들은 시간의 흐름에 따라 공간적인 위치 정보가 변화하는 시공간 데이터(spatio-temporal data)의 특성을 갖는다.The moving object periodically transmits its location to the server, which has the property of spatio-temporal data in which spatial location information changes over time.

이동 객체 데이터베이스에 대한 사용자의 질의는 이동 객체의 과거 이동 경로 이력을 검색하는 과거 시간 질의와 이동 객체의 미래의 움직임을 예측하여 검색하는 미래 시간 질의의 두 가지 질의 형태로 크게 분류할 수 있고, 이중에서 미래 시간 질의는 위치기반 서비스, 교통 정보 시스템, 항공기 통제 시스템 등 미래 상황 예측에 기반한 다양한 서비스에 활용이 가능하다. The user's query for the moving object database can be broadly classified into two types: a past time query that retrieves the past movement path history of the moving object and a future time query that predicts and searches for the future movement of the moving object. The future time query can be used for various services based on forecasting future conditions, such as location-based services, traffic information systems, and aircraft control systems.

상기 서비스는 일정 시간 간격마다 위성 내의 GPS를 이용하여 측정된 이동 객체의 궤적 정보를 기반으로 한다. The service is based on trajectory information of a moving object measured using GPS in the satellite at regular time intervals.

상기 궤적 정보를 효율적으로 저장 및 관리하는 기법 중에서 주어진 이동 객체의 궤적과 유사한 궤적을 검색 또는 클러스터링하고, 이를 도로정보 및 사용자 정보등과 연계하여 분석하려는 시도가 활발히 진행되고 있다.Attempts have been made to search or cluster trajectories similar to the trajectories of a given moving object among the techniques for efficiently storing and managing the trajectory information, and to analyze them in connection with road information and user information.

이들 대부분의 시도는 유클리디안(Euclidean) 공간 상에서 2차원의 공간 좌표(x, y)의 연속으로 표현되는 이동 객체의 궤적 정보를 대상으로 하며, 유사 궤적을 검색하기 위한 유사도 측정 방식으로 공간 좌표 상의 거리 측정 방식인 유클리디안 거리를 사용한다.Most of these attempts are based on the trajectory information of a moving object represented as a series of two-dimensional spatial coordinates (x, y) in Euclidean space, and the spatial coordinates as a similarity measure to search for similar trajectories. Use Euclidean distance, which is a distance measurement method.

지금까지 유클리디안 공간상에서 이동 객체 궤적들 간의 유사도 측정을 위하여 EU(Euclidean distance), DTW (dynamic time warping distance), ERP(edit distance with real penalty), LCSS(longest common sub -sequences), EDR(edit distance in real sequence) 등의 거리 함수를 이용한 기법들이 제안되어 있다.To measure the similarity between moving object trajectories in Euclidean space, the EU (Euclidean distance), dynamic time warping distance (DTW), edit distance with real penalty (ERP), longest common sub-sequences (LCSS), and EDR ( Techniques using distance functions such as edit distance in real sequence have been proposed.

여기서, 상기 EU 방식은 길이가 같은 두 궤적이 주어졌을 때, 궤적을 구성하는 k차원 시공간 좌표들 간의 유클리디안 거리를 구하는 방식이다. In the EU method, when two trajectories having the same length are given, the Euclidean distance between k-dimensional space-time coordinates constituting the trajectory is obtained.

이 방식은 비교하는 두 궤적의 길이가 동일해야 한다는 제약이 있어서, 실제 응용에서는 궤적들의 길이가 동일하지 않기 때문에 이 방법은 적합하지 않다. This method has the constraint that the lengths of the two trajectories being compared must be the same, so in practice, this method is not suitable because the lengths of the trajectories are not the same.

DTW 방식은 서로 다른 길이를 갖는 궤적들 간의 유사도를 측정하기 위하여 궤적내의 특정 시공간 좌표 값을 임의의 수만큼 반복시키는 것을 허용하는 방식이다. ERP 방식은 유사 궤적 내에 갭을 허용하며, 길이가 서로 다른 두 궤적간의 유사도 측정이 가능하지만 잡음에 매우 민감하다는 단점이 있다. EU, DTW, ERP 함수는 모두 잡음에 민감한 유사도 측정 방식이므로, 궤적의 획득 혹은 표현 과정에서 잡음이 발생할 가능성이 높은 실제의 궤적들을 대상으로 할 경우 정확도가 낮다는 단점을 갖는다.The DTW method is a method that allows to repeat an arbitrary number of specific space-time coordinate values in the trajectory to measure the similarity between trajectories having different lengths. The ERP method allows gaps in similar trajectories, and can measure similarity between two trajectories of different lengths, but it is very sensitive to noise. EU, DTW, and ERP functions are all noise-sensitive similarity measurement methods, and thus have a disadvantage of low accuracy when targeting actual trajectories that are likely to generate noise during the acquisition or representation of the trajectory.

이러한 문제점을 해결한 방식으로 LCSS 방식과 EDR 방식을 들 수 있다. 이들 방식은 잡음에 의한 민감도를 감소시키기 위하여 유사 궤적내의 시공간 좌표 값의 차이가 주어진 허용치보다 작으면 두 좌표가 매칭된 것으로 간주하는 방식이다. 그러나 LCSS 방식은 궤적 내에 갭을 허용하지 않는 유사도 측정 방식으로 정확도가 낮다는 단점을 갖는다. EDR 방식은 유사도 측정을 위하여 에디트 거리(edit distance)를 사용하며, LCSS와 달리 궤적내의 갭을 허용하는 방식이다. The LCSS method and the EDR method are solved. These methods consider the two coordinates to be matched if the difference in the spatiotemporal coordinate values in the pseudo trajectory is less than a given allowance to reduce the sensitivity due to noise. However, the LCSS method has a disadvantage in that accuracy is low as a similarity measurement method that does not allow a gap in the trajectory. The EDR method uses an edit distance for measuring similarity and, unlike LCSS, allows a gap in a trajectory.

이들 방식들은 모두 유클리디안 공간을 기반으로 하므로 본 발명에서 대상으로 하는 도로 네트워크 공간상에서의 궤적들 간의 유사도 측정에는 적합하지 않다. Since these methods are all based on Euclidean space, they are not suitable for measuring similarity between trajectories on the road network space targeted by the present invention.

도로 네트워크 공간상에서의 궤적은 이동 객체가 거쳐간 도로 세그먼트들의 연속으로 표현되며, 이 경우 연속적으로 동일 도로 세그먼트가 반복적으로 궤적 내에 출현하는 경우는 발생하지 않는다. 따라서 반복을 허용하는 DTW나 ERP 방식은 적용할 수가 없다. 또한, LCSS나 EDR 방식은 궤적을 구성하는 도로 세그먼트의 식별자에 대하여 유사도 측정이 가능하나, 서브 궤적에 대한 유사도 측정 과정을 반복적으로 수행해야하므로 이로 인한 성능 저하 문제가 발생할 수 있다.The trajectory on the road network space is represented by a series of road segments passed by the moving object, in which case the same road segment does not appear in the trajectory repeatedly in succession. Therefore, DTW or ERP method that allows repetition is not applicable. In addition, the LCSS or the EDR method can measure the similarity with respect to the identifier of the road segment constituting the trajectory, but the similarity measurement process for the sub trajectory may be repeatedly performed, thereby causing a performance degradation problem.

본 발명은 상기와 같은 문제점을 해결하기 위하여 제안된 것으로, 기존의 유클리디언 공간에서가 아니라 데이터베이스에 저장된 이동 객체의 궤적 정보에 대한 유사도 측정 함수를 제시하고 이를 통한 클러스터 방법을 제공하기 위한 것이다. The present invention is proposed to solve the above problems, and to provide a clustering method through the similarity measurement function for the trajectory information of the moving object stored in the database, not in the existing Euclidean space.

또한, 이동 객체의 질의 궤적에 대하여 해당 클러스터를 신속하게 검색하는 질의 처리 방법을 제공하기 위한 것이다. Another object of the present invention is to provide a query processing method for quickly searching a corresponding cluster for a query trajectory of a moving object.

또한, 이동 객체의 궤적에 대한 유사로 검색된 클러스터와 연관된 사용자 정 보, 도로 정보를 사용자에게 함께 제공하여 활용되는 클러스터링 검색 방법을 제공하기 위한 것이다. In addition, it is to provide a clustering search method that is utilized by providing the user information and road information associated with the cluster found by the similarity to the trajectory of the moving object to the user.

상기와 같은 목적을 달성하기 위한 본 발명에 따른 도로 네트워크에서 이동 객체 궤적의 유사도 측정 방법은, (a) 도로 네트워크 공간 상에서 이동 객체의 궤적을 세그먼트 식별자와 세그먼트의 길이를 포함하는 궤적 데이터로서 데이터베이스에 저장하는 단계; (b) 상기 저장된 특정 궤적과 데이터베이스에 이미 저장된 궤적에 대하여 유사도 측정함수에 의한 계산된 측정값이 있는지를 판단하는 단계; (c) 상기 저장된 특정 궤적에 대하여 유사도 측정 함수에 의해 계산된 값이 없는 경우에는 상기 유사도 측정함수에 의해 유사도 측정 함수값을 계산하는 단계; (d) 상기 (c) 단계에서 측정된 함수값을 데이터베이스에서 상기 특정 궤적과 데이터베이스에 저장된 궤적간에 대한 유사도값으로서 데이터베이스에 저장하는 단계; 및 (e) 상기 (b) 단계에서 저장된 특정 궤적에 대하여 유사도 측정 함수에 의해 계산된 값이 없는 경우에는 상기 특정 궤적에 대하여 유사도 측정 함수값이 이미 계산된 유사도 측정값을 저장하는 단계를 포함하는 것을 특징으로 하는 것이다.In the road network according to the present invention for achieving the above object, the method for measuring the similarity of the trajectory of the moving object, (a) the trajectory of the moving object in the road network space as the trajectory data including the segment identifier and the length of the segment in the database; Storing; (b) determining whether there is a measured value calculated by a similarity measurement function for the stored specific trajectory and the trajectory already stored in the database; (c) calculating a similarity measure function value by the similarity measure function when there is no value calculated by the similarity measure function for the stored specific trajectory; (d) storing the function value measured in step (c) in the database as a similarity value between the specific trajectory in the database and the trajectory stored in the database; And (e) if there is no value calculated by the similarity measurement function for the specific trajectory stored in the step (b), storing the similarity measurement value for which the similarity measurement function value has already been calculated for the specific trajectory. It is characterized by.

또한, 본 발명에 따른 도로 네트워크에서 이동 객체 궤적의 유사도 측정 방법에 있어서, 상기 유사도 측정 함수는 임의의 두 개의 궤적을 각각 T_i, T_j라고 할때, 다음식 DSN(T_i, T_j) In addition, in the method for measuring the similarity of the moving object trajectory in the road network according to the present invention, the similarity measurement function, when any two trajectories are respectively T _i , T _j , the following equation DSN (T _i , T _j )

로 주어지는 것을 특징으로 하는 것이다.It is characterized by being given.

또한, 본 발명에 따른 도로 네트워크에서 이동 객체 궤적의 유사도 측정 방법에 있어서, 상기 유사도 측정 함수는 임의의 두 개의 궤적을 각각 T_i, T_j라고 할때, 다음식 DSL(T_i, T_j) In addition, in the method for measuring the similarity of the moving object trajectory in the road network according to the present invention, when the similarity measuring function is any two trajectories, respectively, T _i , T _j , the following equation DSL (T _i , T _j )

한편, 본 발명에 따른 도로 네트워크 공간에서 이동 객체의 유사 궤적 클러스터링 방법은, (a) 도로 네트워크 공간 상에서 이동 객체의 궤적 데이터를 데이터베이스에 저장하는 단계; (b) 상기 데이터베이스에 저장된 상기 궤적 데이터에 대하여 매칭되는 유사도 측정 함수값이 있는지를 판단하는 단계; (c) 상기 (b) 단계에서 유사도 측정 함수값과 매칭되지 않는 경우 유사도 측정 함수에 의하여 유사도를 측정하여 저장하는 단계; (d) 상기 유사도 측정 함수값이 가장 큰 값을 가지는 임의의 두 궤적을 검색하는 단계; (e) 상기 (d) 단계에서 검색된 두 궤적을 기준으로 궤적들을 k차원으로 맵핑하는 단계; (f) 상기 (e) 단계에서 맵핑된 결과를 대상으로 클러스터링을 수행하는 단계; 및 (g) 상기 (f) 단계에서 구성된 클러스터를 데이터베이스에 저장하는 단계를 포함한다.Meanwhile, a method for clustering similar trajectories of moving objects in a road network space may include: (a) storing trajectory data of the moving objects in a database in a road network space; (b) determining whether there is a similarity measure function value matched with the trajectory data stored in the database; (c) measuring and storing the similarity according to the similarity measuring function when it does not match with the similarity measuring function in step (b); (d) searching for any two trajectories of which the similarity measurement function has the largest value; (e) mapping the trajectories in k dimensions based on the two trajectories retrieved in step (d); (f) performing clustering on the result mapped in step (e); And (g) storing the cluster configured in the step (f) in a database.

또한, 본 발명에 다른 도로 네트워크 공간에서 이동 객체의 유사 궤적 클러스터링 방법에 있어서, 상기 유사도 측정 함수는, 임의의 두 개의 궤적을 각각 T_i, T_j라고 할때, 다음식 DSL(T_i, T_j) In addition, in the method of clustering similar trajectories of moving objects in a road network space according to the present invention, the similarity measurement function is assuming that any two trajectories are T _i and T _j , respectively, the following equation DSL (T _i , T _j )

또한, 본 발명에 다른 도로 네트워크 공간에서 이동 객체의 유사 궤적 클러스터링 방법에 있어서, (i) 클러스터의 정보가 요약되어 있는지를 판단하는 단계; (j) 상기 (i) 단계에서 클러스터의 정보가 요약되어 있지 않은 것으로 판단되는 경우에는 빈도수 계산식에 의하여 클러스터 정보를 요약하는 단계; 및 (k) 클러스터 요약정보를 데이터베이스에 저장하는 단계를 더 포함하는 것을 특징으로 하는 것이다.In addition, the method for clustering similar trajectories of moving objects in a road network space according to the present invention, the method comprising: (i) determining whether information of clusters is summarized; (j) if it is determined in step (i) that the information of the cluster is not summarized, summarizing the cluster information by a frequency calculation formula; And (k) storing the cluster summary information in a database.

또한, 본 발명에 다른 도로 네트워크 공간에서 이동 객체의 유사 궤적 클러스터링 방법에 있어서, 상기 클러스터 요약정보는 각 클러스터에 대하여 세그먼트 요약정보로서 해당 클러스터 내에 포함되는 각 궤적의 세그먼트 리스트, 각 클러스터 내에서 해당 세그먼트 리스트의 발생빈도를 나타내는 세그먼트의 가중치를 포함하는 것을 특징으로 하는 것이다.In addition, in the method of clustering similar trajectories of moving objects in a road network space according to the present invention, the cluster summary information is a segment list of each trajectory included in the cluster as segment summary information for each cluster, and the corresponding segment in each cluster. And a weight of a segment indicating a frequency of occurrence of the list.

또한, 본 발명에 다른 도로 네트워크 공간에서 이동 객체의 유사 궤적 클러스터링 방법에 있어서, 상기 빈도수 계산식은 Further, in the method for clustering similar trajectories of moving objects in a road network space according to the present invention, the frequency calculation equation is

또한, 본 발명에 다른 도로 네트워크 공간에서 이동 객체의 유사 궤적 클러스터링 방법에 있어서, (m-1) 질의궤적을 데이터베이스에 도입하는 단계; (m-2) 상기 질의궤적을 구성하는 세그먼트를 추출하는 단계; (m-3) 상기 질의궤적의 세그먼트와 데이터베이스에 저장된 궤적 클러스터 요약정보의 세그먼트가 일치하는 것이 있는지를 판단하는 단계; (m-4) 상기 (m-3) 단계에서 상기 일치하는 클러스터가 있는 경우 상기 일치하는 클러스터의 세그먼트 가중치의 합을 구하는 단계; (m-5) 상기 가중치의 합이 가장 큰 클러스터를 검색하는 단계; 및 (m-6) 상기 검색된 클러스터를 질의 궤적이 속하는 클러스터로 결정하는 단계를 포함하는 것을 특징으로 하는 것이다. In addition, the method for clustering similar trajectories of moving objects in a road network space according to the present invention, the method comprising: (m-1) introducing a query trajectory into a database; (m-2) extracting segments constituting the query trajectory; (m-3) determining whether the segment of the query trajectory and the segment of the trajectory cluster summary information stored in the database match each other; (m-4) obtaining a sum of segment weights of the matching clusters when the matching clusters are present in the step (m-3); (m-5) searching for a cluster having the largest sum of the weights; And (m-6) determining the searched cluster as a cluster to which the query trajectory belongs.

또한, 본 발명에 다른 도로 네트워크 공간에서 이동 객체의 유사 궤적 클러스터링 방법에 있어서, 상기 질의궤적과 유사한 것으로 검색된 클러스터와 상기 클러스터에 연관된 사용자 정보와 도로 정보를 사용자에게 제시하는 단계를 더 포함하는 것을 특징으로 하는 것이다.In addition, the method for clustering similar trajectories of a moving object in a road network space according to the present invention, the method further comprises the step of presenting to the user a cluster found to be similar to the query trajectory, user information and road information associated with the cluster. It is to be done.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세하게 설명한다.Hereinafter, with reference to the accompanying drawings will be described in detail a preferred embodiment of the present invention.

도1에서 도시된 바와 같이, 본 발명에서는 이동 객체의 궤적을 Ti = {(S₁,L₁), ... , (S_n,L_n)}으로 표현한다. 여기서, T_i은 궤적의 식별자이고, S_j는 세그먼트의 식별자, L_j는 세그먼트의 길이를 나타낸다. As shown in FIG. 1, in the present invention, the trajectory of the moving object is expressed as Ti = {(S ₁ , L ₁ ), ..., (S _n , L _n )}. Here, T _i is an identifier of a trajectory, S _j is an identifier of a segment, and L _j is a length of a segment.

또한, 궤적 정보와 함께 사용자 정보, 도로 정보 등을 부가 정보로 함께 저장, 관리한다.In addition to the trajectory information, user information, road information, and the like are stored and managed together as additional information.

본 발명이 제안하는 기법에서 궤적은 문자열로 표현되는 세그먼트 식별자들의 리스트로 구성된다. In the scheme proposed by the present invention, the trajectory is composed of a list of segment identifiers represented by a string.

도2에서 참조하면, 먼저 도로 네트워크 공간 상에서 이동 객체의 새로운 궤적을 세그먼트 식별자와 세그먼트의 길이를 포함하는 궤적 데이터로서 데이터베이스에 저장하게 된다(S21).Referring to FIG. 2, first, a new trajectory of a moving object on a road network space is stored in a database as trajectory data including a segment identifier and a length of a segment (S21).

다음으로, 상기 저장된 특정 궤적과 데이터베이스에 이미 저장된 궤적에 대하여 이미 유사도 측정함수에 의한 계산된 측정값이 있는지를 판단한다(S22).Next, it is determined whether there is already a measured value calculated by the similarity measurement function with respect to the stored specific trajectory and the trajectory already stored in the database (S22).

상기 S22 단계의 판단 결과, 상기 저장된 특정 궤적에 대하여 유사도 측정 함수에 의해 계산된 측정값이 없는 경우에는, 상기 유사도 측정함수에 의해 상기 저장된 특정 궤적과 데이터베이스에 이미 저장된 궤적에 대하여 유사도 측정 함수값을 계산한다(S23).As a result of the determination in step S22, when there is no measurement value calculated by the similarity measurement function for the stored specific trajectory, the similarity measurement function value is determined for the stored specific trajectory and the trajectories already stored in the database by the similarity measurement function. Calculate (S23).

상기 S23 단계에서 측정된 함수값을 상기 특정 궤적과 데이터베이스에 이미 저장된 궤적간에 대한 유사도값으로서 데이터베이스에 저장한다(S24).The function value measured in step S23 is stored in the database as a similarity value between the specific trajectory and the trajectories already stored in the database (S24).

상기 S22 단계의 판단 결과, 저장된 특정 궤적에 대하여 유사도 측정 함수에 의해 이미 계산된 값이 있는 경우에는, 상기 저장된 특정 궤적과 데이터베이스에 이미 저장된 궤적에 대하여 상기 이미 계산된 유사도 측정 함수값을 유사도값으로서 저장한다(S25). As a result of the determination in step S22, when there is a value already calculated by the similarity measurement function for the stored specific trajectory, the similarity measurement function value already calculated for the stored specific trajectory and the trajectory already stored in the database is used as the similarity value. Save (S25).

일반적으로 주어진 두 궤적간의 유사도 측정을 위하여 문자열간의 거리 함수로 많이 사용되는 ED(edit distance) 함수를 이용할 수 있다.In general, the ED (edit distance) function, which is widely used as a distance function between strings, can be used to measure similarity between two given trajectories.

그러나 비교되는 두 궤적의 세그먼트 개수가 서로 다른 경우 유사도는 세그먼트 개수가 많은 궤적에 의하여 영향을 받는다.However, when the number of segments of two compared trajectories are different from each other, the similarity is affected by the trajectories having a large number of segments.

본 발명에서는 궤적 T _i 와 T _j 간의 식(1)의 유사도 측정 함수 DSN(disimilarity with number)을 제안한다.The present invention proposes a trace T _i and the similarity measure function DSN (disimilarity with number) of the formula (1) between T _j.

(1)

(One)

예를 들어, 도1의 궤적들에 대하여 DSN 함수에 의한 유사도 값을 구하면 다음과 같다.For example, the similarity value obtained by the DSN function for the trajectories of FIG. 1 is obtained as follows.

DSN(T1,T2)=0.56,DSN (T1, T2) = 0.56, DSN(T1,T3)=0.4,DSN (T1, T3) = 0.4, DSN(T1,T4)=0.56,DSN (T1, T4) = 0.56, DSN(T2,T3)=0.64,DSN (T2, T3) = 0.64, DSN(T2,T4)=0.6,DSN (T2, T4) = 0.6, DSN(T3,T4)=0.64DSN (T3, T4) = 0.64

T1과 비교되는 T2, T3, T4에 대한 ED 함수에 의한 유사도는 모두 동일한 3이라는 값이 나온다. The similarity by the ED function for T2, T3, and T4 compared to T1 yields the same value of 3.

반면, 제안된 DSN 함수에 의한 값은 위에 제시된 바와 같이 T1에 대해 T3이 가장 유사한 값을 보인다. On the other hand, the value of the proposed DSN function shows the most similar value of T3 to T1, as suggested above.

실제 T1과 T3의 공통된 세그먼트가 더 많기 때문에 ED의 값보다 DSN의 값이 더 신뢰성이 있다. 이는 DSN 함수는 두 궤적의 세그먼트 개수의 합을 이용하여 두 궤적의 거리를 계산하기 때문이다. In fact, since there are more common segments of T1 and T3, the value of DSN is more reliable than the value of ED. This is because the DSN function calculates the distance between two trajectories using the sum of the number of segments of the two trajectories.

즉, 유사도 계산 시 ED 함수에서와 같이 두 궤적 중에서 세그먼트의 개수가 많은 임의의 한 궤적에 의한 영향을 감소시킬 수 있다. That is, when calculating the similarity, the influence of any one trajectory having a large number of segments among the two trajectories as in the ED function can be reduced.

그런데, 세그먼트 식별자의 매치 여부와 개수만을 가지고 유사도를 측정하므로 두 궤적의 실제 거리상의 근접 정도를 알 수 없다. However, since the similarity is measured only by the match and the number of segment identifiers, the degree of proximity of the two trajectories cannot be known.

예를 들어, DSN 함수에 의하여 궤적 T1과 T2의 유사도와 궤적 T1과 T4의 유사도가 같은 값을 갖는다. 이는 T1과 T2의 비공통 세그먼트 개수와 T1과 T4의 비공통 세그먼트의 개수가 동일하기 때문이다. For example, the similarity of the trajectories T1 and T2 and the similarity of the trajectories T1 and T4 have the same value by the DSN function. This is because the number of non-common segments of T1 and T2 and the number of non-common segments of T1 and T4 are the same.

그러나 도 1에 보인바와 같이 T4의 궤적보다 T2의 궤적이 T1에 더 근접해 있으므로, T1과 T2가 더 유사하다고 할 수 있다.However, as shown in FIG. 1, since the trajectory of T2 is closer to T1 than the trajectory of T4, it can be said that T1 and T2 are more similar.

동일한 도로라 하더라도 세그먼트를 나누는 기준에 따라 궤적에 포함된 세그먼트의 개수가 달라질 수 있다. 제안된 DSN 함수는 세그먼트의 개수에 영향을 받기 때문에 하나의 세그먼트가 다수의 세그먼트로 분할되어 표현될 경우, 유사도 값이 전혀 다른 값을 갖게 된다. Even in the same road, the number of segments included in the trajectory may vary depending on the criteria for dividing the segments. Since the proposed DSN function is affected by the number of segments, the similarity value has a completely different value when one segment is divided into a plurality of segments.

예를 들어, 도 1에서 두 궤적 T1과 T2의 DSN(T1,T2)=0.56 이다. 만약, 궤적 T2의 세그먼트 s6을 3개의 세그먼트로 분할했다면 DSN(T1,T2)=0.64가 된다. For example, in FIG. 1, DSN (T1, T2) = 0.56 of two trajectories T1 and T2. If segment s6 of trajectory T2 is divided into three segments, DSN (T1, T2) = 0.64.

따라서, 두 궤적의 유사도를 세그먼트의 개수만을 이용하여 판단할 수 없다. 본 발명에서는 이러한 문제점을 해결하기 위하여 세그먼트의 길이를 이용한 식 (2) 의 유사도 측정 함수 DSL(dissimilarity with length)를 제안한다.Therefore, the similarity of two trajectories cannot be determined using only the number of segments. In order to solve this problem, the present invention proposes a similarity measurement function DSL (dissimilarity with length) of Equation (2) using the length of a segment.

(2)

도1의 궤적에 대하여 DSL 함수에 의한 유사도 값을 구하면 다음과 같다.A similarity value obtained by the DSL function with respect to the trajectory of FIG. 1 is obtained as follows.

DSL(T1,T2)=0.56,DSL (T1, T2) = 0.56, DSL(T1,T3)=0.4, DSL (T1, T3) = 0.4, DSL(T1,T4)=0.65,DSL (T1, T4) = 0.65, DSL(T2,T3)=0.59,DSL (T2, T3) = 0.59, DSL(T2,T4)=0.68,DSL (T2, T4) = 0.68, DSL(T3,T4)=0.7DSL (T3, T4) = 0.7

DSN 함수에 의한 DSN(T1,T2)과 DSN(T1,T4)의 유사도 값이 동일한 반면에, DSL 함수에 의한 DSL(T1,T2)과 DSL(T1,T4)의 유사도 값은 다름을 알 수 있다. 즉, DSL 함수에 의하여 궤적 T2가 궤적 T4보다 궤적 T1과 더 유사함을 파악할 수 있다. 제안된 DSL 함수는 두 궤적 간에 공통 세그먼트의 개수가 많고 세그먼트의 길이의 차가 적은 궤적들을 보다 유사하다고 판단한다. The similarity values of DSN (T1, T2) and DSN (T1, T4) by DSN function are the same, while the similarity values of DSL (T1, T2) and DSL (T1, T4) by DSL function are different. have. That is, it can be seen that the trajectory T2 is more similar to the trajectory T1 than the trajectory T4 by the DSL function. The proposed DSL function judges that the trajectories having a large number of common segments and a small difference in the length of the segments are more similar between the two trajectories.

또한, 제안된 방식은 세그먼트를 나누는 기준에 따른 세그먼트 개수의 변화에 영향을 받지 않는다. 예를 들면, 궤적 T1, T2의 유사도 측정 시 s6의 세그먼트를 3개로 분할할 경우에 분할된 세 개의 세그먼트의 길이를 합한 전체의 길이는 4로 분할 전과 길이면에서는 변함이 없다. 따라서 분할 전과 분할 후의 DSL(T1,T2)의 값은 0.56으로 동일하다. In addition, the proposed scheme is not affected by the change in the number of segments according to the criteria for dividing the segment. For example, in the case of measuring the similarity of the trajectories T1 and T2, when dividing the segment of s6 into three, the total length of the three segments divided into four is unchanged before and after dividing. Therefore, the value of DSL (T1, T2) before and after the division is the same as 0.56.

본 발명에서 제안한 유사도 측정 함수를 이용하여 이동 객체의 유사 궤적을 클러스터링할 수 있게 된다.By using the similarity measurement function proposed in the present invention, it is possible to cluster similar trajectories of a moving object.

또한, 유사 궤적을 클러스터링하여 보다 유용한 정보를 사용자에게 제공할 수 있다.Similar trajectories can also be clustered to provide users with more useful information.

궤적 클러스터링이란 궤적간의 유사도를 이용하여 그룹화하는 것을 말하며, 일반적인 클러스터링 방법에서는 클러스터의 무게 중심이라 할 수 있는 중심점의 반복적인 변경에 의해 클러스터를 구성한다. Trajectory clustering refers to grouping using the similarity between trajectories. In a general clustering method, clusters are formed by repetitive change of a center point, which is a center of gravity of a cluster.

도3을 참조하면, 먼저 도로 네트워크 공간 상에서 이동 객체의 새로운 궤적 데이터를 데이터베이스에 저장한다(S31).Referring to FIG. 3, first, new trajectory data of a moving object on a road network space is stored in a database (S31).

상기 데이터베이스에 저장된 상기 궤적 데이터에 대한 매칭되는 유사도 측정 함수값이 데이터베이스에서 있는 지를 판단한다(S32).It is determined whether there is a matching similarity measure function value for the trajectory data stored in the database (S32).

다음, 상기 S32 단계의 판단 결과, 만일 상기 궤적 데이터에 대하여 매칭되는 유사도 측정 함수값이 데이터베이스에 없는 경우에는, 상기 궤적 데이터에 대하여 유사도 측정 함수에 의하여 새로운 유사도를 측정하여 이를 상기 궤적 데이터에 대한 유사도 측정 함수값으로서 데이터베이스에 저장한다(S33).Next, as a result of the determination in step S32, if there is no similarity measurement function value matched with the trajectory data in the database, a new similarity is measured with the similarity measurement function with respect to the trajectory data, and the similarity with respect to the trajectory data is measured. The measurement function value is stored in the database (S33).

상기 데이터베이스에 저장된 모든 유사도 측정 함수값 중에서 상기 유사도 측정 함수값이 가장 큰 값을 가지는 임의의 두 궤적을 검색한다(S34).Among the similarity measurement function values stored in the database, two random trajectories having the highest value are retrieved (S34).

상기 S34 단계에서 검색된 두 궤적을 기준으로 궤적들을 k차원으로 맵핑한다(S35)The trajectories are mapped in k-dimensions based on the two trajectories found in step S34 (S35).

다음으로, 상기 S35 단계에서 맵핑된 결과를 대상으로 클러스터링을 수행하게 된다(S36).Next, clustering is performed on the result mapped in step S35 (S36).

또한, 상기 S36 단계에서 구성된 클러스터를 데이터베이스에 저장한다(S37).In addition, the cluster configured in the step S36 is stored in the database (S37).

궤적간의 상대적인 거리로 클러스터를 구성한다면 중심점이라는 기준이 모호 해 진다. 이러한 문제점을 해결하기 위하여 본 발명에서는 FastMap을 이용하여 각 궤적을 k차원 공간 상의 한 점으로 표현한 후, 전체 궤적들과 대응되는 이 점들에 대하여 클러스터링을 수행한다. If the cluster is composed of the relative distance between the trajectories, the criterion of the center point becomes ambiguous. In order to solve this problem, the present invention expresses each trajectory as a point in k-dimensional space using FastMap, and then clusters the points corresponding to the entire trajectories.

이때, 서로 다른 길이를 갖는 궤적들을 하나의 차원으로 매핑시키기 위하여 본 발명에서는 제안된 유사도 측정 함수 DSL에 의해 측정된 두 궤적간의 유사도 값을 이용한다. 여기서, 측정된 DSL의 값은 두 궤적간의 거리를 의미한다. In this case, in order to map trajectories having different lengths into one dimension, the present invention uses a similarity value between two trajectories measured by the proposed similarity measurement function DSL. Here, the measured value of DSL means the distance between two trajectories.

FastMap은 n개의 객체들과 객체간 거리 함수가 주어졌을 때, 이 객체들을 k차원 상의 점으로 매핑하는 기법이다. FastMap is a method of mapping these objects to k-dimensional points given n objects and the distance-to-object distance function.

FastMap을 이용하면, 측정된 두 궤적간의 거리를 이용하여 궤적을 쉽게 k차원 공간 상의 점으로 매핑할 수 있다. Using FastMap, the trajectory can be easily mapped to a point in k-dimensional space by using the measured distance between two trajectories.

FastMap을 이용하여 궤적을 k차원의 점으로 변환 방법은 다음과 같다.How to convert trajectory to k-dimensional point using FastMap is as follows.

첫째, DSL의 값 중 가장 큰 값을 갖는 두 궤적을 찾는다. 이 두 궤적을 T _maxa , T _maxb 라 한다. First, find the two trajectories with the largest value among the DSL values. Referred to these two trajectories T _maxa, T _maxb.

둘째, 궤적들은 다음 식에 의해 T _maxa , T _maxb 를 기준으로 하는 T' _i

점으로 매핑한다.Second, the trajectory are T _'i, which is based on the T _maxa, _maxb T using the equation

Map to a point.

셋째, 상기 두 번째 과정에 의해 매핑 된 T' _i 점들의 새로운 거리차를 아래 식에 의해 재계산 한다.Third, the new distance difference between the T ' _i points mapped by the second process is recalculated by the following equation.

마지막으로, 상기 과정을 반복하며 매핑시킬 차원을 확장한다. Finally, we repeat the process and expand the dimensions to map.

FastMap을 이용하여 궤적을 k차원의 한 점으로 변환한 후, 본 발명에서는 k-medoids 방식을 이용하여 변환된 k차원 점들을 대상으로 클러스터링을 수행한다. k-medoids 방식은 각 클러스터에서 대표 객체(medoids)를 임의로 찾음으로써 n개의 객체 중에서 k개의 군집을 찾는 것이다. After converting the trajectory to a k-dimensional point using FastMap, the present invention performs clustering on the k-dimensional points transformed using the k-medoids method. The k-medoids method finds k clusters among n objects by randomly finding representative objects in each cluster.

클러스터의 대표 객체란 그 군집에 속하는 객체 중 다른 객체들과의 평균(또는 전체) 거리가 최소가 되는 객체를 말한다. k-medoids 클러스터링 방법은 객체들을 주어진 수의 클러스터로 구분하는데, 클러스터 내의 각 객체와 대표 객체와의 거리의 총합을 최소로 하는 방법이다. 이 방법은 k-means방식에 비하여 잡음에 덜 민감하다는 장점을 갖는다. The representative object of the cluster is an object whose average (or total) distance from other objects among the objects belonging to the cluster is minimum. The k-medoids clustering method divides objects into a given number of clusters, which minimizes the sum of the distances between each object in the cluster and the representative object. This method has the advantage of being less sensitive to noise than the k-means method.

도4는 FastMap을 이용하여 도1의 궤적을 2차원의 점으로 매핑한 후, 이를 k-medoids 방식에 의하여 3개의 클러스터로 만드는 예를 보인 것이다. 4 shows an example of mapping the trajectory of FIG. 1 to a two-dimensional point using FastMap, and then making it into three clusters by k-medoids.

클러스터링 된 결과를 보면, 유사도가 가장 높은 T1과 T3가 같은 클러스터로 그룹화되어 있는 것을 확인할 수 있다.In the clustered results, it can be seen that T1 and T3 having the highest similarity are grouped in the same cluster.

도5를 참조하면, 구성된 클러스터가 있을 때 우선 상기 클러스터의 정보가 요약되어 있는지를 판단하게 된다(S51). Referring to FIG. 5, when there is a configured cluster, it is first determined whether information of the cluster is summarized (S51).

상기 S51 단계에서 클러스터의 정보가 요약되어 있지 않은 것으로 판단되는 경우에는 빈도수 계산식에 의하여 클러스터 정보를 요약하게 된다(S52). If it is determined in step S51 that the information of the cluster is not summarized, the cluster information is summarized by a frequency calculation formula (S52).

다음으로 상기 클러스트 요약정보를 데이터베이스에 저장하게 된다(S53). Next, the cluster summary information is stored in a database (S53).

S51단계에서, 이미 클러스터 정보가 요약되어 있는 경우에는 이를 클러스터 정보로 데이터베이스에 저장한다(S54). If cluster information is already summarized in step S51, it is stored in the database as cluster information (S54).

본 발명에서는 클러스터링 과정을 통하여 얻어진 클러스터에 대한 정보를 각 클러스터 C에 대하여 세그먼트 요약 정보 S={<(SegList ₁ ), Weight ₁ >, ..., <(SegList _n ), Weight _n >}와 사용자 요약 정보 U={User ₁ , .. , User _m }을 구성하여 함께 저장 관리한다. In the present invention, the information on the cluster obtained through the clustering process is provided to each cluster C with segment summary information S = { <(SegList ₁ ), Weight ₁ >, ..., <(SegList _n ), Weight _n > } and the user. Summary Information U = { User ₁ , .., User _m } is saved and managed together.

여기서, SegList _j 는 해당 클러스터 내에 포함되는 각 궤적의 세그먼트 리스트를 의미하며, Weight _j 는 클러스터 내에서 해당 세그먼트 리스트의 발생 빈도를 의미하며 식 (3)에 의하여 얻어진다.Here, SegList _j means a segment list of each trajectory included in the corresponding cluster, and Weight _j means the frequency of occurrence of the corresponding segment list in the cluster and is obtained by Equation (3).

(3)

예를 들어, 도1의 각 궤적에 대하여 클러스터링 과정을 수행하여 얻어진 C1={T1, T3}이고 각 궤적 사용자가 각각 user1, user2라면, 세그먼트 요약 정보 S={<(s1,s3,s4), 1>, <(s2,s8,s9,s10), 0.5>}, 사용자 요약 정보 U={user1, user2} 이다.For example, if C1 = {T1, T3} obtained by performing a clustering process on each trajectory of FIG. 1 and each trajectory user is user1 and user2, segment summary information S = {<(s1, s3, s4), 1>, <(s2, s8, s9, s10), 0.5>}, and user summary information U = {user1, user2}.

위의 유사도 함수를 이용한 클러스터링의 각각의 수행 과정은 도3, 도5의 과정과 같이 수행된다.Each process of clustering using the above similarity function is performed as in the processes of FIGS. 3 and 5.

상술한 바와 같이 구성된 클러스터의 세그먼트 요약 정보와 사용자 요약 정보를 이용한 효율적인 질의 처리 기법을 제안한다. We propose an efficient query processing technique using segment summary information and user summary information of a cluster configured as described above.

도6을 참조하면, 도로 네트워크 공간상에서 이동객체에 대한 사용자의 질의궤적이 데이터베이스로 들어온다(S61). Referring to FIG. 6, a user's query trajectory for a moving object in a road network space enters a database (S61).

다음, 상기 데이터베이스에 도입된 상기 질의궤적을 구성하는 세그먼트를 추출한다(S62).Next, a segment constituting the query trace introduced to the database is extracted (S62).

상기 추출된 질의궤적의 세그먼트와 데이터베이스에 저장된 궤적 클러스터 요약정보의 세그먼트가 일치하는 것이 있는지를 판단하게 된다(S63).It is determined whether there is a match between the extracted segment of the query trajectory and the segment of the trajectory cluster summary information stored in the database (S63).

이때, 상기 S63 단계에서 상기 일치하는 클러스터가 있는 경우 상기 일치하는 클러스터의 세그먼트 가중치의 합을 구한다(S64).In this case, when there is a matching cluster in step S63, a sum of segment weights of the matching clusters is obtained (S64).

다음으로, 상기 가중치의 합이 가장 큰 클러스터를 검색하고, 상기 검색된 클러스터를 질의 궤적이 속하는 클러스터로 결정하게 된다(S65, S66).Next, the cluster having the largest sum of the weights is searched, and the searched cluster is determined as the cluster to which the query trajectory belongs (S65 and S66).

상기 S63 단계에서 상기 일치하는 클러스터가 없는 경우에는 질의 궤적은 데이터베이스에 저장된 클러스터와는 매칭되지 않으므로 질의 궤적이 속하는 클러스터는 없음을 결정하게 된다(S67).If there is no matching cluster in step S63, since the query trajectory does not match the cluster stored in the database, it is determined that no cluster belongs to the query trajectory (S67).

예를 들어, 도1로부터 구성된 클러스터의 세그먼트 요약 정보에 대하여 질의 궤적 Q={s1, s2, s10, s9, s8, s5}가 데이터베이스에 주어지면 Q를 구성하는 각 세그먼트에 대하여 클러스터의 세그먼트 요약 정보를 검색한다. For example, if the query trajectory Q = { s1, s2, s10, s9, s8, s5 } is given to the database for the segment summary information of the cluster constructed from FIG. 1, the segment summary information of the cluster for each segment constituting Q. Search for.

각 세그먼트와 비교된 각 클러스터의 세그먼트 요약 정보내의 해당 세그먼트의 가중치 값들의 합을 구하고, 그 중 가장 높은 가중치 값을 갖는 클러스터를 검색한다. The sum of the weight values of the corresponding segments in the segment summary information of each cluster compared with each segment is obtained, and the cluster having the highest weight value is searched.

상기 예에서 Q내의 세그먼트 s1, s2, s10, s9, s8는 클러스터 C1에 존재하므로 이때의 s1, s2, s10, s9, s8의 가중치를 모두 합하면 1+0.5+0.5+0.5+0.5=3이 된다. 다음 Q내의 세그먼트 s1, s5는 클러스터 C2에 존재하므로 가중치의 합은 1+1=2이고, 세그먼트 s1는 클러스터 C3에 존재하므로 가중치의 합은 1이다. 질의 궤적 Q와 매핑되는 클러스터는 가중치 값이 가장 큰 클러스터 C1이 된다. In the above example, since the segments s1, s2, s10, s9, and s8 in Q exist in the cluster C1, the weights of s1, s2, s10, s9, and s8 at this time add up to 1 + 0.5 + 0.5 + 0.5 + 0.5 = 3. . Since the segments s1 and s5 in the next Q exist in the cluster C2, the sum of the weights is 1 + 1 = 2, and since the segment s1 exists in the cluster C3, the sum of the weights is 1. The cluster mapped to the query trajectory Q becomes the cluster C1 having the largest weight value.

따라서 주어진 질의 궤적에 대하여 클러스터를 재구성하는 추가적인 오버헤드 없이 질의 궤적과 매치되는 클러스터를 검색할 수 있다.Therefore, a cluster matching the query trajectory can be searched without the additional overhead of reconfiguring the cluster for a given query trajectory.

이러한 질의 처리는 상술한 도6의 과정과 같이 수행된 다음, 검색된 클러스터와 연관된 사용자 정보와 도로 정보 등을 사용자에게 함께 제시한다. This query processing is performed in the same manner as the process of FIG. 6 described above, and then presents the user information and the road information associated with the searched cluster to the user.

예를 들어, 클러스터 C1에는 서울 지역에서 유사한 궤적으로 움직인 사용자 A, B, C가 존재하고, 클러스터 C2에는 부산 지역에서 유사 궤적으로 움직인 사용자 A, B가 존재한다고 가정한다면, 우리는 사용자 C가 부산 지역을 방문할 경우 클러스터 C1에 속한 사용자 A와 B가 부산 지역에서 이동한 경로를 사용자 C에게 추천할 수 있다.For example, suppose cluster C1 has users A, B, and C who have moved in similar trajectories in Seoul, and cluster C2 has users A and B who have moved in similar trajectories in Busan. Visits the Busan area, the user A and B belonging to the cluster C1 may recommend to the user C the route traveled from the Busan area.

본 발명에 따른 도로 네트워크에서의 유사 궤적 클러스터링 방법에 의하면 도로 네트워크 공간을 고려한 유사도 측정 방식을 적용함으로써 도로 네트워크 내의 이동 객체들을 대상으로 하는 효과적인 유사 궤적 검색 및 클러스터링을 수행할 수 있게 된다. According to the method of clustering similar trajectories in the road network according to the present invention, the similar trajectory search and clustering for moving objects in the road network can be performed by applying the similarity measuring method considering the road network space.

또한, 구성된 클러스터에 대한 세그먼트 요약정보를 활용하여 사용자의 질의 궤적과 매칭되는 클러스터를 효율적으로 검색할 수 있다. In addition, it is possible to efficiently search for a cluster matching the user's query trajectory by using segment summary information on the configured cluster.

이상에서 본 발명은 기재된 구체적인 실시예에 대해서만 상세히 설명되었지만 본 발명의 기술사상 범위 내에서 다양한 변형 및 수정이 가능함은 당업자에게 있어서 명백한 것이며, 이러한 변형 및 수정이 첨부된 특허청구범위에 속함은 당연한 것이다.Although the present invention has been described in detail only with respect to the specific embodiments described, it will be apparent to those skilled in the art that various modifications and variations are possible within the technical scope of the present invention, and such modifications and modifications belong to the appended claims. .

Claims

(a) storing the trajectory of the moving object on the road network space as a trajectory data including a segment identifier, a length of the segment, user information, and road information in a database;

(b) determining whether there is a measured value calculated by a similarity measurement function for the stored specific trajectory and the trajectory already stored in the database;

(c) If there is no measured value calculated by the similarity measurement function for the stored specific trajectory as a result of the step (b), the stored specific trajectory and the trajectories already stored in the database by the similarity measurement function. Calculating a similarity measure function;

(d) storing the function value measured in step (c) in the database as a similarity value between the specific trajectory and the trajectory already stored in the database; And

(e) If the determination result of step (b) indicates that there is a value already calculated by the similarity measurement function for the stored specific trajectory, the similarity measurement function already calculated for the stored specific trajectory and the trajectory already stored in the database. And storing the value as a similarity value.

The method of claim 1,

The similarity measurement function is assuming that any two trajectories, respectively, T _i , T _j , the following equation DSN (T _i , T _j )

A method for measuring the similarity of a moving object trajectory in a road network space, characterized in that.

The method of claim 1,

The similarity measurement function is assuming that two arbitrary trajectories T _i and T _j , respectively, the following equation DSL (T _i , T _j )

(b) determining whether there is a similarity measure function value in the database that matches the trajectory data;

(c) If the similarity measurement function value matched for the trajectory data is not found in the database as a result of the determination of step (b), a new similarity is measured for the trajectory data by a similarity measurement function, and the trajectory data is measured. Storing in the database as a similarity measure function for;

(d) retrieving any two trajectories having the largest value of the similarity measurement function value among all similarity measurement function values stored in the database;

(e) mapping the trajectories in k dimensions based on the two trajectories retrieved in step (d);

(f) performing clustering on the result mapped in step (e); And

and (g) storing the cluster configured in the step (f) in a database.

The method of claim 4, wherein

The similarity measurement function,

When two arbitrary trajectories are called T _i and T _j , respectively, the following equation DSL (T _i , T _j )

A method for clustering similar trajectories of moving objects in a road network space.

The method of claim 4, wherein

(i) determining whether information of the cluster is summarized;

(j) if it is determined in step (i) that the information of the cluster is not summarized, summarizing the cluster information by a frequency calculation formula; And

(k) storing the cluster summary information in a database, the method of clustering similar trajectories of moving objects in a road network space.

The method of claim 6,

The cluster summary information is segment segmentation information for each cluster, and includes a segment list of each trajectory included in the cluster, and a weight of a segment indicating a frequency of occurrence of the segment list in each cluster. Similar trajectory clustering method for moving objects.

The method of claim 6,

The frequency calculation formula

The method of claim 6,

(m-1) inputting a query trajectory into the database;

(m-2) extracting segments constituting the query trajectory;

(m-3) determining whether the segment of the query trajectory and the segment of the trajectory cluster summary information stored in the database match each other;

(m-4) obtaining a sum of segment weights of the matching clusters when the matching clusters are present in the step (m-3);

(m-5) searching for a cluster having the largest sum of the weights; And

(m-6) further comprising determining the searched cluster as a cluster to which the query trajectories belong.

The method of claim 9,

And presenting to the user the cluster found to be similar to the query trajectory, the user information and the road information associated with the cluster to the user.