JP2012113625A

JP2012113625A - Information processing apparatus, information processing method, and program

Info

Publication number: JP2012113625A
Application number: JP2010263820A
Authority: JP
Inventors: Daisuke Mochizuki; 大介望月
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2010-11-26
Filing date: 2010-11-26
Publication date: 2012-06-14
Also published as: US20120136911A1; CN102591903A

Abstract

PROBLEM TO BE SOLVED: To achieve fast clustering processing which suppresses throughput.SOLUTION: An information processing apparatus 100 includes: an N-ary numeral generation unit 101 which generates an N-ary numeral in which values of coordinates of respective dimensions represented in N-ary (N=2, 3, ...) representation of a predetermined number of digits are arrayed by one digit for each dimension in order with respect to data 1011 having position information on a feature space 1001 defined with D-dimensional coordinates (D=2, 3, ...); and a clustering unit 103 which classifies the data 1011 in which high-order (k) digits (k=1, 2, ...) of the N-ary numeral are common into the same cluster 1021.

Description

本発明は、情報処置装置、情報処置方法およびプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and a program.

特徴空間上のデータを、データが有する位置情報に基づいてクラスタリングする技術が知られている。クラスタリングによって同一のクラスタに分類されたデータは、特徴空間において近い位置にあるデータ、すなわち特徴空間によって表される特徴が似ているデータであると考えられる。このようなクラスタリングを利用する技術として、例えば、特許文献１には、画像データに撮影場所の位置情報を付加し、この位置情報に基づいたクラスタリングによって画像データを撮影場所ごとに分類する技術が記載されている。 A technique for clustering data on a feature space based on position information included in the data is known. Data classified into the same cluster by clustering is considered to be data at a close position in the feature space, that is, data having similar features represented by the feature space. As a technique using such clustering, for example, Patent Document 1 describes a technique of adding position information of shooting locations to image data and classifying the image data for each shooting location by clustering based on the position information. Has been.

特開２０１０−１４０３８３号公報JP 2010-140383 A

しかし、クラスタリングの処理は、位置情報を有する複数のデータのそれぞれについて相互間の距離を算出するため、距離計算の処理負荷が大きくなる傾向がある。また、大量のメモリを必要とする傾向がある。それゆえ、クラスタリングの処理の処理速度の向上が課題となっている。 However, since the clustering process calculates a distance between each of a plurality of pieces of data having position information, the processing load of the distance calculation tends to increase. Also, it tends to require a large amount of memory. Therefore, improvement of the processing speed of the clustering process has been a problem.

本発明は、上記課題に鑑みてなされたものであり、本発明の目的とするところは、処理量を抑制した高速なクラスタリング処理が可能な、新規かつ改良された情報処置装置、情報処置方法およびプログラムを提供することにある。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a new and improved information processing apparatus, information processing method, and information processing apparatus capable of high-speed clustering processing with a reduced processing amount. To provide a program.

上記課題を解決するために、本発明のある観点によれば、Ｄ次元の座標（Ｄ＝２，３，・・・）によって規定される特徴空間の位置情報を有するデータについて、所定の桁数のＮ進数（Ｎ＝２，３，・・・）で表現された各次元の座標の値を上記各次元について順に１桁ずつ配列したＮ進数値を生成するＮ進数値生成部と、上記Ｎ進数値の上位ｋ桁（ｋ＝１，２，・・・）が共通する上記データを同一のクラスタに分類するクラスタリング部とを備える情報処理装置が提供される。 In order to solve the above-described problem, according to one aspect of the present invention, a predetermined number of digits is obtained for data having position information of a feature space defined by D-dimensional coordinates (D = 2, 3,. An N-ary value generation unit that generates an N-ary value in which the values of coordinates of each dimension expressed in N-ary numbers (N = 2, 3,. There is provided an information processing apparatus including a clustering unit that classifies the above-mentioned data having the same high-order k digits (k = 1, 2,...) Into the same cluster.

上記クラスタリング部は、ｋ＝Ｄ×ｍ（ｍ＝１，２，・・・）である場合に、上記Ｎ進数値の上位ｋ桁が共通する上記データを、Ｎ^Ｄ分木構造クラスタのｍ層目で同一のクラスタに分類してもよい。 The clustering unit, k = D × m (m = 1,2, ···) when it is, the data higher k digits of the N-ary number is common, m layer of ^{N D-ary} tree structure Cluster You may classify into the same cluster visually.

上記クラスタリング部は、上記データを上記Ｎ進数値の順にソートするクラスタリング用ソート部を含み、上記ソートの結果から、上記同一のクラスタに分類される上記データを特定してもよい。 The clustering unit may include a clustering sorting unit that sorts the data in the order of the N-ary value, and may specify the data classified into the same cluster from a result of the sorting.

上記クラスタリング部は、上記ソートの結果において、一のクラスタに分類された上記データが現れる最初の位置と、当該一のクラスタに分類された上記データの数とによって、当該一のクラスタを特定するクラスタ特定情報を生成してもよい。 The clustering unit is a cluster that identifies the one cluster based on a first position where the data classified into one cluster appears in the sorting result and the number of the data classified into the one cluster. Specific information may be generated.

上記情報処理装置は、上記Ｄ次元の座標に基づく第１の順位決定処理の結果に基づいて、上記特徴空間における第１の方向について上記クラスタをソートするマージ用ソート部と、上記第１の方向についてソートされたクラスタが、上記第１の方向で互いに隣接するか否かを判定する隣接判定部と、を含み、上記第１の方向について互いに隣接すると判定されたクラスタをマージするマージ部をさらに備えてもよい。 The information processing apparatus includes: a merge sorting unit that sorts the clusters with respect to a first direction in the feature space based on a result of a first rank determination process based on the D-dimensional coordinates; and the first direction. An adjacency determination unit that determines whether or not the clusters sorted for are adjacent to each other in the first direction, and further includes a merge unit that merges the clusters determined to be adjacent to each other in the first direction You may prepare.

上記マージ用ソート部は、上記Ｄ次元の座標に基づく第２の順位決定処理の結果に基づいて、上記特徴空間における第２の方向について上記クラスタをソートし、上記隣接判定部は、上記第２の方向についてソートされたクラスタが、上記第２の方向で互いに隣接するか否かを判定し、上記マージ部は、さらに、上記第２の方向について互いに隣接すると判定されたクラスタをマージしてもよい。 The merge sorting unit sorts the clusters in the second direction in the feature space based on the result of the second rank determination process based on the D-dimensional coordinates, and the adjacency determination unit The clusters sorted in the second direction are determined to be adjacent to each other in the second direction, and the merge unit may further merge the clusters determined to be adjacent to each other in the second direction. Good.

上記特徴空間は、地表面であり、上記Ｄ次元の座標は、緯度および経度からなる２次元の座標であり、上記クラスタは、上記２次元の座標によって上記地表面上に定義されるグリッドに含まれる上記データの位置情報を包含する領域であり、上記第１の順位決定処理は、上記第１の方向について上記グリッドをソートし、各グリッドに含まれるクラスタに当該グリッドのソート順を順位として与える処理であってもよい。 The feature space is a ground surface, the D-dimensional coordinates are two-dimensional coordinates composed of latitude and longitude, and the cluster is included in a grid defined on the ground surface by the two-dimensional coordinates. The first rank determination process sorts the grid in the first direction and gives the sorting order of the grid as a rank to the clusters included in each grid. It may be a process.

上記特徴空間は、３次元空間であり、上記Ｄ次元の座標は、直交座標系を構成する３次元の座標であり、上記クラスタは、上記３次元の座標によって上記３次元空間内に定義されるブロックに含まれる上記データの位置情報を包含する領域であってもよい。 The feature space is a three-dimensional space, the D-dimensional coordinates are three-dimensional coordinates constituting an orthogonal coordinate system, and the cluster is defined in the three-dimensional space by the three-dimensional coordinates. It may be an area including the position information of the data included in the block.

また、上記課題を解決するために、本発明の別の観点によれば、Ｄ次元の座標（Ｄ＝２，３，・・・）によって規定される特徴空間における位置情報を有するデータについて、所定の桁数のＮ進数（Ｎ＝２，３，・・・）で表現された各次元の座標の値を上記各次元について順に１桁ずつ配列したＮ進数値を生成するステップと、上記Ｎ進数値の上位ｋ桁（ｋ＝１，２，・・・）が共通する上記データを同一のクラスタに分類するステップとを含む情報処理方法が提供される。 In order to solve the above problem, according to another aspect of the present invention, data having position information in a feature space defined by D-dimensional coordinates (D = 2, 3,...) Generating an N-ary value in which the coordinate values of each dimension expressed in N-digits (N = 2, 3,. And classifying the data having the same upper k digits (k = 1, 2,...) Into the same cluster.

また、上記課題を解決するために、本発明のさらに別の観点によれば、Ｄ次元の座標（Ｄ＝２，３，・・・）によって規定される特徴空間における位置情報を有するデータについて、所定の桁数のＮ進数（Ｎ＝２，３，・・・）で表現された各次元の座標の値を上記各次元について順に１桁ずつ配列したＮ進数値を生成する処理と、上記Ｎ進数値の上位ｋ桁（ｋ＝１，２，・・・）が共通する上記データを同一のクラスタに分類する処理とをコンピュータに実行させるプログラムが提供される。 In order to solve the above problem, according to still another aspect of the present invention, data having position information in a feature space defined by D-dimensional coordinates (D = 2, 3,...) A process of generating an N-ary value in which the coordinates of each dimension expressed in N-digit numbers (N = 2, 3,...) Having a predetermined number of digits are arranged in order for each dimension, and N There is provided a program for causing a computer to execute processing for classifying the above-mentioned data having the same high-order k digits (k = 1, 2,...) Into the same cluster.

上記プログラムは、上記Ｄ次元の座標に基づく第１の順位決定処理の結果に基づいて、上記特徴空間における第１の方向について上記クラスタをソートする処理と、上記第１の方向についてソートされたクラスタが、上記第１の方向で互いに隣接するか否かを判定する処理と、上記第１の方向について互いに隣接すると判定されたクラスタをマージする処理とをさらにコンピュータに実行させてもよい。 The program includes a process of sorting the clusters with respect to a first direction in the feature space based on a result of a first rank determination process based on the D-dimensional coordinates, and a cluster sorted with respect to the first direction. However, the computer may further execute a process of determining whether or not adjacent to each other in the first direction and a process of merging clusters determined to be adjacent to each other in the first direction.

上記クラスタをマージする処理は、上記クラスタの相互間の距離を算出する処理と、上記算出された距離が所定の閾値以下である場合に、上記クラスタをマージする処理とを含んでもよい。 The process of merging the clusters may include a process of calculating a distance between the clusters and a process of merging the clusters when the calculated distance is equal to or less than a predetermined threshold.

上記クラスタをマージする処理は、上記クラスタの相互間の距離を算出する処理と、上記算出された距離が所定の閾値以下である場合に、上記クラスタをマージ候補クラスタとして記憶する処理と、上記記憶されたマージ候補クラスタのうち、上記相互間の距離が小さいマージ候補クラスタから順にマージする処理とを含んでもよい。 The process of merging the clusters includes a process of calculating a distance between the clusters, a process of storing the cluster as a merge candidate cluster when the calculated distance is equal to or less than a predetermined threshold, and the storage The merge candidate clusters may be merged in order from the merge candidate clusters with the smallest distance between them.

以上説明したように本発明によれば、処理量を抑制した高速なクラスタリング処理が可能になる。 As described above, according to the present invention, high-speed clustering processing with a reduced processing amount is possible.

本発明の第１の実施形態におけるコンテンツ、クラスタ、およびグリッドの関係の例を示す図である。It is a figure which shows the example of the relationship of the content in the 1st Embodiment of this invention, a cluster, and a grid. 本発明の第１の実施形態におけるグリッドの階層構造の例を示す図である。It is a figure which shows the example of the hierarchical structure of the grid in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるクラスタリングの結果の例を示す図である。It is a figure which shows the example of the result of the clustering in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるグリッドベースのクラスタリングと、一般的な距離ベースのクラスタリングとを比較して説明するための図である。It is a figure for comparing and comparing the grid-based clustering and the general distance-based clustering in the first embodiment of the present invention. 本発明の第１の実施形態におけるグリッドベースのクラスタリングと、一般的な距離ベースのクラスタリングとを比較して説明するための図である。It is a figure for comparing and comparing the grid-based clustering and the general distance-based clustering in the first embodiment of the present invention. 本発明の第１の実施形態に係る情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態におけるクラスタリングについて説明するための図である。It is a figure for demonstrating the clustering in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるクラスタのマージについて説明するための図である。It is a figure for demonstrating the merge of the cluster in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるクラスタリングおよびマージ処理を示すフローチャートである。It is a flowchart which shows the clustering and merge process in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるクラスタリングについて説明するための図である。It is a figure for demonstrating the clustering in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるクラスタ特定情報について説明するための図である。It is a figure for demonstrating the cluster specific information in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるマージ関連処理を示すフローチャートである。It is a flowchart which shows the merge related process in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるマージ設定情報の例を示す図である。It is a figure which shows the example of the merge setting information in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるマージ設定選択処理を示すフローチャートである。It is a flowchart which shows the merge setting selection process in the 1st Embodiment of this invention. 本発明の第１の実施形態における探索順マージ処理を示すフローチャートである。It is a flowchart which shows the search order merge process in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるフルマッチマージ処理を示すフローチャートである。It is a flowchart which shows the full match merge process in the 1st Embodiment of this invention. 本発明の第１の実施形態における水平方向でのグリッド探索を示す図である。It is a figure which shows the grid search in the horizontal direction in the 1st Embodiment of this invention. 本発明の第１の実施形態における垂直方向でのグリッド探索を示す図である。It is a figure which shows the grid search in the perpendicular direction in the 1st Embodiment of this invention. 本発明の第１の実施形態における斜め右下方向でのグリッド探索を示す図である。It is a figure which shows the grid search in the diagonally lower right direction in the 1st Embodiment of this invention. 本発明の第１の実施形態における斜め右上方向でのグリッド探索を示す図である。It is a figure which shows the grid search in the diagonal upper right direction in the 1st Embodiment of this invention. 本発明の第１の実施形態における１方向探索について説明するための図である。It is a figure for demonstrating the one way search in the 1st Embodiment of this invention. 本発明の第１の実施形態における２方向探索について説明するための図である。It is a figure for demonstrating the two-way search in the 1st Embodiment of this invention. 本発明の第１の実施形態における４方向探索について説明するための図である。It is a figure for demonstrating the four-way search in the 1st Embodiment of this invention. 本発明の第１の実施形態における近傍探索マージ処理（上位探索なし）を示すフローチャートである。It is a flowchart which shows the vicinity search merge process (no upper search) in the 1st Embodiment of this invention. 本発明の第１の実施形態における隣接探索処理（上位探索なし）を示すフローチャートである。It is a flowchart which shows the adjacent search process (no high-order search) in the 1st Embodiment of this invention. 本発明の第１の実施形態における近傍探索マージ処理（上位探索あり）を示すフローチャートである。It is a flowchart which shows the neighborhood search merge process (with high rank search) in the 1st Embodiment of this invention. 本発明の第１の実施形態における上位グリッドリストの例を示す図である。It is a figure which shows the example of the high-order grid list | wrist in the 1st Embodiment of this invention. 本発明の第１の実施形態における上位グリッドリストの例を示す図である。It is a figure which shows the example of the high-order grid list | wrist in the 1st Embodiment of this invention. 本発明の第１の実施形態における隣接探索処理（上位探索あり）について説明するための図である。It is a figure for demonstrating the adjacent search process (with high rank search) in the 1st Embodiment of this invention. 本発明の第１の実施形態における隣接探索処理（上位探索あり）を示すフローチャートである。It is a flowchart which shows the adjacent search process (with high-order search) in the 1st Embodiment of this invention. 本発明の第１の実施形態における隣接探索処理（上位探索あり）について説明するための図である。It is a figure for demonstrating the adjacent search process (with high rank search) in the 1st Embodiment of this invention. 本発明の第１の実施形態における近傍探索マージ処理（上位探索あり）のマージ対象グリッドの例を示す図である。It is a figure which shows the example of the merging object grid of the vicinity search merge process (with high-order search) in the 1st Embodiment of this invention. 本発明の第１の実施形態における距離順ソートの概要について説明するための図である。It is a figure for demonstrating the outline | summary of the distance order sort in the 1st Embodiment of this invention. 本発明の第１の実施形態における距離順マージ処理を示すフローチャートである。It is a flowchart which shows the distance order merge process in the 1st Embodiment of this invention. 本発明の第２の実施形態におけるコンテンツ、クラスタ、およびブロックの関係の例を示す図である。It is a figure which shows the example of the relationship of the content in the 2nd Embodiment of this invention, a cluster, and a block. 本発明の第２の実施形態におけるコンテンツおよびクラスタの表示の例を示す図である。It is a figure which shows the example of a display of the content and cluster in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるブロックによる地表面の分割について説明するための図である。It is a figure for demonstrating the division | segmentation of the ground surface by the block in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるブロックによる地表面の分割について説明するための図である。It is a figure for demonstrating the division | segmentation of the ground surface by the block in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるブロックによる地表面の分割について説明するための図である。It is a figure for demonstrating the division | segmentation of the ground surface by the block in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるクラスタリングについて説明するための図である。It is a figure for demonstrating the clustering in the 2nd Embodiment of this invention. 本発明の実施形態に係る情報処理装置のハードウェア構成を説明するためのブロック図である。It is a block diagram for demonstrating the hardware constitutions of the information processing apparatus which concerns on embodiment of this invention.

以下に添付図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書および図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Exemplary embodiments of the present invention will be described below in detail with reference to the accompanying drawings. In the present specification and drawings, components having substantially the same functional configuration are denoted by the same reference numerals, and redundant description is omitted.

なお、説明は以下の順序で行うものとする。
１．第１の実施形態
１−１．グリッドベースの位置クラスタリングの概要
１−２．情報処理装置の構成
１−３．クラスタリングおよびマージ処理の詳細
２．第２の実施形態
２−１．ブロックベースの位置クラスタリングの概要
３．本発明の実施形態に係る情報処理装置のハードウェア構成
４．まとめ The description will be made in the following order.
1. 1. First embodiment 1-1. Outline of grid-based position clustering 1-2. Configuration of information processing apparatus 1-3. 1. Details of clustering and merge processing Second embodiment 2-1. 2. Overview of block-based location clustering 3. Hardware configuration of information processing apparatus according to embodiment of the present invention Summary

＜１．第１の実施形態＞
本発明の第１の実施形態では、地表面が特徴空間に相当する。また、本実施形態において、地表面における位置情報は、緯度および経度という２次元の座標によって表される。また、本実施形態において、クラスタは、緯度および経度によって地表面上に定義されるグリッドに含まれるコンテンツの位置情報を包含する領域である。 <1. First Embodiment>
In the first embodiment of the present invention, the ground surface corresponds to the feature space. In the present embodiment, position information on the ground surface is represented by two-dimensional coordinates such as latitude and longitude. In the present embodiment, the cluster is an area that includes position information of content included in a grid defined on the ground surface by latitude and longitude.

（１−１．グリッドベースの位置クラスタリングの概要）
まず、図１〜図５を参照して、本発明の第１の実施形態におけるクラスタリングの概要について説明する。本実施形態におけるクラスタリングは、緯度および経度という２次元の座標によって定義されるグリッドを基準にして、位置情報を有するコンテンツをクラスタに分類するものであり、グリッドベースの位置クラスタリングともいえるものである。 (1-1. Overview of grid-based location clustering)
First, an overview of clustering in the first embodiment of the present invention will be described with reference to FIGS. Clustering in the present embodiment classifies content having position information into clusters with reference to a grid defined by two-dimensional coordinates of latitude and longitude, and can be said to be grid-based position clustering.

（グリッドについて）
図１は、本発明の第１の実施形態におけるコンテンツ、クラスタ、およびグリッドの関係の例を示す図である。図１には、地表面１００１、コンテンツ１０１１、クラスタ１０２１、およびグリッド１０３１が図示されている。 (About the grid)
FIG. 1 is a diagram illustrating an example of the relationship among content, clusters, and grids according to the first embodiment of the present invention. In FIG. 1, the ground surface 1001, the content 1011, the cluster 1021, and the grid 1031 are illustrated.

地表面１００１は、地球の表面の全部または一部の領域である。本実施形態において、地表面１００１は、緯度および経度という２次元の座標によって位置情報が表される２次元平面として扱われる。 The ground surface 1001 is a region of all or part of the surface of the earth. In the present embodiment, the ground surface 1001 is treated as a two-dimensional plane in which position information is represented by two-dimensional coordinates such as latitude and longitude.

コンテンツ１０１１は、地表面１００１上の位置を特定する位置情報を有するデータである。コンテンツ１０１１は、位置情報そのものであってもよく、また、何らかの情報に対する付加的な情報として位置情報が付加されたデータであってもよい。コンテンツ１０１１は、例えば、撮影場所の位置情報が付加された画像データでありうる。 The content 1011 is data having position information for specifying a position on the ground surface 1001. The content 1011 may be position information itself, or may be data to which position information is added as additional information for some information. The content 1011 can be, for example, image data to which position information on the shooting location is added.

クラスタ１０２１は、地表面１００１において互いに近い位置にあるコンテンツ１０１１を含む領域である。クラスタ１０２１は、略矩形として図示されているが、その他の形状であってもよい。クラスタ１０２１は、クラスタ１０２１に含まれるコンテンツ１０１１の外接図形であってもよい。 The cluster 1021 is an area including the content 1011 located at a position close to each other on the ground surface 1001. The cluster 1021 is illustrated as a substantially rectangular shape, but may have other shapes. The cluster 1021 may be a circumscribed figure of the content 1011 included in the cluster 1021.

グリッド１０３１は、地表面１００１に設定されたグリッドである。グリッド１０３１は、地表面１００１上で緯度および経度の範囲によって定義される矩形の領域でありうる。グリッド１０３１の大きさは、後述するように、コンテンツ１０１１の数、およびクラスタリングの対象となる領域の大きさなど、クラスタリングの条件に応じて、適切な大きさに設定されうる。 The grid 1031 is a grid set on the ground surface 1001. The grid 1031 can be a rectangular area defined by latitude and longitude ranges on the ground surface 1001. As will be described later, the size of the grid 1031 can be set to an appropriate size according to the clustering conditions such as the number of contents 1011 and the size of a region to be clustered.

図示されているように、本実施形態においては、同じグリッド１０３１に含まれるコンテンツ１０１１が、同じクラスタ１０２１に分類される。クラスタ１０２１のマージが実行される場合を除いて、コンテンツ１０１１が分類される一のクラスタ１０２１の領域は、一のグリッド１０３１の領域に含まれる。つまり、本実施形態におけるグリッドベースの位置クラスタリング処理においては、同じグリッド１０３１に含まれるか否かが、クラスタリングの基本的な基準になる。 As illustrated, in the present embodiment, the content 1011 included in the same grid 1031 is classified into the same cluster 1021. The area of one cluster 1021 into which the content 1011 is classified is included in the area of one grid 1031 except when the cluster 1021 is merged. That is, in the grid-based position clustering process in the present embodiment, whether or not they are included in the same grid 1031 is a basic standard for clustering.

ここで、一般的な距離ベースの位置クラスタリングでは、コンテンツ間の距離を算出し、その距離を所定の閾値または他のコンテンツ間の距離と比較する処理が行われる。かかる処理では、コンテンツの組み合わせの数だけ距離計算をするため、距離計算の処理負荷が大きくなる。また、コンテンツ間の距離を他のコンテンツ間の距離と比較する場合、算出された距離を一時的に保持しておくために多くの記憶領域が使用される。 Here, in general distance-based position clustering, a distance between contents is calculated, and a process of comparing the distance with a predetermined threshold or a distance between other contents is performed. In such processing, the distance calculation is performed by the number of combinations of contents, so the processing load of distance calculation increases. Further, when comparing the distance between contents with the distance between other contents, many storage areas are used to temporarily hold the calculated distance.

一方、本実施形態におけるグリッドベースの位置クラスタリングでは、コンテンツ１０１１の位置情報そのものが、コンテンツ１０１１が含まれるグリッド１０３１を表している。後述するように、コンテンツ１０１１を、緯度および経度の値を順に１桁ずつ配列したＮ進数値の順にソートすると、同一のグリッド１０３１の領域に含まれるコンテンツ１０１１は、ソートの結果において互いに隣接する。ソートの結果において隣接したコンテンツ１０１１同士が同一のグリッド１０３１に含まれるか否かは、例えば、上記のＮ進数値の上位ｋ桁（ｋ＝１，２，・・・）が共通するか否かによって判定されうる。ここで、上述のように、同一のグリッド１０３１に含まれるコンテンツ１０１１は、同一のクラスタ１０２１に分類されるコンテンツである。それゆえ、本実施形態におけるグリッドベースの位置クラスタリングでは、コンテンツ１０１１の位置情報から生成した数値をソートすることが、クラスタリングの主な処理になる。ソート処理は、上記の距離計算よりも処理負荷が小さく、また処理回数も少なくて済む。従って、本実施形態におけるグリッドベースの位置クラスタリングでは、処理速度が向上し、使用する記憶領域も削減される。 On the other hand, in the grid-based position clustering in this embodiment, the position information itself of the content 1011 represents the grid 1031 including the content 1011. As will be described later, when the content 1011 is sorted in the order of an N-ary value in which latitude and longitude values are arranged one by one in order, the content 1011 included in the region of the same grid 1031 is adjacent to each other in the sorting result. Whether or not the adjacent contents 1011 are included in the same grid 1031 in the sorting result is, for example, whether the upper k digits (k = 1, 2,...) Of the N-ary value are common. Can be determined. Here, as described above, the content 1011 included in the same grid 1031 is classified into the same cluster 1021. Therefore, in the grid-based position clustering in the present embodiment, sorting the numerical values generated from the position information of the content 1011 is the main processing of clustering. The sort process requires a smaller processing load than the above distance calculation and requires fewer processing times. Therefore, in the grid-based position clustering in this embodiment, the processing speed is improved and the storage area to be used is also reduced.

（グリッドの階層構造について）
図２は、本発明の第１の実施形態におけるグリッドの階層構造の例を示す図である。図２には、レベル０グリッド１０３２、レベル１グリッド１０３３、およびレベル２グリッド１０３４が図示されている。 (About the grid hierarchy)
FIG. 2 is a diagram illustrating an example of a hierarchical structure of the grid in the first embodiment of the present invention. FIG. 2 shows a level 0 grid 1032, a level 1 grid 1033, and a level 2 grid 1034.

レベル０グリッド１０３２は、地表面１００１の全体を範囲とする、階層構造の最上位のグリッドである。つまり、階層構造の最上位においては、地表面１００１全体が１つのグリッドに含まれる。 The level 0 grid 1032 is a topmost grid in a hierarchical structure that covers the entire ground surface 1001. That is, at the top of the hierarchical structure, the entire ground surface 1001 is included in one grid.

レベル１グリッド１０３３は、レベル０グリッド１０３２を、緯度方向および経度方向でそれぞれ２分割したグリッドである。換言すれば、レベル１グリッド１０３３は、レベル０グリッド１０３２の領域である地表面１００１全体を４分割する。 The level 1 grid 1033 is a grid obtained by dividing the level 0 grid 1032 into two in the latitude direction and the longitude direction. In other words, the level 1 grid 1033 divides the entire ground surface 1001 that is the area of the level 0 grid 1032 into four.

レベル２グリッド１０３４は、レベル１グリッド１０３３を、緯度方向および経度方向でそれぞれ２分割したグリッドである。換言すれば、レベル２グリッド１０３４は、レベル１グリッド１０３３の領域を４分割する。つまり、レベル２グリッド１０３４は、レベル０グリッド１０３２の領域である地表面１００１全体を１６分割する。 The level 2 grid 1034 is a grid obtained by dividing the level 1 grid 1033 into two in the latitude direction and the longitude direction. In other words, the level 2 grid 1034 divides the area of the level 1 grid 1033 into four. That is, the level 2 grid 1034 divides the entire ground surface 1001 that is the area of the level 0 grid 1032 into 16 parts.

グリッドの階層構造は、同様にしてさらに下位のレベルに拡張される。具体的には、レベル２グリッド１０３４の領域を４分割したレベル３グリッド、レベル３グリッドの領域を４分割したレベル４グリッド、・・・というように、さらに細かい領域のグリッドが定義されうる。クラスタリング処理に用いられるグリッドのレベルを調整することによって、クラスタリングの粒度と処理負荷をバランスさせることが可能である。 The hierarchical structure of the grid is similarly extended to lower levels. Specifically, a finer area grid can be defined such as a level 3 grid obtained by dividing the area of the level 2 grid 1034 into four, a level 4 grid obtained by dividing the area of the level 3 grid into four, and so on. By adjusting the level of the grid used for the clustering process, it is possible to balance the clustering granularity and the processing load.

このように、本実施形態においては、あるレベルのグリッドを緯度方向および経度方向にそれぞれ２分割したグリッドが、１つ下位のレベルのグリッドになる。換言すれば、下位のレベルのグリッドは、上位のレベルのグリッドの領域を４分割する。従って、本実施形態におけるグリッド１０３１の階層構造は、レベル０グリッド１０３２をルートノードとし、以下の各レベルのグリッドをノードとする４分木構造を有し、グリッド１０３１に従って定義されるクラスタ１０２１もまた、同様の４分木構造を有する。 Thus, in the present embodiment, a grid obtained by dividing a grid at a certain level into two in the latitude direction and the longitude direction respectively becomes a grid of one level below. In other words, the lower level grid divides the area of the upper level grid into four. Therefore, the hierarchical structure of the grid 1031 in this embodiment has a quadtree structure with the level 0 grid 1032 as a root node and the following grids at each level as nodes, and the cluster 1021 defined according to the grid 1031 also has Have a similar quad-tree structure.

ここで、一般的な距離ベースの位置クラスタリングでは、クラスタの木構造が定義された場合、その木構造の情報を保持する記憶領域が消費される。一方、本実施形態におけるグリッドベースの位置クラスタリングでは、上述のようにグリッド１０３１の４分木構造が一意に定まっているため、グリッド１０３１がどのレベルのグリッドであるかという情報が保持されていれば、グリッド１０３１の４分木構造に基づいて、クラスタ１０２１の木構造を容易に把握することが可能である。 Here, in general distance-based position clustering, when a tree structure of a cluster is defined, a storage area that holds information on the tree structure is consumed. On the other hand, in the grid-based position clustering according to the present embodiment, since the quadtree structure of the grid 1031 is uniquely determined as described above, if the information about which level the grid 1031 is is retained. Based on the quadtree structure of the grid 1031, it is possible to easily grasp the tree structure of the cluster 1021.

（クラスタリングの結果について）
図３は、本発明の第１の実施形態におけるクラスタリングの結果の例を示す図である。図３には、地図１００２、コンテンツアイコン１０１２、クラスタ領域１０２２、クラスタ中心１０２３、およびグリッド線１０３５が図示されている。 (About clustering results)
FIG. 3 is a diagram illustrating an example of the result of clustering in the first exemplary embodiment of the present invention. FIG. 3 shows a map 1002, a content icon 1012, a cluster area 1022, a cluster center 1023, and a grid line 1035.

地図１００２は、地表面１００１の一部または全部の領域を表す画像である。地図１００２は、コンテンツ１０１１の位置と、コンテンツ１０１１のクラスタリングの結果であるクラスタ１０２１の領域をユーザに示すために表示される。地図１００２によって表される地表面１００１の領域は、地表面１００１においてコンテンツ１０１１が存在する範囲に応じて設定されてもよく、またユーザの操作に応じて設定されてもよい。 A map 1002 is an image representing a part or all of the area of the ground surface 1001. A map 1002 is displayed to indicate to the user the location of the content 1011 and the area of the cluster 1021 that is the result of clustering the content 1011. The area of the ground surface 1001 represented by the map 1002 may be set according to a range where the content 1011 exists on the ground surface 1001 or may be set according to a user operation.

コンテンツアイコン１０１２は、地表面１００１におけるコンテンツ１０１１の位置に対応する地図１００２上の位置に表示される。コンテンツアイコン１０１２は、ピンの形のアイコンとして図示されているが、これには限られず、様々な形のアイコンでありうる。また、コンテンツアイコン１０１２は、コンテンツ１０１１に含まれる文字または画像などの情報の、一部または全部を表示していてもよい。 The content icon 1012 is displayed at a position on the map 1002 corresponding to the position of the content 1011 on the ground surface 1001. The content icon 1012 is illustrated as a pin-shaped icon, but is not limited to this, and may be an icon of various shapes. The content icon 1012 may display a part or all of information such as characters or images included in the content 1011.

クラスタ領域１０２２は、地表面１００１におけるクラスタ１０２１の領域に対応する地図１００２上の位置に表示される。クラスタ領域１０２２は、クラスタ１０２１の領域と同様の形状で表示されてもよく、また、例えばコンテンツアイコン１０１２と表示が重複することを避け、コンテンツアイコン１０１２を見やすくするために、クラスタ１０２１の領域よりも若干拡張されていてもよい。 The cluster area 1022 is displayed at a position on the map 1002 corresponding to the area of the cluster 1021 on the ground surface 1001. The cluster area 1022 may be displayed in the same shape as the area of the cluster 1021. Also, for example, in order to avoid overlapping the display with the content icon 1012 and make the content icon 1012 easier to see, the cluster area 1022 may be displayed. It may be slightly expanded.

クラスタ中心１０２３は、クラスタ領域１０２２の中心、または地表面１００１におけるクラスタ１０２１の中心の位置に対応する地図１００２上の位置に表示される。クラスタ中心１０２３は、例えばクラスタ１０２１に含まれるコンテンツ１０１１から抽出された要約的な情報（コンテンツ１０１１が画像データであれば、代表画像またはサムネイル画像など）をユーザに示すために表示されてもよいが、必ずしも表示されなくてもよい。 The cluster center 1023 is displayed at a position on the map 1002 corresponding to the center of the cluster region 1022 or the position of the center of the cluster 1021 on the ground surface 1001. The cluster center 1023 may be displayed, for example, to show the user summary information extracted from the content 1011 included in the cluster 1021 (such as a representative image or a thumbnail image if the content 1011 is image data). , Not necessarily displayed.

グリッド線１０３５は、コンテンツ１０１１をクラスタ１０２１に分類するクラスタリングで用いられたグリッド１０３１を示す線である。グリッド線１０３５は、クラスタリングの結果の表示である地図１００２に表示されない。しかし、例えばユーザがクラスタリングの粒度の設定を変更する場合などに、参考情報としてグリッド線１０３５が表示されてもよい。 The grid line 1035 is a line indicating the grid 1031 used in the clustering for classifying the content 1011 into the cluster 1021. The grid line 1035 is not displayed on the map 1002 which is the display of the clustering result. However, for example, when the user changes the setting of clustering granularity, the grid line 1035 may be displayed as reference information.

（距離ベースのクラスタリングとの比較〜グリッドベースの利点）
図４は、本発明の第１の実施形態におけるグリッドベースのクラスタリングと、一般的な距離ベースのクラスタリングとを比較して説明するための図である。図４には、コンテンツ１０１１ａ〜１０１１ｋを、距離ベースでクラスタリングした場合（ａ）と、本実施形態においてグリッドベースでクラスタリングした場合（ｂ）とが図示されている。 (Comparison with distance-based clustering-grid-based advantages)
FIG. 4 is a diagram for comparing and explaining grid-based clustering according to the first embodiment of the present invention and general distance-based clustering. FIG. 4 illustrates a case where the contents 1011a to 1011k are clustered on a distance basis (a) and a case where clustering is performed on a grid base in this embodiment (b).

（ａ）に示されるように、コンテンツ１０１１ａ〜１０１１ｋを距離ベースでクラスタリングした場合、コンテンツ１０１１ａ〜１０１１ｅがクラスタ１０２１ａに分類され、コンテンツ１０１１ｆ〜１０１１ｊがクラスタ１０２１ｂに分類され、コンテンツ１０１１ｋがクラスタ１０２１ｃに分類される。図示されているように、クラスタ１０２１ａ〜１０２１ｃの領域を楕円形で図示すると、クラスタ１０２１ａの領域とクラスタ１０２１ｂの領域とはその一部で重複しており、クラスタ１０２１ｃの領域はクラスタ１０２１ｂの領域に包含されている。距離ベースのクラスタリングの場合、例えば距離の算出および比較の手順によっては、このように互いに入り組んだ領域のクラスタ構造が生成されうる。 As shown in (a), when the contents 1011a to 1011k are clustered on a distance basis, the contents 1011a to 1011e are classified into the cluster 1021a, the contents 1011f to 1011j are classified into the cluster 1021b, and the content 1011k is classified into the cluster 1021c. Is done. As shown in the figure, when the areas of the clusters 1021a to 1021c are illustrated in an oval shape, the area of the cluster 1021a and the area of the cluster 1021b partially overlap each other, and the area of the cluster 1021c is the area of the cluster 1021b. Is included. In the case of distance-based clustering, for example, depending on the distance calculation and comparison procedure, a cluster structure of such a complicated region can be generated.

一方、（ｂ）に示されるように、コンテンツ１０１１ａ〜１０１１ｋをグリッドベースでクラスタリングした場合、コンテンツ１０１１ａ，１０１１ｂがクラスタ１０２１ｄに分類され、コンテンツ１０１１ｃがクラスタ１０２１ｅに分類され、コンテンツ１０１１ｄ〜１０１１ｇがクラスタ１０２１ｆに分類され、コンテンツ１０１１ｈ〜１０１１ｋがクラスタ１０２１ｇに分類される。図示されているように、クラスタ１０２１ｄ〜１０２１ｇは、互いに入り組むことなく、明確に分離されている。上述のように、グリッドベースの位置クラスタリングの場合、コンテンツ１０１１が同じグリッド１０３１に含まれるか否かがクラスタリングの基本的な基準になるため、クラスタ１０２１の領域は原則としてグリッド１０３１の領域に含まれる。従って、互いに入り組んだ領域のクラスタ構造は形成されないであろう。 On the other hand, as shown in (b), when the contents 1011a to 1011k are clustered on a grid basis, the contents 1011a and 1011b are classified into the cluster 1021d, the contents 1011c are classified into the cluster 1021e, and the contents 1011d to 1011g are classified into the cluster 1021f. The contents 1011h to 1011k are classified into the cluster 1021g. As shown in the figure, the clusters 1021d to 1021g are clearly separated without interfering with each other. As described above, in the case of grid-based position clustering, whether or not the content 1011 is included in the same grid 1031 is a basic criterion for clustering. Therefore, the region of the cluster 1021 is included in the region of the grid 1031 in principle. . Therefore, a cluster structure of intricate regions will not be formed.

（距離ベースのクラスタリングとの比較〜グリッドベースの弱点）
図５は、本発明の第１の実施形態におけるクラスタリングと、一般的な距離ベースのクラスタリングとを比較して説明するための別の例を示す図である。図５には、コンテンツ１０１１ｍ〜１０１１ｑを、距離ベースでクラスタリングした場合（ａ）と、本実施形態においてグリッドベースでクラスタリングした場合（ｂ）とが図示されている。 (Comparison with distance-based clustering-grid-based weakness)
FIG. 5 is a diagram illustrating another example for comparing and explaining clustering according to the first embodiment of the present invention and general distance-based clustering. FIG. 5 illustrates a case where the contents 1011m to 1011q are clustered on a distance basis (a) and a case where the contents 1011m to 1011q are clustered on a grid basis in this embodiment (b).

（ａ）に示されるように、コンテンツ１０１１ｍ〜１０１１ｑを距離ベースでクラスタリングした場合、コンテンツ１０１１ｍ，１０１１ｎがクラスタ１０２１ｈに分類され、コンテンツ１０１１ｏ〜１０１１ｑがクラスタ１０２１ｉに分類される。距離ベースのクラスタリングの場合、基本的には、このように距離の近いコンテンツ１０１１同士が同じクラスタ１０２１に分類される。 As shown in (a), when the contents 1011m to 1011q are clustered on a distance basis, the contents 1011m and 1011n are classified into the cluster 1021h, and the contents 1011o to 1011q are classified into the cluster 1021i. In the case of distance-based clustering, basically, the content 1011 having a close distance is classified into the same cluster 1021 in this way.

一方、（ｂ）に示されるように、コンテンツ１０１１ｍ〜１０１１ｑをグリッドベースでクラスタリングした場合、コンテンツ１０１１ｍ，１０１１ｎがクラスタ１０２１ｊに分類され、コンテンツ１０１１ｏがクラスタ１０２１ｋに分類され、コンテンツ１０１１ｐ，１０１１ｑがクラスタ１０２１ｍに分類される。上述のように、グリッドベースの位置クラスタリングの場合、コンテンツ１０１１が同じグリッド１０３１に含まれるか否かがクラスタリングの基本的な基準になるため、異なるグリッド１０３１に含まれるコンテンツ１０１１は、コンテンツ１０１１同士の距離が近くても同じクラスタ１０２１に分類されない場合がある。 On the other hand, as shown in (b), when the contents 1011m to 1011q are clustered on a grid basis, the contents 1011m and 1011n are classified into the cluster 1021j, the contents 1011o are classified into the cluster 1021k, and the contents 1011p and 1011q are classified into the cluster 1021m. are categorized. As described above, in the case of grid-based position clustering, whether or not the content 1011 is included in the same grid 1031 is a basic criterion for clustering. Therefore, the content 1011 included in the different grid 1031 Even if the distance is short, it may not be classified into the same cluster 1021.

さらに、（ｂ）では、グリッド境界１０３６および上位グリッド境界１０３７が図示されている。グリッド境界１０３６は、グリッド１０３１の境界である。上位グリッド境界１０３７は、グリッド１０３１の境界であり、かつ、４つのグリッド１０３１を含む上位レベルのグリッドの境界である。図示されている例において、コンテンツ１０１１ｍ，１０１１ｎはグリッド１０３１ａに含まれ、コンテンツ１０１１ｏはグリッド１０３１ｅに含まれ、コンテンツ１０１１ｐ，１０１１ｑはグリッド１０３１ｃに含まれる。 Further, in (b), a grid boundary 1036 and an upper grid boundary 1037 are shown. The grid boundary 1036 is a boundary of the grid 1031. The upper grid boundary 1037 is a boundary of the grid 1031 and a boundary of an upper level grid including the four grids 1031. In the illustrated example, the contents 1011m and 1011n are included in the grid 1031a, the content 1011o is included in the grid 1031e, and the contents 1011p and 1011q are included in the grid 1031c.

ここで、グリッド１０３１の上位グリッドでのクラスタリングが実行された場合を考える。図示されているように、グリッド１０３１ａとグリッド１０３１ｅとは、同じ上位グリッドに含まれる。そのため、上位グリッドでのクラスタリングが実行された場合、コンテンツ１０１１ｍ，１０１１ｎ，１０１１ｏは、同じクラスタ１０２１ｎに分類される。一方、グリッド１０３１ｃは、グリッド１０３１ａ，１０３１ｅとは異なる上位グリッドに含まれる。そのため、上位グリッドでのクラスタリングが実行されても、コンテンツ１０１１ｐ，１０１１ｑのクラスタはクラスタ１０２１ｍで変化しない。 Here, consider a case where clustering is performed on the upper grid of the grid 1031. As illustrated, the grid 1031a and the grid 1031e are included in the same upper grid. Therefore, when clustering in the upper grid is executed, the contents 1011m, 1011n, and 1011o are classified into the same cluster 1021n. On the other hand, the grid 1031c is included in a higher-order grid different from the grids 1031a and 1031e. For this reason, even if clustering in the upper grid is executed, the clusters of the contents 1011p and 1011q do not change in the cluster 1021m.

このように、グリッドベースの位置クラスタリングは、処理が非常に高速であるという大きな利点を有するが、グリッド１０３１の境界が間にある場合、距離が近いコンテンツ１０１１同士であっても同じクラスタ１０２１に分類されない場合がありうることに留意すべきである。このような場合には、後述するマージ処理によって、クラスタリングの結果を自然な形に近づけることが可能である。なお、後述するように、このマージ処理も、グリッドベースのクラスタリングの性質を活かした高速な処理によって実現されうる。 As described above, the grid-based position clustering has a great advantage that the processing is very fast. However, when the grid 1031 has a boundary, the content 1011 having a short distance is classified into the same cluster 1021. Note that it may not be possible. In such a case, the result of clustering can be approximated to a natural form by a merge process described later. As will be described later, this merging process can also be realized by a high-speed process utilizing the characteristics of grid-based clustering.

（１−２．情報処理装置の構成）
次に、図６〜図８を参照して、本発明の第１の実施形態に係る情報処理装置の構成について説明する。 (1-2. Configuration of information processing apparatus)
Next, the configuration of the information processing apparatus according to the first embodiment of the present invention will be described with reference to FIGS.

図６は、本発明の第１の実施形態に係る情報処理装置の構成を示すブロック図である。図６では、情報処理装置１００と、情報処理装置１００に主に含まれるＮ進数値生成部１０１、クラスタリング部１０３、マージ部１０７、入力部１１３、表示制御部１１５、および表示部１１７が図示されている。 FIG. 6 is a block diagram showing the configuration of the information processing apparatus according to the first embodiment of the present invention. In FIG. 6, the information processing apparatus 100 and an N-ary value generation unit 101, a clustering unit 103, a merge unit 107, an input unit 113, a display control unit 115, and a display unit 117 that are mainly included in the information processing apparatus 100 are illustrated. ing.

情報処理装置１００は、上述のコンテンツ１０１１をデータとして扱う。コンテンツ１０１１は、例えば、静止画コンテンツもしくは動画コンテンツなどの画像コンテンツ、またはユーザ同士が各種情報の共有を行うためにサーバ等に登録した各種の文字情報もしくは画像情報等などでありうる。また、コンテンツ１０１１は、メール、楽曲、スケジュール、電子マネー使用履歴、通話履歴、コンテンツ視聴履歴、観光情報や地域情報、ニュースや天気予報、着信音モード履歴等のコンテンツであってもよい。以下の説明では、静止画コンテンツまたは動画コンテンツなどの画像コンテンツを例として説明する。しかし、情報処理装置１００では、特徴空間における位置を表す位置情報が例えばメタデータとして添付されているデータであれば、任意の情報やコンテンツデータを扱うことが可能である。 The information processing apparatus 100 handles the above-described content 1011 as data. The content 1011 can be, for example, image content such as still image content or moving image content, or various character information or image information registered in a server or the like so that users can share various information. The content 1011 may be content such as mail, music, schedule, electronic money usage history, call history, content viewing history, sightseeing information and area information, news and weather forecast, and ringtone mode history. In the following description, image content such as still image content or moving image content will be described as an example. However, the information processing apparatus 100 can handle arbitrary information and content data as long as the position information representing the position in the feature space is data attached as, for example, metadata.

また、上述のようなコンテンツデータや各種情報を表すデータは、情報処理装置１００の内部に格納されていることが好ましい。しかしながら、情報処理装置１００の外部に設けられたサーバ等の装置にデータ本体が格納されており、情報処理装置１００には、これらのデータ本体に対応するメタデータが格納されていてもよい。以下では、情報処理装置１００が、コンテンツデータや各種情報を表すデータをメタデータとともに格納している場合を例として説明する。 Moreover, it is preferable that the above-described content data and data representing various types of information are stored inside the information processing apparatus 100. However, the data body may be stored in a device such as a server provided outside the information processing apparatus 100, and the metadata corresponding to these data bodies may be stored in the information processing apparatus 100. Hereinafter, a case where the information processing apparatus 100 stores content data and data representing various types of information together with metadata will be described as an example.

（Ｎ進数値生成部）
Ｎ進数値生成部１０１は、例えば、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）等によって実現される。Ｎ進数値生成部１０１は、緯度および経度の２次元の座標によって規定される地表面１００１の位置情報を有するコンテンツ１０１１について、所定の桁数の２進数で表現された緯度および経度の値を、緯度および経度について順に１桁ずつ配列した２進数値を生成する。 (N-ary value generator)
The N-ary value generation unit 101 is realized by, for example, a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. The N-ary numerical value generation unit 101 calculates the latitude and longitude values expressed in binary numbers of a predetermined number of digits for the content 1011 having the position information of the ground surface 1001 defined by the two-dimensional coordinates of latitude and longitude. A binary value is generated in which one digit is arranged in order for latitude and longitude.

例えば、所定の桁数として２９桁を設定した場合、Ｎ進数値生成部１０１は、緯度および経度の値をそれぞれ２９桁の２進数値で表現する。ここで、２進数で表現された緯度を“ａ_２８ａ_２７ａ_２６・・・ａ_０”とし、２進数で表現された経度を“ｂ_２８ｂ_２７ｂ_２６・・・ｂ_０”とすると、これらの値を各次元について順に１桁ずつ配列した２進数値は、“ｂ_２８ａ_２８ｂ_２７ａ_２７ｂ_２６ａ_２６・・・ｂ_０ａ_０”という５８桁の２進数値になる。なお、所定の桁数が２９桁である場合、緯度の最小分解能は、地球の半径を２００００ｋｍとすると２００００ｋｍ／２^２９＝０．０４ｍ相当になる。また、この場合、経度の最小分解能は、地球の直径を４００００ｋｍとすると４００００ｋｍ／２^２９＝０．０７ｍ相当になる。所定の桁数は、例えば、必要な最小分解能、および情報処理装置１００で扱われるデータ単位のサイズ（例えば３２ビット、６４ビットなど）を考慮した適切な値に設定されうる。 For example, when 29 digits are set as the predetermined number of digits, the N-ary value generation unit 101 expresses the values of latitude and longitude as 29-digit binary values, respectively. Here, if the latitude expressed in binary number is “a ₂₈ a ₂₇ a ₂₆ ... A ₀ ” and the longitude expressed in binary number is “b ₂₈ b ₂₇ b ₂₆ ... B ₀ ”, A binary value in which these values are arranged one digit at a time in each dimension is a 58-digit binary value “b ₂₈ a ₂₈ b ₂₇ a ₂₇ b ₂₆ a ₂₆ ... B ₀ a ₀ ”. When the predetermined number of digits is 29, the minimum latitude resolution is equivalent to 20000 km / 2 ²⁹ = 0.04 m when the radius of the earth is 20000 km. In this case, the minimum resolution of longitude is equivalent to 40000 km / 2 ²⁹ = 0.07 m when the diameter of the earth is 40000 km. The predetermined number of digits can be set to an appropriate value in consideration of, for example, the required minimum resolution and the size of the data unit handled by the information processing apparatus 100 (for example, 32 bits, 64 bits, etc.).

このように、Ｎ進数値生成部１０１が位置および経度の値から１つの２進数値を生成することによって、２次元座標である位置情報を１つの数値として保持することが可能になる。また、２進数値において緯度の２進数値の値と経度の２進数値の値とが順に配列されていることによって、後述するようにクラスタリングの処置が容易になる。 As described above, the N-ary value generation unit 101 generates one binary value from the position and longitude values, thereby making it possible to hold position information that is two-dimensional coordinates as one value. Further, since the binary value of latitude and the binary value of longitude are sequentially arranged in the binary value, the clustering process is facilitated as will be described later.

（クラスタリング部）
クラスタリング部１０３は、ＣＰＵ、ＲＯＭ、ＲＡＭ等によって実現される。クラスタリング部１０３は、後述するクラスタリング用ソート部１０５を含む。クラスタリング部１０３は、Ｎ進数値生成部１０１が生成した２進数値の上位ｋ桁（ｋ＝１，２，・・・）が共通するコンテンツ１０１１を同一のクラスタに分類する。ｋ＝２×ｍ（ｍ＝１，２，・・・）である場合、２進数値の上位ｋ桁が共通するコンテンツ１０１１が分類されるクラスタは、２^２＝４分木構造クラスタのｍ層目になる。 (Clustering part)
The clustering unit 103 is realized by a CPU, a ROM, a RAM, and the like. The clustering unit 103 includes a clustering sorting unit 105 described later. The clustering unit 103 classifies the content 1011 having the same upper k digits (k = 1, 2,...) Of the binary values generated by the N-ary value generation unit 101 into the same cluster. When k = 2 × m (m = 1, 2,...), the cluster into which the content 1011 having the same upper k digits of the binary value is classified is m layers of 2 ² = quadrant tree cluster. Eyes.

クラスタリング用ソート部１０５は、クラスタリング部１０３に含まれ、ＣＰＵ、ＲＯＭ、ＲＡＭ等によって実現される。クラスタリングソート部１０５は、コンテンツ１０１１を、Ｎ進数値生成部１０１が生成した２進数値の順にソートする。後述するように、クラスタリング部１０３は、クラスタリング用ソート部１０５によるソートの結果から、同一のクラスタに分類されるコンテンツ１０１１を特定する。 The clustering sort unit 105 is included in the clustering unit 103 and is realized by a CPU, a ROM, a RAM, and the like. The clustering sorting unit 105 sorts the content 1011 in the order of the binary values generated by the N-ary value generation unit 101. As will be described later, the clustering unit 103 identifies the content 1011 classified into the same cluster from the result of sorting by the clustering sorting unit 105.

ここで、図７を参照して、クラスタリング部１０３の機能について説明する。図７は、本発明の第１の実施形態におけるクラスタリングについて説明するための図である。図７では、地表面１００１上に定義されたグリッド１０３１、グリッド１０３１の上位レベルのグリッドである上位グリッド１０４１、および、クラスタリングの対象になるコンテンツ１０１１ｕ〜１０１１ｗが図示されている。 Here, the function of the clustering unit 103 will be described with reference to FIG. FIG. 7 is a diagram for explaining clustering in the first embodiment of the present invention. In FIG. 7, a grid 1031 defined on the ground surface 1001, an upper grid 1041 which is a higher level grid of the grid 1031, and contents 1011 u to 1011 w to be clustered are illustrated.

なお、図１を参照して説明したように、本実施形態において、コンテンツ１０１１が分類されるクラスタ１０２１は、コンテンツ１０１１が属するグリッド１０３１によって定義される。すなわち、クラスタリングの段階では、コンテンツ１０１１が属するグリッド１０３１が特定されれば、自動的にコンテンツ１０１１が分類されるグリッド１０３１が特定される。それゆえ、クラスタリング部１０３がコンテンツ１０１１をクラスタ１０２１に分類することは、コンテンツ１０１１が属するグリッド１０３１を決定することと実質的に同じである。従って、図７を用いた説明では、クラスタリング部１０３が、コンテンツ１０１１が属するグリッド１０３１を特定する処理について主に説明している。 As described with reference to FIG. 1, in this embodiment, the cluster 1021 into which the content 1011 is classified is defined by the grid 1031 to which the content 1011 belongs. That is, in the clustering stage, if the grid 1031 to which the content 1011 belongs is specified, the grid 1031 to which the content 1011 is automatically classified is specified. Therefore, classifying the content 1011 into the cluster 1021 by the clustering unit 103 is substantially the same as determining the grid 1031 to which the content 1011 belongs. Therefore, in the description using FIG. 7, the clustering unit 103 mainly describes processing for specifying the grid 1031 to which the content 1011 belongs.

図７では、簡単のために、コンテンツ１０１１の位置情報の緯度および経度がそれぞれ３桁の２進数値で表現される例を示している。この例において、Ｎ進数値生成部１０１は、コンテンツ１０１１の緯度および経度を、それぞれ０００〜１１１の範囲の３桁の２進数値で表現し、これらの２進数値を順に１桁ずつ配置して、００００００〜１１１１１１の範囲の６桁の２進数値を生成する。従って、経度が０００、緯度が１１１であるコンテンツ１０１１ｕには、“０１０１０１”という２進数値が対応付けられる。また、経度が００１、緯度が１１０であるコンテンツ１０１１ｖには、“０１０１１０”という２進数値が対応付けられる。さらに、経度が０００、緯度が１０１であるコンテンツ１０１１ｗには、“０１０００１”という２進数値が対応付けられる。 For the sake of simplicity, FIG. 7 shows an example in which the latitude and longitude of the position information of the content 1011 are each expressed as a 3-digit binary value. In this example, the N-ary value generation unit 101 expresses the latitude and longitude of the content 1011 as 3-digit binary values in the range of 000 to 111, and arranges these binary values one by one in order. A 6-digit binary value in the range of 000000 to 111111 is generated. Accordingly, the binary value “010101” is associated with the content 1011u having longitude 000 and latitude 111. Also, a binary value of “010110” is associated with the content 1011v whose longitude is 001 and latitude is 110. Further, a binary value of “010001” is associated with the content 1011w whose longitude is 000 and latitude is 101.

一方、図示されている例において、地表面１００１は、４つの上位グリッド１０４１に分割され、各上位グリッド１０４１は、さらに、４つのグリッド１０３１に分割される。つまり、図示されている例において、グリッドは、４分木構造を有する。地表面全体を、この４分木構造の０層目のグリッドとすると、上位グリッド１０４１は１層目のグリッドであり、グリッド１０３１は２層目のグリッドである。上述のように、本実施形態のクラスタリングでは、クラスタ１０２１とグリッド１０３１とは対応している。そのため、図示されている例において、クラスタもグリッドと同様の４分木構造を有する。具体的には、地表面１００１にあるコンテンツ１０１１の全体を含むクラスタが４分木構造のルートノード、すなわち０層目に相当し、各上位グリッド１０４１に含まれるコンテンツ１０１１を含むクラスタが１層目のクラスタ、各グリッド１０３１に含まれるコンテンツ１０１１を含むクラスタが２層目のクラスタになる。 On the other hand, in the illustrated example, the ground surface 1001 is divided into four upper grids 1041, and each upper grid 1041 is further divided into four grids 1031. That is, in the illustrated example, the grid has a quadtree structure. If the entire ground surface is the 0th layer grid of this quadtree structure, the upper grid 1041 is the 1st layer grid and the grid 1031 is the 2nd layer grid. As described above, in the clustering of this embodiment, the cluster 1021 and the grid 1031 correspond to each other. Therefore, in the illustrated example, the cluster also has a quadtree structure similar to the grid. Specifically, the cluster including the entire content 1011 on the ground surface 1001 corresponds to the root node of the quadtree structure, that is, the 0th layer, and the cluster including the content 1011 included in each upper grid 1041 is the first layer. And the cluster including the content 1011 included in each grid 1031 is the second layer cluster.

ここで、「Ｎ進数値生成部１０１が生成した２進数値の上位ｋ桁が共通する複数のコンテンツ１０１１が同一のクラスタに分類される」ことについて説明する。 Here, it will be described that “a plurality of contents 1011 having the same upper k digits of binary values generated by the N-ary value generation unit 101 are classified into the same cluster”.

例えば、図の左上にある上位グリッド１０４１には、経度が０００〜０１１、緯度が１００〜１１１であるコンテンツ１０１１が属する。これらの範囲において、経度の１桁目は“０”であり、緯度の１桁目は“１”である。従って、コンテンツ１０１１に対応付けられた２進数値の上位２桁は、“０１”となり、この上位グリッド１０４１に属するコンテンツ１０１１の２進数値は、“０１ｘｘｘｘ”と表される。同様に、他の上位グリッドに属するコンテンツ１０１１の２進数値は、“１１ｘｘｘｘ”、“００ｘｘｘｘ”、および“１０ｘｘｘｘ”と表される。つまり、４分木構造グリッドの１層目のグリッドである上位グリッド１０４１について、コンテンツ１０１１がどの上位グリッド１０４１に属するかは、コンテンツ１０１１に対応付けられた２進数値の上位２桁によって決定される。従って、２進数値の上位２桁が共通する複数のコンテンツ１０１１は、同一の上位グリッド１０４１に属し、同一の上位グリッド１０４１に対応する同一のクラスタ、すなわち４分木構造クラスタの１層目において同一のクラスタに分類される。 For example, content 1011 having longitudes 000 to 011 and latitudes 100 to 111 belongs to the upper grid 1041 in the upper left of the figure. In these ranges, the first digit of longitude is “0” and the first digit of latitude is “1”. Accordingly, the upper two digits of the binary value associated with the content 1011 are “01”, and the binary value of the content 1011 belonging to the upper grid 1041 is represented as “01xxxx”. Similarly, the binary values of the content 1011 belonging to other upper grids are expressed as “11xxxx”, “00xxxx”, and “10xxxx”. That is, regarding the upper grid 1041 that is the first layer of the quadtree structure grid, which upper grid 1041 the content 1011 belongs to is determined by the upper two digits of the binary value associated with the content 1011. . Therefore, the plurality of contents 1011 having the same upper two digits of the binary value belong to the same upper grid 1041 and are the same in the first cluster of the same cluster corresponding to the same upper grid 1041, that is, the quadtree structure cluster. Classified into clusters.

また、例えば、図の左上端にあるグリッド１０３１には、経度が０００または００１、緯度が１１０または１１１であるコンテンツ１０１１が属する。これらの範囲において、経度の上位２桁は“００”であり、緯度の上位２桁は“１１”である。従って、コンテンツ１０１１に対応付けられた２進数値の上位４桁は“０１０１”となり、このグリッド１０３１に属するコンテンツ１０１１の２進数値は、“０１０１ｘｘ”と表される。同様に、他のグリッドに属するコンテンツ１０１１の２進数値も、上位４桁が特定された形で表される。つまり、４分木構造グリッドの２層目のグリッドであるグリッド１０３１について、コンテンツ１０１１がどのグリッド１０３１に属するかは、コンテンツ１０１１に対応付けられた２進数値の上位４桁によって決定される。従って、２進数値の上位４桁が共通する複数のコンテンツ１０１１は、同一のグリッド１０３１に属し、同一のグリッド１０３１に対応する同一のクラスタ、すなわち４分木構造クラスタの２層目の同一のクラスタに分類される。 Also, for example, content 1011 with longitude 000 or 001 and latitude 110 or 111 belongs to the grid 1031 at the upper left corner of the figure. In these ranges, the upper two digits of longitude are “00” and the upper two digits of latitude are “11”. Accordingly, the upper 4 digits of the binary value associated with the content 1011 are “0101”, and the binary value of the content 1011 belonging to the grid 1031 is represented as “0101xx”. Similarly, the binary value of the content 1011 belonging to another grid is also expressed in a form in which the upper 4 digits are specified. In other words, regarding the grid 1031 which is the second layer of the quadtree structure grid, the grid 1031 to which the content 1011 belongs is determined by the upper 4 digits of the binary value associated with the content 1011. Therefore, the plurality of contents 1011 having the same upper 4 digits of the binary value belong to the same grid 1031 and correspond to the same grid 1031, that is, the same cluster in the second layer of the quadtree structure cluster. are categorized.

さらに、図７を参照して、クラスタリングの具体例として、コンテンツ１０１１ｕ〜１０１１ｗのクラスタリングについて説明する。まず、コンテンツ１０１１ｕは、２進数値“０１０１０１”に対応付けられている。この２進数値の上位４桁は“０１０１”であるため、コンテンツ１０１１ｕは、“０１０１ｘｘ”で表される２進数値に対応する図の左上端にあるグリッド１０３１に属すると判定される。 Furthermore, with reference to FIG. 7, the clustering of the contents 1011u to 1011w will be described as a specific example of clustering. First, the content 1011u is associated with the binary value “010101”. Since the upper 4 digits of this binary value are “0101”, it is determined that the content 1011u belongs to the grid 1031 at the upper left corner of the figure corresponding to the binary value represented by “0101xx”.

次に、コンテンツ１０１１ｖは、２進数値“０１０００１”に対応付けられている。この２進数値の上位４桁は“０１００”であるため、コンテンツ１０１１ｖは、“０１００ｘｘ”で表される２進数値に対応するグリッド１０３１、すなわち、コンテンツ１０１１ｕとは異なるグリッド１０３１に属すると判定される。 Next, the content 1011v is associated with the binary value “010001”. Since the upper 4 digits of this binary value are “0100”, it is determined that the content 1011v belongs to the grid 1031 corresponding to the binary value represented by “0100xx”, that is, the grid 1031 different from the content 1011u. The

次に、コンテンツ１０１１ｗは、２進数値“０１０１１０”に対応付けられている。この２進数値の上位４桁は“０１０１”であるため、コンテンツ１０１１ｖは、“０１０１ｘｘ”で表される２進数値に対応する図の左上端にあるグリッド１０３１、つまりコンテンツ１０１１ｕと同じグリッドに属すると判定される。 Next, the content 1011w is associated with the binary value “010110”. Since the upper 4 digits of this binary value are “0101”, the content 1011v belongs to the grid 1031 at the upper left corner of the figure corresponding to the binary value represented by “0101xx”, that is, the same grid as the content 1011u. It is determined.

ここで、クラスタリング用ソート部１０５が、コンテンツ１０１１ｕ〜１０１１ｗを、２進数値の順にソートした場合を考える。コンテンツ１０１１ｕ〜１０１１ｗを、２進数値の順にソートすると、
コンテンツ１０１１ｖ“０１０００１”
コンテンツ１０１１ｕ“０１０１０１”
コンテンツ１０１１ｗ“０１０１１０”
となる。同一のグリッドに属するコンテンツ、すなわち、同一のクラスタに分類されるコンテンツであるコンテンツ１０１１ｕとコンテンツ１０１１ｗとは、このソート結果において互いに隣接する。従って、クラスタリング部１０３は、クラスタリング用ソート部１０５によるソートの結果から、同一のクラスタ１０２１に分類されるコンテンツ１０１１を特定しうる。 Here, a case is considered where the clustering sorting unit 105 sorts the contents 1011u to 1011w in the order of binary values. When the contents 1011u to 1011w are sorted in the order of binary values,
Content 1011v “010001”
Content 1011u “010101”
Content 1011w “010110”
It becomes. Content belonging to the same grid, that is, content 1011u and content 1011w that are classified into the same cluster are adjacent to each other in the sorting result. Therefore, the clustering unit 103 can specify the content 1011 classified into the same cluster 1021 from the result of sorting by the clustering sorting unit 105.

なお、クラスタリング部１０３およびクラスタリング用ソート部１０５の処理の詳細については後述する。 Details of the processing of the clustering unit 103 and the clustering sort unit 105 will be described later.

（マージ部）
マージ部１０７は、ＣＰＵ、ＲＯＭ、ＲＡＭ等によって実現される。マージ部１０７は、後述するマージ用ソート部１０９および隣接判定部１１１を含む。マージ部１０７は、隣接判定部１１１によって、地表面のある方向について互いに隣接すると判定されたクラスタ１０２１を処理対象にする。マージ部１０７が実行するマージ処理には、後述すように探索順マージと距離順マージがある。また、マージ部１０７は、マージ処理の対象とするクラスタ１０２１を決定するために、マージ用ソート部１０９および隣接判定部１１１の処理結果を利用する場合がある。さらに、マージ部１０７は、マージ処理の条件を設定するために、記憶部１１９に格納された所定のマージ設定情報を参照してもよい。 (Merge part)
The merge unit 107 is realized by a CPU, a ROM, a RAM, and the like. The merge unit 107 includes a merge sort unit 109 and an adjacency determination unit 111 described later. The merge unit 107 sets the clusters 1021 that are determined to be adjacent to each other in a certain direction on the ground surface by the adjacent determination unit 111 as a processing target. The merge processing executed by the merge unit 107 includes search order merge and distance order merge as described later. In addition, the merge unit 107 may use the processing results of the merge sort unit 109 and the adjacency determination unit 111 in order to determine a cluster 1021 to be merged. Further, the merge unit 107 may refer to predetermined merge setting information stored in the storage unit 119 in order to set conditions for the merge process.

マージ用ソート部１０９は、マージ部１０７に含まれ、ＣＰＵ、ＲＯＭ、ＲＡＭ等によって実現される。マージ用ソート部１０９は、緯度および経度に基づく順位決定処理の結果に基づいて、地表面のある方向についてクラスタ１０２１をソートする。マージ用ソート部１０９は、ソートの結果を隣接判定部１１１に提供する。ここで、本実施形態において、順位決定処理は、グリッド１０３１を地表面における方向、例えば東西方向、南北方向、北西−南東方向または南西−北東方向など、についてソートし、各グリッド１０３１に含まれるクラスタ１０２１に、グリッド１０３１のソート順を順位として与える処理でありうる。 The merge sort unit 109 is included in the merge unit 107 and is realized by a CPU, a ROM, a RAM, and the like. The merge sorting unit 109 sorts the clusters 1021 in a certain direction on the ground surface based on the result of the rank determination process based on latitude and longitude. The merge sorting unit 109 provides the sorting result to the adjacency determination unit 111. Here, in this embodiment, the rank determination process sorts the grid 1031 with respect to the direction on the ground surface, for example, the east-west direction, the north-south direction, the northwest-southeast direction, or the southwest-northeast direction, and the clusters included in each grid 1031. This may be a process of giving the sorting order of the grid 1031 as a rank to 1021.

隣接判定部１１１は、マージ部１０７に含まれ、ＣＰＵ、ＲＯＭ、ＲＡＭ等によって実現される。隣接判定部１１１は、マージ用ソート部１０９によって地表面のある方向にソートされたクラスタが、当該方向で互いに隣接するか否かを判定する。隣接判定部１１１は、判定の結果をマージ部１０７に提供する。ここで、本実施形態において、隣接判定処理は、クラスタ１０２１を含むグリッド１０３１が隣接しているか否かを判定する処理でありうる。 The adjacency determination unit 111 is included in the merge unit 107 and is realized by a CPU, a ROM, a RAM, and the like. The adjacency determination unit 111 determines whether or not clusters sorted in a certain direction on the ground surface by the merge sorting unit 109 are adjacent to each other in the direction. The adjacency determination unit 111 provides the determination result to the merge unit 107. Here, in the present embodiment, the adjacent determination process may be a process of determining whether or not the grid 1031 including the cluster 1021 is adjacent.

ここで、図８を参照して、マージ部１０７の機能について説明する。図８は、本発明の第１の実施形態におけるクラスタのマージについて説明するための図である。図８では、経度方向について互いに隣接するグリッド１０３１ｘ，１０３１ｙと、それぞれのグリッドに含まれるクラスタ１０２１ｘ，１０２１ｙが図示されている。 Here, the function of the merge unit 107 will be described with reference to FIG. FIG. 8 is a diagram for explaining cluster merging according to the first embodiment of this invention. FIG. 8 shows grids 1031x and 1031y adjacent to each other in the longitude direction, and clusters 1021x and 1021y included in the respective grids.

図示された例において、マージ部１０７は、クラスタ１０２１ｘとクラスタ１０２１ｙのように、対応するグリッド１０３１が隣接しているクラスタ１０２１を、互いに隣接するクラスタ１０２１として、このようなクラスタ１０２１の間で、クラスタ間の距離ｄを算出してもよい。クラスタ間の距離ｄは、例えば、それぞれのクラスタ１０２１の中心間の距離であってもよい。マージ部１０７は、クラスタ間の距離ｄが所定の閾値以下である場合に、クラスタ１０２１ｘとクラスタ１０２１ｙとをマージしてクラスタ１０２１ｚとする。 In the illustrated example, the merging unit 107 sets a cluster 1021 adjacent to the corresponding grid 1031 as a cluster 1021x and a cluster 1021y as a cluster 1021 adjacent to each other, and the clusters 1021 You may calculate the distance d between. The distance d between clusters may be, for example, the distance between the centers of the respective clusters 1021. The merge unit 107 merges the cluster 1021x and the cluster 1021y into a cluster 1021z when the distance d between the clusters is equal to or smaller than a predetermined threshold.

このように、マージ部１０７は、クラスタ１０２１が緯度または経度の方向で互いに隣接する場合に、これらのクラスタ１０２１についてクラスタ間の距離を算出してもよい。また、マージ部１０７は、クラスタ１０２１が互いに隣接するか否かとは関係なく、クラスタ間の距離を算出してもよい。さらに、マージ部１０７は、クラスタ間の距離が所定の閾値以下である場合に、これらのクラスタ１０２１をすぐにはマージせず、マージ候補クラスタとして記憶部１１９に記憶してもよい。この場合、マージ部１０７は、その後、マージ候補クラスタのうちクラスタ間の距離が小さいマージ候補クラスタから順にクラスタ１０２１をマージする。 As described above, the merge unit 107 may calculate the distance between the clusters 1021 when the clusters 1021 are adjacent to each other in the latitude or longitude direction. The merging unit 107 may calculate the distance between the clusters regardless of whether the clusters 1021 are adjacent to each other. Further, the merge unit 107 may store the clusters 1021 in the storage unit 119 as merge candidate clusters instead of immediately merging these clusters 1021 when the distance between the clusters is equal to or smaller than a predetermined threshold. In this case, the merging unit 107 then merges the clusters 1021 in order from the merge candidate cluster having the smallest distance between the clusters among the merge candidate clusters.

なお、マージ部１０７、マージ用ソート部１０９、および隣接判定部１１１の処理の詳細については後述する。 Details of the processes of the merge unit 107, the merge sort unit 109, and the adjacency determination unit 111 will be described later.

（入力部）
図６に戻って、入力部１１３は、本実施形態に係る情報処理装置１００が備える入力装置の一例である。この入力部１１３は、例えば、ＣＰＵ、ＲＯＭ、ＲＡＭ、入力装置等により実現される。入力部１１３は、情報処理装置１００が備えるキーボード、マウス、タッチパネル等になされたユーザ操作を、なされたユーザ操作に対応する電気的な信号に変換して、Ｎ進数値生成部１０１および表示制御部１１５に通知する。例えば、ユーザによって、クラスタリングの実行を指示する操作や、クラスタリングの粒度の変更を指示するする操作がなされた場合、入力部１１３は、かかる指示を表す情報を生成して、Ｎ進数値生成部１０１などに出力する。 (Input section)
Returning to FIG. 6, the input unit 113 is an example of an input device included in the information processing apparatus 100 according to the present embodiment. The input unit 113 is realized by, for example, a CPU, a ROM, a RAM, an input device, and the like. The input unit 113 converts a user operation performed on the keyboard, mouse, touch panel, and the like included in the information processing apparatus 100 into an electrical signal corresponding to the performed user operation, and generates an N-ary value generation unit 101 and a display control unit 115 is notified. For example, when an operation for instructing execution of clustering or an operation for instructing a change in the granularity of clustering is performed by the user, the input unit 113 generates information representing the instruction and generates an N-ary value generation unit 101. Output to etc.

（表示制御部）
表示制御部１１５は、例えば、ＣＰＵ、ＲＯＭ、ＲＡＭ等により実現される。表示制御部１１５は、例えば、入力部１１３から、クラスタリング結果の表示を指示するユーザ操作がなされた旨の通知を受けると、クラスタリング部１０３およびマージ部１０７によって記憶部１１９などに格納されている、コンテンツ１０１１のクラスタリング結果を取得する。その後、表示制御部１１５は、例えば、図３を参照して説明したようなクラスタリング結果の画像を構成し、この画像を後述する表示部１１７に表示させる表示制御を行ってもよい。 (Display control unit)
The display control unit 115 is realized by a CPU, a ROM, a RAM, and the like, for example. For example, when the display control unit 115 receives a notification from the input unit 113 that a user operation for instructing display of the clustering result has been performed, the display control unit 115 is stored in the storage unit 119 or the like by the clustering unit 103 and the merge unit 107. The clustering result of the content 1011 is acquired. Thereafter, the display control unit 115 may perform display control such that, for example, an image of the clustering result as described with reference to FIG. 3 is configured and this image is displayed on the display unit 117 described later.

（表示部）
表示部１１７は、本実施形態に係る情報処理装置１００が備える表示装置の一例である。表示部１１７は、情報処理装置１００が実行可能な各種のコンテンツや、各種のアプリケーションの実行画面等を表示する表示部である。また、表示部１１７には、各種コンテンツの動作や、各種アプリケーションの実行状況等を操作するために利用される各種のオブジェクトが表示されてもよい。表示部１１７が備える表示画面には、表示制御部１１５による制御によって、例えば図３を用いて説明したようなクラスタリング結果の画像など、様々な情報が表示される。 (Display section)
The display unit 117 is an example of a display device included in the information processing apparatus 100 according to the present embodiment. The display unit 117 is a display unit that displays various contents that can be executed by the information processing apparatus 100, execution screens of various applications, and the like. In addition, the display unit 117 may display various objects used for operating various content operations, execution states of various applications, and the like. Various information such as an image of the clustering result described with reference to FIG. 3 is displayed on the display screen of the display unit 117 under the control of the display control unit 115.

（記憶部）
記憶部１１９は、本実施形態に係る情報処理装置１００が備えるストレージ装置の一例である。この記憶部１１９には、情報処理装置１００が有する各種のコンテンツデータ、および、コンテンツデータに対応付けられたメタデータ等が格納されてもよい。また、記憶部１１９には、Ｎ進数値生成部１０１が生成した２進数値、クラスタリング部１０３がコンテンツ１０１１をクラスタ１０２１に分類した結果、およびマージ部１０７がクラスタ１０２１をマージした結果が格納されてもよい。また、記憶部１１９には、表示制御部１１５が、各種の情報を表示部１１７に表示するために利用する各種のアプリケーションに対応する実行データが格納されてもよい。また、この記憶部１１９には、情報処理装置１００が何らかの処理を行う際に保存する必要が生じた様々なパラメータや処理の途中経過など、または、各種のデータベースなどが適宜格納される。この記憶部１１９には、本実施形態に係る情報処理装置１００が備える各処理部が、自由に読み書きを行うことが可能である。 (Memory part)
The storage unit 119 is an example of a storage device included in the information processing apparatus 100 according to the present embodiment. The storage unit 119 may store various content data included in the information processing apparatus 100 and metadata associated with the content data. The storage unit 119 stores the binary value generated by the N-ary value generation unit 101, the result of the clustering unit 103 classifying the content 1011 into the cluster 1021, and the result of the merge unit 107 merging the cluster 1021. Also good. The storage unit 119 may also store execution data corresponding to various applications used by the display control unit 115 to display various types of information on the display unit 117. In addition, the storage unit 119 appropriately stores various parameters that need to be saved when the information processing apparatus 100 performs some processing, the progress of processing, or various databases. In the storage unit 119, each processing unit included in the information processing apparatus 100 according to the present embodiment can freely read and write.

（情報処理装置についての補足）
なお、本実施形態に係る情報処理装置１００は、コンテンツ１０１１に対応付けられる位置情報を、コンテンツ１０１１自身や付加的なデータファイルから取得する機能を有する装置であればよい。情報処理装置１００の例としては、デジタルスチルカメラ、デジタルビデオカメラ等の撮像装置、記憶装置内蔵マルチメディアコンテンツビューワー、コンテンツを記録保存閲覧することが可能な携帯情報端末、ネットワーク上の地図サービスと連携したコンテンツ管理閲覧サービス、パーソナルコンピュータのアプリケーションソフト、写真データ管理機能を有する携帯ゲーム端末、記憶装置を有するカメラ付き携帯電話機、記憶装置、写真データ管理機能を有するデジタル家電やゲーム機などがある。 (Supplementary information processing equipment)
Note that the information processing apparatus 100 according to the present embodiment may be an apparatus having a function of acquiring position information associated with the content 1011 from the content 1011 itself or an additional data file. Examples of the information processing device 100 include an imaging device such as a digital still camera and a digital video camera, a multimedia content viewer with a built-in storage device, a portable information terminal capable of recording, storing and browsing content, and a map service on a network. Content management browsing services, personal computer application software, portable game terminals having a photo data management function, mobile phones with cameras having a storage device, storage devices, digital home appliances and game machines having a photo data management function, and the like.

以上、本実施形態に係る情報処理装置１００の機能の一例を示した。上記の各構成要素は、汎用的な部材や回路を用いて構成されていてもよいし、各構成要素の機能に特化したハードウェアにより構成されていてもよい。また、各構成要素の機能を、ＣＰＵ等が全て行ってもよい。従って、本実施形態を実施する時々の技術レベルに応じて、適宜、利用する構成を変更することが可能である。 Heretofore, an example of the function of the information processing apparatus 100 according to the present embodiment has been shown. Each component described above may be configured using a general-purpose member or circuit, or may be configured by hardware specialized for the function of each component. In addition, the CPU or the like may perform all functions of each component. Therefore, it is possible to appropriately change the configuration to be used according to the technical level at the time of carrying out the present embodiment.

なお、上述のような本実施形態に係る情報処理装置の各機能を実現するためのコンピュータプログラムを作製し、パーソナルコンピュータ等に実装することが可能である。また、このようなコンピュータプログラムが格納された、コンピュータで読み取り可能な記録媒体も提供することができる。記録媒体は、例えば、磁気ディスク、光ディスク、光磁気ディスク、フラッシュメモリなどである。また、上記のコンピュータプログラムは、記録媒体を用いずに、例えばネットワークを介して配信されてもよい。 It should be noted that a computer program for realizing each function of the information processing apparatus according to the present embodiment as described above can be produced and installed in a personal computer or the like. In addition, a computer-readable recording medium storing such a computer program can be provided. The recording medium is, for example, a magnetic disk, an optical disk, a magneto-optical disk, a flash memory, or the like. Further, the above computer program may be distributed via a network, for example, without using a recording medium.

（１−３．クラスタリングおよびマージ処理の詳細）
次に、図９〜図２９を参照して、本発明の第１の実施形態におけるクラスタリング処理およびマージ処理の詳細について説明する。 (1-3. Details of clustering and merge processing)
Next, with reference to FIGS. 9 to 29, details of the clustering process and the merge process in the first embodiment of the present invention will be described.

図９は、本発明の第１の実施形態におけるクラスタリングおよびマージの処理を示すフローチャートである。図６を参照して説明したように、本実施形態に係る情報処理装置１００では、クラスタリング部１０３によるクラスタリングの結果として特定されたクラスタ１０２１について、マージ部１０７がマージ処理を実行する。従って、処理の流れとしては、まずクラスタリング処理が実行され（ステップＳ１０１）、続いてマージ関連処理が実行される（ステップＳ１０３）。なお、マージ関連処理とは、マージ処理にパラメータ設定などの関連処理を加えた処理の総称である。クラスタリング処理、およびマージ関連処理の詳細については、後述する。 FIG. 9 is a flowchart illustrating clustering and merging processing according to the first embodiment of this invention. As described with reference to FIG. 6, in the information processing apparatus 100 according to the present embodiment, the merging unit 107 performs the merging process on the cluster 1021 identified as a result of clustering by the clustering unit 103. Therefore, as a processing flow, clustering processing is first executed (step S101), and then merge related processing is executed (step S103). The merge related process is a generic name of processes in which related processes such as parameter setting are added to the merge process. Details of the clustering process and the merge-related process will be described later.

表１は、クラスタリング処理において設定されるグリッドレベルと、マージ処理において設定されるマージ閾値との組み合わせの例を示す表である。例えば、グリッドレベルが１０、マージ閾値が５０ｋｍの組が選択された場合、クラスタリング処理では、レベル１０グリッドが用いられる。ここで、図２を参照して説明したグリッドの階層構造では、１レベルごとに上位のグリッドが４分割される。従って、レベル１０グリッドは、レベル０グリッドの範囲である地表面の全体を４^１０分割したグリッドである。また、マージ閾値は、図８を参照して説明したクラスタ間の距離ｄの閾値である。上記の場合、互いに隣接するグリッド１０３１に含まれる複数のクラスタ１０２１について、クラスタ間の距離ｄが５０ｋｍ以下であれば、これらのクラスタ１０２１はマージされる。 Table 1 is a table showing an example of combinations of grid levels set in the clustering process and merge thresholds set in the merge process. For example, when a set having a grid level of 10 and a merge threshold of 50 km is selected, a level 10 grid is used in the clustering process. Here, in the hierarchical structure of the grid described with reference to FIG. 2, the upper grid is divided into four for each level. Therefore, the level 10 grid is an overall 4 ¹⁰ divided grids of the earth surface in the range of level 0 grid. The merge threshold is a threshold for the distance d between clusters described with reference to FIG. In the above case, regarding a plurality of clusters 1021 included in the grid 1031 adjacent to each other, if the distance d between the clusters is 50 km or less, these clusters 1021 are merged.

上記のようなグリッドレベルとマージ閾値とは、それぞれ任意の値に設定されうる。表１において示されたグリッドレベルとマージ閾値との組み合わせは、北緯４０度付近で、マージ閾値を半径とする円内にグリッドが包含されるようなグリッドレベルとマージ閾値との組み合わせの例である。マージ閾値に対して相対的にグリッドレベルを大きく（グリッドサイズを小さく）すると、グリッドの隣接判定の時点で隣接ではないと判定され、グリッドに含まれるクラスタ間の距離がマージ閾値以下であっても、クラスタがマージされない場合がある。ただし、後述する４方向探索、または上位レベル探索などを利用し、グリッドの隣接判定の範囲を広げられる場合には、グリッドサイズを小さくするほど、演算量は増えるものの、クラスタリングの結果が自然な形に近づく。 The grid level and the merge threshold as described above can be set to arbitrary values. The combination of the grid level and the merge threshold shown in Table 1 is an example of the combination of the grid level and the merge threshold such that the grid is included in a circle having a radius of the merge threshold near 40 degrees north latitude. . If the grid level is relatively increased with respect to the merge threshold (the grid size is reduced), it is determined that the grid is not adjacent at the time of determining the grid adjacency, and even if the distance between clusters included in the grid is less than or equal to the merge threshold The cluster may not be merged. However, if the range of grid adjacency determination can be expanded by using a 4-way search or higher level search described later, the amount of calculation increases as the grid size is reduced, but the result of clustering is more natural. Get closer to.

（クラスタリング処理の詳細）
図１０を参照して、クラスタリング部１０３によるクラスタリング処理について、さらに詳細に説明する。図１０は、本発明の第１の実施形態におけるクラスタリングについて説明するための図である。図１０では、クラスタリング用ソート部１０５が、Ｎ進数値生成部１０１によって生成され、コンテンツ１０１１に対応付けられた２進数値の順に、コンテンツ１０１１をソートした状態が示されている。クラスタリング用ソート部１０５によるソートの結果では、コンテンツ１０１１が、１層目のクラスタ１０２４、および２層目のクラスタ１０２５において分類されるクラスタの順に並んでいる。なお、図１０でも、簡単のために、コンテンツ１０１１の位置情報の緯度および経度がそれぞれ３桁の２進数値で表現される例を示している。 (Details of clustering process)
The clustering process by the clustering unit 103 will be described in more detail with reference to FIG. FIG. 10 is a diagram for explaining clustering in the first embodiment of the present invention. FIG. 10 shows a state where the sorting unit 105 for clustering sorts the content 1011 in the order of binary values generated by the N-ary value generation unit 101 and associated with the content 1011. As a result of the sorting by the clustering sorting unit 105, the contents 1011 are arranged in the order of clusters classified in the first layer cluster 1024 and the second layer cluster 1025. Note that FIG. 10 also shows an example in which the latitude and longitude of the position information of the content 1011 are each represented by a 3-digit binary value for the sake of simplicity.

上述のように、本実施形態のクラスタリングでは、Ｎ進数値生成部１０１が生成した２進数値の上位ｋ桁が共通する複数のコンテンツ１０１１が同一のクラスタに分類される。また、ｋ＝２×ｍ（ｍ＝１，２，・・・）である場合、２進数値の上位ｋ桁が共通するコンテンツ１０１１が分類されるクラスタは、２^２＝４分木構造クラスタのｍ層目になる。例えば、２進数値の上位２桁が共通する複数のコンテンツ１０１１は、４分木構造の１層目において同一のクラスタに分類される。つまり、４分木構造の１層目のクラスタは、“００ｘｘｘｘ”、“０１ｘｘｘｘ”、“１０ｘｘｘｘ”、または“１１ｘｘｘｘ”でそれぞれ表される２進数値に対応する４つのクラスタである。また、４分木構造の２層目のクラスタは、“００００ｘｘ”、“０００１ｘｘ”、“００１０ｘｘ”、・・・“１１１１ｘｘ”でそれぞれ表される２進数値に対応する１６のクラスタである。 As described above, in the clustering of the present embodiment, a plurality of contents 1011 having the same high-order k digits of binary values generated by the N-ary value generation unit 101 are classified into the same cluster. When k = 2 × m (m = 1, 2,...), The cluster into which the content 1011 having the same upper k digits of the binary value is classified is 2 ² = quadrant tree cluster. It becomes m-th layer. For example, a plurality of contents 1011 having the same upper two digits of the binary value are classified into the same cluster in the first layer of the quadtree structure. That is, the first-layer clusters of the quadtree structure are four clusters corresponding to binary values represented by “00xxxx”, “01xxxx”, “10xxxx”, or “11xxxx”, respectively. The second-layer cluster of the quadtree structure is 16 clusters corresponding to binary values represented by “0000xx”, “0001xx”, “0010xx”,... “1111xx”, respectively.

なお、図１０の例では、１層目のクラスタ１０２４の例として、“００ｘｘｘｘ”に対応する１層目のクラスタ１０２４ａ、および“０１ｘｘｘｘ”に対応する１層目のクラスタ１０２４ｂが図示されている。また、２層目のクラスタ１０２５の例として、“００００ｘｘ”に対応する２層目のクラスタ１０２５ａ、“０００１ｘｘ”に対応する２層目のクラスタ１０２５ｂ、“００１０ｘｘ”に対応する２層目のクラスタ１０２５ｃ、および“００１１ｘｘ”に対応する２層目のクラスタ１０２５ｄが図示されている。 In the example of FIG. 10, as an example of the first layer cluster 1024, a first layer cluster 1024a corresponding to “00xxx” and a first layer cluster 1024b corresponding to “01xxxx” are illustrated. As an example of the second layer cluster 1025, the second layer cluster 1025a corresponding to "0000xx", the second layer cluster 1025b corresponding to "0001xx", and the second layer cluster 1025c corresponding to "0010xx". , And a second layer cluster 1025d corresponding to “0011xx” is illustrated.

例えば、２層目のクラスタについていえば、ソートされた２進数値の先頭にある“００００１０”が対応付けられたコンテンツ１０１１は、２層目のクラスタ１０２５ａに分類される。また、次の“０００１００”から“０００１１１”までの４つの２進数値がそれぞれ対応付けられたコンテンツ１０１１は、２層目のクラスタ１０２５ｂに分類される。さらに、次の“００１００１”が対応付けられたコンテンツ１０１１は２層目のクラスタ１０２５ｃに分類され、その次の“００１１１０”が対応付けられたコンテンツ１０１１は２層目のクラスタ１０２５ｄに分類される。 For example, regarding the second layer cluster, the content 1011 associated with “000010” at the head of the sorted binary value is classified as the second layer cluster 1025a. Further, the content 1011 associated with the next four binary values “000100” to “000111” is classified into the second layer cluster 1025b. Further, the content 1011 associated with the next “001001” is classified into the second layer cluster 1025c, and the content 1011 associated with the next “001110” is classified into the second layer cluster 1025d.

また、１層目のクラスタについていえば、ソートされた２進数値の先頭にある“００００１０”から“００１１１０”までの７つの２進数値がそれぞれ対応付けられたコンテンツ１０１１は、１層目のクラスタ１０２４ａに分類される。また、次の“０１００１１”から“０１１１０１”までの４つの２進数値にそれぞれ対応付けられたコンテンツ１０１１は、１層目のクラスタ１０２４ｂに分類される。 As for the first layer cluster, the content 1011 in which the seven binary values from “000010” to “001110” at the head of the sorted binary value are associated with each other is the first layer cluster. Classified as 1024a. Further, the contents 1011 associated with the following four binary values from “010011” to “011101” are classified into the first layer cluster 1024b.

このように、本実施形態におけるクラスタリングでは、Ｎ進数値生成部１０１によって生成された２進数値の順にコンテンツ１０１１をソートすると、コンテンツ１０１１は、分類されるクラスタ単位に並ぶ。つまり、本実施形態におけるクラスタリング処理は、コンテンツ１０１１を、Ｎ進数値生成部１０１によって生成された２進数値の順にソートすることによって実現される。 As described above, in the clustering according to the present embodiment, when the content 1011 is sorted in the order of the binary value generated by the N-ary value generation unit 101, the content 1011 is arranged in the cluster unit to be classified. That is, the clustering process in the present embodiment is realized by sorting the contents 1011 in the order of the binary values generated by the N-ary value generation unit 101.

ここで、一般的な距離ベースの位置クラスタリングでは、距離の近いコンテンツのペアを探索するため、処理回数がコンテンツの組み合わせの数になり、コンテンツの数をＮとすると処理回数はＯ（Ｎ^２）になる。一方、本実施形態におけるクラスタリングでは、クラスタリング処理が実質的にはソート処理であるため、コンテンツの数をＮとすると処理回数はＯ（ＮｌｏｇＮ）になり、より少ない処理回数で済む。しかも、一般的な距離ベースの位置クラスタリングでの１回の処理が２次元座標間の距離計算であるのに対し、本実施形態におけるクラスタリングの１回の処理は数値の大小比較であり、１回の処理あたりの負荷がより低くなる。 Here, in general distance-based position clustering, a pair of content having a short distance is searched, so the number of processing times is the number of content combinations, and when the number of content is N, the number of processing times is O (N ² ). become. On the other hand, in the clustering in this embodiment, since the clustering process is substantially a sort process, if the number of contents is N, the number of processes is O (NlogN), and a smaller number of processes is sufficient. In addition, one process in general distance-based position clustering is a distance calculation between two-dimensional coordinates, whereas one process in clustering in this embodiment is a numerical comparison of numerical values. The load per process is lower.

図１１は、本発明の第１の実施形態におけるクラスタ特定情報について説明するための図である。図１１には、コンテンツ１０１１を特定するコンテンツ特定情報、およびクラスタ１０２１を特定するクラスタ特定情報の配列が図示されている。 FIG. 11 is a diagram for describing cluster specifying information according to the first embodiment of this invention. FIG. 11 shows an arrangement of content specifying information for specifying the content 1011 and cluster specifying information for specifying the cluster 1021.

本実施形態において、クラスタリング部１０３は、クラスタリング用ソート部１０５によるソートの結果において、あるクラスタ１０２１に分類されたコンテンツ１０１１が現れる最初の位置と、このクラスタ１０２１に分類されたコンテンツ１０１１の数とによって、このクラスタ１０２１を特定するクラスタ特定情報を生成する。 In the present embodiment, the clustering unit 103 determines, based on the result of sorting by the clustering sorting unit 105, the first position where the content 1011 classified into a certain cluster 1021 appears and the number of content 1011 classified into this cluster 1021. Then, cluster specifying information for specifying the cluster 1021 is generated.

図示された例において、コンテンツ１０１１は、Itemというデータ構造によって定義される。Itemは、例えば下記のようなデータ構造でありうる。 In the illustrated example, the content 1011 is defined by a data structure called Item. Item can have the following data structure, for example.

struct Item {
uint32 id;
uint64 geocode;
}; struct Item {
uint32 id;
uint64 geocode;
};

ここで、idは、それぞれのコンテンツ１０１１を識別するために与えられる固有のＩＤである。geocodeは、それぞれのコンテンツ１０１１について、Ｎ進数値生成部１０１が生成した２進数値である。図示されているItemの配列において、それぞれのItemはgeocodeに従ってソートされている。 Here, id is a unique ID given to identify each content 1011. The geocode is a binary value generated by the N-ary value generation unit 101 for each content 1011. In the array of Items shown in the figure, each Item is sorted according to geocode.

また、図示された例において、クラスタ１０２１は、Clusterというデータ構造によって定義される。Clusterは、例えば下記のようなデータ構造である。 In the illustrated example, the cluster 1021 is defined by a data structure called Cluster. Cluster is, for example, the following data structure.

struct Cluster {
uint64_t clusterid;
uint32_t latcode;
uint32_t lngcode;
float latitude;
float longitude;
float halfEW;
float halfNS;
uint32 numLeaves;
Item *pLeaves;
}; struct Cluster {
uint64_t clusterid;
uint32_t latcode;
uint32_t lngcode;
float latitude;
float longitude;
float halfEW;
float halfNS;
uint32 numLeaves;
Item * pLeaves;
};

ここで、clusteridは、それぞれのクラスタ１０２１を識別するために与えられる固有のＩＤである。latcodeおよびlngcodeは、クラスタ１０２１に対応するグリッド１０３１の緯度および経度のコードである。例えば、クラスタ１０２１に対応するグリッド１０３１が、２進数値“１００１１１”に対応するものであれば、latcodeは“０１１”、lngcodeは“１０１”になる。latitude、longitude、halfEW、およびhalfNSは、クラスタ１０２１の領域を定義するための情報である。 Here, clusterid is a unique ID given to identify each cluster 1021. Latcode and lngcode are latitude and longitude codes of the grid 1031 corresponding to the cluster 1021. For example, if the grid 1031 corresponding to the cluster 1021 corresponds to the binary value “100111”, the latcode is “011” and the lngcode is “101”. latitude, longitude, halfEW, and halfNS are information for defining the area of the cluster 1021.

また、numLeavesは、クラスタ１０２１に分類されるコンテンツ１０１１の数である。*pLeavesは、クラスタリング用ソート部１０５がItemをgeocodeでソートした配列における、クラスタ１０２１に分類される最初のコンテンツ１０１１の位置を示すポインタである。これらの２つの要素は、クラスタ１０２１に分類されるコンテンツ１０１１を特定するための情報である。図８を参照して説明した例と同様に、geocodeでソートされたItem配列は、Itemによって定義されるコンテンツ１０１１が分類されるクラスタの順に並ぶ。従って、「配列のどこ（*pLeaves）からいくつ（numLeaves）分」という情報によって、クラスタ１０２１に分類されるコンテンツ１０１１を特定することが可能である。 Further, numLeaves is the number of contents 1011 classified into the cluster 1021. * pLeaves is a pointer indicating the position of the first content 1011 classified into the cluster 1021 in the array in which the clustering sorting unit 105 sorts the items by geocode. These two elements are information for specifying the content 1011 classified into the cluster 1021. Similar to the example described with reference to FIG. 8, the Item array sorted by geocode is arranged in the order of clusters into which the content 1011 defined by Item is classified. Therefore, it is possible to specify the content 1011 classified into the cluster 1021 by the information “where (* pLeaves) to how many (numLeaves) in the array”.

ここで、一般的なクラスタリングでは、クラスタにコンテンツが分類されている場合、クラスタを定義する情報として、分類されているコンテンツをそれぞれ特定する情報、例えばコンテンツＩＤ配列などを保持する。この場合、クラスタに分類されるコンテンツの数が増えるとコンテンツＩＤ配列も大きくなるため、クラスタを定義する情報も大きくなる。一方、本実施形態におけるクラスタリングでは、上述のように「配列のどこからいくつ分」という情報によってクラスタ１０２１に分類されるコンテンツ１０１１が特定されるため、クラスタ１０２１に分類されるコンテンツ１０１１の数が増えても、クラスタ１０２１を定義する情報のサイズが抑えられる。 Here, in general clustering, when content is classified into clusters, information for identifying each classified content, such as a content ID array, is held as information defining the cluster. In this case, as the number of contents classified into clusters increases, the content ID array also increases, so the information defining the clusters also increases. On the other hand, in the clustering according to the present embodiment, the content 1011 classified into the cluster 1021 is specified by the information “from where in the array” as described above, and therefore the number of the content 1011 classified into the cluster 1021 increases. In addition, the size of information defining the cluster 1021 can be suppressed.

（マージ関連処理の詳細）
図１２は、本発明の第１の実施形態におけるマージ関連処理を示すフローチャートである。マージ関連処理では、マージ設定情報（config）が設定されているか否かによって、マージ処理をするか否かが判定され、マージ処理をする場合、マージ設定情報（config）の内容に応じて、探索順マージまたは距離順マージが実行される。なお、マージ設定情報（config）は、探索順マージおよび距離順マージにおけるパラメータの設定にも用いられる。 (Details of merge related processing)
FIG. 12 is a flowchart showing merge-related processing in the first embodiment of the present invention. In the merge related process, whether to perform the merge process is determined depending on whether the merge setting information (config) is set. When performing the merge process, the search is performed according to the contents of the merge setting information (config). A forward or distance order merge is performed. The merge setting information (config) is also used for setting parameters in search order merge and distance order merge.

まず、マージ部１０７は、マージ設定選択処理を実行する（ステップＳ２０１）。このマージ設定選択処理において、マージ設定情報（config）が選択される。なお、マージ設定選択処理の詳細については後述する。 First, the merging unit 107 executes merge setting selection processing (step S201). In this merge setting selection process, merge setting information (config) is selected. Details of the merge setting selection process will be described later.

続いて、マージ部１０７は、マージ設定情報（config）にデータが設定されているか否かを判定する（ステップＳ２０３）。 Subsequently, the merge unit 107 determines whether data is set in the merge setting information (config) (step S203).

ステップＳ２０３において、マージ設定情報（config）が設定されている場合、マージ部１０７は、マージ設定情報（config）の距離順マージフラグ（sortPair）が“true”であるか、つまり距離順マージが有効であるか否かを判定する（ステップＳ２０５）。 If the merge setting information (config) is set in step S203, the merge unit 107 determines whether the distance order merge flag (sortPair) of the merge setting information (config) is “true”, that is, distance order merge is valid. It is determined whether or not (step S205).

ステップＳ２０５において、マージ設定情報（config）の距離順マージフラグ（sortPair）が“false”である場合、つまり距離順マージが有効ではない場合、マージ部１０７は、探索順マージ処理を実行する（ステップＳ２０７）。探索順マージ処理の詳細については後述する。 If the distance order merge flag (sortPair) in the merge setting information (config) is “false” in step S205, that is, if distance order merge is not valid, the merge unit 107 executes search order merge processing (step S205). S207). Details of the search order merge processing will be described later.

一方、ステップS２０５において、マージ設定情報（config）の距離順マージフラグ（sortPair）が“true”である場合、つまり距離順マージが有効である場合、マージ部１０７は、距離順マージ処理を実行する（ステップＳ２０９）。距離順マージ処理の詳細については後述する。 On the other hand, if the distance order merge flag (sortPair) of the merge setting information (config) is “true” in step S205, that is, if the distance order merge is valid, the merge unit 107 executes distance order merge processing. (Step S209). Details of the distance order merge processing will be described later.

一方、ステップＳ２０３において、マージ設定情報（config）がnull、つまりデータが設定されていない状態である場合、マージ部１０７は、探索順マージ処理も距離順マージ処理も実行せずに処理を終了する。 On the other hand, if the merge setting information (config) is null in step S203, that is, no data is set, the merge unit 107 ends the process without executing the search order merge process and the distance order merge process. .

（マージ設定選択処理の詳細）
図１３は、本発明の第１の実施形態におけるマージ設定情報の例を示す図である。図１３では、マージ設定情報の要素として、適用最大グリッド数（maxGrid）、探索手法（searchType）、上位レベル探索（upperLevel）、距離順マージ（sortPair）、および距離計算が図示されている。 (Details of merge setting selection processing)
FIG. 13 is a diagram illustrating an example of merge setting information according to the first embodiment of this invention. FIG. 13 illustrates the maximum number of grids (maxGrid), search method (searchType), upper level search (upperLevel), distance order merge (sortPair), and distance calculation as elements of merge setting information.

これらのマージ設定情報は、例えば図示されているようなテーブルの形で情報処理装置１００の記憶部１１９に格納されていてもよい。それぞれのマージ設定情報は、マージ設定レコード１０５１の形で格納され、マージ設定レコード１０５１は、インデックス（index）によって識別されてもよい。以下、マージ設定情報の各要素について説明する。 The merge setting information may be stored in the storage unit 119 of the information processing apparatus 100 in the form of a table as illustrated, for example. Each merge setting information is stored in the form of a merge setting record 1051, and the merge setting record 1051 may be identified by an index. Hereinafter, each element of the merge setting information will be described.

適用最大グリッド数（maxGrid）は、マージ設定情報が設定されうる最大のグリッド数である。マージ処理に用いられるマージ設定情報（config）は、グリッド数がこの適用最大グリッド数以下になるマージ設定情報から選択される。なお、ここでいうグリッド数は、クラスタリング処理の結果、クラスタ１０２１が含まれるグリッド１０３１の数でありうる。従って、グリッドレベルが大きい（グリッドサイズが小さい）場合でも、コンテンツ１０１１の数が少ない、または特定のグリッドに偏って分布しているような場合には、比較的小さい適用最大グリッド数のマージ設定情報が選択されうる。 The applied maximum number of grids (maxGrid) is the maximum number of grids for which merge setting information can be set. The merge setting information (config) used for the merge process is selected from merge setting information in which the number of grids is equal to or less than the maximum number of applied grids. Note that the number of grids referred to here may be the number of grids 1031 including the cluster 1021 as a result of the clustering process. Therefore, even when the grid level is large (the grid size is small), when the number of contents 1011 is small or the distribution is biased toward a specific grid, the merge setting information for the relatively small number of applied maximum grids. Can be selected.

探索手法（searchType）は、マージ処理における探索手法を指定する。“Full Match”は、クラスタ１０２１が含まれるグリッド１０３１のすべての組み合わせをマージ処理の対象とする探索手法である。この探索手法によるマージ処理を、以下、フルマッチマージ処理という。また、“4 Dir”、“2 Dir”、および“1 Dir”は、それぞれ４方向探索、２方向探索、および１方向探索を表し、特定の方向について隣接するグリッドをマージ処理の対象とする探索手法である。これらの探索手法を、以下、近傍探索マージ処理という。なお、フルマッチマージ処理および近傍探索マージ処理の詳細については後述する。 The search method (searchType) specifies a search method in the merge process. “Full Match” is a search method in which all combinations of the grid 1031 including the cluster 1021 are targeted for merge processing. The merge processing by this search method is hereinafter referred to as full match merge processing. “4 Dir”, “2 Dir”, and “1 Dir” represent a four-direction search, a two-way search, and a one-way search, respectively, and a search that uses a grid adjacent to a specific direction as a target of merge processing. It is a technique. These search methods are hereinafter referred to as neighborhood search merge processing. Details of the full match merge process and the neighbor search merge process will be described later.

上位レベル探索（upperLevel）は、上位レベル探索を実行するか否か、および上位レベル探索が実行される場合は、いくつ上のレベルまで探索を実行するかを指定する。“２”および“１”は、上位レベル探索を実行することを示し、それぞれ２レベル上位、１レベル上位までの上位レベル探索を示す。“０”（disable）は、上位レベル探索を実行しないことを示す。フルマッチマージ処理では、クラスタ１０２１が含まれるグリッド１０３１のすべての組み合わせがマージ処理の対象となるため、上位レベル探索は必要なく、従って上位レベル探索（upperLevel）は定義されない。なお、上位レベル探索の詳細については後述する。 The upper level search (upperLevel) specifies whether or not to execute an upper level search and, if an upper level search is executed, up to which level the search is executed. “2” and “1” indicate that an upper level search is performed, and indicates an upper level search up to 2 levels higher and 1 level higher, respectively. “0” (disable) indicates that the upper level search is not performed. In the full match merging process, all combinations of the grid 1031 including the cluster 1021 are subjected to the merging process. Therefore, the upper level search is not necessary, and therefore the upper level search (upperLevel) is not defined. Details of the upper level search will be described later.

距離順マージフラグ（sortPair）は、距離順マージを実行するか否かを指定する。距離順マージとは、クラスタ１０２１が含まれるグリッド１０３１のペアのうち、クラスタ１０２１の距離が近いペアから順にマージするマージ手法である。距離順マージを実行した場合、互いに近接しているクラスタ１０２１を確実にマージすることが可能になるが、グリッドのペアを一時的に保持しておくため、その分の記憶容量が占有される。“true”は、距離順マージを実行することを示し、“false”は、距離順マージではなく、後述する探索順マージを実行することを示す。なお、距離順マージの詳細については後述する。 The distance order merge flag (sortPair) specifies whether or not to execute distance order merge. The distance order merging is a merging method of merging sequentially from the pair of the grid 1031 including the cluster 1021 in order from the pair having the closest distance of the cluster 1021. When the distance order merge is executed, the clusters 1021 that are close to each other can be surely merged. However, since the grid pairs are temporarily stored, the storage capacity is occupied accordingly. “True” indicates that distance order merging is executed, and “false” indicates that search order merging described later is executed instead of distance order merging. Details of distance order merging will be described later.

距離計算は、クラスタ間の距離を算出する際に用いられる距離計算手法を指定する。“大圏距離”が用いられる場合、例えば２つのクラスタ１０２１の中心の座標（経度，緯度）を、それぞれ（lon1，lat1）、および（lon2，lat2）とすると、クラスタ間の距離ｄは以下の数式（１）によって算出される。 The distance calculation specifies a distance calculation method used when calculating the distance between clusters. When “great circle distance” is used, for example, assuming that the coordinates (longitude, latitude) of the centers of two clusters 1021 are (lon1, lat1) and (lon2, lat2), respectively, the distance d between the clusters is Calculated by Equation (1).

また、“近似大圏距離”が用いられる場合、同様の場合においてクラスタ間の距離ｄは以下の数式（２）によって算出される。 When the “approximate great circle distance” is used, the distance d between clusters is calculated by the following formula (2) in the same case.

このようなマージ設定情報は、例えば予め設定されて記憶部１１９に格納されていてもよい。図示されたマージ設定情報は、適用最大グリッド数（maxGrid）が小さいほど高度なマージであり、適用最大グリッド数（maxGrid）が大きいほど簡素なマージであるように設定されている。グリッド数が５００００を超える場合に対応するマージ設定情報は定義されておらず、この場合、マージ処理は実行されない。これは、グリッド数が多くなるほどマージ処理の負荷が増大するため、グリッド数に応じてマージの設定を変更して演算の最大負荷を調整するためである。 Such merge setting information may be set in advance and stored in the storage unit 119, for example. The illustrated merge setting information is set such that the smaller the applied maximum number of grids (maxGrid) is, the higher the merge is, and the larger the applied maximum number of grids (maxGrid) is, the simpler the merge is. The merge setting information corresponding to the case where the number of grids exceeds 50000 is not defined. In this case, the merge process is not executed. This is because the load of merge processing increases as the number of grids increases, so the merge setting is changed according to the number of grids to adjust the maximum load of calculation.

図１４は、本発明の第１の実施形態におけるマージ設定選択処理を示すフローチャートである。マージ設定選択処理では、図１３を参照して説明したマージ設定情報から、マージ処理に用いられるマージ設定情報（config）が選択される。なお、上述のように、マージ設定情報（config）が設定されず、マージ処理を実行しないことが指定される場合もある。 FIG. 14 is a flowchart showing merge setting selection processing according to the first embodiment of the present invention. In the merge setting selection process, merge setting information (config) used for the merge process is selected from the merge setting information described with reference to FIG. As described above, the merge setting information (config) may not be set, and it may be specified that the merge process is not executed.

まず、マージ部１０７は、クラスタ１０２１が含まれるグリッド１０３１のリストであるグリッドリスト（glist）の長さ、すなわちグリッド数が、マージ設定リストmlistの末尾の要素tailでの適用最大グリッド数（maxGrid）以下であるか否かを判定する（ステップＳ３０１）。 First, the merging unit 107 has a length of a grid list (glist) that is a list of grids 1031 including the cluster 1021, that is, the number of grids, the maximum number of grids applied (maxGrid) in the tail element of the merge setting list mlist. It is determined whether or not the following is true (step S301).

ステップＳ３０１において、グリッド数がマージ設定リストmlistの末尾の要素tailでの適用最大グリッド数（maxGrid）以下であると判定された場合、マージ部１０７は、マージ設定リストmlistの各要素について順に、以下のステップＳ３０５およびステップＳ３０７を繰り返す（ステップＳ３０３：マージ設定リストループ）。 In step S301, when it is determined that the number of grids is equal to or less than the maximum number of applied grids (maxGrid) in the tail element tail of the merge setting list mlist, the merge unit 107 sequentially performs the following on each element of the merge setting list mlist. Step S305 and step S307 are repeated (step S303: merge setting list loop).

ステップＳ３０５は、グリッドリスト（glist）の長さlengthを、マージ設定リストmlistのインデックスｉの要素の適用最大グリッド数（maxGrid）と比較するステップである。 Step S305 is a step of comparing the length length of the grid list (glist) with the maximum number of applied grids (maxGrid) of the element of index i in the merge setting list mlist.

ステップＳ３０５において、グリッドリスト（glist）の長さlengthが適用最大グリッド数（maxGrid）以下であると判定された場合、マージ部１０７は、マージ設定情報（config）にマージ設定リストmlistのインデックスｉの要素を設定する（ステップＳ３０７）。既に設定されているマージ設定情報（config）が存在する場合は、新たに設定されるマージ設定リストmlistのインデックスｉの要素で上書きされる。 In step S305, when it is determined that the length length of the grid list (glist) is equal to or less than the maximum number of applied grids (maxGrid), the merge unit 107 includes the index i of the merge setting list mlist in the merge setting information (config). Elements are set (step S307). When merge setting information (config) that has already been set exists, it is overwritten with the element of index i of the newly set merge setting list mlist.

一方、ステップＳ３０５において、グリッドリスト（glist）の長さlengthが適用最大グリッド数（maxGrid）を超えると判定された場合、マージ設定情報（config）の設定には変化がなく、マージ設定リストループが続行される。 On the other hand, if it is determined in step S305 that the length length of the grid list (glist) exceeds the maximum number of applied grids (maxGrid), the setting of the merge setting information (config) is not changed, and the merge setting list loop is executed. To continue.

ステップＳ３０３のマージ設定リストループが終了すると、マージ部１０７はマージ設定選択処理を終了する。 When the merge setting list loop in step S303 ends, the merge unit 107 ends the merge setting selection process.

一方、ステップＳ３０１において、グリッド数がマージ設定リストmlistの末尾の要素tailでの適用最大グリッド数（maxGrid）を超えると判定された場合、マージ部１０７は、マージ設定情報（config）にnullを設定する（ステップＳ３０９）。nullは、データが設定されない状態を示す。上述のように、マージ設定情報（config）がnullである場合、マージ部１０７は後続のマージ処理を実行しない。 On the other hand, when it is determined in step S301 that the number of grids exceeds the maximum number of applicable grids (maxGrid) in the tail element of the merge setting list mlist, the merge unit 107 sets null in the merge setting information (config). (Step S309). null indicates a state in which no data is set. As described above, when the merge setting information (config) is null, the merge unit 107 does not execute subsequent merge processing.

（探索順マージ処理の詳細）
探索順マージ処理は、相互間の距離が所定の閾値以下であるクラスタ１０２１を探索し、そのようなクラスタ１０２１を探索された順に順次マージする処理を含む。なお、後述するフルマッチマージ処理では、クラスタ１０２１が含まれるグリッド１０３１のすべての組み合わせについて、クラスタ１０２１の相互間の距離が算出される。また、同じく後述する近傍探索マージ処理では、クラスタ１０２１が含まれるグリッド１０３１が特定の方向にソートされ、当該方向について隣接するグリッド１０３１に含まれるクラスタ１０２１について、相互間の距離が算出される。 (Details of search order merge processing)
The search order merging process includes a process of searching for clusters 1021 whose distance between each other is equal to or smaller than a predetermined threshold, and sequentially merging such clusters 1021 in the searched order. In the full match merge process described later, the distance between the clusters 1021 is calculated for all combinations of the grid 1031 including the cluster 1021. Further, in the neighborhood search merge process described later, the grid 1031 including the cluster 1021 is sorted in a specific direction, and the distance between the clusters 1021 included in the grid 1031 adjacent to the direction is calculated.

図１５は、本発明の第１の実施形態における探索順マージ処理を示すフローチャートである。探索順マージでは、マージ設定情報（config）の内容に応じて、フルマッチマージ処理、近傍探索マージ処理のいずれかが実行される。また、近傍探索マージ処理が実行される場合、マージ設定情報（config）の内容に応じて、上位探索あり、または上位探索なしのいずれかで実行される。 FIG. 15 is a flowchart showing search order merge processing according to the first embodiment of the present invention. In the search order merge, either the full match merge process or the neighbor search merge process is executed according to the contents of the merge setting information (config). Further, when the neighborhood search merge process is executed, it is executed with or without an upper search according to the contents of the merge setting information (config).

まず、マージ部１０７は、マージ設定情報（config）の探索手法（searchType）が、“Full Match”であるか否かを判定する（ステップＳ４０１）。 First, the merge unit 107 determines whether or not the search method (searchType) of the merge setting information (config) is “Full Match” (step S401).

ステップＳ４０１において、マージ設定情報（config）の探索手法（searchType）が、“Full Match”である場合、マージ部１０７は、フルマッチマージ処理を実行する（ステップＳ４０３）。フルマッチマージ処理のさらなる詳細については後述する。 If the search method (searchType) of the merge setting information (config) is “Full Match” in step S401, the merge unit 107 executes a full match merge process (step S403). Further details of the full match merge process will be described later.

一方、ステップＳ４０１において、マージ設定情報（config）の探索手法（searchType）が、“Full Match”ではない場合、マージ部１０７は、さらに、マージ設定情報（config）の上位レベル探索（upperLevel）が、“０”であるか否かを判定する（ステップＳ４０５）。 On the other hand, when the search method (searchType) of the merge setting information (config) is not “Full Match” in step S401, the merge unit 107 further performs an upper level search (upperLevel) of the merge setting information (config), It is determined whether or not it is “0” (step S405).

ステップＳ４０５において、マージ設定情報（config）の上位レベル探索（upperLevel）が“０”である場合、マージ部１０７は、上位探索なしの近傍探索マージ処理を実行する（ステップＳ４０７）。この上位探索なし近傍探索マージ処理の詳細については、後述する。 If the upper level search (upperLevel) of the merge setting information (config) is “0” in step S405, the merge unit 107 executes a neighbor search merge process without an upper search (step S407). The details of the neighborhood search merging process without upper search will be described later.

一方、ステップＳ４０５において、マージ設定情報（config）の上位レベル探索（upperLevel）が“０”ではない場合、マージ部１０７は、上位探索ありの近傍探索マージ処理を実行する（ステップS４０９）。この上位探索あり近傍探索マージ処理の詳細については、後述する。 On the other hand, if the upper level search (upperLevel) of the merge setting information (config) is not “0” in step S405, the merge unit 107 executes a neighbor search merge process with an upper search (step S409). The details of the neighborhood search merging process with upper search will be described later.

以上のように、マージ部１０７は、フルマッチマージ処理、上位探索なし近傍探索マージ処理、または上位探索あり近傍探索マージ処理のいずれかのマージ処理を実行して、探索順マージ処理を終了する。 As described above, the merge unit 107 executes any one of the full match merge process, the neighborhood search merge process without upper search, or the neighbor search merge process with upper search, and ends the search order merge process.

（フルマッチマージ処理の詳細）
図１６は、本発明の第１の実施形態におけるフルマッチマージ処理を示すフローチャートである。フルマッチマージ処理では、クラスタ１０２１が含まれるグリッド１０３１のすべての組み合わせを探索の対象とする。 (Details of full match merge processing)
FIG. 16 is a flowchart showing the full match merge process according to the first embodiment of the present invention. In the full match merge process, all combinations of the grid 1031 including the cluster 1021 are targeted for search.

まず、ステップＳ５０１において、マージ部１０７は、クラスタ１０２１が含まれるグリッド１０３１のリストであるグリッドリスト（glist）の各要素について順に、以下のステップＳ５０３を繰り返す（マージするグリッドループ）。なお、このステップＳ５０１のループにおいて処理対象となっているグリッドリスト（glist）の要素を、グリッドリスト（glist）のインデックスｉの要素とする。 First, in step S501, the merging unit 107 sequentially repeats the following step S503 for each element of the grid list (glist) that is a list of the grid 1031 including the cluster 1021 (grid loop for merging). Note that the element of the grid list (glist) that is the processing target in the loop of step S501 is the element of the index i of the grid list (glist).

ステップＳ５０３において、マージ部は、グリッドリスト（glist）の、現在処理対象としているインデックスｉの要素の次の要素である要素から順に、以下のステップＳ５０５〜Ｓ５０９を繰り返す（マージされるグリッドループ）。なお、このステップＳ５０３のループにおいて処理対象となっているグリッドリスト（glist）の要素を、グリッドリスト（glist）のインデックスｊの要素とする。 In step S503, the merge unit repeats the following steps S505 to S509 in order starting from the element that is the element next to the element of the index i that is currently processed in the grid list (glist) (a grid loop to be merged). Note that the element of the grid list (glist) that is the processing target in the loop of step S503 is the element of the index j of the grid list (glist).

続くステップＳ５０５において、マージ部１０７は、グリッドリスト（glist）のインデックスｉの要素（マージするグリッド）とインデックスｊの要素（マージされるグリッド）との間の距離ｄの算出処理（distance）を実行する。ここで、インデックスｉの要素とインデックスｊの要素との距離は、例えば、インデックスｉの要素であるグリッド１０３１に含まれるクラスタ１０２１と、インデックスｊの要素であるグリッド１０３１に含まれるクラスタ１０２１との中心距離であってもよい。 In subsequent step S505, the merging unit 107 calculates the distance d between the element at index i (grid to be merged) and the element at index j (grid to be merged) in the grid list (glist). To do. Here, the distance between the element of index i and the element of index j is, for example, the center between cluster 1021 included in grid 1031 that is an element of index i and cluster 1021 included in grid 1031 that is an element of index j. It may be a distance.

続くステップＳ５０７は、ステップＳ５０５において算出された距離ｄを、所定の閾値ｔｈと比較するステップである。ここで、閾値ｔｈは、例えば、表１を参照して説明されたマージ閾値でありうる。 The subsequent step S507 is a step of comparing the distance d calculated in step S505 with a predetermined threshold th. Here, the threshold th may be, for example, the merge threshold described with reference to Table 1.

ここで、ステップＳ５０７において、距離ｄが閾値ｔｈ以下であると判定された場合、マージ部１０７は、グリッドリスト（glist）のインデックスｉの要素とインデックスｊの要素とをマージする（merge：ステップＳ５０９）。ここで、マージ部１０７は、グリッドリスト（glist）のインデックスｉの要素であるグリッド１０３１に対応するクラスタ１０２１と、インデックスｊの要素であるグリッド１０３１に対応するクラスタ１０２１とをマージした新たなクラスタ１０２１を生成し、この新たなクラスタ１０２１を、マージされる前の各クラスタ１０２１が対応付けられていたグリッド１０３１に対応付ける。つまり、このマージの後、インデックスｉの要素であるグリッド１０３１に対応するクラスタ１０２１と、インデックスｊの要素であるグリッド１０３１に対応するクラスタ１０２１とは、いずれもこの新たなクラスタ１０２１になる。 If it is determined in step S507 that the distance d is equal to or smaller than the threshold th, the merge unit 107 merges the element at index i and the element at index j in the grid list (glist) (merge: step S509). ). Here, the merging unit 107 merges the cluster 1021 corresponding to the grid 1031 that is the element of the index i in the grid list (glist) and the cluster 1021 corresponding to the grid 1031 that is the element of the index j. And the new cluster 1021 is associated with the grid 1031 associated with each cluster 1021 before merging. That is, after this merging, the cluster 1021 corresponding to the grid 1031 that is the element of the index i and the cluster 1021 corresponding to the grid 1031 that is the element of the index j both become this new cluster 1021.

一方、ステップＳ５０７において、距離ｄが閾値ｔｈを超えると判定された場合、マージ部１０７は、グリッドリスト（glist）のインデックスｉの要素とインデックスｊの要素とをマージせず、インデックスｊをインクリメントして、グリッドリスト（glist）の次の要素を参照する（ステップＳ５０３）。 On the other hand, when it is determined in step S507 that the distance d exceeds the threshold th, the merging unit 107 increments the index j without merging the elements of the index i and the index j of the grid list (glist). Then, the next element of the grid list (glist) is referred to (step S503).

ステップＳ５０３のループによってグリッドリスト（glist）の要素が末尾まで参照された場合、マージ部１０７は、インデックスｉをインクリメントしてグリッドリスト（glist）の次の要素を参照する（ステップＳ５０１）。 When the element of the grid list (glist) is referred to the end by the loop of step S503, the merge unit 107 increments the index i and refers to the next element of the grid list (glist) (step S501).

ステップＳ５０１のループによってグリッドリスト（glist）の要素が末尾まで参照された場合、マージ部１０７は、フルマッチマージの処理を終了する。 When the elements of the grid list (glist) are referred to the end by the loop of step S501, the merge unit 107 ends the full match merge process.

上記のステップＳ５０９における処理は、例えばマージ部１０７がソフトウェアとして実装される場合における具体的な処理の一例である。この場合、マージ部１０７は、記憶部１１９に格納された単一のグリッドリスト（glist）を、インデックスｉとインデックスｊとの両方で参照し、逐次グリッドリスト（glist）の内容を更新する。なお、マージ部１０７がソフトウェア以外で実装される場合、およびマージ部１０７が上記の例とは仕様の異なるソフトウェアで実装される場合、実質的な処理内容が上記のフローチャートに沿ったものであれば、例えば処理対象とするグリッド１０３１の参照の仕方、およびクラスタのマージをデータに反映させるタイミングなどは、適宜設計されうる。 The processing in step S509 is an example of specific processing when the merge unit 107 is implemented as software, for example. In this case, the merge unit 107 refers to the single grid list (glist) stored in the storage unit 119 using both the index i and the index j, and sequentially updates the contents of the grid list (glist). If the merge unit 107 is implemented by software other than the above, and if the merge unit 107 is implemented by software having a different specification from the above example, if the substantial processing content is in accordance with the above flowchart, For example, the method of referring to the grid 1031 to be processed and the timing for reflecting the merge of clusters in the data can be designed as appropriate.

（近傍探索マージ処理の詳細）
続いて、本実施形態における近傍探索マージ処理について説明する。近傍探索マージ処理では、マージ用ソート部１０９が、緯度および経度に基づく第１の順位決定処理の結果に基づいて、地表面のある方向についてクラスタ１０２１をソートする。また、隣接判定部１１１が、当該方向についてソートされたクラスタが、この方向について互いに隣接するか否かを判定する。さらに、マージ部１０７が、この方向について互いに隣接すると判定されたクラスタについて、クラスタ相互間の距離を算出する。マージ部１０７、マージ用ソート部１０９、および隣接判定部１１１は、地表面における別の方向についても、同様の処理によってクラスタをマージしてもよい。 (Details of neighbor search merge processing)
Next, the neighborhood search merge process in this embodiment will be described. In the neighborhood search merging process, the merging sort unit 109 sorts the clusters 1021 in a certain direction on the ground surface based on the result of the first rank determination process based on the latitude and longitude. Further, the adjacency determination unit 111 determines whether or not the clusters sorted in the direction are adjacent to each other in the direction. Further, the merging unit 107 calculates the distance between the clusters for the clusters determined to be adjacent to each other in this direction. The merging unit 107, the merging sorting unit 109, and the adjacency determining unit 111 may merge clusters in the same process for other directions on the ground surface.

ここで、本実施形態において、地表面における方向は、例えば、水平方向、垂直方向、斜め右下方向、または斜め右上方向でありうる。マージ用ソート部１０９を含むマージ部１０７は、これらの方向のうちの１つについて、以下の近傍探索マージ処理を実行してもよい。この場合の近傍探索は、１方向探索とも呼ばれる。また、マージ用ソート部１０９を含むマージ部１０７は、上記の方向のうちの２つについて、以下の近傍探索マージ処理を実行してもよい。この場合の近傍探索は、２方向探索とも呼ばれる。また、マージ用ソート部１０９を含むマージ部１０７は、上記の４つの方向について、以下の近傍探索マージ処理を実行してもよい。この場合の近傍探索は、４方向探索とも呼ばれる。 Here, in the present embodiment, the direction on the ground surface can be, for example, the horizontal direction, the vertical direction, the diagonally lower right direction, or the diagonally upper right direction. The merge unit 107 including the merge sort unit 109 may perform the following neighborhood search merge process for one of these directions. The neighborhood search in this case is also called a one-way search. Further, the merging unit 107 including the merging sort unit 109 may execute the following neighborhood search merging process for two of the above directions. The neighborhood search in this case is also called a two-way search. Further, the merge unit 107 including the merge sort unit 109 may execute the following neighborhood search merge process in the above four directions. The neighborhood search in this case is also called a four-way search.

また、本実施形態において、マージ用ソート部１０９は、第１の順位決定処理として、地表面のある方向についてグリッド１０３１をソートして与えられたソート順を、各グリッド１０３１に含まれるクラスタ１０２１に順位として与えてもよい。図１を参照して説明したように、本実施形態において、グリッド１０３１とこれに含まれるクラスタ１０２１とは１対１に対応する。従って、グリッドに与えられたソート順を、そのままクラスタの順位として与えることが可能である。グリッド１０３１は、既知の境界で定義された領域であるため、地表面における特定の方向についてソートすることは容易である。それゆえ、グリッド１０３１のソート結果に基づいてクラスタ１０２１に順位を与える本実施形態に構成によって、クラスタ１０２１のソートを高速化することが可能である。 Further, in the present embodiment, the merge sorting unit 109 sorts the sorting order given by sorting the grid 1031 in a certain direction on the ground surface into the clusters 1021 included in each grid 1031 as the first order determination processing. It may be given as a ranking. As described with reference to FIG. 1, in the present embodiment, the grid 1031 and the cluster 1021 included in the grid 1031 correspond one-to-one. Therefore, the sort order given to the grid can be given as it is as the rank of the cluster. Since the grid 1031 is an area defined by a known boundary, it is easy to sort in a specific direction on the ground surface. Therefore, it is possible to speed up the sorting of the cluster 1021 by the configuration of the present embodiment that gives the rank to the cluster 1021 based on the sorting result of the grid 1031.

図１７Ａ〜図１７Ｄは、本発明の第１の実施形態における近傍探索マージ処理で、グリッドの探索が実行される方向を示す図である。近傍探索マージ処理では、クラスタ１０２１が含まれるグリッド１０３１のリストであるグリッドリスト（glist）を、ある特定の方向についてソートし、ソートされたグリッドリスト（glist）において互いに隣接するグリッド１０３１を、マージ処理の対象とする。図１７Ａ〜図１７Ｄでは、それぞれの方向を基準にしてグリッドリスト（glist）のソート順を定義した場合に、各グリッドに与えられるインデックスが示されている。なお、図１７Ａ〜図１７Ｄにおいて、グリッド自体の配置は同じである。 FIG. 17A to FIG. 17D are diagrams illustrating directions in which a grid search is executed in the neighborhood search merge process according to the first embodiment of the present invention. In the neighbor search merge process, a grid list (glist) that is a list of grids 1031 including the cluster 1021 is sorted in a specific direction, and the grids 1031 adjacent to each other in the sorted grid list (glist) are merged. The target of. 17A to 17D show indexes given to the grids when the sorting order of the grid list (glist) is defined based on the respective directions. 17A to 17D, the arrangement of the grid itself is the same.

図１７Ａは、グリッドリスト（glist）の探索が水平方向に実行される場合を示す図である。地表面１００１での水平方向は、東西方向とも呼ばれる。なお、以下の図では、経度方向がｘ軸方向、緯度方向がｙ軸方向として図示されている。ここでの水平方向とは、ｘ軸方向のことである。 FIG. 17A is a diagram illustrating a case where a search for a grid list (glist) is performed in the horizontal direction. The horizontal direction on the ground surface 1001 is also called the east-west direction. In the following drawings, the longitude direction is illustrated as the x-axis direction, and the latitude direction is illustrated as the y-axis direction. Here, the horizontal direction is the x-axis direction.

この場合、グリッドリスト（glist）は、水平方向についてソートされる。グリッドリスト（glist）のソート順は、例えば、座標が（ｘ１，ｙ１）、（ｘ２，ｙ２）である２つのグリッドの前後関係を以下のように定義することで決定される。 In this case, the grid list (glist) is sorted in the horizontal direction. The sort order of the grid list (glist) is determined, for example, by defining the following relationship between two grids whose coordinates are (x1, y1) and (x2, y2) as follows.

if(ｙ２≠ｙ１)
ｙ１，ｙ２の大小に基づき判定（より小さい方が前）
else
ｘ１，ｘ２の大小に基づき判定（より小さい方が前） if (y2 ≠ y1)
Judgment based on the magnitude of y1 and y2 (smaller is before)
else
Judgment based on the magnitude of x1 and x2 (the smaller one is before)

図１７Ｂは、グリッドリスト（glist）の探索が垂直方向に実行される場合を示す図である。地表面１００１での垂直方向は、南北方向とも呼ばれる。この場合、グリッドリスト（glist）は、垂直方向についてソートされる。グリッドリスト（glist）のソート順は、例えば、座標が（ｘ１，ｙ１）、（ｘ２，ｙ２）である２つのグリッドの前後関係を以下のように定義することで決定される。 FIG. 17B is a diagram illustrating a case where a search for a grid list (glist) is performed in the vertical direction. The vertical direction on the ground surface 1001 is also called the north-south direction. In this case, the grid list (glist) is sorted in the vertical direction. The sort order of the grid list (glist) is determined, for example, by defining the following relationship between two grids whose coordinates are (x1, y1) and (x2, y2) as follows.

if(ｘ２≠ｘ１)
ｘ１，ｘ２の大小に基づき判定（より小さい方が前）
else
ｙ１，ｙ２の大小に基づき判定（より小さい方が前） if (x2 ≠ x1)
Judgment based on the magnitude of x1 and x2 (the smaller one is before)
else
Judgment based on the magnitude of y1 and y2 (smaller is before)

図１７Ｃは、グリッドリスト（glist）の探索が斜め右下方向に実行される場合を示す図である。地表面１００１での斜め右下方向は、北西−南東方向とも呼ばれる。この場合、グリッドリスト（glist）は、斜め右下方向についてソートされる。グリッドリスト（glist）のソート順は、例えば、座標が（ｘ１，ｙ１）、（ｘ２，ｙ２）である２つのグリッドの前後関係を以下のように定義することで決定される。 FIG. 17C is a diagram illustrating a case where a search for a grid list (glist) is executed in a diagonally lower right direction. The diagonally lower right direction on the ground surface 1001 is also called the northwest-southeast direction. In this case, the grid list (glist) is sorted in the diagonally lower right direction. The sort order of the grid list (glist) is determined, for example, by defining the following relationship between two grids whose coordinates are (x1, y1) and (x2, y2) as follows.

ｓｕｍ１＝ｘ１＋ｙ１
ｓｕｍ２＝ｘ２＋ｙ２
if(ｓｕｍ１≠ｓｕｍ２)
ｓｕｍ１，ｓｕｍ２の大小に基づき判定（より小さい方が前）
else
ｙ１，ｙ２の大小に基づき判定（より大きい方が前） sum1 = x1 + y1
sum2 = x2 + y2
if (sum1 ≠ sum2)
Judgment based on size of sum1, sum2 (smaller is before)
else
Judgment based on y1 and y2 (larger is before)

図１７Ｄは、グリッドリスト（glist）の探索が斜め右上方向に実行される場合を示す図である。地表面１００１での斜め右上方向は、南西−北東方向とも呼ばれる。この場合、グリッドリスト（glist）は、斜め右上方向についてソートされる。グリッドリスト（glist）のソート順は、例えば、座標が（ｘ１，ｙ１）、（ｘ２，ｙ２）である２つのグリッドの前後関係を以下のように定義することで決定される。 FIG. 17D is a diagram illustrating a case where a search for a grid list (glist) is executed in a diagonally upper right direction. The diagonally upper right direction on the ground surface 1001 is also called the southwest-northeast direction. In this case, the grid list (glist) is sorted in the diagonally upper right direction. The sort order of the grid list (glist) is determined, for example, by defining the following relationship between two grids whose coordinates are (x1, y1) and (x2, y2) as follows.

最大のｙ座標からｙ１，ｙ２までの距離をｙ１’，ｙ２’とする
ｓｕｍ１＝ｘ１＋ｙ１’
ｓｕｍ２＝ｘ２＋ｙ２’
if(ｓｕｍ１≠ｓｕｍ２)
ｓｕｍ１，ｓｕｍ２の大小に基づき判定（より小さい方が前）
else
ｙ１，ｙ２の大小に基づき判定（より小さい方が前） The distance from the maximum y coordinate to y1, y2 is defined as y1 ′, y2 ′. Sum1 = x1 + y1 ′
sum2 = x2 + y2 ′
if (sum1 ≠ sum2)
Judgment based on size of sum1, sum2 (smaller is before)
else
Judgment based on the magnitude of y1 and y2 (smaller is before)

図１８Ａ〜図１８Ｃは、本発明の第１の実施形態における近傍探索マージ処理での、１方向探索、２方向探索、および４方向探索について説明するための図である。本実施形態における近傍探索マージ処理では、図１７Ａ〜図１７Ｄを参照して説明したグリッド探索の方向を選択し、または組み合わせた、１方向探索、２方向探索、および４方向探索の３種類の探索手法が設定されている。 18A to 18C are diagrams for describing a one-way search, a two-way search, and a four-way search in the neighborhood search merge process according to the first embodiment of the present invention. In the neighborhood search merging process according to the present embodiment, three types of searches, that is, a one-way search, a two-way search, and a four-way search, in which the grid search directions described with reference to FIGS. 17A to 17D are selected or combined. The method is set.

図１８Ａは、１方向探索を示す図である。１方向探索では、グリッド探索の方向が１つだけ選択される。図示された例では、水平方向が選択されている。この場合、あるグリッドに対して水平方向で隣接する２つのグリッドが、マージ処理の対象になる。探索の方向が１つであるため、マージ処理におけるグリッドリスト（glist）のソート回数も１回でよい。なお、選択されるグリッド探索の方向は、水平方向には限られず、垂直方向、斜め右下方向、または斜め右上方向のいずれかであってもよい。このような１方向探索の場合、ソート回数は１回であり、マージ処理での最大距離計算回数は、クラスタの数をＮとすると、約Ｎ（回）になる。 FIG. 18A is a diagram illustrating a one-way search. In the one-way search, only one grid search direction is selected. In the illustrated example, the horizontal direction is selected. In this case, two grids that are adjacent to a certain grid in the horizontal direction are to be merged. Since the search direction is one, the grid list (glist) may be sorted once in the merge process. Note that the grid search direction to be selected is not limited to the horizontal direction, and may be any one of the vertical direction, the diagonally lower right direction, and the diagonally upper right direction. In such a one-way search, the number of sorts is one, and the maximum number of distance calculations in the merge process is approximately N (times), where N is the number of clusters.

図１８Ｂは、２方向探索を示す図である。２方向探索では、グリッド探索の方向が２つ組み合わされる。図示された例では、水平方向と垂直方向とが組み合わされている。この場合、あるグリッドに対して水平方向、または垂直方向で隣接する４つのグリッドが、マージ処理の対象になる。探索の方向が２つであるため、マージ処理におけるグリッドリスト（glist）のソート回数は２回になる。なお、組み合わされるグリッド探索の方向は、水平方向と垂直方向とには限られず、これらの方向に代えて、斜め右下方向または斜め右上方向のいずれか、または両方が組み合わされてもよい。このような２方向探索の場合、ソート回数は２回であり、マージ処理での最大距離計算回数は、クラスタの数をＮとすると、約２Ｎ（回）になる。 FIG. 18B is a diagram illustrating a two-way search. In the two-way search, two grid search directions are combined. In the illustrated example, the horizontal direction and the vertical direction are combined. In this case, four grids adjacent to a certain grid in the horizontal direction or the vertical direction are subjected to the merge process. Since there are two search directions, the grid list (glist) is sorted twice in the merge process. Note that the grid search directions to be combined are not limited to the horizontal direction and the vertical direction, and instead of these directions, either the diagonally lower right direction or the diagonally upper right direction, or both may be combined. In such a two-way search, the number of times of sorting is 2, and the maximum number of distance calculations in the merge process is about 2N (times), where N is the number of clusters.

図１８Ｃは、４方向探索を示す図である。４方向探索では、グリッド探索の方向が４つ組み合わされる。本実施形態では、水平方向、垂直方向、斜め右下方向、および斜め右上方向の４つが組み合わされる。この場合、あるグリッドに対して水平方向、垂直方向、または斜め方向で近傍に位置するグリッドが、マージ処理の対象になる。ここで、斜め方向で隣にあるグリッドまでの距離が水平方向または垂直方向で隣接するグリッドよりも長くなることを考慮して、水平方向および垂直方向については、図示されているように２つ隣にあるグリッドまでを「隣接するグリッド」としてマージ処理の対象にしてもよい。この場合、マージ処理の対象になるのは、水平方向、垂直方向、および斜め方向の近傍にある１２のグリッドである。このような４方向探索の場合、ソート回数は４回であり、マージ処理での最大距離計算回数は、クラスタの数をＮとすると、約４Ｎ（回）になる。 FIG. 18C is a diagram illustrating a four-way search. In the four-direction search, four grid search directions are combined. In the present embodiment, the horizontal direction, the vertical direction, the diagonally lower right direction, and the diagonally upper right direction are combined. In this case, a grid located in the vicinity in a horizontal direction, a vertical direction, or an oblique direction with respect to a certain grid is a target of the merge process. Here, considering that the distance to the adjacent grid in the diagonal direction is longer than the grid adjacent in the horizontal direction or the vertical direction, the horizontal direction and the vertical direction are adjacent to each other as shown in the figure. The grids up to the grid may be subject to merge processing as “adjacent grids”. In this case, 12 grids in the vicinity of the horizontal direction, the vertical direction, and the diagonal direction are targeted for the merge process. In such a four-way search, the number of sorts is four, and the maximum number of distance calculations in the merge process is about 4N (times), where N is the number of clusters.

なお、上記の１方向探索、２方向探索、および４方向探索の最大距離計算回数は、いずれもＯ（Ｎ）になる。また、ソートは高速なものでＯ（ＮｌｏｇＮ）であるため、クラスタの数が膨大になれば、ソートの処理の方が演算量的に支配的になることも考えられる。 Note that the maximum number of distance calculations for the one-way search, the two-way search, and the four-way search is O (N). In addition, since the sorting is fast and O (NlogN), it can be considered that if the number of clusters becomes enormous, the sorting process becomes dominant in terms of computational complexity.

図１９は、本発明の第１の実施形態における近傍探索マージ処理（上位探索なし）を示すフローチャートである。なお、近傍探索マージ処理（上位探索あり）については、上位探索の詳細とともに後述する。 FIG. 19 is a flowchart showing neighborhood search merge processing (no upper search) in the first embodiment of the present invention. The neighborhood search merge process (with upper search) will be described later together with details of the upper search.

まず、マージ部１０７は、方向（dir）を“水平”として、隣接探索処理（上位探索なし）を実行する（ステップＳ６０１）。隣接探索処理（上位探索なし）については後述する。 First, the merging unit 107 sets the direction (dir) to “horizontal” and executes an adjacent search process (no upper search) (step S601). The neighbor search process (without upper search) will be described later.

続いて、マージ部１０７は、マージ設定情報（config）の探索手法（searchType）が“2 Dir”または“4 Dir”であるか否かを判定する（ステップＳ６０３）。ここで、マージ設定情報（config）の探索手法（searchType）が“2 Dir”でも“4 Dir”でもない場合、マージ部１０７は、１方向探索が指定されているものと判断し、処理を終了する。 Subsequently, the merge unit 107 determines whether the search method (searchType) of the merge setting information (config) is “2 Dir” or “4 Dir” (step S603). Here, when the search method (searchType) of the merge setting information (config) is neither “2 Dir” nor “4 Dir”, the merge unit 107 determines that a one-way search is designated and ends the process. To do.

ここで、ステップＳ６０３において、探索手法（searchType）が“2 Dir”または“4 Dir”であった場合、マージ部１０７は、方向（dir）を“垂直”として、隣接探索処理（上位探索なし）を実行する（ステップＳ６０５）。 If the search method (searchType) is “2 Dir” or “4 Dir” in step S603, the merge unit 107 sets the direction (dir) to “vertical” and performs an adjacent search process (no upper search). Is executed (step S605).

続いて、マージ部１０７は、マージ設定情報（config）の探索手法（searchType）が“4 Dir”であるか否かを判定する（ステップＳ６０７）。ここで、マージ設定情報（config）の探索手法（searchType）が“4 Dir”ではない場合、マージ部１０７は、２方向探索が指定されているものと判断し、処理を終了する。 Subsequently, the merge unit 107 determines whether the search method (searchType) of the merge setting information (config) is “4 Dir” (step S607). Here, when the search method (searchType) of the merge setting information (config) is not “4 Dir”, the merge unit 107 determines that a two-way search is designated, and ends the process.

ここで、ステップＳ６０７において、探索手法（searchType）が“4 Dir”であった場合、マージ部１０７は、４方向探索が指定されているものと判断する。この場合、マージ部１０７は、方向（dir）を“斜め右下”として隣接探索処理（上位探索なし）を実行し（ステップＳ６０９）、続いて、方向を“斜め右上”として隣接探索処理（上位探索なし）を実行し（ステップＳ６１１）、処理を終了する。 If the search method (searchType) is “4 Dir” in step S607, the merge unit 107 determines that a four-way search is designated. In this case, the merging unit 107 executes the adjacent search process (no upper search) with the direction (dir) as “diagonal lower right” (step S609), and then performs the adjacent search process (upper right with the direction as “oblique upper right”). (No search) is executed (step S611), and the process is terminated.

図２０は、本発明の第１の実施形態における隣接探索処理（上位探索なし）を示すフローチャートである。なお、隣接探索処理（上位探索あり）については、上位探索の詳細とともに後述する。 FIG. 20 is a flowchart showing the adjacent search process (no upper search) in the first embodiment of the present invention. The adjacent search process (with upper search) will be described later together with details of the upper search.

まず、隣接判定部１１１が、マージ設定情報（config）の探索手法（searchType）が“4 Dir”であり、かつ、方向（dir）が“水平”または“垂直”であるか否かを判定する（ステップＳ７０１）。 First, the adjacency determination unit 111 determines whether the search method (searchType) of the merge setting information (config) is “4 Dir” and the direction (dir) is “horizontal” or “vertical”. (Step S701).

ここで、ステップＳ７０１において、マージ設定情報（config）の探索手法（searchType）が“4 Dir”であり、かつ、方向（dir）が“水平”または“垂直”である場合、隣接判定部１１１は、隣接判定閾値ｔｈ＿ｎとして２を設定する（ステップＳ７０３）。本実施形態において、隣接判定閾値ｔｈ＿ｎは、グリッド間隔を単位として設定されうる。隣接判定閾値ｔｈ＿ｎは、後述するグリッド間の隣接判定に用いられる閾値であり、マージ判定に用いられる所定の閾値ｔｈとは別個に設定されうる。 Here, in step S701, when the search method (searchType) of the merge setting information (config) is “4 Dir” and the direction (dir) is “horizontal” or “vertical”, the adjacency determining unit 111 Then, 2 is set as the adjacency determination threshold th_n (step S703). In the present embodiment, the adjacency determination threshold th_n can be set in units of grid intervals. The adjacency determination threshold th_n is a threshold used for adjacency determination between grids to be described later, and can be set separately from a predetermined threshold th used for merge determination.

一方、ステップＳ７０１において、マージ設定情報（config）の探索手法（searchType）が“4 Dir”ではない、すなわち“1 Dir”または“2 Dir”である場合、または、方向（dir）が“水平”でも“垂直”でもない場合、隣接判定部１１１は、隣接判定閾値ｔｈ＿ｎとして１を設定する（ステップＳ７０５）。なお、本実施形態において隣接探索処理が実行されるのは、マージ設定情報（config）の探索手法（searchType）が“1 Dir”、“2 Dir”、または“4 Dir”の場合である。 On the other hand, when the search method (searchType) of the merge setting information (config) is not “4 Dir” in step S701, that is, “1 Dir” or “2 Dir”, or the direction (dir) is “horizontal”. However, if it is not “vertical”, the adjacency determination unit 111 sets 1 as the adjacency determination threshold th_n (step S705). In the present embodiment, the adjacent search process is executed when the search method (searchType) of the merge setting information (config) is “1 Dir”, “2 Dir”, or “4 Dir”.

ここで、マージ設定情報（config）の探索手法（searchType）に応じて隣接判定閾値ｔｈ＿ｎが設定されるのは、図１８Ａ〜図１８Ｃを参照して説明したように、１方向探索および２方向探索では、あるグリッドに対して水平方向または垂直方向に隣接するグリッド（グリッド間の距離が１グリッド分）がマージ処理の対象になるのに対して、４方向探索では、あるグリッドに対して水平方向および垂直方向では２つ隣にあるグリッド（グリッド間の距離が２グリッド分）までが「隣接するグリッド」としてマージ処理の対象になりうることに対応したものである。 Here, the adjacency determination threshold th_n is set according to the search method (searchType) of the merge setting information (config) as described with reference to FIGS. 18A to 18C. Then, a grid adjacent to a grid in the horizontal direction or the vertical direction (distance between grids is one grid) is subject to merge processing, whereas in a four-way search, the grid is horizontal This corresponds to the fact that up to two adjacent grids in the vertical direction (the distance between the grids is equivalent to two grids) can be the target of merge processing as “adjacent grids”.

続いて、マージ用ソート部１０９が、グリッドリスト（glist）を、方向(dir)で一時的にソートする（tmpSort）（ステップＳ７０７）。ここで、方向（dir）は、例えば図１９のステップＳ６０１、ステップＳ６０５、ステップＳ６０９およびステップＳ６１１において、隣接探索処理（上位探索なし）が実行されるときにパラメータとして指定される。一時的なソート（tmpSort）は、図１７Ａ〜図１７Ｄを参照して説明したように、グリッドリスト（glist）の各グリッドにインデックスを一時的に与える処理でありうる。具体的には、例えば、方向（dir）が“水平”である場合、グリッドリスト（glist）の各グリッドには、図１７Ａを参照して説明したようなインデックスが一時的に与えられる。方向（dir）が“垂直”である場合、グリッドリスト（glist）の各グリッドには、図１７Ｂを参照して説明したようなインデックスが一時的に与えられる。方向（dir）が“斜め右下”である場合、グリッドリスト（glist）のグリッドには、図１７Ｃを参照して説明したようなインデックスが一時的に与えられる。方向（dir）が“斜め右上”である場合、グリッドリスト（glist）のグリッドには、図１７Ｄを参照して説明したようなインデックスが一時的に与えられる。後述のステップＳ７０９のループ処理は、この一時的に与えられたインデックスに従って実行される。なお、その後、グリッドリスト（glist）のインデックスは、保存されていた当初のインデックスに戻される。 Subsequently, the merge sort unit 109 temporarily sorts the grid list (glist) in the direction (dir) (tmpSort) (step S707). Here, the direction (dir) is specified as a parameter when the adjacent search process (no upper search) is executed in, for example, step S601, step S605, step S609, and step S611 in FIG. As described with reference to FIGS. 17A to 17D, the temporary sorting (tmpSort) can be a process of temporarily giving an index to each grid of the grid list (glist). Specifically, for example, when the direction (dir) is “horizontal”, an index as described with reference to FIG. 17A is temporarily given to each grid of the grid list (glist). When the direction (dir) is “vertical”, an index as described with reference to FIG. 17B is temporarily given to each grid of the grid list (glist). When the direction (dir) is “diagonally lower right”, an index as described with reference to FIG. 17C is temporarily given to the grid of the grid list (glist). When the direction (dir) is “diagonally upper right”, the grid as described with reference to FIG. 17D is temporarily given to the grid of the grid list (glist). A loop process in step S709 described later is executed according to the temporarily given index. After that, the index of the grid list (glist) is returned to the stored original index.

続くステップＳ７０９において、マージ部１０７および隣接判定部１１１は、ステップＳ７０７において設定されたインデックスに従って、グリッドリスト（glist）の先頭の要素から順に、以下のステップＳ７１１〜Ｓ７１７を繰り返す（マージするグリッドループ）。なお、このステップＳ７０９のループにおいて処理対象となっているグリッドリスト（glist）の要素を、グリッドリスト（glist）のインデックスｉの要素とする。 In subsequent step S709, the merging unit 107 and the adjacency determining unit 111 repeat the following steps S711 to S717 in order from the top element of the grid list (glist) according to the index set in step S707 (grid loop to be merged). . Note that the element of the grid list (glist) that is the processing target in the loop of step S709 is the element of the index i of the grid list (glist).

ステップＳ７１１において、隣接判定部１１１は、グリッドリスト（glist）のインデックスｉの要素と、その次の要素（インデックスｉ＋１の要素）とが隣接しているか否かを判定する。ここで、本実施形態におけるグリッド間の隣接は、例えば方向（dir）が“水平”である場合、それぞれのグリッド１０３１の垂直方向の位置（図１７Ａの例でいえばｙ座標）が同じで、かつ水平方向の位置（図１７Ａの例でいえばｘ座標）の差が隣接判定閾値ｔｈ＿ｎ以下であるか否かによって判定されうる。それゆえ、ステップＳ７１１におけるグリッド間の隣接の判定処理は、任意座標間の距離の算出などの演算を含む処理と比較すると、処理負荷が小さい処理である。 In step S711, the adjacency determination unit 111 determines whether the element of index i in the grid list (glist) is adjacent to the next element (element of index i + 1). Here, for example, when the direction (dir) is “horizontal”, the adjacent positions between the grids in the present embodiment are the same in the vertical position of each grid 1031 (y coordinate in the example of FIG. 17A). Further, it can be determined by whether or not the difference in the horizontal position (x coordinate in the example of FIG. 17A) is equal to or smaller than the adjacency determination threshold th_n. Therefore, the determination process of adjacent grids in step S711 is a process with a small processing load compared to a process including an operation such as calculation of a distance between arbitrary coordinates.

ここで、ステップＳ７１１において、グリッドリスト（glist）のインデックスｉの要素と、その次の要素（インデックスｉ＋１の要素）とが隣接していると判定された場合、マージ部１０７は、グリッドリスト（glist）のインデックスｉの要素とインデックスｉ＋１の要素との間の距離ｄの算出処理（distance）を実行する（ステップＳ７１３）。ここで、インデックスｉの要素とインデックスｉ＋１の要素との間の距離ｄは、例えば、インデックスｉの要素であるグリッド１０３１に含まれるクラスタ１０２１と、インデックスｉ＋１の要素であるグリッド１０３１に含まれるクラスタ１０２１との中心距離でありうる。 If it is determined in step S711 that the element of index i in the grid list (glist) and the next element (element of index i + 1) are adjacent to each other, the merge unit 107 determines that the grid list (glist ) The distance d calculation process (distance) between the element at index i and the element at index i + 1 is executed (step S713). Here, the distance d between the element of index i and the element of index i + 1 is, for example, the cluster 1021 included in the grid 1031 that is the element of index i and the cluster 1021 included in the grid 1031 that is the element of index i + 1. And the center distance.

続くステップＳ７１５は、ステップＳ７１３において算出された距離ｄを、所定の閾値ｔｈと比較するステップである。ここで、所定の閾値ｔｈは、例えば、表１を参照して説明されたマージ閾値でありうる。 The subsequent step S715 is a step of comparing the distance d calculated in step S713 with a predetermined threshold th. Here, the predetermined threshold th may be, for example, the merge threshold described with reference to Table 1.

ここで、ステップＳ７１５において、距離ｄが閾値ｔｈ以下であると判定された場合、マージ部１０７は、グリッドリスト（glist）のインデックスｉの要素とインデックスｉ＋１の要素とをマージする（merge：ステップＳ７１７）。ここで、マージ部１０７は、グリッドリスト（glist）のインデックスｉの要素であるグリッド１０３１に対応するクラスタ１０２１と、インデックスｉ＋１の要素であるグリッド１０３１に対応するクラスタ１０２１とをマージした新たなクラスタ１０２１を生成し、この新たなクラスタ１０２１を、マージされる前の各クラスタ１０２１が対応付けられていたグリッド１０３１に対応付ける。つまり、このマージの後、インデックスｉの要素であるグリッド１０３１に対応するクラスタ１０２１と、インデックスｉ＋１の要素であるグリッド１０３１に対応するクラスタ１０２１とは、いずれもこの新たなクラスタ１０２１になる。 If it is determined in step S715 that the distance d is equal to or smaller than the threshold th, the merge unit 107 merges the element at index i and the element at index i + 1 in the grid list (glist) (merge: step S717). ). Here, the merging unit 107 merges the cluster 1021 corresponding to the grid 1031 that is an element of the index i of the grid list (glist) and the cluster 1021 corresponding to the grid 1031 that is the element of the index i + 1. And the new cluster 1021 is associated with the grid 1031 associated with each cluster 1021 before merging. That is, after this merging, the cluster 1021 corresponding to the grid 1031 that is the element of the index i and the cluster 1021 corresponding to the grid 1031 that is the element of the index i + 1 both become the new cluster 1021.

一方、ステップＳ７１１において、グリッドリスト（glist）のインデックスｉの要素と、その次の要素（インデックスｉ＋１の要素）とが隣接していないと判定された場合、マージ部１０７は、ステップＳ７１３〜ステップＳ７１７の処理をスキップして、インデックスｉをインクリメントし、グリッドリスト（glist）の次の要素を参照する（ステップＳ７０９）。このように、処理負荷が比較的小さいグリッド間の隣接判定処理（ステップＳ７１１）によって、処理負荷が比較的大きい距離ｄの算出処理（ステップＳ７１３）を実行するか否かを判断することによって、処理全体の負荷を軽減することが可能になる。 On the other hand, when it is determined in step S711 that the element of index i in the grid list (glist) and the next element (element of index i + 1) are not adjacent to each other, the merging unit 107 performs steps S713 to S717. Is skipped, the index i is incremented, and the next element of the grid list (glist) is referred to (step S709). In this way, by determining whether or not to calculate the distance d with a relatively large processing load (step S713) by the adjacent determination processing between grids with a relatively small processing load (step S711), the processing is performed. It becomes possible to reduce the overall load.

また、上記の隣接探索処理では、ソートされたグリッドリスト（glist）において隣接するグリッド間に限定してクラスタ間の距離を計算するため、ある方向（dir）についての最大計算回数は、クラスタの数をＮとすると、Ｎ−１（回）になる。ただし、これに、上述のようにＯ（ＮｌｏｇＮ）でソート処理の処理回数が加わる。 Further, in the above adjacent search processing, since the distance between clusters is calculated only between adjacent grids in the sorted grid list (glist), the maximum number of calculations for a certain direction (dir) is the number of clusters. If N is N, N-1 (times) is obtained. However, the number of times of sort processing is added to this as O (NlogN) as described above.

（上位探索処理の詳細）
図２１は、本発明の第１の実施形態における近傍探索マージ処理（上位探索あり）を示すフローチャートである。 (Details of upper search processing)
FIG. 21 is a flowchart showing neighborhood search merge processing (with upper search) in the first embodiment of the present invention.

まず、マージ部１０７は、上位グリッドリスト（ulist）を生成する（ステップＳ８０１）。ここで生成される上位グリッドリスト（ulist）について、図２２および図２３を参照して説明する。 First, the merge unit 107 generates an upper grid list (ulist) (step S801). The upper grid list (ulist) generated here will be described with reference to FIG. 22 and FIG.

図２２および図２３は、本発明の第１の実施形態における上位探索において生成される上位グリッドリストについて説明するための図である。ここでは、水平方向に上位探索が実行される場合を例として説明する。なお、上位探索は、垂直方向、斜め右下方向、および斜め右下方向についても同様に実行されうる。 22 and 23 are diagrams for explaining the upper grid list generated in the upper search in the first embodiment of the present invention. Here, a case where the upper search is executed in the horizontal direction will be described as an example. It should be noted that the upper search can be executed in the same manner in the vertical direction, the diagonally lower right direction, and the diagonally lower right direction.

図２２において示されるように、この例では、０〜１１の番号が付されたグリッドが、Ｉ〜ＶＩの６つの上位グリッドにそれぞれ属している。このような配置のグリッドおよび上位グリッドに対して、図２３において示されるように、０〜１１のグリッドを配列したグリッドリスト（glist）から、Ｉ〜ＶＩのそれぞれの上位グリッドに属するグリッドの先頭にあたるグリッド（各上位グリッドに属するグリッドは、水平方向にソートされている）を抽出したリストが上位グリッドリスト（ulist）である。後述されるように、上位グリッドリスト（ulist）は、例えば水平方向にソートされうる。 As shown in FIG. 22, in this example, the grids numbered 0 to 11 belong to the six upper grids I to VI, respectively. As shown in FIG. 23, the grid and the upper grid having such an arrangement correspond to the heads of the grids belonging to the upper grids I to VI from the grid list (glist) in which the grids 0 to 11 are arranged. A list obtained by extracting grids (the grids belonging to each upper grid are sorted in the horizontal direction) is an upper grid list (ulist). As will be described later, the upper grid list (ulist) can be sorted in the horizontal direction, for example.

図２１に戻って、次に、マージ部１０７は、上位グリッド内でのマージ処理を実行する（ステップＳ８０３）。ここで実行される上位グリッド内でのマージ処理について、図２４を参照して説明する。 Returning to FIG. 21, the merging unit 107 next performs a merging process in the upper grid (step S <b> 803). The merge processing in the upper grid executed here will be described with reference to FIG.

図２４には、図２２の例におけるＩの上位グリッドにおける、上位グリッド内のマージ処理の例が示されている。図２４において示されているように、Ｉの上位グリッドに属する各グリッドの総当りの組み合わせ１組が対象になる。このような上位グリッド内でのマージ処理における最大距離計算回数は、クラスタの数をＮとすると、約Ｎ／４×６＝１．５Ｎ（回）になる。 FIG. 24 shows an example of merge processing in the upper grid in the upper grid of I in the example of FIG. As shown in FIG. 24, one set of round robin combinations of each grid belonging to the upper grid of I is targeted. The maximum number of times of distance calculation in the merge processing in the upper grid is about N / 4 × 6 = 1.5 N (times), where N is the number of clusters.

図２１に戻って、続いて、マージ部１０７は、方向（dir）を“水平”として、隣接探索処理（上位探索あり）を実行する（ステップＳ８０５）。隣接探索処理（上位探索あり）については後述する。 Returning to FIG. 21, the merging unit 107 then performs the adjacent search process (with upper search) with the direction (dir) as “horizontal” (step S <b> 805). The neighbor search process (with upper search) will be described later.

続いて、マージ部１０７は、マージ設定情報（config）の探索手法（searchType）が“2 Dir”または“4 Dir”であるか否かを判定する（ステップＳ８０７）。ここで、マージ設定情報（config）の探索手法（searchType）が“2 Dir”でも“4 Dir”でもない場合、マージ部１０７は、１方向探索が指定されているものと判断し、処理を終了する。 Subsequently, the merge unit 107 determines whether the search method (searchType) of the merge setting information (config) is “2 Dir” or “4 Dir” (step S807). Here, when the search method (searchType) of the merge setting information (config) is neither “2 Dir” nor “4 Dir”, the merge unit 107 determines that a one-way search is designated and ends the process. To do.

ここで、ステップＳ８０７において、探索手法（searchType）が“2 Dir”または“4 Dir”であった場合、マージ部１０７は、方向（dir）を“垂直”として、隣接探索処理（上位探索あり）を実行する（ステップＳ８０９）。 If the search method (searchType) is “2 Dir” or “4 Dir” in step S807, the merge unit 107 sets the direction (dir) to “vertical” and performs an adjacent search process (with upper search). Is executed (step S809).

続いて、マージ部１０７は、マージ設定情報（config）の探索手法（searchType）が“4 Dir”であるか否かを判定する（ステップＳ８１１）。ここで、マージ設定情報（config）の探索手法（searchType）が“4 Dir”ではない場合、マージ部１０７は、２方向探索が指定されているものと判断し、処理を終了する。 Subsequently, the merge unit 107 determines whether or not the search method (searchType) of the merge setting information (config) is “4 Dir” (step S811). Here, when the search method (searchType) of the merge setting information (config) is not “4 Dir”, the merge unit 107 determines that a two-way search is designated, and ends the process.

ここで、ステップＳ８１１において、探索手法（searchType）が“4 Dir”であった場合、マージ部１０７は、４方向探索が指定されているものと判断する。この場合、マージ部１０７は、方向（dir）を“斜め右下”として隣接探索処理（上位探索あり）を実行し（ステップＳ８１３）、続いて、方向を“斜め右上”として隣接探索処理（上位探索あり）を実行し（ステップＳ８１５）、処理を終了する。 If the search method (searchType) is “4 Dir” in step S811, the merge unit 107 determines that a four-way search is designated. In this case, the merging unit 107 executes the adjacent search process (with upper search) with the direction (dir) as “diagonally lower right” (step S813), and then sets the direction as “oblique upper right” with the adjacent search process (upper search). Search is performed) (step S815), and the process is terminated.

図２５は、本発明の第１の実施形態における隣接探索処理（上位探索あり）を示すフローチャートである。 FIG. 25 is a flowchart showing an adjacent search process (with upper search) in the first embodiment of the present invention.

まず、隣接判定部１１１が、マージ設定情報（config）の探索手法（searchType）が“4 Dir”であり、かつ、方向（dir）が“水平”または“垂直”であるか否かを判定する（ステップＳ９０１）。 First, the adjacency determination unit 111 determines whether the search method (searchType) of the merge setting information (config) is “4 Dir” and the direction (dir) is “horizontal” or “vertical”. (Step S901).

ここで、ステップＳ９０１において、マージ設定情報（config）の探索手法（searchType）が“4 Dir”であり、かつ、方向（dir）が“水平”または“垂直”である場合、隣接判定部１１１は、隣接判定閾値ｔｈ＿ｎとして２を設定する（ステップＳ９０３）。本実施形態において、隣接判定閾値ｔｈ＿ｎは、上位グリッド間隔を単位として設定されうる。なお、上位グリッド間隔はグリッド間隔の２倍である。隣接判定閾値ｔｈ＿ｎは、後述する上位グリッド間の隣接判定に用いられる閾値であり、マージ判定に用いられる所定の閾値ｔｈとは別個に設定されうる。 Here, in step S901, when the search method (searchType) of the merge setting information (config) is “4 Dir” and the direction (dir) is “horizontal” or “vertical”, the adjacency determining unit 111 Then, 2 is set as the adjacency determination threshold th_n (step S903). In the present embodiment, the adjacency determination threshold th_n may be set with the upper grid interval as a unit. The upper grid interval is twice the grid interval. The adjacency determination threshold th_n is a threshold used for adjacent determination between upper grids, which will be described later, and can be set separately from a predetermined threshold th used for merge determination.

一方、ステップＳ９０１において、マージ設定情報（config）の探索手法（searchType）が“4 Dir”ではない、すなわち“1 Dir”または“2 Dir”である場合、または、方向（dir）が“水平”でも“垂直”でもない場合、隣接判定部１１１は、隣接判定閾値ｔｈ＿ｎとして１を設定する（ステップＳ９０５）。なお、本実施形態において隣接探索処理が実行されるのは、マージ設定情報（config）の探索手法（searchType）が“1 Dir”、“2 Dir”、または“4 Dir”の場合である。 On the other hand, if the search method (searchType) of the merge setting information (config) is not “4 Dir” in step S901, that is, “1 Dir” or “2 Dir”, or the direction (dir) is “horizontal”. However, if it is not “vertical”, the adjacency determination unit 111 sets 1 as the adjacency determination threshold th_n (step S905). In the present embodiment, the adjacent search process is executed when the search method (searchType) of the merge setting information (config) is “1 Dir”, “2 Dir”, or “4 Dir”.

続いて、マージ用ソート部１０９が、図２３を参照して説明したように、上位グリッドリスト（ulist）を方向(dir)についてソートする（sort）（ステップＳ９０７）。ここで、方向（dir）は、例えば図２１のステップＳ８０５、ステップＳ８０９、ステップＳ８１３およびステップＳ８１５において、隣接探索処理（上位探索あり）が実行されるときにパラメータとして指定される。ソート（sort）は、図１７Ａ〜図１７Ｄを参照して説明したようなグリッドリスト（glist）の各グリッドにインデックスを与える処理を、上位グリッドリスト（ulist）に含まれる各上位グリッドの先頭のグリッドに対して実行するものでありうる。 Subsequently, the merge sorting unit 109 sorts the upper grid list (ulist) with respect to the direction (dir) as described with reference to FIG. 23 (sort) (step S907). Here, the direction (dir) is specified as a parameter when the adjacent search process (with upper search) is executed in, for example, step S805, step S809, step S813, and step S815 in FIG. The sort (sort) is a process of giving an index to each grid of the grid list (glist) as described with reference to FIGS. 17A to 17D, and the top grid of each upper grid included in the upper grid list (ulist). Can be executed against.

続くステップＳ９０９において、マージ部１０７および隣接判定部１１１は、ステップＳ９０７のソートの結果に従って、上位グリッドリスト（ulist）の先頭の要素から順に、以下のステップＳ９１３〜Ｓ９２３を繰り返す（上位グリッドリストループ）。なお、このステップＳ９０９のループにおいて処理対象となっている上位グリッドリスト（ulist）の要素を、上位グリッドリスト（ulist）のインデックスｉの要素とする。 In the subsequent step S909, the merge unit 107 and the adjacency determination unit 111 repeat the following steps S913 to S923 in order from the top element of the upper grid list (ulist) according to the sorting result in step S907 (upper grid list loop). . Note that the element of the upper grid list (ulist) that is the processing target in the loop of step S909 is the element of the index i of the upper grid list (ulist).

ステップＳ９１１において、隣接判定部１１１は、上位グリッドリスト（ulist）のインデックスｉの要素に対応する上位グリッドと、その次の要素（インデックスｉ＋１の要素）に対応する上位グリッドとが隣接しているか否かを判定する。ここで、本実施形態における上位グリッド間の隣接は、例えば方向（dir）が“水平”である場合、それぞれのグリッド１０３１の垂直方向の位置（図１７Ａの例でいえばｙ座標）が同じで、かつ水平方向の位置（図１７Ａの例でいえばｘ座標）の差が隣接判定閾値ｔｈ＿ｎ以下であるか否かによって判定されうる。それゆえ、ステップＳ９１１における上位グリッド間の隣接の判定処理は、任意座標間の距離の算出などの演算を含む処理と比較すると、処理負荷が小さい処理である。 In step S911, the adjacency determination unit 111 determines whether the upper grid corresponding to the element of index i in the upper grid list (ulist) and the upper grid corresponding to the next element (element of index i + 1) are adjacent to each other. Determine whether. Here, for example, when the direction (dir) is “horizontal”, the adjacent positions between the upper grids in this embodiment are the same in the vertical position of each grid 1031 (y coordinate in the example of FIG. 17A). In addition, it can be determined by whether or not the difference in the horizontal position (x coordinate in the example of FIG. 17A) is equal to or smaller than the adjacency determination threshold th_n. Therefore, the process for determining the adjacency between the upper grids in step S911 is a process with a smaller processing load compared to a process including an operation such as calculation of a distance between arbitrary coordinates.

ここで、ステップＳ９１１において、上位グリッドリスト（ulist）のインデックスｉの要素に対応する上位グリッドと、その次の要素（インデックスｉ＋１の要素）に対応する上位グリッドとが隣接していると判定された場合、ステップＳ９１３において、マージ部１０７は、上位グリッドリスト（ulist）のインデックスｉの要素に対応する上位グリッド（第１の上位グリッド）に属するグリッド（サブグリッド：subgrids）のそれぞれについて、以下のステップＳ９１５〜Ｓ９２１を繰り返す（第１の上位グリッドのサブグリッドループ）。ここで、ステップＳ９１３のループの処理対象となっているグリッド（サブグリッド）をａとする。 Here, in step S911, it is determined that the upper grid corresponding to the element of index i in the upper grid list (ulist) and the upper grid corresponding to the next element (element of index i + 1) are adjacent to each other. In step S913, the merging unit 107 performs the following steps for each of the grids (subgrids) belonging to the upper grid (first upper grid) corresponding to the element of the index i of the upper grid list (ulist). S915 to S921 are repeated (sub grid loop of the first upper grid). Here, it is assumed that a grid (subgrid) to be processed in the loop of step S913 is a.

さらに、ステップＳ９１５において、マージ部１０７は、上位グリッドリスト（ulist）のインデックスｉ＋１の要素に対応する上位グリッド（第２の上位グリッド）に属するグリッド（サブグリッド：subgrids）のそれぞれについて、以下のステップＳ９１７〜Ｓ９２１を繰り返す（第２の上位グリッドのサブグリッドループ）。ここで、ステップＳ９１５のループの処理対象となっているグリッド（サブグリッド）をｂとする。 Further, in step S915, the merging unit 107 performs the following steps for each of the grids (subgrids) belonging to the upper grid (second upper grid) corresponding to the element of the index i + 1 of the upper grid list (ulist). S917 to S921 are repeated (second grid grid sub-grid loop). Here, b is a grid (subgrid) that is a processing target of the loop in step S915.

ステップＳ９１７において、マージ部１０７は、グリッドａとグリッドｂとの間の距離ｄの算出処理（distance）を実行する（ステップＳ９１７）。ここで、グリッドａとグリッドｂとの間の距離ｄは、例えば、グリッドａに含まれるクラスタと、グリッドｂに含まれるクラスタとの中心距離でありうる。 In step S917, the merge unit 107 executes a calculation process (distance) of the distance d between the grid a and the grid b (step S917). Here, the distance d between the grid a and the grid b may be the center distance between the cluster included in the grid a and the cluster included in the grid b, for example.

続くステップＳ９１９は、ステップＳ９１７において算出された距離ｄを、所定の閾値ｔｈと比較するステップである。ここで、所定の閾値ｔｈは、例えば、表１を参照して説明されたマージ閾値でありうる。 The subsequent step S919 is a step of comparing the distance d calculated in step S917 with a predetermined threshold th. Here, the predetermined threshold th may be, for example, the merge threshold described with reference to Table 1.

ここで、ステップＳ９１９において、距離ｄが閾値ｔｈ以下であると判定された場合、マージ部１０７は、グリッドａに含まれるクラスタと、グリッドｂに含まれるクラスタとをマージする（merge：ステップＳ９２１）。ここで、マージ部１０７は、グリッドａに対応するクラスタと、グリッドｂに対応するクラスタとをマージした新たなクラスタを生成し、この新たなクラスタを、グリッドａおよびグリッドｂの両方に対応付ける。つまり、このマージの後、グリッドａに対応するクラスタと、グリッドｂに対応するクラスタとは、いずれもこの新たなクラスタになる。 Here, when it is determined in step S919 that the distance d is equal to or smaller than the threshold th, the merge unit 107 merges the cluster included in the grid a and the cluster included in the grid b (merge: step S921). . Here, the merge unit 107 generates a new cluster by merging the cluster corresponding to the grid a and the cluster corresponding to the grid b, and associates the new cluster with both the grid a and the grid b. That is, after this merging, both the cluster corresponding to the grid a and the cluster corresponding to the grid b become this new cluster.

一方、ステップＳ９１１において、上位グリッドリスト（ulist）のインデックスｉの要素に対応する上位グリッドと、その次の要素（インデックスｉ＋１の要素）に対応する上位グリッドとが隣接していないと判定された場合、マージ部１０７は、ステップＳ９１３〜ステップＳ９２１の処理をスキップして、インデックスｉをインクリメントし、上位グリッドリスト（ulist）の次の要素を参照する（ステップＳ９０９）。このように、処理負荷が比較的小さい上位グリッド間の隣接判定処理（ステップＳ９１１）によって、処理負荷が比較的大きい距離ｄの算出処理（ステップＳ９１７）を実行するか否かを判断することによって、処理全体の負荷を軽減することが可能になる。 On the other hand, when it is determined in step S911 that the upper grid corresponding to the element of index i in the upper grid list (ulist) and the upper grid corresponding to the next element (element of index i + 1) are not adjacent to each other. The merging unit 107 skips the processes in steps S913 to S921, increments the index i, and refers to the next element in the upper grid list (ulist) (step S909). In this way, by determining whether or not to calculate the distance d with a relatively large processing load (step S917) by the adjacent determination processing between the upper grids with a relatively small processing load (step S911), It becomes possible to reduce the load of the entire processing.

ここで、図２６を参照して、上記のステップＳ９１３〜Ｓ９２１の処理についてさらに説明する。図２６には、方向（dir）が“水平”で、図２２の例におけるＩおよびＩＩＩの上位グリッドが、それぞれ第１の上位グリッドおよび第２の上位グリッドになった場合の、ステップＳ９１３〜Ｓ９２１の処理の例が示されている。 Here, with reference to FIG. 26, the process of said step S913-S921 is further demonstrated. FIG. 26 shows steps S913 to S921 when the direction (dir) is “horizontal” and the upper grids of I and III in the example of FIG. 22 become the first upper grid and the second upper grid, respectively. An example of the process is shown.

図示された例において、ステップＳ９１３のループ処理の対象（グリッドａ）になるグリッドは、Ｉの上位グリッドに属する各グリッドである。また、ステップＳ９１５のループ処理の対象になるグリッドは、ＩＩＩの上位グリッドに属する各グリッドである。従って、ステップＳ９１７〜Ｓ９２１におけるグリッドａとグリッドｂとの間のマージ処理の対象になるグリッドの組み合わせは、図２５において示されているように、Ｉの上位グリッドに属する各グリッドと、ＩＩＩの上位グリッドに属する各グリッドとの総当りの組み合わせで、合計６組になる。このような上位グリッド間でのマージ処理における最大距離計算回数は、クラスタの数をＮとすると、約Ｎ／４×４^２＝４Ｎ（回）になる。 In the illustrated example, the grids that are the targets of the loop processing (grid a) in step S913 are the grids belonging to the upper I grid. In addition, the grids that are the targets of the loop processing in step S915 are the grids belonging to the upper grid of III. Therefore, the combinations of the grids that are the targets of the merge processing between the grid a and the grid b in steps S917 to S921 are the grids belonging to the upper grid of I and the higher rank of III as shown in FIG. The total number of combinations with each grid belonging to the grid is 6 sets. The maximum number of distance calculations in the merge process between such upper grids is approximately N / 4 × 4 ² = 4N (times), where N is the number of clusters.

図２７は、本発明の第１の実施形態において、近傍探索マージ処理（上位探索あり）が実行された場合の、あるグリッド（濃いハッチングで表されているグリッド）に対するマージ処理の対象グリッド（薄いハッチングで表されているグリッド）を示す図である。図２７には、マージ設定情報（config）の探索種類（searchType）が“4 Dir”、上位レベル探索（upperLevel）が１である場合の例が示されている。 FIG. 27 shows a target grid (light thin) of merge processing for a certain grid (grid represented by dark hatching) when neighborhood search merge processing (with high-order search) is executed in the first embodiment of the present invention. It is a figure which shows the grid) represented by hatching. FIG. 27 shows an example in which the search type (searchType) of the merge setting information (config) is “4 Dir” and the upper level search (upperLevel) is 1.

この場合、４つの方向について近傍にある１２の上位グリッド（太線で表されている）が、上位探索でのマージ処理対象となる。さらに、上記のあるグリッドが属する上位グリッド内の３つのグリッドが、上位グリッド内のマージ処理の対象となる。従って、上位探索処理を含む近傍探索マージ処理によってマージ処理の対象となるグリッドは、全部で５１のグリッドになる。４方向探索なので、ソート回数は４回になり、最大距離計算回数は、クラスタの数をＮとすると、（４×４Ｎ）＋１．５Ｎ＝１７．５Ｎ（回）になる。 In this case, twelve upper grids (represented by bold lines) adjacent to each other in the four directions are merge processing targets in the upper search. Further, the three grids in the upper grid to which the certain grid belongs belong to the merge process in the upper grid. Therefore, the grids that are the targets of the merge process by the neighbor search merge process including the upper search process are 51 grids in total. Since it is a four-way search, the number of sorting is four, and the maximum distance calculation number is (4 × 4N) + 1.5N = 17.5N (times), where N is the number of clusters.

上記のような上位探索処理は、１方向探索、２方向探索で実行されてもよいが、本実施形態における探索範囲の形状的には、４方向探索で実行されるのが好ましい。また、上位探索処理を実行する場合としない場合とで、ソート回数は変わらないが、距離計算回数は増加する。 The upper search process as described above may be executed by a one-way search or a two-way search, but is preferably executed by a four-way search in terms of the shape of the search range in the present embodiment. Further, the number of sorts does not change depending on whether the upper search process is executed or not, but the number of distance calculations increases.

（距離順マージ処理の詳細）
距離順マージ処理は、相互間の距離が所定の閾値以下であるクラスタ１０２１を探索し、そのようなクラスタ１０２１をマージ候補クラスタとして記憶する処理と、記憶されたマージ候補クラスタのうち、相互間の距離が小さいマージ候補クラスタから順にマージする処理とを含む。 (Details of distance order merge processing)
The distance order merging process searches for a cluster 1021 whose distance between each other is a predetermined threshold or less, stores such a cluster 1021 as a merge candidate cluster, and among the stored merge candidate clusters, Merging in order from the merge candidate cluster with the smallest distance.

図２８は、本発明の第１の実施形態における距離順ソートの概要について説明するための図である。図２８では、水平方向（この例における探索方向とする）に隣接した３つのグリッド１０３１にそれぞれ含まれるクラスタ１０２１ｓ〜１０２１ｕが図示されている。ここで、クラスタ１０２１ｓとクラスタ１０２１ｔとの間の中心距離をｄ１、クラスタ１０２１ｔとクラスタ１０２１ｕとの間の中心距離をｄ２とする。ｄ１およびｄ２は、いずれも、マージ処理のための所定の閾値ｔｈを下回っており、ｄ１＞ｄ２であるものとする。 FIG. 28 is a diagram for describing an overview of distance order sorting in the first embodiment of the present invention. In FIG. 28, clusters 1021 s to 1021 u included in the three grids 1031 adjacent to each other in the horizontal direction (the search direction in this example) are illustrated. Here, it is assumed that the center distance between the cluster 1021s and the cluster 1021t is d1, and the center distance between the cluster 1021t and the cluster 1021u is d2. Both d1 and d2 are below a predetermined threshold th for the merge process, and d1> d2.

ここで、図示されているような向きで、上述のような隣接探索処理が実行された場合、先にマージ処理の対象となるのは、クラスタ１０２１ｓとクラスタ１０２１ｔの組み合わせである。クラスタ１０２１ｓとクラスタ１０２１ｔとの中心距離ｄ１は、マージ処理のための所定の閾値ｔｈを下回っているため、クラスタ１０２１ｓとクラスタ１０２１ｔはマージされ、クラスタ１０２１ｖとなる。ここで、クラスタ１０２１ｖとクラスタ１０２１ｕとの間の中心距離をｄ３とする。ｄ３は、ｄ１およびｄ２よりも大きく、マージ処理のための所定の閾値ｔｈを上回っているものとする。 Here, when the adjacent search process as described above is executed in the direction shown in the drawing, the target of the merge process is the combination of the cluster 1021s and the cluster 1021t. Since the center distance d1 between the cluster 1021s and the cluster 1021t is less than the predetermined threshold th for the merge process, the cluster 1021s and the cluster 1021t are merged to become the cluster 1021v. Here, it is assumed that the center distance between the cluster 1021v and the cluster 1021u is d3. It is assumed that d3 is larger than d1 and d2 and exceeds a predetermined threshold th for merge processing.

この後、クラスタ１０２１ｖとクラスタ１０２１ｕがマージ処理の対象となる。しかし、クラスタ１０２１ｖとクラスタ１０２１ｕとの間の中心距離ｄ３は、マージ処理のための所定の閾値ｔｈを上回っているため、クラスタ１０２１ｖとクラスタ１０２１ｕとはマージされず、クラスタ１０２１ｓ〜１０２１ｕがマージされたクラスタ１０２１ｗは生成されない。 Thereafter, the cluster 1021v and the cluster 1021u are to be merged. However, since the center distance d3 between the cluster 1021v and the cluster 1021u exceeds a predetermined threshold th for the merge process, the cluster 1021v and the cluster 1021u are not merged, and the clusters 1021s to 1021u are merged. The cluster 1021w is not generated.

このように、探索順にマージ処理が実行される場合、クラスタ間の中心距離では、クラスタ１０２１ｓとクラスタ１０２１ｔとの間のｄ１よりも、クラスタ１０２１ｔとクラスタ１０２１ｕとの間のｄ２の方が小さいにもかかわらず、クラスタ１０２１ｓとクラスタ１０２１ｔとがマージされ、クラスタ１０２１ｕはマージされないという問題が生じうる。 Thus, when merge processing is performed in the search order, the center distance between clusters is smaller in d2 between the cluster 1021t and the cluster 1021u than in d1 between the cluster 1021s and the cluster 1021t. Regardless, there may be a problem that the cluster 1021s and the cluster 1021t are merged and the cluster 1021u is not merged.

かかる問題を解決するために、距離順マージが実行される。本実施形態では、上述のように、マージ設定情報（config）に距離順マージフラグ（sortPair）が含まれ、この距離順マージフラグ（sortPair）が“true”の場合に、距離順マージが実行される。 In order to solve such a problem, distance order merging is performed. In this embodiment, as described above, the merge order information (config) includes the distance order merge flag (sortPair), and when the distance order merge flag (sortPair) is “true”, the distance order merge is executed. The

図２９は、本発明の第１の実施形態における距離順マージ処理を示すフローチャートである。 FIG. 29 is a flowchart showing distance order merge processing according to the first embodiment of the present invention.

まず、マージ部１０７は、図１５を参照して説明した説明した探索順マージ処理を実行する（ステップＳ１００１）。ただし、探索順マージ処理から呼び出される、フルマッチマージ処理または近傍探索マージ処理において、マージ（merge）に代えて、距離順マージ処理では、ペアリスト（pairList）に、それぞれのグリッドに含まれるクラスタをマージ候補クラスタとして追加する処理（add）が実行される。例えば、図２０を参照して説明した隣接探索処理（上位探索なし）のステップＳ７１７は、距離順マージ処理の場合以下のように書き換えられる。 First, the merge unit 107 performs the search order merge process described with reference to FIG. 15 (step S1001). However, in full match merge processing or neighborhood search merge processing called from search order merge processing, instead of merging (merge), in distance order merge processing, clusters included in each grid are merged into pair list (pairList). Processing to add as a candidate cluster (add) is executed. For example, step S717 of the adjacent search process (without upper search) described with reference to FIG. 20 is rewritten as follows in the case of the distance order merge process.

「ここで、ステップＳ７１５において、距離ｄが閾値ｔｈ以下であると判定された場合、マージ部１０７は、グリッドリスト（glist）のインデックスｉの要素に含まれるクラスタと、インデックスｉ＋１の要素に含まれるクラスタとを、マージ候補クラスタとしてペアリスト（pairList）に追加する（add）。」 “If it is determined in step S715 that the distance d is equal to or smaller than the threshold th, the merging unit 107 is included in the cluster included in the element of index i in the grid list (glist) and included in the element of index i + 1. Add the cluster as a merge candidate cluster to the pair list (pairList) (add). "

続いて、マージ部１０７は、ペアリスト（pairList）を、要素であるクラスタのペアを、クラスタ間の距離ｄの順にソートする（ステップＳ１００３）。このために、ステップＳ１００１において、ペアになるクラスタの情報に加えてクラスタ間の距離ｄの情報も一緒にペアリスト（pairList）に追加されてもよい。 Subsequently, the merging unit 107 sorts the pair list (pairList) into cluster pairs as elements in the order of the distance d between the clusters (step S1003). For this reason, in step S1001, in addition to the information on the clusters to be paired, the information on the distance d between the clusters may be added together to the pair list (pairList).

続いて、ステップＳ１００５において、マージ部１０７は、ステップＳ１００３においてソートされた順番に従って、ペアリスト（pairList）の先頭から順に、ステップＳ１００７を繰り返す（ペアリストループ）。ここで、ステップＳ１００５のループで処理対象になっているペアリスト（pairList）の要素のインデックスをｋとする。ペアリストがクラスタ間の距離ｄの順にソートされ、そのソートされた順番に従ってマージのための処理が実行されることで、相互間の距離が小さいマージ候補クラスタから順にマージされることになる。 Subsequently, in step S1005, the merge unit 107 repeats step S1007 in order from the top of the pair list (pairList) according to the order sorted in step S1003 (pair list loop). Here, the index of the element of the pair list (pairList) that is the processing target in the loop of step S1005 is k. The pair list is sorted in the order of the distance d between the clusters, and processing for merging is executed in accordance with the sorted order, so that the merge candidate clusters having the smallest distance are merged in order.

ステップＳ１００７では、ペアリスト（pairList）のインデックスｋの要素であるクラスタのペアについて、それぞれのクラスタをマージした新たなクラスタを生成する（merge）。この新たなクラスタの情報には、マージされている元のクラスタの情報が含まれていてもよい。ステップＳ１００７において、処理対象の一方のクラスタが、先に行われたステップＳ１００５のループ処理で他のクラスタとマージされていた場合、マージは、処理対象のもう一方のクラスタと、そのマージされたクラスタとの間で実行されうる。また、ステップＳ１００７において、処理対象の両方のクラスタが、先に行われたステップＳ１００５のループ処理で他のクラスタとマージされていた場合、マージ部１０７は、マージを実行せず、ペアリスト（pairList）の次の要素の処理に移ってもよい（ステップＳ１００５）。 In step S1007, a new cluster is generated by merging the clusters of the cluster pair that is an element of the index k of the pair list (pairList) (merge). This new cluster information may include information of the original cluster that has been merged. In step S1007, if one cluster to be processed has been merged with another cluster in the loop processing of step S1005 previously performed, the merge is performed with the other cluster to be processed and the merged cluster. Can be executed between. In step S1007, if both the clusters to be processed have been merged with other clusters in the loop processing in step S1005 previously performed, the merging unit 107 does not perform merging, and the pair list (pairList ) May be moved to the next element (step S1005).

なお、距離順マージを有効にした場合、ペアリスト（pairList）を保持するための記憶領域が消費される。ペアリスト（pairList）のサイズは、マージされる可能性があるマージ候補クラスタの数である。それゆえ、ペアリスト（pairList）のサイズは、ワーストケース（最もサイズが大きくなるケース）において、最大距離計算回数に一致し、また、隣接判定の範囲が広がるほど大きくなる。ここで、最大距離計算回数は、処理対象のグリッド数、近傍探索の方向、および上位レベル探索を実行する上位レベルの数が多くなるほど増加する。従って、使用可能な記憶領域が限られる環境では、例えば図１３を参照して説明したマージ設定情報を利用するなどして、処理対象のグリッド数がある程度以下である場合に距離順マージを有効にすることが望ましい。 When distance order merging is enabled, a storage area for holding a pair list (pairList) is consumed. The size of the pair list (pairList) is the number of merge candidate clusters that may be merged. Therefore, the size of the pair list (pairList) coincides with the maximum distance calculation count in the worst case (the case where the size becomes the largest), and becomes larger as the range of the adjacent determination becomes wider. Here, the maximum number of distance calculations increases as the number of grids to be processed, the direction of neighborhood search, and the number of higher levels for executing higher level searches increase. Therefore, in an environment where the usable storage area is limited, for example, the merge setting information described with reference to FIG. 13 is used, and the distance order merge is enabled when the number of grids to be processed is less than a certain level. It is desirable to do.

以上、本発明の第１の実施形態におけるマージ処理について説明した。一般的なマージ処理では、クラスタ全体の組み合わせに対してそれぞれ距離計算の処理が発生し、処理負荷が高い。また、一般的なマージ処理では、処理負荷を調整することが難しい。これに対して、本実施形態におけるマージ処理では、例えばクラスタ１０２１を含むグリッド１０３１の数に応じて、マージをするか否か、探索順マージか距離順マージか、フルマッチマージか近傍探索マージかといったマージの設定を選択し、マージの精度と処理負荷とを調整することが可能である。近傍探索マージでは、マージのためのクラスタ間の距離計算をするクラスタを、ある方向についてソートした場合に隣接するクラスタに限定することによって、距離計算の回数を減らし、処理負荷を低減させている。 Heretofore, the merging process in the first embodiment of the present invention has been described. In a general merge process, a distance calculation process occurs for each combination of the entire cluster, and the processing load is high. In general merge processing, it is difficult to adjust the processing load. On the other hand, in the merge processing according to the present embodiment, for example, according to the number of grids 1031 including the cluster 1021, whether to merge, whether search order merge or distance order merge, full match merge or neighborhood search merge. It is possible to select the merge setting and adjust the accuracy and processing load of the merge. In the neighbor search merge, the number of distance calculations is reduced and the processing load is reduced by limiting the clusters that calculate the distance between clusters for merging to adjacent clusters when sorting in a certain direction.

＜２．第２の実施形態＞
本発明の第２の実施形態では、地球を含む３次元空間が特徴空間に相当する。また、本実施形態において、３次元空間における位置情報は、例えばｘ座標、ｙ座標、およびｚ座標のような、３次元の直交座標系を用いて表される。また、本実施形態において、クラスタは、ｘ座標、ｙ座標、およびｚ座標によって３次元空間内に定義されるブロックに含まれるコンテンツの位置情報を包含する領域である。 <2. Second Embodiment>
In the second embodiment of the present invention, a three-dimensional space including the earth corresponds to a feature space. In the present embodiment, the position information in the three-dimensional space is expressed using a three-dimensional orthogonal coordinate system such as x-coordinate, y-coordinate, and z-coordinate. In the present embodiment, the cluster is an area including the position information of the content included in the block defined in the three-dimensional space by the x coordinate, the y coordinate, and the z coordinate.

なお、本発明の第２の実施形態は、クラスタリングが２次元座標によって定義されるグリッドではなく、３次元座標によって定義されるブロックに基づいて行われる点において本発明の第１の実施形態と相違するが、それ以外の構成は第１の実施形態と略同一であるため、ここでは詳細説明を省略する。 Note that the second embodiment of the present invention is different from the first embodiment of the present invention in that clustering is performed based on blocks defined by three-dimensional coordinates rather than grids defined by two-dimensional coordinates. However, since the other configuration is substantially the same as that of the first embodiment, detailed description thereof is omitted here.

（２−１．ブロックベースの位置クラスタリングの概要）
図３０Ａ〜図３４を参照して、本発明の第２の実施形態におけるクラスタリングの概要について説明する。本実施形態におけるクラスタリングは、３次元の直交座標系を用いて定義されるブロックを基準にして、位置情報を有するコンテンツをクラスタに分類するものであり、ブロックベースの位置クラスタリングともいえるものである。 (2-1. Overview of block-based location clustering)
With reference to FIGS. 30A to 34, an outline of clustering in the second exemplary embodiment of the present invention will be described. Clustering in the present embodiment classifies content having position information into clusters on the basis of blocks defined using a three-dimensional orthogonal coordinate system, and can be said to be block-based position clustering.

（ブロックについて）
図３０Ａは、本発明の第２の実施形態におけるコンテンツ、クラスタ、およびブロックの関係の例を示す図である。図３０Ａには、３次元空間２００１、コンテンツ２０１１、クラスタ２０２１、およびブロック２０３１が図示されている。 (About blocks)
FIG. 30A is a diagram illustrating an example of a relationship among content, clusters, and blocks according to the second embodiment of the present invention. FIG. 30A shows a three-dimensional space 2001, content 2011, cluster 2021, and block 2031.

３次元空間２００１は、地球の一部または全部を含む空間である。本実施形態において、３次元空間２００１は、ｘ座標、ｙ座標、およびｚ座標という３次元の座標によって位置情報が表される３次元空間である。 A three-dimensional space 2001 is a space including part or all of the earth. In the present embodiment, the three-dimensional space 2001 is a three-dimensional space in which position information is represented by three-dimensional coordinates such as an x coordinate, a y coordinate, and a z coordinate.

コンテンツ２０１１は、３次元空間２００１内の位置を特定する位置情報を有するデータである。コンテンツ２０１１は、位置情報そのものであってもよく、また、何らかの情報に対する付加的な情報として位置情報が付加されたデータであってもよい。コンテンツ２０１１は、例えば、撮影場所の位置情報が付加された画像データでありうる。 The content 2011 is data having position information for specifying a position in the three-dimensional space 2001. The content 2011 may be position information itself, or may be data to which position information is added as additional information for some information. The content 2011 can be, for example, image data to which position information on the shooting location is added.

クラスタ２０２１は、３次元空間２００１において互いに近い位置にあるコンテンツ２０１１を含む領域である。クラスタ２０２１は、直方体として図示されているが、その他の形状であってもよい。また、クラスタ２０２１は、コンテンツ２０１１に外接する立体であってもよい。図示されている例において、コンテンツ２０１１は、地表面上に位置する。そのため、コンテンツ２０１１は、クラスタ２０２１の地表面による切断面２０２１ｓ上にある。 The cluster 2021 is an area including the contents 2011 that are close to each other in the three-dimensional space 2001. The cluster 2021 is illustrated as a rectangular parallelepiped, but may have other shapes. The cluster 2021 may be a solid that circumscribes the content 2011. In the illustrated example, the content 2011 is located on the ground surface. For this reason, the content 2011 is on the cut surface 2021s of the ground surface of the cluster 2021.

ブロック２０３１は、３次元空間２００１に設定されたブロックである。ブロック２０３１は、３次元空間２００１内でｘ座標、ｙ座標およびｚ座標の範囲によって定義される直方体形状の領域でありうる。ブロック２０３１の大きさは、コンテンツ２０１１の数、およびクラスタリングの対象となる領域の大きさなど、クラスタリングの条件に応じて、適切な大きさに設定されうる。 A block 2031 is a block set in the three-dimensional space 2001. The block 2031 may be a rectangular parallelepiped region defined by the x-coordinate, y-coordinate, and z-coordinate ranges in the three-dimensional space 2001. The size of the block 2031 can be set to an appropriate size according to the clustering conditions such as the number of contents 2011 and the size of the area to be clustered.

図示されているように、本実施形態においては、同じブロック２０３１に含まれるコンテンツ２０１１が、同じクラスタ２０２１に分類される。つまり、本実施形態におけるブロックベースの位置クラスタリング処理においては、同じブロック２０３１に含まれるか否かが、クラスタリングの基本的な基準になる。図示されたコンテンツ２０１１は、そのすべてが同じブロック２０３１に含まれるため、同じクラスタ２０２１に分類される。 As shown in the figure, in the present embodiment, the contents 2011 included in the same block 2031 are classified into the same cluster 2021. That is, in the block-based position clustering process in the present embodiment, whether or not they are included in the same block 2031 is a basic criterion for clustering. The illustrated contents 2011 are all included in the same block 2031 and are therefore classified into the same cluster 2021.

なお、図示されている例では、コンテンツ２０１１はいずれも地表面上に位置している。しかし、コンテンツ２０１１は、例えば地球の内側＝地下に位置していてもよい。この場合、コンテンツ２０１１は、クラスタ２０２１の地表面による切断面２０２１ｓよりも奥側にある。また、コンテンツ２０１１は、地球の外側＝空中に位置していてもよい。この場合、コンテンツ２０１１は、クラスタ２０２１の地表面による切断面２０２１ｓよりも手前側にある。クラスタリングの基準が、地球の内側と外側とにまたがるブロック２０３１であることによって、そのような場合でも、コンテンツ２０１１に対してクラスタ２０２１を設定することが可能である。 In the illustrated example, the contents 2011 are all located on the ground surface. However, the content 2011 may be located, for example, inside the earth = underground. In this case, the content 2011 is located behind the cut surface 2021 s of the cluster 2021 due to the ground surface. Further, the content 2011 may be located outside the earth = in the air. In this case, the content 2011 is on the near side of the cut surface 2021s of the ground surface of the cluster 2021. Since the clustering standard is the block 2031 that extends between the inside and outside of the earth, the cluster 2021 can be set for the content 2011 even in such a case.

図３０Ｂは、本発明の第２の実施形態におけるコンテンツおよびクラスタの表示の例を示す図である。図３０Ｂでは、図３０Ａの例におけるコンテンツ２０１１およびクラスタ２０２１が、地表面上の図形として表示された例が示されている。 FIG. 30B is a diagram showing an example of content and cluster display according to the second embodiment of the present invention. FIG. 30B shows an example in which the content 2011 and the cluster 2021 in the example of FIG. 30A are displayed as graphics on the ground surface.

図示された例では、コンテンツ２０１１と、クラスタ２０２１の地表面による切断面２０２１ｓに外接する円であるクラスタ表示領域２０２１ｄとが表示されている。このように、クラスタ表示領域２０２１ｄは、クラスタ２０２１の地表面による切断面２０２１ｓの外接図形でありうる。また、上述のようにコンテンツ２０１１が地下や空中に位置する場合には、クラスタ表示領域２０２１ｄは立体的な図形であってもよい。 In the illustrated example, the content 2011 and the cluster display area 2021d that is a circle circumscribing the cut surface 2021s by the ground surface of the cluster 2021 are displayed. As described above, the cluster display area 2021d can be a circumscribed figure of the cut surface 2021s by the ground surface of the cluster 2021. As described above, when the content 2011 is located underground or in the air, the cluster display area 2021d may be a three-dimensional figure.

本実施形態におけるブロックベースの位置クラスタリングでは、コンテンツ２０１１の位置情報そのものが、コンテンツ２０１１が含まれるブロック２０３１を表している。上述の第１の実施形態と同様にして、コンテンツ２０１１を、ｘ座標、ｙ座標およびｚ座標の値を順に１桁ずつ配列したＮ進数値の順にソートすると、同一のブロック２０３１の領域に含まれるコンテンツ２０１１は、ソートの結果において互いに隣接する。ソートの結果において隣接したコンテンツ２０１１同士が同一のブロック２０３１に含まれるか否かは、例えば、上記のＮ進数値の上位ｋ桁（ｋ＝１，２，・・・）が共通するか否かによって判定されうる。ここで、上述のように、同一のブロック２０３１に含まれるコンテンツ２０１１は、同一のクラスタ２０２１に分類されるコンテンツである。それゆえ、本実施形態におけるブロックベースの位置クラスタリングでは、コンテンツ２０１１の位置情報から生成した数値をソートすることが、クラスタリングの主な処理になる。 In the block-based position clustering in this embodiment, the position information itself of the content 2011 represents the block 2031 including the content 2011. In the same manner as in the first embodiment described above, when the contents 2011 are sorted in the order of N-ary values in which the x-coordinate, y-coordinate, and z-coordinate values are sequentially arranged one by one, they are included in the same block 2031 area. The contents 2011 are adjacent to each other in the sorting result. Whether or not the adjacent contents 2011 are included in the same block 2031 in the sorting result is, for example, whether the upper k digits (k = 1, 2,...) Of the N-ary value are common. Can be determined. Here, as described above, the contents 2011 included in the same block 2031 are contents classified into the same cluster 2021. Therefore, in the block-based position clustering in the present embodiment, sorting the numerical values generated from the position information of the content 2011 is a main process of clustering.

（ブロックの階層構造について）
本実施形態におけるブロック２０３１も、第１の実施形態におけるグリッド１０３１と同様に、階層構造にすることが可能である。ここでは、中心が地球の中心と一致し、地球全体を包含するブロックがレベル０ブロックである場合を例として説明する。この場合、レベル１ブロックは、レベル０ブロックをｘ座標、ｙ座標、およびｚ座標についてそれぞれ２分割したブロックである。また、レベル２ブロックは、レベル１ブロックをｘ座標、ｙ座標、およびｚ座標についてそれぞれ２分割したブロックである。さらに、同様にして、レベル２ブロックの領域をｘ座標、ｙ座標、およびｚ座標についてそれぞれ２分割したレベル３ブロック、レベル３ブロックの領域をｘ座標、ｙ座標、およびｚ座標についてそれぞれ２分割したレベル４ブロック、・・・というように、さらに下位のレベルのブロックが定義されうる。 (About the block hierarchy)
Similarly to the grid 1031 in the first embodiment, the block 2031 in the present embodiment can also have a hierarchical structure. Here, a case where the center coincides with the center of the earth and the block including the entire earth is a level 0 block will be described as an example. In this case, the level 1 block is a block obtained by dividing the level 0 block into two for the x coordinate, the y coordinate, and the z coordinate. Further, the level 2 block is a block obtained by dividing the level 1 block into two for the x coordinate, the y coordinate, and the z coordinate. Further, similarly, the level 2 block area is divided into two for the x coordinate, y coordinate, and z coordinate, and the level 3 block area is divided into two for the x coordinate, y coordinate, and z coordinate. Lower level blocks can be defined, such as level 4 blocks.

図３１は、本発明の第２の実施形態におけるブロックによる地表面の分割について説明するための図である。レベル０ブロックの中心が地球の中心と一致している場合、地表面２０１０は、図示されているように、レベル１ブロックによって８つの領域２０３２ａ〜２０３２ｈに分割される。図３１は、地球を外部から見た図であり、レベル１ブロックによって分割された領域のうち２０３２ａ〜２０３２ｅが図示されている。 FIG. 31 is a diagram for explaining the division of the ground surface by the blocks in the second embodiment of the present invention. If the center of the level 0 block coincides with the center of the earth, the ground surface 2010 is divided into eight regions 2032a-2032h by the level 1 block as shown. FIG. 31 is a view of the earth as seen from the outside, and 2032a to 2032e are shown in the region divided by the level 1 block.

図３２は、本発明の第２の実施形態におけるブロックによる地表面の分割について説明するための図である。図３２では、地球を平面に展開した図であり、レベル１ブロックによって分割された地表面２０１０の８つの領域２０３２ａ〜２０３２ｈと、レベル１ブロックをさらに分割したレベル２ブロックによって分割された地表面２０１０の領域２０３３とが図示されている。本実施形態において、レベル１ブロックは、レベル２ブロックによって８つに分割されるが、レベル２ブロックのうち１つは地球の内側にあるため、地表面２０１０と交わらない。それゆえ、レベル１ブロックによって分割された地表面２０１０の８つの領域２０３２ａ〜２０３２ｈは、それぞれ、レベル２ブロックによって分割されて、８−１＝７つの領域２０３３になる。 FIG. 32 is a diagram for explaining division of the ground surface by blocks according to the second embodiment of the present invention. FIG. 32 is a diagram in which the earth is developed in a plane. The eight regions 2032a to 2032h of the ground surface 2010 divided by the level 1 block and the ground surface 2010 divided by the level 2 block obtained by further dividing the level 1 block. The region 2033 is shown. In the present embodiment, the level 1 block is divided into eight by the level 2 block, but one of the level 2 blocks is inside the earth and thus does not intersect the ground surface 2010. Therefore, the eight regions 2032a to 2032h of the ground surface 2010 divided by the level 1 block are respectively divided by the level 2 block to become 8-1 = 7 regions 2033.

図３３は、本発明の第２の実施形態におけるブロックによる地表面の分割について説明するための図である。図３３では、上記のレベル０ブロック〜レベル２ブロックと同様にして順次定義される下位のブロックのうち、レベル５ブロックによって分割された地表面２０１０の領域が図示されている。図示された例のように、レベル５ブロックは、例えば、日本の地域ごとにコンテンツ２０１１を分類するのに適したサイズになる。本実施形態におけるブロックベースの位置クラスタリングにおいても、クラスタリング処理に用いられるブロックのレベルを調整することによって、クラスタリングの粒度と処理負荷とをバランスさせることが可能である。 FIG. 33 is a diagram for explaining the division of the ground surface by blocks in the second embodiment of the present invention. In FIG. 33, the area | region of the ground surface 2010 divided | segmented by the level 5 block among the low-order blocks sequentially defined similarly to said level 0 block-level 2 block is shown in figure. As in the illustrated example, the level 5 block has a size suitable for classifying the content 2011 for each region of Japan, for example. Also in the block-based position clustering in the present embodiment, it is possible to balance the clustering granularity and the processing load by adjusting the block level used for the clustering process.

このように、本実施形態においては、あるレベルのブロックをｘ座標、ｙ座標、およびｚ座標についてそれぞれ２分割したブロックが、１つ下位のレベルのブロックになる。換言すれば、下位のレベルのブロックは、上位のレベルのブロックの領域を８分割する。従って、本実施形態におけるブロック２０３１の階層構造は、レベル０ブロックをルートノードとし、以下の各レベルのブロックをノードとする８分木構造を有し、ブロック２０３１に従って定義されるクラスタ２０２１もまた、同様の８分木構造を有するといえる。 As described above, in the present embodiment, a block obtained by dividing a block at a certain level into two with respect to the x coordinate, the y coordinate, and the z coordinate is a block at the next lower level. In other words, the lower level block divides the area of the upper level block into eight. Therefore, the hierarchical structure of the block 2031 in the present embodiment has an octree structure in which a level 0 block is a root node and blocks of the following levels are nodes, and a cluster 2021 defined according to the block 2031 is also: It can be said that it has the same octree structure.

ここで、一般的な距離ベースの位置クラスタリングでは、クラスタの木構造が定義された場合、その木構造の情報を保持する記憶領域が消費される。一方、本実施形態におけるブロックベースの位置クラスタリングでは、上述のようにブロック２０３１の８分木構造が一意に定まっているため、ブロック２０３１がどのレベルのブロックであるかという情報が保持されていれば、ブロック２０３１の８分木構造に基づいて、クラスタ２０２１の木構造を容易に把握することが可能である。 Here, in general distance-based position clustering, when a tree structure of a cluster is defined, a storage area that holds information on the tree structure is consumed. On the other hand, in the block-based position clustering in the present embodiment, since the octree structure of the block 2031 is uniquely determined as described above, if information on which level the block 2031 is is retained. Based on the octree structure of the block 2031, the tree structure of the cluster 2021 can be easily grasped.

（グリッドベースとの比較）
上述の第１の実施形態におけるグリッドベースの位置クラスタリングでは、地表面を、緯度および経度という２次元の座標によって位置情報が表される２次元平面として扱う。この場合、グリッドサイズは、例えば、緯度１度×経度１度、といったように定義される。しかし、周知のように、緯度１度あたりの距離が約１１１ｋｍでほぼ一定であるのに対して、経度１度あたりの距離は、高緯度になるほど短くなる。具体的には、緯度０度の赤道上での経度１度あたりの距離が約１１１ｋｍであるのに対し、緯度６０度での経度１度あたりの距離は約５５．７ｋｍである。従って、グリッドサイズを実際の距離で表すと、赤道付近では１１１ｋｍ×１１１ｋｍであるのに対し、緯度６０度付近では１１１ｋｍ×５５．７ｋｍと、面積では半分程度になってしまう。 (Comparison with grid base)
In the grid-based position clustering in the first embodiment described above, the ground surface is handled as a two-dimensional plane in which position information is represented by two-dimensional coordinates such as latitude and longitude. In this case, the grid size is defined as, for example, 1 degree latitude × 1 degree longitude. However, as is well known, the distance per degree of latitude is approximately 111 km and is almost constant, whereas the distance per degree of longitude becomes shorter as the latitude becomes higher. Specifically, the distance per 1 degree longitude on the equator at 0 degrees latitude is about 111 km, while the distance per 1 degree longitude at 60 degrees latitude is about 55.7 km. Therefore, when the grid size is expressed by an actual distance, it is 111 km × 111 km near the equator, whereas it is 111 km × 55.7 km near the latitude 60 degrees, which is about half of the area.

これに対して、本実施形態におけるブロックベースの位置クラスタリングでは、地表面２０１０上のコンテンツ２０１１を、３次元空間２００１に定義されたブロック２０３１で囲むことによってクラスタ２０２１を設定する。従って、ブロック２０３１のサイズは、緯度の高低によって変化しない。なお、ブロック２０３１の地表面２０１０との交わり方の違いによって、地表面２０１０で占める領域のサイズは、同緯度のブロック２０３１の間でも互いに異なりうる。このようなブロックベースの位置クラスタリングにマージ処理を合わせて行うことによって、地球全体にわたって、緯度によらずに均一な粒度のクラスタリングが実現される。 On the other hand, in the block-based position clustering in this embodiment, the cluster 2021 is set by surrounding the content 2011 on the ground surface 2010 with the block 2031 defined in the three-dimensional space 2001. Therefore, the size of the block 2031 does not change depending on the latitude. Note that the size of the area occupied by the ground surface 2010 may be different between the blocks 2031 at the same latitude, depending on the way the blocks 2031 intersect with the ground surface 2010. By performing merge processing together with such block-based position clustering, clustering with a uniform granularity can be realized over the entire earth regardless of latitude.

（ブロックに対応付けられるＮ進数値）
図３４は、本発明の第２の実施形態におけるクラスタリングについて説明するための図である。各ブロック２０３１に付されている番号は、これらのブロックをＮ進数値の大小でソートした場合のインデックスである。 (N-ary value associated with the block)
FIG. 34 is a diagram for explaining clustering in the second embodiment of the present invention. The number given to each block 2031 is an index when these blocks are sorted by N-ary numerical values.

本実施形態においても、情報処理装置１００のＮ進数値生成部１０１によって、Ｎ進数値が生成される。本実施形態におけるＮ進数値は、３次元空間２００１において３次元の座標によって表される位置情報を有するデータについて、所定の桁数の２進数で表現された各次元の座標の値を各次元について順に１桁ずつ配列した２進数値でありうる。 Also in this embodiment, an N-ary value is generated by the N-ary value generation unit 101 of the information processing apparatus 100. The N-ary value in the present embodiment is the value of the coordinate of each dimension expressed by a binary number of a predetermined number of digits for data having position information represented by three-dimensional coordinates in the three-dimensional space 2001 for each dimension. It may be a binary value arranged one digit at a time.

例えば、所定の桁数として２１桁を設定した場合、Ｎ進数値生成部１０１は、ｘ座標、ｙ座標、およびｚ座標の値をそれぞれ２１桁の２進数値で表現する。ここで、２進数で表現されたｘ座標を“ｘ_２０ｘ_１９ｘ_１８・・・ｘ_０”とし、２進数で表現された経度を“ｙ_２０ｙ_１９ｙ_１８・・・ｙ_０”とし、２進数で表現されたｚ座標を“ｚ_２０ｚ_１９ｚ_１８・・・ｚ_０”とすると、これらの値を各次元について順に１桁ずつ配列した２進数値は、“ｘ_２０ｙ_２０ｚ_２０ｘ_１９ｙ_１９ｚ_１９ｘ_１８ｙ_１８ｚ_１８・・・ｘ_０ｙ_０ｚ_０”という６３桁の２進数値になる。なお、所定の桁数が２１桁である場合、ブロック２０３１の対角線方向での最小分解能は１１ｍ相当になる。所定の桁数は、例えば、必要な最小分解能、および情報処理装置１００で扱われるデータ単位のサイズ（例えば３２ビット、６４ビットなど）を考慮した適切な値に設定されうる。 For example, when 21 digits are set as the predetermined number of digits, the N-ary value generation unit 101 represents the values of the x-coordinate, y-coordinate, and z-coordinate as 21-digit binary values. Here, the x coordinate expressed in binary number is “x ₂₀ x ₁₉ x ₁₈ ... X ₀ ”, and the longitude expressed in binary number is “y ₂₀ y ₁₉ y ₁₈ ... Y ₀ ”. Assuming that the z-coordinate expressed in binary number is “z ₂₀ z ₁₉ z ₁₈ ... Z ₀ ”, a binary value in which these values are arranged one digit at a time in each dimension is “x ₂₀ y ₂₀ z _20”. x ₁₉ y ₁₉ z ₁₉ x ₁₈ y ₁₈ z ₁₈ ... x ₀ y ₀ z ₀ ”is a 63-digit binary value. When the predetermined number of digits is 21 digits, the minimum resolution in the diagonal direction of the block 2031 is equivalent to 11 m. The predetermined number of digits can be set to an appropriate value in consideration of, for example, the required minimum resolution and the size of the data unit handled by the information processing apparatus 100 (for example, 32 bits, 64 bits, etc.).

図示された例では、３次元空間２００１において、各次元の座標が３桁の２進数値で表現される。ここで、ブロック２０３１は、この２進数値のうち上位の２桁によって規定されるブロックである。かかるブロック２０３１は、３次元空間２００１を（２^３）^２＝６４分割する。図では、各ブロック２０３１に０〜６３のインデックスが与えられている。このインデックスは、当該ブロック２０３１に含まれうるコンテンツ２０１１のとりうる２進数値の順にブロック２０３１をソートした場合のソート順に対応する。この場合における、インデックス０〜８，５５〜６３のブロック２０３１に含まれうるコンテンツ２０１１の、座標およびＮ進数値生成部１０１によって生成される２進数値は、以下の表２のようになる。なお、表中「００ｘ」「００ｙ」「００ｚ」という表記「ｘ」「ｙ」「ｚ」は、任意の値（０または１）を示す。 In the illustrated example, in the three-dimensional space 2001, the coordinates of each dimension are represented by three-digit binary values. Here, the block 2031 is a block defined by the upper two digits of this binary value. The block 2031 divides the three-dimensional space 2001 by (2 ³ ) ² = 64. In the figure, each block 2031 is given an index of 0 to 63. This index corresponds to the sort order when the block 2031 is sorted in the order of binary values that can be taken by the content 2011 that can be included in the block 2031. In this case, the coordinates and binary values generated by the N-ary value generation unit 101 of the content 2011 that can be included in the block 2031 of the indexes 0 to 8 and 55 to 63 are as shown in Table 2 below. In the table, the notations “x”, “y”, and “z” of “00x”, “00y”, and “00z” indicate arbitrary values (0 or 1).

本実施形態において、クラスタリング部１０３は、Ｎ進数値生成部１０１が生成した２進数値の上位ｋ桁（ｋ＝１，２，・・・）が共通する複数のコンテンツ２０１１を同一のクラスタに分類する。ｋ＝３×ｍ（ｍ＝１，２，・・・）である場合、２進数値の上位ｋ桁が共通するコンテンツ２０１１が分類されるクラスタは、２^３＝８分木構造クラスタのｍ層目になる。 In the present embodiment, the clustering unit 103 classifies a plurality of contents 2011 having the same upper k digits (k = 1, 2,...) Of the binary values generated by the N-ary value generation unit 101 into the same cluster. To do. When k = 3 × m (m = 1, 2,...), the cluster into which the content 2011 having the same upper k digits of the binary value is classified is m layers of 2 ³ = octtree structure cluster. Eyes.

ここで、表２を参照すると、インデックス０〜７のブロック２０３１に含まれるコンテンツ２０１１は、いずれも２進数値の上位３桁が“０００”で共通している。図３４を参照すると、これらのコンテンツ２０１１が含まれるブロック２０３１は、図中の左下奥に位置する８つのブロック２０３１であることがわかる。図示されているように、これらのブロック２０３１は、１階層上位のブロックである上位ブロック２０４１を構成する８つのブロック２０３１である。従って、例えば、インデックス１のブロック２０３１に含まれるコンテンツ２０１１と、インデックス５のブロック２０３１に含まれるコンテンツ２０１１とは、同じ上位ブロック２０４１に含まれる。 Here, referring to Table 2, the contents 2011 included in the blocks 2031 of the indexes 0 to 7 all have a common binary value of “000”. Referring to FIG. 34, it can be seen that the block 2031 including these contents 2011 is eight blocks 2031 located in the lower left back of the figure. As shown in the figure, these blocks 2031 are eight blocks 2031 constituting an upper block 2041, which is a block higher by one layer. Therefore, for example, the content 2011 included in the block 2031 of the index 1 and the content 2011 included in the block 2031 of the index 5 are included in the same upper block 2041.

本実施形態においても、上述の第１の実施形態におけるグリッドとクラスタとの関係と同様に、また図３０Ａおよび図３０Ｂを参照して説明したように、ブロック２０３１はそれぞれクラスタ２０２１と対応付けられている。従って、上記の例から、２進数値の上位ｋ桁が共通するコンテンツ２０１１が同一のクラスタ２０２１に分類されることがわかる。 Also in the present embodiment, as described with reference to FIGS. 30A and 30B, the blocks 2031 are associated with the clusters 2021 in the same manner as the relationship between the grids and clusters in the first embodiment described above. Yes. Therefore, it can be seen from the above example that the contents 2011 having the same upper k digits of binary values are classified into the same cluster 2021.

なお、本実施形態のクラスタリングおよびマージ処理に関するその他の処理は、上述の第１の実施形態と同様に行われうる。 Note that other processing related to clustering and merge processing of the present embodiment can be performed in the same manner as in the first embodiment.

＜３．本発明の実施形態に係る情報処理装置のハードウェア構成＞
次に、図３５を参照しながら、本発明の実施形態に係る情報処理装置１００のハードウェア構成について、詳細に説明する。図３５は、本発明の実施形態に係る情報処理装置１００のハードウェア構成を説明するためのブロック図である。 <3. Hardware configuration of information processing apparatus according to embodiment of the present invention>
Next, the hardware configuration of the information processing apparatus 100 according to the embodiment of the present invention will be described in detail with reference to FIG. FIG. 35 is a block diagram for explaining a hardware configuration of the information processing apparatus 100 according to the embodiment of the present invention.

情報処理装置１００は、主に、ＣＰＵ９０１と、ＲＯＭ９０３と、ＲＡＭ９０５と、を備える。また、情報処理装置１００は、更に、ホストバス９０７と、ブリッジ９０９と、外部バス９１１と、インターフェース９１３と、入力装置９１５と、出力装置９１７と、ストレージ装置９１９と、ドライブ９２１と、接続ポート９２３と、通信装置９２５とを備える。 The information processing apparatus 100 mainly includes a CPU 901, a ROM 903, and a RAM 905. The information processing apparatus 100 further includes a host bus 907, a bridge 909, an external bus 911, an interface 913, an input device 915, an output device 917, a storage device 919, a drive 921, and a connection port 923. And a communication device 925.

ＣＰＵ９０１は、演算処理装置および制御装置として機能し、ＲＯＭ９０３、ＲＡＭ９０５、ストレージ装置９１９、またはリムーバブル記録媒体９２７に記録された各種プログラムに従って、情報処理装置１００内の動作全般またはその一部を制御する。ＲＯＭ９０３は、ＣＰＵ９０１が使用するプログラムや演算パラメータ等を記憶する。ＲＡＭ９０５は、ＣＰＵ９０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を一次記憶する。これらはＣＰＵバス等の内部バスにより構成されるホストバス９０７により相互に接続されている。 The CPU 901 functions as an arithmetic processing unit and a control unit, and controls all or a part of the operation in the information processing apparatus 100 according to various programs recorded in the ROM 903, the RAM 905, the storage device 919, or the removable recording medium 927. The ROM 903 stores programs used by the CPU 901, calculation parameters, and the like. The RAM 905 primarily stores programs used in the execution of the CPU 901, parameters that change as appropriate during the execution, and the like. These are connected to each other by a host bus 907 constituted by an internal bus such as a CPU bus.

ホストバス９０７は、ブリッジ９０９を介して、ＰＣＩ（Peripheral Component Interconnect/Interface）バスなどの外部バス９１１に接続されている。 The host bus 907 is connected to an external bus 911 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 909.

入力装置９１５は、例えば、マウス、キーボード、タッチパネル、ボタン、スイッチおよびレバーなどユーザが操作する操作手段である。また、入力装置９１５は、例えば、赤外線やその他の電波を利用したリモートコントロール手段（いわゆる、リモコン）であってもよいし、情報処理装置１００の操作に対応した携帯電話やＰＤＡ等の外部接続機器９２９であってもよい。さらに、入力装置９１５は、例えば、上記の操作手段を用いてユーザにより入力された情報に基づいて入力信号を生成し、ＣＰＵ９０１に出力する入力制御回路などから構成されている。情報処理装置１００のユーザは、この入力装置９１５を操作することにより、情報処理装置１００に対して各種のデータを入力したり処理動作を指示したりすることができる。 The input device 915 is an operation unit operated by the user, such as a mouse, a keyboard, a touch panel, a button, a switch, and a lever. Further, the input device 915 may be, for example, remote control means (so-called remote control) using infrared rays or other radio waves, or an external connection device such as a mobile phone or a PDA corresponding to the operation of the information processing device 100. 929 may be used. Furthermore, the input device 915 includes an input control circuit that generates an input signal based on information input by a user using the above-described operation means and outputs the input signal to the CPU 901, for example. The user of the information processing apparatus 100 can input various data and instruct processing operations to the information processing apparatus 100 by operating the input device 915.

出力装置９１７は、取得した情報をユーザに対して視覚的または聴覚的に通知することが可能な装置で構成される。このような装置として、ＣＲＴディスプレイ装置、液晶ディスプレイ装置、プラズマディスプレイ装置、ＥＬディスプレイ装置およびランプなどの表示装置や、スピーカおよびヘッドホンなどの音声出力装置や、プリンタ装置、携帯電話、ファクシミリなどがある。出力装置９１７は、例えば、情報処理装置１００が行った各種処理により得られた結果を出力する。具体的には、表示装置は、情報処理装置１００が行った各種処理により得られた結果を、テキストまたはイメージで表示する。他方、音声出力装置は、再生された音声データや音響データ等からなるオーディオ信号をアナログ信号に変換して出力する。 The output device 917 is configured by a device capable of visually or audibly notifying acquired information to the user. Examples of such devices include CRT display devices, liquid crystal display devices, plasma display devices, EL display devices and display devices such as lamps, audio output devices such as speakers and headphones, printer devices, mobile phones, and facsimiles. For example, the output device 917 outputs results obtained by various processes performed by the information processing apparatus 100. Specifically, the display device displays the results obtained by various processes performed by the information processing device 100 as text or images. On the other hand, the audio output device converts an audio signal composed of reproduced audio data, acoustic data, and the like into an analog signal and outputs the analog signal.

ストレージ装置９１９は、情報処理装置１００の記憶部の一例として構成されたデータ格納用の装置である。ストレージ装置９１９は、例えば、ＨＤＤ（Hard Disk Drive）等の磁気記憶部デバイス、半導体記憶デバイス、光記憶デバイス、または光磁気記憶デバイス等により構成される。このストレージ装置９１９は、ＣＰＵ９０１が実行するプログラムや各種データ、および外部から取得した各種のデータなどを格納する。 The storage device 919 is a data storage device configured as an example of a storage unit of the information processing device 100. The storage device 919 includes, for example, a magnetic storage device such as an HDD (Hard Disk Drive), a semiconductor storage device, an optical storage device, or a magneto-optical storage device. The storage device 919 stores programs executed by the CPU 901, various data, various data acquired from the outside, and the like.

ドライブ９２１は、記録媒体用リーダライタであり、情報処理装置１００に内蔵、あるいは外付けされる。ドライブ９２１は、装着されている磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリ等のリムーバブル記録媒体９２７に記録されている情報を読み出して、ＲＡＭ９０５に出力する。また、ドライブ９２１は、装着されている磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリ等のリムーバブル記録媒体９２７に記録を書き込むことも可能である。リムーバブル記録媒体９２７は、例えば、ＤＶＤメディア、ＨＤ−ＤＶＤメディア、Ｂｌｕ−ｒａｙメディア等である。また、リムーバブル記録媒体９２７は、コンパクトフラッシュ（登録商標）（Compact Flash：ＣＦ）、フラッシュメモリ、または、ＳＤメモリカード（Secure Digital memory card）等であってもよい。また、リムーバブル記録媒体９２７は、例えば、非接触型ＩＣチップを搭載したＩＣカード（Integrated Circuit card）または電子機器等であってもよい。 The drive 921 is a recording medium reader / writer, and is built in or externally attached to the information processing apparatus 100. The drive 921 reads information recorded on a removable recording medium 927 such as a mounted magnetic disk, optical disk, magneto-optical disk, or semiconductor memory, and outputs the information to the RAM 905. In addition, the drive 921 can write a record on a removable recording medium 927 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory. The removable recording medium 927 is, for example, a DVD medium, an HD-DVD medium, a Blu-ray medium, or the like. The removable recording medium 927 may be a compact flash (CF), a flash memory, an SD memory card (Secure Digital memory card), or the like. The removable recording medium 927 may be, for example, an IC card (Integrated Circuit card) on which a non-contact IC chip is mounted, an electronic device, or the like.

接続ポート９２３は、機器を情報処理装置１００に直接接続するためのポートである。接続ポート９２３の一例として、ＵＳＢ（Universal Serial Bus）ポート、ＩＥＥＥ１３９４ポート、ＳＣＳＩ（Small Computer System Interface）ポート等がある。接続ポート９２３の別の例として、ＲＳ−２３２Ｃポート、光オーディオ端子、ＨＤＭＩ（High-Definition Multimedia Interface）ポート等がある。この接続ポート９２３に外部接続機器９２９を接続することで、情報処理装置１００は、外部接続機器９２９から直接各種のデータを取得したり、外部接続機器９２９に各種のデータを提供したりする。 The connection port 923 is a port for directly connecting a device to the information processing apparatus 100. Examples of the connection port 923 include a USB (Universal Serial Bus) port, an IEEE 1394 port, and a SCSI (Small Computer System Interface) port. As another example of the connection port 923, there are an RS-232C port, an optical audio terminal, an HDMI (High-Definition Multimedia Interface) port, and the like. By connecting the external connection device 929 to the connection port 923, the information processing apparatus 100 acquires various data directly from the external connection device 929 or provides various data to the external connection device 929.

通信装置９２５は、例えば、通信網９３１に接続するための通信デバイス等で構成された通信インターフェースである。通信装置９２５は、例えば、有線または無線ＬＡＮ（Local Area Network）、Ｂｌｕｅｔｏｏｔｈ（登録商標）、またはＷＵＳＢ（Wireless USB）用の通信カード等である。また、通信装置９２５は、光通信用のルータ、ＡＤＳＬ（Asymmetric Digital Subscriber Line）用のルータ、または、各種通信用のモデム等であってもよい。この通信装置９２５は、例えば、インターネットや他の通信機器との間で、例えばＴＣＰ／ＩＰ等の所定のプロトコルに則して信号等を送受信することができる。また、通信装置９２５に接続される通信網９３１は、有線または無線によって接続されたネットワーク等により構成され、例えば、インターネット、家庭内ＬＡＮ、赤外線通信、ラジオ波通信または衛星通信等であってもよい。 The communication device 925 is a communication interface configured by a communication device or the like for connecting to the communication network 931, for example. The communication device 925 is, for example, a communication card for wired or wireless LAN (Local Area Network), Bluetooth (registered trademark), or WUSB (Wireless USB). The communication device 925 may be a router for optical communication, a router for ADSL (Asymmetric Digital Subscriber Line), or a modem for various communication. The communication device 925 can transmit and receive signals and the like according to a predetermined protocol such as TCP / IP, for example, with the Internet or other communication devices. The communication network 931 connected to the communication device 925 is configured by a wired or wireless network, and may be, for example, the Internet, a home LAN, infrared communication, radio wave communication, satellite communication, or the like. .

以上、本発明の実施形態に係る情報処理装置１００の機能を実現可能なハードウェア構成の一例を示した。上記の各構成要素は、汎用的な部材を用いて構成されていてもよいし、各構成要素の機能に特化したハードウェアにより構成されていてもよい。従って、本実施形態を実施する時々の技術レベルに応じて、適宜、利用するハードウェア構成を変更することが可能である。 Heretofore, an example of the hardware configuration capable of realizing the function of the information processing apparatus 100 according to the embodiment of the present invention has been shown. Each component described above may be configured using a general-purpose member, or may be configured by hardware specialized for the function of each component. Therefore, it is possible to change the hardware configuration to be used as appropriate according to the technical level at the time of carrying out this embodiment.

＜４．まとめ＞
（実施形態の構成と効果の例）
以上説明した本発明の実施形態では、Ｄ次元の座標（Ｄ＝２，３，・・・）によって規定される特徴空間の位置情報を有するデータについて、所定の桁数のＮ進数（Ｎ＝２，３，・・・）で表現された各次元の座標の値を上記各次元について順に１桁ずつ配列したＮ進数値を生成するＮ進数値生成部と、上記Ｎ進数値の上位ｋ桁（ｋ＝１，２，・・・）が共通する上記データを同一のクラスタに分類するクラスタリング部とを備える情報処理装置が提供される。 <4. Summary>
(Example of configuration and effect of embodiment)
In the embodiment of the present invention described above, a predetermined number of N-ary numbers (N = 2) are used for data having feature space position information defined by D-dimensional coordinates (D = 2, 3,...). , 3,...), An N-ary value generation unit for generating an N-ary value in which the values of the coordinates of each dimension expressed in order for each of the dimensions are arranged, and the upper k digits of the N-ary value ( There is provided an information processing apparatus including a clustering unit that classifies the data having the same k = 1, 2,.

かかる構成によれば、位置情報を有するデータのクラスタリング処理を、各データに対応付けられたＮ進数値のソートの処理に置き換えることが可能になる。位置情報間の距離計算処理が数値の大小比較処理によって代替され、さらに処理回数自体も少なくて済むため、例えばクラスタリングに要するプロセッサや記憶領域のリソースがより少なくなり、クラスタリングの処理が高速化される。 According to such a configuration, it is possible to replace the clustering process of data having position information with a process of sorting N-ary values associated with each data. The distance calculation processing between position information is replaced by numerical value comparison processing, and the number of processing times itself can be reduced. For example, clustering requires fewer processors and storage area resources, and clustering processing is accelerated. .

かかる構成によれば、クラスタの階層構造が容易に構成できる。さらに、階層構造全体を保持しなくてもよいので、記憶領域のリソースを節約できる。 According to this configuration, the hierarchical structure of the cluster can be easily configured. Furthermore, since it is not necessary to maintain the entire hierarchical structure, resources in the storage area can be saved.

かかる構成によれば、ソートの結果によって、同一のクラスタに分類されるデータを容易に特定することができる。 According to such a configuration, data classified into the same cluster can be easily specified based on the sorting result.

かかる構成によれば、クラスタ特定情報が、クラスタに分類されるデータを個々に特定する情報を必ずしも含まなくてもよくなるため、クラスタ特定情報の生成および記憶に要するリソースがより少なくなり、クラスタリングの処理が高速化される。 According to such a configuration, since the cluster specifying information does not necessarily include information for individually specifying data classified into clusters, the resources required for generating and storing the cluster specifying information are reduced, and the clustering process is performed. Is faster.

かかる構成によれば、ソートの結果隣接すると判定されたクラスタを対象にしてマージが実行されるため、すべてのクラスタを対象としてマージを実行するのと比べて処理に要するリソースがより少なくなり、生成されたクラスタのマージの処理を含むクラスタリングの処理が高速化される。 According to such a configuration, since the merge is performed on clusters that are determined to be adjacent as a result of sorting, fewer resources are required for processing compared to performing the merge on all clusters. Clustering processing including merged cluster merging processing is speeded up.

かかる構成によれば、特徴空間における２つの方向について、クラスタが隣接するか否かを判定するため、互いに近傍に位置するクラスタをより確実にマージすることができる。 According to such a configuration, since it is determined whether or not the clusters are adjacent to each other in the two directions in the feature space, the clusters located in the vicinity of each other can be more reliably merged.

かかる構成によれば、地表面における位置情報を有するデータを、緯度および経度によって定義されるグリッドを用いて高速にクラスタリングすることができる。また、このグリッドのソートによってクラスタに順位を与えることによって、クラスタのソートを高速化することができる。 According to this configuration, data having position information on the ground surface can be clustered at high speed using the grid defined by the latitude and longitude. Further, the cluster sorting can be speeded up by giving ranks to the clusters by this grid sorting.

かかる構成によれば、３次元空間における位置情報を有するデータを、直交座標系によって定義されるブロックを用いて高速にクラスタリングすることができる。また、地表面をブロックによって分割する場合、地表面に緯度によるひずみが生じないクラスタによってデータをクラスタリングすることができる。 According to such a configuration, data having position information in a three-dimensional space can be clustered at high speed using blocks defined by an orthogonal coordinate system. In addition, when the ground surface is divided into blocks, data can be clustered by clusters that do not cause distortion due to latitude on the ground surface.

また、本発明の実施形態において、クラスタをマージする処理は、上記クラスタの相互間の距離を算出する処理と、上記算出された距離が所定の閾値以下である場合に、上記クラスタをマージする処理とを含みうる。 In the embodiment of the present invention, the process of merging clusters includes a process of calculating a distance between the clusters, and a process of merging the clusters when the calculated distance is equal to or less than a predetermined threshold. Can be included.

かかる構成によれば、マージ処理に要する処理をより少なくすることができ、生成されたクラスタのマージの処理を含むクラスタリングの処理がさらに高速化される。 According to such a configuration, the processing required for the merge processing can be further reduced, and the clustering processing including the processing for merging the generated clusters is further accelerated.

かかる構成によれば、より距離の近いクラスタを確実にマージすることができ、生成されたクラスタのマージの処理を含むクラスタリングの精度を向上させることができる。 According to such a configuration, clusters closer to each other can be reliably merged, and the accuracy of clustering including the process of merging the generated clusters can be improved.

（実施形態の変形例）
ここで、第１の実施形態では、特徴空間は地表面であり、第２の実施形態では、特徴空間は地球を含む３次元空間である。しかし、本発明はかかる例に限定されない。例えば、特徴空間は、実空間ではなく、ＲＧＢ、ＹＵＶなどの色空間であってもよく、画像特徴量を表現するためのより高次の特徴量空間であってもよい。 (Modification of the embodiment)
Here, in the first embodiment, the feature space is the ground surface, and in the second embodiment, the feature space is a three-dimensional space including the earth. However, the present invention is not limited to such an example. For example, the feature space is not a real space, but may be a color space such as RGB or YUV, or may be a higher-order feature amount space for expressing an image feature amount.

また、第１の実施形態では、Ｄ＝２、Ｎ＝２であり、第２の実施形態では、Ｄ＝３、Ｎ＝２である。しかし、本発明はかかる例に限定されない。例えば、Ｄ＝４以上であってもよい。つまり、データは４次元以上の高次元の座標によって規定される特徴空間の位置情報を有してもよい。一例として、Ｄ＝４で、各次元の座標を１６桁の２進数で表現する場合、Ｎ進数値生成部は、１６×４＝６４桁の２進数値を生成する。また、例えば、Ｎ進数値生成部は、８進数、１０進数、または１６進数など、任意の基数を用いて表されるＮ進数を生成してもよい。一例として、Ｄ＝２で、各次元の座標を２９桁の１６進数で表現する場合、Ｎ進数値生成部は、２９×２＝５８桁の１６進数値を生成し、データは１６^２＝２５６分木構造クラスタに分類される。 In the first embodiment, D = 2 and N = 2, and in the second embodiment, D = 3 and N = 2. However, the present invention is not limited to such an example. For example, D may be 4 or more. In other words, the data may have feature space position information defined by four-dimensional or higher-dimensional coordinates. As an example, when D = 4 and the coordinates of each dimension are expressed by 16-digit binary numbers, the N-ary value generation unit generates 16 × 4 = 64-digit binary values. In addition, for example, the N-ary value generation unit may generate an N-ary number represented by using an arbitrary radix such as an octal number, a decimal number, or a hexadecimal number. As an example, when D = 2 and the coordinates of each dimension are expressed in 29-digit hexadecimal numbers, the N-ary value generation unit generates 29 × 2 = 58-digit hexadecimal values, and the data is 16 ² = 256. It is classified into a branch tree cluster.

また、第１の実施形態では、２次元の座標は緯度および経度であり、第２の実施形態では、３次元の座標はｘ座標、ｙ座標、およびｚ座標からなる直交座標系である。しかし、本発明はかかる例に限定されない。２次元、３次元、およびＤ次元の座標は、直交座標系であってもよく、斜交座標系であってもよく、極座標系であってもよい。 In the first embodiment, the two-dimensional coordinates are latitude and longitude, and in the second embodiment, the three-dimensional coordinates are an orthogonal coordinate system including an x coordinate, a y coordinate, and a z coordinate. However, the present invention is not limited to such an example. The two-dimensional, three-dimensional, and D-dimensional coordinates may be an orthogonal coordinate system, an oblique coordinate system, or a polar coordinate system.

以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 The preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings, but the present invention is not limited to such examples. It is obvious that a person having ordinary knowledge in the technical field to which the present invention pertains can come up with various changes or modifications within the scope of the technical idea described in the claims. Of course, it is understood that these also belong to the technical scope of the present invention.

１００情報処理装置
１０１Ｎ進数値生成部
１０３クラスタリング部
１０５マージ部
１０７ソート部
１０９隣接判定部
１１９記憶部
１００１地表面
１０１１，２０１１コンテンツ
１０２１，２０２１クラスタ
１０３１グリッド
２００１３次元空間
２０３１ブロック
DESCRIPTION OF SYMBOLS 100 Information processing apparatus 101 N-ary value generation part 103 Clustering part 105 Merge part 107 Sort part 109 Adjacency determination part 119 Storage part 1001 Ground surface 1011, 2011 Content 1021, 2021 Cluster 1031 Grid 2001 Three-dimensional space 2031 Block

Claims

The data having the position information of the feature space defined by the D-dimensional coordinates (D = 2, 3,...) Is expressed by an N-ary number (N = 2, 3,...) Having a predetermined number of digits. An N-ary value generator for generating an N-ary value in which the values of the coordinates of each dimension are arranged one digit at a time for each dimension;
A clustering unit that classifies the data having the same upper k digits (k = 1, 2,...) Of the N-ary value into the same cluster;
An information processing apparatus comprising:

The clustering unit, k = D × m (m = 1,2, ···) in the case of, the data higher k digits of the N-ary number is common, m layer of ^{N D-ary} tree structure Cluster Classify into the same cluster by eye,
The information processing apparatus according to claim 1.

The clustering unit includes a clustering sorting unit that sorts the data in the order of the N-ary value, and identifies the data classified into the same cluster from the result of the sorting.
The information processing apparatus according to claim 1 or 2.

The clustering unit is a cluster that identifies the one cluster based on a first position where the data classified into one cluster appears in the sorting result and the number of the data classified into the one cluster. Generate specific information,
The information processing apparatus according to claim 3.

A merging sort unit for sorting the clusters in the first direction in the feature space based on a result of a first rank determination process based on the D-dimensional coordinates;
An adjacency determination unit that determines whether or not clusters sorted in the first direction are adjacent to each other in the first direction;
The information processing apparatus according to claim 1, further comprising: a merge unit that merges clusters determined to be adjacent to each other in the first direction.

The merging sorting unit sorts the clusters in the second direction in the feature space based on the result of the second ranking determination process based on the D-dimensional coordinates,
The adjacency determining unit determines whether or not clusters sorted in the second direction are adjacent to each other in the second direction;
The merging unit further merges clusters determined to be adjacent to each other in the second direction;
The information processing apparatus according to claim 5.

The feature space is a ground surface;
The D-dimensional coordinates are two-dimensional coordinates composed of latitude and longitude,
The cluster is a region including position information of the data included in a grid defined on the ground surface by the two-dimensional coordinates,
The first ranking determination process is a process of sorting the grid in the first direction and giving the sorting order of the grid as a rank to the clusters included in each grid.
The information processing apparatus according to claim 5.

The feature space is a three-dimensional space;
The D-dimensional coordinates are three-dimensional coordinates constituting an orthogonal coordinate system,
The cluster is an area including position information of the data included in a block defined in the three-dimensional space by the three-dimensional coordinates.
The information processing apparatus according to claim 1.

The data having position information in the feature space defined by the D-dimensional coordinates (D = 2, 3,...) Is expressed by an N-ary number (N = 2, 3,...) Having a predetermined number of digits. Generating an N-ary value in which the coordinate values of each dimension are arranged one digit at a time for each dimension;
Classifying the data having the same upper k digits (k = 1, 2,...) Of the N-ary values into the same cluster;
An information processing method including:

The data having position information in the feature space defined by the D-dimensional coordinates (D = 2, 3,...) Is expressed by an N-ary number (N = 2, 3,...) Having a predetermined number of digits. A process of generating an N-ary value in which the values of the coordinates of each dimension are arranged in order of one digit for each dimension;
A process of classifying the data having the same upper k digits (k = 1, 2,...) Of the N-ary value into the same cluster;
A program that causes a computer to execute.

A process of sorting the clusters with respect to a first direction in the feature space based on a result of a first ranking determination process based on the D-dimensional coordinates;
Processing to determine whether clusters sorted in the first direction are adjacent to each other in the first direction;
Merging clusters determined to be adjacent to each other in the first direction;
The program according to claim 10, further causing a computer to execute.

The process of merging the clusters is
A process of calculating a distance between the clusters;
A process of merging the clusters when the calculated distance is less than or equal to a predetermined threshold;
The program according to claim 11, including:

The process of merging the clusters is
A process of calculating a distance between the clusters;
A process for storing the cluster as a merge candidate cluster when the calculated distance is equal to or less than a predetermined threshold;
Among the stored merge candidate clusters, the process of merging in order from the merge candidate cluster having a smaller distance between each other,
The program according to claim 11, including: