JP5930551B2

JP5930551B2 - Residence point extraction method, residence point extraction device, and residence point extraction program

Info

Publication number: JP5930551B2
Application number: JP2014047201A
Authority: JP
Inventors: 伊藤　淳; 淳伊藤; 良彦数原; 浩之戸田; 鷲崎　誠司; 誠司鷲崎
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2014-03-11
Filing date: 2014-03-11
Publication date: 2016-06-08
Anticipated expiration: 2034-03-11
Also published as: JP2015170338A

Description

本発明は、ユーザ端末の位置情報を用いてユーザの滞留点を求める技術に関する。 The present invention relates to a technique for obtaining a user's stay point using position information of a user terminal.

フィーチャーフォンやスマートフォンなどから、ＧＰＳや無線基地局、Ｗｉ−Ｆｉなどを用いて取得された位置情報を測位データとして収集し、ユーザの地理的行動を分析することが行われている。そして、収集した測位データからユーザがどこへ訪問したかを理解するために、その測位データを用いてユーザが長時間留まった地点を滞留点として抽出する技術が存在する。具体的には、測位データに含まれる測位点、すなわち、測位された時刻における経度・緯度が示す地点が密である場所を滞留点として抽出する、カーネル密度ベースのクラスタリング手法であるＭｅａｎ−Ｓｈｉｆｔ技術（非特許文献１）が用いられている。 Location information acquired from a feature phone, a smartphone, or the like using GPS, a wireless base station, Wi-Fi, or the like is collected as positioning data, and the geographical behavior of the user is analyzed. And in order to understand where the user visited from the collected positioning data, there is a technique for extracting a point where the user stayed for a long time as a staying point using the positioning data. Specifically, the Mean-Shift technology, which is a kernel density-based clustering technique, extracts positioning points included in the positioning data, that is, locations where the points indicated by the longitude and latitude at the positioning time are dense as dwell points. (Non-Patent Document 1) is used.

Dorin Comaniciu、外１名、“Mean Shift: A Robust Approach Toward Feature Space Analysis”、IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE、Vol.24、No.5、2002年5月、p.603-619Dorin Comaniciu, 1 other person, “Mean Shift: A Robust Approach Toward Feature Space Analysis”, IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol.24, No.5, May 2002, p.603-619

しかしながら、滞留点に測位点が集中していることを前提としているため、フィーチャーフォンなどの電池消費削減の観点から移動のない時に測位回数を抑制するなどして測位取得間隔が変化する場合、滞留点付近の測位点の数が減少してしまうという課題がある。また、ＧＰＳなどによる測位は、屋内や屋外、電波状況などの環境要因によって変化する。そのため、全ての測位点を等しく扱う従来技術では、測定誤差の大きい測位点に引き寄せられることにより滞留点がずれてしまうという課題もある。これら２つの課題より、滞留点の抽出精度が低下するおそれがあった。 However, since it is assumed that the positioning points are concentrated at the staying point, if the positioning acquisition interval changes due to, for example, suppressing the number of positionings when there is no movement from the viewpoint of reducing battery consumption such as feature phones, the staying point There is a problem that the number of positioning points near the point is reduced. In addition, positioning by GPS or the like varies depending on environmental factors such as indoors, outdoors, and radio wave conditions. Therefore, in the prior art that treats all the positioning points equally, there is a problem that the staying points are shifted by being drawn to the positioning points having a large measurement error. Due to these two problems, there is a possibility that the extraction accuracy of the staying point is lowered.

本発明は、上記事情を鑑みてなされたものであり、ユーザが留まった滞留点の抽出精度を改善することを目的とする。 This invention is made | formed in view of the said situation, and it aims at improving the extraction accuracy of the staying point where the user stayed.

上記課題を解決するため、請求項１の滞留点抽出方法は、コンピュータにより、ユーザ端末から収集した位置情報を測位データとして記憶手段に記憶する第１のステップと、前記記憶手段から各測位データを読み出して、前記各測位データにそれぞれ含まれる測位誤差データを用いて各測位データの測位精度をそれぞれ算出する第２のステップと、前記各測位データにおける各測位点をそれぞれクラスタとし、それぞれのクラスタから一定領域内かつ測位時刻の間隔が一定時間内に含まれる複数の測位点をそれぞれ取得する第３のステップと、前記複数の測位点の測位精度および測位データを用いて測位精度を重みとする重み付き重心を算出し、前記それぞれのクラスタの中心点とする第４のステップと、各クラスタの中心点間の距離が一定距離以下であるクラスタを併合する第５のステップと、併合されたクラスタに含まれる各クラスタの測位点を測位時刻の順番に並べ、隣り合う２つの測位点の測位時刻の時間間隔が一定時間以上の場合、前記併合されたクラスタを分離する第６のステップと、併合されたクラスタ及び併合後に分離されたクラスタの中心点をユーザの滞留点として抽出する第５のステップと、を有することを要旨とする。 In order to solve the above-mentioned problem, a dwell point extraction method according to claim 1 is a first step of storing position information collected from a user terminal as positioning data in a storage means by a computer, and each positioning data from the storage means. A second step of calculating the positioning accuracy of each positioning data using the positioning error data included in each positioning data, and setting each positioning point in each positioning data as a cluster, and from each cluster A third step of acquiring each of a plurality of positioning points within a certain region and having a positioning time interval within a certain time, and a weighting that uses the positioning accuracy and positioning data of the plurality of positioning points as a weight per calculates the center of gravity, and a fourth step of the center point of said each cluster, the distance between the center points of each cluster constant The fifth step of merging clusters that are less than or equal to the separation and the positioning points of each cluster included in the merged cluster are arranged in the order of the positioning time, and the time interval between the positioning times of two adjacent positioning points is a certain time or more In this case, the present invention includes a sixth step of separating the merged cluster and a fifth step of extracting a central point of the merged cluster and the cluster separated after the merge as a user's residence point. And

本発明によれば、各測位データにそれぞれ含まれる測位誤差データを用いて各測位データの測位精度をそれぞれ算出し、各測位データにおける各測位点をそれぞれクラスタとし、任意のクラスタから一定領域内かつ測位時刻の間隔が一定時間内に含まれる複数の測位点を取得し、それら複数の測位点の測位精度および測位データを用いて測位精度を重みとする重み付き重心を算出し、その重み付き重心を任意のクラスタの中心点とし、その中心点をユーザの滞留点として抽出するため、環境要因によって測位精度が変化する場合でも、精度の悪い測位点による影響で滞留点の場所がずれてしまうことを抑制できることから、ユーザが留まった滞留点を精度よく抽出することができる。 According to the present invention, the positioning accuracy of each positioning data is calculated using positioning error data included in each positioning data, each positioning point in each positioning data is set as a cluster, and any given cluster within a certain region and Obtains multiple positioning points whose positioning time intervals are included within a certain time, calculates the weighted centroid using the positioning accuracy and positioning data of the multiple positioning points as weights, and the weighted centroid Is the center point of any cluster, and the center point is extracted as the user's stay point, so even if the positioning accuracy changes due to environmental factors, the location of the stay point will be shifted due to the influence of the positioning point with poor accuracy. Therefore, the staying point where the user stayed can be extracted with high accuracy.

請求項２の滞留点抽出方法は、請求項１に記載の滞留点抽出方法において、前記第２のステップと前記第３のステップとの間に、測位時刻の間隔が一定時間より大きく、かつ、測位点間が一定距離より小さい測位データ間に新たな測位データを追加するステップを更に有することを要旨とする。 The stay point extracting method according to claim 2 is the stay point extracting method according to claim 1, wherein an interval between positioning times is larger than a predetermined time between the second step and the third step, and The gist of the present invention is to further include a step of adding new positioning data between positioning data whose positioning points are smaller than a certain distance.

本発明によれば、測位時刻の間隔が一定時間より大きく、かつ、測位点間が一定距離より小さい測位データ間に新たな測位データを追加するため、測位データの測位取得間隔が変化する場合でも、滞留の可能性が高い地点に多くの測位データを補間できることから、ユーザが留まった滞留点を更に精度よく抽出することができる。 According to the present invention, since the new positioning data is added between the positioning data whose positioning time interval is larger than a certain time and between the positioning points is smaller than the certain distance, even when the positioning acquisition interval of the positioning data changes. Since a lot of positioning data can be interpolated at a point where the possibility of staying is high, the staying point where the user stayed can be extracted more accurately.

請求項３の滞留点抽出装置は、ユーザ端末から収集した位置情報を測位データとして記憶する記憶手段と、前記記憶手段から各測位データを読み出して、前記各測位データにそれぞれ含まれる測位誤差データを用いて各測位データの測位精度をそれぞれ算出する算出手段と、前記各測位データにおける各測位点をそれぞれクラスタとし、それぞれのクラスタから一定領域内かつ測位時刻の間隔が一定時間内に含まれる複数の測位点をそれぞれ取得する取得手段と、前記複数の測位点の測位精度および測位データを用いて測位精度を重みとする重み付き重心を算出し、前記それぞれのクラスタの中心点とする算出手段と、各クラスタの中心点間の距離が一定距離以下であるクラスタを併合する併合手段と、併合されたクラスタに含まれる各クラスタの測位点を測位時刻の順番に並べ、隣り合う２つの測位点の測位時刻の時間間隔が一定時間以上の場合、前記併合されたクラスタを分離する分離手段と、併合されたクラスタ及び併合後に分離されたクラスタの中心点をユーザの滞留点として抽出する抽出手段と、を有することを要旨とする。 The stay point extraction apparatus according to claim 3 stores a storage unit that stores position information collected from a user terminal as positioning data, reads out each positioning data from the storage unit, and outputs positioning error data included in each positioning data. A plurality of calculation means for calculating the positioning accuracy of each positioning data, and each positioning point in each positioning data as a cluster, and within a certain area from each cluster and a positioning time interval within a certain time Obtaining means for obtaining each positioning point; and calculating means for calculating a weighted centroid weighted by positioning accuracy using the positioning accuracy and positioning data of the plurality of positioning points, and calculating the center point of each cluster; Merging means for merging clusters whose distance between the center points of each cluster is a certain distance or less, and each cluster included in the merged cluster. If the positioning points of the data are arranged in the order of the positioning times and the time interval between the positioning times of two adjacent positioning points is a certain time or more, the separating means for separating the merged cluster, the merged cluster, and after the merge The gist of the present invention is to include an extraction unit that extracts a center point of the separated cluster as a staying point of the user.

請求項４の滞留点抽出装置は、請求項３に記載の滞留点抽出装置において、測位時刻の間隔が一定時間より大きく、かつ、測位点間が一定距離より小さい測位データ間に新たな測位データを追加する補間手段を更に有することを要旨とする。 The stay point extracting apparatus according to claim 4 is the stay point extracting apparatus according to claim 3, wherein new positioning data is obtained between positioning data in which the interval between positioning times is larger than a certain time and the distance between positioning points is smaller than a certain distance. The gist of the present invention is to further include an interpolation means for adding.

請求項５の滞留点抽出プログラムは、請求項１又は２に記載の滞留点抽出方法をコンピュータに実行させることを要旨とする。 A summary of the stay point extraction program according to claim 5 is to cause a computer to execute the stay point extraction method according to claim 1 or 2.

本発明によれば、ユーザが留まった滞留点の抽出精度を高めることができる。 ADVANTAGE OF THE INVENTION According to this invention, the extraction precision of the stay point where the user stayed can be improved.

滞留点抽出装置の機能ブロック構成を示す図である。It is a figure which shows the functional block structure of a stay point extraction apparatus. 測位データＤＢの構成例を示す図である。It is a figure which shows the structural example of positioning data DB. 測位データ抽出部の処理フローを示す図である。It is a figure which shows the processing flow of a positioning data extraction part. 測位データ補間部の処理フローを示す図である。It is a figure which shows the processing flow of a positioning data interpolation part. 測位データの線形補間例を示す図である。It is a figure which shows the linear interpolation example of positioning data. 滞留点抽出部の処理フローを示す図である。It is a figure which shows the processing flow of a stay point extraction part. 滞留点ＤＢの構成例を示す図である。It is a figure which shows the structural example of residence point DB. 滞留点の抽出例を示す図である。It is a figure which shows the example of extraction of a stay point.

以下、本発明を実施する一実施の形態について図面を用いて説明する。 Hereinafter, an embodiment for carrying out the present invention will be described with reference to the drawings.

図１は、本実施の形態に係る滞留点抽出装置１の機能ブロック構成を示す図である。この滞留点抽出装置１は、測位データＤＢ１１と、測位データ抽出部１２と、測位データ補間部１３と、滞留点抽出部１４と、滞留点ＤＢ１５と、を備えて構成される。以下、それら各部の機能について詳述する。 FIG. 1 is a diagram showing a functional block configuration of a stay point extraction apparatus 1 according to the present embodiment. The stay point extraction device 1 includes a positioning data DB 11, a positioning data extraction unit 12, a positioning data interpolation unit 13, a stay point extraction unit 14, and a stay point DB 15. Hereinafter, the function of each part will be described in detail.

測位データＤＢ１１は、フィーチャーフォンやスマートフォンなどのユーザ端末から、ＧＰＳや携帯の無線基地局、Ｗｉ−Ｆｉなどを利用して取得できる位置情報を測位データとして収集し、保存する機能部である。 The positioning data DB 11 is a functional unit that collects and stores, as positioning data, position information that can be acquired from a user terminal such as a feature phone or a smartphone using a GPS, a portable radio base station, Wi-Fi, or the like.

測位データＤＢ１１の構成例を図２に示す。測位データは、ユーザＩＤ（ｕｉｄ）、測位時間（ｔｉｍｅ）、経度（ｌｏｎｇｉｔｕｄｅ）、緯度（ｌａｔｉｔｕｄｅ）、測位方法（ｍｅｔｈｏｄ）、測位誤差（ａｃｃｕｒａｃｙ）などにより構成される。測位誤差とは、ＧＰＳなどで計測された測位の不正確度であり、収集した位置情報に既に含まれているパラメータ値である。図２の構成例では、メートルを単位としている。なお、高度や加速度など、その他の情報が測位データに含まれていても構わない。 A configuration example of the positioning data DB 11 is shown in FIG. The positioning data includes a user ID (uid), positioning time (time), longitude (longitude), latitude (latitude), positioning method (method), positioning error (accuracy), and the like. The positioning error is an inaccuracy of positioning measured by GPS or the like, and is a parameter value already included in the collected position information. In the configuration example of FIG. 2, the unit is meters. Note that other information such as altitude and acceleration may be included in the positioning data.

測位データ抽出部１２は、入力された対象ユーザＩＤリストに記載されたユーザＩＤに該当する測位データを測位データＤＢ１１から抽出する。このとき、後述する図３の手続きにしたがって、測位誤差から測位精度を算出し、抽出した測位データに付与した上で出力する。 The positioning data extraction unit 12 extracts positioning data corresponding to the user ID described in the input target user ID list from the positioning data DB 11. At this time, the positioning accuracy is calculated from the positioning error according to the procedure of FIG. 3 described later, and is output after being added to the extracted positioning data.

測位データ補間部１３は、測位データ抽出部１２から出力された測位データを入力とし、後述する図４の手続きにしたがって、測位時間の間隔が一定時間より大きく、かつ、測位点間が一定距離より小さい測位データ間に新たな測位データを線形補間し、もとの測位データに加えた上で出力する。なお、測位点とは、測位された時間における経度・緯度が示す地点という。 The positioning data interpolating unit 13 receives the positioning data output from the positioning data extracting unit 12, and in accordance with the procedure of FIG. 4 to be described later, the positioning time interval is larger than a certain time and the distance between positioning points is more than a certain distance. New positioning data is linearly interpolated between small positioning data, added to the original positioning data, and then output. The positioning point is a point indicated by longitude / latitude at the positioning time.

滞留点抽出部１４は、測位データ補間部１３から出力された測位データを入力とし、後述する図６の手続きにしたがって、測位精度に応じて測位点に重みをつけたカーネル密度ベースのクラスタリング手法を用いて滞留点を抽出する。抽出した滞留点は、滞留点ＤＢ１５に出力し、保存する。 The stay point extraction unit 14 receives the positioning data output from the positioning data interpolation unit 13, and performs a kernel density-based clustering method in which the positioning points are weighted according to the positioning accuracy according to the procedure of FIG. Use to extract the residence point. The extracted stay point is output to the stay point DB 15 and stored.

次に、測位データ抽出部１２、測位データ補間部１３、滞留点抽出部１４の各動作を詳述する。最初に、測位データ抽出部１２について説明する。図３は、測位データ抽出部１２の処理フローを示す図である。 Next, each operation | movement of the positioning data extraction part 12, the positioning data interpolation part 13, and the stay point extraction part 14 is explained in full detail. First, the positioning data extraction unit 12 will be described. FIG. 3 is a diagram illustrating a processing flow of the positioning data extraction unit 12.

まず、ステップＳ１０１において、事前に与えられた対象ユーザＩＤリストに含まれるユーザＩＤを入力とし、未処理のユーザＩＤに該当する全ての測位データを測位データＤＢ１１から取得する。以降のステップＳ１０２は、ユーザＩＤごとに実行される。 First, in step S101, the user ID included in the target user ID list given in advance is input, and all the positioning data corresponding to the unprocessed user ID is acquired from the positioning data DB 11. The subsequent step S102 is executed for each user ID.

次に、ステップＳ１０２において、ステップＳ１０１で取得した各測位データにそれぞれ含まれる各測位誤差ａを用いて、各測位データの測位精度Ａ（ａ）を式（１）よりそれぞれ算出する。 Next, in step S102, the positioning accuracy A (a) of each positioning data is calculated from equation (1) using each positioning error a included in each positioning data acquired in step S101.

ここで、Ａは測位精度を求める関数、ａは測位誤差の値、Ａ_０は測位誤差がゼロの場合の測位精度、λは減衰定数である。式（１）を用いることにより、測位精度Ａは、Ａ_０をピークとして測位誤差が大きくなるごとに指数関数的に減衰することになる。減衰の強さは減衰定数λの大きさによって調節できる。測位精度Ａ_０と減衰定数λは事前に人手によって任意に定めることができるが、共に正の値でなければならない。例えば、Ａ_０＝１、λ＝０．０１などを用いる。

Here, A is a function for obtaining positioning accuracy, a is a positioning error value, A ₀ is positioning accuracy when the positioning error is zero, and λ is an attenuation constant. By using the equation (1), positioning accuracy A will attenuate exponentially each time the positioning error becomes large as the peak A _0. The strength of the attenuation can be adjusted by the magnitude of the attenuation constant λ. The positioning accuracy A ₀ and the attenuation constant λ can be arbitrarily determined in advance by hand, but both must be positive values. For example, A ₀ = 1, λ = 0.01, etc. are used.

なお、測位誤差が大きるなるほど測位精度が小さくなるような関数であり、負の値をとることがなければ、式（１）以外の関数を用いても構わない。また、ＧＰＳ、無線基地局、Ｗｉ−Ｆｉなどの測位方法に応じて測位精度Ａ_０や減衰定数λの値を変えるなど、測位誤差ａ以外の情報を考慮した関数を用いても構わない。 Note that the function is such that the positioning accuracy decreases as the positioning error increases, and a function other than Expression (1) may be used as long as it does not take a negative value. Also, a function that takes into account information other than the positioning error a, such as changing the positioning accuracy _A0 or the value of the attenuation constant λ according to a positioning method such as GPS, a wireless base station, or Wi-Fi, may be used.

そして、このようにして算出した測位精度を、ステップＳ１０１で取得した測位データに付与し、新たな測位データとして測位データ補間部１３に出力する。 Then, the positioning accuracy calculated in this way is added to the positioning data acquired in step S101, and is output to the positioning data interpolation unit 13 as new positioning data.

その後、ステップＳ１０３において、未処理のユーザＩＤがあるか否かを判定し、未処理のユーザＩＤがある場合は、ステップＳ１０１に戻り、全てのユーザＩＤについてステップＳ１０１，Ｓ１０２を行う。一方、未処理のユーザＩＤがない場合は、処理を終了する。 Thereafter, in step S103, it is determined whether or not there is an unprocessed user ID. If there is an unprocessed user ID, the process returns to step S101, and steps S101 and S102 are performed for all user IDs. On the other hand, if there is no unprocessed user ID, the process ends.

続いて、測位データ補間部１３の動作について説明する。図４は、測位データ補間部１３の処理フローを示す図である。 Next, the operation of the positioning data interpolation unit 13 will be described. FIG. 4 is a diagram illustrating a processing flow of the positioning data interpolation unit 13.

まず、ステップＳ２０１において、測位データ抽出部１２から出力された測位データを入力とし、未処理のユーザＩＤに該当する全ての測位データを取得する。以降のステップＳ２０２〜Ｓ２０５は、ユーザＩＤごとに実行される。 First, in step S201, the positioning data output from the positioning data extraction unit 12 is input, and all positioning data corresponding to an unprocessed user ID is acquired. The subsequent steps S202 to S205 are executed for each user ID.

次に、ステップＳ２０２において、未処理の隣接測位データを取得する。具体的には、ステップＳ２０１で取得した各測位データから、測位点と当該測位点に時間的に隣接している測位点とを隣接測位データとして取得する。ここでは、測位データｄの集合Ｄが｛ｄ_０，ｄ_１，…，ｄ_nー１｜ｎ＞２，ｎ∈Ｎ｝のように測位時間昇順に構成されているものとし、隣接測位データを含む集合Ｄ’を｛（ｄ_０，ｄ_１），（ｄ_１，ｄ_２），…，（ｄ_nー２，ｄ_nー１）｜ｎ＞２，ｎ∈Ｎ｝と表現する。なお、ｎは、現在処理中のユーザＩＤに該当する測位データの総数である。以降、この隣接測位データを用いて測位時間昇順で順に処理する。 Next, in step S202, unprocessed adjacent positioning data is acquired. Specifically, a positioning point and a positioning point that is temporally adjacent to the positioning point are acquired as adjacent positioning data from each positioning data acquired in step S201. Here, it is assumed that the set D of positioning data d is configured in ascending order of positioning time as {d ₀ , d ₁ ,..., D _n−1 | n> 2, n∈N}. A set D ′ including the above is expressed as {(d ₀ , d ₁ ), (d ₁ , d ₂ ),..., (D _n−2 , d _n−1 ) | n> 2, n∈N}. Note that n is the total number of positioning data corresponding to the user ID currently being processed. Thereafter, the adjacent positioning data is used to perform processing in order of ascending positioning time.

次に、ステップＳ２０３において、事前に人手によって定められた測位時間間隔閾値Ｔ_ｌｅｒｐおよび測位距離間隔閾値Ｌ_ｌｅｒｐを用いて、隣接測位データｄ_ｉ，ｄ_ｉ＋１（０≦ｉ＜ｎ−１；ｉ∈Ｚ）にそれぞれ含まれる測位時間ｔ_ｉ，ｔ_ｉ＋１と測位点ｌ_ｉ，ｌ_ｉ＋１が式（２）の条件を満たしているか否かを判定する。 Next, in step S203, the adjacent positioning data d _i , d _{i + 1} (0 ≦ i <n−1; _i∈ ) using the positioning time interval threshold value T _lerr and the positioning distance interval threshold value L _lerrp determined in advance by hand. It is determined whether or not the positioning times t _i and t _{i + 1} and the positioning points l _i and l _{i + 1} included in Z) satisfy the condition of equation (2).

なお、ｄｉｓｔとは、与えられた２つの測位点ｌ_ｉ，ｌ_ｉ＋１間における、回転楕円体上の最小距離をメートル単位で出力する関数である。例えば、ヒュベニの距離計算式などを用いる。

The dist is a function that outputs the minimum distance on the spheroid between two given positioning points l _i and l _{i + 1} in meters. For example, the Huveni distance calculation formula is used.

そして、式（２）の条件を満たす場合、前述したように測定回数の抑制処理が働いていることが考えられるため、ステップＳ２０４へ進み、隣接測位データｄ_ｉ，ｄ_ｉ＋１における２つの測位データ間に対して新たな測位データを追加する。一方、式（２）の条件を満たさない場合は、ステップＳ２０４を行うことなくステップＳ２０５へ進む。 If the condition of the expression (2) is satisfied, it is considered that the measurement number suppression process is working as described above. Therefore, the process proceeds to step S204, and between the two positioning data in the adjacent positioning data d _i and d _{i + 1} . Add new positioning data to. On the other hand, if the condition of formula (2) is not satisfied, the process proceeds to step S205 without performing step S204.

次に、ステップＳ２０４において、新たな測位データを補間間隔閾値Ｓ_ｌｅｒｐの間隔で線形補間する。具体的には、隣接測位データｄ_ｉ，ｄ_ｉ＋１間を線形補間する場合、式（３）より算出した補間測位データＤ_{ｉ，ｉ＋１}を新たな測位データとして追加する。 Next, in step S204, new positioning data is linearly interpolated at intervals of the interpolation interval threshold value S _lerrp . Specifically, when linear interpolation is performed between the adjacent positioning data d _i and d _{i + 1} , the interpolation positioning data D _{i, i + 1} calculated from the equation (3) is added as new positioning data.

すなわち、隣接する２つの測位点間の線分を補間間隔閾値Ｓ_ｌｅｒｐで分割し、分割された各点に補間測位点を新たに追加する。これにより、（Ｓ_ｌｅｒｐ−１）個の補間測位点が増加することになる。なお、ｌｏｎ_ｉは測位点ｌ_ｉにおける経度を表し、ｌａｔ_ｉは測位点ｌ_ｉにおける緯度を表している。図５は、測位データの線形補間例を示す図である。測位時間間隔閾値Ｔ_ｌｅｒｐ＝１５、測位距離間隔閾値Ｌ_ｌｅｒｐ＝２００、補間間隔閾値Ｓ_ｌｅｒｐ＝４の場合を例示している。

That is, a line segment between two adjacent positioning points is divided by the interpolation interval threshold value S _{l erp} , and an interpolation positioning point is newly added to each of the divided points. As a result, (S _{l erp} −1) interpolation positioning points are increased. In addition, lon _i represents the longitude at the positioning point l _i , and lat _i represents the latitude at the positioning point l _i . FIG. 5 is a diagram illustrating an example of linear interpolation of positioning data. The case where the positioning time interval threshold value T _{l erp} = 15, the positioning distance interval threshold value L l _erp = 200, and the interpolation interval threshold value S _{l erp} = 4 is illustrated.

次に、ステップＳ２０５において、未処理の隣接測位データがあるか否かを判定し、未処理の隣接測位データがある場合は、ステップＳ２０２へ戻り、全ての未処理の隣接測位データについてステップＳ２０２〜Ｓ２０４を行う。一方、未処理の隣接測位データがない場合は、ステップＳ２０６へ進む。 Next, in step S205, it is determined whether there is unprocessed adjacent positioning data. If there is unprocessed adjacent positioning data, the process returns to step S202, and steps S202 to S202 are performed for all unprocessed adjacent positioning data. S204 is performed. On the other hand, if there is no unprocessed adjacent positioning data, the process proceeds to step S206.

次に、ステップＳ２０６において、未処理のユーザＩＤがあるか否かを判定し、未処理のユーザＩＤがある場合は、ステップＳ２０１へ戻り、全てのユーザＩＤについてステップＳ２０１〜Ｓ２０５を行う。一方、未処理のユーザＩＤがない場合、ステップＳ２０７へ進む。 Next, in step S206, it is determined whether or not there is an unprocessed user ID. If there is an unprocessed user ID, the process returns to step S201, and steps S201 to S205 are performed for all user IDs. On the other hand, if there is no unprocessed user ID, the process proceeds to step S207.

最後に、ステップＳ２０７において、これまでの処理によって線形補間が行われた測位データ、又は線形補間を行う必要がなかったそのままの測位データを滞留点抽出部１４に出力する。 Finally, in step S207, the positioning data that has been subjected to the linear interpolation by the processing so far, or the positioning data that has not been subjected to the linear interpolation is output to the stay point extraction unit 14.

続いて、滞留点抽出部１４の動作について説明する。図６は、滞留点抽出部１４の処理フローを示す図である。 Next, the operation of the stay point extraction unit 14 will be described. FIG. 6 is a diagram illustrating a processing flow of the stay point extraction unit 14.

まず、ステップＳ３０１において、測位データ補間部１３から出力された測位データを入力とし、未処理のユーザＩＤに該当する全ての測位データを処理する。以降のステップＳ３０２〜Ｓ３１１は、ユーザＩＤごとに実行される。 First, in step S301, the positioning data output from the positioning data interpolation unit 13 is input, and all positioning data corresponding to an unprocessed user ID is processed. The subsequent steps S302 to S311 are executed for each user ID.

次に、ステップＳ３０２において、ステップＳ３０１で取得した測位データｄに含まれる全ての各測位点ｌをそれぞれクラスタとして初期化し、ループ回数をゼロに設定する。なお、ここでいう初期化とは、各クラスタの中心点ｄ_ｉ ^ｃを測位点ｌ_ｉそのものの値にそれぞれ設定することをいう。すなわち、ｉ番目のクラスタの中心点ｄ_ｉ ^ｃの初期値は測位データｄ_ｉとなる。 Next, in step S302, all the positioning points l included in the positioning data d acquired in step S301 are initialized as clusters, and the number of loops is set to zero. Here, the initialization means that the center point d _i ^c of each cluster is set to the value of the positioning point l _i itself. That is, the initial value of the center point d _i ^c of the i-th cluster is the positioning data d _i .

なお、クラスタとは、ユーザが所定の位置に留まっていたと考えられる範囲の最小単位である。ステップＳ３０２〜Ｓ３０８では、各測位点をそれぞれクラスタとし、その後、ステップＳ３０９，Ｓ３１０において、クラスタを併合又は併合後のクラスタを分割することにより、一定時間ごとに留まったユーザの滞留点を抽出する。 Note that a cluster is a minimum unit of a range in which the user is considered to have stayed at a predetermined position. In Steps S302 to S308, each positioning point is set as a cluster, and then in Steps S309 and S310, the clustering of the clusters or the cluster after the merging is performed to extract the staying points of the users who remain at regular intervals.

次に、ステップＳ３０３において、未処理のクラスタ（以降、対象クラスタ）の中心から、事前に人手によって定められた半径Ｒ（単位はメートル）以内かつ時間Ｔ以内（単位は秒）に含まれる複数の測位点を取得する。具体的には、半径Ｒかつ時間Ｔ以内に含まれる測位点のインデックスの集合Ｋを式（４）より算出する。 Next, in step S303, a plurality of pieces included within a radius R (unit: meters) and time T (unit: seconds) determined in advance by hand from the center of an unprocessed cluster (hereinafter, target cluster). Get a positioning point. Specifically, a set K of positioning point indexes included within a radius R and within a time T is calculated from Equation (4).

ここで、ｎは、現在処理中のユーザＩＤに該当する測位データの総数である。

Here, n is the total number of positioning data corresponding to the user ID currently being processed.

次に、ステップＳ３０４において、ステップＳ３０３で取得した複数の測位点ｌ_ｋの測位精度Ａ（ａ_ｋ）および測位データｄ_ｋを用いて、測位精度を重みとする重み付き重心ｄ’_ｉ ^ｃを式（５）より算出する。 Next, in step S304, by using the positioning accuracy _{A (a k)} and positioning data _{d k} of the plurality of positioning points _{l k} obtained in step S303, the weighted centroid d _'i ^c to weight the positioning accuracy formula Calculate from (5).

ここで、重み付き重心ｄ’_ｉ ^ｃの算出方法を具体的に説明する。例えば、ステップＳ３０３で３つの測位データｄ_ｋが取得され、取得された各測位データｄ_１〜ｄ_３の実データが以下であるとする。右辺に示すデータ構造は、左から順に、測位時間，経度，緯度，測位精度である。

Here, concretely describing a method of calculating the weighted centroid d _'i ^c. For example, it is assumed that three positioning data d _k are acquired in step S303, and the actual data of the acquired positioning data d _{1 to} d ₃ is as follows. The data structure shown on the right side is, in order from the left, positioning time, longitude, latitude, and positioning accuracy.

ｄ_１＝｛２０１４／１／１００：００，［１４０，４０］，０．５｝
ｄ_２＝｛２０１４／１／１００：１０，［１４１，４１］，０．３｝
ｄ_３＝｛２０１４／１／１００：２０，［１４２，４２］，０．２｝
これらの測位データｄ_１〜ｄ_３を用いて、測位時間，経度，緯度，測位精度の各重み付き重心をそれぞれ算出し、算出し終えた全てを対象クラスタの重み付き重心とする。例えば、測位時間ｔであれば、ｔ＝（０．５×［２０１４／１／１００：００］＋０．３×［２０１４／１／１００：１０］＋０．２×［２０１４／１／１００：２０］）÷（０．５＋０．３＋０．２）＝［２０１４／１／１００：０７］が重み付き測位時間となる。 d ₁ = {2014/1/1 00:00:00, [140, 40], 0.5}
d ₂ = {2014/1/1 00:10, [141, 41], 0.3}
d ₃ = {2014/1/1 00:20, [142, 42], 0.2}
Using these positioning data d _{1 to} d ₃ , weighted centroids for positioning time, longitude, latitude, and positioning accuracy are calculated, respectively, and all the calculated centroids are set as weighted centroids for the target cluster. For example, if the positioning time is t, t = (0.5 × [2014/1/1 00: 0] + 0.3 × [2014/1/1 00:10] + 0.2 × [2014/1/1. 00:20]) ÷ (0.5 + 0.3 + 0.2) = [2014/1/1 00:07] is the weighted positioning time.

同様に、経度ｌｏｎ，緯度ｌａｔであれば、［ｌｏｎ，ｌａｔ］＝（０．５×［１４０，４０］＋０．３×［１４１，４１］＋０．２×［１４２，４２］）÷（０．５＋０．３＋０．２）＝［１４０．７，４０．７］が重み付き経度，緯度となる。 Similarly, in the case of longitude lon and latitude lat, [lon, lat] = (0.5 × [140, 40] + 0.3 × [141, 41] + 0.2 × [142, 42]) / (0 .5 + 0.3 + 0.2) = [140.7, 40.7] is the weighted longitude and latitude.

このように、測位精度および測位データを式（５）に代入して計算することにより、測位データに含まれる経度・緯度や時間など、多次元情報の重み付き重心が算出される。なお、高度など、経度・緯度や時間以外の情報が測位データに含まれていてもよく、その場合にはその重心を更に追加してもよい。 In this way, the weighted centroid of multidimensional information such as longitude, latitude, and time included in the positioning data is calculated by substituting and calculating the positioning accuracy and the positioning data into Expression (5). In addition, information other than longitude / latitude and time, such as altitude, may be included in the positioning data, and in that case, the center of gravity may be further added.

次に、ステップＳ３０５において、対象クラスタの中心点ｄ_ｉ ^ｃを、ステップＳ３０４で算出した重み付き重心ｄ’_ｉ ^ｃに移動する。ここでいう移動とは、対象クラスタの中心点ｄ_ｉ ^ｃを重み付き重心ｄ’_ｉ ^ｃに変更することをいう。これにより、対象クラスタの中心が測位精度に応じた位置に移動することになる。 Next, in step S305, the center point d _i ^c of the target cluster is moved to the weighted center of gravity d ′ _i ^c calculated in step S304. Here, the movement means changing the center point d _i ^c of the target cluster to the weighted center of gravity d ′ _i ^c . As a result, the center of the target cluster moves to a position corresponding to the positioning accuracy.

次に、ステップＳ３０６において、未処理のクラスタがあるか否かを判定し、未処理のクラスタがある場合は、ステップＳ３０３へ戻り、全てのクラスタについてステップＳ３０３〜Ｓ３０５を行う。一方、未処理のクラスタがない場合は、ステップＳ３０７へ進む。 Next, in step S306, it is determined whether or not there is an unprocessed cluster. If there is an unprocessed cluster, the process returns to step S303, and steps S303 to S305 are performed for all clusters. On the other hand, if there is no unprocessed cluster, the process proceeds to step S307.

次に、ステップＳ３０７において、全てのクラスタについて、各クラスタの中心ｄ_ｉ ^ｃを重み付き重心ｄ’_ｉ ^ｃに変更する処理を１回行ったので、ループ回数をインクリメントし、再度、全てのクラスタを未処理とする。 Next, in step S307, for all the clusters, the process of changing the center d _i ^c of each cluster to the weighted centroid d ′ _i ^c is performed once, so the loop count is incremented and all the clusters are again registered. Unprocessed.

次に、ステップＳ３０８において、現時点のループ回数が、事前に人手によって設定されたループ限界数閾値Ｉの回数に到達しているか否かを判定し、ループ限界数閾値Ｉに到達していない場合は、ステップＳ３０３へ戻り、今度は、ステップＳ３０５で移動した移動後の中心点を基準にしてステップＳ３０３〜Ｓ３０５を行う。 Next, in step S308, it is determined whether or not the current loop count has reached the number of loop limit threshold I set in advance by hand, and if the loop limit threshold I has not been reached, Returning to step S303, this time, steps S303 to S305 are performed based on the moved center point moved in step S305.

ここまでの処理において、例えば１００個の測位データがある場合、ステップＳ３０２により、１００個のクラスタが初期に生成され、ステップＳ３０３〜Ｓ３０５により、各クラスタの中心点が測位精度に応じた位置に移動するので、１００個のクラスタが測位データの密な場所に集合することになる。一方、ループ限界数閾値Ｉに到達した場合は、ステップＳ３０９へ進む。 In the processing so far, for example, when there are 100 positioning data, 100 clusters are initially generated in step S302, and the center point of each cluster is moved to a position corresponding to the positioning accuracy in steps S303 to S305. Therefore, 100 clusters are gathered in a dense location of positioning data. On the other hand, if the loop limit number threshold I has been reached, the process proceeds to step S309.

次に、ステップＳ３０９において、集合したクラスタを併合することにより、測位データが密になっている場所の数だけクラスタを残す。具体的には、クラスタの中心点間の距離が、事前に人手によって設定されたクラスタ中心点間距離閾値Ｃ以下であるクラスタを併合する。なお、クラスタの中心点間距離は関数ｄｉｓｔを用いて計算する。また、併合した新しいクラスタの中心点は、式（５）に基づき重み付き重心により算出する。その際、インデックスの集合Ｋの要素は、併合対象となるクラスタに含まれる測位点の和集合を用いる。 Next, in step S309, the clusters are merged to leave clusters as many as the number of places where the positioning data is dense. Specifically, clusters having a distance between cluster center points equal to or less than the cluster center point distance threshold C set in advance by hand are merged. Note that the distance between the center points of the cluster is calculated using the function dist. Further, the center point of the merged new cluster is calculated from the weighted centroid based on the equation (5). At that time, the union set of positioning points included in the cluster to be merged is used as an element of the index set K.

次に、ステップＳ３１０において、クラスタ内の測位データ集合が、事前に人手によって設定された時間間隔閾値Ｓより大きい場合は、該当するクラスタを分割する。つまり、クラスタ内の測位点を時間昇順で順番に確認したとき、時間ｈと時間ｈ＋１とにおける時間間隔が時間間隔閾値Ｓより大きい場合には、時間ｈまでの測位データで構成されるクラスタと、時間ｈ＋１以降の測位データで構成されるクラスタとに分離する。その際、分離した２つの各クラスタの中心点の位置は、いずれも元のクラスタの中心点の位置を設定する。この処理により、ステップＳ３０９で距離的観点から併合されてしまった、行きと帰りで同じ場所となるようなクラスタを、時間的観点から区別することができる。 Next, in step S310, if the positioning data set in the cluster is larger than the time interval threshold S set in advance by hand, the corresponding cluster is divided. That is, when the positioning points in the cluster are confirmed in order in ascending order of time, if the time interval between the time h and the time h + 1 is larger than the time interval threshold S, a cluster composed of positioning data up to the time h, Separated into clusters composed of positioning data after time h + 1. At this time, the position of the center point of each of the two separated clusters is set to the position of the center point of the original cluster. By this processing, clusters that have been merged from the viewpoint of distance in step S309 and become the same place on the way to return can be distinguished from the viewpoint of time.

次に、ステップＳ３１１において、残存する各クラスタの中心点をユーザの滞留点としてそれぞれ出力し、滞留点ＤＢ１５に保存する。滞留点ＤＢ１５の構成例を図７に示す。図７の構成例では、あるユーザが１つの地点に滞留した場合を例示している。滞留点における滞留開始時間（ｂｅｇｉｎ＿ｔｉｍｅ）および滞留終了時間（ｅｎｄ＿ｔｉｍｅ）は、それぞれ、クラスタに含まれる測位点の最小時間および最大時間で定義される。また、滞留時間（ｓｔａｙ＿ｔｉｍｅ）は、滞留終了時間（ｅｎｄ＿ｔｉｍｅ）から滞留開始時間（ｂｅｇｉｎ＿ｔｉｍｅ）を引くことで求められる。経度（ｌｏｎｇｉｔｕｄｅ）および緯度（ｌａｔｉｔｕｄｅ）は、クラスタの中心点、すなわちユーザの滞留点である。 Next, in step S311, the center point of each remaining cluster is output as the user's stay point and stored in the stay point DB 15. A configuration example of the stay point DB 15 is shown in FIG. The configuration example of FIG. 7 illustrates a case where a certain user stays at one point. The residence start time (begin_time) and residence end time (end_time) at the residence point are defined by the minimum time and the maximum time of the positioning points included in the cluster, respectively. The residence time (stay_time) is obtained by subtracting the residence start time (begin_time) from the residence end time (end_time). Longitude and latitude are the center points of the cluster, that is, the user's residence point.

そして、ステップＳ３１２において、未処理のユーザＩＤがあるか否かを判定し、未処理のユーザＩＤがある場合は、ステップＳ３０１へ戻り、全てのユーザＩＤについてステップＳ３０１〜Ｓ３１１を行う。一方、未処理のユーザＩＤがない場合は、処理を終了する。 In step S312, it is determined whether there is an unprocessed user ID. If there is an unprocessed user ID, the process returns to step S301, and steps S301 to S311 are performed for all user IDs. On the other hand, if there is no unprocessed user ID, the process ends.

以上より、本実施の形態によれば、測位データ抽出部１２により、各測位データにそれぞれ含まれる測位誤差を用いて各測位データの測位精度をそれぞれ算出し、滞留点抽出部１４により、各測位データにおける各測位点をそれぞれクラスタとし、対象クラスタから半径Ｒ以内かつ時間Ｔ以内に含まれる複数の測位点を取得し、それら複数の測位点の測位精度および測位データを用いて測位精度を重みとする重み付き重心を算出し、その重み付き重心を対象クラスタの中心点とし、その中心点をユーザの滞留点として抽出するので、環境要因によって測位精度が変化する場合でも、精度の悪い測位点による影響で滞留点の場所がずれてしまうことを抑制できる。 As described above, according to the present embodiment, the positioning data extraction unit 12 calculates the positioning accuracy of each positioning data using the positioning error included in each positioning data, and the staying point extraction unit 14 calculates each positioning data. Each positioning point in the data is set as a cluster, a plurality of positioning points included within a radius R and within a time T are acquired from the target cluster, and the positioning accuracy and the positioning data are used as the positioning accuracy. The weighted center of gravity is calculated, the weighted center of gravity is used as the center point of the target cluster, and the center point is extracted as the user's staying point, so even if the positioning accuracy changes due to environmental factors, It can suppress that the place of a stay point shifts by influence.

また、本実施の形態によれば、測位データ補間部１３により、測位時間間隔が測位時間間隔閾値より大きく、かつ、測位点間が測位距離間隔閾値より小さい測位データ間に新たな測位データを追加するので、測位データの測位取得間隔が変化する場合でも、滞留の可能性が高い地点に多くの測位データを補間できる。 Further, according to the present embodiment, the positioning data interpolation unit 13 adds new positioning data between the positioning data whose positioning time interval is larger than the positioning time interval threshold and between the positioning points is smaller than the positioning distance interval threshold. Therefore, even when the positioning acquisition interval of the positioning data changes, a lot of positioning data can be interpolated at a point where the possibility of staying is high.

以上説明した大きく２つの処理を行うことにより、図８に示すように、ユーザが留まった滞留点を精度よく抽出することができる。 By performing the two processes described above, the staying point where the user stayed can be extracted with high accuracy as shown in FIG.

最後に、本実施の形態で説明した滞留点抽出装置１は、メモリやＣＰＵを備えたコンピュータにより実現できる。また、滞留点抽出装置１の各動作をプログラムとして構築し、コンピュータにインストールして実行させることや、通信ネットワークを介して流通させることも可能である。 Finally, the stay point extraction device 1 described in the present embodiment can be realized by a computer having a memory and a CPU. It is also possible to construct each operation of the stay point extraction apparatus 1 as a program, install it in a computer and execute it, or distribute it via a communication network.

１…滞留点抽出装置
１１…測位データＤＢ
１２…測位データ抽出部
１３…測位データ補間部
１４…滞留点抽出部
１５…滞留点ＤＢ
Ｓ１０１〜Ｓ１０３、Ｓ２０１〜Ｓ２０７、Ｓ３０１〜Ｓ３１２…ステップ 1 ... Residence point extraction device 11 ... Positioning data DB
DESCRIPTION OF SYMBOLS 12 ... Positioning data extraction part 13 ... Positioning data interpolation part 14 ... Residence point extraction part 15 ... Residence point DB
S101 to S103, S201 to S207, S301 to S312 ... step

Claims

By computer
A first step of storing position information collected from a user terminal in a storage means as positioning data;
A second step of reading each positioning data from the storage means and calculating positioning accuracy of each positioning data using positioning error data included in each positioning data;
Wherein each positioning points in each measurement data and each cluster, a third step of spacing constant region and positioning time from each cluster to obtain a plurality of positioning points included within a predetermined time,
A fourth step of calculating a weighted centroid having a positioning accuracy as a weight using the positioning accuracy and positioning data of the plurality of positioning points, and setting it as a center point of each of the clusters;
A fifth step of merging clusters whose distance between the center points of each cluster is less than or equal to a certain distance;
A positioning point of each cluster included in the merged cluster is arranged in the order of the positioning time, and when the time interval between the positioning times of two adjacent positioning points is equal to or longer than a predetermined time, the sixth cluster is separated. Steps,
A seventh step of extracting the central point of the merged cluster and the cluster separated after the merge as a user residence point;
A method for extracting staying points, comprising:

A step of adding new positioning data between positioning data between the positioning time and the positioning time interval larger than a certain time and between the positioning points smaller than the certain distance between the second step and the third step. The dwell point extraction method according to claim 1, comprising:

Storage means for storing position information collected from the user terminal as positioning data;
Calculation means for reading each positioning data from the storage means and calculating positioning accuracy of each positioning data using positioning error data included in each positioning data;
Each positioning point in each positioning data as a cluster, respectively , an acquisition means for respectively acquiring a plurality of positioning points that are included in a certain area and within a certain time interval between positioning times from each cluster;
Calculating a weighted centroid with a positioning accuracy as a weight using the positioning accuracy and positioning data of the plurality of positioning points, and calculating means as a center point of each of the clusters;
A merging means for merging clusters whose distance between the center points of each cluster is equal to or less than a certain distance;
Separating means that separates the merged clusters when the positioning points of the clusters included in the merged cluster are arranged in the order of the positioning time, and the time interval between the positioning times of two adjacent positioning points is equal to or greater than a certain time. ,
An extraction means for extracting the central point of the merged cluster and the cluster separated after the merge as a staying point of the user;
A dwell point extraction device characterized by comprising:

4. The stay point extraction according to claim 3, further comprising interpolation means for adding new positioning data between positioning data having a positioning time interval larger than a certain time and between positioning points smaller than a certain distance. apparatus.

A stay point extraction program that causes a computer to execute the stay point extraction method according to claim 1.