JP2000305940A

JP2000305940A - Method and device for retrieving time series data and storage medium storing time series data retrieval program

Info

Publication number: JP2000305940A
Application number: JP11114161A
Authority: JP
Inventors: Kazuhiro Otsuka; 和弘大塚; Tsutomu Horikoshi; 力堀越; Haruhiko Kojima; 治彦児島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1999-04-21
Filing date: 1999-04-21
Publication date: 2000-11-02

Abstract

PROBLEM TO BE SOLVED: To perform retrieval of a series of similar data with high accuracy and stability by automatically and objectively integrating dissimilarity of plural feature values capable of expressing various features of data according to an inquiry from a user and setting the total dissimilarity. SOLUTION: Time series data having a time series is inputted (S1), stored in a database, plural first featured value vectors to express a property of the time series data are calculated (S2) from the stored time series data at every interval of fixed time, an optional piece of inquiry data to be given by the user is inputted (S3) and second featured value vectors are calculated (S4) from the inquiry data. Then, the dissimilarity between the inquiry data and data of each time in the data base is calculated (S5) from the first and second featured value vectors, the dissimilarity is integrated (S6), the total dissimilarity between the inquiry data and the data of each time in the database is calculated, data of the applicable time are selected and outputted from the database.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、時系列データ検索
方法及び装置及び時系列データ検索プログラムを格納し
た記憶媒体に係り、特に、データベース中に蓄積されて
いる大量の時系列データの中から、利用者が与える任意
の時系列データと類似するデータを検索するための技術
における、映像データベースの検索や、気象、地震、市
場などの各種現象を記録した時系列信号の解析、データ
マイニングなどの応用に適用される時系列データ検索方
法及び装置及び時系列データ検索プログラムを格納した
記憶媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a time-series data search method and apparatus and a storage medium storing a time-series data search program, and more particularly, to a storage medium storing a large amount of time-series data stored in a database. Application of technology to search for data similar to arbitrary time-series data provided by users, such as video database search, analysis of time-series signals that record various phenomena such as weather, earthquakes, and markets, and data mining And a storage medium storing a time-series data search program.

【０００２】[0002]

【従来の技術】これまでの時系列データの検索の技術と
しては、時系列画像を対象とした例として、文献「藤本
泰史、岩佐英彦、横矢直和、竹村治雄：“固有空間内の
軌跡の類似性に基づく動画像検索”、電子情報通信学会
技術報告、ＰＲＭＵ９６−１１０、pp.49-56,1996 」が
あげられる。2. Description of the Related Art Conventional techniques for searching for time-series data include, for example, time-series images, which are described in the documents "Yasufumi Fujimoto, Hidehiko Iwasa, Naokazu Yokoya, Haruo Takemura:" Video Search Based on Similarity ", IEICE Technical Report, PRMU 96-110, pp. 49-56, 1996.

【０００３】この方法は、時系列画像のフレーム毎に、
画像中の画素の濃淡値を１次元の特徴量ベクトルとして
表現し、時系列画像データベースを構成する画像系列の
各フレームについて計算したこのベクトルを列ベクトル
にした行列をつくり、この行列から全画像系列の特徴量
のベクトルの平均値を引いた行列について共分散行列を
求め、この行列の固有値問題を解いて得られる大きい固
有値に対応する固有ベクトルを基底とする固有空間をつ
くり、各画像系列を固有空間上に投影した軌跡を計算す
る。そして、検索の問い合わせの画像系列に対応する固
有空間上の軌跡と、データベースに蓄積されている画像
系列の固有空間上の軌跡との距離を時系列画像間の非類
似度の尺度として用い、距離の小さい画像系列を検索結
果としている。In this method, for each frame of a time-series image,
A gray-scale value of a pixel in an image is represented as a one-dimensional feature amount vector, and a matrix is created by converting this vector calculated for each frame of the image sequence constituting the time-series image database into a column vector. A covariance matrix is obtained for a matrix obtained by subtracting the average value of the feature amount vectors, and an eigenspace based on eigenvectors corresponding to large eigenvalues obtained by solving an eigenvalue problem of this matrix is created. Calculate the trajectory projected above. The distance between the trajectory in the eigenspace corresponding to the image sequence of the search query and the trajectory in the eigenspace of the image sequence stored in the database is used as a measure of the dissimilarity between the time-series images, and the distance The image sequence having a small value is used as a search result.

【０００４】また、時系列画像を対象とする場合、上記
の画像の濃淡値の分布に関する特徴量ベクトルの他に、
画像中のパターンの動きベクトル場や、色、テクスチャ
ー等の画像特徴も特徴量ベクトルとして通常用いられて
いる。これらの各種特徴量ベクトルを用いて、問い合わ
せ時系列画像と、データベース中の時系列画像との非類
似度を特徴量ベクトルの差として計算し、各種特徴量ベ
クトルについて得られる非類似度に重み付を行い、総和
をとることで総合的な非類似度の尺度とし、それが小さ
い時系列画像をデータベース中から検索する方法が、従
来多く用いられている。When a time-series image is to be processed, in addition to the above-mentioned feature amount vector relating to the distribution of gray values of the image,
An image feature such as a motion vector field of a pattern in an image, a color, and a texture is usually used as a feature amount vector. Using these various feature vectors, the dissimilarity between the query time-series image and the time-series image in the database is calculated as the difference between the feature vectors, and the dissimilarity obtained for the various feature vectors is weighted. Conventionally, a method of searching for a time-series image having a small scale from a database by summing up the sum to obtain a total dissimilarity measure is widely used.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記従
来の方法では、画像濃淡値や動きベクトル場、色、テク
スチャーといった複数の特徴量ベクトルを用い、それぞ
れについて、利用者から指示された問い合わせデータと
データベース中のデータとの非類似度を計算し、それら
に重み付けして総合的な非類似度を計算する際、その非
類似度の尺度の設定や、各特徴量毎の重み付けの計数を
決定する必要がある。従来、非類似度の尺度の設定は、
主に設計者が予め経験的に決定しており、その重み付け
も設計者あるいは、利用者が人手で設定している。However, in the above-mentioned conventional method, a plurality of feature quantity vectors such as an image density value, a motion vector field, a color, and a texture are used. When calculating the degree of dissimilarity with the data in it and weighting them to calculate the overall degree of dissimilarity, it is necessary to determine the scale of the degree of dissimilarity and determine the weighting count for each feature value There is. Traditionally, setting the dissimilarity measure is:
Mainly, the designer determines in advance empirically, and the weighting is manually set by the designer or the user.

【０００６】また、一般に、同じデータベース中におい
ても、データの特性に応じて、重視すべき特性が変化す
るため、検索時に随時、特徴量間の重み付けの設定を変
更する必要があり、検索操作が非能率的かつ非効率的で
あり、検索精度も不安定で低下するという問題がある。
また、特徴量を画像の濃淡分布などの少数の特徴量に絞
り込む場合、特徴量間の重み付けを決定する問題は軽減
できるが、元のデータに含まれる多様な性質を表現する
能力が低下し、結果的に利用者が望む類似データの検索
が困難になるという問題がある。In general, even in the same database, the characteristics to be emphasized change according to the characteristics of the data. Therefore, it is necessary to change the setting of the weighting between the feature amounts at any time during the search, and the search operation becomes difficult. It is inefficient and inefficient, and there is a problem that search accuracy is unstable and deteriorates.
Also, when narrowing down the features to a small number of features, such as the grayscale distribution of the image, the problem of determining the weight between the features can be reduced, but the ability to express various properties included in the original data is reduced, As a result, it is difficult to search for similar data desired by the user.

【０００７】本発明は、上記の点に鑑みなされたもの
で、データの様々な特性を表現可能な複数の特徴量につ
いて、その非類似度を、利用者からの問い合わせデータ
の特性に併せて自動的かつ客観的に統合し、総合的な非
類似度を設定することで、高精度かつ安定性の高い、類
似データ系列の検索を行うことが可能な時系列データ検
索方法及び装置及び時系列データ検索プログラムを格納
した記憶媒体を提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above points, and automatically determines the dissimilarity of a plurality of feature amounts capable of expressing various characteristics of data in accordance with the characteristics of inquiry data from a user. -Time data search method and apparatus and time-series data capable of performing high-accuracy, high-stability, similar data series search by integrating objectively and objectively and setting comprehensive dissimilarity It is an object to provide a storage medium storing a search program.

【０００８】[0008]

【課題を解決するための手段】図１は、本発明の原理を
説明するための図である。本発明（請求項１）は、時系
列データを記録する時系列データベースから、利用者の
問い合わせデータに類似したデータを検索する時系列デ
ータ検索方法において、時系列を成す時系列データを入
力してデータベースに記憶し（ステップ１）、記憶され
た時系列データより、一定の時間間隔毎に該時系列デー
タの性質を表現する複数の第１の特徴量ベクトルを計算
し（ステップ２）、利用者が与える任意の問い合わせデ
ータを入力し（ステップ３）、問い合わせデータの性質
を表現する第２の特徴量ベクトルを計算し（ステップ
４）、問い合わせデータとデータベース中の各時刻のデ
ータとの非類似度を、第１の特徴量ベクトルと第２の特
徴量ベクトルから計算し（ステップ５）、非類似度を統
合して、問い合わせデータとデータベース中の各時刻の
データとの間の総合的な非類似度を計算し（ステップ
６）、総合的な非類似度の小さい順番に、該当する時刻
のデータをデータベースより選択し、出力する（ステッ
プ７）。FIG. 1 is a diagram for explaining the principle of the present invention. The present invention (claim 1) relates to a time-series data search method for searching for data similar to user inquiry data from a time-series database that records time-series data. A plurality of first feature vectors expressing the properties of the time-series data are calculated at regular time intervals from the stored time-series data (step 1), and stored in the database (step 2). Is input (Step 3), a second feature vector expressing the nature of the query data is calculated (Step 4), and the degree of dissimilarity between the query data and the data at each time in the database is calculated. Is calculated from the first feature amount vector and the second feature amount vector (step 5), and the dissimilarity is integrated to obtain the query data and the The overall dissimilarities between the time of the data calculated (Step 6), the increasing order of overall dissimilarity, the data of the corresponding time selected from a database, and outputs (Step 7).

【０００９】本発明（請求項２）は、総合的な非類似度
を計算する際に、データベース中の問い合わせデータか
ら非類似度が小さいデータ集合である部分的なデータ集
合を選択し、選択された部分的なデータ集合中の各々の
時刻について、該時刻とデータベース中の他の時刻のデ
ータとの間の第１の非類似度を計算し、それぞれの時刻
以後の２つのデータ間の第２の非類似度を計算し、選択
された部分的なデータ集合に含まれる各々の時刻につい
て計算された他の時刻のデータとの間の第１の非類似度
と、該時刻以後の２つのデータ間の第２の非類似度との
関連を、該第１の非類似度の関数として、該第２の非類
似度の期待値をモデル化し、第１の非類似度をモデルに
対して適用することで統合し、問い合わせデータとデー
タベース中の各時刻のデータ間の総合的な非類似度を計
算する。According to the present invention (claim 2), when calculating the overall dissimilarity, a partial data set having a small dissimilarity is selected from the query data in the database, and the selected data set is selected. For each time in the partial data set, a first dissimilarity between the time and data at another time in the database is calculated, and a second dissimilarity between the two data after each time is calculated. , And a first dissimilarity between the data at the other times calculated for each time included in the selected partial data set, and two data after the time. Model the expected value of the second degree of dissimilarity as a function of the first degree of dissimilarity, and apply the first degree of dissimilarity to the model By integrating, query data and each time in the database To calculate the overall non-similarity between the data of.

【００１０】本発明（請求項３）は、各特徴量ベクトル
毎に、問い合わせデータとデータベース中の各時刻のデ
ータとの第１の非類似度を計算する際、または、当該時
刻以後の２つの時刻のデータ間の第２の非類似度を計算
する際に、特徴量ベクトル各々についてそれぞれ、該当
する２つの時刻と該時刻以前の時刻に対応する特徴量ベ
クトルの２つの系列について、２つの系列間の特徴量ベ
クトルの間の距離の和、または、平均的な距離を時間的
な順序関係を保持して計算する。According to the present invention (claim 3), the first dissimilarity between the inquiry data and the data at each time in the database is calculated for each feature amount vector, or two dissimilarities after the time are calculated. When calculating the second degree of dissimilarity between time data, for each of the feature amount vectors, two sequences of two corresponding time periods and two sequences of feature amount vectors corresponding to times before the current time are used. The sum or the average distance between the feature amount vectors between them is calculated while maintaining the temporal order relation.

【００１１】本発明（請求項４）は、選択された部分的
なデータ集合中の各々の時刻に対して、該時刻以後のデ
ータと、データベース中のデータとの間の第１の非類似
度を計算する際に、任意の一種類の特徴量ベクトルにつ
いて、指定される２つの時刻以後の時刻に対応する特徴
ベクトルの２つの部分系列について、２つの系列間にお
ける特徴量ベクトルの間の距離の和、または、平均的な
距離を時間的な順序関係を保持して計算する。The present invention (claim 4) provides, for each time in a selected partial data set, a first degree of dissimilarity between data after the time and data in a database. Is calculated, for any one type of feature amount vector, for two partial sequences of the feature vector corresponding to times after the two designated times, the distance between the feature amount vectors between the two sequences is calculated. The sum or the average distance is calculated while maintaining the temporal order relation.

【００１２】本発明（請求項５）は、選択された部分的
なデータ集合に含まれる各々の時刻について計算された
他の時刻のデータとの間の第１の非類似度と、該時刻以
後の２つのデータ間の第２の非類似度との関連を、該第
１の非類似度の関数として、該第２の非類似度の期待値
をモデル化する際に、２つの時刻のデータ間の非類似度
と、それら時刻以後の２つのデータ間の非類似度の期待
値の関係を、第１の非類似度を変数とするロジスティッ
クモデルを用いて近似し、回帰分析によりモデルの係数
を求める。[0012] The present invention (claim 5) provides a first dissimilarity between data at other times calculated for each time included in the selected partial data set, When modeling the relationship between the two pieces of data with the second degree of dissimilarity as a function of the first degree of dissimilarity, the expected value of the second degree of dissimilarity is modeled. The relationship between the dissimilarity between the two data and the expected value of the dissimilarity between the two data after that time is approximated using a logistic model with the first dissimilarity as a variable, and the coefficient of the model is determined by regression analysis. Ask for.

【００１３】本発明（請求項６）は、選択された部分的
なデータ集合に含まれる各々の時刻について計算された
他の時刻のデータとの間の第１の非類似度と、該時刻以
後の２つのデータ間の第２の非類似度との関連を、該第
１の非類似度の関数として、該第２の非類似度の期待値
をモデル化する際に、２つの時刻のデータ間の第２の非
類似度が任意の閾値以下であるデータの部分集合につい
て、各時刻以後の２つのデータ間の非類似度の期待値を
計算し、第１の非類似度の閾値を変数とするロジスティ
ックモデルを用いて近似し、回帰分析によりモデルの係
数を求める。[0013] The present invention (claim 6) provides a first dissimilarity between data at other times calculated for each time included in the selected partial data set, When modeling the relationship between the two pieces of data with the second degree of dissimilarity as a function of the first degree of dissimilarity, the expected value of the second degree of dissimilarity is modeled. For a subset of data in which the second degree of dissimilarity is equal to or less than an arbitrary threshold, an expected value of the degree of dissimilarity between two pieces of data after each time is calculated, and the first threshold of dissimilarity is set as a variable Is approximated using a logistic model, and the coefficients of the model are determined by regression analysis.

【００１４】本発明（請求項７）は、時系列データとし
て時系列画像を入力する。本発明（請求項８）は、時系
列画像を対象データとして、データの性質を表現する特
徴量ベクトルを計算する際に、単位時間毎に画像中の画
素の濃淡値、または、該濃淡値を空間的に平均した平均
輝度値、または、画像平面をメッシュ状に区切り、各メ
ッシュに含まれる画像の濃淡値の平均を要素とするベク
トルとして計算する。According to the present invention (claim 7), a time-series image is input as time-series data. The present invention (claim 8) uses a time-series image as target data, and calculates a feature vector expressing the property of the data, and calculates a gray value of a pixel in the image for each unit time or the gray value. The average brightness value obtained by spatially averaging or the image plane is divided into meshes, and the vector is calculated as a vector having an element of the average of the grayscale values of the images included in each mesh.

【００１５】本発明（請求項９）は、任意の一種類の特
徴量ベクトルとして、時系列データが時系列画像である
場合に、画像の濃淡値の分布に関する特徴量ベクトルを
選択する。本発明（請求項１０）は、データベース中の
部分的なデータ集合を選択する際に、問い合わせデータ
の第２の特徴量ベクトルと、データベース中の各時刻の
データの第１の特徴量ベクトルとの距離を、各種特徴量
ベクトルについてそれぞれ計算し、それぞれの距離が所
定の値よりも小さい時刻のデータを選択する。According to the present invention (claim 9), when the time-series data is a time-series image, a characteristic amount vector relating to the distribution of gray values of the image is selected as one kind of characteristic amount vector. According to the present invention (claim 10), when a partial data set in the database is selected, the second feature vector of the inquiry data and the first feature vector of the data at each time in the database are selected. The distance is calculated for each of the characteristic amount vectors, and data at a time when each distance is smaller than a predetermined value is selected.

【００１６】図２は、本発明の原理構成図である。本発
明（請求項１１）は、時系列データを記録する時系列デ
ータベースから、利用者の問い合わせデータに類似した
データを検索する時系列データ検索装置であって、デー
タベースを構成する時系列をなす時系列データを入力
し、データベース９０に記憶させるデータ系列入力手段
２０と、データベース９０に記憶された時系列データよ
り、一定の時間間隔毎にデータの性質を表現する複数の
第１の特徴量ベクトルを計算する第１の特徴量ベクトル
計算手段２０と、利用者が与える任意の問い合わせデー
タを入力する検索問い合わせデータ系列入力手段３０
と、問い合わせデータから、該データの性質を表現する
第２の特徴量ベクトルを計算する第２の特徴量ベクトル
計算手段４０と、問い合わせデータとデータベース９０
中の各時刻のデータとの非類似度を、各第１の特徴量ベ
クトルと第２の特徴量ベクトルから計算する第１の非類
似度計算手段５０と、非類似度を統合して、該問い合わ
せデータと該データベース中の各時刻のデータとの総合
的な非類似度を計算する第２の非類似度計算手段６０
と、総合的な非類似度の小さい順番に、該当する時刻の
データをデータベース９０より選択する検索手段７０
と、検索手段７０により選択されたデータを出力する出
力手段８０とを有する。FIG. 2 is a diagram showing the principle of the present invention. The present invention (Claim 11) is a time-series data search device for searching data similar to user inquiry data from a time-series database that records time-series data. A plurality of first feature quantity vectors expressing the characteristics of the data at regular time intervals are obtained from the data series input means 20 for inputting the series data and storing the data in the database 90 and the time series data stored in the database 90. First feature value vector calculating means 20 for calculating, and search query data series input means 30 for inputting arbitrary query data given by a user
A second feature value vector calculating means 40 for calculating a second feature value vector expressing a property of the data from the inquiry data;
The first dissimilarity calculating means 50 for calculating the dissimilarity with the data at each of the time points from each of the first feature amount vector and the second feature amount vector, and the dissimilarity are integrated. Second dissimilarity calculating means 60 for calculating the overall dissimilarity between the inquiry data and the data at each time in the database.
Searching means 70 for selecting the data at the corresponding time from the database 90 in the order of the smaller overall dissimilarity.
And output means 80 for outputting the data selected by the search means 70.

【００１７】本発明（請求項１２）は、第２の非類似度
計算手段６０において、データベース９０中の問い合わ
せデータから非類似度が小さいデータ集合である部分的
なデータ集合を選択する手段と、選択されたデータ集合
中の各々の時刻について、該時刻とデータベース中の他
の時刻のデータとの間の第１の非類似度を計算する手段
と、それぞれの時刻以後の２つのデータ間の第２の非類
似度を計算する手段と、選択された部分的なデータ集合
に含まれる各々の時刻について計算された他の時刻のデ
ータとの間の第１の非類似度と、該時刻以後の２つのデ
ータ間の第２の非類似度との関連を、該第１の非類似度
の関数として、該第２の非類似度の期待値をモデル化す
る手段と、第１の非類似度をモデルに対して適用するこ
とにより統合し、問い合わせデータとデータベース９０
中の各時刻のデータ間の総合的な非類似度を計算する手
段とを有する。According to the present invention (claim 12), in the second dissimilarity calculating means 60, means for selecting a partial data set which is a data set having a small dissimilarity from the query data in the database 90, Means for calculating, for each time in the selected data set, a first degree of dissimilarity between the time and the data at another time in the database; A first dissimilarity between the data at the other times calculated for each time included in the selected partial data set; and Means for modeling an association of the second data with the second degree of dissimilarity as a function of the first degree of dissimilarity, the expected value of the second degree of dissimilarity; By applying to the model, Have combined data and the database 90
Means for calculating the overall dissimilarity between the data at each of the times.

【００１８】本発明（請求項１３）は、第１の非類似度
を計算する手段及び第２の非類似度を計算する手段にお
いて、第２の特徴量ベクトル各々についてそれぞれ該当
する２つの時刻と該時刻以前の時刻に対応する特徴量ベ
クトルの２つの系列について、２つの系列間の特徴量ベ
クトルの間の距離の和、または、平均的な距離を時間的
な順序関係を保持して計算する手段を含む。According to a thirteenth aspect of the present invention, in the means for calculating the first degree of dissimilarity and the means for calculating the second degree of dissimilarity, two times corresponding to each of the second feature amount vectors are provided. For two sequences of feature vectors corresponding to times before the time, a sum of distances between feature vectors between the two sequences or an average distance is calculated while maintaining a temporal order relationship. Including means.

【００１９】本発明（請求項１４）は、第１の非類似度
を計算する手段において、任意の一種類の特徴量ベクト
ルについて、指定される２つの時刻以後の時刻に対応す
る特徴ベクトルの２つの部分系列について、２つの系列
間における特徴量ベクトルの間の距離の和、または、平
均的な距離を時間的な順序関係を保持して計算する手段
を含む。According to the present invention (claim 14), in the means for calculating the first degree of dissimilarity, two types of feature vectors corresponding to times after two specified times are specified for any one type of feature amount vector. Means for calculating a sum of distances between feature amount vectors between two sequences or an average distance of two partial sequences while maintaining a temporal order relationship.

【００２０】本発明（請求項１５）は、選択された部分
的なデータ集合に含まれる各々の時刻について計算され
た他の時刻のデータとの間の第１の非類似度と、該時刻
以後の２つのデータ間の第２の非類似度との関連を、該
第１の非類似度の関数として、該第２の非類似度の期待
値をモデル化する手段は、２つの時刻のデータ間の非類
似度と、それら時刻以後の２つのデータ間の非類似度の
期待値の関係を、第１の非類似度を変数とするロジステ
ィックモデルを用いて近似し、回帰分析によりモデルの
係数を求める手段を含む。[0020] The present invention (claim 15) provides a first dissimilarity between data at other times calculated for each time included in the selected partial data set, Means for modeling an association of the second dissimilarity between the two data with the second dissimilarity as a function of the first dissimilarity, the expected value of the second dissimilarity, The relationship between the dissimilarity between the two data and the expected value of the dissimilarity between the two data after that time is approximated using a logistic model with the first dissimilarity as a variable, and the coefficient of the model is determined by regression analysis. Including means for determining

【００２１】本発明（請求項１６）は、選択された部分
的なデータ集合に含まれる各々の時刻について計算され
た他の時刻のデータとの間の第１の非類似度と、該時刻
以後の２つのデータ間の第２の非類似度との関連を、該
第１の非類似度の関数として、該第２の非類似度の期待
値をモデル化する手段は、２つの時刻のデータ間の第２
の非類似度が任意の閾値以下であるデータの部分集合に
ついて、各時刻以後の２つのデータ間の非類似度の期待
値を計算し、第１の非類似度の閾値を変数とするロジス
ティックモデルを用いて近似し、回帰分析によりモデル
の係数を求める手段を含む。[0021] The present invention (claim 16) is characterized in that a first dissimilarity between data at other times calculated for each time included in the selected partial data set, Means for modeling an association of the second dissimilarity between the two data with the second dissimilarity as a function of the first dissimilarity, the expected value of the second dissimilarity, Second in between
For a subset of data in which the degree of dissimilarity is equal to or less than an arbitrary threshold, an expected value of dissimilarity between two pieces of data after each time is calculated, and a logistic model using the first threshold of dissimilarity as a variable And means for obtaining the coefficients of the model by regression analysis.

【００２２】本発明（請求項１７）は、データ系列入力
手段１０において、時系列データとして時系列画像を入
力する。本発明（請求項１８）は、第１の特徴量ベクト
ル計算手段２０において、時系列データとして、時系列
画像が入力された場合には、単位時間毎に画像中の画素
の濃淡値、または、該濃淡値を空間的に平均した平均輝
度値、または、画像平面をメッシュ状に区切り、各メッ
シュに含まれる画像の濃淡値の平均を要素とするベクト
ルとして計算する手段を含む。According to the present invention (claim 17), the data series input means 10 inputs a time series image as time series data. According to the present invention (claim 18), when a time-series image is input as time-series data in the first feature amount vector calculation means 20, a gray scale value of a pixel in the image per unit time, or Means for calculating an average luminance value obtained by spatially averaging the grayscale values or a vector having an image plane divided into meshes and having an average of grayscale values of images included in each mesh as an element.

【００２３】本発明（請求項１９）は、第１の非類似度
を計算する手段において、任意の一種類の特徴量ベクト
ルとして、時系列画像が時系列データである場合に、画
像の濃淡値の分布に関する特徴量ベクトルを選択する手
段を含む。本発明（請求項２０）は、データベース中の
部分的なデータ集合を選択する手段において、問い合わ
せデータの第２の特徴量ベクトルと、データベース中の
各時刻のデータの第１の特徴量ベクトルとの距離を、各
種特徴量ベクトルについてそれぞれ計算し、それぞれの
距離が所定の値よりも小さい時刻のデータを選択する手
段を含む。According to a nineteenth aspect of the present invention, in the means for calculating the first degree of dissimilarity, when the time-series image is time-series data, the gray-scale value Means for selecting a feature quantity vector related to the distribution of. According to the present invention (claim 20), in the means for selecting a partial data set in the database, the second feature vector of the inquiry data and the first feature vector of the data at each time in the database are used. The method includes means for calculating the distance for each of the characteristic amount vectors and selecting data at a time when each distance is smaller than a predetermined value.

【００２４】本発明（請求項２１）は、時系列データを
記録する時系列データベースから、利用者の問い合わせ
データに類似したデータを検索する時系列データ検索プ
ログラムを格納した記憶媒体であって、データベースを
構成する時系列をなす時系列データを入力させ、データ
ベースに記憶させるデータ系列入力プロセスと、データ
ベースに記憶された時系列データより、一定の時間間隔
毎にデータの性質を表現する複数の第１の特徴量ベクト
ルを計算する第１の特徴量ベクトル計算プロセスと、利
用者が与える任意の問い合わせデータを入力する検索問
い合わせデータ系列入力プロセスと、問い合わせデータ
から、該データの性質を表現する第２の特徴量ベクトル
を計算する第２の特徴量ベクトル計算プロセスと、問い
合わせデータとデータベース中の各時刻のデータとの非
類似度を、各第１の特徴量ベクトルと第２の特徴量ベク
トルから計算する第１の非類似度計算プロセスと、非類
似度を統合して、該問い合わせデータと該データベース
中の各時刻のデータとの総合的な非類似度を計算する第
２の非類似度計算プロセスと、総合的な非類似度の小さ
い順番に、該当する時刻のデータをデータベースより選
択する検索プロセスと、検索プロセスにより選択された
データを出力させる出力プロセスとを有する。The present invention (claim 21) is a storage medium storing a time-series data search program for searching data similar to user inquiry data from a time-series database for recording time-series data. A data series input process for inputting time series data forming a time series and configuring the data series, and a plurality of first series expressing the properties of the data at regular time intervals based on the time series data stored in the database. A first feature amount vector calculation process for calculating a feature amount vector, a search query data series input process for inputting arbitrary query data given by a user, and a second process for expressing a property of the data from the query data A second feature vector calculation process for calculating the feature vector; The first dissimilarity calculation process of calculating the dissimilarity with the data at each time in the database from each of the first feature amount vector and the second feature amount vector, and the dissimilarity are integrated. A second dissimilarity calculation process for calculating the overall dissimilarity between the inquiry data and the data at each time in the database; and It has a search process to select more and an output process to output the data selected by the search process.

【００２５】本発明（請求項２２）は、第２の非類似度
計算プロセスにおいて、データベース中の問い合わせデ
ータから非類似度が小さいデータ集合である部分的なデ
ータ集合を選択するプロセスと、選択されたデータ集合
中の各々の時刻について、該時刻とデータベース中の他
の時刻のデータとの間の第１の非類似度を計算するプロ
セスと、それぞれの時刻以後の２つのデータ間の第２の
非類似度を計算するプロセスと、選択された部分的なデ
ータ集合に含まれる各々の時刻について計算された他の
時刻のデータとの間の第１の非類似度と、該時刻以後の
２つのデータ間の第２の非類似度との関連を、該第１の
非類似度の関数として、該第２の非類似度の期待値をモ
デル化するプロセスと、第１の非類似度をモデルに対し
て適用することにより統合し、問い合わせデータとデー
タベース中の各時刻のデータ間の総合的な非類似度を計
算するプロセスとを有する。According to the present invention (claim 22), in the second dissimilarity calculation process, a process of selecting a partial data set which is a data set having a small dissimilarity from the query data in the database is selected. Calculating, for each time in the collected data set, a first dissimilarity between the time and the data at another time in the database; and a second dissimilarity between the two data after the respective time. A process for calculating the dissimilarity, a first dissimilarity between the data of the other times calculated for each time included in the selected partial data set, and two Modeling the association between the data with the second dissimilarity as a function of the first dissimilarity, and modeling the expected value of the second dissimilarity; To apply to Ri integrated, and a process of calculating the overall dissimilarities between data at each time in the inquiry data and the database.

【００２６】本発明（請求項２３）は、第１の非類似度
を計算するプロセス及び第２の非類似度を計算するプロ
セスにおいて、第２の特徴量ベクトル各々についてそれ
ぞれ該当する２つの時刻と該時刻以前の時刻に対応する
特徴量ベクトルの２つの系列について、２つの系列間の
特徴量ベクトルの間の距離の和、または、平均的な距離
を時間的な順序関係を保持して計算するプロセスを含
む。According to the present invention (claim 23), in the process of calculating the first dissimilarity and the process of calculating the second dissimilarity, two times corresponding to each of the second feature amount vectors and For two sequences of feature vectors corresponding to times before the time, a sum of distances between feature vectors between the two sequences or an average distance is calculated while maintaining a temporal order relationship. Including processes.

【００２７】本発明（請求項２４）は、第１の非類似度
を計算するプロセスにおいて、任意の一種類の特徴量ベ
クトルについて、指定される２つの時刻以後の時刻に対
応する特徴ベクトルの２つの部分系列について、２つの
系列間における特徴量ベクトルの間の距離の和、また
は、平均的な距離を時間的な順序関係を保持して計算す
るプロセスを含む。According to the present invention (claim 24), in the process of calculating the first degree of dissimilarity, for any one type of feature amount vector, two of feature vectors corresponding to times after two specified times are specified. The method includes a process of calculating a sum of distances between feature value vectors between two sequences or an average distance of two partial sequences while maintaining a temporal order relationship.

【００２８】本発明（請求項２５）は、選択された部分
的なデータ集合に含まれる各々の時刻について計算され
た他の時刻のデータとの間の第１の非類似度と、該時刻
以後の２つのデータ間の第２の非類似度との関連を、該
第１の非類似度の関数として、該第２の非類似度の期待
値をモデル化するプロセスにおいて、２つの時刻のデー
タ間の非類似度と、それら時刻以後の２つのデータ間の
非類似度の期待値の関係を、第１の非類似度を変数とす
るロジスティックモデルを用いて近似し、回帰分析によ
りモデルの係数を求めるプロセスを含む。[0028] The present invention (claim 25) provides a first dissimilarity between data of other times calculated for each time included in the selected partial data set, In a process of modeling the expected value of the second dissimilarity as a function of the first dissimilarity as a function of the first dissimilarity between the two data of The relationship between the dissimilarity between the two data and the expected value of the dissimilarity between the two data after that time is approximated using a logistic model with the first dissimilarity as a variable, and the coefficient of the model is determined by regression analysis. Including the process of seeking.

【００２９】本発明（請求項２６）は、選択された部分
的なデータ集合に含まれる各々の時刻について計算され
た他の時刻のデータとの間の第１の非類似度と、該時刻
以後の２つのデータ間の第２の非類似度との関連を、該
第１の非類似度の関数として、該第２の非類似度の期待
値をモデル化するプロセスにおいて、２つの時刻のデー
タ間の第２の非類似度が任意の閾値以下であるデータの
部分集合について、各時刻以後の２つのデータ間の非類
似度の期待値を計算し、第１の非類似度の閾値を変数と
するロジスティックモデルを用いて近似し、回帰分析に
よりモデルの係数を求めるプロセスを含む。The present invention (claim 26) provides a first dissimilarity between data of another time calculated for each time included in the selected partial data set, In a process of modeling the expected value of the second dissimilarity as a function of the first dissimilarity as a function of the first dissimilarity between the two data of For a subset of data in which the second degree of dissimilarity is equal to or less than an arbitrary threshold, an expected value of the degree of dissimilarity between two pieces of data after each time is calculated, and the first threshold of dissimilarity is set as a variable Approximation using a logistic model, and obtaining the coefficients of the model by regression analysis.

【００３０】本発明（請求項２７）は、データ系列入力
プロセスにおいて、時系列データとして時系列画像を入
力する。本発明（請求項２８）は、第１の特徴量ベクト
ル計算プロセスにおいて、時系列データとして時系列画
像が入力された場合には、単位時間毎に画像中の画素の
濃淡値、または、該濃淡値を空間的に平均した平均輝度
値、または、画像平面をメッシュ状に区切り、各メッシ
ュに含まれる画像の濃淡値の平均を要素とするベクトル
として計算するプロセスを含む。According to the present invention (claim 27), in the data series input process, a time series image is inputted as time series data. According to the present invention (claim 28), in the first feature amount vector calculation process, when a time-series image is input as time-series data, the gray value of a pixel in the image or the gray value of the pixel in each unit time is obtained. The method includes a process in which an average brightness value obtained by spatially averaging the values or an image plane is divided into meshes, and the vector is calculated as an element having an average of the grayscale values of the images included in each mesh.

【００３１】本発明（請求項２９）は、第１の非類似度
を計算するプロセスにおいて、任意の一種類の特徴量ベ
クトルとして、時系列画像が時系列データである場合
に、画像の濃淡値の分布に関する特徴量ベクトルを選択
するプロセスを含む。本発明（請求項３０）は、データ
ベース中の部分的なデータ集合を選択するプロセスにお
いて、問い合わせデータの特徴量ベクトルと、データベ
ース中の各時刻のデータの特徴量ベクトルとの距離を、
各種特徴量ベクトルについてそれぞれ計算し、それぞれ
の距離が所定の値よりも小さい時刻のデータを選択する
プロセスを含む。According to the present invention (claim 29), in the process of calculating the first degree of dissimilarity, when the time-series image is time-series data, the gray-scale value And selecting a feature vector related to the distribution of. According to the present invention (claim 30), in the process of selecting a partial data set in the database, the distance between the feature vector of the query data and the feature vector of the data at each time in the database is determined.
The method includes a process of calculating each of various feature amount vectors and selecting data at a time at which each distance is smaller than a predetermined value.

【００３２】上記のように、本発明では、時系列データ
の各時刻のデータを複数の特徴量ベクトルに変換し、そ
の特徴量ベクトルの相違により、元データ間の類似の度
合いを判定している。そのため、映像のように膨大なデ
ータ量を対象とする場合、元のデータの次元として、大
幅に少ない次元数の特徴量ベクトルで処理が行われるた
め、検索に要する計算コストが削減できる。また、デー
タの種類や利用用途に応じて、データの様々な特性を表
現する望ましい特徴量をそれぞれ、予め選択することに
より、冗長性の大きい元データ同士の類似度を直接計算
する場合と比較して、より、利用用途に叶った的確な検
索が可能となり、検索時に各特徴についての類似度の度
合いを利用者に提示することにより、検索されたデータ
の各特性を利用者が把握することが可能となる。As described above, in the present invention, the data at each time of the time-series data is converted into a plurality of feature vectors, and the degree of similarity between the original data is determined based on the difference between the feature vectors. . Therefore, when an enormous amount of data such as a video is to be processed, processing is performed using a feature amount vector having a significantly smaller number of dimensions as the dimensions of the original data, so that the calculation cost required for the search can be reduced. In addition, by selecting in advance the desired features representing various characteristics of the data in accordance with the type of data and the intended use, it is possible to compare the similarity between the original data having a large degree of redundancy and the similarity directly. Therefore, it is possible to perform an accurate search that meets the intended use, and by presenting the degree of similarity of each feature to the user at the time of the search, the user can grasp each characteristic of the searched data. It becomes possible.

【００３３】また、本発明では、問い合わせデータとデ
ータベース中のデータとの間の非類似度を複数の特徴量
ベクトルについて計算しており、それら非類似度を統合
して、データ全体の総合的な非類似度を計算している。
そのため、利用者は手動にて各種特徴量の非類似度の重
み付けを設定する手間が省け、効率よく高速に検索を行
うことができる。また、利用者の個人差による検索結果
の違いがなくなり、常に、高い精度で検索が実行可能と
なる。Further, according to the present invention, the dissimilarity between the query data and the data in the database is calculated for a plurality of feature quantity vectors, and the dissimilarities are integrated to form a comprehensive data of the entire data. Dissimilarity is calculated.
Therefore, the user does not need to manually set the weighting of the dissimilarity of various feature amounts, and the search can be performed efficiently and at high speed. In addition, there is no difference in search results due to individual differences among users, and a search can always be executed with high accuracy.

【００３４】また、本発明は、各種特徴量の非類似度の
重み付けを設定する際に、検索対象の時刻における２デ
ータ間の各特徴量毎の非類似度と、その後の時刻におけ
る非類似度の期待値との間の関係をモデル化している。
そのため、問い合わせデータの時刻以後の時間変化も一
致するデータベース中の時系列データを優先して検索す
ることが可能である。Further, according to the present invention, when weighting of the degree of dissimilarity of various feature amounts is set, the dissimilarity of each feature amount between two data at the search target time and the dissimilarity degree at subsequent times are set. It models the relationship between the expected value of
Therefore, it is possible to preferentially search for the time-series data in the database that also matches the time change after the time of the inquiry data.

【００３５】また、本発明において、上記の関係をモデ
ル化する際に、問い合わせデータに類似するデータベー
ス中のデータ系列の集合を選択することにより、問い合
わせの時系列データと、変化の傾向が類似するデータベ
ース中のデータ系列を用いてモデル化ができるので、各
時刻の問い合わせデータの特性に合致した検索尺度を利
用することが可能となる。Further, in the present invention, when modeling the above relationship, by selecting a set of data series in the database similar to the inquiry data, the change tendency is similar to the time series data of the inquiry. Since modeling can be performed using the data series in the database, it is possible to use a search scale that matches the characteristics of the inquiry data at each time.

【００３６】また、本発明は、検索の用途により最も重
視する特徴量ベクトルを一つ選択し、検索問い合わせ時
系列データについての選択された特徴量ベクトルの未来
の値と、検索候補の時系列データの特徴量ベクトルの未
来の値との平均的な非類似度が小さいと予想される順番
に検索候補を選択することができる。そのため、現時点
で観測された時系列データから、データベースに蓄えら
れている過去の類似時系列データを検索し、検索された
時系列データの時間変化を参考にして、現在のデータの
今後の変化を予想するという利用用途において、最も重
視する性質を表す特徴量を上記特徴量ベクトルとして選
択することで、その特徴量について未来の値も類似する
時系列データをデータベースから検索することが可能と
なる。Further, according to the present invention, one feature amount vector which is most important for the purpose of search is selected, and the future value of the selected feature amount vector for the search query time series data and the time series data of the search candidate are selected. The search candidates can be selected in the order in which the average dissimilarity with the future value of the feature amount vector is expected to be small. For this reason, we search past similar time series data stored in the database from the time series data observed at the present time, and refer to the time change of the searched time series data to determine future changes in the current data. In the usage application of prediction, by selecting a feature amount representing the property that is most important as the feature amount vector, it becomes possible to search the database for time-series data with similar future values for the feature amount.

【００３７】例えば、降水現象を観測した気象データ画
像の時系列を対象とする場合、気象レーダ画像の濃淡の
空間分布や、速度ベクトル場、パターン表面のテクスチ
ャー等の特徴ベクトルを用い、その中から、濃淡の空間
分布に関する特徴量ベクトルを上記の特徴量ベクトルと
して選択し、検索を行うことで、未来の濃淡の空間分布
も類似する過去のレーダ画像が検索でき、検索されたレ
ーダ画像から現在の気象の変化を予想することが可能と
なる。For example, when a time series of a weather data image obtained by observing a precipitation phenomenon is to be processed, a spatial distribution of shading of a weather radar image, a velocity vector field, a feature vector such as a texture of a pattern surface, and the like are used. By selecting a feature vector related to the spatial distribution of shades as the above-described feature vector and performing a search, a past radar image having a similar spatial distribution of shades in the future can be searched. It is possible to anticipate changes in weather.

【００３８】また、本発明では、２つの時刻のデータ間
の非類似度とその時刻以後の２つのデータ間の非類似度
との関連を、前者の関数として後者の期待値をロジステ
ィックモデルを用いてモデル化している。そのため、各
特徴量についての非類似度が増大するにつれ、当該時刻
以後のデータ間の非類似度が大きくなるという性質を、
精度よく近似できる。また、各特徴量ベクトルについて
の非類似度、あるいは、それらの非類似度に課せられる
閾値を変数とするロジスティックモデルを用いて、回帰
分布により各変数にかかる計数を決定している。そのた
め、それぞれの計数について有意性の検定を行うことに
より、有用な特徴量ベクトルを選択することが可能とな
る。Further, in the present invention, the relation between the dissimilarity between data at two times and the dissimilarity between two data after that time is determined by using a logistic model by using an expected value of the latter as a function of the former. Modeled. Therefore, as the dissimilarity of each feature increases, the property that the dissimilarity between the data after the time becomes larger,
It can be approximated with high accuracy. In addition, using a dissimilarity of each feature amount vector or a logistic model having a threshold value imposed on the dissimilarity as a variable, the count of each variable is determined by a regression distribution. Therefore, by performing a test of significance for each count, it is possible to select a useful feature amount vector.

【００３９】[0039]

【発明の実施の形態】図３は、本発明の時系列データ検
索装置の構成を示す。同図に示す時系列データ検索装置
は、入力部１００、処理部２００及び出力部３００から
構成される。入力部１００は、データベースに蓄積する
ための時系列のデータ系列を入力するデータ系列入力部
１０１、検索の問い合わせとなる時系列のデータ系列を
入力する検索問い合わせデータ系列入力部１０２からな
る。FIG. 3 shows a configuration of a time-series data search device according to the present invention. The time-series data search device shown in FIG. 1 includes an input unit 100, a processing unit 200, and an output unit 300. The input unit 100 includes a data sequence input unit 101 for inputting a time-series data sequence to be stored in a database, and a search query data sequence input unit 102 for inputting a time-series data sequence for a search query.

【００４０】処理部２００は、データ系列入力部１０１
により入力されるデータ系列を記憶する元データ系列記
憶部２０１、元データ系列記憶部２０１に記憶されたデ
ータ系列から、一定時刻毎にデータの特徴量ベクトルを
計算する特徴抽出部２０２、特徴抽出部２０２において
計算されたデータの各種特徴量ベクトルを記憶する特徴
量系列記憶部２０３、問い合わせデータとデータベース
（元データ系列記憶部２０１）中のデータとの間の類似
度の尺度を設定する検索尺度設定部２０４、検索問い合
わせデータ系列入力部１０２から入力され、特徴抽出部
２０２により計算された特徴量ベクトルを入力とし、問
い合わせデータと類似する元データ系列記憶部２０１に
記憶されているデータ系列を、特徴量ベクトルの非類似
度の関数として検索尺度設定部２０４で設定された基準
で検索を行う検索部２０５からなる。The processing section 200 includes a data series input section 101
, A feature extraction unit 202 that calculates a feature vector of data from the data sequence stored in the original data sequence storage unit 201 at regular time intervals, and a feature extraction unit A feature amount series storage unit 203 that stores various feature amount vectors of the data calculated in 202, a search scale setting that sets a similarity scale between query data and data in a database (original data sequence storage unit 201). The unit 204 receives a feature amount vector input from the search query data sequence input unit 102 and calculated by the feature extraction unit 202, and converts a data sequence stored in the original data sequence storage unit 201 similar to the query data into a feature. Search that performs a search based on the criteria set by the search scale setting unit 204 as a function of the dissimilarity of the quantity vector Consisting of 205.

【００４１】出力部３００は、処理部２００から出力さ
れる検索結果の画像系列をディスプレイ装置やファイル
装置などに出力を行う。図４は、本発明の時系列データ
検索装置の概要動作を示すフローチャートである。ステ
ップ１０１）まず、データ系列入力部１０１によりデ
ータベース構築用の時系列データ集合（問い合わせデー
タ）を入力し、処理部２００の元データ系列記憶部２０
１に蓄積する。The output unit 300 outputs the image sequence of the search result output from the processing unit 200 to a display device or a file device. FIG. 4 is a flowchart showing an outline operation of the time-series data search device of the present invention. Step 101) First, a time series data set (query data) for database construction is input from the data series input unit 101, and the original data series storage unit 20 of the processing unit 200 is input.
Accumulate in 1.

【００４２】ステップ１０２）特徴抽出部２０１にお
いて入力された時系列データより一定時間毎に特徴量ベ
クトルを計算し、特徴量系列記憶部２０３に記憶する。
ステップ１０３）検索問い合わせデータ系列入力部１
０２から検索の問い合わせ用の時系列データを入力し、
特徴抽出部２０１において特徴量ベクトルを計算する。Step 102) The feature vector is calculated at regular time intervals from the time-series data input in the feature extracting unit 201 and stored in the feature-series storing unit 203.
Step 103) Search query data series input unit 1
Enter time series data for search inquiry from 02,
The feature amount vector is calculated in the feature extracting unit 201.

【００４３】ステップ１０４）検索尺度設定部２０４
において検索問い合わせデータ系列入力部１０１から入
力された問い合わせデータと元データ系列記憶部２０１
のデータ系列から検索の類似尺度を設定する。ステップ
１０５）検索部２０５において問い合わせデータ系列
に類似するデータをデータベースより検索し、出力部３
００より出力する。Step 104) Search scale setting section 204
In the query data input from the search query data sequence input unit 101 and the original data sequence storage unit 201
A search similarity measure is set from the data series. Step 105) The search unit 205 searches the database for data similar to the inquiry data series, and the output unit 3
Output from 00.

【００４４】[0044]

【実施例】以下、図面と共に本発明の実施例を説明す
る。以下では、時系列データの例として、時系列画像を
対象として、時系列データ検索装置の動作を処理部２０
０を中心に具体的に説明する。図３における元データ系
列記憶部２０１は、データ系列入力部１０１を通して入
力される時系列データ系列を蓄積するデータベースであ
る。ここでは、時系列画像を対象とするが、この画像系
列は時間的に切れ目なく継続しているもの、及び、任意
の区間連続している画像系列の集合を入力、蓄積するこ
とができ、画像フレームの時刻または、フレーム番号を
指定することで、対応する画像を取り出せるものとす
る。Embodiments of the present invention will be described below with reference to the drawings. In the following, as an example of the time-series data, the operation of the time-series data search device for a time-series image will be described.
A specific description will be made focusing on 0. The original data sequence storage unit 201 in FIG. 3 is a database that stores time-series data sequences input through the data sequence input unit 101. Here, a time-series image is targeted, and a set of a series of image series that are continuous in time and an image series that is continuous in an arbitrary section can be input and stored. By specifying the frame time or the frame number, the corresponding image can be taken out.

【００４５】特徴抽出部２０２では、元データ系列記憶
部２０１に記憶されているデータ系列、及び検索問い合
わせデータ系列入力部１０２を通じて入力されるデータ
系列から、一定の時間間隔でデータの性質を表現する複
数の特徴量ベクトルを計算する。ここでは、時系列画像
を対象とした場合の例として、画像中のパターンの濃淡
値の空間分布を表すメッシュ特徴と、パターンの動き分
布を表す速度ベクトル場と、パターン表面の細かい動き
やテクスチャを表現するテンポラルテクスチャ特徴の３
つの特徴ベクトルを計算する例を示す。The feature extraction unit 202 expresses the nature of data at regular time intervals from the data sequence stored in the original data sequence storage unit 201 and the data sequence input through the search query data sequence input unit 102. Calculate a plurality of feature vectors. Here, as an example of a case of a time-series image, a mesh feature representing a spatial distribution of gray values of a pattern in an image, a velocity vector field representing a motion distribution of a pattern, and fine movements and textures of a pattern surface are described. Temporal texture feature 3 to express
An example of calculating two feature vectors will be described.

【００４６】図５は、本発明の一実施例の時系列画像を
対象データとする場合の特徴量ベクトルの例を説明する
ための図である。メッシュ特徴は、ある時刻ｔにおい
て、図５（ａ）のように得られた画像Ｉ（ｉ，ｊ，ｔ）
を図５（ｂ）のようにメッシュ状に区切り、各メッシュ
内の画素の濃淡値ｉ（ｉ，ｊ，ｔ）の平均値を要素する
１次元の特徴量ベクトルｘ₁（ｔ）である。また、図５
（ｂ）のような各メッシュ中に含まれるパターンの平均
的な速度ベクトルを図５（ｃ）のような速度ベクトル場
を計算し、その各成分を要素とする１次元ベクトルを速
度ベクトル場の特徴量ベクトルｘ₂（ｔ）とする。FIG. 5 is a diagram for explaining an example of a feature vector when a time-series image is used as target data according to an embodiment of the present invention. The mesh feature is obtained at a certain time t by using an image I (i, j, t) obtained as shown in FIG.
Is a one-dimensional feature amount vector x ₁ (t) that is divided into meshes as shown in FIG. 5B, and that averages the gray values i (i, j, t) of the pixels in each mesh. FIG.
An average velocity vector of a pattern included in each mesh as shown in FIG. 5B is calculated as a velocity vector field as shown in FIG. 5C, and a one-dimensional vector having each component as an element is calculated as a velocity vector field. Let it be a feature vector x ₂ (t).

【００４７】さらに、テンポラルテクスチャ特徴は、パ
ターン表現の細かい動きやテクスチャを表現する画像特
徴であり、パターンの生成・消滅を伴う非剛体の動きパ
ターンについて、その性質を定量化することができる。
それは、複数の画像フレームに含まれるパターンの動き
成分の確率密度分布を計算し、その分布から時系列画像
の局所的な時空間領域中に含まれる動きの多様性や、画
像要素の配置の規則性などの画像特徴を計算している。
ここでは、優勢な速度の大きさ、動きの一様性、隠蔽
率、輪郭配置の方向性、輪郭配置の粗さ、輪郭のコント
ラストなどの量を要素とするベクトルを一定時間毎に計
算し、特徴量ベクトルｘ₃（ｔ）としている。Further, the temporal texture feature is an image feature that expresses a fine movement or texture of a pattern expression, and the nature of a non-rigid movement pattern accompanied by generation and disappearance of a pattern can be quantified.
It calculates the probability density distribution of the motion components of the patterns contained in multiple image frames, and from that distribution, the variety of motions contained in the local spatio-temporal region of the time-series image and the rules for the arrangement of image elements Image features such as gender are calculated.
Here, a vector having the elements such as the magnitude of the predominant speed, the uniformity of the movement, the concealment ratio, the direction of the contour arrangement, the roughness of the contour arrangement, the contrast of the contour, etc. is calculated at regular time intervals, The feature amount vector is x ₃ (t).

【００４８】元データ系列記憶部２０１に含まれるデー
タについて得られた特徴量ベクトルは、特徴量系列記憶
部２０３に記憶される。なお、その際、各特徴量ベクト
ル毎に、その分散、平均についての正規化を行い、ま
た、検索問い合わせデータ系列入力部１０１から入力さ
れた問い合わせデータ系列から得られた特徴量ベクトル
についても同じパラメータで正規化を行う。The feature vector obtained for the data contained in the original data series storage unit 201 is stored in the feature series storage unit 203. At this time, the variance and the average of each feature amount vector are normalized, and the same parameter is used for the feature amount vector obtained from the query data sequence input from the search query data sequence input unit 101. Perform normalization with.

【００４９】特徴量系列記憶部２０３では、特徴抽出部
２０２により計算された特徴量ベクトルを記憶し、検索
尺度設定部２０４及び検索部２０５からの要求に応じ
て、要求された時刻に対応する特徴量ベクトルを出力す
る。検索尺度設定部２０４では、特徴抽出部２０３によ
り計算される問い合わせデータについての特徴量ベクト
ルとの非類似度を各種特徴量ベクトルについて計算し、
その非類似度を統合して、問い合わせデータとデータベ
ース中の各時刻のデータとの全体の非類似度を計算する
ため検索尺度の設定を行う。The feature amount series storage unit 203 stores the feature amount vector calculated by the feature extraction unit 202 and, in response to a request from the search scale setting unit 204 and the search unit 205, a feature corresponding to the requested time. Output a quantity vector. The search scale setting unit 204 calculates the degree of dissimilarity of the query data calculated by the feature extraction unit 203 with the feature vector, for each of the various feature vectors.
By integrating the dissimilarities, a search scale is set to calculate the overall dissimilarity between the inquiry data and the data at each time in the database.

【００５０】ここでは、その一例として以下に示した方
法を説明する。いま、問い合わせデータとして、時刻Ｔ
に観測された画像を含むＬ時間ステップのデータ系列｛Ｔ−Ｌ＋１，…，Ｔ−１，Ｔ｝を入力すると考える。ここでこのデータ系列をＴを表記
する。そこで、時刻Ｔのデータと元データ系列記憶部２
０１に記憶されている時刻Ｔの部分データ系列｛ｔ−Ｌ＋１，…，ｔ−１，ｔ｝との間について、各特徴量ベクトルｋに関する類似度Ｄ（Ｔ，ｔ）＝｛Ｄ_k（Ｔ，ｔ）｜ｋ＝１，２，３｝を計算する方法を示す。本例では、２つの部分系列の間
の時間伸縮を補正するためにＤＰマッチングを用いた方
法を用い、次のように計算する。Here, the following method will be described as an example. Now, as the inquiry data, the time T
, T−1, T} at L time steps including the observed image. Here, this data series is denoted by T. Therefore, the data at time T and the original data series storage unit 2
01, the partial data sequence {t−L + 1,..., T−1, t} at time T, and the similarity D (T, t) = {D _k (T , T) | k = 1, 2, 3}. In this example, a method using DP matching is used to correct the time expansion and contraction between two partial sequences, and the calculation is performed as follows.

【００５１】[0051]

【数１】 (Equation 1)

【００５２】但し、ηは定数とし、｜・｜はユークリッ
ドノルムである。Ｖ_k（Ｔ）は、問い合わせデータ系列
のパターン変化の速さであり、Ｖ_k（Ｔ）＝｜ｘ_k（Ｔ）−ｘ_k（Ｔ−１）｜ (2) のように計算できる。このＶ_k（Ｔ）を用いて、２つの
時間間のパターン変化の速さに依らない正規化された非
類似度ｄ_k（ｍ，ｎ）を計算し、それを問い合わせデー
タの時間のステップ幅Ｌについて、その和が最小となる
ような非類似度Ｄ _k（Ｔ，ｔ）を計算する。この非類似
度の計算方法は、検索部２０５においても使用される。Where η is a constant and | · |
Donorm. V_k(T) is the query data series
Is the speed of the pattern change._k(T) = | x_k(T) -x_k(T-1) | (2) This V_kUsing (T), two
Normalized non-independent
Similarity d_kCalculate (m, n) and query it
For the step width L of the data time
Like dissimilarity D _kCalculate (T, t). This dissimilarity
The degree calculation method is also used in the search unit 205.

【００５３】次に、以上で定義した方法を用いて、問い
合わせＴと、元データ系列記憶部２０１中の各時刻のデ
ータとの間の非類似度を、それぞれの特徴量ベクトルｋ
について計算する。そして、検索されるデータ系列の候
補として、各特徴量ベクトルの非類似度について、利用
者が与える閾値ξ＝｛ξ₁，ξ₂，ξ₃｝を満たすデー
タ系列の集合Ｓ_Tを求める。Next, using the method defined above, the degree of dissimilarity between the query T and the data at each time in the original data series storage unit 201 is determined by using the respective feature amount vectors k
Is calculated. Then, as the candidate data sequences to be searched, the non-similarity of each feature vector, the threshold xi] = user gives _{_{{ξ 1, ξ 2, ξ}} 3} finding a set S _T of the data sequence satisfying.

【００５４】[0054]

【数２】 (Equation 2)

【００５５】次に、検索候補集合Ｓ_T中の各要素を用い
て、各特徴量ベクトルの非類似度を統合して、総合的な
非類似度を求めるための尺度を設定する。以下にその方
法の一例を示す。まず、検索候補集合Ｓ_T中の各時刻ｔ
∈Ｓ_Tについて、元データ系列記憶部２０１中の他の時
刻τとのデータとの間の非類似度Ｄ_k（ｔ，τ）を計算
し、また、併せて、その時刻ｔ，τ以後の時刻における
２つの系列の間の非類似度ｙ（ｔ，τ）を着目する特徴
量ベクトルについて計算する。この非類似度ｙ（ｔ，
τ）を予測誤差と呼ぶ。ここでは、前述の特徴抽出部２
０２の特徴量ベクトルの例として説明したメッシュ特徴
ｘ₁（ｔ）を着目する特徴ベクトルとする。そこで、予
測誤差ｙ（ｔ，τ）を、Next, with reference to each element in the search candidate set S _T, by integrating the non-similarity of each feature quantity vectors, sets a measure for determining the overall dissimilarities. An example of the method will be described below. First, each time t in the search candidate set S _T
For ∈S _T, calculate the dissimilarity D _k between the data with other time tau in the original data sequence storage unit 201 (t, tau), also the same time, that time t, tau subsequent The dissimilarity y (t, τ) between the two sequences at the time is calculated for the feature amount vector of interest. This dissimilarity y (t,
τ) is called a prediction error. Here, the above-described feature extraction unit 2
The mesh feature x ₁ (t) described as an example of the feature vector of 02 is set as a feature vector of interest. Therefore, the prediction error y (t, τ) is

【００５６】[0056]

【数３】 (Equation 3)

【００５７】のように計算する。次に、このように選択
されたデータ集合Ｓτ中の各要素について得られた非類
似度Ｄ_k（ｔ，τ）と、予測誤差のｙ（ｔ，τ）の組よ
り、非類似度に課される閾値ε＝（ε₁，ε₂，ε₃）
を満たす組に対する予測誤差ｙ（ｔ，τ）の期待値ｅ
（ε）を計算する。The calculation is as follows. Next, from the set of the dissimilarity D _k (t, τ) obtained for each element in the data set Sτ thus selected and the prediction error y (t, τ), the dissimilarity is imposed. Threshold value ε = (ε ₁ , ε ₂ , ε ₃ )
Expected value e of the prediction error y (t, τ) for the set satisfying
Calculate (ε).

【００５８】[0058]

【数４】 (Equation 4)

【００５９】さらに、この予測誤差の期待値ｅ（ε）
を、非類似度の閾値εの関数と考え、回帰分析を用いて
モデル化を行う。ここでは、モデルの一つとしてロジス
ティックモデルを用いた例を示す。Further, the expected value e (ε) of the prediction error
Is regarded as a function of the threshold ε of dissimilarity, and modeling is performed using regression analysis. Here, an example in which a logistic model is used as one of the models will be described.

【００６０】[0060]

【数５】 (Equation 5)

【００６１】ここで、ｇ（ε）は、閾値ε_kを変数とす
る多項式であり、ｂ₀は一つの係数である。このような
モデルを用いて計算した予測誤差の期待値Here, g (ε) is a polynomial using the threshold value ε _k as a variable, and b ₀ is one coefficient. The expected value of the prediction error calculated using such a model

【００６２】[0062]

【数６】 (Equation 6)

【００６３】を期待予測誤差と呼ぶことにする。期待予
測誤差は、ある閾値εを満たす検索候補の集合につい
て、その将来のデータと問い合わせデータの将来の値と
の着目した特徴量に関する平均的な非類似度の推定値を
意味する。なお、データベース中の各時刻のデータの性
質が全て均一な場合、予測誤差の期待値のモデル化をす
る際に用いた検索候補集合Ｓτを、データベース中の全
データとして計算することもできる。Is referred to as an expected prediction error. The expected prediction error refers to an estimated value of an average dissimilarity regarding a feature amount focused on future data and a future value of inquiry data for a set of search candidates satisfying a certain threshold ε. When all the properties of the data at each time in the database are uniform, the search candidate set Sτ used when modeling the expected value of the prediction error can be calculated as all the data in the database.

【００６４】このように得られた式（６）のモデルが検
索部２０５へと出力される。検索部２０５では、検索問
い合わせデータ系列入力部１０２から入力された問い合
わせデータ系列に類似する元データ系列記憶部２０１中
のデータ系列を選択し、出力部３００へ出力を行う。こ
こでは、その一例として以下の方法を示す。The model of the equation (6) thus obtained is output to the search unit 205. The search unit 205 selects a data sequence in the original data sequence storage unit 201 that is similar to the query data sequence input from the search query data sequence input unit 102, and outputs the data sequence to the output unit 300. Here, the following method is shown as an example.

【００６５】前述の検索尺度設定部２０４の説明通り、
いま、時刻Ｔの問い合わせデータ系列について得られた
検索候補集合Ｓτの中の検索候補から、前述の検索尺度
設定部２０４で設定された検索尺度（期待予測誤差のモ
デル）を用いて、問い合わせデータ系列と、検索候補の
データ系列との間の期待予測誤差を総合的な非類似度と
して計算し、類似する順番に検索候補を並び替え、利用
者が指示する条件を満たす検索候補を選択し、検索結果
として出力部３００へ出力する。As described above in the search scale setting unit 204,
Now, from the search candidates in the search candidate set Sτ obtained for the query data sequence at time T, the query data sequence is set using the search scale (expected prediction error model) set by the search scale setting unit 204 described above. And the expected prediction error between the search candidate and the data series of the search candidate is calculated as a total degree of dissimilarity, the search candidates are sorted in a similar order, a search candidate that satisfies the conditions specified by the user is selected, and the search is performed. The result is output to the output unit 300.

【００６６】ここでは、利用者が指示する条件の一例と
して、検索結果のデータ系列の最大数Ｋと、検索結果の
データ系列の集合が満たしているべき期待予測誤差の閾
値ｅ _THを考え以下のように検索結果のデータ系列を求め
る。Here, an example of the condition specified by the user is shown.
And the maximum number K of data series of the search result
The expected prediction error threshold that the set of data series should satisfy
Value e _THAnd find the data series of the search results as follows
You.

【００６７】[0067]

【数７】 (Equation 7)

【００６８】以下で気象レーダ画像を対象とした具体的
な例を説明する。図６は、本発明の一実施例の利用者が
与える検索問い合わせデータ系列の例を示す。同図で
は、利用者から与えられた検索問い合わせデータ系列と
して入力した時系列画像の始めと最後のフレームを示
す。但し、単位時間ステップは１時間であり、検索問い
合わせデータ系列は３時間分のデータを含み、３６フレ
ームで構成される。A specific example for a weather radar image will be described below. FIG. 6 shows an example of a search query data sequence provided by a user according to an embodiment of the present invention. The figure shows the beginning and end frames of a time-series image input as a search query data sequence given by a user. However, the unit time step is one hour, and the search inquiry data sequence includes data for three hours and is composed of 36 frames.

【００６９】特徴抽出部２０２の説明で述べた通り、特
徴量ベクトルとして、メッシュ特徴、速度ベクトル場、
テンポラステスクチャ特徴の３種を用いた。また、元デ
ータ系列記憶部２０１には、およそ９０００時間分（１
２万フレーム）の時系列画像を記憶した。図６の問い合
わせデータについて検索尺度設定部２０４にて設定され
た検索尺度は、As described in the description of the feature extracting unit 202, mesh features, velocity vector fields,
Three types of temporary texture features were used. Also, the original data series storage unit 201 stores about 9000 hours (1
(20,000 frames). The search scale set by the search scale setting unit 204 for the inquiry data of FIG.

【００７０】[0070]

【数８】 (Equation 8)

【００７１】となり、この様子を図７に示す。図中の点
は、計算されたサンプル点であり、曲面として表される
モデルと比較して、相関係数は０．９８９となり、良好
にモデル化が行われたことがわかる。図８には、図６の
検索問い合わせデータ系列について得られた検索結果を
上位５位まで示す。利用者から入力された検索問い合わ
せデータである時系列画像には、線状のパターンが、そ
の形状を若干崩しながら、陸地へ接近する様子が示され
ているが、検索された時系列画像にも、同じような傾向
を持ったパターンが得られたことが確認できる。FIG. 7 shows this state. The points in the figure are the calculated sample points, and the correlation coefficient is 0.989 as compared with the model represented as a curved surface, indicating that the modeling has been performed well. FIG. 8 shows the search results obtained for the search query data series in FIG. The time-series image, which is search query data input by the user, shows a linear pattern approaching the land while slightly distorting its shape. It can be confirmed that a pattern having a similar tendency was obtained.

【００７２】なお、本発明は、データを保存し、それら
を自由に読み出し可能なハードディスクやそれに準ずる
装置と、所望の情報を表示・出力するディスプレイ装置
などの装置を備え、それらを予め定められた手順に基づ
いて制御する中央演算装置などを備えたコンピュータや
それに準じる装置をもとに、上述した実施例での各部の
処理の一部もしくは全部、乃至は、図４にフローチャー
トに示した手順もしくは、アルゴリズムを記述した処理
プログラムやそれに準じるものを、当該コンピュータに
対して与え、制御・実行させることで本発明を実現する
ことが可能である。ここで、処理プログラムやそれに準
じるものを、コンピュータが実行する際に読み出しを実
行できるＣＤ−ＲＯＭ、フロッピーディスク（ＦＤ）、
光磁気ディスク（ＭＯ）あるいは、それらに準ずる記憶
媒体に記録して配布することが可能である。The present invention includes a hard disk capable of storing data and freely reading the data, a device similar thereto, and a device such as a display device for displaying and outputting desired information. Based on a computer having a central processing unit or the like that controls based on the procedure or a device equivalent thereto, a part or all of the processing of each unit in the above-described embodiment, or the procedure shown in the flowchart in FIG. The present invention can be realized by giving a processing program describing an algorithm or a program similar thereto to the computer, and controlling and executing the processing program. Here, a CD-ROM, floppy disk (FD),
It can be recorded on a magneto-optical disk (MO) or a storage medium equivalent thereto and distributed.

【００７３】なお、本発明は、上記の実施例に限定され
ることなく、特許請求の範囲内で種々変更・応用が可能
である。The present invention is not limited to the above embodiment, but can be variously modified and applied within the scope of the claims.

【００７４】[0074]

【発明の効果】上述のように、本発明によれば、類似す
る時系列データの検索において、データの性質を表現す
る特徴量ベクトルを計算し、問い合わせデータと、デー
タベース中のデータ間について非類似度を各特徴量ベク
トル毎に計算して、それらを自動的かつ客観的に統合し
て、問い合わせデータと、データベース中のデータ間の
総合的な非類似度を設定し、その尺度を用いてデータベ
ースから類似する時系列データを選択することにより、
自然現象を映像化したような複雑なデータについても簡
易かつ効率的に精度良い検索が可能となる。As described above, according to the present invention, in searching for similar time-series data, a feature vector expressing the property of the data is calculated, and the query data and the data in the database are dissimilar. The degree is calculated for each feature vector, they are automatically and objectively integrated, and the overall dissimilarity between the query data and the data in the database is set. By selecting similar time series data from
Simple and efficient search with high accuracy is possible even for complex data such as a video of a natural phenomenon.

【００７５】また、上記の各特徴量ベクトルについての
非類似度を統合する方法において、複数の特徴量ベクト
ルの中から一つ最も重視する特徴量ベクトルを選択し、
問い合わせデータと検索候補データの将来の非類似度
を、各種特徴ベクトルについての非類似度の関数として
モデル化しているために、問い合わせ時系列データにつ
いて現在のみならず、将来の時点においても類似する検
索候補を選択することが可能となり、過去のデータに基
づいて、将来のデータの変化を予想するような用途に有
効となる。Further, in the above method of integrating the dissimilarity of each feature amount vector, one of the plurality of feature amount vectors is selected as the most important feature amount vector,
Since the future dissimilarity between the query data and the search candidate data is modeled as a function of the dissimilarity for various feature vectors, similar retrieval is possible not only for the current time series data but also at a future time. It is possible to select a candidate, which is effective for use in predicting a change in future data based on past data.

[Brief description of the drawings]

【図１】本発明の原理を説明するための図である。FIG. 1 is a diagram for explaining the principle of the present invention.

【図２】本発明の原理構成図である。FIG. 2 is a principle configuration diagram of the present invention.

【図３】本発明の時系列データ検索装置の構成図であ
る。FIG. 3 is a configuration diagram of a time-series data search device of the present invention.

【図４】本発明の時系列データ検索装置の概要動作を示
すフローチャートである。FIG. 4 is a flowchart showing an outline operation of the time-series data search device of the present invention.

【図５】本発明の一実施例の時系列画像を対象データと
する場合の特徴量ベクトルの例を説明するための図であ
る。FIG. 5 is a diagram for describing an example of a feature amount vector when a time-series image is set as target data according to an embodiment of the present invention.

【図６】本発明の一実施例の利用者が与える検索問い合
わせデータ系列の例である。FIG. 6 is an example of a search query data sequence provided by a user according to an embodiment of the present invention.

【図７】本発明の一実施例の検索尺度を説明するための
図である。FIG. 7 is a diagram for explaining a search scale according to an embodiment of the present invention.

【図８】本発明の一実施例の問い合わせデータ系列に対
して得られた検索結果のデータ系列の例である。FIG. 8 is an example of a data sequence of a search result obtained for an inquiry data sequence according to an embodiment of the present invention.

[Explanation of symbols]

１０データ系列入力手段２０第１の特徴量ベクトル計算手段３０検索問い合わせデータ系列入力手段４０第２の特徴量ベクトル計算手段５０第１の非類似度計算手段６０第２の非類似度計算手段７０検索手段８０出力手段９０データベース１００入力部１０１データ系列入力部１０２検索問い合わせデータ系列入力部２００処理部２０１元データ系列記憶部２０２特徴抽出部２０３特徴量系列記憶部２０４検索尺度設定部２０５検索部３００出力部 DESCRIPTION OF SYMBOLS 10 Data series input means 20 First feature vector calculation means 30 Search query data series input means 40 Second feature vector calculation means 50 First dissimilarity calculation means 60 Second dissimilarity calculation means 70 Search Means 80 Output means 90 Database 100 Input unit 101 Data sequence input unit 102 Search query data sequence input unit 200 Processing unit 201 Original data sequence storage unit 202 Feature extraction unit 203 Feature amount sequence storage unit 204 Search scale setting unit 205 Search unit 300 Output Department

───────────────────────────────────────────────────── フロントページの続き (72)発明者児島治彦東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内Ｆターム(参考） 5B075 ND12 NK07 PQ02 PR06 UU40 5L096 FA32 FA37 FA66 FA67 GA19 GA51 HA04 HA08 JA03 JA11 ────────────────────────────────────────────────── ─── Continuing from the front page (72) Inventor Haruhiko Kojima 3-19-2 Nishi-Shinjuku, Shinjuku-ku, Tokyo F-term in Nippon Telegraph and Telephone Corporation (reference) 5B075 ND12 NK07 PQ02 PR06 UU40 5L096 FA32 FA37 FA66 FA67 GA19 GA51 HA04 HA08 JA03 JA11

Claims

[Claims]

1. A time series data search method for searching data similar to user inquiry data from a time series database that records time series data, wherein time series data forming a time series is input and stored in the database. Calculating a plurality of first feature vectors expressing the properties of the time-series data at predetermined time intervals from the stored time-series data, inputting arbitrary inquiry data given by a user, Calculating a second feature vector from the query data; calculating a dissimilarity between the query data and the data at each time in the database from the first feature vector and the second feature vector; Integrating the dissimilarities to calculate a total dissimilarity between the query data and the data at each time in the database; In increasing order of overall dissimilarity, the data of the corresponding time selected from the database, sequence data search method when and outputting.

2. When calculating the overall dissimilarity, a partial data set having a small dissimilarity is selected from query data in the database, and the selected partial data is selected. For each time in the set, calculate a first dissimilarity between the time and the data at another time in the database; a second dissimilarity between the two data after each time And the first dissimilarity between the data at the other times calculated for each time included in the selected partial data set, and the two data after the time. Modeling an association between the second dissimilarity and the expected value of the second dissimilarity as a function of the first dissimilarity, and applying the first dissimilarity to the model The inquiry data and the data 2. The time-series data search method according to claim 1, wherein the overall dissimilarity between data at each time in the database is calculated.

3. When calculating the first degree of dissimilarity between the inquiry data and the data at each time in the database for each feature amount vector, or between data at two times after the time. When calculating the second degree of dissimilarity of each of the feature amount vectors,
For two sequences of feature vectors corresponding to one time and a time before the time, the sum of the distances between the feature vectors between the two sequences or the average distance is stored in a temporal order relationship. 3. The time-series data search method according to claim 1, wherein the time-series data is calculated.

4. When calculating the first dissimilarity between data after the time and data in the database for each time in the selected partial data set. In addition, for any one type of feature amount vector, 2
For two subsequences of the feature vector corresponding to times after two times, the sum of the distances between the feature vectors between the two sequences or the average distance is calculated while maintaining the temporal order relation. The time-series data search method according to claim 2.

5. The first degree of dissimilarity between data at other times calculated for each time included in the selected partial data set, and two data after the time. When modeling the association of the second dissimilarity with the second dissimilarity as a function of the first dissimilarity, the expected value of the second dissimilarity is modeled. The relationship between the degree and the expected value of the degree of dissimilarity between the two pieces of data after the time is represented by the first
3. An approximation is made using a logistic model having a degree of dissimilarity as a variable, and coefficients of the model are obtained by regression analysis.
The described time-series data search method.

6. The first degree of dissimilarity between data at other times calculated for each time included in the selected partial data set, and two data after the time. Modeling the association of the second dissimilarity with the second dissimilarity as a function of the first dissimilarity when the expected value of the second dissimilarity is modeled. For a subset of data in which the degree of dissimilarity of 2 is equal to or less than an arbitrary threshold, an expected value of the degree of dissimilarity between two pieces of data after each time is calculated.
3. The time-series data search method according to claim 2, wherein approximation is performed using a logistic model having a threshold value of the degree of dissimilarity as a variable, and coefficients of the model are obtained by regression analysis.

7. The time-series data search method according to claim 1, wherein a time-series image is input as the time-series data.

8. When the time-series image is used as target data to calculate a feature vector expressing the property of the data, the grayscale value of a pixel in the image or the grayscale value is spatially determined for each unit time. 8. The time-series data search method according to claim 7, wherein the averaged average luminance value or the image plane is divided into meshes, and the vector is calculated as a vector having an average of the gray values of the images included in each mesh as elements.

9. The image processing apparatus according to claim 4, wherein, when the time-series data is a time-series image, a characteristic amount vector relating to a distribution of gray values of the image is selected as the arbitrary one type of characteristic amount vector.
The described time-series data search method.

10. When selecting a partial data set in the database, the second feature vector of the inquiry data;
3. The distance between the data at each time in the database and the first feature value vector is calculated for each of the various feature value vectors, and data at a time at which each distance is smaller than a predetermined value is selected. Time series data search method.

11. A time-series data search device for searching data similar to user inquiry data from a time-series database that records time-series data, wherein time-series data forming a time series constituting the database is input. A data sequence input unit to be stored in the database; and a first feature for calculating a plurality of first feature amount vectors expressing the properties of the data at regular time intervals from the time series data stored in the database. Quantity vector calculation means, search query data series input means for inputting arbitrary query data given by the user, and a second feature quantity vector expressing a property of the data from the query data. A feature vector calculation means, and a degree of dissimilarity between the inquiry data and data at each time in the database. A first dissimilarity calculating means for calculating from the first feature amount vector and the second feature amount vector, and integrating the dissimilarity to obtain the query data and the time of each time in the database. Second dissimilarity calculating means for calculating overall dissimilarity with data; search means for selecting data at the corresponding time from the database in the order of the small overall dissimilarity; Output means for outputting data selected by the search means.

12. The second dissimilarity calculating means: means for selecting a partial data set that is a data set having a small dissimilarity from the query data in the database; Means for calculating a first dissimilarity between said time and data at another time in said database; a second dissimilarity between two data after each said time Means for calculating a degree, a first dissimilarity between data of other times calculated for each time included in the selected partial data set, and 2 after the time. Means for modeling an association of the second data with the second degree of dissimilarity as a function of the first degree of dissimilarity, the expected value of the second degree of dissimilarity; By applying to the model , The time-series data retrieval apparatus according to claim 11, further comprising a means for calculating the overall dissimilarities between data at each time in the said inquiry data database.

13. The means for calculating the first degree of dissimilarity and the means for calculating the second degree of dissimilarity include two corresponding times for each of the second feature amount vectors and a time before the time. Means for calculating the sum of the distances between the feature vectors between the two sequences or the average distance of the two sequences of the feature vectors corresponding to the time while maintaining the temporal order relationship. Item 13. The time-series data search device according to item 11 or 12.

14. The means for calculating the first degree of dissimilarity may include:
For two subsequences of the feature vector corresponding to times after two times, the sum of the distances between the feature vectors between the two sequences or the average distance is calculated while maintaining the temporal order relation. 13. The time-series data search device according to claim 12, comprising means.

15. The first degree of dissimilarity between data at other times calculated for each time included in the selected partial data set, and two data after the time. Means for modeling the expected value of the second dissimilarity as a function of the first dissimilarity as a function of the first dissimilarity. The relationship between the degree and the expected value of the degree of dissimilarity between the two pieces of data after the time is represented by the first
13. The time-series data search device according to claim 12, further comprising means for approximating using a logistic model that uses the degree of dissimilarity as a variable, and obtaining a coefficient of the model by regression analysis.

16. The first degree of dissimilarity between data at other times calculated for each time included in the selected partial data set, and a difference between two data after the time. Means for modeling the association of the second dissimilarity with the second dissimilarity as a function of the first dissimilarity, wherein the expected value of the second dissimilarity is: For a subset of data in which the degree of dissimilarity of 2 is equal to or less than an arbitrary threshold, an expected value of the degree of dissimilarity between two pieces of data after each time is calculated.
13. The time-series data search device according to claim 12, further comprising means for approximating using a logistic model having a threshold value of the degree of dissimilarity as a variable and obtaining a coefficient of the model by regression analysis.

17. The data series input unit inputs a time series image as the time series data.
The time-series data search device according to claim 2, 13, or 14.

18. When a time-series image is input as the time-series data, the first feature amount vector calculation means calculates a gray-scale value of a pixel in the image for each unit time or a gray-scale value of the pixel. 12. Time-series data according to claim 11, further comprising means for calculating an average brightness value obtained by spatially averaging or dividing the image plane into meshes and calculating as a vector having an element of an average of the gray values of the images included in each mesh as an element. Search device.

19. The means for calculating the first degree of dissimilarity, wherein when the time-series image is the time-series data, the one or more types of feature amount vectors are characterized by a distribution of gray values of the image. 2. The method of claim 1, further comprising means for selecting a quantity vector.
4. The time-series data search device according to 4.

20. A means for selecting a partial data set in the database, the second feature amount vector of the inquiry data;
Means for calculating a distance between the data at each time in the database and the first feature value vector for each of the various feature value vectors, and selecting data at a time at which each distance is smaller than a predetermined value. Item 13. The time-series data search device according to Item 12.

21. A storage medium storing a time-series data search program for searching for data similar to user inquiry data from a time-series database that records time-series data, the time-series data constituting a database. A data sequence input process for inputting time-series data and storing the data in a database; and calculating a plurality of first feature vectors expressing the properties of the data at regular time intervals from the time-series data stored in the database. A first feature amount vector calculation process for inputting, a search query data series input process for inputting arbitrary query data given by the user, and a second feature amount vector expressing a property of the data from the query data. A second feature amount vector calculation process to be calculated; A first dissimilarity calculation process of calculating a dissimilarity with data at each time in the database from each of the first feature amount vector and the second feature amount vector; and integrating the dissimilarity. A second dissimilarity calculation process for calculating the overall dissimilarity between the inquiry data and the data at each time in the database; A storage medium storing a time-series data search program, comprising: a search process for selecting the data from the database; and an output process for outputting the data selected by the search process.

22. The second dissimilarity calculation process includes: selecting a partial data set that is a data set having a low dissimilarity from the query data in the database; Calculating a first degree of dissimilarity between the time and data at other times in the database for each time of the second time; and a second dissimilarity between two data after the respective time. Calculating the degree of dissimilarity; the first degree of dissimilarity between the data of the other times calculated for each time included in the selected partial data set; Modeling an association of a second dissimilarity between the two data as a function of the first dissimilarity with the expected value of the second dissimilarity; To the model Time-series data retrieval program storage medium storing according to claim 21, wherein said having a process of calculating the overall dissimilarities between data at each time in the integrated, the and the inquiry data database by.

23. The process of calculating the first dissimilarity and the process of calculating the second dissimilarity include two times corresponding to each of the second feature amount vectors and a time before the time. Claims including a process of calculating a sum of distances between feature vectors between two sequences or an average distance of two sequences of feature vectors corresponding to time while maintaining a temporal order relationship. Item 21
Alternatively, a storage medium storing the time-series data search program according to 22.

24. The process of calculating the first degree of dissimilarity comprises the steps of:
For two subsequences of the feature vector corresponding to times after two times, the sum of the distances between the feature vectors between the two sequences or the average distance is calculated while maintaining the temporal order relation. 23. A storage medium storing a time-series data search program according to claim 22, including a process.

25. The first degree of dissimilarity between data at other times calculated for each time included in the selected partial data set and two data after the time. Modeling the association of the second dissimilarity with the second dissimilarity as a function of the first dissimilarity, the dissimilarity between the data at two times The relationship between the degree and the expected value of the degree of dissimilarity between the two pieces of data after the time is represented by the first
23. The storage medium storing the time-series data search program according to claim 22, including a process of approximating using a logistic model having a degree of dissimilarity as a variable and obtaining a coefficient of the model by regression analysis.

26. The first degree of dissimilarity between data at other times calculated for each time included in the selected partial data set, and the two data after the time. Modeling the association of the second dissimilarity with the second dissimilarity as a function of the first dissimilarity, wherein the expected value of the second dissimilarity is: For a subset of data in which the degree of dissimilarity of 2 is equal to or less than an arbitrary threshold, an expected value of the degree of dissimilarity between two pieces of data after each time is calculated.
23. The storage medium storing the time-series data search program according to claim 22, comprising a process of approximating using a logistic model having a threshold value of the dissimilarity as a variable and obtaining a coefficient of the model by regression analysis.

27. The data series input process, wherein a time series image is inputted as the time series data.
24. A storage medium storing the time-series data search program according to claim 1, 22, or 23.

28. The first feature amount vector calculation process, wherein, when a time-series image is input as the time-series data, a gray-scale value of a pixel in the image for each unit time, or the gray-scale value 22. The time-series data according to claim 21, further comprising a process of dividing an image plane into an average brightness value obtained by spatially averaging or dividing the image plane into a mesh and calculating a vector having an element of an average of gray values of an image included in each mesh as an element. A storage medium storing a search program.

29. The process of calculating the first degree of dissimilarity, comprising: when the time-series image is the time-series data, as the arbitrary one type of feature amount vector; 25. The storage medium storing the time-series data search program according to claim 24, comprising a process of selecting a feature vector related to the time series data.

30. A process for selecting a partial data set in the database, comprising: calculating a distance between a feature vector of the inquiry data and a feature vector of data at each time in the database; 23. The storage medium storing the time-series data search program according to claim 22, including a process of calculating each of the time series and selecting data at a time at which each distance is smaller than a predetermined value.