JP4887661B2

JP4887661B2 - Learning device, learning method, and computer program

Info

Publication number: JP4887661B2
Application number: JP2005141957A
Authority: JP
Inventors: 健一日台; 雅博藤田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2005-05-13
Filing date: 2005-05-13
Publication date: 2012-02-29
Anticipated expiration: 2025-05-13
Also published as: JP2006318319A

Description

本発明は、過去から現在に至る状態値からなる時系列データを引数として次の時刻における状態値を出力するための関数を近似する処理を行なう学習装置及び学習方法、並びにコンピュータ・プログラムに係り、特に、与えられた学習サンプルからは直接知ることのできない情報を自動的に生成する学習装置及び学習方法、並びにコンピュータ・プログラムに関する。 The present invention relates to a learning apparatus and a learning method for performing a process for approximating a function for outputting a state value at the next time using time series data including state values from the past to the present as an argument, and a computer program. In particular, the present invention relates to a learning apparatus and a learning method that automatically generate information that cannot be directly known from a given learning sample, and a computer program.

さらに詳しくは、本発明は、マルコフ過程に従わない時系列データを予測学習し次の時刻における状態値を出力するための関数を近似する処理を行なう学習装置及び学習方法、並びにコンピュータ・プログラムに係り、特に、リカレント・ニューラル・ネットワーク以外の手法により非マルコフ過程の時系列データの予測関数を学習する学習装置及び学習方法、並びにコンピュータ・プログラムに関する。 More particularly, the present invention relates to a learning apparatus and a learning method for performing a process for approximating a function for predictive learning of time series data not following a Markov process and outputting a state value at the next time, and a computer program. In particular, the present invention relates to a learning apparatus and learning method for learning a prediction function of time-series data of a non-Markov process by a method other than a recurrent neural network, and a computer program.

例えば、人間は同じような出来事を重ねて経験したり、同じような事実を繰り返し観測したりすると、また似たようなことが起こるのではないかと予測したり、何かこれを支配している既存が存在するのではないかと推測する。このように、過去の経験の上に立って新しい知識や技術を習得することを「学習」と言う。 For example, humans are expected to experience similar events over time, repeatedly observe similar facts, and predict that similar things will occur, and do something Guess that an existing one exists. In this way, learning based on past experiences and acquiring new knowledge and techniques is called “learning”.

情報技術（ＩＴ）が発展した昨今においては、このような学習のメカニズムをコンピュータ・システム上で実現する研究開発が広く行なわれている。例えば、現在時刻までの状態値を入力として学習を行ない、この学習結果を基に次の時刻における状態値を推測若しくは自動生成する。すなわち、学習器は、過去から現在に至る状態値からなる時系列データを引数として、次の時刻における状態値を出力する関数を近似する処理を行なっていることと等価である。 In recent years when information technology (IT) has been developed, research and development for realizing such learning mechanism on a computer system is widely performed. For example, learning is performed using the state value up to the current time as input, and the state value at the next time is estimated or automatically generated based on the learning result. That is, the learning device is equivalent to performing a process of approximating a function that outputs a state value at the next time using time series data including state values from the past to the present as an argument.

学習器の多くは、過去の履歴とは関係なく、現在の状態のみから未来の確率法則が決定される、というマルコフ過程のモデルを作成して学習を行なう。ところが、学習対象すなわち近似すべき関数がマルコフ過程に従わない（若しくは、２次以上のマルコフ過程である）場合がある。例えば、正弦波は、現在の状態値が同じ値であっても、将来は増加傾向をとる場合と逆に減少傾向をとる場合の２通りがあり、現在の状態のみから未来の状態を自動生成することはできない。 Many learners learn by creating a Markov process model in which the future probability law is determined only from the current state, regardless of the past history. However, there are cases where the learning target, that is, the function to be approximated, does not follow the Markov process (or is a Markov process of second or higher order). For example, even if the current state value is the same value, there are two types of sine waves: when the future tends to increase and when it tends to decrease, the future state is automatically generated only from the current state. I can't do it.

このような非マルコフ過程に関しては、予測学習を行なうより他ないと思料される。非マルコフ過程を予測学習する学習メカニズムの代表例として、リカレント・ニューラル・ネットワーク（ＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋ：再帰的ニューラル・ネットワーク）を挙げることができる（例えば、非特許文献１を参照のこと）。 Such a non-Markov process is thought to be nothing but predictive learning. A typical example of a learning mechanism for predictive learning of a non-Markov process is a recurrent neural network (for example, see Non-Patent Document 1).

例えば、学習機構としてリカレント・ニューラル・ネットワークを備えたロボットは、ロボット自身の持つ制御可能な部分によって外界の移動可能な対象物を動かし、知覚センサによって対象物の置かれている環境と、対象物の動きを知覚して、ロボットの各関節部の動かし方と対象物の動きとの関連を学習し、さらに対象物の動きを予測して、ノベルティ・リワーディングにより対象物を動かすモーションを自己学習することができる（例えば、特許文献１を参照のこと）。 For example, a robot equipped with a recurrent neural network as a learning mechanism moves a movable object in the outside world by a controllable part of the robot itself, and an environment in which the object is placed by a perceptual sensor, and the object Learn the relationship between the movement of each joint part of the robot and the movement of the target object, predict the movement of the target object, and self-learn the motion of moving the target object through novelty-rewarding (For example, see Patent Document 1).

しかしながら、リカレント・ニューラル・ネットワークは誤差逆伝播法を用いていることから、学習に時間がかかるという問題がある。 However, since the recurrent neural network uses an error back propagation method, there is a problem that learning takes time.

また、誤差逆伝播法を用いたリカレント・ニューラル・ネットワークでは、コンテキスト・ユニット（文脈情報）の次元数が本来必要と思われる数以上に非常に多く要する。このため、凡化特性に影響が生じる可能性がある。 Further, in the recurrent neural network using the error back propagation method, the number of dimensions of the context unit (context information) is much larger than the number originally considered necessary. This can affect the generalization characteristics.

特開２００２−５９３８４号公報JP 2002-59384 A Ｅｌｍａｎ，Ｊ．Ｌ．著“Ｆｉｎｄｉｎｇｓｔｒｕｃｔｕｒｅｉｎｔｉｍｅ”（ＣｏｇｎｉｔｉｖｅＳｃｉｅｎｃｅ，ｖｏｌ．１４，１９９０，ｐｐ．１７９−２１１）Elman, J.M. L. Author “Finding structure in time” (Cognitive Science, vol. 14, 1990, pp. 179-211)

本発明の目的は、与えられた学習サンプルからは直接知ることのできない情報を自動的に生成することができる、優れた学習装置及び学習方法、並びにコンピュータ・プログラムを提供することにある。 An object of the present invention is to provide an excellent learning device, learning method, and computer program capable of automatically generating information that cannot be directly known from a given learning sample.

本発明のさらなる目的は、マルコフ過程に従わない時系列データを予測学習し、次の時刻における状態値を出力する関数を近似する処理を行なうことができる、優れた学習装置及び学習方法、並びにコンピュータ・プログラムを提供することにある。 A further object of the present invention is to provide an excellent learning apparatus, learning method, and computer capable of predicting and learning time-series data not following a Markov process and approximating a function that outputs a state value at the next time.・ To provide a program.

本発明のさらなる目的は、リカレント・ニューラル・ネットワーク以外の手法により非マルコフ過程の時系列データの予測関数を学習することができる、優れた学習装置及び学習方法、並びにコンピュータ・プログラムを提供することにある。 A further object of the present invention is to provide an excellent learning apparatus and learning method, and a computer program capable of learning a prediction function of time-series data of a non-Markov process by a method other than a recurrent neural network. is there.

本発明は、上記課題を参酌してなされたものであり、その第１の側面は、ある時刻ｔに関する状態ｚ_tに基づいて次の時刻ｔ＋１に関する状態ｚ_t+1を予測するための時系列予測関数Ｆの近似を行なう学習装置であって、各時刻ｔの状態は当該時刻における学習対象の情報ｘ_t及び文脈情報ｃ_tからなり、
現在時刻Ｔに至る各時刻ｔ（但し、ｔ＝１…Ｔ）の前記学習対象の時系列情報｛ｘ_t｝及び文脈情報の時系列｛ｃ_t｝を過去の状態｛ｚ_t｝として入力するデータ入力手段と、
該入力された過去の状態｛ｚ_t｝を用い、所定の学習アルゴリズムに従って時系列予測関数Ｆを学習する関数学習手段と、
該学習して得られた時系列予測関数Ｆと状態の初期値ｚ₁を用いて、現在時刻Ｔに至るまでの各時刻ｔの学習サンプル｛ｘ_t｝を予測する予測手段と、
前記データ入力手段で入力された各時刻ｔの学習サンプル｛ｘ_t｝と前記予測手段により生成された各時刻ｔの学習サンプルの予測値との誤差を算出する誤差計算手段と、
該誤差に基づいて前記関数学習手段による時系列予測関数Ｆの学習が終了したか否かを判定する判定手段と、
を具備することを特徴とする学習装置である。 The present invention has been made in consideration of the above problems, and a first aspect thereof is a time series for predicting a state z _{t + 1} related to a next time t + 1 based on a state z _t related to a certain time t. A learning device for approximating the prediction function F, wherein the state at each time t consists of information x _t and context information c _t to be learned at the time,
The time series information {x _t } of the learning object and the time series {c _t } of the context information at each time t (where t = 1... T) up to the current time T are input as past states {z _t }. Data input means;
Function learning means for learning the time series prediction function F according to a predetermined learning algorithm using the input past state {z _t };
Prediction means for predicting the learning sample {x _t } at each time t until the current time T using the time series prediction function F and the state initial value z ₁ obtained by the learning;
An error calculating means for calculating an error between the learning sample {x _t } at each time t input by the data input means and the predicted value of the learning sample at each time t generated by the prediction means;
Determination means for determining whether learning of the time series prediction function F by the function learning means is completed based on the error;
A learning apparatus comprising:

本発明は、非マルコフ過程の時系列予測関数Ｆを学習する学習装置に関する。非マルコフ過程の時系列情報の予測関数を学習するために、誤差逆伝播法に基づくリカレント・ニューラル・ネットワークを用いるのが一般的であるが、学習に時間がかかる、文脈情報の次元数が非常に多く凡化特性に影響が生じる可能性がある、といった問題が危惧される。そこで、本発明に係る学習装置では、時系列予測関数Ｆを学習する学習アルゴリズムとして、連続値関数近似手法を用いる。短時間で大域解へ収束することが保証される連続値関数近似手法の代表例として、ＳｕｐｐｏｒｔＶｅｃｔｏｒＲｅｇｒｅｓｓｉｏｎ（以下、ＳＶＲとする）を挙げることができる。 The present invention relates to a learning apparatus that learns a time series prediction function F of a non-Markov process. It is common to use a recurrent neural network based on the error back-propagation method to learn the prediction function of non-Markov time series information, but it takes a long time to learn and the number of dimensions of context information is very high. There is a concern that the generalization characteristics may be affected in large numbers. Therefore, in the learning apparatus according to the present invention, a continuous value function approximation method is used as a learning algorithm for learning the time series prediction function F. A representative example of a continuous function approximation method that is guaranteed to converge to a global solution in a short time is Support Vector Regression (hereinafter referred to as SVR).

本発明に係る学習装置は、現在時刻ｔに関する状態に基づいて次の時刻ｔ＋１に関する状態を予測若しくは自動生成するための時系列予測関数Ｆの近似を行なうものである。ここで、非マルコフ過程である時系列情報に関する時系列予測問題を解決するために、文脈情報を用いている。したがって、ある時刻における状態は、当該時刻における学習サンプルと、同時刻における文脈情報からなる。学習サンプルをｎ次元、文脈情報をｍ次元とすると、関数Ｆは、（ｎ＋ｍ）次元入力（ｎ＋ｍ）次元出力の時系列予測関数ということになる。 The learning device according to the present invention approximates a time series prediction function F for predicting or automatically generating a state related to the next time t + 1 based on a state related to the current time t. Here, context information is used to solve a time series prediction problem related to time series information which is a non-Markov process. Therefore, the state at a certain time includes a learning sample at the time and context information at the same time. If the learning sample is n-dimensional and the context information is m-dimensional, the function F is a time-series prediction function with an (n + m) -dimensional input (n + m) -dimensional output.

このような場合、学習アルゴリズムにより学習する対象は時系列予測関数Ｆであるが、文脈情報｛ｃ_t｝が未知であるから、関数Ｆの学習に併せて文脈情報｛ｃ_t｝の推定を行なわなければならない。そこで、本発明では、文脈情報｛ｃ_t｝の推定と関数Ｆの学習を交互に繰り返し行なうことで、理想的な解に漸近するようにしている。 In such a case, the object to be learned by the learning algorithm is the time series prediction function F, but since the context information {c _t } is unknown, the context information {c _t } is estimated together with the learning of the function F. There must be. Therefore, in the present invention, the estimation of the context information {c _t } and the learning of the function F are alternately repeated so as to approach the ideal solution.

まず、現在時刻Ｔに至る各時刻ｔ（但し、ｔ＝１…Ｔ）の前記学習対象の時系列情報｛ｘ_t｝及び文脈情報の時系列｛ｃ_t｝を過去の状態｛ｚ_t｝として入力し、該入力された過去の状態｛ｚ_t｝からＳＶＲの学習アルゴリズムに従って時系列予測関数Ｆを学習する。 First, the time series information {x _t } of the learning target and the time series {c _t } of the context information at each time t (where t = 1... T) up to the current time T are set as past states {z _t }. The time series prediction function F is learned from the input past state {z _t } according to the SVR learning algorithm.

続いて、学習結果を評価するために、学習して得られた時系列予測関数Ｆと状態の初期値ｚ₁を用いて、現在時刻Ｔに至るまでの状態｛ｚ_t｝を予測してみる。そして、実際にデータ入力された各時刻ｔの学習サンプル｛ｘ_t｝と、学習した時系列予測関数Ｆを用いて予測される各時刻ｔの学習サンプルの予測値との誤差ｅを算出し、この誤差ｅが閾値以下に収まっているかどうかによって学習が終了したか否かを判定することができる。 Subsequently, in order to evaluate the learning result, the state {z _t } up to the current time T is predicted using the time series prediction function F obtained by learning and the initial value z ₁ of the state. . Then, an error e between the learning sample {x _t } at each time t when data is actually input and the predicted value of the learning sample at each time t predicted using the learned time series prediction function F is calculated, Whether or not learning is completed can be determined based on whether or not the error e falls below a threshold value.

ここで、学習が終了されていないと判定されたときには、各時刻ｔの文脈情報｛ｃ_t｝を修正してから、時系列予測関数Ｆの再学習を行なう。算出された誤差ｅに基づいて各時刻ｔの文脈情報｛ｃ_t｝を修正することができる。具体的には、算出された誤差ｅを文脈情報｛ｃ_t｝で偏微分した結果得られる勾配ベクトルの方向に文脈情報｛ｃ_t｝を変化させて修正することができる。 Here, when it is determined that the learning is not finished, the time series prediction function F is relearned after correcting the context information {c _t } at each time t. The context information {c _t } at each time t can be corrected based on the calculated error e. Specifically, the calculated error e can be corrected by changing the context information {c _t } in the direction of the gradient vector obtained as a result of partial differentiation of the calculated error e with the context information {c _t }.

また、本発明の第２の側面は、ある時刻ｔに関する状態ｚ_tに基づいて次の時刻ｔ＋１に関する状態ｚ_t+1を予測するための時系列予測関数Ｆの近似を行なうための処理をコンピュータ・システム上で実行するようにコンピュータ可読形式で記述されたコンピュータ・プログラムであって、各時刻ｔの状態は当該時刻における学習対象の情報ｘ_t及び文脈情報ｃ_tからなり、前記コンピュータ・システムに対し、
現在時刻Ｔに至る各時刻ｔ（但し、ｔ＝１…Ｔ）の前記学習対象の時系列情報｛ｘ_t｝及び文脈情報の時系列｛ｃ_t｝を過去の状態｛ｚ_t｝として入力するデータ入力手順と、
該入力された過去の状態を用い、連続値関数近似手法に基づく学習アルゴリズムに従って時系列予測関数Ｆを学習する関数学習手順と、
該学習して得られた時系列予測関数Ｆと状態の初期値ｚ₁を用いて、現在時刻Ｔに至るまでの各時刻ｔの学習サンプル｛ｘ_t｝を予測する予測手順と、
前記データ入力手順で入力された各時刻ｔの学習サンプル｛ｘ_t｝と前記予測手順において生成された各時刻ｔの学習サンプルの予測値との誤差を算出する誤差計算手順と、
該誤差に基づいて前記関数学習手順における時系列予測関数Ｆの学習が終了したか否かを判定する判定手順と、
前記誤差計算手順において算出された誤差に基づいて各時刻ｔの文脈情報｛ｃ_t｝を修正する文脈修正手順と、
前記判定手順において学習が終了されていないと判定されたときには、前記文脈修正手順において修正された文脈情報を含む各時刻ｔの状態｛ｚ_t｝を用いて前記関数学習手順における時系列予測関数Ｆの学習を再度行なわせる繰り返し学習手順と、
を実行させることを特徴とするコンピュータ・プログラムである。 The second aspect of the present invention, a computer processing for when performing approximation of series prediction function F for prediction based on the state z _t for a certain time t the state z _{t + 1} for the next time t + 1 A computer program written in a computer-readable format to be executed on the system, wherein the state at each time t is composed of information _xt and context information _ct to be learned at the time, and is stored in the computer system. In contrast,
The time series information {x _t } of the learning object and the time series {c _t } of the context information at each time t (where t = 1... T) up to the current time T are input as past states {z _t }. Data entry procedure;
A function learning procedure for learning the time series prediction function F according to a learning algorithm based on a continuous value function approximation method using the input past state;
A prediction procedure for predicting a learning sample {x _t } at each time t up to the current time T using the time series prediction function F and the state initial value z ₁ obtained by the learning;
An error calculation procedure for calculating an error between the learning sample {x _t } at each time t input in the data input procedure and the predicted value of the learning sample at each time t generated in the prediction procedure;
A determination procedure for determining whether learning of the time-series prediction function F in the function learning procedure is completed based on the error;
A context correction procedure for correcting the context information {c _t } at each time t based on the error calculated in the error calculation procedure;
When it is determined in the determination procedure that learning has not been completed, the time series prediction function F in the function learning procedure using the state {z _t } at each time t including the context information corrected in the context correction procedure. An iterative learning procedure to re-learn
Is a computer program characterized in that

本発明の第２の側面に係るコンピュータ・プログラムは、コンピュータ・システム上で所定の処理を実現するようにコンピュータ可読形式で記述されたコンピュータ・プログラムを定義したものである。換言すれば、本発明の第２の側面に係るコンピュータ・プログラムをコンピュータ・システムにインストールすることによって、コンピュータ・システム上では協働的作用が発揮され、本発明の第１の側面に係る学習装置と同様の作用効果を得ることができる。 The computer program according to the second aspect of the present invention defines a computer program described in a computer-readable format so as to realize predetermined processing on a computer system. In other words, by installing the computer program according to the second aspect of the present invention in the computer system, a cooperative action is exhibited on the computer system, and the learning device according to the first aspect of the present invention. The same effect can be obtained.

本発明によれば、与えられた学習サンプルからは直接知ることのできない情報を自動的に生成することができる、優れた学習装置及び学習方法、並びにコンピュータ・プログラムを提供することができる。 According to the present invention, it is possible to provide an excellent learning device, learning method, and computer program capable of automatically generating information that cannot be directly known from a given learning sample.

また、本発明によれば、リカレント・ニューラル・ネットワーク以外の手法により非マルコフ過程の時系列データの予測関数を学習することができる、優れた学習装置及び学習方法、並びにコンピュータ・プログラムを提供することができる。 In addition, according to the present invention, it is possible to provide an excellent learning device, learning method, and computer program capable of learning a prediction function of time-series data of non-Markov processes by a method other than a recurrent neural network. Can do.

また、本発明によれば、文脈情報を用いて非マルコフ過程の時系列予測問題を解決することができる、優れた学習装置及び学習方法、並びにコンピュータ・プログラムを提供することができる。 Further, according to the present invention, it is possible to provide an excellent learning device, learning method, and computer program that can solve the time-series prediction problem of a non-Markov process using context information.

また、本発明によれば、短時間で大域解への収束が保証されている連続値関数近似手法を用いて非マルコフ過程の時系列データの予測関数を学習することができる、優れた学習装置及び学習方法、並びにコンピュータ・プログラムを提供することができる。 In addition, according to the present invention, an excellent learning device that can learn a prediction function of time-series data of a non-Markov process using a continuous value function approximation method that guarantees convergence to a global solution in a short time. And a learning method and a computer program can be provided.

本発明に係る学習方法によれば、誤差逆伝播法を用いたリカレント・ニューラル・ネットワークよりも高速に学習を完了させることができ、且つ、より少ない文脈情報の次元数ｍで学習を収束させることができる。 According to the learning method of the present invention, learning can be completed faster than the recurrent neural network using the error back-propagation method, and the learning can be converged with less dimension number m of context information. Can do.

本発明のさらに他の目的、特徴や利点は、後述する本発明の実施形態や添付する図面に基づくより詳細な説明によって明らかになるであろう。 Other objects, features, and advantages of the present invention will become apparent from more detailed description based on embodiments of the present invention described later and the accompanying drawings.

以下、図面を参照しながら本発明の実施形態について詳解する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

本発明では、非マルコフ過程の時系列情報の予測関数を学習するために、誤差逆伝播法を用いたリカレント・ニューラル・ネットワークに代えて、連続値関数近似手法を用いる。短時間で大域解への収束が保証される連続値関数近似手法の代表例としてＳＶＲを挙げることができる。 In the present invention, in order to learn the prediction function of the time series information of the non-Markov process, a continuous value function approximation method is used instead of the recurrent neural network using the error back propagation method. SVR can be given as a representative example of a continuous value function approximation method that guarantees convergence to a global solution in a short time.

ＳＶＲは、ｎ次元入力１次元出力の実数値関数ｆ：Ｒⁿ→Ｒを、下式（１）に示す形で推定する。 The SVR estimates an n-dimensional input one-dimensional output real value function f: R ⁿ → R in the form shown in the following equation (1).

ここで、ｚ∈Ｒⁿは入力ベクトル、ｓ_i∈Ｒⁿはサポート・ベクタ、Ｋ（．）はカーネル関数、ｂはバイアス項と呼ばれるスカラ値である。 Here, z∈R ⁿ input vectors, s _i ∈R ⁿ is support vector, K (.) Is a kernel function, b is a scalar value called bias term.

ＳＶＲの学習アルゴリズムによれば、Ｔ個（すなわち、時刻ｔ＝１…Ｔ）の学習サンプル｛（ｚ_k，ｙ_k）｜ｋ＝１…Ｔ｝とカーネル関数Ｋ（．）が与えられると、それらをうまく説明することができるｓ_j、θ_j、及びｂを一意に求めることができる。 According to the learning algorithm of SVR, when T learning samples {(z _k , y _k ) | k = 1... T} and a kernel function K (.) Are given, It is possible to uniquely determine s _j , θ _j , and b that can explain them well.

ここで、多次元の時系列情報の予測問題を考える。学習サンプルとして時系列｛ｘ_t∈Ｒⁿ｜ｔ＝１…Ｔ｝が与えられたとき、下式（２）のように、学習サンプルｘをｎ次元のベクトルとおくことで、ｎ次元入力ｎ次元出力の時系列予測関数Ｆ：Ｒⁿ→ＲをＳＶＲにより構成することができる。 Here, a prediction problem of multidimensional time series information is considered. When a time series {x _t εR ⁿ | t = 1... T} is given as a learning sample, an n-dimensional input n is obtained by setting the learning sample x as an n-dimensional vector as shown in the following equation (2). The dimension output time series prediction function F: R ⁿ → R can be configured by SVR.

しかしながら、関数Ｆはマルコフ過程の時系列情報を予測することはできるが、非マルコフ過程（若しくは２次以上のマルコフ過程）の時系列情報を予測することはできない、という問題がある。 However, although the function F can predict time series information of a Markov process, there is a problem that time series information of a non-Markov process (or a second or higher order Markov process) cannot be predicted.

非マルコフ過程の時系列予測問題を解決するためによく採られる方法は２つある。１つは時間遅れ入力を用いる方法であり、もう１つは文脈情報を用いる方法である。本実施形態では、後者の文脈情報を用いている。以下、文脈情報を導入するための仕組みについて詳解する。 There are two common methods for solving the time series prediction problem of non-Markov processes. One is a method using time delay input, and the other is a method using context information. In the present embodiment, the latter context information is used. In the following, the mechanism for introducing context information is explained in detail.

上式（２）で定義される時系列予測関数Ｆは、現在時刻ｔにおけるｎ次元の学習サンプルｘ_tを入力とし、これに基づいて予測される次の時刻ｔ＋１におけるｎ次元の学習サンプルｘ_t+1を出力に持つ。ここでは文脈情報を導入するために、ｎ次元の学習サンプル｛ｘ_t｝とｍ次元の文脈情報｛ｃ_t｝からなる（ｎ＋ｍ）次元の状態｛ｚ_t｝を定義して、学習サンプルを拡張する。そして、関数Ｆは、現在時刻ｔに関する状態ｚ_tに基づいて次の時刻ｔ＋１に関する状態ｚ_t+1を予測することとする。したがって、ＳＶＲの学習アルゴリズムにより学習すべき関数Ｆは下式（３）のように表される。 Series prediction function F when it is defined by the above formula (2) as input learning samples x _t n-dimensional at the current time t, the next time t + n-dimensional in one learning sample x _t that is predicted based on this _{Has +1} as output. Here to introduce contextual information, define a made of n-dimensional training samples {x _t} and m-dimensional context information _{{c t} (n + m} ) dimensional states {z _t}, expanded learning sample To do. Then, the function F shall be possible to predict the state z _{t + 1} for the next time t + 1 based on the state z _t for the current time t. Therefore, the function F to be learned by the SVR learning algorithm is expressed as the following equation (3).

このような場合、学習アルゴリズムにより学習する対象は時系列予測関数Ｆであるが、文脈情報｛ｃ_t｝が未知であるから、関数Ｆの学習に併せて文脈情報｛ｃ_t｝の推定を行なわなければならない。そこで、本実施形態では、文脈情報｛ｃ_t｝の推定と関数Ｆの学習を交互に繰り返し行なうことで、理想的な解に漸近するようにしている。この場合の関数学習並びに文脈情報修正のアルゴリズムは以下の通りとなる。 In such a case, the object to be learned by the learning algorithm is the time series prediction function F, but since the context information {c _t } is unknown, the context information {c _t } is estimated together with the learning of the function F. There must be. Therefore, in the present embodiment, the estimation of the context information {c _t } and the learning of the function F are alternately repeated so as to approach the ideal solution. The algorithm for function learning and context information correction in this case is as follows.

（１）学習対象の時系列情報と文脈情報の時系列を入力する。
ここで、学習対象の時系列情報｛ｘ_t｝はｎ次元であり、１つ前の学習サンプルのみからは予測できない非マルコフ過程である。また、文脈情報｛ｃ_t｝は未知であることから、初期値としてランダムに生成された値を用いる。後述するように、学習が終了するまで文脈情報の修正を繰り返し行なうが、ここではｉ番目に生成された文脈情報を｛ｃ⁽ⁱ⁾ _t｝と表記する。文脈情報｛ｃ_t｝はｍ次元とする（但し、ｍは任意）。 (1) Input time series information to be learned and time series of context information.
Here, the time-series information {x _t } to be learned is n-dimensional and is a non-Markov process that cannot be predicted from only the previous learning sample. Further, since the context information {c _t } is unknown, a randomly generated value is used as the initial value. As will be described later, the context information is repeatedly corrected until the learning is completed. Here, the i-th generated context information is represented as {c ⁽ⁱ⁾ _t }. The context information {c _t } has m dimensions (where m is arbitrary).

（２）ＳＶＲに基づくアルゴリズムに従って、学習対象の時系列情報｛ｘ_t｝と文脈情報の時系列｛ｃ⁽ⁱ⁾ _t｝からなる（ｎ＋ｍ）次元の状態｛ｚ⁽ⁱ⁾ _t｝を予測する時系列予測関数Ｆを学習する。 (2) Predict a (n + m) -dimensional state {z ⁽ⁱ⁾ _t } composed of time series information {x _t } to be learned and time series {c ⁽ⁱ⁾ _t } of context information according to an algorithm based on SVR. A time series prediction function F is learned.

（３）学習した関数Ｆ：Ｒ^n+m→Ｒ^n+mと状態の初期値ｚ⁽ⁱ⁾ ₁を用いて過去の状態｛ｚ_t｝を予測してみる。 (3) Predict the past state {z _t } using the learned function F: R ^{n + m} → R ^{n + m} and the initial value z ⁽ⁱ⁾ ₁ of the state.

（４）予測した学習対象の時系列情報と、実際に与えられた学習対象の時系列情報との誤差を算出する。例えば２乗誤差ｅを求め、ｅが閾値以下であれば、学習が終了したと判定する。 (4) An error between the predicted time series information of the learning target and the time series information of the learning target actually given is calculated. For example, the square error e is obtained, and if e is equal to or less than the threshold value, it is determined that the learning is finished.

（５）予測した学習対象の時系列情報と、実際に与えられた学習対象の時系列情報との誤差ｅが閾値以内に収まらないときは、文脈情報を修正して関数Ｆの学習を再度行なう。文脈情報の修正は、算出された誤差ｅを用いて行なう。 (5) When the error e between the predicted time series information of the learning target and the actually given time series information does not fall within the threshold, the context information is corrected and the learning of the function F is performed again. . The context information is corrected using the calculated error e.

そして、予測した学習対象の時系列情報の誤差が閾値以下に収まるまでは、ｉ←ｉ＋１として、文脈情報｛ｃ_t｝の修正と関数Ｆの学習を交互に繰り返し行なう。 Then, until the predicted error of the time-series information to be learned falls below the threshold value, the correction of the context information {c _t } and the learning of the function F are alternately repeated as i ← i + 1.

上式（８）において、文脈情報｛ｃ⁽ⁱ⁾ _t｝を推定するために、誤差ｅをｉ回目の推定値｛ｃ⁽ⁱ⁾ _t｝で偏微分した結果得られる勾配ベクトルの方向に｛ｃ⁽ⁱ⁾ _t｝を変化させている。これは、最急降下法のアプローチと同じである。ｉ回目の繰り返しにおける勾配ベクトルは下式（９）のように表される。 In the above equation (8), in order to estimate the context information {c ⁽ⁱ⁾ _t }, the error e in the direction of the gradient vector obtained as a result of partial differentiation with the i-th estimated value {c ⁽ⁱ⁾ _t } c ⁽ⁱ⁾ _t } is changed. This is the same as the steepest descent approach. The gradient vector in the i-th iteration is expressed as the following equation (9).

この勾配ベクトルの求め方について、以下に述べる。但し、以後はｉ回目を仮定して、添え字（ｉ）を省略する（変数ｉは別の用途で再利用する）。 A method for obtaining the gradient vector will be described below. However, the subscript (i) is omitted assuming the i-th time thereafter (the variable i is reused for another purpose).

利用するカーネル関数Ｋ（．）をガウシアンＲＢＦ関数の場合を考える。定義は下式（１０）の通りである。 Consider the case where the kernel function K (.) To be used is a Gaussian RBF function. The definition is as in the following formula (10).

まず、時刻ｔ＝Ｔ−１における文脈情報ｃ_T-1の勾配を考える。ｃ_T-1から誤差関数ｅ_Tまでの順伝播は下式（１１）の通りである。 First, consider the gradient of the context information c _T-1 at time t = T−1. The forward propagation from c _T-1 to the error function e _T is expressed by the following equation (11).

上式において、Ｍ_kはｋ番目の関数ｆ_kにおけるサポート・ベクタの数である。さらに、利便性のため、下式（１２）のようにおく。 In the above equation, M _k is the number of support vectors in the k-th function f _k . Further, for convenience, the following equation (12) is set.

以上より、誤差ｅ_Tをｃ_T-1の各成分｛ｃ_T-1｜ｉ＝１…ｍ｝で偏微分すると、次式（１３）のようになる。 As described above, the error e _T components of c _T-1 | when partially differentiated by _{{c T-1 i = 1} ... m}, the following equation (13).

参考までに、各偏微分は次式（１４）の通りである。 For reference, each partial differentiation is as shown in the following equation (14).

以上から、ｃ_T-1＝｛ｃ_T-1,j｜ｉ＝１…ｍ｝の変更量Δｃ_T-1,iは下式（１５）のようになる。但し、αは任意の学習係数である。 From the above, the change amount Δc _{T−1, i} of c _T−1 = {c _{T−1, j} | i = 1... M} is expressed by the following equation (15). Here, α is an arbitrary learning coefficient.

ここまで、最後の時刻における誤差ｅ_Tに直接影響を与える文脈情報ｃ_T-1の変更について考察してきた。｛ｃ_t｜ｔ＝１…Ｔ−２｝においても同様に誤差関数ｅをｃ_tで偏微分していけばよい。但し、時刻ｔにおける文脈情報ｃ_tは時刻ｔ＋１の誤差ｅ_t+1へのみ影響を与えるのではなく、ｔ＋１からＴにかけてのすべての未来へ影響を与える。このため、変更量Δｃ_t,iは下式（１６）の通りとなる。 So far, the change of the context information c _T-1 that directly affects the error e _T at the last time has been considered. Similarly, in {c _t | t = 1... T−2}, the error function e may be partially differentiated by c _t . However, the context information c _t at time t rather than give only impact to the error e _{t + 1} of time t + 1, it affects from t + 1 to all of the future of over the T. Therefore, the change amount Δct _{, i} is as shown in the following equation (16).

また、文脈情報を変更するアルゴリズムの変更例として、以下が挙げられる。 Moreover, the following is mentioned as an example of a change of the algorithm which changes context information.

（１）アルゴリズム中のｃ⁽ⁱ⁺¹⁾ _tを求める式を下式（１７）とする。 (1) An expression for _obtaining c ^{(i + 1)} _t in the algorithm is represented by the following expression (17).

（２）文脈情報｛ｃ_t｝に時間方向の低域通過フィルタをかけると、学習サンプル｛ｘ_t｝と比べて時間スケールの大きな文脈情報を抽出することが可能である。 (2) By applying a low-pass filter in the time direction to the context information {c _t }, it is possible to extract context information having a larger time scale than the learning sample {x _t }.

（３）低域通過フィルタの通過周波数を変数毎に変化させて同居させることで、異なる時間スケールの現象を意図的に分離することができる。 (3) By changing the pass frequency of the low-pass filter for each variable and allowing them to coexist, phenomena of different time scales can be intentionally separated.

図１には、本発明の一実施形態に係る学習装置１の機能的構成を示している。同図に示す学習装置１は、入力部１１と、初期化部１２と、関数近似部１３と、予測部１４と、誤差計算部１５と、判定部１６と、文脈修正部１７を備えている。学習装置１を専用のハードウェア装置としてデザインしてもよいが、各機能モジュールを実現するためのコンピュータ・プログラムを一般的なコンピュータ・システム上で起動するという形態で構成することも可能である。 FIG. 1 shows a functional configuration of a learning device 1 according to an embodiment of the present invention. The learning apparatus 1 shown in FIG. 1 includes an input unit 11, an initialization unit 12, a function approximation unit 13, a prediction unit 14, an error calculation unit 15, a determination unit 16, and a context correction unit 17. . Although the learning device 1 may be designed as a dedicated hardware device, it may be configured in such a manner that a computer program for realizing each functional module is started on a general computer system.

入力部１１は、学習対象となるｎ次元の時系列情報｛ｘ_t｝を入力する。学習対象は、１つ前の学習サンプルのみからは予測できない非マルコフ過程の時系列情報である。また、初期化部１２は、ｍ次元の文脈情報｛ｃ_t｝の初期値をランダムに生成する。 The input unit 11 inputs n-dimensional time series information {x _t } to be learned. The learning target is time-series information of a non-Markov process that cannot be predicted from only the previous learning sample. In addition, the initialization unit 12 randomly generates an initial value of the m-dimensional context information {c _t }.

入力部１１より入力されたｎ次元の学習データとｍ次元の文脈データは、（ｎ＋ｍ）次元の状態｛ｚ_t｝として関数近似部１３に入力される。関数近似部１３は、ＳＶＲの学習アルゴリズムにより、ある時刻ｔの状態ｚ_tから次の時刻ｔ＋１における状態ｚ_t+1を予測するための時系列予測関数Ｆの学習すなわち関数近似を行なう。 The n-dimensional learning data and m-dimensional context data input from the input unit 11 are input to the function approximating unit 13 as an (n + m) -dimensional state {z _t }. The function approximating unit 13 performs learning, that is, function approximation, of the time series prediction function F for predicting the state z _{t + 1} at the next time t + 1 from the state z _{t at} a certain time t by the learning algorithm of SVR.

予測部１４は、関数近似された予測関数Ｆと状態の初期値ｚ⁽ⁱ⁾ ₁を用いて各時刻ｔの状態｛ｚ_t｝を予測してみる。 The prediction unit 14 tries to predict the state {z _t } at each time t using the prediction function F approximated by the function and the initial value z ⁽ⁱ⁾ ₁ of the state.

誤差計算部１５は、予測部１４で予測された学習データと、入力部１１から実際に入力された学習データとの誤差ｅを算出する。 The error calculation unit 15 calculates an error e between the learning data predicted by the prediction unit 14 and the learning data actually input from the input unit 11.

判定部１６は、誤差計算部１５で算出された誤差ｅを閾値と比較し、誤差ｅが閾値以下であれば学習が終了したと判定する。そして、終了判定時の予測関数Ｆを学習装置１による学習結果として出力する。 The determination unit 16 compares the error e calculated by the error calculation unit 15 with a threshold value, and determines that learning has been completed if the error e is equal to or less than the threshold value. And the prediction function F at the time of completion | finish determination is output as a learning result by the learning apparatus 1. FIG.

予測した学習対象の時系列情報と、実際に与えられた学習対象の時系列情報との誤差ｅが閾値以内に収まらないときは、文脈修正部１７により文脈情報｛ｃ_t｝を修正して、予測関数Ｆの学習を再度行なう。文脈修正部１７は、誤差計算部１５により算出された誤差ｅを用いて文脈情報の修正を行なう。具体的には、算出された誤差ｅを文脈情報｛ｃ_t｝で偏微分した結果得られる勾配ベクトルの方向に文脈情報｛ｃ_t｝を変化させて修正する（前述）。そして、予測した学習対象の時系列情報の誤差が閾値以下に収まるまでは、文脈情報｛ｃ_t｝の修正と関数Ｆの学習を交互に繰り返し行なう。 When the error e between the predicted time series information of the learning target and the actually given time series information does not fall within the threshold, the context correction unit 17 corrects the context information {c _t }, The prediction function F is learned again. The context correction unit 17 corrects the context information using the error e calculated by the error calculation unit 15. Specifically, varying the contextual information {c _t} the calculated error e in the direction of the contextual information {c _t} in partial differential obtained as a result of the gradient vector corrected (above). The correction of the context information {c _t } and the learning of the function F are alternately repeated until the predicted error in the time-series information to be learned falls below the threshold.

図２には、学習装置１が文脈情報の修正を行ないながら時系列予測関数Ｆの学習を行なうための処理手順をフローチャートの形式で示している。以下、同図を参照しながら処理手順について説明する。 FIG. 2 shows a processing procedure for learning the time series prediction function F while the learning device 1 corrects the context information in the form of a flowchart. Hereinafter, the processing procedure will be described with reference to FIG.

まず、入力部１１から学習データを入力するとともに、初期化部１２により文脈データの初期値を生成する（ステップＳ１）。 First, learning data is input from the input unit 11, and an initial value of context data is generated by the initialization unit 12 (step S1).

次いで、関数近似部１３は、学習データと文脈データを参照して、予測関数Ｆを生成すなわち関数近似する（ステップＳ２）。 Next, the function approximating unit 13 generates the prediction function F, that is, approximates the function with reference to the learning data and the context data (step S2).

次いで、予測部１４は、学習データの初期値を用い、生成された予測関数Ｆに従って学習データの予測を行なってみる（ステップＳ３）。 Next, the prediction unit 14 tries to predict the learning data according to the generated prediction function F using the initial value of the learning data (step S3).

次いで、誤差計算部１５は、予測関数Ｆを用いて予測された学習データと、実際に入力部１１から入力された学習データとの差分を計算し、予測誤差を算出する（ステップＳ４）。 Next, the error calculation unit 15 calculates a difference between the learning data predicted using the prediction function F and the learning data actually input from the input unit 11, and calculates a prediction error (step S4).

この誤差計算結果は判定部１６に入力される。判定部１６では、算出された誤差に基づいて、関数近似部１３により生成された予測関数の近似が十分であるかどうかを判定する（ステップＳ５）。 The error calculation result is input to the determination unit 16. The determination unit 16 determines whether the approximation of the prediction function generated by the function approximation unit 13 is sufficient based on the calculated error (step S5).

ここで、判定部１６で終了判定を出せば、当該学習は終了する。 If the determination unit 16 makes an end determination, the learning ends.

一方、終了判定を出さないときには、文脈修正部１７が誤差に従って文脈データを修正する（ステップＳ６）。そして、ステップＳ１に戻って、予測関数Ｆの学習を再度行なう。予測した学習対象の時系列情報の誤差が閾値以下に収まるまでは、文脈情報｛ｃ_t｝の修正と関数Ｆの学習を交互に繰り返し行なう。 On the other hand, when the end determination is not issued, the context correction unit 17 corrects the context data according to the error (step S6). And it returns to step S1 and learning of the prediction function F is performed again. The correction of the context information {c _t } and the learning of the function F are repeated alternately until the predicted error in the time-series information to be learned falls below the threshold value.

最後に、本実施形態に係る学習メカニズムを正弦波の時系列予測に適用した実験例について説明する。 Finally, an experimental example in which the learning mechanism according to the present embodiment is applied to sine wave time series prediction will be described.

図３には、このときの学習サンプル｛ｘ_t∈Ｒ｜ｔ＝１…６０｝を示している。横軸が時間ｔであり、縦軸が値ｘ_tである。図示の学習サンプルは３周期分の正弦波で構成される。 FIG. 3 shows learning samples {x _t εR | t = 1... 60} at this time. The horizontal axis is time t, and the vertical axis is the value x _t. The illustrated learning sample is composed of three sine waves.

図４には、学習前の初期状態を示している。同図の上段では、図３に示したと同様の３周期分の正弦波からなる学習サンプル｛ｘ_t∈Ｒ｜ｔ＝１…６０｝と、各時刻ｔにおいてランダムに生成された文脈情報の初期値｛ｃ⁽¹⁾ _t∈Ｒ｜ｔ＝１…６０｝が示されている。また、同図の下段では、予測された学習サンプルの時系列値と、予測された文脈情報が示されている。 FIG. 4 shows an initial state before learning. In the upper part of the figure, learning samples {x _t εR | t = 1... 60} composed of three sine waves similar to those shown in FIG. 3 and initial context information randomly generated at each time t are shown. The value {c ⁽¹⁾ _t ∈ R | t = 1... 60} is shown. In the lower part of the figure, the time series value of the predicted learning sample and the predicted context information are shown.

第１回目の文脈情報｛ｃ_t｝の推定と関数Ｆの学習では、文脈情報｛ｃ⁽¹⁾ _t｝がランダムであるため、図示のように、予測された学習データの予測値は実際に入力された学習データとは異なったものとなっている。文脈情報｛ｃ_t｝の推定と関数Ｆの学習を交互に繰り返し行なう目的は、この２つの時系列値を同じようにすることにある。 In the first estimation of context information {c _t } and learning of the function F, since the context information {c ⁽¹⁾ _t } is random, as shown in the figure, the predicted value of the predicted learning data is actually It is different from the input learning data. The purpose of alternately estimating the context information {c _t } and learning the function F is to make the two time series values the same.

図５には、文脈情報｛ｃ_t｝の推定と関数Ｆの学習を交互に繰り返し行なうことにより、予測関数Ｆの学習が収束していく様子を示している。同図に示すように、文脈情報｛ｃ⁽ⁱ⁾ _t｝は、学習サンプル｛ｘ_t｝と半位相だけずれた同一周期の波形となっている。 FIG. 5 shows how learning of the prediction function F converges by repeatedly performing estimation of context information {c _t } and learning of the function F alternately. As shown in the figure, the context information {c ⁽ⁱ⁾ _t } has a waveform with the same period shifted from the learning sample {x _t } by a half phase.

同図の上から２段目では、学習サンプルの予測値が実際の学習サンプル｛ｘ_t｝と同一の波形になっていることを示している。しかも、学習されていない時刻ｔ＝６１以降も、続けて正弦波を予測し続けていることが分かる。 The second row from the top in the figure shows that the predicted value of the learning sample has the same waveform as the actual learning sample {x _t }. Moreover, it can be seen that the sine wave is continuously predicted even after the time t = 61 when it is not learned.

また、同図の上から３段目では、学習サンプルの予測値にノイズを付加しながら関数Ｆを予測させた結果を示している。ノイズを付加しないと同じように、学習サンプルと同一の波形に引き込まれていることが分かる。 Further, the third row from the top in the figure shows the result of predicting the function F while adding noise to the predicted value of the learning sample. It can be seen that the same waveform as the learning sample is drawn in the same manner as when no noise is added.

また、同図の上から４段目では、横軸にステップ数を、縦軸にＲＭＳＥ（ＲｏｏｔＭｅａｎＳｑｕａｒｅｄＥｒｒｏｒ）をとった学習曲線を示している。同図では、１０回で学習が収束していることが分かる。 Further, in the fourth row from the top in the figure, a learning curve is shown in which the horizontal axis represents the number of steps and the vertical axis represents RMSE (Root Mean Squared Error). In the figure, it can be seen that learning has converged after 10 times.

上述したように、本発明に係る学習方法は、短時間で大域解への収束が保証されている連続値関数近似手法に従い、文脈情報を用いて非マルコフ過程の時系列データの予測関数を学習することができる。その際、誤差逆伝播法を用いたリカレント・ニューラル・ネットワークよりも高速に学習を完了させることができ、且つ、より少ない文脈情報の次元数ｍで学習を収束させることができるという点を十分に理解されたい。 As described above, the learning method according to the present invention learns a prediction function of time-series data of non-Markov processes using context information according to a continuous value function approximation method in which convergence to a global solution is guaranteed in a short time. can do. At that time, it is possible to complete the learning faster than the recurrent neural network using the error back propagation method, and to sufficiently converge the learning with less dimension number m of context information. I want you to understand.

以上、特定の実施形態を参照しながら、本発明について詳解してきた。しかしながら、本発明の要旨を逸脱しない範囲で当業者が該実施形態の修正や代用を成し得ることは自明である。 The present invention has been described in detail above with reference to specific embodiments. However, it is obvious that those skilled in the art can make modifications and substitutions of the embodiment without departing from the gist of the present invention.

本明細書では、主にＳＶＲに従って予測関数Ｆの学習を行なう実施形態を中心に説明してきたが、本発明の要旨はこれに限定されるものではない。例えばＳＶＲ以外の連続値関数近似手法に基づく学習アルゴリズムや、それ以外の学習アルゴリズムを適用する学習装置に対しても、同様に本発明を適用することができる。 In the present specification, the description has been made mainly on the embodiment in which the prediction function F is learned mainly according to the SVR, but the gist of the present invention is not limited to this. For example, the present invention can be similarly applied to a learning algorithm based on a continuous value function approximation method other than SVR and a learning device to which another learning algorithm is applied.

要するに、例示という形態で本発明を開示してきたのであり、本明細書の記載内容を限定的に解釈するべきではない。本発明の要旨を判断するためには、特許請求の範囲を参酌すべきである。 In short, the present invention has been disclosed in the form of exemplification, and the description of the present specification should not be interpreted in a limited manner. In order to determine the gist of the present invention, the claims should be taken into consideration.

図１は、本発明の一実施形態に係る学習装置１の機能的構成を示した図である。FIG. 1 is a diagram showing a functional configuration of a learning device 1 according to an embodiment of the present invention. 図２は、学習装置１が文脈情報の修正を行ないながら時系列予測関数Ｆの学習を行なうための処理手順示したフローチャートである。FIG. 2 is a flowchart showing a processing procedure for the learning device 1 to learn the time series prediction function F while correcting the context information. 図３は、本発明に係る学習メカニズムを正弦波の時系列予測に適用した実験例を説明するための図である。FIG. 3 is a diagram for explaining an experimental example in which the learning mechanism according to the present invention is applied to time series prediction of a sine wave. 図４は、本発明に係る学習メカニズムを正弦波の時系列予測に適用した実験例を説明するための図である。FIG. 4 is a diagram for explaining an experimental example in which the learning mechanism according to the present invention is applied to time series prediction of a sine wave. 図５は、本発明に係る学習メカニズムを正弦波の時系列予測に適用した実験例を説明するための図である。FIG. 5 is a diagram for explaining an experimental example in which the learning mechanism according to the present invention is applied to sine wave time series prediction.

Explanation of symbols

１…学習装置
１１…入力部
１２…初期化部
１３…関数近似部
１４…予測部
１５…誤差計算部
１６…判定部
１７…文脈修正部
DESCRIPTION OF SYMBOLS 1 ... Learning apparatus 11 ... Input part 12 ... Initialization part 13 ... Function approximation part 14 ... Prediction part 15 ... Error calculation part 16 ... Determination part 17 ... Context correction part

Claims

A learning apparatus for performing an approximation of the series prediction function F when for predicting the state z _{t + 1} for the next time t + 1 based on the state z _t for a certain time t, the state z _t at each time t is the time It consists of learning target information x _t and context information c _{t at t} ,
As the past state {z _t }, the learning target time-series information {x _t } at each time t (provided that t = 1... T) up to the current time T is input, and until the current time T is reached. a data input means to generate a time series {c _t} contextual information randomly or inputting the time series {c _t} randomly contextual information generated,
Function learning means for learning the time series prediction function F according to a predetermined learning algorithm using the input past state {z _t };
Prediction means for predicting the learning sample {x _t } at each time t until the current time T using the time series prediction function F and the state initial value z ₁ obtained by the learning;
An error calculating means for calculating an error between the learning sample {x _t } at each time t input by the data input means and the predicted value of the learning sample at each time t generated by the prediction means;
Context correcting means for changing the context information {c _t } in the direction of the gradient vector obtained as a result of partial differentiation of the error e calculated by the error calculating means with the context information {c _t } ;
Determination means for determining whether learning of the time series prediction function F by the function learning means is completed based on the error;
Equipped with,
When the determination means determines that the learning has not ended , the time learning function {z _t } including the context information corrected by the context correction means is given to the function learning means to obtain a time series prediction function. Re-learn F,
A learning apparatus characterized by that.

The function learning means learns the time series prediction function F according to a learning algorithm based on a continuous value function approximation method.
The learning apparatus according to claim 1.

The function learning means learns a time series prediction function F according to a learning algorithm based on Support Vector Regression.
The learning apparatus according to claim 2 , wherein:

State z for a certain time t _tt The state z for the next time t + 1 based on _{t+1t + 1} Is a learning method for approximating the time series prediction function F for predicting the state z at each time t _tt Is information x to be learned at the time t _tt And context information c _tt Consists of
Past state {z _tt }, The time series information {x of the learning target at each time t (where t = 1... T) up to the current time T. _tt } And context information time series up to the current time T {c _tt } At random, or a time series {c of randomly generated context information _tt }, A data input step for inputting
The input past state {z _tt }, A function learning step of learning the time series prediction function F according to a learning algorithm based on a continuous value function approximation method,
Time series prediction function F and state initial value z obtained by learning ₁₁ Using the learning sample {x at each time t until the current time T _tt } For predicting,
Learning sample {x at each time t input in the data input step _tt } And an error calculation step of calculating an error between the prediction value of the learning sample at each time t generated in the prediction step;
The error e calculated in the error calculation step is used as context information {c _tt } In the direction of the gradient vector obtained as a result of partial differentiation with {} _tt } To modify the context and modify the context,
A determination step of determining whether learning of the time series prediction function F in the function learning step is completed based on the error;
Have
When it is determined in the determination step that the learning has not been completed, the state {z at each time t including the context information corrected in the context correction step _tt } Is used to learn the time series prediction function F in the function learning step again.
A learning method characterized by that.

State z for a certain time t _tt The state z for the next time t + 1 based on _{t+1t + 1} Is a computer program written in a computer-readable format so as to execute on a computer system a process for approximating a time series prediction function F for predicting a state z at each time t. _tt Is information x to be learned at the time t _tt And context information c _tt For the computer system
Past state {z _tt }, The time series information {x of the learning target at each time t (where t = 1... T) up to the current time T. _tt } And context information time series up to the current time T {c _tt } At random, or a time series {c of randomly generated context information _tt } The data input procedure to input},
A function learning procedure for learning the time series prediction function F according to a learning algorithm based on a continuous value function approximation method using the input past state;
Time series prediction function F and state initial value z obtained by learning ₁₁ Using the learning sample {x at each time t until the current time T _tt } Prediction procedure for predicting
Learning sample {x at each time t input in the data input procedure _tt } And an error calculation procedure for calculating an error between the prediction value of the learning sample at each time t generated in the prediction procedure;
The error e calculated by the error calculation procedure is used as context information {c _tt } In the direction of the gradient vector obtained as a result of partial differentiation with {} _tt } To modify the context and modify the context,
A determination procedure for determining whether learning of the time-series prediction function F in the function learning procedure is completed based on the error;
When it is determined in the determination procedure that learning has not been completed, the state {z at each time t including the context information corrected in the context correction procedure {z _tt } To repeat the learning of the time series prediction function F in the function learning procedure,
A computer program for executing