JP2009140454A

JP2009140454A - Data processor, data processing method, and program

Info

Publication number: JP2009140454A
Application number: JP2007319213A
Authority: JP
Inventors: Masato Ito; 真人伊藤; Kuniaki Noda; 邦昭野田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2007-12-11
Filing date: 2007-12-11
Publication date: 2009-06-25

Abstract

PROBLEM TO BE SOLVED: To enable a user to perform a desired operation in advance. SOLUTION: A state data acquiring part 101 acquires state data being time sequential data expressing the states. An operation data acquiring part 102 acquires desired operation data being the time sequential data corresponding to the operation desired by the user. A prediction learning part 103 learns the dynamics of the state data and the desired operation data. A predicting part 105 obtains the prediction value of the desired operation data with the state data as an input, on the basis of the dynamics. An operation data output part 106 outputs the prediction value of the desired operation data. The present invention is applied to an electronic apparatus to be operated by the user, such as a PC, a TV, and a game machine. COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、データ処理装置、データ処理方法、及びプログラムに関し、特に、例えば、ユーザが行おうとしている操作を先取りして行うことができるようにするデータ処理装置、データ処理方法、及びプログラムに関する。 The present invention relates to a data processing device, a data processing method, and a program, and more particularly, to a data processing device, a data processing method, and a program that enable, for example, an operation that a user is about to perform in advance.

例えば、PC(Personal Computer)や、TV（テレビジョン受像機）等の電子機器のユーザインタフェースでは、ユーザが過去に行ったことがある操作であっても、ユーザは、マウスやリモートコマンダの操作を、過去に行った場合と同様に繰り返し行う必要があった。 For example, in a user interface of an electronic device such as a PC (Personal Computer) or a TV (TV receiver), the user can operate a mouse or a remote commander even if the user has performed in the past. It was necessary to repeat the same as in the past.

また、ユーザの視線を用いたユーザインタフェースが提案されている（例えば、非特許文献１を参照）。 A user interface using the user's line of sight has been proposed (see, for example, Non-Patent Document 1).

かかるユーザインタフェースによれば、ユーザは、マウスやリモートコマンダを操作せずに、視線を動かすだけでコンピュータの操作を行うことができる。 According to such a user interface, the user can operate the computer only by moving the line of sight without operating the mouse or the remote commander.

大野健彦、「視線を用いたインタフェース」、情報処理４４巻７号、２００３年７月Takehiko Ohno, “Interface Using Eyes,” Information Processing Vol. 44, No. 7, July 2003

先に提案されている、視線を用いるユーザインタフェースは、様々なユーザに共通する特定の視線の動きのパターンから、「迷い」状態の検出や、単語の理解度の推定等を行い、ある検出結果や推定結果等に対して、電子機器の操作を、いわば固定的に対応付け、ユーザの作業を支援する。 The previously proposed user interface using gaze detects a “stray” state or estimates word comprehension from a specific gaze movement pattern common to various users. In other words, the operation of the electronic device is fixedly associated with the estimation result or the like to support the user's work.

したがって、先に提案されているユーザインタフェースでは、例えば、ユーザごとに、そのユーザに特有の視線の動きを認識し、その動きに応じて、ユーザが行おうとしている操作を行うことは困難であった。 Therefore, with the previously proposed user interface, for example, it is difficult for each user to recognize the movement of the line of sight unique to the user and perform the operation that the user is trying to perform according to the movement. It was.

本発明は、このような状況に鑑みてなされたものであり、ユーザが行おうとしている操作、つまり、ユーザが所望する操作を先取りして行うことができるようにするものである。 The present invention has been made in view of such a situation, and makes it possible to perform in advance a user's operation, that is, an operation desired by the user.

本発明の一側面のデータ処理装置、又は、プログラムは、時系列データを処理するデータ処理装置であり、状況を表す時系列データである状況データを取得する状況データ取得手段と、ユーザが所望する操作に対応する時系列データである所望操作データを取得する操作データ取得手段と、前記状況データ及び所望操作データのダイナミクスを学習する学習手段と、前記ダイナミクスに基づき、前記状況データを入力として、前記所望操作データの予測値を求める予測手段と、前記所望操作データの予測値を出力する出力手段とを備えるデータ処理装置、又は、データ処理装置として、コンピュータを機能させるプログラムである。 A data processing apparatus or program according to an aspect of the present invention is a data processing apparatus that processes time-series data, and a user desires a situation data acquisition unit that acquires situation data that is time-series data representing a situation. Operation data acquisition means for acquiring desired operation data that is time-series data corresponding to an operation, learning means for learning the dynamics of the situation data and desired operation data, and the situation data as input based on the dynamics, A program that causes a computer to function as a data processing apparatus or a data processing apparatus that includes a predicting unit that calculates a predicted value of desired operation data and an output unit that outputs the predicted value of desired operation data.

本発明の一側面のデータ処理方法は、時系列データを処理するデータ処理装置のデータ処理方法であり、状況を表す時系列データである状況データを取得するとともに、ユーザが所望する操作に対応する時系列データである所望操作データを取得し、前記状況データ及び所望操作データのダイナミクスを学習し、前記ダイナミクスに基づき、前記状況データを入力として、前記所望操作データの予測値を求め、前記所望操作データの予測値を出力するステップを含むデータ処理方法である。 A data processing method according to one aspect of the present invention is a data processing method of a data processing apparatus that processes time-series data, acquires situation data that is time-series data representing a situation, and corresponds to an operation desired by a user. Obtaining desired operation data as time series data, learning dynamics of the situation data and desired operation data, obtaining the predicted value of the desired operation data based on the dynamics, using the situation data as input, and obtaining the desired operation A data processing method including a step of outputting a predicted value of data.

本発明の一側面においては、状況を表す時系列データである状況データと、ユーザが所望する操作に対応する時系列データである所望操作データとが取得され、前記状況データ及び所望操作データのダイナミクスの学習が行われる。そして、前記ダイナミクスに基づき、前記状況データを入力として、前記所望操作データの予測値が求められ、前記所望操作データの予測値が出力される。 In one aspect of the present invention, situation data that is time-series data representing a situation and desired operation data that is time-series data corresponding to an operation desired by the user are acquired, and the dynamics of the situation data and the desired operation data are acquired. Learning is done. Then, based on the dynamics, with the situation data as an input, a predicted value of the desired operation data is obtained, and a predicted value of the desired operation data is output.

なお、データ処理装置は、独立した装置であっても良いし、１つの装置を構成している内部ブロックであっても良い。 Note that the data processing device may be an independent device or an internal block constituting one device.

また、プログラムは、伝送媒体を介して伝送し、又は、記録媒体に記録して、提供することができる。 Further, the program can be provided by being transmitted via a transmission medium or by being recorded on a recording medium.

本発明の一側面によれば、ユーザが所望する操作を先取りして行うことができる。 According to an aspect of the present invention, an operation desired by a user can be performed in advance.

以下に本発明の実施の形態を説明するが、本発明の構成要件と、明細書又は図面に記載の実施の形態との対応関係を例示すると、次のようになる。この記載は、本発明をサポートする実施の形態が、明細書又は図面に記載されていることを確認するためのものである。したがって、明細書又は図面中には記載されているが、本発明の構成要件に対応する実施の形態として、ここには記載されていない実施の形態があったとしても、そのことは、その実施の形態が、その構成要件に対応するものではないことを意味するものではない。逆に、実施の形態が構成要件に対応するものとしてここに記載されていたとしても、そのことは、その実施の形態が、その構成要件以外の構成要件には対応しないものであることを意味するものでもない。 Embodiments of the present invention will be described below. Correspondences between the constituent elements of the present invention and the embodiments described in the specification or the drawings are exemplified as follows. This description is intended to confirm that the embodiments supporting the present invention are described in the specification or the drawings. Therefore, even if there is an embodiment that is described in the specification or the drawings but is not described here as an embodiment that corresponds to the constituent elements of the present invention, that is not the case. It does not mean that the form does not correspond to the constituent requirements. Conversely, even if an embodiment is described herein as corresponding to a configuration requirement, that means that the embodiment does not correspond to a configuration requirement other than the configuration requirement. It's not something to do.

本発明の一側面のデータ処理装置、又は、プログラムは、
時系列データを処理するデータ処理装置（例えば、図１のデータ処理装置）であり、
状況を表す時系列データである状況データを取得する状況データ取得手段（例えば、図１の状況データ取得部１０１）と、
ユーザが所望する操作に対応する時系列データである所望操作データを取得する操作データ取得手段（例えば、図１の操作データ取得部１０２）と、
前記状況データ及び所望操作データのダイナミクスを学習する学習手段（例えば、図１の予測学習部１０３）と、
前記ダイナミクスに基づき、前記状況データを入力として、前記所望操作データの予測値を求める予測手段（例えば、図１の予測部１０５）と、
前記所望操作データの予測値を出力する出力手段（例えば、図１の操作データ出力部１０６）と
を備えるデータ処理装置、又は、データ処理装置として、コンピュータを機能させるプログラムである。 A data processing apparatus or program according to one aspect of the present invention is provided.
A data processing device that processes time-series data (for example, the data processing device of FIG. 1),
Situation data acquisition means (for example, the situation data acquisition unit 101 in FIG. 1) for acquiring situation data that is time-series data representing the situation;
Operation data acquisition means (for example, the operation data acquisition unit 102 in FIG. 1) for acquiring desired operation data that is time-series data corresponding to an operation desired by the user;
Learning means for learning the dynamics of the situation data and desired operation data (for example, the predictive learning unit 103 in FIG. 1);
Based on the dynamics, prediction means (for example, the prediction unit 105 in FIG. 1) that obtains a predicted value of the desired operation data using the situation data as an input;
A data processing apparatus including an output unit (for example, the operation data output unit 106 in FIG. 1) that outputs a predicted value of the desired operation data, or a program that causes a computer to function as the data processing apparatus.

本発明の一側面のデータ処理方法は、
時系列データを処理するデータ処理装置のデータ処理方法であり、
状況を表す時系列データである状況データを取得するとともに、ユーザが所望する操作に対応する時系列データである所望操作データを取得し（例えば、図３のステップＳ１０１、及びＳ１０２）、
前記状況データ及び所望操作データのダイナミクスを学習し（例えば、図３のステップＳ１０４）、
前記ダイナミクスに基づき、前記状況データを入力として、前記所望操作データの予測値を求め（例えば、図４のステップＳ１１３）、
前記所望操作データの予測値を出力する（例えば、図４のステップＳ１１４）
ステップを含むデータ処理方法である。 A data processing method according to one aspect of the present invention includes:
A data processing method of a data processing apparatus for processing time series data,
While acquiring the situation data which is the time series data representing the situation, the user obtains the desired operation data which is the time series data corresponding to the operation desired by the user (for example, steps S101 and S102 in FIG. 3).
Learning the dynamics of the situation data and desired operation data (for example, step S104 in FIG. 3),
Based on the dynamics, the situation data is used as an input to obtain a predicted value of the desired operation data (for example, step S113 in FIG. 4).
The predicted value of the desired operation data is output (for example, step S114 in FIG. 4).
A data processing method including steps.

以下、図面を参照して、本発明の実施の形態について説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、本発明を適用したデータ処理装置の一実施の形態の構成例を示すブロック図である。 FIG. 1 is a block diagram showing a configuration example of an embodiment of a data processing apparatus to which the present invention is applied.

図１のデータ処理装置は、例えば、PCや、TVその他の、ユーザが操作を行う電子機器の一部を構成する。 The data processing apparatus in FIG. 1 constitutes a part of an electronic device operated by a user, such as a PC, a TV, or the like.

図１において、データ処理装置は、状況データ取得部１０１、操作データ取得部１０２、予測学習部１０３、ダイナミクス学習モデル記憶部１０４、予測部１０５、及び操作データ出力部１０６から構成される。 In FIG. 1, the data processing apparatus includes a situation data acquisition unit 101, an operation data acquisition unit 102, a prediction learning unit 103, a dynamics learning model storage unit 104, a prediction unit 105, and an operation data output unit 106.

状況データ取得部１０１は、状況を表す時系列データである状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tを取得し、予測学習部１０３、及び予測部１０５に供給する。 The situation data acquisition unit 101 acquires situation data s ⁱ _t , s ⁱ _{t + 1} ,..., S ⁱ _{t + T} that are time-series data representing the situation, and sends them to the prediction learning unit 103 and the prediction unit 105. Supply.

ここで、sⁱ _tは、i番目(i=1,2,・・・,I)の種類の状況データの時刻tのサンプル値を表す。 Here, s ⁱ _t represents a sample value at time t of the i-th (i = 1, 2,..., I) type of situation data.

状況データ取得部１０１は、例えば、PCやTVにおける、マウスカーソルや、アイコン、ウインドウ、GUI(Graphical User Interface)で表示されるボタン、ユーザの視線の位置等の、各種の状況を表す状況データを取得する。 The situation data acquisition unit 101 stores situation data representing various situations such as a mouse cursor, an icon, a window, a button displayed on a GUI (Graphical User Interface), and a user's line of sight on a PC or TV. get.

操作データ取得部１０２は、ユーザが所望する操作に対応する時系列データである所望操作データa^j _t,a^j _t+1,・・・,a^j _t+Tを取得し、予測学習部１０３に供給する。 The operation data acquisition unit 102 acquires desired operation data a ^j _t , a ^j _{t + 1} ,..., A ^j _{t + T} that are time-series data corresponding to the operation desired by the user, and the prediction learning unit 103. To supply.

ここで、a^j _tは、j番目(j=1,2,・・・,J)の種類の所望操作データの時刻tのサンプル値を表す。 Here, a ^j _t represents a sample value at time t of the desired operation data of the j-th (j = 1, 2,..., J) type.

操作データ取得部１０２は、例えば、PCやTVのキーボードのキーに対するユーザの操作、ユーザによるウインドウの選択の操作、GUIで表示されるボタンに対するユーザの操作、マウスのボタンに対するユーザの操作、スクロールバーに対するユーザの操作等の、ユーザの操作に対応する時系列データ、つまり、ユーザの操作がどのような状態（ユーザによる操作が行われていない状態を含む）にあるのかを表す時系列データを、ユーザが、そのときに所望している操作を表す所望操作データとして取得する。 The operation data acquisition unit 102 is, for example, a user's operation on a PC or TV keyboard key, a user's window selection operation, a user's operation on a button displayed on the GUI, a user's operation on a mouse button, a scroll bar Time-series data corresponding to the user's operation, such as the user's operation, that is, the time-series data representing the state of the user's operation (including the state where the user's operation is not performed) Acquired as desired operation data representing the operation desired by the user at that time.

なお、どのようなデータを、状況データ及び所望操作データとするかは、例えば、図１のデータ処理装置に、あらかじめ設定しておくこともできるし、ユーザに設定してもらうようにすることもできる。 Note that what kind of data is used as the situation data and the desired operation data can be set in advance in the data processing apparatus of FIG. 1 or can be set by the user. it can.

どのようなデータを、状況データ及び所望操作データとするかを、ユーザに設定してもらう場合には、上述したマウスカーソルの位置等や、キーに対するユーザの操作等の、状況データ及び所望操作データになり得るデータの一覧を表示し、そのデータの一覧の中から、状況データ及び所望操作データとするデータそれぞれを、ユーザに選択してもらえば良い。 When requesting the user to set what kind of data is the situation data and the desired operation data, the situation data and the desired operation data such as the position of the mouse cursor and the user's operation on the key described above. A list of possible data may be displayed, and the user may select the status data and the desired operation data from the list of data.

予測学習部１０３は、状況データ取得部１０１からの状況データ、及び、操作データ取得部１０２からの所望操作データのダイナミクスを、ダイナミクス学習モデル記憶部１０４に記憶されたダイナミクス学習モデルによって学習する。 The prediction learning unit 103 learns the dynamics of the situation data from the situation data acquisition unit 101 and the desired operation data from the operation data acquisition unit 102 using the dynamics learning model stored in the dynamics learning model storage unit 104.

すなわち、予測学習部１０３は、ダイナミクス学習モデル記憶部１０４に記憶されたダイナミクス学習モデルのパラメータを、状況データ取得部１０１からの状況データ、及び、操作データ取得部１０２からの所望操作データを用いて更新する、ダイナミクス学習モデルの学習を行い、これにより、ダイナミクス学習モデル記憶部１０４に記憶されたダイナミクス学習モデルが、状況データ取得部１０１からの状況データ、及び、操作データ取得部１０２からの所望操作データのダイナミクスを獲得する。 That is, the predictive learning unit 103 uses parameters of the dynamics learning model stored in the dynamics learning model storage unit 104 using the situation data from the situation data acquisition unit 101 and the desired operation data from the operation data acquisition unit 102. The dynamics learning model to be updated is learned, whereby the dynamics learning model stored in the dynamics learning model storage unit 104 is updated to the situation data from the situation data acquisition unit 101 and the desired operation from the operation data acquisition unit 102. Acquire data dynamics.

ダイナミクス学習モデル記憶部１０４は、ダイナミクスを獲得することができるモデルであるダイナミクス学習モデルを記憶する。 The dynamics learning model storage unit 104 stores a dynamics learning model that is a model that can acquire dynamics.

ここで、ダイナミクス学習モデルとしては、例えば、RNN(Recurrent Neural Network)、FNN(Feed Forward Neural Network)、及びRNN-PB(Recurrent Neural Net with Parametric Bias)等のNN(Neural Network)や、SVR(Support Vector Regression)、その他のダイナミクスを獲得することができるモデルを採用することができる。 Here, as a dynamics learning model, for example, NN (Neural Network) such as RNN (Recurrent Neural Network), FNN (Feed Forward Neural Network), and RNN-PB (Recurrent Neural Net with Parametric Bias), SVR (Support Vector Regression) and other models that can acquire dynamics can be adopted.

なお、多数のダイナミクスを獲得することができるモデルとして、複数のノードによって構成され、その複数のノードそれぞれにダイナミクスを保持するダイナミクス記憶ネットワークがある。ダイナミクス記憶ネットワークについては、後述する。 As a model that can acquire a large number of dynamics, there is a dynamics storage network that includes a plurality of nodes and holds the dynamics in each of the plurality of nodes. The dynamics storage network will be described later.

予測部１０５は、ダイナミクス学習モデル記憶部１０４に記憶されたダイナミクス学習モデルが獲得したダイナミクスに基づき、状況データ取得部１０１から供給される状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tを入力として、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+T、さらには、必要に応じて、状況データの予測値s'^j _t,s'^j _t+1,・・・,s'^j _t+Tを求め、操作データ出力部１０６に供給する。 The prediction unit 105 uses the situation data s ⁱ _t , s ⁱ _{t + 1} ,... Supplied from the situation data acquisition unit 101 based on the dynamics acquired by the dynamics learning model stored in the dynamics learning model storage unit 104. s ⁱ _{t + T} as input, predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., a ′ ^j _{t + T} of the desired operation data, and, if necessary, prediction of situation data Values s ′ ^j _t , s ′ ^j _{t + 1} ,..., S ′ ^j _{t + T} are obtained and supplied to the operation data output unit 106.

なお、予測部１０５は、状況データの予測値s'^j _t,s'^j _t+1,・・・,s'^j _t+Tを求める場合には、状況データ取得部１０１からの状況データ（の真値）sⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tも、操作データ出力部１０６に供給する。 When the prediction unit 105 obtains the predicted values s ′ ^j _t , s ′ ^j _{t + 1} ,..., S ′ ^j _{t + T} of the situation data, the situation data from the situation data acquisition unit 101 ( S ⁱ _t , s ⁱ _{t + 1} ,..., S ⁱ _{t + T} are also supplied to the operation data output unit 106.

操作データ出力部１０６は、予測部１０５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを、図１のデータ処理装置が一部を構成している電子機器としてのPCやTV等の操作データ（ユーザが行った操作に対応するデータ（操作を表すデータ））を受け付けるインタフェース（モジュール）に出力する。 The operation data output unit 106 uses the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data from the prediction unit 105 as one by one. The data is output to an interface (module) that receives operation data (data corresponding to an operation performed by the user (data indicating an operation)) such as a PC or TV as an electronic device constituting the unit.

操作データ出力部１０６が、予測部１０５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを、PCやTV等の操作データを受け付けるインタフェースに出力した場合、PCやTV等では、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tに従った処理が行われる。 The operation data output unit 106 uses the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data from the prediction unit 105 as operation data such as a PC or TV. When output to the accepting interface, the PC or TV performs processing according to the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data.

なお、操作データ出力部１０６は、予測部１０５から、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tの他、状況データの予測値s'^j _t,s'^j _t+1,・・・,s'^j _t+T、及び、状況データの真値sⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tが供給される場合には、状況データの予測値の予測誤差eⁱ _t=|sⁱ-s'^j|を求める。そして、操作データ出力部１０６は、例えば、予測誤差e^j _t,e^j _t+1,・・・,e^j _t+Tのそれぞれ、又は、すべての総和が、あらかじめ決定された所定の閾値以下（未満）であるときのみ、予測部１０５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを出力する。 Note that the operation data output unit 106 receives the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data from the prediction unit 105 and the predicted values of the situation data. ^{_{^{s' j t, s' j}}} t + 1, ···, s' j t + T, and, true value s ⁱ _t of the situation ^{_{data, s i t + 1, ···}} , is s ⁱ _{t + T} When supplied, the prediction error e ⁱ _t = | s ⁱ −s ′ ^j | of the predicted value of the situation data is obtained. Then, the operation data output unit 106, for example, each of the prediction errors e ^j _t , e ^j _{t + 1} ,..., E ^j _{t + T} or the total sum thereof is equal to or less than a predetermined threshold value determined in advance. Only when it is (less than), the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data from the prediction unit 105 are output.

この場合、予測誤差が大であるときには、操作データ出力部１０６は、予測部１０５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを出力しないので、PCやTV等では、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tに従った処理は、行われない。すなわち、PCやTV等では、予測誤差が小であるときのみ、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tに従った処理が行われる。 In this case, when the prediction error is large, the operation data output unit 106 predicts the desired operation data from the prediction unit 105 a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + Since T} is not output, processing according to the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data is not performed on a PC, TV, or the like. That is, in PC, TV, etc., only when the prediction error is small, the process according to the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data is performed. Done.

以上のような、予測誤差に基づく所望操作データの出力の制御は、例えば、ユーザの操作に従って、有効、又は無効とすることができる。 The control of the output of the desired operation data based on the prediction error as described above can be validated or invalidated according to a user operation, for example.

次に、図２を参照して、図１の状況データ取得部１０１が取得する状況データと、操作データ取得部１０２が取得する所望操作データについて、さらに説明する。 Next, the situation data acquired by the situation data acquisition unit 101 of FIG. 1 and the desired operation data acquired by the operation data acquisition unit 102 will be further described with reference to FIG.

図２Ａは、PCの画面上のマウスカーソルの位置の軌跡と、PCの画面の左上の位置にあるアイコンがクリックされた状態を示している。 FIG. 2A shows the locus of the position of the mouse cursor on the PC screen and the state where the icon at the upper left position of the PC screen is clicked.

図２Ａでは、ユーザが、マウスカーソルを、PCの画面の右下から左上の方向に移動し、PCの画面の左上の位置にあるアイコンをクリックする操作をしている。 In FIG. 2A, the user moves the mouse cursor from the lower right to the upper left of the PC screen, and clicks on the icon at the upper left position of the PC screen.

図２Ｂは、図２Ａの操作が行われた場合に、状況データ取得部１０１が取得する状況データを示している。 FIG. 2B shows status data acquired by the status data acquisition unit 101 when the operation of FIG. 2A is performed.

なお、図２Ｂにおいて、横軸は、時刻（時間）tを表し、縦軸は、図２Ａの画面の左下の点を原点とするxy座標系における座標(x,y)を表している。 In FIG. 2B, the horizontal axis represents time (time) t, and the vertical axis represents coordinates (x, y) in the xy coordinate system with the lower left point of the screen of FIG. 2A as the origin.

状況データ取得部１０１は、図２Ｂに示すように、マウスカーソルの位置の軌跡を表す座標(mouse_x_t,mouse_y_t)を、i番目の種類の状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tとして取得するとともに、図２Ａの画面の左上の位置にあるアイコンの位置を表す座標(icon_x_t,icon_y_t)を、i'番目の種類の状況データs^i' _t,s^i' _t+1,・・・,s^i' _t+Tとして取得する。 As shown in FIG. 2B, the situation data acquisition unit 101 uses coordinates (mouse_x _t , mouse_y _t ) representing the locus of the position of the mouse cursor as the i-th kind of situation data s ⁱ _t , s ⁱ _{t + 1} ,. · ·, s ⁱ acquires as _{t + T,} the coordinates (icon_x _{_t,} icon_y _t) representative of the position of the icon in the upper left position of the screen of FIG. 2A, i 'th type of status data s ^i' _t , s ^{i ′} _{t + 1} ,..., s ^{i ′} _{t + T.}

図２Ｃは、図２Ａの操作が行われた場合に、操作データ取得部１０２が取得する所望操作データを示している。 FIG. 2C shows desired operation data acquired by the operation data acquisition unit 102 when the operation of FIG. 2A is performed.

なお、図２Ｃにおいて、横軸は、時刻tを表し、縦軸は、マウスのボタンのオフ(off)とオン(on)を表している。 In FIG. 2C, the horizontal axis represents time t, and the vertical axis represents mouse button off (off) and on (on).

操作データ取得部１０２は、図２Ｃに示すように、マウスのボタンの、オン又はオフの操作の状態を表す時系列データを、j番目の所望操作データa^j _t,a^j _t+1,・・・,a^j _t+Tとして取得する。 As shown in FIG. 2C, the operation data acquisition unit 102 converts the time series data representing the ON / OFF operation state of the mouse button into the jth desired operation data a ^j _t , a ^j _{t + 1} ^,.・・ Acquire as a ^j _{t + T.}

次に、図１のデータ処理装置では、ダイナミクス学習モデル記憶部１０４に記憶されたダイナミクス学習モデルのパラメータを、状況データ及び所望操作データを用いて更新する、ダイナミクス学習モデルの学習の処理（学習処理）と、学習処理によってダイナミクスを獲得したダイナミクス学習モデルを用い、状況データを入力として、所望操作データの予測値（さらには、状況データの予測値）を求める予測の処理（予測処理）とが行われる。 Next, in the data processing apparatus of FIG. 1, a dynamics learning model learning process (learning process) in which parameters of the dynamics learning model stored in the dynamics learning model storage unit 104 are updated using situation data and desired operation data. ) And a prediction process (prediction process) for obtaining a predicted value of the desired operation data (and a predicted value of the situation data) using the dynamics learning model that has acquired the dynamics through the learning process and using the situation data as input. Is called.

図３は、図１のデータ処理装置で行われる学習処理を説明するフローチャートである。 FIG. 3 is a flowchart for explaining a learning process performed by the data processing apparatus of FIG.

学習処理は、例えば、周期的に、又は不定期のタイミングで開始され、ステップＳ１０１において、状況データ取得部１０１が、状況データを取得し、予測学習部１０３に供給して、処理は、ステップＳ１０２に進む。 The learning process is started, for example, periodically or at irregular timing. In step S101, the situation data acquisition unit 101 acquires the situation data and supplies the situation data to the prediction learning unit 103, and the process is performed in step S102. Proceed to

ステップＳ１０２では、操作データ取得部１０２が、所望操作データを取得し、予測学習部１０３に供給して、処理は、ステップＳ１０３に進む。 In step S102, the operation data acquisition unit 102 acquires desired operation data and supplies it to the prediction learning unit 103, and the process proceeds to step S103.

ステップＳ１０３では、予測学習部１０３が、ダイナミクス学習モデル記憶部１０４に記憶されたダイナミクス学習モデルのパラメータを読み出し、処理は、ステップＳ１０４に進む。 In step S103, the predictive learning unit 103 reads the parameters of the dynamics learning model stored in the dynamics learning model storage unit 104, and the process proceeds to step S104.

ステップＳ１０４では、予測学習部１０３が、ステップＳ１０３でダイナミクス学習モデル記憶部１０４から読み出したダイナミクス学習モデルのパラメータを、状況データ取得部１０１からの状況データ、及び、操作データ取得部１０２からの所望操作データを用いて更新して、処理は、ステップＳ１０５に進む。 In step S 104, the predictive learning unit 103 uses the dynamics learning model parameters read from the dynamics learning model storage unit 104 in step S 103, the situation data from the situation data acquisition unit 101, and the desired operation from the operation data acquisition unit 102. After updating using the data, the process proceeds to step S105.

すなわち、例えば、いま、任意の数の、ニューロンに相当するユニットにより構成される入力層、隠れ層（中間層）、及び出力層の３層を有する３層型NNが、ダイナミクス学習モデルとして採用されていることとすると、予測学習部１０３は、入力層のユニットに、時刻tの入力データ（ベクトル）X_tとして、状況データ取得部１０１からの時刻tの状況データsⁱ _t、及び、操作データ取得部１０２からの時刻tの所望操作データaⁱ _tを入力する。 That is, for example, a three-layer NN having three layers of an input layer, a hidden layer (intermediate layer), and an output layer composed of an arbitrary number of units corresponding to neurons is adopted as a dynamics learning model. If so, the predictive learning unit 103 receives, as input data (vector) X _t at time t, status data s ⁱ _t at time t from the status data acquisition unit 101 and operation data as input data (vector) X _t. The desired operation data a ⁱ _t at time t from the acquisition unit 102 is input.

これにより、隠れ層のユニットでは、入力層に入力される入力データX_tを対象として、ニューロンとしてのユニットどうしを結合する結合重み（結合荷重）を用いた重み付け加算が行われ、さらに、その重み付け加算の結果を引数とする非線形関数の演算が行われて、その演算結果が、出力層のユニットに出力される。 Thus, the units of the hidden layer, as the target input data X _t which is input to the input layer, the weighted addition using the connection weight for coupling the units to each other as the neuron (connection weight) conducted, further, the weighting An operation of a nonlinear function using the addition result as an argument is performed, and the operation result is output to the output layer unit.

出力層のユニットからは、入力データX_tに対する出力データとして、その入力データX_tの次の時刻t+1の入力データX_t+1の予測値X'_t+1、すなわち、いまの場合、時刻t+1の状況データの予測値s'ⁱ _t+1、及び、時刻t+1の所望操作データの予測値a'ⁱ _t+1が出力される。 From the unit of the output layer, as the output data corresponding to input data X _t, the predicted value X _{'t + 1} of the input data X _{t + 1} at the next time t + 1 of the input data X _t, namely, in this case, predicted value s status data at time _{t + 1} ^'i t + 1, and the predicted value a desired operating data at time _{t + 1'} ⁱ t + 1 is output.

予測学習部１０３は、例えば、BP(Back-propagation)法に従い、入力データX_tに対する出力データとしての、その入力データX_tの次の時刻t+1の入力データX_t+1の予測値X'_t+1の、真値（時刻t+1の入力データX_t+1）に対する予測誤差が小さくなるように、NNのパラメータとしての結合重みを、予測誤差に応じた値だけ更新する計算を、NNのパラメータが収束するまで繰り返し行う。 Prediction learning unit 103 is, for example, BP accordance (Back-propagation) method, input of the output data to the data X _t, the following time t + 1 of the input data X _{t + 1} of the predicted value X of the input data X _t the _{'t + 1,} so the prediction error is smaller for the true value (time t + 1 of the input data X _{t + 1),} the connection weight as a parameter NN, a calculation for updating by a value corresponding to the prediction error Repeat until the NN parameters converge.

なお、NNのパラメータの更新は、ステップＳ１０３でダイナミクス学習モデル記憶部１０４から読み出されたパラメータを、パラメータの初期値として行われる。 Note that the update of the NN parameter is performed using the parameter read from the dynamics learning model storage unit 104 in step S103 as the initial value of the parameter.

以上のように、時刻tの時系列データX_tから、次の時刻t+1の時系列データX_t+1を予測することの学習（予測学習(prediction learning)）を行うことで、ダイナミクス学習モデルとしてのNNは、時系列データの時間発展法則を学習し、その時系列データのダイナミクスを獲得することができる。 As described above, from the time series data X _t at time t, learning to predict time series data X _{t + 1} at the next time t + 1 (prediction learning (prediction learning)) by performing, dynamics learning NN as a model can learn the time evolution law of time series data and acquire the dynamics of the time series data.

ステップＳ１０５では、予測学習部１０３は、ステップＳ１０４で更新したダイナミクス学習モデルのパラメータを、ダイナミクス学習モデル記憶部１０４に上書きの形で書き込み（保存し）、処理は終了する。 In step S105, the predictive learning unit 103 writes (saves) the parameters of the dynamics learning model updated in step S104 in the form of overwriting in the dynamics learning model storage unit 104, and the process ends.

図４は、図１のデータ処理装置で行われる予測処理を説明するフローチャートである。 FIG. 4 is a flowchart for explaining a prediction process performed by the data processing apparatus of FIG.

予測処理は、例えば、周期的に、又は不定期のタイミングで開始され、ステップＳ１１１において、状況データ取得部１０１が、状況データを取得し、予測部１０５に供給して、処理は、ステップＳ１１２に進む。 The prediction process is started, for example, periodically or at irregular timing. In step S111, the situation data acquisition unit 101 acquires the situation data and supplies the situation data to the prediction unit 105, and the process proceeds to step S112. move on.

ステップＳ１１２では、予測部１０５が、ダイナミクス学習モデル記憶部１０４に記憶されたダイナミクス学習モデルのパラメータを読み出し、処理は、ステップＳ１１３に進む。 In step S112, the prediction unit 105 reads the parameters of the dynamics learning model stored in the dynamics learning model storage unit 104, and the process proceeds to step S113.

ステップＳ１１３では、予測部１０５は、ダイナミクス学習モデル記憶部１０４に記憶されたダイナミクス学習モデルが獲得したダイナミクスに基づき、状況データ取得部１０１から供給される状況データを入力として、所望操作データの予測値を求め、操作データ出力部１０６に供給して、処理は、ステップＳ１１４に進む。 In step S113, the prediction unit 105 receives the situation data supplied from the situation data acquisition unit 101 based on the dynamics acquired by the dynamics learning model stored in the dynamics learning model storage unit 104, and predicts the desired operation data. Is supplied to the operation data output unit 106, and the process proceeds to step S114.

すなわち、例えば、いま、ダイナミクス学習モデルが、図３で説明した３層型NNであるとすると、予測部１０５は、入力層のユニットに、状況データ取得部１０１からの時刻tの状況データsⁱ _tと、時刻tの所望操作データaⁱ _tとしての、例えば、乱数やあらかじめ決められた値等とを、時刻tの入力データX_tとして入力する。 That is, for example, if the dynamics learning model is the three-layer NN described with reference to FIG. 3, the prediction unit 105 sends the status data s ^{i at} the time t from the status data acquisition unit 101 to the unit of the input layer. and _t, as desired operation data a ⁱ _t at time t, for example, the random number and a predetermined value such as, inputs as input data X _t at time t.

これにより、隠れ層のユニットでは、入力層に入力される入力データX_tを対象として、ダイナミクス学習モデルのパラメータとしての結合重みを用いた重み付け加算が行われ、さらに、その重み付け加算の結果を引数とする非線形関数の演算が行われて、その演算結果が、出力層のユニットに出力される。 Thus, the units of the hidden layer, as the target input data X _t which is input to the input layer, is performed weighting addition using a link weight as the parameter dynamics learning model, further argument the result of the weighted addition And the result of the calculation is output to the output layer unit.

出力層のユニットでは、隠れ層のユニットの演算結果を入力として、隠れ層のユニットと同様の演算が行われ、その演算結果が、入力データX_tの次の時刻t+1の入力データX_t+1の予測値X'_t+1、すなわち、時刻t+1の状況データの予測値s'ⁱ _t+1、及び、時刻t+1の所望操作データの予測値a'ⁱ _t+1として出力される。 In the output layer unit, the calculation result of the hidden layer unit is input, and the same calculation as that of the hidden layer unit is performed. The calculation result is the input data X _t at the time t + 1 next to the input data X _{t. As} the predicted value X ′ _{t + 1 of} ₊₁ , that is, the predicted value s ′ ⁱ _{t + 1} of the situation data at time t + 1, and the predicted value a ′ ⁱ _{t + 1} of the desired operation data at time t + 1 Is output.

そして、出力層のユニットから出力される時刻t+1の所望操作データの予測値a'ⁱ _t+1が、予測部１０５から操作データ出力部１０６に供給される。 Then, the predicted value a ′ ⁱ _{t + 1} of the desired operation data at time t + 1 output from the output layer unit is supplied from the prediction unit 105 to the operation data output unit 106.

なお、入力層のユニットに、状況データ取得部１０１からの時刻tの状況データsⁱ _tが入力された後は、状況データ取得部１０１からの次の時刻t+1の状況データsⁱ _t+1が入力されるが、そのとき、時刻tの所望操作データaⁱ _tとしては、例えば、出力層のユニットから出力された時刻t+1の所望操作データの予測値a'ⁱ _t+1が用いられる（入力層のユニットに入力される）。 Incidentally, the unit of the input layer, after the situation data s ⁱ _t at time t from state data acquisition unit 101 is input, status data s of the next time t + 1 from state data acquisition unit 101 ⁱ _{t + 1} is input, and as the desired operation data a ⁱ _{t at} time t, for example, the predicted value a ′ ⁱ _{t + 1} of the desired operation data at time t + 1 output from the unit of the output layer is, for example, Used (input to the input layer unit).

また、上述の場合には、予測部１０５において、ダイナミクス学習モデルを用い、時刻tの入力データX_tの入力に対して、入力データX_tの次の時刻t+1の入力データX_t+1の予測値X'_t+1を求めることとしたが、予測部１０５では、ダイナミクス学習モデルを用いて得られた予測値を、さらにダイナミクス学習モデルの入力としてフィードバックすることを繰り返すことで、時刻t+1の入力データX_t+1の予測値X'_t+1の他、その時刻t+1の先（未来）の時刻の予測値X'_t+2，X'_t+3，・・・をも求めることが可能である。後述する予測部１６５（図９）でも同様である。 In the foregoing paragraphs, the prediction unit 105, using a dynamics learning model, to the input of the input data X _t at time t, the input data X _{t + 1} at the next time t + 1 of the input data X _t of it was and obtaining a prediction value X _{'t + 1,} the prediction unit 105, a prediction value obtained by using the dynamics learning model, by repeating the further feedback as an input dynamics learning model, the time t In addition to the predicted value X ′ _{t + 1} of the input data X _{t + 1} of +1, predicted values X ′ _{t + 2} , X ′ _{t + 3} ,... Can also be obtained. The same applies to a prediction unit 165 (FIG. 9) described later.

ステップＳ１１４では、操作データ出力部１０６が、予測部１０５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを、図１のデータ処理装置が一部を構成している電子機器としてのPCやTV等の操作データを受け付けるインタフェースに出力し、処理は終了する。 In step S114, the operation data output unit 106 uses the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data from the prediction unit 105 as the data in FIG. The processing apparatus outputs the data to an interface that accepts operation data such as a PC or a TV as an electronic device that constitutes a part thereof, and the processing ends.

これにより、PCやTV等では、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tに従った処理が行われる。 As a result, processing according to the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data is performed on a PC, TV, or the like.

なお、図４の予測処理は、図３の学習処理が行われていないときに行うこともできるし、図３の学習処理と並列して行うこともできる。 4 can be performed when the learning process of FIG. 3 is not performed, or can be performed in parallel with the learning process of FIG.

以上のように、状況データ取得部１０１において、状況を表す時系列データである状況データを取得するとともに、操作データ取得部１０２において、ユーザが所望する操作に対応する時系列データである所望操作データを取得し、予測学習部１０３において、状況データ及び所望操作データのダイナミクスを学習し、予測部１０５において、ダイナミクスに基づき、状況データを入力として、所望操作データの予測値を求め、操作データ出力部１０６において、所望操作データの予測値を出力するので、ユーザごとに、ユーザが行おうとしている操作、つまり、ユーザが所望する操作を先取りして行うことができる。 As described above, the situation data obtaining unit 101 obtains situation data that is time-series data representing a situation, and the operation data obtaining unit 102 obtains desired operation data that is time-series data corresponding to an operation desired by the user. The prediction learning unit 103 learns the dynamics of the situation data and the desired operation data, and the prediction unit 105 obtains a predicted value of the desired operation data based on the dynamics, using the situation data as an input, and the operation data output unit In 106, since the predicted value of the desired operation data is output, for each user, the operation that the user is trying to perform, that is, the operation desired by the user can be performed in advance.

すなわち、例えば、図２Ａに示したように、ユーザが、マウスカーソルを、PCの画面の右下から左上の方向に移動し、PCの画面の左上の位置にあるアイコンをクリックする操作をした場合には、状況データ取得部１０１では、図２Ｂに示したように、マウスカーソルの位置の軌跡を表す座標(mouse_x_t,mouse_y_t)が、i番目の種類の状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tとして取得されるとともに、図２Ａの画面の左上の位置にあるアイコンの位置を表す座標(icon_x_t,icon_y_t)が、i'番目の種類の状況データs^i' _t,s^i' _t+1,・・・,s^i' _t+Tとして取得される。 That is, for example, as shown in FIG. 2A, when the user moves the mouse cursor from the lower right to the upper left of the PC screen and clicks on the icon at the upper left position of the PC screen. In the situation data acquisition unit 101, as shown in FIG. 2B, coordinates (mouse_x _t , mouse_y _t ) representing the locus of the position of the mouse cursor are the i-th type of situation data s ⁱ _t , s ⁱ _{t. The} coordinates (icon_x _t , icon_y _t ) obtained as ₊₁ ,..., S ⁱ _{t + T} and representing the position of the icon at the upper left position of the screen in FIG. Data s ^{i ′} _t , s ^{i ′} _{t + 1} ,..., S ^{i ′} _{t + T} are acquired.

さらに、操作データ取得部１０２では、図２Ｃに示したように、オフになっているマウスのボタンを所定のタイミングでオンにする操作を表す操作データが、j番目の所望操作データa^j _t,a^j _t+1,・・・,a^j _t+Tとして取得される。 Furthermore, in the operation data acquisition unit 102, as shown in FIG. 2C, operation data representing an operation of turning on a mouse button that is turned off at a predetermined timing is j-th desired operation data a ^j _t , acquired as a ^j _{t + 1} ,..., a ^j _{t + T.}

この場合、予測学習部１０３では、PCの画面の右下から左上の方向に向かうマウスカーソルの軌跡を表す座標(mouse_x_t,mouse_y_t)の時系列データ、PCの画面の左上の位置にあるアイコンの位置を表す座標(icon_x_t,icon_y_t)の時系列データ、及び、オフになっているマウスのボタンを所定のタイミングでオンにする操作を表す時系列データを用いて、ダイナミクス学習モデルの学習が行われる。 In this case, the predictive learning unit 103 sets time series data of coordinates (mouse_x _t , mouse_y _t ) representing the locus of the mouse cursor from the lower right to the upper left of the PC screen, and the icon at the upper left position of the PC screen. Learning dynamics learning model using time-series data of coordinates (icon_x _t , icon_y _t ) representing the position of the mouse and time-series data representing an operation to turn on a mouse button that is turned off at a predetermined timing Is done.

これにより、ダイナミクス学習モデルは、PCの画面の右下から左上の方向に向かうマウスカーソルの軌跡を表す座標(mouse_x_t,mouse_y_t)の時系列データ、PCの画面の左上の位置にあるアイコンの位置を表す座標(icon_x_t,icon_y_t)の時系列データ、及び、オフになっているマウスのボタンを所定のタイミングでオンにする操作を表す時系列データのダイナミクスを獲得する。 As a result, the dynamics learning model uses time series data of coordinates (mouse_x _t , mouse_y _t ) representing the locus of the mouse cursor from the lower right to the upper left of the PC screen, and the icon at the upper left position of the PC screen. Time-series data of coordinates (icon_x _t , icon_y _t ) representing a position and dynamics of time-series data representing an operation of turning on a mouse button that is turned off at a predetermined timing are acquired.

その後、状況データ取得部１０１において、PCの画面の右下から左上の方向に向かうマウスカーソルの軌跡を表す座標(mouse_x_t,mouse_y_t)の時系列データと、PCの画面の左上の位置にあるアイコンの位置を表す座標(icon_x_t,icon_y_t)の時系列データが取得され、予測部１０５に供給されると、予測部１０５では、ダイナミクス学習モデルが獲得したダイナミクスを有する所望操作データの予測値（さらには、状況データの予測値）、つまり、オフになっているマウスのボタンを所定のタイミングでオンにする操作を表す時系列データの予測値が求められ、操作データ出力部１０６を介して、PCのインタフェースに出力される。 Then, there the situation data acquiring unit 101, the coordinate (mouse_x _{_t,} mouse_y _t) from the lower right represents the trajectory of the mouse cursor toward the direction of the upper left corner of the screen of the PC and the time-series data of the position of the upper left of the screen of the PC When time series data of coordinates (icon_x _t , icon_y _t ) representing the position of the icon is acquired and supplied to the prediction unit 105, the prediction unit 105 predicts the desired operation data having the dynamics acquired by the dynamics learning model. (Furthermore, the predicted value of the situation data), that is, the predicted value of the time-series data representing the operation of turning on the mouse button that is turned off at a predetermined timing, is obtained via the operation data output unit 106 Is output to the PC interface.

したがって、PCでは、PCの画面の左上の位置に、アイコンが存在する状況であって、かつ、マウスカーソルが、PCの画面の右下から左上の方向に向かう軌跡を描いた状況となった場合、PCの画面の左上の位置にあるアイコンをクリックする操作に対応する所望操作データの予測値が求められ、これにより、PCでは、ユーザが、PCの画面の左上の位置にあるアイコンをクリックする操作をする前に、その操作を先取りする形で、その操作に従った処理が行われる。 Therefore, on a PC, when there is an icon at the upper left position of the PC screen, and the mouse cursor is in a situation that draws a trajectory from the lower right side of the PC screen toward the upper left direction. The predicted value of the desired operation data corresponding to the operation of clicking on the icon at the upper left position of the PC screen is obtained, and on this PC, the user clicks on the icon at the upper left position of the PC screen Before an operation is performed, processing according to the operation is performed in a form that preempts the operation.

なお、NNやSVR等の、ダイナミクスを獲得することができるダイナミクス学習モデルは、いわゆる汎化の能力（機能）を有する。かかる汎化の能力を有するダイナミクス学習モデルによれば、学習処理に用いられた時系列データと一致しない時系列データが与えられた場合であっても、その与えられた時系列データが、学習処理に用いられた時系列データとダイナミクスが類似するものであるときには、ある程度予測精度の高い予測値を出力する。 A dynamics learning model that can acquire dynamics, such as NN and SVR, has a so-called generalization ability (function). According to the dynamics learning model having the generalization ability, even when time-series data that does not match the time-series data used in the learning process is given, the given time-series data is converted into the learning process. When the time series data used in the above and the dynamics are similar, a prediction value with a certain degree of prediction accuracy is output.

図５及び図６は、状況データと所望操作データの他の例を示している。 5 and 6 show other examples of situation data and desired operation data.

すなわち、図５は、PCの画面上のwebブラウザに表示された記事を見ているユーザの視線の軌跡と、その軌跡が描かれた後に、webブラウザ（の表示）のスクロールをする操作がされた状態を示している。 That is, FIG. 5 shows the locus of the line of sight of the user who is viewing the article displayed on the web browser on the PC screen, and the operation of scrolling the web browser (display) after the locus is drawn. Shows the state.

図５において、状況データ取得部１０１が、ユーザの視線の軌跡を表す座標を、i番目の種類の状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tとして取得するとともに、操作データ取得部１０２が、webブラウザのスクロールをする操作に対応する操作データを、j番目の所望操作データa^j _t,a^j _t+1,・・・,a^j _t+Tとして取得した場合、予測学習部１０３では、ユーザの視線の軌跡を表す座標の時系列データ、及び、webブラウザのスクロールをする操作を表す時系列データを用いて、ダイナミクス学習モデルの学習が行われる。 In FIG. 5, the situation data acquisition unit 101 acquires coordinates representing the locus of the user's line of sight as i-th type situation data s ⁱ _t , s ⁱ _{t + 1} ,..., S ⁱ _{t + T.} At the same time, the operation data acquisition unit 102 acquires operation data corresponding to the scroll operation of the web browser as the jth desired operation data a ^j _t , a ^j _{t + 1} ,..., A ^j _{t + T} In this case, the prediction learning unit 103 learns the dynamics learning model using the time series data of the coordinates representing the locus of the user's line of sight and the time series data representing the scrolling operation of the web browser.

これにより、ダイナミクス学習モデルは、ユーザの視線の軌跡を表す座標の時系列データ、及び、webブラウザのスクロールをする操作を表す時系列データのダイナミクスを獲得する。 As a result, the dynamics learning model acquires time-series data of coordinates representing the locus of the user's line of sight, and dynamics of time-series data representing an operation of scrolling the web browser.

その後、状況データ取得部１０１において、図５に示したような、ユーザの視線の軌跡を表す座標の時系列データが取得され、予測部１０５に供給されると、予測部１０５では、ダイナミクス学習モデルが獲得したダイナミクスを有する所望操作データの予測値（さらには、状況データの予測値）、つまり、webブラウザのスクロールをする操作を表す時系列データの予測値が求められ、操作データ出力部１０６を介して、PCのインタフェースに出力される。 After that, when the situation data acquisition unit 101 acquires time series data of coordinates representing the locus of the user's line of sight as shown in FIG. 5 and supplies it to the prediction unit 105, the prediction unit 105 uses the dynamics learning model. The predicted value of the desired operation data having the dynamics acquired by the user (and the predicted value of the situation data), that is, the predicted value of the time-series data representing the scrolling operation of the web browser is obtained. To the PC interface.

したがって、PCでは、webブラウザ上におけるユーザの視線が、図５に示したような軌跡を描いた状況となった場合、webブラウザのスクロールをする操作を表す時系列データの予測値が求められ、これにより、PCでは、ユーザが、webブラウザのスクロールをする操作をする前に、その操作を先取りする形で、その操作に従った処理、つまり、webブラウザのスクロールが行われる。 Therefore, in the PC, when the user's line of sight on the web browser is in a situation where a locus as shown in FIG. 5 is drawn, a predicted value of time series data representing an operation of scrolling the web browser is obtained, Thus, on the PC, before the user performs an operation for scrolling the web browser, processing in accordance with the operation, that is, scrolling of the web browser is performed in advance.

図６は、TVの画面（表示画面）を示している。 FIG. 6 shows a TV screen (display screen).

すなわち、図６では、TVの画面がメイン画面とサブ画面とに左右に分割され、メイン画面において、ユーザが選択した番組の画像が表示されるとともに、サブ画面において、４つの、いわゆる裏番組の画像それぞれを縮小した４つの縮小画像が縦に並ぶ形で表示されている。 That is, in FIG. 6, the TV screen is divided into a main screen and a sub screen on the left and right, and the image of the program selected by the user is displayed on the main screen, and four so-called back programs are displayed on the sub screen. Four reduced images obtained by reducing the respective images are displayed in a vertically arranged form.

さらに、図６では、メイン画面の番組の画像上と、サブ画面の上から２番目の番組の縮小画像上とを往復するユーザの視線の軌跡と、その軌跡が描かれた後に、サブ画面の上から２番目に縮小画像が表示された番組を選択する操作がされている。 Furthermore, in FIG. 6, the locus of the user's line of sight that reciprocates between the image of the program on the main screen and the reduced image of the second program from the top of the sub screen, and after the locus is drawn, An operation for selecting the program on which the second reduced image is displayed from the top is performed.

図６において、状況データ取得部１０１が、ユーザの視線の軌跡を表す座標を、i番目の種類の状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tとして取得するとともに、操作データ取得部１０２が、サブ画面の上から２番目に縮小画像が表示された番組を選択する操作に対応する操作データを、j番目の所望操作データa^j _t,a^j _t+1,・・・,a^j _t+Tとして取得した場合、予測学習部１０３では、ユーザの視線の軌跡を表す座標の時系列データ、及び、サブ画面の上から２番目に縮小画像が表示された番組を選択する操作を表す時系列データを用いて、ダイナミクス学習モデルの学習が行われる。 In FIG. 6, the situation data acquisition unit 101 acquires the coordinates representing the user's line of sight as the i-th kind of situation data s ⁱ _t , s ⁱ _{t + 1} ,..., S ⁱ _{t + T.} At the same time, the operation data acquisition unit 102 obtains the operation data corresponding to the operation for selecting the program on which the second reduced image is displayed from the top of the sub screen, as the jth desired operation data a ^j _t , a ^j _{t + 1.} ,..., ^Aj _{t + T} , the prediction learning unit 103 displays the time-series data of coordinates representing the user's line of sight and the second reduced image from the top of the sub-screen. Learning of a dynamics learning model is performed using time series data representing an operation of selecting a program.

これにより、ダイナミクス学習モデルは、ユーザの視線の軌跡（メイン画面の番組の画像上と、サブ画面の上から２番目の番組の縮小画像上とを往復するユーザの視線の軌跡）を表す座標の時系列データ、及び、サブ画面の上から２番目に縮小画像が表示された番組を選択する操作を表す時系列データ（番組を選択する操作（選択操作）と、その選択操作の前後で、選択操作がされていないことを表す時系列データ）のダイナミクスを獲得する。 As a result, the dynamics learning model has coordinates of the user's line of sight (the locus of the user's line of sight that reciprocates between the program image on the main screen and the reduced image of the second program from the top of the sub screen). Time-series data and time-series data representing an operation for selecting the program on which the second reduced image is displayed from the top of the sub screen (selection before and after the operation for selecting the program (selection operation)) Acquire dynamics of time-series data indicating that no operation has been performed.

その後、状況データ取得部１０１において、図６に示したような、ユーザの視線の軌跡を表す座標の時系列データが取得され、予測部１０５に供給されると、予測部１０５では、ダイナミクス学習モデルが獲得したダイナミクスを有する所望操作データの予測値（さらには、状況データの予測値）、つまり、サブ画面の上から２番目に縮小画像が表示された番組を選択する操作を表す時系列データの予測値が求められ、操作データ出力部１０６を介して、TVのインタフェースに出力される。 After that, when the situation data acquisition unit 101 acquires time series data of coordinates representing the locus of the user's line of sight as shown in FIG. 6 and supplies the time series data to the prediction unit 105, the prediction unit 105 uses the dynamics learning model. Is the predicted value of the desired operation data having the dynamics acquired by (and also the predicted value of the situation data), that is, the time-series data representing the operation of selecting the program on which the second reduced image is displayed from the top of the sub-screen. A predicted value is obtained and output to the TV interface via the operation data output unit 106.

したがって、TVでは、画面上におけるユーザの視線が、図６に示したような軌跡を描いた状況となった場合、サブ画面の上から２番目に縮小画像が表示された番組を選択する操作を表す時系列データの予測値が求められ、これにより、TVでは、ユーザが、サブ画面の上から２番目に縮小画像が表示された番組を選択する操作をする前に、その操作を先取りする形で、その操作に従った処理、つまり、サブ画面の上から２番目に縮小画像が表示された番組の画像を、メイン画面に表示することが行われる。 Therefore, in the TV, when the user's line of sight on the screen is in a state of drawing a locus as shown in FIG. 6, an operation for selecting the program on which the second reduced image is displayed from the top of the sub screen is performed. The predicted value of the time-series data to be expressed is obtained, so that, on the TV, before the user performs the operation of selecting the program on which the second reduced image is displayed from the top of the sub screen, the operation is preempted. Thus, processing according to the operation, that is, displaying the image of the program on which the second reduced image is displayed from the top of the sub screen is performed on the main screen.

なお、予測部１０５（図１）は、図４の予測処理において、上述したように、状況データ取得部１０１から供給される状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tを入力として、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを求める他、状況データの予測値s'^j _t,s'^j _t+1,・・・,s'^j _t+Tを求め、状況データ取得部１０１から供給される状況データ（の真値）sⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tとともに、操作データ出力部１０６に供給することができる。 In addition, in the prediction process of FIG. 4, the prediction unit 105 (FIG. 1), as described above, the situation data s ⁱ _t , s ⁱ _{t + 1} ,..., S ⁱ supplied from the situation data acquisition unit 101. _{Using t + T} as an input, the predicted value a ′ ^j _t , a ′ ^j _{t + 1} ,..., a ′ ^j _{t + T} of the desired operation data is obtained, and the predicted value s ′ ^j _t , s of the situation data ' ^j _{t + 1} , ..., s' ^j _{t + T} is obtained, and the situation data (true value) s ⁱ _t , s ⁱ _{t + 1} , ..., s supplied from the situation data acquisition unit 101 It can be supplied to the operation data output unit 106 together with ⁱ _{t + T.}

この場合、操作データ出力部１０６は、予測部１０５からの状況データの予測値s'^j _t,s'^j _t+1,・・・,s'^j _t+T、及び、状況データの真値sⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tを用いて、状況データの予測値の予測誤差eⁱ _t=|sⁱ-s'^j|を求め、予測誤差e^j _t,e^j _t+1,・・・,e^j _t+Tのそれぞれ、又は、すべての総和が、あらかじめ決定された所定の閾値以下であるときのみ、予測部１０５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを出力する。 In this case, the operation data output unit 106 includes the predicted values s ′ ^j _t , s ′ ^j _{t + 1} ,..., S ′ ^j _{t + T} of the situation data from the prediction unit 105 and the true value of the situation data. ^{_{^{_{s i t, s i t +}}}} 1, ···, s i t + T using a prediction of the predicted value of the situation data error ^{_{^{e i t = | s i -s}}} ' j | look, the prediction error e ^j _{^{_{t, e j t + 1,}}} ···, each e ^j _{t + T,} or, all of the sum, only when it is below a predetermined threshold value that is predetermined, the prediction of the desired operating data from the prediction unit 105 The values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} are output.

この場合、状況データの予測値s'^j _t,s'^j _t+1,・・・,s'^j _t+Tが確からしくないときに、ひいては、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tも確からしくないときに、そのような確からしくない所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tに従った処理が行われること、つまり、ユーザが意図しない処理が行われることを防止することができる。 In this case, when the predicted values s ′ ^j _t , s ′ ^j _{t + 1} ,..., S ′ ^j _{t + T of} the situation data are not certain, eventually, the predicted values a ′ ^j _t , When a ′ ^j _{t + 1} ,..., a ′ ^j _{t + T} are not certain, the predicted value a ′ ^j _t , a ′ ^j _{t + 1} ,. It is possible to prevent the processing according to a ′ ^j _{t + T from} being performed, that is, the processing not intended by the user.

次に、図７は、本発明を適用したデータ処理装置の他の一実施の形態の構成例を示すブロック図である。 Next, FIG. 7 is a block diagram showing a configuration example of another embodiment of the data processing apparatus to which the present invention is applied.

図７のデータ処理装置は、例えば、ゲーム装置（又は、ゲームのソフトウェア）の一部を構成する。 The data processing device in FIG. 7 constitutes a part of a game device (or game software), for example.

なお、ここでは、図７のデータ処理装置が一部を構成するゲーム装置が、ユーザが複数のキャラクタを操作する、例えば、サッカーや野球等のゲームのゲーム装置であるとする。 Here, it is assumed that the game device that is a part of the data processing device of FIG. 7 is a game device for games such as soccer and baseball, in which a user operates a plurality of characters.

図７において、データ処理装置は、教示キャラクタ選択部１５１、及び、ユーザが操作する複数としてのN人のキャラクタと同一の数のN個のキャラクタ動作補助モジュール１５２₁，１５２₂，・・・，１５２_Nから構成される。 7, the data processing apparatus includes a teaching character selection unit 151 and N character motion auxiliary modules 152 ₁ , 152 ₂ ,..., The same number as a plurality of N characters operated by the user. 152 _N.

教示キャラクタ選択部１５１は、例えば、ユーザの操作に応じて、N人のキャラクタの中から、動作を学習させるキャラクタを選択し、そのキャラクタの動作の補助を担当するキャラクタ動作補助モジュール１５２_nを、学習処理を行うように制御する。 The teaching character selection unit 151 selects, for example, a character for learning a motion from N characters in response to a user operation, and a character motion assist module 152 _n that is responsible for assisting the motion of the character. Control to perform the learning process.

キャラクタ動作補助モジュール１５２₁ないし１５２_Nは、上述したように、ユーザが操作するN人のキャラクタと同一の数であるN個だけ設けられている。キャラクタ動作補助モジュール１５２_nは、N人のキャラクタのうちの、n番目のキャラクタの動作の補助を担当する。 As described above, the character motion assist modules 152 ₁ to 152 _N are provided in the same number N as the N characters operated by the user. The character motion assist module 152 _n is responsible for assisting the motion of the nth character among the N characters.

すなわち、キャラクタ動作補助モジュール１５２_nは、n番目のキャラクタの動作を学習し、その学習結果に従い、n番目のキャラクタの動作を補助する。 That is, the character motion assist module 152 _n learns the motion of the nth character and assists the motion of the nth character according to the learning result.

図８は、図７のデータ処理装置が一部を構成するゲーム装置が表示するゲームの画面（ゲーム画面）を示している。 FIG. 8 shows a game screen (game screen) displayed by a game device that constitutes a part of the data processing device of FIG.

すなわち、図８は、サッカーのゲームのゲーム画面を示している。 That is, FIG. 8 shows a game screen of a soccer game.

図８では、サッカーのプレイヤのキャラクタとして、ユーザのチームの一部のキャラクタである４人の味方キャラクタFC#1，FC#2，FC#3、及びFC#4と、ゲーム装置が操る敵のチームの一部のキャラクタである３人の敵キャラクタRC#1，RC#2、及びRC#3が表示されている。 In FIG. 8, four teammate characters FC # 1, FC # 2, FC # 3, and FC # 4, which are part of the user's team, are the characters of the soccer player, and the enemy controlled by the game device. Three enemy characters RC # 1, RC # 2, and RC # 3, which are part of the team character, are displayed.

さらに、図８においては、サッカーのボールが表示されており、ゲーム中は、例えば、原則として、ボールの最も近くにいる味方キャラクタFC#nが、ユーザの操作に応じて動作し、他の味方キャラクタFC#n'は、ゲーム装置が動作させる。 Further, in FIG. 8, a soccer ball is displayed. During the game, for example, as a general rule, the ally character FC # n closest to the ball moves according to the user's operation, and other ally The character FC # n ′ is operated by the game device.

図８では、味方キャラクタFC#1ないしFC#4のうちの、味方キャラクタFC#1がボールに最も近く、したがって、味方キャラクタFC#1は、ユーザの操作に応じて動作し、他の味方キャラクタFC#2ないしFC#4は、ゲーム装置の制御に従って動作する。 In FIG. 8, among the teammate characters FC # 1 to FC # 4, the teammate character FC # 1 is closest to the ball. Therefore, the teammate character FC # 1 moves according to the user's operation, and other teammate characters FC # 2 to FC # 4 operate according to the control of the game device.

ゲーム装置が、他の味方キャラクタFC#n'を、どのように動作させるかは、ゲームのソフトウェアにあらかじめプログラミングされている。したがって、例えば、図８において、ユーザが、味方キャラクタFC#1を操作している最中に、他の味方キャラクタFC#2ないしFC#4それぞれが、矢印で示すように移動して欲しいと思っていても、ゲームのソフトウェアに、そのようなプログラミングがされていなければ、他の味方キャラクタFC#2ないしFC#4は、移動しない。 It is programmed in advance in the game software how the game device operates other teammate characters FC # n '. Therefore, for example, in FIG. 8, while the user is operating the teammate character FC # 1, the other teammate characters FC # 2 to FC # 4 want to move as indicated by arrows. However, if such programming is not performed in the game software, the other ally characters FC # 2 to FC # 4 do not move.

そこで、キャラクタ動作補助モジュール１５２_nは、n番目のキャラクタの動作を学習し、その学習結果に従い、n番目のキャラクタの動作を補助することで、そのn番目のキャラクタに、ユーザが所望する動作（ユーザが操作していたならば、行うであろう操作に対応する動作）を行わせる。 Therefore, the character motion assist module 152 _n learns the motion of the n th character, and assists the motion of the n th character according to the learning result, so that the motion desired by the user ( If the user is operating, an operation corresponding to an operation that will be performed) is performed.

すなわち、図９は、図７のキャラクタ動作補助モジュール１５２_nの構成例を示している。 That is, FIG. 9 shows a configuration example of the character motion assist module 152 _n of FIG.

図９において、キャラクタ動作補助モジュール１５２_nは、状況データ取得部１６１、操作データ取得部１６２、予測学習部１６３、ダイナミクス学習モデル記憶部１６４、予測部１６５、及び操作データ出力部１６６から構成される。 In FIG. 9, the character motion assist module 152 _n includes a situation data acquisition unit 161, an operation data acquisition unit 162, a prediction learning unit 163, a dynamics learning model storage unit 164, a prediction unit 165, and an operation data output unit 166. .

状況データ取得部１６１は、図１の状況データ取得部１０１と同様に、状況を表す時系列データである状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tを取得し、予測学習部１６３、及び予測部１６５に供給する。 The situation data acquisition unit 161 acquires situation data s ⁱ _t , s ⁱ _{t + 1} ,..., S ⁱ _{t + T} that are time-series data representing the situation, as in the situation data acquisition unit 101 of FIG. And supplied to the prediction learning unit 163 and the prediction unit 165.

但し、状況データ取得部１６１は、例えば、味方キャラクタFC#1ないしFC#4の位置や、ボールの位置、ゴールの位置、敵キャラクタRC#1ないしRC#3の位置等を表す、サッカーゲームの各種の状況を表す状況データを取得する。 However, the situation data acquisition unit 161, for example, indicates the position of the teammate characters FC # 1 to FC # 4, the position of the ball, the position of the goal, the position of the enemy characters RC # 1 to RC # 3, etc. Acquire situation data representing various situations.

操作データ取得部１６２は、図１の操作データ取得部１０１と同様に、ユーザが所望する操作に対応する時系列データである所望操作データa^j _t,a^j _t+1,・・・,a^j _t+Tを取得し、予測学習部１６３に供給する。 Similar to the operation data acquisition unit 101 in FIG. 1, the operation data acquisition unit 162 is desired operation data a ^j _t , a ^j _{t + 1} ,..., A that is time-series data corresponding to an operation desired by the user. ^j _{t + T} is acquired and supplied to the prediction learning unit 163.

但し、操作データ取得部１６２は、例えば、図９のキャラクタ動作補助モジュール１５２_nが担当するn番目のキャラクタ（味方キャラクタ）を移動させるユーザの操作や、そのn番目のキャラクタに、シュート、パス、ドリブルをさせるユーザの操作等の、ユーザが、そのときに所望している動作をn番目のキャラクタに行わせる操作を表す所望操作データを取得する。 However, the operation data acquisition unit 162, for example, the operation and the user moves the n-th character is the character operation auxiliary module 152 _n in FIG charge (ally character), its n-th character, shoot, pass, Desired operation data representing an operation for causing the nth character to perform a motion desired by the user, such as a user operation for dribbling, is acquired.

予測学習部１６３は、図１の予測学習部１０３と同様に、状況データ取得部１６１からの状況データ、及び、操作データ取得部１６２からの所望操作データのダイナミクスを、ダイナミクス学習モデル記憶部１６４に記憶されたダイナミクス学習モデルによって学習する。 Similar to the prediction learning unit 103 in FIG. 1, the prediction learning unit 163 stores the dynamics of the situation data from the situation data acquisition unit 161 and the desired operation data from the operation data acquisition unit 162 in the dynamics learning model storage unit 164. Learning with a memorized dynamics learning model.

ダイナミクス学習モデル記憶部１６４は、図１のダイナミクス学習モデル記憶部１０４と同様に、ダイナミクス学習モデルを記憶する。 The dynamics learning model storage unit 164 stores a dynamics learning model in the same manner as the dynamics learning model storage unit 104 in FIG.

予測部１６５は、図１の予測部１０５と同様に、ダイナミクス学習モデル記憶部１６４に記憶されたダイナミクス学習モデルが獲得したダイナミクスに基づき、状況データ取得部１６１から供給される状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tを入力として、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+T、さらには、必要に応じて、状況データの予測値s'^j _t,s'^j _t+1,・・・,s'^j _t+Tを求め、操作データ出力部１６６に供給する。 Like the prediction unit 105 in FIG. 1, the prediction unit 165 uses the situation data s ⁱ _t , supplied from the situation data acquisition unit 161 based on the dynamics acquired by the dynamics learning model stored in the dynamics learning model storage unit 164. s ⁱ _{t + 1} ,..., s ⁱ _{t + T} as input, the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., a ′ ^j _{t + T} of the desired operation data, and if necessary, the predicted value s' ^j _t, s' of the status data obtains a ^{_{j t + 1, ···, s}} ' j t + T, and supplies the operation data output unit 166.

なお、予測部１６５は、状況データの予測値s'^j _t,s'^j _t+1,・・・,s'^j _t+Tを求める場合には、状況データ取得部１６１からの状況データ（の真値）sⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tも、操作データ出力部１６６に供給する。 In addition, when the prediction unit 165 obtains the predicted values s ′ ^j _t , s ′ ^j _{t + 1} ,..., S ′ ^j _{t + T} of the situation data, the situation data from the situation data acquisition unit 161 ( S ⁱ _t , s ⁱ _{t + 1} ,..., S ⁱ _{t + T} are also supplied to the operation data output unit 166.

操作データ出力部１６６は、図１の操作データ出力部１０６と同様に、予測部１６５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを、図７のデータ処理装置が一部を構成しているゲーム装置の操作データを受け付けるインタフェース（モジュール）に出力する。 The operation data output unit 166 is similar to the operation data output unit 106 of FIG. 1, and predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _t of the desired operation data from the prediction unit 165. _{+ T} is output to an interface (module) that receives operation data of a game device that is a part of the data processing device of FIG.

操作データ出力部１６６が、予測部１６５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを、ゲーム装置の操作データを受け付けるインタフェースに出力した場合、ゲーム装置では、n番目のキャラクタを、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tに従って動作させる処理が行われる。 An interface in which the operation data output unit 166 receives the operation data of the game device using the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data from the prediction unit 165 In the game device, the game device performs a process of moving the n-th character according to the predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} of the desired operation data. .

なお、操作データ出力部１６６では、図１の操作データ出力部１０６と同様に、状況データの予測値の予測誤差eⁱ _t=|sⁱ-s'^j|を求め、予測誤差e^j _t,e^j _t+1,・・・,e^j _t+Tのそれぞれ、又は、すべての総和が、あらかじめ決定された所定の閾値以下（未満）であるときのみ、予測部１６５からの所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tを出力することができる。 In operation data output unit 166, similarly to the operation data output unit 106 of FIG. 1, the prediction error e ⁱ _t = the predicted value of the status data | s ⁱ -s' ^j | look, the prediction error e ^j _t, Only when the sum of all or all of e ^j _{t + 1} ,..., e ^j _{t + T} is less than (or less than) a predetermined threshold value determined in advance, the desired operation data from the prediction unit 165 Predicted values a ′ ^j _t , a ′ ^j _{t + 1} ,..., A ′ ^j _{t + T} can be output.

次に、図１０を参照して、図９の状況データ取得部１６１が取得する状況データと、操作データ取得部１６２が取得する所望操作データについて、さらに説明する。 Next, the situation data acquired by the situation data acquisition unit 161 in FIG. 9 and the desired operation data acquired by the operation data acquisition unit 162 will be further described with reference to FIG.

図１０Ａは、学習モード時のゲーム画面を示している。 FIG. 10A shows a game screen in the learning mode.

すなわち、図７のデータ処理装置が一部を構成するゲーム装置には、動作モードとして、ゲーム装置（又は、他のユーザ）が操る敵チームとサッカーの試合を行う通常モードと、ユーザのチームのキャラクタの動作の学習を行う学習モードとがある。 That is, in the game device in which the data processing device of FIG. 7 constitutes a part, the normal mode for playing a soccer game with the enemy team operated by the game device (or another user) as the operation mode, There is a learning mode for learning the movement of the character.

図１０Ａは、動作モードが学習モードである場合のゲーム画面を示している。なお、動作モードの切り換えは、例えば、ユーザの操作に応じて行われる。 FIG. 10A shows a game screen when the operation mode is the learning mode. Note that the operation mode is switched in accordance with, for example, a user operation.

図１０Ａは、図８の場合と同様のゲーム画面を示している。 FIG. 10A shows the same game screen as in FIG.

したがって、図１０Ａのゲーム画面では、ユーザのチームの一部のキャラクタである４人の味方キャラクタFC#1ないしFC#4と、ゲーム装置が操る敵のチームの一部のキャラクタである３人の敵キャラクタRC#1ないしRC#3が存在する状況であって、かつ、味方キャラクタFC#1ないしFC#4のうちの、味方キャラクタFC#1が、ボールに最も近い位置に存在する状況になっている。 Therefore, in the game screen of FIG. 10A, four friendly characters FC # 1 to FC # 4 that are part of the characters of the user's team and three characters that are part of the enemy team operated by the game device are displayed. The enemy characters RC # 1 to RC # 3 exist, and among the teammate characters FC # 1 to FC # 4, the teammate character FC # 1 is located closest to the ball. ing.

図１０Ａの状況において、ユーザは、ボールに最も近い味方キャラクタFC#1以外の他の味方キャラクタFC#2ないしFC#4それぞれを、矢印で示すように移動させたい場合、例えば、味方キャラクタFC#2を、動作の学習対象のキャラクタとして選択し、その味方キャラクタFC#2を、矢印で示すように移動させる操作を行う。 In the situation of FIG. 10A, when the user wants to move each of the teammate characters FC # 2 to FC # 4 other than the teammate character FC # 1 closest to the ball as indicated by the arrows, for example, the teammate character FC # 2 is selected as an action learning target character, and an operation of moving the friend character FC # 2 as indicated by an arrow is performed.

この場合、教示キャラクタ選択部１５１（図７）は、味方キャラクタFC#2の動作の補助を担当する、例えば、2番目のキャラクタ動作補助モジュール１５２₂を、学習処理を行うように制御する。キャラクタ動作補助モジュール１５２₂は、教示キャラクタ選択部１５１の制御に従い、状況データ取得部１６１において、状況データを取得するとともに、操作データ取得部１６２において、所望操作データ（ここでは、味方キャラクタFC#2を、図１０Ａに矢印で示すように移動させる操作に対応する操作データ）を取得し、その状況データ及び所望操作データのダイナミクスを学習する学習処理を行う。 In this case, teaching character selection section 151 (FIG. 7) is responsible for assisting the operation of the teammate character FC # 2, for example, the second character operation auxiliary module 152 ₂ is controlled to perform the learning process. Character movement auxiliary module 152 ₂ under the control of the teaching character selection section 151, the status data obtaining unit 161 obtains the status data, the operation data acquisition unit 162, the desired operating data (here, the teammate character FC # 2 10A is acquired, and learning processing is performed for learning the dynamics of the situation data and desired operation data.

図１０Ｂは、図１０Ａのゲーム画面において、ユーザが、味方キャラクタFC#2を、矢印で示すように移動させる操作を行った場合に、味方キャラクタFC#2の動作の補助を担当するキャラクタ動作補助モジュール１５２₂の状況データ取得部１６１が取得する状況データを示している。 FIG. 10B shows a character motion assistance that assists the movement of the teammate character FC # 2 when the user performs an operation of moving the teammate character FC # 2 as indicated by an arrow on the game screen of FIG. 10A. It shows a situation data status data obtaining unit 161 of the module 152 ₂ obtains.

なお、図１０Ｂにおいて、横軸は、時刻tを表し、縦軸は、図１０Ａのゲーム画面の左下の点を原点とするxy座標系における座標(x,y)を表している。 In FIG. 10B, the horizontal axis represents time t, and the vertical axis represents coordinates (x, y) in the xy coordinate system with the lower left point of the game screen of FIG. 10A as the origin.

状況データ取得部１６１は、例えば、図１０Ｂに示すように、ボールの位置の軌跡を表す座標(ball_x_t,ball_y_t)を、i番目の種類の状況データsⁱ _t,sⁱ _t+1,・・・,sⁱ _t+Tとして取得する。 For example, as shown in FIG. 10B, the situation data acquisition unit 161 uses coordinates (ball_x _t , ball_y _t ) representing the trajectory of the position of the ball as i-th type situation data s ⁱ _t , s ⁱ _{t + 1} , ..., obtained as s ⁱ _{t + T.}

図１０Ｃは、図１０Ａのゲーム画面において、ユーザが、味方キャラクタFC#2を、矢印で示すように移動させる操作を行った場合に、味方キャラクタFC#2の動作の補助を担当するキャラクタ動作補助モジュール１５２₂の操作データ取得部１６２が取得する所望操作データを示している。 FIG. 10C shows a character motion assistance that assists the movement of the teammate character FC # 2 when the user performs an operation of moving the teammate character FC # 2 as indicated by an arrow on the game screen of FIG. 10A. It indicates the desired operation data operation data acquisition unit 162 of the module 152 ₂ obtains.

なお、図１０Ｃにおいて、横軸は、時刻tを表し、縦軸は、動作の学習対象のキャラクタである味方キャラクタFC#2の図１０Ａのゲーム画面の左下の点を原点とするxy座標系における座標(x,y)を表している。 In FIG. 10C, the horizontal axis represents time t, and the vertical axis represents the xy coordinate system having the origin at the lower left point of the game screen of FIG. The coordinates (x, y) are represented.

操作データ取得部１６２は、図１０Ｃに示すように、ユーザの操作に従って移動する、動作の学習対象のキャラクタである味方キャラクタFC#2の位置の軌跡を表す座標(p_move_x_t,p_move_y_t)を、j番目の所望操作データa^j _t,a^j _t+1,・・・,a^j _t+Tとして取得する。 As shown in FIG. 10C, the operation data acquisition unit 162 represents coordinates (p_move_x _t , p_move_y _t ) representing the locus of the position of the teammate character FC # 2 that is a character to be learned for movement and moves according to the user's operation. Obtained as j-th desired operation data a ^j _t , a ^j _{t + 1} ,..., a ^j _{t + T.}

以下、同様に、ユーザは、味方キャラクタFC#3及びFC#4それぞれを、動作の学習対象のキャラクタとして選択し、矢印で示すように移動させる操作を行う。これにより、味方キャラクタFC#3及びFC#4の動作の補助を担当する、例えば、3番目のキャラクタ動作補助モジュール１５２₃、及び４番目のキャラクタ動作補助モジュール１５２₄それぞれでは、キャラクタ動作補助モジュール１５２₂の場合と同様に、状況データと所望操作データとが取得される。 Hereinafter, similarly, the user performs an operation of selecting each of the teammate characters FC # 3 and FC # 4 as a character to be learned for movement and moving it as indicated by an arrow. Thus, for example, the third character motion assist module 152 ₃ and the fourth character motion assist module 152 ₄ are responsible for assisting the motion of the teammate characters FC # 3 and FC # 4. _{As in} the case of ₂ , situation data and desired operation data are acquired.

次に、図７のデータ処理装置の動作について説明する。 Next, the operation of the data processing apparatus in FIG. 7 will be described.

図７のデータ処理装置では、図１のデータ処理装置と同様に、キャラクタ動作補助モジュール１５２_n（図９）において、ダイナミクス学習モデル記憶部１６４に記憶されたダイナミクス学習モデルのパラメータを、状況データ及び所望操作データを用いて更新する、ダイナミクス学習モデルの学習の処理（学習処理）と、学習処理によってダイナミクスを獲得したダイナミクス学習モデルを用い、状況データを入力として、所望操作データの予測値（さらには、状況データの予測値）を求める予測の処理（予測処理）とが行われる。 In the data processing device of FIG. 7, as in the data processing device of FIG. 1, in the character motion assist module 152 _n (FIG. 9), the parameters of the dynamics learning model stored in the dynamics learning model storage unit 164 are used as the situation data and The dynamics learning model learning process (learning process), which is updated using the desired operation data, and the dynamics learning model that has acquired the dynamics through the learning process, using the situation data as input, the predicted value of the desired operation data (and more , A prediction process (prediction process) for obtaining a predicted value of the situation data is performed.

図１１は、図７のデータ処理装置で行われる学習処理を説明するフローチャートである。 FIG. 11 is a flowchart illustrating the learning process performed by the data processing apparatus of FIG.

学習処理は、例えば、動作モードが学習モードとされたときに行われる。 The learning process is performed, for example, when the operation mode is set to the learning mode.

教示キャラクタ選択部１５１は、ステップＳ１５１において、ユーザが、動作を学習させるキャラクタを選択する操作を行うのを待って、そのキャラクタの動作の補助を担当するキャラクタ動作補助モジュール１５２_nを、学習を行う学習対象モジュールとして選択し、処理は、ステップＳ１５２に進む。 In step S151, the teaching character selection unit 151 waits for the user to perform an operation of selecting a character for learning a motion, and learns the character motion assist module 152 _n responsible for assisting the motion of the character. It selects as a learning object module, and a process progresses to step S152.

ステップＳ１５２では、学習対象モジュールとして選択されたキャラクタ動作補助モジュール１５２_n（図９）において、状況データ取得部１６１が、状況データを取得し、予測学習部１６３に供給して、処理は、ステップＳ１５３に進む。 In step S152, in the character motion assistance module 152 _n (FIG. 9) selected as the learning target module, the situation data acquisition unit 161 acquires situation data and supplies the situation data to the prediction learning unit 163, and the processing is performed in step S153. Proceed to

ステップＳ１５３では、操作データ取得部１６２が、所望操作データを取得し、予測学習部１６３に供給して、処理は、ステップＳ１５４に進む。 In step S153, the operation data acquisition unit 162 acquires the desired operation data, supplies it to the prediction learning unit 163, and the process proceeds to step S154.

ステップＳ１５４では、予測学習部１６３が、ダイナミクス学習モデル記憶部１６４に記憶されたダイナミクス学習モデルのパラメータを読み出し、処理は、ステップＳ１５５に進む。 In step S154, the prediction learning unit 163 reads the parameters of the dynamics learning model stored in the dynamics learning model storage unit 164, and the process proceeds to step S155.

ステップＳ１５５では、予測学習部１６３が、ステップＳ１５４でダイナミクス学習モデル記憶部１６４から読み出したダイナミクス学習モデルのパラメータを、状況データ取得部１６１からの状況データ、及び、操作データ取得部１６２からの所望操作データを用いて更新して、処理は、ステップＳ１５６に進む。 In step S155, the prediction learning unit 163 uses the dynamics learning model parameters read from the dynamics learning model storage unit 164 in step S154, the situation data from the situation data acquisition unit 161, and the desired operation from the operation data acquisition unit 162. After updating using the data, the process proceeds to step S156.

ステップＳ１５６では、予測学習部１６３は、ステップＳ１５５で更新したダイナミクス学習モデルのパラメータを、ダイナミクス学習モデル記憶部１６４に上書きの形で書き込み、処理は終了する。 In step S156, the prediction learning unit 163 writes the parameters of the dynamics learning model updated in step S155 in the form of overwriting in the dynamics learning model storage unit 164, and the process ends.

図１２は、図７のデータ処理装置で行われる予測処理を説明するフローチャートである。 FIG. 12 is a flowchart for explaining a prediction process performed by the data processing apparatus of FIG.

予測処理は、例えば、動作モードが通常モードとされたときに、N個のキャラクタ動作補助モジュール１５２₁ないし１５２_Nそれぞれ（但し、ユーザが操作しているキャラクタを担当するキャラクタ動作補助モジュール１５２_n’は、除外することができる）において行われる。 For example, when the motion mode is set to the normal mode, the prediction process includes N character motion assist modules 152 ₁ to 152 _N (however, the character motion assist module 152 _n ′ in charge of the character operated by the user). Can be excluded).

キャラクタ動作補助モジュール１５２_nでは、ステップＳ１６１において、状況データ取得部１６１が、状況データを取得し、予測部１６５に供給して、処理は、ステップＳ１６２に進む。 In the character motion assist module 152 _n , in step S161, the situation data acquisition unit 161 acquires situation data and supplies the situation data to the prediction unit 165, and the process proceeds to step S162.

ステップＳ１６２では、予測部１６５が、ダイナミクス学習モデル記憶部１６４に記憶されたダイナミクス学習モデルのパラメータを読み出し、処理は、ステップＳ１６３に進む。 In step S162, the prediction unit 165 reads the parameters of the dynamics learning model stored in the dynamics learning model storage unit 164, and the process proceeds to step S163.

ステップＳ１６３では、予測部１６５は、ダイナミクス学習モデル記憶部１６４に記憶されたダイナミクス学習モデルが獲得したダイナミクスに基づき、状況データ取得部１６１から供給される状況データを入力として、所望操作データの予測値を求め、操作データ出力部１６６に供給して、処理は、ステップＳ１６４に進む。 In step S163, the prediction unit 165 receives the situation data supplied from the situation data acquisition unit 161 based on the dynamics acquired by the dynamics learning model stored in the dynamics learning model storage unit 164, and predicts the desired operation data. Is obtained and supplied to the operation data output unit 166, and the process proceeds to step S164.

ステップＳ１６４では、操作データ出力部１６６が、予測部１６５からの所望操作データの予測値を、図７のデータ処理装置が一部を構成しているゲーム装置の操作データを受け付けるインタフェースに出力し、処理は終了する。 In step S164, the operation data output unit 166 outputs the predicted value of the desired operation data from the prediction unit 165 to the interface that receives the operation data of the game device of which the data processing device of FIG. The process ends.

これにより、ゲーム装置では、キャラクタ動作補助モジュール１５２_nが担当するキャラクタを、所望操作データの予測値に従って動作させる処理が行われる。 Thereby, in the game device, a process of causing the character in charge of the character motion assist module 152 _n to move according to the predicted value of the desired operation data is performed.

以上のように、キャラクタ動作補助モジュール１５２_nでは、状況データ取得部１６１において、状況を表す時系列データである状況データを取得するとともに、操作データ取得部１６２において、ユーザが所望する操作に対応する時系列データである所望操作データを取得し、予測学習部１６３において、状況データ及び所望操作データのダイナミクスを学習し、予測部１６５において、ダイナミクスに基づき、状況データを入力として、所望操作データの予測値を求め、操作データ出力部１６６において、所望操作データの予測値を出力するので、ユーザごとに、ユーザが行いたい操作、つまり、ユーザが所望する操作を先取りして行うことができる。 As described above, in the character motion assist module 152 _n , the situation data acquisition unit 161 acquires situation data that is time-series data representing the situation, and the operation data acquisition unit 162 corresponds to an operation desired by the user. Desired operation data that is time-series data is acquired, the prediction learning unit 163 learns the dynamics of the situation data and the desired operation data, and the prediction unit 165 predicts the desired operation data using the situation data as an input based on the dynamics. Since the value is obtained and the predicted value of the desired operation data is output in the operation data output unit 166, an operation desired by the user, that is, an operation desired by the user can be performed in advance for each user.

すなわち、例えば、図８や図１０に示したように、ユーザが、味方キャラクタFC#1を操作している最中に、他の味方キャラクタFC#2ないしFC#4それぞれを、矢印で示すように移動する操作を行いたいときに、そのように移動させることができる。 That is, for example, as shown in FIGS. 8 and 10, while the user is operating the teammate character FC # 1, the other teammate characters FC # 2 to FC # 4 are indicated by arrows. When you want to perform an operation to move to, you can move that way.

なお、上述したたように、操作データ出力部１６６において、状況データの予測値の予測誤差eⁱ _t=|sⁱ-s'^j|を求め、図１の操作データ出力部１０６と同様に、予測誤差e^j _t,e^j _t+1,・・・,e^j _t+Tに応じて、所望操作データの予測値a'^j _t,a'^j _t+1,・・・,a'^j _t+Tの出力を制御することにより、ユーザが意図しない処理が行われること、つまり、キャラクタが意図しない動作を行うことを防止することができる。 As described above, the operation data output unit 166 obtains the prediction error e ⁱ _t = | s ⁱ −s ′ ^j | of the predicted value of the situation data, and similarly to the operation data output unit 106 of FIG. prediction error ^{_{^{_{e j t, e j t +}}}} 1, ···, e j t + according to _T, the predicted value a ^'j _t, a' desired operation data ^{_{j t + 1, ···, a}} 'j By controlling the output of _{t + T} , it is possible to prevent a process unintended by the user, that is, a character from performing an unintended action.

次に、ダイナミクス学習モデルとしては、上述したように、多数のダイナミクスを保持することができるダイナミクス記憶ネットワークを採用することができる。 Next, as the dynamics learning model, as described above, a dynamics storage network that can hold a large number of dynamics can be employed.

そこで、以下では、例えば、自律型ロボット等の自律的に行動する自律エージェントへの適用を例に、ダイナミクス記憶ネットワークについて説明する。 Therefore, in the following, a dynamics storage network will be described taking an example of application to an autonomous agent that acts autonomously such as an autonomous robot.

自律型ロボット等の自律エージェントは、様々なセンサ信号に基づいて、どのように振る舞うべきか、つまり、とるべき行動を決定し、その行動に応じたモータ信号を生成することで、自律的に行動する。 Autonomous agents such as autonomous robots act autonomously by deciding how to behave based on various sensor signals, that is, determining the action to be taken and generating a motor signal according to the action. To do.

ここで、センサ信号とは、例えば、カメラが、センシングとしての撮像を行うことで出力する画像信号や、マイク（マイクロフォン）が、センシングとしての集音を行うことで出力する音声信号等である。また、モータ信号とは、例えば、自律エージェントの腕や脚等を駆動するモータに与えられる信号や、音声合成装置に対して与えられる、音声合成に必要な信号等である。 Here, the sensor signal is, for example, an image signal output when the camera performs imaging as sensing, an audio signal output when the microphone (microphone) collects sound as sensing, or the like. The motor signal is, for example, a signal given to a motor that drives an arm or leg of an autonomous agent, a signal necessary for speech synthesis, etc. given to a speech synthesizer.

自律エージェントは、とるべき行動を決定するときに、センサ信号に基づいて、周囲の状態（例えば、何らかの物体がある位置等）や、自律エージェントの状態（例えば、腕や脚の状態等）等の状況を認識する。この、状況を認識することを、以下、適宜、認知ともいう。 When an autonomous agent decides an action to be taken, based on the sensor signal, the surrounding state (for example, the position where an object is present), the state of the autonomous agent (for example, the state of an arm or a leg, etc.) Recognize the situation. Recognizing this situation is hereinafter also referred to as recognition as appropriate.

また、自律エージェントは、認知（認識）の結果に基づき、とるべき行動（動作）を決定し、その行動に応じたモータ信号を生成する。このモータ信号が、自律エージェントの腕や脚等を駆動するモータに与えられることで、自律エージェントは、腕や脚等を動かす行動をとる。 Further, the autonomous agent determines an action (motion) to be taken based on the recognition (recognition) result, and generates a motor signal corresponding to the action. When this motor signal is given to the motor that drives the arms, legs, etc. of the autonomous agent, the autonomous agent takes action to move the arms, legs, etc.

ここで、以下、適宜、とるべき行動に応じたモータ信号を生成することを、単に、行動ともいう。 Here, the generation of a motor signal corresponding to an action to be taken as appropriate is hereinafter simply referred to as an action.

また、以下、適宜、状況を認識し、その認識結果に基づき、とるべき行動を決定して、その行動に応じたモータ信号を生成する認識生成、つまり、認知を行い、その認知の結果に基づき、行動することを、認知行動ともいい、認知行動をモデル化したモデルを、認知行動モデルという。 In addition, hereinafter, the situation is recognized as appropriate, the action to be taken is determined based on the recognition result, and the motor is generated according to the action. To act is also called cognitive behavior. A model that models cognitive behavior is called a cognitive behavior model.

自律エージェントの認知行動は、時間発展法則により定められる力学系（dynamical systems）として記述することができ、様々な行動はその力学系が持つ特定のアトラクタダイナミクス（attractor dynamics）によって実現できることが知られている。例えば、人を模した二足型ロボットの歩行運動は、系の運動状態が様々な初期状態からある特定の周期軌道に落ち着くことを特徴とするリミットサイクルダイナミクス（limit cycle dynamics）として記述することができる。 The cognitive behavior of autonomous agents can be described as dynamical systems defined by the law of time evolution, and it is known that various behaviors can be realized by specific attractor dynamics of the dynamic system. Yes. For example, the walking motion of a biped robot imitating a person can be described as limit cycle dynamics, which is characterized by the movement state of the system from various initial states to a specific periodic trajectory. it can.

また、自律エージェントとしての、例えば、アームロボットがある対象物に対して手先を伸ばすようなリーチング運動は、様々な初期状態からある特定の固定点に落ち着くことを特徴とする不動点ダイナミクス（fixed-point dynamics）として記述することができる。さらに、全ての運動は、不動点ダイナミクスで実現可能な離散運動（discrete movement）とリミットサイクルダイナミクスで実現可能な周期運動（cyclic movement）の組み合わせにより実現できるとも言われている。 In addition, as an autonomous agent, for example, a leaching movement in which an arm robot extends a hand against an object, a fixed point dynamics (fixed- point dynamics). Furthermore, it is said that all the movements can be realized by a combination of a discrete movement that can be realized by the fixed point dynamics and a cyclic movement that can be realized by the limit cycle dynamics.

したがって、ダイナミクスを学習するダイナミクス学習モデルは、認知行動モデルとして利用することができる。 Therefore, a dynamics learning model for learning dynamics can be used as a cognitive behavior model.

ダイナミクス学習モデルの１つである、例えば、RNNは、ネットワークに回帰ループで結合されるコンテキストユニットを持ち、そこに内部状態を保持することによって、理論的には、任意の力学系を近似可能である。 One of the dynamics learning models, for example, RNN has a context unit connected to the network by a regression loop, and can hold any internal state in theory, so it can theoretically approximate any dynamic system. is there.

但し、１つのRNNでは、多数のダイナミクスを獲得（学習）することは、学習の収束性などから難しいことがある。 However, it may be difficult to acquire (learn) a large number of dynamics with one RNN because of the convergence of learning.

これに対して、ダイナミクス記憶ネットワークによれば、多数のダイナミクスを容易に獲得することができる。 On the other hand, according to the dynamics storage network, a large number of dynamics can be easily acquired.

そこで、図１３は、ダイナミクス記憶ネットワークによって、時系列データのダイナミクスを獲得する学習をし、その学習結果を用いて、時系列データの認識及び生成を行うデータ処理装置の構成例を示している。 FIG. 13 shows an example of the configuration of a data processing apparatus that learns to acquire dynamics of time-series data using a dynamics storage network, and recognizes and generates time-series data using the learning results.

図１３のデータ処理装置では、観測することができる観測信号が、信号入力部１１に入力される。観測信号は、例えば音や画像の信号、LED(Light Emitting Diode)の明るさ、モータの回転角度や回転角速度などであり、図１３のデータ処理装置が、例えば、自律エージェントの認知行動に利用されることとすると、その自律エージェントに対して入出力し得る信号が、観測信号となり得る。 In the data processing apparatus of FIG. 13, an observation signal that can be observed is input to the signal input unit 11. The observation signal is, for example, a sound or image signal, the brightness of an LED (Light Emitting Diode), the rotation angle or the rotation angular velocity of the motor, and the data processing device of FIG. 13 is used for, for example, the recognition behavior of an autonomous agent. If so, a signal that can be input to and output from the autonomous agent can be an observation signal.

ここで、ダイナミクス学習モデルとして、ダイナミクス記憶ネットワークが採用される場合、状況データ、及び所望操作データが、観測信号に相当する。 Here, when the dynamics storage network is adopted as the dynamics learning model, the situation data and the desired operation data correspond to the observation signal.

信号入力部１１は、観測される観測信号に対応する電気信号を出力する。具体的には、信号入力部１１は、例えば、観測信号が音の信号の場合は、センサとしてのマイクに対応し、観測信号が画像信号の場合は、センサとしてのカメラに対応する。また、モータの回転角度や回転速度の計測装置なども、信号入力部１１に対応する。 The signal input unit 11 outputs an electrical signal corresponding to the observed signal. Specifically, for example, the signal input unit 11 corresponds to a microphone as a sensor when the observation signal is a sound signal, and corresponds to a camera as a sensor when the observation signal is an image signal. Further, a measuring device for the rotation angle and rotation speed of the motor also corresponds to the signal input unit 11.

ここで、以下、適宜、信号入力部１１に入力される信号も、信号入力部１１が出力する信号も、観測信号という。 Here, hereinafter, a signal input to the signal input unit 11 and a signal output from the signal input unit 11 are also referred to as observation signals as appropriate.

なお、観測信号は、時間的に定常的な定常信号であっても良いし、時間的に変化する（定常的でない）非定常信号であっても良い。 Note that the observation signal may be a stationary signal that is stationary in time, or an unsteady signal that changes in time (not stationary).

以下では、例えば、センサモータ信号を観測信号とする。センサモータ信号とは、例えば、図示せぬ自律型ロボットが有するカメラやマイクその他のセンサが出力するセンサ信号と、自律型ロボットの腕や脚等を駆動するモータに与えられるモータ信号とを、一定の時間間隔でサンプリングして得られる、同一のサンプル点（時刻）のサンプル値をコンポーネントとするベクトルの時系列である。 In the following, for example, a sensor motor signal is used as an observation signal. The sensor motor signal is, for example, a constant of a sensor signal output by a camera, microphone, or other sensor of an autonomous robot (not shown) and a motor signal given to a motor that drives the arm or leg of the autonomous robot. This is a vector time series obtained by sampling at a time interval of the same and having sample values at the same sample point (time) as components.

信号入力部１１は、時系列データである観測信号を、逐次、適当な長さに区切って出力する。すなわち、信号入力部１１は、観測信号としてのセンサモータ信号から、例えば、１００サンプル（点）を、１サンプルずつシフトしながら抽出し、その１００サンプルの時系列データを、特徴量抽出部１２に供給する。 The signal input unit 11 sequentially outputs the observation signals, which are time series data, divided into appropriate lengths. That is, the signal input unit 11 extracts, for example, 100 samples (points) from the sensor motor signal as the observation signal while shifting one sample at a time, and the time-series data of the 100 samples is extracted to the feature amount extraction unit 12. Supply.

なお、センサモータ信号のサンプリングの時間間隔や、信号入力部１１がセンサモータ信号から抽出するサンプルの数（サンプル数）は、観測信号とするセンサモータ信号に応じて適切に調整される。 Note that the sampling time interval of the sensor motor signal and the number of samples (number of samples) extracted from the sensor motor signal by the signal input unit 11 are appropriately adjusted according to the sensor motor signal used as the observation signal.

特徴量抽出部１２は、信号入力部１１から供給される観測信号から特徴量を抽出し、その特徴量の時系列を、学習部１３、認識部１４、及び生成部１５に供給する。 The feature amount extraction unit 12 extracts a feature amount from the observation signal supplied from the signal input unit 11, and supplies a time series of the feature amount to the learning unit 13, the recognition unit 14, and the generation unit 15.

すなわち、観測信号が、例えば、音声信号である場合には、特徴量抽出部１２は、その音声信号の一定時間分ごとに、周波数分析その他の音響処理を施し、音声認識等で広く利用されている、例えば、メルケプストラムなどの音声の特徴量を抽出する。そして、特徴量抽出部１２は、観測信号から抽出した特徴量を、時系列に出力し、これにより、特徴量抽出部１２から学習部１３、認識部１４、及び生成部１５に対して、特徴量の時系列データが供給される。 That is, when the observation signal is, for example, a voice signal, the feature amount extraction unit 12 performs frequency analysis and other acoustic processing every predetermined time of the voice signal, and is widely used for voice recognition and the like. For example, a voice feature amount such as a mel cepstrum is extracted. Then, the feature amount extraction unit 12 outputs the feature amount extracted from the observation signal in time series, and thereby the feature amount extraction unit 12 provides the feature to the learning unit 13, the recognition unit 14, and the generation unit 15. Quantity time-series data is supplied.

学習部１３は、特徴量抽出部１２からの時系列データに基づき、ダイナミクスを学習する学習処理を行い、認識部１４及び生成部１５は、特徴量抽出部１２からの時系列データに基づき、学習処理の結果を利用して、時系列データを認識する認識処理や、時系列データを生成する生成処理、時系列データを認識し、その認識結果に応じて、時系列データを生成する認識生成処理を行う。 The learning unit 13 performs a learning process for learning dynamics based on the time series data from the feature amount extraction unit 12, and the recognition unit 14 and the generation unit 15 learn based on the time series data from the feature amount extraction unit 12. Recognition processing that recognizes time-series data, generation processing that generates time-series data, recognition processing that recognizes time-series data and generates time-series data according to the recognition result I do.

すなわち、学習部１３は、特徴量抽出部１２からの時系列データに基づき、後述するネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークの各ダイナミクスを自己組織的に更新する学習処理を行う。 That is, the learning unit 13 performs a learning process for self-organizingly updating each dynamics of the dynamics storage network stored in the network storage unit 16 to be described later, based on the time series data from the feature amount extraction unit 12.

ここで、学習処理では、ダイナミクス記憶ネットワークのパラメータの更新が行われる。パラメータの更新は、学習とも呼ばれる。 Here, in the learning process, the parameters of the dynamics storage network are updated. The parameter update is also called learning.

学習部１３による学習処理の詳細は後述するが、学習処理では、基本的には、ラベル（正解ラベル）の付与されていない時系列データを、ダイナミクス記憶ネットワークに対して、繰り返し与えていく（供給していく）と、ダイナミクス記憶ネットワークが、その時系列データの中の特徴的なダイナミクスを自己組織的に獲得していく教師なし学習が実行される。その結果、ダイナミクス記憶ネットワークには、そこに与えられた時系列データの代表的なダイナミクスが記憶される。 Although the details of the learning process by the learning unit 13 will be described later, in the learning process, basically, time-series data to which no label (correct answer label) is assigned is repeatedly given to the dynamics storage network (supply). Then, unsupervised learning is performed in which the dynamics storage network acquires the characteristic dynamics in the time-series data in a self-organizing manner. As a result, the dynamics storage network stores representative dynamics of the time-series data given thereto.

ここで、ダイナミクス記憶ネットワークは、例えば、後述するように、力学系近似モデルの１つであるRNNによって、ダイナミクスを保持する。例えば、ある時刻tのデータの入力に対して、次の時刻t+1のデータを出力するRNN（の後述する入力層の入力ユニット）に対して入力される、ある時刻tのデータを、入力データというとともに、その時刻tのデータに対してRNNが出力する時刻t+1のデータを、出力データということとすると、学習部１３、認識部１４、及び生成部１５に対して、特徴量抽出部１２から供給される時系列データは、入力データである。 Here, the dynamics storage network holds the dynamics by, for example, an RNN that is one of the dynamic system approximation models, as will be described later. For example, in response to an input of data at a certain time t, an input of data at a certain time t input to an RNN that outputs data at the next time t + 1 (an input unit of an input layer described later) When the data at the time t + 1 output by the RNN with respect to the data at the time t is referred to as output data, the feature amount extraction is performed for the learning unit 13, the recognition unit 14, and the generation unit 15. The time series data supplied from the unit 12 is input data.

認識部１４は、入力データ、つまり、特徴量抽出部１２から供給される時系列データを認識の対象として、認識処理を行う。 The recognition unit 14 performs a recognition process using the input data, that is, the time-series data supplied from the feature amount extraction unit 12 as a recognition target.

すなわち、認識部１４は、ネットワーク記憶部１６のダイナミクス記憶ネットワークが記憶しているダイナミクスの中で、特徴量抽出部１２から供給される時系列データに最も適合するダイナミクスを決定し、そのダイナミクスを表す情報を、入力データとしての時系列データの認識結果として出力する。 That is, the recognizing unit 14 determines the dynamics most suitable for the time-series data supplied from the feature amount extracting unit 12 among the dynamics stored in the dynamics storage network of the network storage unit 16 and represents the dynamics. Information is output as a recognition result of time-series data as input data.

生成部１５には、特徴量抽出部１２から時系列データが供給される他、制御信号が供給される。生成部１５は、ネットワーク記憶部１６のダイナミクス記憶ネットワークが記憶しているダイナミクスの中から、時系列データの生成に用いるダイナミクスを、そこに供給される制御信号に従って決定し、そのダイナミクスを有する時系列データを、特徴量抽出部１２から供給される時系列データを必要に応じて用いて生成する生成処理を行う。 In addition to the time-series data supplied from the feature amount extraction unit 12, the generation unit 15 is also supplied with a control signal. The generation unit 15 determines the dynamics used for generating the time series data from the dynamics stored in the dynamics storage network of the network storage unit 16 according to the control signal supplied thereto, and the time series having the dynamics A generation process for generating data using the time-series data supplied from the feature amount extraction unit 12 as necessary is performed.

なお、生成部１５が生成処理を行うことによって得られる時系列データは、必要な処理が施されて出力される。 The time-series data obtained by the generation unit 15 performing the generation process is output after being subjected to necessary processes.

ネットワーク記憶部１６は、ダイナミクス記憶ネットワークを記憶する。 The network storage unit 16 stores a dynamics storage network.

ダイナミクス記憶ネットワークは、ダイナミクスを１つのノードに保持し、複数のノードによって構成される。 The dynamics storage network holds dynamics in one node and is composed of a plurality of nodes.

ここで、ダイナミクスは、時間変化する力学系を表すもので、例えば、具体的な関数によって表現することができる。ダイナミクス記憶ネットワークでは、時系列データの時間変化の特徴が、ダイナミクスとして記憶される。 Here, the dynamics represents a dynamic system that changes with time, and can be expressed by a specific function, for example. In the dynamics storage network, the temporal change characteristics of time-series data are stored as dynamics.

なお、本実施の形態では、ダイナミクス記憶ネットワークのノードにおいて、例えば、内部状態量を持つ力学系近似モデルによってモデル化されたダイナミクスを保持することとする。この場合、ダイナミクス記憶ネットワークは、内部状態量を持つ力学系近似モデルをノードとするネットワーク（内部状態量を持つ力学系近似モデルを保持（記憶）するノードによって構成されるネットワーク）である。 In this embodiment, the dynamics modeled by a dynamical approximate model having an internal state quantity, for example, is held in a node of the dynamics storage network. In this case, the dynamics storage network is a network having a dynamic system approximation model having an internal state quantity as a node (a network configured by nodes holding (storing) a dynamic system approximation model having an internal state quantity).

ここで、内部状態量を持つ（力学系近似）モデルとは、例えば、ある入力があると、その入力に応じて出力をするモデルを考えた場合に、外部から観測することができる入力と出力とは別に、外部からは観測されない（できない）、モデルの内部の状態を表す内部状態量を有するモデルである。内部状態量を持つモデルでは、入力の他に、内部状態量をも用いて出力が求められるため、同一の入力があっても、内部状態量が異なると、異なる出力が得られる。 Here, models with internal state quantities (dynamic system approximation) are, for example, inputs and outputs that can be observed from the outside when considering a model that outputs when there is an input. Apart from this, it is a model having an internal state quantity that cannot be observed from the outside and that represents the internal state of the model. In the model having the internal state quantity, the output is obtained using the internal state quantity in addition to the input. Therefore, even if the same input exists, a different output is obtained if the internal state quantity is different.

内部状態記憶部１７は、ネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークの内部状態量を記憶する。内部状態記憶部１７に記憶された内部状態量は、認識生成処理において、時系列データの認識時に、適宜更新され、時系列データの生成時に、必要に応じて利用される。この認識生成処理によって、自律エージェントの認知行動を実現することができる。 The internal state storage unit 17 stores the internal state amount of the dynamics storage network stored in the network storage unit 16. The internal state quantity stored in the internal state storage unit 17 is appropriately updated when the time series data is recognized in the recognition generation process, and is used as necessary when generating the time series data. By this recognition generation process, the cognitive behavior of the autonomous agent can be realized.

次に、図１４は、図１３のネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークの例を、模式的に示している。 Next, FIG. 14 schematically shows an example of the dynamics storage network stored in the network storage unit 16 of FIG.

ダイナミクス記憶ネットワークは、複数のノードとリンクによって構成される。 The dynamics storage network is composed of a plurality of nodes and links.

ノードは、ダイナミクスを保持（記憶）する。 The node holds (stores) dynamics.

リンクは、ノードどうしの間に結合関係を与える。 A link provides a coupling relationship between nodes.

図１４では、ダイナミクス記憶ネットワークは、９個のノードN₁ないしN₉を有し、各ノードN_i(i=1,2,・・・,9)には、９個のノードN₁ないしN₉が２次元の格子状に配置されるように、縦方向及び横方向に隣接するノードとの間にリンクが与えられている。 In FIG. 14, the dynamics storage network has nine nodes N ₁ to N ₉ , and each node N _i (i = 1, 2,..., 9) has nine nodes N ₁ to N 9. Links are provided between nodes adjacent in the vertical direction and the horizontal direction so that ₉ is arranged in a two-dimensional lattice pattern.

すなわち、図１４では、リンクによって、９個のノードN₁ないしN₉に、２次元の配置構造が与えられている。 That is, in FIG. 14, a two-dimensional arrangement structure is given to _nine nodes N ₁ to N ₉ by links.

ここで、ダイナミクス記憶ネットワークにおいては、ノードN_iの配置構造に応じて、ノードN_iの位置を表す座標系を定義することができる。すなわち、例えば、図１４に示すように、２次元の配置構造のノードN_iについては、２次元座標系を定義し、その２次元座標系上の座標によって、ノードN_iの位置を表すことができる。 Here, in the dynamics storage network can, depending on the arrangement of the nodes N _i, defines the coordinate system representing the position of the node N _i. That is, for example, as shown in FIG. 14, the node N _i of a two-dimensional arrangement structure, defines a two-dimensional coordinate system, the coordinates on the two-dimensional coordinate system, is possible to express the node position N _i it can.

例えば、いま、図１４のダイナミクス記憶ネットワークについて、左下のノードN₇の位置を原点(0,0)とするとともに、左から右方向をx軸とし、下から上方向をy軸とする２次元座標系を定義して、リンクの長さを0.5とすると、図１４のダイナミクス記憶ネットワークにおいて、例えば、右上のノードN₃の位置の座標は、(1,1)となる。 For example, in the dynamics storage network of FIG. 14, the two-dimensional configuration is such that the position of the lower left node N ₇ is the origin (0,0), the left to right direction is the x axis, and the bottom to upper direction is the y axis. If the coordinate system is defined and the link length is 0.5, for example, in the dynamics storage network of FIG. 14, the coordinates of the position of the upper right node N ₃ are (1, 1).

また、ダイナミクス記憶ネットワークを構成する任意の２つのノードN_i及びN_jそれぞれが保持するダイナミクスどうしが類似している（近い）度合いを表す尺度として、ノードN_iとN_jとの間の距離を導入する。 Also, as a measure representing the degree of similarity (closeness) between the dynamics held by any two nodes N _i and N _j constituting the dynamics storage network, the distance between the nodes N _i and N _j is Introduce.

いま、ノードN_iとN_jとの間の距離として、ノードN_iとN_jとの間のユークリッド距離を採用することとすると、例えば、左下のノードN₇と、右上のノードN₃との間の距離は、√((0-1)²+(0-1)²)=√2となる。 Now, as the distance between the node N _i and N _j, When adopting the Euclidean distance between the node N _i and N _j, for example, the lower left node N _7, the upper right of the node N ₃ The distance between them is √ ((0-1) ² + (0-1) ² ) = √2.

図１５は、図１３のネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークのノードN_iの構成例を模式的に示している。 Figure 15 shows schematically an example of the configuration of the network storage unit 16 to the node N _i of stored dynamics storage network of Figure 13.

ノードN_iは、例えば、内部状態量を持ち、力学系を表すことができる力学系近似モデルを有する。 The node _Ni has, for example, a dynamic system approximate model that has an internal state quantity and can represent a dynamic system.

図１５では、内部状態量を持つ力学系近似モデルとして、例えば、RNNが採用されている。 In FIG. 15, for example, RNN is adopted as a dynamic system approximation model having an internal state quantity.

図１５において、力学系近似モデルとしてのRNNは、３層型NNに、その出力層から入力層への回帰ループを持たせたものとなっており、その回帰ループによって、内部状態量が保持される。 In FIG. 15, the RNN as a dynamical system approximation model is a three-layer NN having a regression loop from its output layer to its input layer, and the internal state quantity is retained by the regression loop. The

すなわち、図１５において、力学系近似モデルとしてのRNNは、入力層、隠れ層（中間層）、及び出力層の３層により構成されている。入力層、隠れ層、及び出力層は、それぞれ任意の数の、ニューロンに相当するユニットにより構成される。 That is, in FIG. 15, the RNN as a dynamic system approximation model is composed of three layers: an input layer, a hidden layer (intermediate layer), and an output layer. Each of the input layer, the hidden layer, and the output layer is configured by an arbitrary number of units corresponding to neurons.

図１５において、入力層は、入力ユニット、及びコンテキストユニットを有する。 In FIG. 15, the input layer has an input unit and a context unit.

入力ユニットには、時刻tの入力データ（ベクトル）X_tとしての時系列データが入力される。 Time series data as input data (vector) X _{t at} time t is input to the input unit.

コンテキストユニットには、例えば、出力層の一部のユニットが出力するデータが、内部状態量であるコンテキストとしてフィードバックされる。すなわち、図１５のRNNでは、コンテキストユニットと、出力層の一部のユニットとが、回帰ループによって接続されており、コンテキストユニットには、出力層の一部のユニットが出力するデータが、回帰ループを介して、コンテキストとして入力される。 For example, data output by some units in the output layer is fed back to the context unit as a context that is an internal state quantity. That is, in the RNN of FIG. 15, the context unit and a part of the output layer are connected by the regression loop, and the data output by the unit of the output layer is stored in the context unit. Is input as a context.

ここで、時刻tの入力データX_tが入力ユニットに入力されるときに、コンテキストユニットに入力される時刻tのコンテキストC_tは、１時刻前の時刻t-1の入力データX_t-1に対して、出力層の一部のユニットが出力したデータである。したがって、時刻tの入力データX_tの入力に対して出力層の一部のユニットが出力したデータは、次の時刻t+1のコンテキストC_t+1となる。 Here, when the input data X _t at time t is input to the input unit, the context C _t at time t is input to the context unit, the input data X _t-1 of one time before the time t-1 On the other hand, it is data output by some units in the output layer. Therefore, the data output by some units in the output layer with respect to the input of the input data X _t at time t becomes the context C _{t + 1 at} the next time t + 1.

隠れ層のユニットは、入力層に入力される入力データX_t、及びコンテキストC_tを対象として、ニューロンとしてのユニットどうしを結合する結合重み（結合荷重）を用いた重み付け加算を行い、その重み付け加算の結果を引数とする非線形関数の演算を行って、その演算結果を、出力層のユニットに出力する。 The hidden layer unit performs weighted addition using connection weights (connection weights) that connect the units as neurons for the input data X _t and context C _t input to the input layer, and the weighted addition. A non-linear function is calculated using the result of the above as an argument, and the calculation result is output to the unit of the output layer.

出力層の一部のユニットからは、上述したように、次の時刻t+1のコンテキストC_t+1となるデータが出力され、入力層のコンテキストユニットにフィードバックされる。また、出力層の残りのユニットからは、例えば、入力データX_tに対する出力データとして、その入力データX_tの次の時刻t+1の入力データX_t+1の予測値X'_t+1が出力される。 As described above, data serving as the context C _{t + 1} at the next time t + 1 is output from some units in the output layer and fed back to the context unit in the input layer. Further, from the remaining units of the output layer, for example, as output data to the input data X _t, the predicted value X _{'t + 1} of the input data X _{t + 1} at the next time t + 1 of the input data X _t Is output.

以上のようなRNNにおいて、入力データとしての時系列データを、RNNの学習用の学習データとして用い、時刻tの時系列データX_tから、次の時刻t+1の時系列データX_t+1を予測することを学習（予測学習(prediction learning)）することにより、学習データ（時系列データ）の時間発展法則を学習することができる。 In the RNN as described above, time-series data as input data is used as learning data for learning of the RNN, and from the time-series data X _t at the time t, the time-series data X t + 1 at the next time _{t + 1} By learning to predict (prediction learning), the time evolution law of learning data (time-series data) can be learned.

ここで、RNNのような内部状態量を持つ力学系近似モデルのパラメータを求める学習の方法としては、例えば、BPTT(Back-Propagation Through Time)法を採用することができる。BPTT法については、例えば、D. E. Rumelhart, G. E. Hinton & R. E. Williams, 1986 "Learning internal representations by error propagation", In D. E. Rumelhart & J. McClelland, "Parallel distributed processing, pp. 318-364, Cambridge, MA: MIT Pressや、R. J. Williams and D. Zipser, "A learning algorithm for continually running fully recurrent neural networks", Neural Computation, 1:270-280, 1989等に記載されている。 Here, for example, a BPTT (Back-Propagation Through Time) method can be adopted as a learning method for obtaining the parameters of the dynamical approximate model having an internal state quantity such as RNN. For the BPTT method, see DE Rumelhart, GE Hinton & RE Williams, 1986 "Learning internal representations by error propagation", In DE Rumelhart & J. McClelland, "Parallel distributed processing, pp. 318-364, Cambridge, MA: MIT Press, RJ Williams and D. Zipser, “A learning algorithm for continuously running fully recurrent neural networks”, Neural Computation, 1: 270-280, 1989, and the like.

学習部１３は、力学系近似モデルとしてのRNNが保持するダイナミクスが、ダイナミクス記憶ネットワークの学習に用いられる時系列データである学習データの影響を受けるように、RNNのパラメータである結合重みを更新するRNNの学習を行う。 The learning unit 13 updates the connection weight, which is a parameter of the RNN, so that the dynamics retained by the RNN as the dynamical system approximation model is affected by the learning data that is time-series data used for learning of the dynamics storage network. Learn RNN.

なお、学習部１３は、力学系近似モデルが保持するダイナミクスが、学習データの影響を受ける度合いを強くしたり弱くしたりするための調整機能を有している。 The learning unit 13 has an adjustment function for increasing or decreasing the degree to which the dynamics held by the dynamical approximate model is affected by the learning data.

すなわち、ダイナミクス記憶ネットワークの学習では、学習データが入力されるたびに、ダイナミクス記憶ネットワークを構成するノードが有する力学系近似モデルのパラメータが少しずつ更新される。このパラメータの更新時に、学習部１３は、ノードごとに、そのノードが保持するダイナミクスを更新する程度、つまり、学習データを、ノードが保持するダイナミクスに影響させる程度を表す学習重みを決定する。 That is, in learning of the dynamics storage network, every time learning data is input, the parameters of the dynamic system approximation model of the nodes constituting the dynamics storage network are updated little by little. At the time of updating this parameter, the learning unit 13 determines, for each node, a learning weight that represents the degree to which the dynamics held by the node are updated, that is, the degree to which the learning data affects the dynamics held by the node.

学習部１３は、学習重みに応じて、ノードが保持するダイナミクスを、学習データのダイナミクスに近くなるように、自己組織的に更新する The learning unit 13 updates the dynamics held by the node in a self-organized manner so as to be close to the dynamics of the learning data according to the learning weight.

すなわち、学習部１３は、ノードの力学系近似モデルが、例えば、RNNである場合には、そのRNNが保持するダイナミクスが、学習データの影響を受ける度合いを、学習重みに応じて調整しながら、RNNのパラメータを、BPTT法により更新する。 That is, the learning unit 13 adjusts the degree to which the dynamics held by the RNN is affected by the learning data when the dynamic system approximation model of the node is, for example, an RNN, Update the RNN parameters using the BPTT method.

学習部１３において、RNNが保持するダイナミクスが学習データの影響を受ける度合いの調整は、例えば、BPTT法によるRNNのパラメータ（結合重み）の更新時の、パラメータを計算する繰り返し回数を、学習重みに応じて制限することや、パラメータを更新する程度に影響を与える予測誤差を、学習重みに応じて補正すること等によって行われる。 In the learning unit 13, the adjustment of the degree to which the dynamics held by the RNN are affected by the learning data is adjusted by, for example, setting the number of iterations for calculating the parameter when updating the RNN parameter (binding weight) by the BPTT method as the learning weight. For example, the prediction error that affects the degree of parameter update is corrected according to the learning weight.

すなわち、BPTT法によるRNNのパラメータの更新では、例えば、入力データX_tに対する出力データとしての、その入力データX_tの次の時刻t+1の入力データX_t+1の予測値X'_t+1の、真値（時刻t+1の入力データX_t+1）に対する予測誤差が小さくなるように、RNNのパラメータとしての結合重みを、予測誤差に応じた値だけ更新する計算が、RNNのパラメータが収束するまで繰り返し行われる。 That is, updates the parameters of the RNN by the BPTT method, for example, input data X as the output data for _t, the input data X _t of the input data of the next time _{t + 1} X t + 1 of the predicted value X _{'t + The} calculation to update the coupling weight as a parameter of RNN by a value corresponding to the prediction error so that the prediction error for the true value of ₁ (input data X _{t + 1 at} time t + 1) is small is Repeated until parameters converge.

学習部１３は、例えば、学習重みが小さいほど、パラメータの計算の繰り返し回数を少なくすることで、RNNが保持するダイナミクスが学習データの影響を受ける度合いを小に調整する。 For example, the learning unit 13 adjusts the degree to which the dynamics held by the RNN is affected by the learning data to be small by decreasing the number of parameter calculation iterations as the learning weight is small.

あるいは、学習部１３は、例えば、学習重みが小さいほど、予測誤差を小さい値に補正することで、RNNが保持するダイナミクスが学習データの影響を受ける度合いを小に調整する。 Alternatively, for example, the learning unit 13 adjusts the degree of the dynamics held by the RNN to be affected by the learning data to be small by correcting the prediction error to a smaller value as the learning weight is smaller.

いずれにしても、学習重みが大きいときには、RNNのパラメータは、RNNが保持するダイナミクスが学習データの影響を大きく受けるように更新される。また、学習重みが小さいときには、RNNのパラメータは、RNNが保持するダイナミクスが学習データの影響をあまり受けないように（少ししか受けないように）更新される。 In any case, when the learning weight is large, the parameters of the RNN are updated so that the dynamics held by the RNN are greatly affected by the learning data. When the learning weight is small, the RNN parameters are updated so that the dynamics held by the RNN are not significantly affected by the learning data (only a little).

次に、学習重みの決定の方法について説明する。 Next, a method for determining the learning weight will be described.

学習部１３は、ダイナミクス記憶ネットワークのノードの中から、学習データに最も適合するダイナミクスを保持するノードである勝者ノードを決定し、その勝者ノードから各ノードまでの距離に応じて、各ノードが保持するダイナミクスを更新する程度を表す学習重みを決定する。 The learning unit 13 determines a winner node that holds the dynamics most suitable for the learning data from the nodes of the dynamics storage network, and holds each node according to the distance from the winner node to each node. The learning weight representing the degree to which the dynamics to be updated is determined is determined.

すなわち、学習部１３は、特徴量抽出部１２から１サンプルの特徴量が供給されると、その１サンプルの特徴量と、特徴量抽出部１２から直前に供給された過去のT-1サンプルの特徴量とによって、Tサンプルの特徴量（サンプル値）の時系列データを、観測される（た）観測時系列データとして生成する。 That is, when a feature amount of one sample is supplied from the feature amount extraction unit 12, the learning unit 13 calculates the feature amount of the one sample and the past T-1 sample supplied immediately before from the feature amount extraction unit 12. Based on the feature amount, time-series data of the feature amount (sample value) of the T sample is generated as observed time-series data.

ここで、この場合、特徴量抽出部１２が出力する特徴量の時系列データを、１サンプルずつシフトしながら、Tサンプルずつ逐次抽出して得られる時系列データが、観測時系列データとなる。 Here, in this case, the time-series data obtained by sequentially extracting the T-samples while shifting the time-series data of the feature values output from the feature-value extraction unit 12 by one sample becomes the observation time-series data.

なお、以下では、時刻tの（観測）時系列データX_tとは、例えば、時刻t-T+1のサンプル値から、時刻tのサンプル値までのTサンプルのサンプル値X_t-T+1,X_t-T+2,・・・,X_t-1,X_tを意味することとする。 In the following, the (observation) time-series data X _t at time t is, for example, the sample value X _{t-T + 1} of the T sample from the sample value at time t-T + 1 to the sample value at time t. , X _{t−T + 2} ,..., X _t−1 , X _t .

学習部１３は、時刻tの観測時系列データを生成すると、その時刻tの観測時系列データを学習データとして、ダイナミクス記憶ネットワークの各ノード（が有するダイナミクス）が学習データに適合する度合いを表すスコアを求める。 When learning unit 13 generates observation time-series data at time t, the learning time-series data at time t is used as learning data, and a score representing the degree to which each node (dynamics possessed by) of the dynamics storage network matches the learning data. Ask for.

すなわち、いま、学習データとしての時刻tの時系列データとしてのTサンプルのサンプル値X_t-T+1,X_t-T+2,・・・,X_t-1,X_tのうちの時刻tのサンプル値X_tを、ノードが有する力学系近似モデルとしてのRNNに入力したときに、そのRNNが出力する時刻t+1のサンプル値X_t+1の予測値X'_t+1の、時刻t+1のサンプル値（の真値）X_t+1に対する予測誤差δ(t)が、例えば、式δ(t)=|X'_t+1-X_t+1|²で定義されることとすると、学習部１３は、時刻tの時系列データとしてのTサンプルのサンプル値X_t-T+1,X_t-T+2,・・・,X_t-1,X_tについての予測誤差δ(t-T+1),δ(t-T+2),・・・,δ(t-1),δ(t)の、例えば、加算値（総和）E_t(=δ(t-T+1)+δ(t-T+2)+・・・+δ(t-1)+δ(t))を、学習データとしての時刻tの時系列データ（の全体）に対するノードのスコアとして求める。 That is, the time among the sample values X _{t−T + 1} , X _{t−T + 2} ,..., X _t−1 , X _t of time samples at time t as learning data When the sample value X _t of _t is input to the RNN as a dynamic system approximation model possessed by the node, the predicted value X ′ _{t + 1} of the sample value X _{t + 1} of the time t + 1 output by the RNN, time t + 1 sample value (true value) X _{t + 1} the prediction error [delta] with respect to (t) is, for example, the formula [delta] (t) = | defined by ² | X _{'t + 1} -X _{t + 1} If this is the case, the learning unit 13 predicts _T sample values X _{t-T + 1} , X _{t-T + 2} ,..., X _t−1 , X _t as time-series data at time t. For example, an addition value (sum) E _t (= δ (t) of errors δ (t−T + 1), δ (t−T + 2),..., Δ (t−1), δ (t) −T + 1) + δ (t−T + 2) +... + Δ (t−1) + δ (t)) is determined as the learning data for the time-series data at time t (whole) Find as a score.

なお、この場合、スコアが小さいほど、予測値が真値に近いことを表す。そこで、以下、適宜、スコアが小さいことを、スコアが良い、又は高いともいい、スコアが大きいことを、スコアが悪い、又は低いともいう。 In this case, the smaller the score, the closer the predicted value is to the true value. Therefore, hereinafter, as appropriate, a score that is small is also referred to as a good or high score, and a score that is large is also referred to as a poor or low score.

RNNのような内部状態量を持つ力学系近似モデルについては、その内部状態量を適切な値とすることで、スコアはより良くなる。 For a dynamical approximate model with an internal state quantity such as RNN, the score becomes better by setting the internal state quantity to an appropriate value.

そのため、学習部１３は、スコアの計算にあたっては、予測誤差を最小化するように、BPTT法によって、内部状態量としてのRNNのコンテキストを調整した後、そのコンテキストを更新しながら、スコアを計算する。 Therefore, in calculating the score, the learning unit 13 adjusts the RNN context as the internal state quantity by the BPTT method so as to minimize the prediction error, and then calculates the score while updating the context. .

そして、学習部１３は、ダイナミクス記憶ネットワークのノードの中から、スコアが最も良いRNNを有するノードを、学習データに最も適合するダイナミクスを保持する勝者ノードに決定する。 Then, the learning unit 13 determines a node having an RNN having the best score among the nodes of the dynamics storage network as a winner node that holds the dynamics most suitable for the learning data.

さらに、学習部１３は、ダイナミクス記憶ネットワークの各ノードと、勝者ノードとの間の距離dを求める。 Further, the learning unit 13 obtains a distance d between each node of the dynamics storage network and the winner node.

なお、ノードN_iとN_jとの間の距離としては、図１４で説明したノードN_iとN_jとの間のユークリッド距離の他、例えば、ノードN_iとN_jとのスコアの差（の絶対値）を採用することが可能である。この場合、スコアがより良いノードが、勝者ノードとの間の距離がより近いノードとなる。 As the distance between the node N _i and N _j, other Euclidean distance between the node N _i and N _j described in FIG. 14, for example, the difference in scores between the nodes N _i and N _j ( (Absolute value) can be adopted. In this case, a node with a better score is a node closer to the winner node.

また、任意のノードN_iと勝者ノードとの間の距離としては、ノードN_iのスコアそのもの（又は、逆数等）を採用することが可能である。 As the distance between any node N _i and winning node, it is possible to employ the score itself node N _i (or, reciprocal, etc.).

学習部１３は、ダイナミクス記憶ネットワークの各ノードの、勝者ノードとの間の距離dを求めると、距離dの増加に対して学習重みαが減少する関係を表す曲線（以下、距離／重み曲線という）に従って、ノードの学習重みαを決定する。 When the learning unit 13 obtains the distance d between each node of the dynamics storage network and the winner node, a curve representing a relationship in which the learning weight α decreases with an increase in the distance d (hereinafter referred to as a distance / weight curve). ) To determine the learning weight α of the node.

すなわち、図１６は、距離／重み曲線の例を示している。 That is, FIG. 16 shows an example of a distance / weight curve.

図１６の距離／重み曲線において、横軸（左から右方向）は、学習重みαを示しており、縦軸（上から下方向）は、勝者ノードからの距離dを示している。 In the distance / weight curve of FIG. 16, the horizontal axis (from left to right) represents the learning weight α, and the vertical axis (from top to bottom) represents the distance d from the winner node.

図１６の距離／重み曲線によれば、勝者ノードとの距離dが近いノードほど、大きな学習重みαが決定され、距離dが遠いノードほど、小さな学習重みαが決定される。 According to the distance / weight curve of FIG. 16, a node having a shorter distance d to the winner node determines a larger learning weight α, and a node having a longer distance d determines a smaller learning weight α.

ここで、図１６では、縦軸に沿って、ダイナミクス記憶ネットワークを構成する６個のノードN₁'ないしN₆'が、各ノードN_i'と勝者ノードとの距離dに対応する位置（縦軸の位置）に記載されている。 Here, in FIG. 16, along the vertical axis, the six nodes N ₁ ′ to N ₆ ′ constituting the dynamics storage network correspond to the distance d between each node _Ni ′ and the winner node (vertical). Axis position).

図１６では、ダイナミクス記憶ネットワークを構成する６個のノードN₁'ないしN₆'が、その順で、勝者ノードとの距離dが近いノードになっている。ダイナミクス記憶ネットワークを構成する６個のノードN₁'ないしN₆'のうち、勝者ノードとの距離dが最も近いノード、即ち、勝者ノードとの距離が０のノードであるノードN₁'は、勝者ノード（となっているノード）である。 In FIG. 16, the six nodes N ₁ ′ to N ₆ ′ constituting the dynamics storage network are nodes that are in the order of the distance d from the winner node. Among the _six nodes N ₁ ′ to N ₆ ′ constituting the dynamics storage network, the node N ₁ ′ having the closest distance d to the winner node, that is, the node N ₁ ′ having a distance of 0 from the winner node is It is the winner node.

図１６の距離／重み曲線は、例えば、式（１）によって与えられる。 The distance / weight curve in FIG. 16 is given by, for example, Expression (1).

・・・（１）

... (1)

ここで、式（１）において、γは０＜γ＜１の範囲の減衰係数であり、Δは、勝者ノードを中心として各ノードの学習重みαを調整するための変数（以下、適宜、調整変数という）である。 Here, in Expression (1), γ is an attenuation coefficient in a range of 0 <γ <1, and Δ is a variable for adjusting the learning weight α of each node around the winner node (hereinafter, appropriately adjusted). Variable).

調整変数をΔを大きい値から少しずつ０に近づけていくと、学習重みαは勝者ノードから離れるにしたがってより小さい値となる。基本的には、調整変数Δは、学習の開始時は大きくし、時間の経過とともに小さくなるように調整される。 As Δ is gradually approached to 0 from a large value, the learning weight α becomes a smaller value as the distance from the winner node increases. Basically, the adjustment variable Δ is adjusted so as to increase at the start of learning and to decrease with time.

式（１）の学習重みαに基づき、勝者ノードのパラメータ（ノードが有する力学系近似モデルのパラメータ）は、学習データの影響を最も強く受けるように更新され、勝者ノードから離れるにしたがって、学習データの影響が小さくなるように、他のノード（勝者ノード以外のノード）のパラメータの更新が行われる。 Based on the learning weight α in Expression (1), the parameter of the winner node (the parameter of the dynamic system approximation model that the node has) is updated so as to be most affected by the learning data, and the learning data increases as the distance from the winner node increases. The parameters of other nodes (nodes other than the winner node) are updated so as to reduce the influence of.

次に、図１７のフローチャートを参照して、図１３の学習部１３による学習処理について説明する。 Next, the learning process by the learning unit 13 in FIG. 13 will be described with reference to the flowchart in FIG.

学習部１３は、ステップＳ１１において、ネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークのすべてのパラメータの初期化を行う。具体的には、ダイナミクス記憶ネットワークの各ノードの内部状態量を持つ力学系近似モデル（図１５）のパラメータに適当な値が初期値として付与される。 In step S11, the learning unit 13 initializes all parameters of the dynamics storage network stored in the network storage unit 16. Specifically, an appropriate value is assigned as an initial value to the parameter of the dynamic system approximation model (FIG. 15) having the internal state quantity of each node of the dynamics storage network.

ここで、ダイナミクス記憶ネットワークのノードが有する力学系近似モデルが、例えば、RNNである場合には、ステップＳ１１では、そのRNNのユニットに入力される信号に与えられる結合重み等を、力学系近似モデルのパラメータとして、そのパラメータに適当な初期値がセットされる。 Here, when the dynamic system approximation model possessed by the node of the dynamics storage network is, for example, an RNN, in step S11, the coupling weight given to the signal input to the unit of the RNN is represented by the dynamic system approximation model. A suitable initial value is set as the parameter.

その後、特徴量抽出部１２から学習部１３に対して、１サンプルの特徴量が供給されるのを待って、処理は、ステップＳ１１からステップＳ１２に進み、学習部１３は、学習データを生成する。 Thereafter, after waiting for one sample of the feature amount to be supplied from the feature amount extraction unit 12 to the learning unit 13, the process proceeds from step S11 to step S12, and the learning unit 13 generates learning data. .

すなわち、学習部１３は、特徴量抽出部１２から１サンプルの特徴量が供給されると、その１サンプルの特徴量と、特徴量抽出部１２から直前に供給されたT-1サンプルの特徴量とによって、Tサンプルの特徴量（サンプル値）の時系列データを、観測時系列データとして生成する。 That is, when a feature amount of one sample is supplied from the feature amount extraction unit 12, the learning unit 13 and the feature amount of the T-1 sample supplied immediately before from the feature amount extraction unit 12 As a result, time-series data of T sample feature values (sample values) is generated as observed time-series data.

その後、処理は、ステップＳ１２からステップＳ１３に進み、学習部１３は、直前のステップＳ１２で生成した観測時系列データを学習データとして、その学習データに対する、ネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークの各ノードのスコアの計算を、ノードが有する、内部状態量を持つ力学系近似モデルの内部状態量を更新しながら行う。 Thereafter, the process proceeds from step S12 to step S13, and the learning unit 13 uses the observation time series data generated in the immediately preceding step S12 as learning data, and the dynamics storage network stored in the network storage unit 16 for the learning data. The score of each node is calculated while updating the internal state quantity of the dynamic system approximation model having the internal state quantity of the node.

ここで、内部状態量を持つ力学系近似モデルが、例えば、RNNである場合には、所定の基準値を基準として値を変えていく（更新していく）変数の値のうちの、スコアを最も良くする値が、内部状態量としてのRNNのコンテキストの初期値に決定され、コンテキストを初期値から更新しながら、スコアの計算が行われる。 Here, when the dynamic system approximation model having the internal state quantity is, for example, RNN, the score of the variable values whose values are changed (updated) based on a predetermined reference value is calculated. The best value is determined as the initial value of the RNN context as the internal state quantity, and the score is calculated while updating the context from the initial value.

なお、コンテキストの初期値の決定に用いる所定の基準値としては、例えば、ランダムな値や、前回のRNNのパラメータの更新時に求められた、コンテキストの最終的な更新値（以下、適宜、前回更新値という）などを採用することができる。 The predetermined reference value used for determining the initial value of the context is, for example, a random value or the final update value of the context obtained when the previous RNN parameter was updated (hereinafter, the previous update is appropriately performed as appropriate). Value)).

例えば、今回のRNNのパラメータの更新時に学習部１３で生成された観測時系列データと、前回のRNNのパラメータの更新時に学習部１３で生成された観測時系列データとが、何らの関係もないことが分かっている場合には、コンテキストの初期値の決定に用いる所定の基準値としては、ランダムな値を採用することができる。 For example, the observation time series data generated by the learning unit 13 when the RNN parameter is updated this time and the observation time series data generated by the learning unit 13 when the previous RNN parameter is updated have no relationship. If it is known, a random value can be adopted as the predetermined reference value used for determining the initial value of the context.

また、例えば、今回のRNNのパラメータの更新時に学習部１３で生成された観測時系列データと、前回のRNNのパラメータの更新時に学習部１３で生成された観測時系列データとが、連続する時系列データなどのように、何らかの関係を有することが分かっている場合には、コンテキストの初期値の決定に用いる所定の基準値としては、前回更新値を採用することができる。なお、前回更新値を、コンテキストの初期値の決定に用いる所定の基準値として採用する場合には、前回更新値を、そのまま、コンテキストの初期値に決定することができる。 Also, for example, when the observation time series data generated by the learning unit 13 at the time of the current RNN parameter update and the observation time series data generated by the learning unit 13 at the previous update of the RNN parameter are continuous. If it is known that there is some relationship, such as series data, the previous update value can be used as the predetermined reference value used to determine the initial value of the context. When the previous update value is adopted as a predetermined reference value used for determining the initial value of the context, the previous update value can be determined as the initial value of the context as it is.

ネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークのすべてのノードのスコアが求められると、処理は、ステップＳ１３からステップＳ１４に進み、学習部１３は、ダイナミクス記憶ネットワークを構成するノードそれぞれのスコアを比較することによって、最もスコアの良いノードを、学習データに最も適合するノードである勝者ノードに決定して、処理は、ステップＳ１５に進む。 When the scores of all the nodes of the dynamics storage network stored in the network storage unit 16 are obtained, the process proceeds from step S13 to step S14, and the learning unit 13 compares the scores of the nodes constituting the dynamics storage network. As a result, the node having the best score is determined as the winner node that is the most suitable node for the learning data, and the process proceeds to step S15.

ステップＳ１５では、学習部１３は、ネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークの各ノードの学習重みを、図１６で説明したように、勝者ノードを中心として決定する。 In step S15, the learning unit 13 determines the learning weight of each node of the dynamics storage network stored in the network storage unit 16 with the winner node as the center, as described with reference to FIG.

その後、処理は、ステップＳ１５からステップＳ１６に進み、学習部１３が、学習データを用い、ネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークの各ノードが有する、内部状態量を持つ力学系近似モデルのパラメータの更新を、学習重みに応じて行う。 Thereafter, the process proceeds from step S15 to step S16, where the learning unit 13 uses the learning data and the dynamical system approximation model having the internal state quantity that each node of the dynamics storage network stored in the network storage unit 16 has. The parameter is updated according to the learning weight.

ここで、内部状態量を持つ力学系近似モデルが、例えば、RNNである場合には、ステップＳ１６でのパラメータの更新は、BPTT法によりパラメータ（結合重み）を計算する繰り返し回数を、学習重みに応じて制限して行われる。すなわち、学習重みが小さいほど、繰り返し回数は、小さい値に制限される。 Here, when the dynamic system approximation model having the internal state quantity is, for example, RNN, the parameter update in step S16 is performed by using the number of iterations for calculating the parameter (binding weight) by the BPTT method as the learning weight. Depending on the limit. That is, the smaller the learning weight, the smaller the number of repetitions.

なお、勝者ノードのパラメータだけを更新する方法はWTA(winner-take-all)と呼ばれる学習方法であり、勝者ノードの近傍のノードに対してもパラメータの更新を行う方法はSMA(soft-max adaptation)と呼ばれる学習方法である。学習部１３は、SMAで、ダイナミクス記憶ネットワーク（のノードが有する力学系近似モデル）のパラメータの更新を行う。 Note that the method of updating only the parameters of the winner node is a learning method called WTA (winner-take-all), and the method of updating parameters for nodes in the vicinity of the winner node is SMA (soft-max adaptation). ) Is a learning method. The learning unit 13 updates the parameters of the dynamics storage network (dynamic system approximation model possessed by the node) using SMA.

すなわち、図１６で説明したように、学習重みは、勝者ノードとの距離が近い、勝者ノードの近傍にあるノードについてほど大きな値に決定され、逆に、勝者ノードとの距離が遠いノードについてほど小さな値に決定される。その結果、勝者ノードの近傍にあるノードについては、学習データの影響をより強く受けるように、ノードのパラメータを更新し、勝者ノードとの距離が遠いノードについては、学習データの影響をあまり受けないように、ノードのパラメータを更新する近傍競合学習が行われる。 That is, as described with reference to FIG. 16, the learning weight is determined to be larger for a node that is closer to the winner node and closer to the winner node, and conversely, for a node that is farther from the winner node. Decided to a small value. As a result, the node parameters are updated so that the nodes near the winner node are more affected by the learning data, and the nodes far from the winner node are less affected by the learning data. As described above, neighborhood competitive learning for updating the parameter of the node is performed.

その後、特徴量抽出部１２から学習部１３に対して、１サンプルの特徴量が新たに供給されるのを待って、処理は、ステップＳ１６からステップＳ１２に戻り、以下、ステップＳ１２ないしＳ１６の処理が繰り返される。 Thereafter, the process waits for the feature quantity of one sample to be newly supplied from the feature quantity extraction unit 12 to the learning unit 13, and the process returns from step S16 to step S12. Hereinafter, the processes of steps S12 to S16 are performed. Is repeated.

次に、図１８のフローチャートを参照して、図１３の認識部１４による認識処理について説明する。 Next, the recognition process by the recognition unit 14 in FIG. 13 will be described with reference to the flowchart in FIG.

ステップＳ３１において、認識部１４は、認識処理に用いる認識データを生成する。 In step S31, the recognition unit 14 generates recognition data used for recognition processing.

すなわち、認識部１４は、例えば、特徴量抽出部１２から、Tサンプルの特徴量（サンプル値）が供給されるのを待って、そのTサンプルの特徴量の時系列である観測時系列データを、認識データとする。 That is, for example, the recognition unit 14 waits for the feature amount (sample value) of the T sample to be supplied from the feature amount extraction unit 12, and obtains the observation time series data that is the time series of the feature amount of the T sample. And recognition data.

そして、処理は、ステップＳ３１からステップＳ３２に進み、認識部１４は、認識データに対するダイナミクス記憶ネットワークの各ノードのスコアの計算を、図１７の学習処理の場合と同様に、ノードが有する、内部状態量を持つ力学系近似モデルの内部状態量を更新しながら行う。 Then, the process proceeds from step S31 to step S32, and the recognition unit 14 calculates the score of each node of the dynamics storage network for the recognition data, as in the learning process of FIG. This is done while updating the internal state quantity of the dynamic system approximation model with quantity.

ダイナミクス記憶ネットワークのすべてのノードのスコアが求められると、処理は、ステップＳ３２からステップＳ３３に進み、認識部１４は、ダイナミクス記憶ネットワークを構成するノードそれぞれのスコアを比較することによって、最もスコアの良いノードを、認識データに最も適合するノードである勝者ノードに決定して、処理は、ステップＳ３４に進む。 When the scores of all the nodes of the dynamics storage network are obtained, the process proceeds from step S32 to step S33, and the recognition unit 14 compares the scores of the respective nodes constituting the dynamics storage network to obtain the best score. The node is determined to be the winner node that is the most suitable node for the recognition data, and the process proceeds to step S34.

ステップＳ３５では、認識部１４は、勝者ノードを表す情報を、認識データの認識結果として出力して、処理は終了する。 In step S35, the recognition unit 14 outputs information representing the winner node as a recognition result of the recognition data, and the process ends.

ここで、認識部１４が出力した認識結果は、図１３のデータ処理装置の外部に出力することができる。また、認識部１４が出力した認識結果は、制御信号として、生成部１５に供給することができる。 Here, the recognition result output by the recognition unit 14 can be output to the outside of the data processing apparatus of FIG. The recognition result output from the recognition unit 14 can be supplied to the generation unit 15 as a control signal.

次に、図１９のフローチャートを参照して、図１３の生成部１５による生成処理について説明する。 Next, generation processing by the generation unit 15 in FIG. 13 will be described with reference to the flowchart in FIG. 19.

図１７の学習処理によれば、ダイナミクス記憶ネットワークの各ノードは、内部状態量を持つ力学系近似モデルによってダイナミクスを学習し、記憶（獲得）するが、その後は、その各ノードの内部状態量を持つ力学系近似モデルから、その力学系近似モデルによってモデル化されたダイナミクスを有する時系列データ（ダイナミクスとして獲得された時系列パターンの時系列データ）を生成することができる。 According to the learning process of FIG. 17, each node of the dynamics storage network learns and stores (acquires) the dynamics using a dynamical approximate model having an internal state quantity. Thereafter, the internal state quantity of each node is determined. It is possible to generate time series data having dynamics modeled by the dynamic system approximation model (time series data of a time series pattern acquired as dynamics) from the dynamic system approximation model.

内部状態量を持つ力学系近似モデルとしてRNNを用いた場合には、所定の内部状態量をRNNに与えることで、そのRNNを有するノードに保持されるダイナミクスから時系列データを容易に生成することができる。 When RNN is used as a dynamical system approximation model with an internal state quantity, time series data can be easily generated from the dynamics held in the node having the RNN by giving the predetermined internal state quantity to the RNN. Can do.

具体的には、RNNの入力にある時刻tの状態ベクトルを与えると、次の時刻t+1の状態ベクトルが出力される。したがって、この操作を所定の時間ステップ（サンプル点）分だけ行うことで、ダイナミクス記憶ネットワークの各ノードから、その所定の時間ステップ分に相当するサンプル数の時系列データを生成することができる。 Specifically, when the state vector at the time t at the input of the RNN is given, the state vector at the next time t + 1 is output. Therefore, by performing this operation for a predetermined time step (sample point), time series data of the number of samples corresponding to the predetermined time step can be generated from each node of the dynamics storage network.

すなわち、図１９のステップＳ５１において、生成部１５は、ダイナミクス記憶ネットワークのノードのうちの、どのダイナミクスに対応するノードから時系列データを生成するかを決定する。 That is, in step S51 of FIG. 19, the generation unit 15 determines which of the nodes corresponding to the dynamics among the nodes of the dynamics storage network is to generate the time series data.

ここで、時系列データの生成に用いられるノードを、以下、適宜、生成ノードともいう。生成処理では、生成部１５は、例えば、ダイナミクス記憶ネットワークのノードの中から、１個のノードをランダムに選択し、そのノードを、生成ノードに決定する。あるいは、生成部１５は、例えば、ユーザからの指示等に応じて供給される制御信号に基づいて、ダイナミクス記憶ネットワークのノードの中から、生成ノードとするノードを決定する。 Here, a node used for generating time-series data is also referred to as a generation node as appropriate hereinafter. In the generation process, for example, the generation unit 15 randomly selects one node from the nodes of the dynamics storage network, and determines that node as a generation node. Or the production | generation part 15 determines the node used as a production | generation node from the nodes of a dynamics storage network based on the control signal supplied according to the instruction | indication from a user etc., for example.

生成ノードが決定されると、処理は、ステップＳ５１からステップＳ５２に進み、生成部１５は、生成ノードが保持する内部状態量を持つ力学系近似モデルのパラメータに基づき、時系列データを、力学系近似モデルの内部状態量を更新しながら生成して、処理は、ステップＳ５３に進む。 When the generation node is determined, the process proceeds from step S51 to step S52, and the generation unit 15 converts the time series data to the dynamic system based on the parameters of the dynamic system approximation model having the internal state quantity held by the generation node. The process proceeds to step S53 by generating the internal model while updating the internal state quantity of the approximate model.

ステップＳ５３では、生成部１５は、生成ノードの力学系近似モデルから生成された時系列データ（以下、適宜、生成時系列データともいう）を必要に応じて変換し、出力して、処理は終了する。 In step S53, the generation unit 15 converts time series data generated from the dynamic system approximation model of the generation node (hereinafter also referred to as generation time series data as appropriate) and outputs it as necessary, and the process ends. To do.

ここで、学習部１３が学習処理に用いる学習データとしての観測時系列データは、センサモータ信号の特徴量であるため、生成部１５が生成する生成時系列データも、センサモータ信号の特徴量である。生成時系列データとしてのセンサモータ信号の特徴量は、生成部１５が、ステップＳ５３において、センサモータ信号に変換し、そのセンサモータ信号のうちのモータ信号が、例えば、自律エージェントに供給される。 Here, since the observation time series data as learning data used by the learning unit 13 for the learning process is a feature amount of the sensor motor signal, the generation time series data generated by the generation unit 15 is also the feature amount of the sensor motor signal. is there. The feature quantity of the sensor motor signal as the generation time series data is converted into a sensor motor signal by the generation unit 15 in step S53, and the motor signal of the sensor motor signal is supplied to, for example, an autonomous agent.

なお、力学系近似モデルが、例えば、RNNである場合、生成部１５での生成時系列データの生成時には、内部状態量としてのRNNのコンテキストユニット（図１５）に入力されるコンテキストの初期値、及び入力ユニット（図１５）に入力されるデータの初期値として、例えば、ランダムな値が用いられる。 When the dynamic system approximation model is, for example, RNN, when the generation time series data is generated by the generation unit 15, the initial value of the context input to the RNN context unit (FIG. 15) as an internal state quantity, For example, a random value is used as an initial value of data input to the input unit (FIG. 15).

また、ある時刻t+1においてRNNの入力ユニット（図１５）に入力されるデータとしては、直前の時刻tにおいてRNNの出力層から出力された、時刻t+1のデータの予測値が用いられる。 Also, as data input to the RNN input unit (FIG. 15) at a certain time t + 1, the predicted value of the data at time t + 1 output from the RNN output layer at the immediately previous time t is used. .

次に、図２０のフローチャートを参照して、図１３の認識部１４、及び生成部１５による認識生成処理について説明する。 Next, the recognition generation process by the recognition unit 14 and the generation unit 15 in FIG. 13 will be described with reference to the flowchart in FIG.

上述したように、認識生成処理によれば、自律エージェントの認知行動を実現することができる。 As described above, according to the recognition generation process, the recognition behavior of the autonomous agent can be realized.

認識部１４、及び生成部１５において、内部状態量を持つ力学系近似モデルによってダイナミクスを学習したダイナミクス記憶ネットワークを用いて、認識生成を行う場合、図１８の認識処理と図１９の生成処理を逐次的に組み合わせるだけでは、力学系近似モデルの内部状態量を考慮した認識生成を行うことは困難である。 When the recognition unit 14 and the generation unit 15 perform the recognition generation using the dynamics storage network in which the dynamics is learned by the dynamic system approximation model having the internal state quantity, the recognition processing in FIG. 18 and the generation processing in FIG. 19 are sequentially performed. It is difficult to generate a recognition that takes into account the internal state quantity of the dynamical system approximate model only by combining them.

そこで、認識部１４、及び生成部１５は、内部状態記憶部１７において、図１８の認識処理において更新された力学系近似モデルの内部状態量（内部状態）を記憶し、その内部状態量を図１９の生成処理において用いることで、観測信号から得られる時刻tの観測時系列データに対して、次の時刻t+1の観測時系列データの予測値を生成する認識生成処理を行う。 Therefore, the recognizing unit 14 and the generating unit 15 store the internal state quantity (internal state) of the dynamic system approximate model updated in the recognition process of FIG. 18 in the internal state storage unit 17, and display the internal state quantity. By using it in the generation process 19, the recognition generation process for generating the predicted value of the observation time series data at the next time t + 1 is performed on the observation time series data at the time t obtained from the observation signal.

すなわち、認識生成処理では、ステップＳ７１において、認識部１４が、図１８のステップＳ３１の場合と同様に、特徴量抽出部１２からの、Tサンプルの特徴量（サンプル値）の時系列である観測時系列データを、認識データとする。 That is, in the recognition generation process, in step S71, the recognition unit 14 observes a time series of feature values (sample values) of T samples from the feature amount extraction unit 12 as in step S31 of FIG. Time series data is taken as recognition data.

その後、処理は、ステップＳ７１からステップＳ７２に進み、認識部１４は、認識データに対する、ダイナミクス記憶ネットワークの各ノードのスコアの計算を、図１７の学習処理の場合と同様に、ノードが有する、内部状態量を持つ力学系近似モデルの内部状態量を更新しながら行う。 Thereafter, the process proceeds from step S71 to step S72, and the recognition unit 14 has a node that calculates the score of each node of the dynamics storage network for the recognition data, as in the learning process of FIG. This is done while updating the internal state quantity of the dynamical approximate model with the state quantity.

但し、ステップＳ７２のスコアの計算では、認識部１４は、内部状態記憶部１７から前回更新されて記憶されている内部状態量を読み込み、その内部状態記憶部１７から読み込んだ値を、力学系近似モデルの内部状態量（例えば、RNNのコンテキスト）の初期値とする。 However, in the calculation of the score in step S72, the recognition unit 14 reads the internal state quantity that has been updated and stored last time from the internal state storage unit 17, and uses the value read from the internal state storage unit 17 as a dynamical system approximation. The initial value of the model's internal state quantity (eg, RNN context).

ダイナミクス記憶ネットワークのすべてのノードのスコアが求められると、処理は、ステップＳ７２からステップＳ７３に進み、認識部１４は、ダイナミクス記憶ネットワークを構成するノードそれぞれのスコアを比較することによって、最もスコアの良いノードを、認識データに最も適合するノードである勝者ノードに決定する。 When the scores of all the nodes in the dynamics storage network are obtained, the process proceeds from step S72 to step S73, and the recognition unit 14 compares the scores of the respective nodes constituting the dynamics storage network to obtain the best score. The node is determined to be the winner node that is the node that best matches the recognition data.

さらに、ステップＳ７３では、認識部１４は、勝者ノードが決定されたときの内部状態量の更新値（更新された内部状態量）と、その勝者ノードが決定されたときの内部状態量の初期値とを、内部状態記憶部１７に保存する（記憶させる）。 Further, in step S73, the recognition unit 14 updates the internal state quantity update value (updated internal state quantity) when the winner node is determined and the initial value of the internal state quantity when the winner node is determined. Are stored (stored) in the internal state storage unit 17.

ここで、内部状態記憶部１７に記憶された内部状態量の更新値は、認識部１４での次回のスコアの計算を行うステップＳ７２において、力学系近似モデルの内部状態量（例えば、RNNのコンテキスト）の初期値として用いられる。 Here, the updated value of the internal state quantity stored in the internal state storage unit 17 is used as the internal state quantity (for example, the context of the RNN) of the dynamic system approximation model in step S72 in which the recognition unit 14 calculates the next score. ) Is used as the initial value.

また、内部状態記憶部１７に記憶された内部状態量の初期値は、生成部１５において、時系列データの生成時に用いられる。 The initial value of the internal state quantity stored in the internal state storage unit 17 is used in the generation unit 15 when generating time-series data.

その後、認識部１４は、勝者ノードを表す情報を出力し、処理は、ステップＳ７３からステップＳ７４に進む。認識部１４が出力した情報は、制御信号として、生成部１５に供給される。 Thereafter, the recognition unit 14 outputs information representing the winner node, and the process proceeds from step S73 to step S74. Information output from the recognition unit 14 is supplied to the generation unit 15 as a control signal.

ステップＳ７４では、生成部１５は、ダイナミクス記憶ネットワークのノードのうちの、認識部１４から制御信号として供給される情報が表す勝者ノードを、生成ノードとして、その生成ノードが保持する内部状態量を持つ力学系近似モデルのパラメータに基づき、生成時系列データを、力学系近似モデルの内部状態量を更新しながら生成して、処理は、ステップＳ７５に進む。 In step S74, the generation unit 15 has, as a generation node, a winner node represented by information supplied as a control signal from the recognition unit 14 among the nodes of the dynamics storage network, and has an internal state quantity held by the generation node. Based on the parameters of the dynamic system approximation model, generation time series data is generated while updating the internal state quantity of the dynamic system approximation model, and the process proceeds to step S75.

すなわち、生成部１５は、内部状態記憶部１７の記憶値を、ネットワーク記憶部１６に記憶されたダイナミクス記憶ネットワークの生成ノードの力学系近似モデルの内部状態量の初期値として読み込む。 That is, the generation unit 15 reads the stored value of the internal state storage unit 17 as an initial value of the internal state quantity of the dynamic system approximation model of the generation node of the dynamics storage network stored in the network storage unit 16.

つまり、生成部１５は、内部状態記憶部１７の記憶値のうちの、生成ノードが認識部１４において勝者ノードに決定されたときの内部状態量の初期値を読み出し、生成ノードの力学系近似モデルの内部状態量の初期値にセットする。 That is, the generation unit 15 reads the initial value of the internal state quantity when the generation node is determined as the winner node in the recognition unit 14 among the stored values of the internal state storage unit 17, and generates the dynamic system approximation model of the generation node Set to the initial value of the internal state quantity.

さらに、生成部１５は、特徴量抽出部１２から供給される特徴量の時系列から、認識部１４がステップＳ７１で生成するのと同一の認識データを生成し、その認識データを、生成ノードの力学系近似モデルに与え、その力学系近似モデルの内部状態量を更新しながら、生成時系列データを生成する。 Furthermore, the generation unit 15 generates the same recognition data that the recognition unit 14 generates in step S71 from the time series of the feature amounts supplied from the feature amount extraction unit 12, and the recognition data is generated as the generation node. The generation time series data is generated while giving the dynamic system approximation model and updating the internal state quantity of the dynamic system approximation model.

具体的には、力学系近似モデルが、例えば、RNNである場合、RNNのコンテキストユニット（図１５）に対して、内部状態記憶部１７の記憶値のうちの、生成ノードが認識部１４において勝者ノードに決定されたときのコンテキストの初期値が、生成時系列データを生成するときのコンテキストの初期値として入力される。 Specifically, when the dynamic system approximation model is, for example, RNN, the generation node of the stored values of the internal state storage unit 17 is the winner in the recognition unit 14 with respect to the RNN context unit (FIG. 15). The initial value of the context when the node is determined is input as the initial value of the context when generating the generation time-series data.

さらに、RNNの入力ユニット（図１５）に対して、認識データが入力される。 Furthermore, recognition data is input to the input unit of the RNN (FIG. 15).

そして、力学系近似モデルの内部状態量を更新しながら、認識データとしての観測時系列データの次の時刻の観測時系列データの予測値としての生成時系列データが生成される。 Then, while updating the internal state quantity of the dynamic system approximate model, generation time series data as a predicted value of observation time series data at the next time of observation time series data as recognition data is generated.

ステップＳ７５では、生成部１５は、生成ノードの力学系近似モデルから生成された生成時系列データを、図１９のステップＳ５３の場合と同様に、必要に応じて変換し、出力して、処理は、ステップＳ７１に戻り、以下、ステップＳ７１ないしＳ７５の処理が繰り返される。 In step S75, the generation unit 15 converts and outputs the generated time series data generated from the dynamic system approximate model of the generation node as necessary in the same manner as in step S53 in FIG. Returning to step S71, the processes of steps S71 to S75 are repeated.

ここで、生成部１５が生成する生成時系列データは、図１９で説明したように、センサモータ信号の特徴量であるが、そのセンサモータ信号の特徴量は、生成部１５が、ステップＳ７５において、センサモータ信号に変換する。そして、そのセンサモータ信号のうちのモータ信号が、例えば、自律エージェントに供給される。 Here, the generation time series data generated by the generation unit 15 is the feature amount of the sensor motor signal, as described in FIG. 19, and the generation amount of the sensor motor signal feature amount is determined by the generation unit 15 in step S75. , Convert to sensor motor signal. And the motor signal of the sensor motor signal is supplied to an autonomous agent, for example.

以上のような、図２０ステップＳ７１ないしＳ７５の認識生成処理が、例えば、１時刻ごとに行われることで、ロボットは認知行動を行う。 The robot performs the cognitive behavior by performing the recognition generation process in steps S71 to S75 in FIG. 20 as described above, for example, every hour.

図１のデータ処理装置において、ダイナミクス学習モデル記憶部１０４に記憶させるダイナミクス学習モデルとして、ダイナミクス記憶ネットワークを採用する場合には、予測学習部１０３において、図１７の学習処理を行うとともに、予測部１０５において、図２０の認識生成処理を予測処理として行うことで、多数のダイナミクスを獲得し、その多数のダイナミクスそれぞれを有する時系列データとしての所望操作データ（及び状況データ）の予測値を得ることができる。 In the case of adopting a dynamics storage network as a dynamics learning model to be stored in the dynamics learning model storage unit 104 in the data processing apparatus of FIG. 1, the prediction learning unit 103 performs the learning process of FIG. In FIG. 20, by performing the recognition generation process of FIG. 20 as a prediction process, a large number of dynamics can be acquired, and a predicted value of desired operation data (and situation data) as time-series data having each of the large numbers of dynamics can be obtained. it can.

次に、上述した一連の処理は、ハードウェアにより行うこともできるし、ソフトウェアにより行うこともできる。一連の処理をソフトウェアによって行う場合には、そのソフトウェアを構成するプログラムが、汎用のコンピュータ等にインストールされる。 Next, the series of processes described above can be performed by hardware or software. When a series of processing is performed by software, a program constituting the software is installed in a general-purpose computer or the like.

そこで、図２１は、上述した一連の処理を実行するプログラムがインストールされるコンピュータの一実施の形態の構成例を示している。 Thus, FIG. 21 shows a configuration example of an embodiment of a computer in which a program for executing the series of processes described above is installed.

プログラムは、コンピュータに内蔵されている記録媒体としてのハードディスク３０５やROM３０３に予め記録しておくことができる。 The program can be recorded in advance on a hard disk 305 or a ROM 303 as a recording medium built in the computer.

あるいはまた、プログラムは、フレキシブルディスク、CD-ROM(Compact Disc Read Only Memory)，MO(Magneto Optical)ディスク，DVD(Digital Versatile Disc)、磁気ディスク、半導体メモリなどのリムーバブル記録媒体３１１に、一時的あるいは永続的に格納（記録）しておくことができる。このようなリムーバブル記録媒体３１１は、いわゆるパッケージソフトウエアとして提供することができる。 Alternatively, the program is stored temporarily on a removable recording medium 311 such as a flexible disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto Optical) disk, a DVD (Digital Versatile Disc), a magnetic disk, or a semiconductor memory. It can be stored permanently (recorded). Such a removable recording medium 311 can be provided as so-called package software.

なお、プログラムは、上述したようなリムーバブル記録媒体３１１からコンピュータにインストールする他、ダウンロードサイトから、ディジタル衛星放送用の人工衛星を介して、コンピュータに無線で転送したり、LAN(Local Area Network)、インターネットといったネットワークを介して、コンピュータに有線で転送し、コンピュータでは、そのようにして転送されてくるプログラムを、通信部３０８で受信し、内蔵するハードディスク３０５にインストールすることができる。 The program is installed in the computer from the removable recording medium 311 as described above, or transferred from the download site to the computer wirelessly via a digital satellite broadcasting artificial satellite, or a LAN (Local Area Network), The program can be transferred to a computer via a network such as the Internet. The computer can receive the program transferred in this way by the communication unit 308 and install it in the built-in hard disk 305.

コンピュータは、CPU(Central Processing Unit)３０２を内蔵している。CPU３０２には、バス３０１を介して、入出力インタフェース３１０が接続されており、CPU３０２は、入出力インタフェース３１０を介して、ユーザによって、キーボードや、マウス、マイク等で構成される入力部３０７が操作等されることにより指令が入力されると、それに従って、ROM(Read Only Memory)３０３に格納されているプログラムを実行する。あるいは、また、CPU３０２は、ハードディスク３０５に格納されているプログラム、衛星若しくはネットワークから転送され、通信部３０８で受信されてハードディスク３０５にインストールされたプログラム、またはドライブ３０９に装着されたリムーバブル記録媒体３１１から読み出されてハードディスク３０５にインストールされたプログラムを、RAM(Random Access Memory)３０４にロードして実行する。これにより、CPU３０２は、上述したフローチャートにしたがった処理、あるいは上述したブロック図の構成により行われる処理を行う。そして、CPU３０２は、その処理結果を、必要に応じて、例えば、入出力インタフェース３１０を介して、LCD(Liquid Crystal Display)やスピーカ等で構成される出力部３０６から出力、あるいは、通信部３０８から送信、さらには、ハードディスク３０５に記録等させる。 The computer includes a CPU (Central Processing Unit) 302. An input / output interface 310 is connected to the CPU 302 via the bus 301, and the CPU 302 is operated by an input unit 307 including a keyboard, a mouse, a microphone, and the like by the user via the input / output interface 310. When a command is input as a result of the equalization, a program stored in a ROM (Read Only Memory) 303 is executed accordingly. Alternatively, the CPU 302 also transfers a program stored in the hard disk 305, a program transferred from a satellite or a network, received by the communication unit 308 and installed in the hard disk 305, or a removable recording medium 311 attached to the drive 309. The program read and installed in the hard disk 305 is loaded into a RAM (Random Access Memory) 304 and executed. Thereby, the CPU 302 performs processing according to the above-described flowchart or processing performed by the configuration of the above-described block diagram. Then, the CPU 302 outputs the processing result from the output unit 306 configured with an LCD (Liquid Crystal Display), a speaker, or the like, for example, via the input / output interface 310, or from the communication unit 308 as necessary. Transmission and further recording on the hard disk 305 are performed.

なお、本明細書において、コンピュータに各種の処理を行わせるためのプログラムを記述する処理ステップは、必ずしもフローチャートとして記載された順序に沿って時系列に処理する必要はなく、並列的あるいは個別に実行される処理（例えば、並列処理あるいはオブジェクトによる処理）も含むものである。 In this specification, the processing steps for describing a program for causing a computer to perform various types of processing do not necessarily have to be processed in chronological order according to the order described in the flowchart, and are executed in parallel or individually. Processing to be performed (for example, parallel processing or object processing) is also included.

また、プログラムは、１のコンピュータにより処理されるものであっても良いし、複数のコンピュータによって分散処理されるものであっても良い。さらに、プログラムは、遠方のコンピュータに転送されて実行されるものであっても良い。 Further, the program may be processed by one computer or may be distributedly processed by a plurality of computers. Furthermore, the program may be transferred to a remote computer and executed.

なお、本発明の実施の形態は、上述した実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiment of the present invention is not limited to the above-described embodiment, and various modifications can be made without departing from the gist of the present invention.

本発明を適用したデータ処理装置の第１実施の形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of 1st Embodiment of the data processor to which this invention is applied. 状況データ及び所望操作データの例を示す図である。It is a figure which shows the example of situation data and desired operation data. 学習処理を説明するフローチャートである。It is a flowchart explaining a learning process. 予測処理を説明するフローチャートである。It is a flowchart explaining a prediction process. PCの画面上のwebブラウザを示す図である。It is a figure which shows the web browser on the screen of PC. TVの表示画面を示す図である。It is a figure which shows the display screen of TV. 本発明を適用したデータ処理装置の第２実施の形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of 2nd Embodiment of the data processor to which this invention is applied. ゲーム画面を示す図である。It is a figure which shows a game screen. キャラクタ動作補助モジュール１５２_nの構成例を示すブロック図である。It is a block diagram showing a configuration example of a character action auxiliary module 152 _n. 状況データ及び所望操作データの例を示す図である。It is a figure which shows the example of situation data and desired operation data. 学習処理を説明するフローチャートである。It is a flowchart explaining a learning process. 予測処理を説明するフローチャートである。It is a flowchart explaining a prediction process. 本発明を適用したデータ処理装置の一実施の形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of one Embodiment of the data processor to which this invention is applied. ダイナミクス記憶ネットワークの例を、模式的に示す図である。It is a figure which shows typically the example of a dynamics storage network. ノードの構成例を、模式的に示す図である。It is a figure which shows the structural example of a node typically. 学習処理での学習重みの決定の方法を説明する図である。It is a figure explaining the method of the determination of the learning weight in a learning process. 学習処理を説明するフローチャートである。ある。It is a flowchart explaining a learning process. is there. 認識処理を説明するフローチャートである。It is a flowchart explaining a recognition process. 生成処理を説明するフローチャートである。It is a flowchart explaining a production | generation process. 認識生成処理を説明するフローチャートである。It is a flowchart explaining a recognition production | generation process. 本発明を適用したコンピュータの一実施の形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of one Embodiment of the computer to which this invention is applied.

Explanation of symbols

１１信号入力部，１２特徴量抽出部，１３学習部，１４認識部，１５
生成部，１６ネットワーク記憶部，１７内部状態記憶部，１０１状況データ取得部，１０２操作データ取得部，１０３予測学習部，１０４ダイナミクス学習モデル記憶部，１０５予測部，１０６操作データ出力部，１５１教示キャラクタ選択部，１５２₁ないし１５２_N キャラクタ動作補助モジュール，１６１状況データ取得部，１６２操作データ取得部，１６３予測学習部，１６４ダイナミクス学習モデル記憶部，１６５予測部，１６６操作データ出力部，３０１バス，３０２ CPU，３０３ ROM，３０４ RAM，３０５ハードディスク，３０６出力部，３０７入力部，３０８通信部，３０９ドライブ，３１０入出力インタフェース，３１１リムーバブル記録媒体 11 signal input unit, 12 feature quantity extraction unit, 13 learning unit, 14 recognition unit, 15
Generation unit, 16 network storage unit, 17 internal state storage unit, 101 status data acquisition unit, 102 operation data acquisition unit, 103 prediction learning unit, 104 dynamics learning model storage unit, 105 prediction unit, 106 operation data output unit, 151 teaching Character selection unit, 152 ₁ to 152 _N character motion assist module, 161 situation data acquisition unit, 162 operation data acquisition unit, 163 prediction learning unit, 164 dynamics learning model storage unit, 165 prediction unit, 166 operation data output unit, 301 bus , 302 CPU, 303 ROM, 304 RAM, 305 hard disk, 306 output unit, 307 input unit, 308 communication unit, 309 drive, 310 input / output interface, 311 removable recording medium

Claims

In a data processing device that processes time-series data,
Situation data acquisition means for acquiring situation data which is time-series data representing the situation;
Operation data acquisition means for acquiring desired operation data which is time-series data corresponding to an operation desired by the user;
Learning means for learning the dynamics of the situation data and desired operation data;
Based on the dynamics, with the situation data as an input, prediction means for obtaining a predicted value of the desired operation data;
A data processing apparatus comprising: output means for outputting a predicted value of the desired operation data.

The prediction means also obtains a predicted value of the situation data together with a predicted value of the desired operation data,
The data processing apparatus according to claim 1, wherein the output means outputs the predicted value of the desired operation data when a prediction error of the predicted value of the situation data with respect to a true value of the situation data is equal to or less than a predetermined threshold. .

The data processing apparatus according to claim 1, wherein the learning unit learns the dynamics of the situation data and the desired operation data using a dynamics learning model that is a model capable of acquiring dynamics.

The data processing according to claim 3, wherein the dynamics learning model is RNN (Recurrent Neural Network), FNN (Feed Forward Neural Network), SVR (Support Vector Regression), or RNN-PB (Recurrent Neural Net with Parametric Bias). apparatus.

The data processing apparatus according to claim 3, wherein the dynamics learning model is a dynamics storage network configured by a plurality of nodes and holding dynamics in each of the plurality of nodes.

The learning means updates the dynamics of each node of the dynamics storage network based on the situation data and desired operation data in a self-organizing manner,
The prediction means includes
Determining a winner node that is a node that holds the dynamics most suitable for the situation data;
The winner node is determined as a generation node which is a node used for generating time-series data,
The data processing device according to claim 5, wherein time-series data having dynamics held by the generation node is generated as a predicted value of the desired operation data.

In a data processing method of a data processing apparatus for processing time series data,
While obtaining situation data that is time-series data representing the situation, obtain desired operation data that is time-series data corresponding to an operation desired by the user,
Learn the dynamics of the situation data and desired operation data,
Based on the dynamics, using the situation data as an input, obtain a predicted value of the desired operation data,
A data processing method including a step of outputting a predicted value of the desired operation data.

In a program that causes a computer to function as a data processing device that processes time-series data,
Situation data acquisition means for acquiring situation data which is time-series data representing the situation;
Operation data acquisition means for acquiring desired operation data which is time-series data corresponding to an operation desired by the user;
Learning means for learning the dynamics of the situation data and desired operation data;
Based on the dynamics, with the situation data as an input, prediction means for obtaining a predicted value of the desired operation data;
A program that causes a computer to function as output means for outputting a predicted value of the desired operation data.