JP2022018477A

JP2022018477A - Learning execution device, program, and learning execution method

Info

Publication number: JP2022018477A
Application number: JP2020121597A
Authority: JP
Inventors: 裕子石若; Yuko ISHIWAKA; 智博吉田; Tomohiro Yoshida; 忠輝伊藤; Tadateru Ito
Original assignee: SoftBank Corp
Current assignee: SoftBank Corp
Priority date: 2020-07-15
Filing date: 2020-07-15
Publication date: 2022-01-27
Anticipated expiration: 2040-07-15
Also published as: JP2023130362A; JP7379742B2; JP7379750B2; JP7237891B2; JP2023085258A

Abstract

To provide a learning execution device for executing learning for causing a muscle model, which is modeled muscle movement, to perform a target operation, a program, and a learning execution method.SOLUTION: A learning execution unit 100 for communicating with a plurality of communication terminals via a network comprises: an information storage unit which stores a muscle model in which muscle is operated by contracting a muscle fiber connected to a motor neuron included in an exercise unit according to a firing pattern of a plurality of intervening neurons each of which is connected with an exercise unit; an operation setting unit for setting a target operation of a muscle model; and a learning execution unit which learns a firing pattern for achieving a target operation by executing learning for giving a compensation to a firing pattern whose operation of a muscle model is closer to a target operation among a plurality of firing patterns.SELECTED DRAWING: Figure 4

Description

本発明は、学習実行装置、プログラム、及び学習実行方法に関する。 The present invention relates to a learning execution device, a program, and a learning execution method.

ＣＧ（ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓ）の分野において、筋収縮に基づくシミュレーション手法が知られていた（例えば、非特許文献１～非特許文献６、参照）。従来のシミュレーション手法では、いわゆるヒルタイプモデル及びいわゆるＣＰＧ（ＣｅｎｔｒａｌＰａｔｔｅｒｎＧｅｎｅｒａｔｏｒ）等が用いられていた。
［先行技術文献］
［非特許文献］
［非特許文献１］Thomas Geitenbeek, Michiel van de Panne, A. F. v. d. s. Flexible muscle-based locomotion for bipedal creatures. ACM Transactions on Graphics, (206), 2013.
［非特許文献２］Jack M.Wang, Samuel R.Hmner, S. L. V. K. Optimizing locomotion controllers using biologically-based actuators and objectives. ACM Trans. Graph, 31(4), 2012.
［非特許文献３］Yoonsang Lee, Moon Seok Park, T. K. J. L. Locomotion control for many-muscle humanoids. ACM Transactions on Graphics, 33(6), 2014.
［非特許文献４］Sehee Min, Jungdam Won, S. L. J. P. J. L. Softcon: simulation and control of soft-bodied animals with biomimetic actuators. ACM Transactions on Graphics, 38(6):208:1-208:12, 2019.
［非特許文献５］Cecila Laschi, Matteo Cianchetti, B. M. L. m. M. F. P. D. Soft robot arm inspired by the octopus. Advanced Robotics, 26(7):709-727, 2012.
［非特許文献６］Jungdam Won, Jongho Park, K. K. J. L. How to train your dragon: Example-guided control of flapping flight. ACM Transactions on Graphics, 36(4):1:1-1:12, 2017. In the field of CG (Computer Graphics), a simulation method based on muscle contraction has been known (see, for example, Non-Patent Documents 1 to 6). In the conventional simulation method, a so-called hill type model and a so-called CPG (Central Pattern Generator) or the like have been used.
[Prior Art Document]
[Non-patent literature]
[Non-Patent Document 1] Thomas Geitenbeek, Michiel van de Panne, AF vds Flexible muscle-based locomotion for bipedal creatures. ACM Transactions on Graphics, (206), 2013.
[Non-Patent Document 2] Jack M. Wang, Samuel R.Hmner, SLVK Optimizing locomotion controllers using biologically-based actuators and objectives. ACM Trans. Graph, 31 (4), 2012.
[Non-Patent Document 3] Yoonsang Lee, Moon Seok Park, TKJL Locomotion control for many-muscle humanoids. ACM Transactions on Graphics, 33 (6), 2014.
[Non-Patent Document 4] Sehee Min, Jungdam Won, SLJPJL Softcon: simulation and control of soft-bodied animals with biomimetic actuators. ACM Transactions on Graphics, 38 (6): 208: 1-208: 12, 2019.
[Non-Patent Document 5] Cecila Laschi, Matteo Cianchetti, BML m. MFPD Soft robot arm inspired by the octopus. Advanced Robotics, 26 (7): 709-727, 2012.
[Non-Patent Document 6] Jungdam Won, Jongho Park, KKJL How to train your dragon: Example-guided control of flapping flight. ACM Transactions on Graphics, 36 (4): 1: 1-1: 12, 2017.

本発明の第１の態様によれば、学習実行装置が提供される。学習実行装置は、それぞれに運動単位が接続された複数の介在ニューロンの発火パターンに従って、運動単位に含まれる運動ニューロンに接続された筋繊維を収縮させることによって筋肉を動作させる筋肉モデルを格納する格納部を備えてよい。学習実行装置は、筋肉モデルの目標動作を設定する動作設定部を備えてよい。学習実行装置は、発火パターンを学習する学習実行部であって、複数の発火パターンのうち、筋肉モデルの動作が目標動作により近い発火パターンに報酬を与える学習を実行することによって、目標動作を実現する発火パターンを学習する学習実行部を備えてよい。 According to the first aspect of the present invention, a learning execution device is provided. The learning execution device stores a muscle model that operates a muscle by contracting muscle fibers connected to the motor neurons included in the motor unit according to the firing pattern of a plurality of interneurons to which the motor unit is connected. May have a part. The learning execution device may include a motion setting unit that sets a target motion of the muscle model. The learning execution device is a learning execution unit that learns the firing pattern, and realizes the target motion by executing learning that rewards the firing pattern in which the motion of the muscle model is closer to the target motion among a plurality of firing patterns. It may be provided with a learning execution unit for learning the firing pattern.

上記学習実行部は、上記複数の発火パターンのそれぞれに従って上記筋肉モデルを動作させ、上記筋肉モデルの動作が上記目標動作により近い発火パターンに基づいて複数の発火パターンを発生させ、当該複数の発火パターンのそれぞれに従って上記筋肉モデルを動作させ、上記筋肉モデルの動作が上記目標動作により近い発火パターンに基づいて複数の発火パターンを発生させることを繰り返すことによって、上記目標動作を実現する発火パターンを学習してよい。上記学習実行部は、ランダムに発生させた上記複数の発火パターンのそれぞれに従って上記筋肉モデルを動作させ、上記筋肉モデルの動作が上記目標動作により近い発火パターンに基づいて複数の発火パターンを発生させ、当該複数の発火パターンのそれぞれに従って上記筋肉モデルを動作させ、上記筋肉モデルの動作が上記目標動作により近い発火パターンに基づいて複数の発火パターンを発生させることを繰り返すことによって、上記目標動作を実現する発火パターンを学習してよい。上記学習実行部は、学習済みの発火パターンに基づいて発生させた上記複数の発火パターンのそれぞれに従って上記筋肉モデルを動作させ、上記筋肉モデルの動作が上記目標動作により近い発火パターンに基づいて複数の発火パターンを発生させ、当該複数の発火パターンのそれぞれに従って上記筋肉モデルを動作させ、上記筋肉モデルの動作が上記目標動作により近い発火パターンに基づいて複数の発火パターンを発生させることを繰り返すことによって、上記目標動作を実現する発火パターンを学習してよい。上記学習実行部は、上記発火パターンに基づいて上記筋肉モデルを動作させた場合に、上記筋繊維を収縮させた上記運動単位を成長させてよい。上記筋肉モデルは、速筋の運動単位と、遅筋の運動単位とを含んでよく、上記学習実行部は、上記発火パターンに基づいて上記筋肉モデルを動作させた場合に、上記速筋の運動単位と上記遅筋の運動単位とを異なる基準に従って成長させてよい。上記情報格納部は、上記運動単位に対して、速筋であるか遅筋であるかを示す第１パラメータと、収縮可能なエネルギーを示す第２パラメータと、上記第２パラメータの最大値と、自己回復力を示す第３パラメータと、上記第３パラメータの最大値とを格納してよく、上記学習実行部は、上記第１パラメータ、上記第２パラメータ、上記第２パラメータの最大値、上記第３パラメータ、及び上記第３パラメータの最大値を用いた学習を実行してよい。上記学習実行部は、上記運動単位が収縮する毎に上記第２パラメータから予め定められた値を減算し、上記第３パラメータが０でない間は、時間経過に伴って上記第２パラメータを回復させてよい。上記情報格納部は、上記運動単位が速筋である場合に、上記運動単位が収縮する毎に消費されるエネルギー量を示す第４パラメータを格納し、上記学習実行部は、上記第１パラメータ、上記第２パラメータ、上記第２パラメータの最大値、上記第３パラメータ、上記第３パラメータの最大値、及び上記第４パラメータを用いた学習を実行してよい。上記学習実行部は、上記運動単位が速筋である場合には、上記運動単位が収縮する毎に上記第２パラメータから上記第４パラメータの値を減算し、上記運動単位が遅筋である場合には、上記運動単位が収縮する毎に上記第２パラメータから上記第４パラメータの値以外の値を減算してよい。上記学習実行部は、上記筋繊維が損傷したと判定した後、上記筋繊維が回復したと判定した場合に、上記運動単位が速筋である場合には、上記第２パラメータの最大値及び上記第４パラメータの値を増加させ、上記運動単位が遅筋である場合には、上記第３パラメータの最大値を増加させてよい。上記学習実行部は、上記筋繊維が損傷したと判定した後、上記筋繊維が回復したと判定した場合において、上記運動単位が速筋である場合、上記第３パラメータの最大値は増大させなくてよい。上記学習実行部は、上記筋繊維が損傷したと判定した後、上記筋繊維が回復したと判定した場合において、上記運動単位が遅筋である場合、上記第２パラメータの最大値は増大させなくてよい。上記学習実行部は、上記第２パラメータが０になった場合に、上記筋繊維が損傷したと判定してよい。上記情報格納部は、上記運動単位に対して、上記運動単位の使用に関連する第５パラメータを格納してよく、上記学習実行部は、上記第５パラメータの増加に伴って上記運動単位のレベルを向上させ、上記運動単位のレベルが高いほど、上記運動単位が速筋である場合の上記第２パラメータの最大値及び上記第４パラメータの値を増加しにくくし、上記運動単位が遅筋である場合の上記第３パラメータの最大値を増加しにくくしてよい。上記学習実行部は、一の運動単位を収縮させた後、予め定められた不応期を経過するまで、当該一の運動単位が収縮できないようにして、上記発火パターンを学習してよい。上記学習実行部は、上記運動単位の温度が高いほど上記不応期を短くして、上記発火パターンを学習してよい。上記学習実行部は、時系列の上記複数の発火パターンに従って動作させた上記筋肉モデルの動作が上記目標動作を達成した場合に、上記目標動作を達成した状態の発火パターンから予め定められた時間遡った状態の発火パターンを更新することによって、上記学習を実行してよい。 The learning execution unit operates the muscle model according to each of the plurality of firing patterns, generates a plurality of firing patterns based on a firing pattern in which the motion of the muscle model is closer to the target motion, and the plurality of firing patterns. By operating the muscle model according to each of the above and repeating the generation of a plurality of firing patterns based on the firing pattern in which the motion of the muscle model is closer to the target motion, the firing pattern that realizes the target motion is learned. It's okay. The learning execution unit operates the muscle model according to each of the plurality of randomly generated firing patterns, and generates a plurality of firing patterns based on a firing pattern in which the movement of the muscle model is closer to the target movement. The target motion is realized by operating the muscle model according to each of the plurality of firing patterns and repeatedly generating a plurality of firing patterns based on the firing patterns in which the motion of the muscle model is closer to the target motion. You may learn the firing pattern. The learning execution unit operates the muscle model according to each of the plurality of firing patterns generated based on the learned firing pattern, and the motion of the muscle model is closer to the target motion based on the firing pattern. By generating an ignition pattern, operating the muscle model according to each of the plurality of ignition patterns, and repeating the operation of the muscle model to generate a plurality of ignition patterns based on the ignition pattern closer to the target motion. You may learn the firing pattern that realizes the above target motion. When the muscle model is operated based on the firing pattern, the learning execution unit may grow the motor unit in which the muscle fibers are contracted. The muscle model may include a fast muscle motor unit and a slow muscle motor unit, and the learning execution unit may exercise the fast muscle when the muscle model is operated based on the firing pattern. The unit and the slow muscle motor unit may be grown according to different criteria. The information storage unit includes a first parameter indicating whether the muscle is a fast muscle or a slow muscle, a second parameter indicating contractile energy, and a maximum value of the second parameter with respect to the motor unit. The third parameter indicating self-recovery power and the maximum value of the third parameter may be stored, and the learning execution unit may store the first parameter, the second parameter, the maximum value of the second parameter, and the second parameter. Learning may be performed using the three parameters and the maximum value of the third parameter. The learning execution unit subtracts a predetermined value from the second parameter each time the motor unit contracts, and while the third parameter is not 0, the second parameter is restored with the passage of time. It's okay. The information storage unit stores a fourth parameter indicating the amount of energy consumed each time the motor unit contracts when the motor unit is a fast muscle, and the learning execution unit stores the first parameter, Learning using the second parameter, the maximum value of the second parameter, the third parameter, the maximum value of the third parameter, and the fourth parameter may be executed. When the motor unit is a fast muscle, the learning execution unit subtracts the value of the fourth parameter from the second parameter every time the motor unit contracts, and when the motor unit is a slow muscle. May be subtracted from the second parameter a value other than the value of the fourth parameter each time the motor unit contracts. When the learning execution unit determines that the muscle fiber has been damaged and then determines that the muscle fiber has recovered, and the motor unit is a fast muscle, the maximum value of the second parameter and the above The value of the fourth parameter may be increased, and if the motor unit is a slow muscle, the maximum value of the third parameter may be increased. When the learning execution unit determines that the muscle fiber is damaged and then the muscle fiber is recovered, and the motor unit is a fast muscle, the maximum value of the third parameter is not increased. It's okay. When the learning execution unit determines that the muscle fiber has been damaged and then the muscle fiber has recovered, and the motor unit is a slow muscle, the maximum value of the second parameter is not increased. It's okay. The learning execution unit may determine that the muscle fiber is damaged when the second parameter becomes 0. The information storage unit may store a fifth parameter related to the use of the motor unit for the motor unit, and the learning execution unit may store the level of the motor unit as the fifth parameter increases. The higher the level of the motor unit, the more difficult it is to increase the maximum value of the second parameter and the value of the fourth parameter when the motor unit is a fast muscle, and the motor unit is a slow muscle. It may be difficult to increase the maximum value of the third parameter in a certain case. After contracting one motor unit, the learning execution unit may learn the firing pattern by preventing the one motor unit from contracting until a predetermined refractory period elapses. The learning execution unit may learn the ignition pattern by shortening the refractory period as the temperature of the motor unit increases. When the movement of the muscle model operated according to the plurality of firing patterns in the time series achieves the target movement, the learning execution unit traces back a predetermined time from the firing pattern in the state where the target movement is achieved. The above learning may be performed by updating the firing pattern in the state of being in the state.

本発明の第２の態様によれば、学習実行装置が提供される。学習実行装置は、筋肉に含まれる複数の筋繊維のそれぞれに対して、筋繊維が速筋であるか遅筋であるかを示す第１パラメータと、収縮可能なエネルギーを示す第２パラメータと、第２パラメータの最大値と、自己回復力を示す第３パラメータと、第３パラメータの最大値とを格納する情報格納部を備えてよい。学習実行装置は、上記第１パラメータ、上記第２パラメータ、上記第２パラメータの最大値、上記第３パラメータ、及び上記第３パラメータの最大値を用いた学習を実行することによって、筋肉のモデルを学習する学習実行部を備えてよい。 According to the second aspect of the present invention, a learning execution device is provided. The learning execution device has, for each of the plurality of muscle fibers contained in the muscle, a first parameter indicating whether the muscle fiber is a fast muscle or a slow muscle, and a second parameter indicating contractile energy. An information storage unit may be provided for storing the maximum value of the second parameter, the third parameter indicating the self-healing power, and the maximum value of the third parameter. The learning execution device obtains a muscle model by executing learning using the first parameter, the second parameter, the maximum value of the second parameter, the third parameter, and the maximum value of the third parameter. It may be provided with a learning execution unit for learning.

上記情報格納部は、上記筋繊維が速筋である場合に、上記筋繊維が収縮する毎に消費されるエネルギー量を示す第４パラメータを格納してよく、上記学習実行部は、上記第１パラメータ、上記第２パラメータ、上記第２パラメータの最大値、上記第３パラメータ、上記第３パラメータの最大値、及び上記第４パラメータを用いた学習を実行してよい。上記学習実行部は、上記筋繊維が収縮する毎に上記第２パラメータから予め定められた値を減算し、上記第３パラメータが０でない間は、時間経過に伴って上記第２パラメータを回復させ、上記筋繊維が損傷したと判定した後、上記筋繊維が回復したと判定した場合に、上記筋繊維が速筋である場合には、上記第２パラメータの最大値及び上記第４パラメータの値を増加させ、上記筋繊維が遅筋である場合には、上記第３パラメータの最大値を増加させることによって、上記筋肉のモデルを学習してよい。上記学習実行部は、上記筋繊維が速筋である場合には、上記筋繊維が収縮する毎に上記第２パラメータから上記第４パラメータの値を減算し、上記筋繊維が遅筋である場合には、上記筋繊維が収縮する毎に上記第２パラメータから上記第４パラメータの値以外の値を減算してよい。上記学習実行部は、上記筋繊維が損傷したと判定した後、上記筋繊維が回復したと判定した場合において、上記筋繊維が速筋である場合、上記第３パラメータの最大値は増大させなくてよい。上記学習実行部は、上記筋繊維が損傷したと判定した後、上記筋繊維が回復したと判定した場合において、上記筋繊維が遅筋である場合、上記第２パラメータの最大値は増大させなくてよい。 The information storage unit may store a fourth parameter indicating the amount of energy consumed each time the muscle fiber contracts when the muscle fiber is a fast muscle, and the learning execution unit may store the first parameter. Learning may be performed using the parameters, the second parameter, the maximum value of the second parameter, the third parameter, the maximum value of the third parameter, and the fourth parameter. The learning execution unit subtracts a predetermined value from the second parameter each time the muscle fiber contracts, and while the third parameter is not 0, the second parameter is restored with the passage of time. If it is determined that the muscle fiber is damaged and then the muscle fiber is recovered, and the muscle fiber is a fast muscle, the maximum value of the second parameter and the value of the fourth parameter are obtained. If the muscle fiber is a slow muscle, the muscle model may be learned by increasing the maximum value of the third parameter. When the muscle fiber is a fast muscle, the learning execution unit subtracts the value of the fourth parameter from the second parameter every time the muscle fiber contracts, and when the muscle fiber is a slow muscle. May be obtained by subtracting a value other than the value of the fourth parameter from the second parameter each time the muscle fiber contracts. When the learning execution unit determines that the muscle fiber is damaged and then determines that the muscle fiber has recovered, if the muscle fiber is a fast muscle, the maximum value of the third parameter is not increased. It's okay. When the learning execution unit determines that the muscle fiber is damaged and then determines that the muscle fiber has recovered, if the muscle fiber is a slow muscle, the maximum value of the second parameter is not increased. It's okay.

本発明の第３の態様によれば、コンピュータを、上記学習実行装置として機能させるためのプログラムが提供される。 According to the third aspect of the present invention, a program for making a computer function as the learning execution device is provided.

本発明の第４の態様によれば、コンピュータによって実行される学習実行方法が提供される。学習実行方法は、それぞれに運動単位が接続された複数の介在ニューロンの発火パターンに従って、運動単位に含まれる運動ニューロンに接続された筋繊維を収縮させることによって筋肉を動作させる筋肉モデルの目標動作を設定する動作設定ステップを備えてよい。学習実行方法は、複数の発火パターンのうち、筋肉モデルの動作が目標動作により近い発火パターンに報酬を与える学習を実行することによって、目標動作を実現する発火パターンを学習する学習実行ステップを備えてよい。 According to the fourth aspect of the present invention, a learning execution method executed by a computer is provided. The learning execution method follows the firing pattern of multiple interneurons to which each motor unit is connected, and the target movement of the muscle model that moves the muscle by contracting the muscle fibers connected to the motor neurons included in the motor unit. It may have an operation setting step to be set. The learning execution method includes a learning execution step of learning the firing pattern that realizes the target motion by executing learning that rewards the firing pattern in which the motion of the muscle model is closer to the target motion among the plurality of firing patterns. good.

本発明の第５の態様によれば、コンピュータによって実行される学習実行方法が提供される。学習実行方法は、筋肉に含まれる複数の筋繊維のそれぞれに対して、筋繊維が速筋であるか遅筋であるかを示す第１パラメータと、収縮可能なエネルギーを示す第２パラメータと、第２パラメータの最大値と、自己回復力を示す第３パラメータと、第３パラメータの最大値とを格納する格納ステップを備えてよい。学習実行方法は、第１パラメータ、第２パラメータ、第２パラメータの最大値、第３パラメータ、及び第３パラメータの最大値を用いた学習を実行することによって、筋肉のモデルを学習する学習実行ステップを備えてよい。 According to a fifth aspect of the present invention, there is provided a learning execution method executed by a computer. The learning execution method includes, for each of the plurality of muscle fibers contained in the muscle, a first parameter indicating whether the muscle fiber is a fast muscle or a slow muscle, and a second parameter indicating contractile energy. A storage step may be provided for storing the maximum value of the second parameter, the third parameter indicating the self-healing power, and the maximum value of the third parameter. The learning execution method is a learning execution step of learning a muscle model by executing learning using the first parameter, the second parameter, the maximum value of the second parameter, the third parameter, and the maximum value of the third parameter. May be equipped.

なお、上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではない。また、これらの特徴群のサブコンビネーションもまた、発明となりうる。 The outline of the above invention does not list all the necessary features of the present invention. A subcombination of these feature groups can also be an invention.

学習実行装置１００の一例を概略的に示す。An example of the learning execution device 100 is shown schematically. 筋肉モデル３００の一例を概略的に示す。An example of the muscle model 300 is shown schematically. 発火パターン４００の一例を概略的に示す。An example of the ignition pattern 400 is shown schematically. 学習実行装置１００の機能構成の一例を概略的に示す。An example of the functional configuration of the learning execution device 100 is schematically shown. 筋肉モデル３００の具体例を概略的に示す。A specific example of the muscle model 300 is shown schematically. 学習実行装置１００として機能するコンピュータ１２００のハードウェア構成の一例を概略的に示す。An example of the hardware configuration of the computer 1200 that functions as the learning execution device 100 is schematically shown.

以下、発明の実施の形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲にかかる発明を限定するものではない。また、実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。 Hereinafter, the present invention will be described through embodiments of the invention, but the following embodiments do not limit the invention to which the claims are made. Also, not all combinations of features described in the embodiments are essential to the means of solving the invention.

図１は、学習実行装置１００の一例を概略的に示す。学習実行装置１００は、筋肉の動きをモデル化した筋肉モデルに目標動作を実行させるための学習を実行する。 FIG. 1 schematically shows an example of a learning execution device 100. The learning execution device 100 executes learning for causing a muscle model that models muscle movement to execute a target movement.

筋肉モデルは、例えば、人の一部の筋肉に対応する。筋肉モデルは、人の全部の筋肉に対応してもよい。筋肉モデルは、人に限らず、筋肉を有する任意の生物に対応してもよい。また、筋肉モデルは、ＣＧのキャラクタ等に対応してもよい。 The muscle model corresponds to, for example, some muscles of a person. The muscle model may correspond to all muscles of a person. The muscle model is not limited to humans and may correspond to any organism having muscle. Further, the muscle model may correspond to a CG character or the like.

本実施形態に係る学習実行装置１００は、例えば、それぞれに運動単位が接続された複数の介在ニューロンの発火パターンに従って、運動単位に含まれる運動ニューロンに接続された筋繊維を収縮させることによって筋肉を動作させる筋肉モデルを格納する。介在ニューロンは、インターニューロンと呼ばれる場合もある。運動ニューロンは、モーターニューロンと呼ばれる場合もある。運動単位は、モーターユニットと呼ばれる場合もある。 The learning execution device 100 according to the present embodiment, for example, causes muscles by contracting muscle fibers connected to motor neurons included in the motor unit according to an firing pattern of a plurality of interneurons to which the motor unit is connected. Stores the muscle model to operate. Interneurons are sometimes called interneurons. Motor neurons are sometimes called motor neurons. Motor units are sometimes referred to as motor units.

学習実行装置１００は、筋肉モデルが目標動作を実現する発火パターンを学習する。学習実行装置１００は、例えば、ランダムに発生させた複数の発火パターンのうち、筋肉モデルの動作が目標動作に近い発火パターンに報酬を与える学習を実行することによって、発火パターンを学習する。 The learning execution device 100 learns the firing pattern in which the muscle model realizes the target motion. The learning execution device 100 learns the firing pattern by, for example, performing learning in which the motion of the muscle model rewards the firing pattern close to the target motion among a plurality of randomly generated firing patterns.

従来の筋収縮に基づくシミュレーション手法として、ヒルタイプモデル及びＣＰＧ等が知られている。従来手法では、パラメータを人手で設定して運動をシミュレーションしていた。従来手法では、筋肉モデルに異なる動作を実行させようとした場合に、すべて人手でパラメータを設定する必要があった。それに対して、本実施形態に係る学習実行装置１００によれば、目標動作を実現可能な発火パターンを自動的に学習できるので、動作の種類毎に個別にパラメータを設定する必要をなくすことができる。 As a conventional simulation method based on muscle contraction, a hill type model, CPG, and the like are known. In the conventional method, the parameters are set manually to simulate the movement. In the conventional method, when trying to make the muscle model perform different movements, it was necessary to set the parameters manually. On the other hand, according to the learning execution device 100 according to the present embodiment, since the firing pattern that can realize the target motion can be automatically learned, it is possible to eliminate the need to set parameters individually for each type of motion. ..

学習実行装置１００は、学習を進める中で、発火パターンに基づいて筋肉モデルを動作させた場合に、筋肉モデルの筋肉を成長させてもよい。学習実行装置１００は、例えば、発火パターンに基づいて筋肉モデルを動作させた場合に、筋繊維を収縮させた運動単位を成長させる。従来手法においては、パラメータの設定次第では、実際の筋肉の動きとは異なる動きを実現してしまう場合があった。それに対して、本実施形態に係る学習実行装置１００は、筋肉の成長をも考慮することによって、よりリアルな動きを実現可能にできる。 The learning execution device 100 may grow the muscles of the muscle model when the muscle model is operated based on the firing pattern while the learning is progressing. The learning execution device 100 grows a motor unit in which muscle fibers are contracted, for example, when a muscle model is operated based on an ignition pattern. In the conventional method, depending on the setting of parameters, a movement different from the actual movement of the muscle may be realized. On the other hand, the learning execution device 100 according to the present embodiment can realize more realistic movements by also considering the growth of muscles.

学習実行装置１００は、様々な分野に適用されてよい。学習実行装置１００は、例えば、ＣＧのキャラクタに任意の動作を実現させる発火パターンを学習し、任意の動作を実行するキャラクタのＣＧアニメーションを生成する。 The learning execution device 100 may be applied to various fields. The learning execution device 100 learns, for example, an ignition pattern that causes a CG character to realize an arbitrary operation, and generates a CG animation of the character that executes an arbitrary operation.

従来は、キャラクタに任意の動作を実行させるためにアニメーションを作り込む必要があったが、本実施形態に係る学習実行装置１００によれば、例えば、筋肉モデルの筋肉を成長させつつ、目標動作を実行するように介在ニューロンの発火パターンを学習することによって、自動的に任意の動作を実行するキャラクタのＣＧアニメーションを生成することができる。例えば、目標動作としてダンスの動作を設定すると、キャラクタが当該ダンスを実行するＣＧアニメーションを自動的に生成することができる。本実施形態に係る学習実行装置１００によれば、介在ニューロンからの発火パターンを学習し、実際の生物と同じ制御系統の動きを実現することによって、リアルな動きを実現することができる。 Conventionally, it was necessary to create an animation in order to make a character perform an arbitrary movement, but according to the learning execution device 100 according to the present embodiment, for example, while growing the muscles of a muscle model, a target movement is performed. By learning the firing pattern of interneurons to perform, it is possible to automatically generate a CG animation of a character that performs an arbitrary action. For example, if a dance motion is set as the target motion, a CG animation in which the character executes the dance can be automatically generated. According to the learning execution device 100 according to the present embodiment, it is possible to realize a realistic movement by learning the firing pattern from the interneuron and realizing the movement of the same control system as an actual organism.

また、従来技術では、例えば、８頭身の人間のダンスの動きを、３頭身のキャラクタに実行させるような場合に、動きの対応がとれずに不自然な動きになってしまう場合があった。それに対して、本実施形態に係る学習実行装置１００によれば、３頭身のキャラクタの筋肉の構造及び成長を考慮した学習を実行することによって、３頭身のキャラクタに、自然な動きを実現させることができる。 Further, in the conventional technique, for example, when a three-headed character is made to perform a dance movement of an eight-headed human being, the movement may not be compatible and the movement may become unnatural. rice field. On the other hand, according to the learning execution device 100 according to the present embodiment, a natural movement is realized for the three-headed character by executing learning considering the structure and growth of the muscles of the three-headed character. Can be made to.

学習実行装置１００は、例えば、生成したＣＧアニメーションを、学習実行装置１００が備えるディスプレイに表示させる。また、学習実行装置１００は、例えば、生成したＣＧアニメーションを、ネットワーク２０を介して通信端末２００に送信することによって、通信端末２００に表示させてもよい。 The learning execution device 100 displays, for example, the generated CG animation on the display included in the learning execution device 100. Further, the learning execution device 100 may display the generated CG animation on the communication terminal 200 by transmitting it to the communication terminal 200 via the network 20, for example.

通信端末２００は、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、タブレット端末、及びスマートフォン等であってよい。学習実行装置１００と通信端末２００とは、ネットワーク２０を介して通信してよい。ネットワーク２０は、インターネットを含んでよい。ネットワーク２０は、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）を含んでよい。ネットワーク２０は、移動体通信ネットワークを含んでよい。移動体通信ネットワークは、３Ｇ（３ｒｄＧｅｎｅｒａｔｉｏｎ）通信方式、ＬＴＥ（ＬｏｎｇＴｅｒｍＥｖｏｌｕｔｉｏｎ）通信方式、５Ｇ（５ｔｈＧｅｎｅｒａｔｉｏｎ）通信方式、及び６Ｇ（６ｔｈＧｅｎｅｒａｔｉｏｎ）通信方式以降の通信方式のいずれに準拠していてもよい。 The communication terminal 200 may be a PC (Personal Computer), a tablet terminal, a smartphone, or the like. The learning execution device 100 and the communication terminal 200 may communicate with each other via the network 20. The network 20 may include the Internet. The network 20 may include a LAN (Local Area Network). The network 20 may include a mobile communication network. The mobile communication network complies with any of 3G (3rd Generation) communication method, LTE (Long Term Evolution) communication method, 5G (5th Generation) communication method, and 6G (6th Generation) communication method or later. May be good.

また、学習実行装置１００は、例えば、リハビリテーションの分野に適用されてもよい。学習実行装置１００は、例えば、歩行のリハビリを実施する実施者の筋肉モデルを登録するとともに、目標動作として歩行を登録する。そして、介在ニューロンの発火パターンのン学習を進め、歩行ができるようになるまでの動作及び筋肉の成長を記録する。これにより、歩行ができるようになるまでの適切な動作を模索することができる。 Further, the learning execution device 100 may be applied to, for example, the field of rehabilitation. The learning execution device 100 registers, for example, a muscle model of a performer who performs walking rehabilitation, and also registers walking as a target motion. Then, the learning of the firing pattern of the interneuron is advanced, and the movement and the growth of the muscle until the walking becomes possible are recorded. As a result, it is possible to search for an appropriate movement until the person can walk.

また、学習実行装置１００は、例えば、スポーツ科学の分野に適用されてもよい。学習実行装置１００は、例えば、スポーツ選手の筋肉モデルを登録するとともに、目標動作として、理想的なフォーム等を登録する。そして、介在ニューロンの発火パターンのン学習を進め、理想的なフォームが身に着くまでの動作及び筋肉の成長を記録する。これにより、トレーニングの方法を模索することができる。 Further, the learning execution device 100 may be applied to, for example, the field of sports science. The learning execution device 100 registers, for example, a muscle model of an athlete and also registers an ideal form or the like as a target motion. Then, the learning of the firing pattern of the interneuron is advanced, and the movement and muscle growth until the ideal form is acquired are recorded. This makes it possible to find a training method.

なお、学習実行装置１００は、介在ニューロンの発火パターンに従って筋肉を動作させる筋肉モデル以外の筋肉モデルに対して、筋肉の成長を適用してもよい。例えば、学習実行装置１００は、ヒルタイプモデルに基づく筋肉モデルに対して、筋肉の成長を適用する。また、例えば、学習実行装置１００は、ＣＰＧを用いた筋肉モデルに対して、筋肉の成長を適用する。また、例えば、学習実行装置１００は、ＤＱＮを用いた筋肉モデルに対して、筋肉の成長を適用する。学習実行装置１００は、その他、任意の既存のモデルに対して、筋肉の成長を適用してもよい。 The learning execution device 100 may apply muscle growth to a muscle model other than the muscle model that operates the muscle according to the firing pattern of the interneuron. For example, the learning execution device 100 applies muscle growth to a muscle model based on a hill type model. Further, for example, the learning execution device 100 applies muscle growth to a muscle model using CPG. Further, for example, the learning execution device 100 applies muscle growth to a muscle model using DQN. The learning execution device 100 may apply muscle growth to any other existing model.

図２は、筋肉モデル３００の一例を概略的に示す。筋肉モデル３００は、脊髄３１０内の複数の介在ニューロン３２０と、複数の介在ニューロン３２０のそれぞれに接続された複数の運動単位３３０とを含む。１つの運動単位３３０には、運動ニューロン３４０と、運動ニューロン３４０に接続された筋繊維３５０とが含まれる。１つの運動ニューロン３４０には、複数の筋繊維３５０が接続される。 FIG. 2 schematically shows an example of the muscle model 300. The muscle model 300 includes a plurality of interneurons 320 within the spinal cord 310 and a plurality of motor units 330 connected to each of the plurality of interneurons 320. One motor unit 330 includes a motor neuron 340 and a muscle fiber 350 connected to the motor neuron 340. A plurality of muscle fibers 350 are connected to one motor neuron 340.

図３は、発火パターン４００の一例を概略的に示す。発火パターン４００は、複数の介在ニューロン３２０の時系列のオン４０２及びオフ４０４を示す。筋肉モデル３００に対して、発火パターン４００を適用することによって、介在ニューロン３２０から各運動単位３３０に対して時系列で信号が入力され、オン４０２に従って、運動単位３３０の筋繊維３５０が収縮する。これにより、様々な筋肉の動きが実現される。 FIG. 3 schematically shows an example of the ignition pattern 400. The firing pattern 400 shows time series on 402 and off 404 of multiple interneurons 320. By applying the firing pattern 400 to the muscle model 300, signals are input from the interneuron 320 to each motor unit 330 in chronological order, and the muscle fibers 350 of the motor unit 330 contract according to the on 402. As a result, various muscle movements are realized.

図４は、学習実行装置１００の機能構成の一例を概略的に示す。学習実行装置１００は、情報格納部１０２、入力受付部１０４、データ受信部１０６、動作設定部１０８、学習実行部１１０、及び表示制御部１１２を備える。 FIG. 4 schematically shows an example of the functional configuration of the learning execution device 100. The learning execution device 100 includes an information storage unit 102, an input reception unit 104, a data reception unit 106, an operation setting unit 108, a learning execution unit 110, and a display control unit 112.

情報格納部１０２は、各種情報を格納する。情報格納部１０２は、筋肉モデルを格納してよい。情報格納部１０２は、それぞれに運動単位３３０が接続された複数の介在ニューロン３２０の発火パターンに従って、運動単位３３０に含まれる運動ニューロン３４０に接続された筋繊維３５０を収縮させることによって筋肉を動作させる筋肉モデルを格納してよい。 The information storage unit 102 stores various types of information. The information storage unit 102 may store the muscle model. The information storage unit 102 operates the muscle by contracting the muscle fiber 350 connected to the motor neuron 340 included in the motor unit 330 according to the firing pattern of the plurality of interneurons 320 to which the motor unit 330 is connected. The muscle model may be stored.

情報格納部１０２は、筋肉モデル３００に含まれる複数の運動単位３３０のそれぞれについて、関連するパラメータを格納してよい。情報格納部１０２は、運動単位３３０が、速筋であるか遅筋であるかを示すタイプパラメータを格納してよい。タイプパラメータは、第１パラメータの一例であってよい。 The information storage unit 102 may store related parameters for each of the plurality of motor units 330 included in the muscle model 300. The information storage unit 102 may store a type parameter indicating whether the motor unit 330 is a fast muscle or a slow muscle. The type parameter may be an example of the first parameter.

情報格納部１０２は、収縮可能なエネルギーを示すパラメータであるＨＰを格納してよい。ＨＰは、第２パラメータの一例であってよい。情報格納部１０２は、ＨＰの最大値を示すＭＡＸＨＰを格納してよい。 The information storage unit 102 may store HP, which is a parameter indicating contractile energy. HP may be an example of the second parameter. The information storage unit 102 may store MAXHP indicating the maximum value of HP.

情報格納部１０２は、自己回復力を示すパラメータであるＭＰを格納してよい。ＭＰは、第３パラメータの一例であってよい。情報格納部１０２は、ＭＰの最大値を示すＭＡＸＭＰを格納してよい。 The information storage unit 102 may store MP, which is a parameter indicating self-healing power. MP may be an example of the third parameter. The information storage unit 102 may store MAXMP indicating the maximum value of MP.

情報格納部１０２は、筋繊維３５０が速筋である場合に、筋繊維３５０が収縮する毎に消費されるエネルギー量を示す第４パラメータを格納してよい。本例では、情報格納部１０２は、第４パラメータの一例である筋繊維３５０の直径を示すＤＩＡＭを格納する。情報格納部１０２は、運動単位３３０の使用に関連するパラメータであるＥＸＰを格納してよい。ＥＸＰは、例えば、運動単位３３０が使用されるたびに増加するパラメータであってよい。ＥＸＰは、例えば、運動単位３３０が使用された回数に関連するパラメータであってよい。ＥＸＰは、運動単位３３０が使用された回数そのものであってもよい。ＥＸＰは、第５パラメータの一例であってよい。 The information storage unit 102 may store a fourth parameter indicating the amount of energy consumed each time the muscle fiber 350 contracts when the muscle fiber 350 is a fast muscle. In this example, the information storage unit 102 stores a DIAM indicating the diameter of the muscle fiber 350, which is an example of the fourth parameter. The information storage unit 102 may store EXP, which is a parameter related to the use of the motor unit 330. EXP may be, for example, a parameter that increases each time the motor unit 330 is used. EXP may be, for example, a parameter related to the number of times the motor unit 330 has been used. The EXP may be the number of times the motor unit 330 has been used. EXP may be an example of the fifth parameter.

入力受付部１０４は、各種入力を受け付ける。入力受付部１０４は、学習実行装置１００が備える入力デバイスを介した入力を受け付けてよい。 The input receiving unit 104 receives various inputs. The input receiving unit 104 may receive an input via an input device included in the learning execution device 100.

データ受信部１０６は、ネットワーク２０を介して各種データを受信する。データ受信部１０６は、例えば、通信端末２００から、筋肉モデル３００を受信して情報格納部１０２に格納する。また、データ受信部１０６は、例えば、通信端末２００から、運動単位３３０のパラメータを受信して、情報格納部１０２に格納する。 The data receiving unit 106 receives various data via the network 20. The data receiving unit 106 receives, for example, the muscle model 300 from the communication terminal 200 and stores it in the information storage unit 102. Further, the data receiving unit 106 receives, for example, the parameter of the motor unit 330 from the communication terminal 200 and stores it in the information storage unit 102.

動作設定部１０８は、筋肉モデルの目標動作を設定する。動作設定部１０８は、例えば、入力受付部１０４が受け付けた入力に従って、筋肉モデル３００の目標動作を設定してよい。動作設定部１０８は、データ受信部１０６が通信端末２００から受信した設定指示に従って、筋肉モデル３００の目標動作を設定してよい。 The motion setting unit 108 sets the target motion of the muscle model. The motion setting unit 108 may set the target motion of the muscle model 300 according to the input received by the input reception unit 104, for example. The motion setting unit 108 may set the target motion of the muscle model 300 according to the setting instruction received from the communication terminal 200 by the data reception unit 106.

学習実行部１１０は、学習を実行する。学習実行部１１０は、発火パターンを学習してよい。学習実行部１１０は、複数の発火パターンのうち、筋肉モデル３００の動作が目標動作により近い発火パターンに報酬を与える学習によって、目標動作を実現する発火パターンを学習してよい。学習実行部１１０は、例えば、強化学習を用いる。学習実行部１１０は、ＤＱＮ（ＤｅｅｐＱ－Ｎｅｔｗｏｒｋ）を用いてもよい。学習実行部１１０は、ＧＡ（ＧｅｎｅｔｉｃＡｌｇｏｒｉｔｈｍ）を用いてもよい。学習実行部１１０は、その他任意の学習手法を用いてもよい。 The learning execution unit 110 executes learning. The learning execution unit 110 may learn the firing pattern. The learning execution unit 110 may learn the firing pattern that realizes the target motion by learning that the motion of the muscle model 300 rewards the firing pattern that is closer to the target motion among the plurality of firing patterns. The learning execution unit 110 uses, for example, reinforcement learning. The learning execution unit 110 may use DQN (Deep Q-Network). The learning execution unit 110 may use GA (Genetic Algorithm). The learning execution unit 110 may use any other learning method.

学習実行部１１０は、例えば、ある目標動作を実現する発火パターンを学習する場合に、まず、ランダムに複数の発火パターンを発生させる。学習実行部１１０は、ランダムに発生させた複数の発火パターンのそれぞれに従って筋肉モデル３００を動作させ、筋肉モデル３００の動作が目標動作により近い発火パターンに基づいて複数の発火パターンを発生させる。学習実行部１１０は、発生させた複数の発火パターンのそれぞれに従って筋肉モデル３００を動作させ、筋肉モデル３００の動作が目標動作により近い発火パターンに基づいて複数の発火パターンを発生させる。学習実行部１１０は、これらを繰り返すことによって、目標動作を実現する発火パターンを学習してよい。 For example, when learning a firing pattern that realizes a certain target motion, the learning execution unit 110 randomly generates a plurality of firing patterns. The learning execution unit 110 operates the muscle model 300 according to each of the plurality of randomly generated firing patterns, and generates a plurality of firing patterns based on the firing patterns in which the movement of the muscle model 300 is closer to the target movement. The learning execution unit 110 operates the muscle model 300 according to each of the generated firing patterns, and generates a plurality of firing patterns based on the firing patterns in which the movement of the muscle model 300 is closer to the target movement. By repeating these, the learning execution unit 110 may learn the firing pattern that realizes the target motion.

学習実行部１１０は、学習済みの発火パターンに基づいて複数の発火パターンを発生させてもよい。例えば、学習実行部１１０は、膝を２０度に曲げて維持するという目標動作に対して学習した発火パターンと、膝を６０度に曲げて維持するという目標動作に対して学習した発火パターンと、膝を９０度に曲げて維持するという目標動作に対して学習した発火パターンに基づいて、複数の発火パターンを発生させる。これにより、例えば、膝を任意の角度に曲げて維持するという目標動作のための複数の発火パターンを容易に準備することができ、発火パターンをランダムに発生させる場合と比較して、全体に要する時間を短くすることができる。 The learning execution unit 110 may generate a plurality of ignition patterns based on the learned ignition patterns. For example, the learning execution unit 110 has a firing pattern learned for the target motion of bending and maintaining the knee at 20 degrees, and a firing pattern learned for the target motion of bending and maintaining the knee at 60 degrees. Multiple firing patterns are generated based on the firing patterns learned for the target motion of bending and maintaining the knee at 90 degrees. This makes it possible to easily prepare multiple firing patterns for the target motion of, for example, bending and maintaining the knee at an arbitrary angle, which is required as a whole compared to the case where the firing patterns are randomly generated. The time can be shortened.

学習実行部１１０は、学習を進める間、発火パターンに基づいて筋肉モデル３００を動作させた場合に、筋繊維３５０を収縮させた運動単位３３０を成長させてよい。 The learning execution unit 110 may grow the motor unit 330 in which the muscle fiber 350 is contracted when the muscle model 300 is operated based on the firing pattern while the learning is progressing.

筋肉モデル３００は、速筋の運動単位３３０と、遅筋の運動単位３３０とを含んでよい。学習実行部１１０は、発火パターンに基づいて筋肉モデル３００を動作させた場合に、速筋の運動単位３３０と遅筋の運動単位３３０とを異なる基準に従って成長させてよい。 The muscle model 300 may include a fast muscle motor unit 330 and a slow muscle motor unit 330. When the muscle model 300 is operated based on the firing pattern, the learning execution unit 110 may grow the motor unit 330 of the fast muscle and the motor unit 330 of the slow muscle according to different criteria.

学習実行部１１０は、運動単位３３０が収縮する毎にＨＰから予め定められた値を減算してよく、ＭＰが０でない間は、時間経過に伴ってＨＰを回復させてよい。学習実行部１１０は、運動単位３３０が速筋である場合には、運動単位３３０が収縮する毎にＨＰからＤＩＡＭを減算してよい。学習実行部１１０は、運動単位３３０が遅筋である場合には、運動単位３３０が収縮する毎にＨＰから１を減算してよい。なお、これに限らず、学習実行部１１０は、運動単位３３０が速筋である場合に、運動単位３３０が収縮する毎にＨＰからＤＩＡＭ以外の値を減算してもよい。また、学習実行部１１０は、運動単位３３０が遅筋である場合に、運動単位３３０が収縮する毎にＨＰから、例えばＤＩＡＭの値等の、１以外の値を減算してもよい。学習実行部１１０は、ＭＰが０でない間は、時間経過に伴ってＨＰを回復させてよい。学習実行部１１０は、ＭＰが０になった場合、ＨＰの回復を行わなくてよい。学習実行部１１０は、時間経過に伴って、ＭＰを回復させてよい。 The learning execution unit 110 may subtract a predetermined value from the HP each time the motor unit 330 contracts, and may recover the HP with the passage of time while the MP is not 0. When the motor unit 330 is a fast muscle, the learning execution unit 110 may subtract DIAM from the HP each time the motor unit 330 contracts. When the motor unit 330 is a slow muscle, the learning execution unit 110 may subtract 1 from the HP each time the motor unit 330 contracts. Not limited to this, the learning execution unit 110 may subtract a value other than DIAM from the HP every time the motor unit 330 contracts when the motor unit 330 is a fast muscle. Further, the learning execution unit 110 may subtract a value other than 1 from the HP every time the motor unit 330 contracts, for example, the value of DIAM, when the motor unit 330 is a slow muscle. The learning execution unit 110 may recover HP with the passage of time while MP is not 0. The learning execution unit 110 does not have to recover the HP when the MP becomes 0. The learning execution unit 110 may recover MP with the passage of time.

学習実行部１１０は、筋繊維３５０が損傷したと判定した後、筋繊維３５０が回復したと判定した場合に、運動単位３３０が速筋である場合には、ＭＡＸＨＰ及びＤＩＡＭを増加させ、運動単位３３０が遅筋である場合には、ＭＡＸＭＰを増加させてよい。 When the learning execution unit 110 determines that the muscle fiber 350 has been damaged and then determines that the muscle fiber 350 has recovered, if the motor unit 330 is a fast muscle, MAXHP and DIAM are increased, and the motor unit is increased. If 330 is slow, MAXMP may be increased.

学習実行部１１０は、筋繊維３５０が損傷したと判定した後、筋繊維３５０が回復したと判定した場合において、運動単位３３０が速筋である場合、ＭＡＸＭＰは増大させなくてよい。学習実行部１１０は、筋繊維３５０が損傷したと判定した後、筋繊維３５０が回復したと判定した場合において、運動単位３３０が遅筋である場合、ＭＡＸＨＰは増大させなくてよい。学習実行部１１０は、例えば、ＨＰが０になった場合に、筋繊維３５０が損傷したと判定してよく、ＨＰがＭＡＸＨＰになったり、ＨＰが予め定められた閾値より高くなった場合に、筋繊維３５０が回復したと判定してよい。 When the learning execution unit 110 determines that the muscle fiber 350 has been damaged and then determines that the muscle fiber 350 has recovered, the MAXMP does not have to be increased if the motor unit 330 is a fast muscle. When the learning execution unit 110 determines that the muscle fiber 350 has been damaged and then determines that the muscle fiber 350 has recovered, the MAXHP does not have to be increased if the motor unit 330 is a slow muscle. For example, when the HP becomes 0, the learning execution unit 110 may determine that the muscle fiber 350 is damaged, and when the HP becomes MAXHP or the HP becomes higher than a predetermined threshold value, the learning execution unit 110 may determine. It may be determined that the muscle fiber 350 has recovered.

学習実行部１１０は、ＥＸＰの増加に伴って、運動単位３３０のレベルを向上させてよい。学習実行部１１０は、例えば、レベル毎に定められたＥＸＰの値を登録しておき、ＥＸＰの値がレベルに対応するＥＸＰの値を超えた場合に、運動単位３３０のレベルを向上させる。より高いレベルに対して、より多いＥＸＰの値が登録されてよい。 The learning execution unit 110 may improve the level of the motor unit 330 as the EXP increases. For example, the learning execution unit 110 registers an EXP value determined for each level, and improves the level of the motor unit 330 when the EXP value exceeds the EXP value corresponding to the level. More EXP values may be registered for higher levels.

学習実行部１１０は、運動単位３３０のレベルが高いほど、運動単位３３０が速筋である場合のＭＡＸＨＰ及びＤＩＡＭを増加しにくくし、運動単位３３０が遅筋である場合のＭＡＸＭＰを増加しにくくしてよい。 The learning execution unit 110 makes it difficult to increase MAXHP and DIAM when the motor unit 330 is a fast muscle, and makes it difficult to increase MAXMP when the motor unit 330 is a slow muscle, as the level of the motor unit 330 is higher. It's okay.

学習実行部１１０は、運動単位３３０の筋繊維３５０を収縮させた後、予め定められた不応期を経過するまで、当該筋繊維３５０が収縮できないようにしてよい。情報格納部１０２は、複数の運動単位３３０のそれぞれの温度を格納してもよい。学習実行部１１０は、運動単位３３０が使用されるほど、運動単位３３０の温度を高くしてよく、運動単位３３０が使用されなければ、時間経過に伴って、運動単位３３０の温度を低くしてよい。学習実行部１１０は、運動単位３３０の温度が高いほど不応期を短くしてよい。 After contracting the muscle fiber 350 of the motor unit 330, the learning execution unit 110 may prevent the muscle fiber 350 from contracting until a predetermined refractory period elapses. The information storage unit 102 may store the temperature of each of the plurality of motor units 330. The learning execution unit 110 may raise the temperature of the motor unit 330 as the motor unit 330 is used, and if the motor unit 330 is not used, lower the temperature of the motor unit 330 over time. good. The learning execution unit 110 may shorten the refractory period as the temperature of the motor unit 330 increases.

学習実行部１１０は、時系列の複数の発火パターンに従って動作させた筋肉モデルの動作が目標動作を達成した場合に、目標動作を達成した状態の発火パターンから予め定められた時間遡った状態の発火パターンを更新することによって学習を実行してもよい。発火パターンが生成されてから、筋肉が実際に動くまで、不応期及び慣性の法則等の、様々なタイムディレイが存在するので、報酬を得た瞬間の発火パターンを更新するのは好ましくない場合がある。それに対して、学習実行部１１０によれば、目標動作を達成した状態の発火パターンから予め定められた時間遡った状態の発火パターンが更新されるので、学習精度を向上させることができる。 When the movement of the muscle model operated according to a plurality of time-series firing patterns achieves the target movement, the learning execution unit 110 fires in a predetermined time retroactive state from the firing pattern in the state where the target movement is achieved. Learning may be performed by updating the pattern. Since there are various time delays from the generation of the firing pattern to the actual movement of the muscle, such as the refractory period and the law of inertia, it may not be desirable to update the firing pattern at the moment of reward. be. On the other hand, according to the learning execution unit 110, since the firing pattern in the state of going back in a predetermined time from the firing pattern in the state where the target motion is achieved is updated, the learning accuracy can be improved.

当該予め定められた時間は、任意に設定可能であってよく、変更可能であってよい。学習実行部１１０は、速筋と遅筋とで、異なる時間を用いてもよい。例えば、学習実行部１１０は、運動単位３３０が速筋である場合、目標動作を達成した状態の発火パターンから２０ｍｓ前の状態の発火パターンを更新し、運動単位３３０が遅筋である場合、目標動作を達成した状態の発火パターンから４０ｍｓ前の状態の発火パターンを更新してよい。 The predetermined time may be arbitrarily set or may be changeable. The learning execution unit 110 may use different times for the fast muscle and the slow muscle. For example, when the motor unit 330 is a fast muscle, the learning execution unit 110 updates the firing pattern in the state 20 ms before the firing pattern in the state where the target motion is achieved, and when the motor unit 330 is the slow muscle, the target. The ignition pattern in the state 40 ms before may be updated from the ignition pattern in the state where the operation is achieved.

学習実行部１１０は、学習した発火パターンを用いて、表示データを生成してよい。学習実行部１１０は、例えば、発火パターンによって任意のキャラクタを動作させたＣＧアニメーションを生成する。学習実行部１１０は、筋肉モデル３００の学習を開始してから、目標動作を実現できるまでの間の、筋肉モデル３００の動作及び筋肉の成長に関するデータを表示する表示データを生成してもよい。学習実行部１１０は、筋肉モデル３００の学習を開始してから、理想的なフォームを実現できるまでの間の、筋肉モデル３００の動作及び筋肉の成長に関するデータを表示する表示データを生成してもよい。 The learning execution unit 110 may generate display data using the learned firing pattern. The learning execution unit 110 generates, for example, a CG animation in which an arbitrary character is operated according to a firing pattern. The learning execution unit 110 may generate display data for displaying data related to the movement of the muscle model 300 and the growth of the muscle from the start of learning of the muscle model 300 to the realization of the target movement. Even if the learning execution unit 110 generates display data that displays data on the movement and muscle growth of the muscle model 300 from the start of learning of the muscle model 300 until the ideal form can be realized. good.

表示制御部１１２は、学習実行部１１０による学習結果に関連する各種表示を制御する。表示制御部１１２は、例えば、学習実行部１１０によって生成された表示データを、学習実行装置１００が備えるディスプレイに表示させる。表示制御部１１２は、学習実行部１１０によって生成された表示データを、ネットワーク２０を介して通信端末２００に送信し、通信端末２００が備えるディスプレイに表示させてもよい。 The display control unit 112 controls various displays related to the learning result by the learning execution unit 110. The display control unit 112 displays, for example, the display data generated by the learning execution unit 110 on the display included in the learning execution device 100. The display control unit 112 may transmit the display data generated by the learning execution unit 110 to the communication terminal 200 via the network 20 and display it on the display provided in the communication terminal 200.

情報格納部１０２は、既知のモデルに従った筋肉モデルを格納してもよい。情報格納部１０２は、例えば、ヒルタイプモデルに基づく筋肉モデルを格納する。情報格納部１０２は、ＣＰＧを用いた筋肉モデルを格納してもよい。情報格納部１０２は、ＤＱＮを用いた筋肉モデルを格納してもよい。 The information storage unit 102 may store a muscle model according to a known model. The information storage unit 102 stores, for example, a muscle model based on a hill type model. The information storage unit 102 may store a muscle model using CPG. The information storage unit 102 may store a muscle model using DQN.

情報格納部１０２は、既知のモデルに従った筋肉モデルの筋肉に含まれる複数の筋繊維のそれぞれに対して、タイプパラメータと、ＨＰと、ＭＡＸＨＰと、ＭＰと、ＭＡＸＭＰと、ＤＩＡＭとを格納してよい。 The information storage unit 102 stores the type parameters, HP, MAXHP, MP, MAXMP, and DIAM for each of the plurality of muscle fibers contained in the muscle of the muscle model according to the known model. It's okay.

学習実行部１１０は、筋繊維が収縮する毎に、筋繊維が速筋である場合にＨＰからＤＩＡＭを減算し、筋繊維が遅筋である場合にＨＰから１を減算し、ＭＰが０でない間は、時間経過に伴ってＨＰを回復させ、筋繊維が損傷したと判定した後、筋繊維が回復したと判定した場合に、筋繊維が速筋である場合には、ＭＡＸＨＰ及びＤＩＡＭを増加させ、筋繊維が遅筋である場合には、ＭＡＸＭＰを増加させてよい。 Each time the muscle fiber contracts, the learning execution unit 110 subtracts DIAM from HP when the muscle fiber is a fast muscle, subtracts 1 from HP when the muscle fiber is a slow muscle, and MP is not 0. In the meantime, HP is restored with the passage of time, and after it is determined that the muscle fiber is damaged, when it is determined that the muscle fiber has recovered, MAXHP and DIAM are increased if the muscle fiber is a fast muscle. If the muscle fibers are slow muscles, MAXMP may be increased.

学習実行部１１０は、ＭＰが０になった場合、ＨＰの回復を行わなくてよい。学習実行部１１０は、時間経過に伴って、ＭＰを回復させてよい。学習実行部１１０は、筋繊維が損傷したと判定した後、ＨＰが回復した場合において、筋繊維が速筋である場合、ＭＡＸＭＰは増大させなくてよい。学習実行部１１０は、筋繊維が損傷したと判定した後、ＨＰが回復した場合において、筋繊維が遅筋である場合、ＭＡＸＨＰは増大させなくてよい。 The learning execution unit 110 does not have to recover the HP when the MP becomes 0. The learning execution unit 110 may recover MP with the passage of time. The learning execution unit 110 does not have to increase MAXMP when the muscle fiber is a fast muscle in the case where the HP is recovered after determining that the muscle fiber is damaged. The learning execution unit 110 does not have to increase MAXHP when the muscle fiber is a slow muscle in the case where the HP is recovered after determining that the muscle fiber is damaged.

学習実行部１１０は、ＥＸＰの増加に伴って、筋繊維のレベルを向上させてよい。学習実行部１１０は、例えば、レベル毎に定められたＥＸＰの値を登録しておき、ＥＸＰの値がレベルに対応するＥＸＰの値を超えた場合に、筋繊維のレベルを向上させる。より高いレベルに対して、より多いＥＸＰの値が登録されてよい。学習実行部１１０は、筋繊維のレベルが高いほど、筋繊維が速筋である場合のＭＡＸＨＰ及びＤＩＡＭの値を増加しにくくし、筋繊維が遅筋である場合のＭＡＸＭＰを増加しにくくしてよい。 The learning execution unit 110 may improve the level of muscle fibers as the EXP increases. The learning execution unit 110 registers, for example, an EXP value determined for each level, and improves the muscle fiber level when the EXP value exceeds the EXP value corresponding to the level. More EXP values may be registered for higher levels. The learning execution unit 110 makes it difficult to increase the values of MAXHP and DIAM when the muscle fiber is a fast muscle, and makes it difficult to increase the MAXMP when the muscle fiber is a slow muscle, as the level of the muscle fiber is high. good.

学習実行部１１０は、筋繊維を収縮させた後、予め定められた不応期を経過するまで、当該筋繊維が収縮できないようにしてよい。情報格納部１０２は、複数の筋繊維のそれぞれの温度を格納してもよい。学習実行部１１０は、筋繊維が使用されるほど、筋繊維の温度を高くしてよく、筋繊維が使用されなければ、時間経過に伴って、筋繊維の温度を低くしてよい。学習実行部１１０は、筋繊維の温度が高いほど不応期を短くしてよい。 After contracting the muscle fiber, the learning execution unit 110 may prevent the muscle fiber from contracting until a predetermined refractory period elapses. The information storage unit 102 may store the temperature of each of the plurality of muscle fibers. The learning execution unit 110 may raise the temperature of the muscle fiber as the muscle fiber is used, and may lower the temperature of the muscle fiber with the passage of time if the muscle fiber is not used. The learning execution unit 110 may shorten the refractory period as the temperature of the muscle fiber is higher.

図５は、筋肉モデル３００の具体例を概略的に示す。図５では、人間の腱３７２、膝３７４、及び骨３７６に対応する筋肉３６０の筋肉モデル３００を例示する。上述の通り、脊髄３１０内には複数の介在ニューロン３２０が存在する。脊髄３１０は、学習器とみなすことも可能である。複数の介在ニューロン３２０のそれぞれは、発火と非発火の２つの状態をとり得る。運動単位３３０には、運動ニューロン３４０と、運動ニューロン３４０に接続された筋繊維３５０とが含まれる。１つの運動ニューロン３４０には、複数の筋繊維３５０が接続される。運動ニューロン３４０には、速筋と遅筋との２つの種類があってよい。運動ニューロン３４０は、サイズが大きい場合、速筋であってよく、サイズが小さい場合、遅筋であってよい。運動ニューロン３４０は、例えば、サイズが閾値より大きい場合、速筋であり、サイズが閾値より小さいばあい、遅筋である。筋繊維３５０は、速筋繊維と遅筋繊維との２つの種類があってよい。筋肉３６０は、筋繊維３５０の集合体である。本例において、学習実行部１１０は、１つのモデルとして、２つの筋肉（伸筋と屈筋）が接続された膝関節に対して、単純な動きを発火パターンで制御する。運動単位３３０には、速筋及び遅筋の２つの種類があってよく、学習実行部１１０は、速筋と遅筋とでそれぞれ異なる成長を行わせてよい。 FIG. 5 schematically shows a specific example of the muscle model 300. FIG. 5 illustrates a muscle model 300 of muscles 360 corresponding to human tendons 372, knees 374, and bones 376. As described above, there are a plurality of interneurons 320 in the spinal cord 310. The spinal cord 310 can also be regarded as a learning device. Each of the plurality of interneurons 320 can be in two states, firing and non-firing. Motor unit 330 includes motor neurons 340 and muscle fibers 350 connected to motor neurons 340. A plurality of muscle fibers 350 are connected to one motor neuron 340. There may be two types of motor neurons 340, fast muscles and slow muscles. The motor neuron 340 may be a fast muscle if it is large in size and a slow muscle if it is small in size. The motor neuron 340 is, for example, a fast muscle when the size is larger than the threshold, and a slow muscle when the size is smaller than the threshold. There may be two types of muscle fibers 350, fast muscle fibers and slow muscle fibers. Muscle 360 is an aggregate of muscle fibers 350. In this example, the learning execution unit 110 controls a simple movement with a firing pattern for a knee joint in which two muscles (extensor muscle and flexor muscle) are connected as one model. There may be two types of motor units 330, fast muscles and slow muscles, and the learning execution unit 110 may cause the fast muscles and the slow muscles to grow differently.

発火パターンを使用して筋肉を制御するためには、ニューロンの活動電位を計算する必要がある。学習実行部１１０は、介在ニューロン３２０を発火する場合に、例えば、Hodgkin-Huxleyモデル（A.L. Hodgkin, A. A quantitative description of membrane current and its application to conduction and excitation in nerve, from the physiological laboratory. University of Cambridge, pp. 500-544, 1952.）に従って、活動電位を計算してよい。計算された活動電位は、キルヒホッフの法則に従って、接続された運動ニューロン３４０に分配される。 In order to control muscles using firing patterns, it is necessary to calculate the action potentials of neurons. When the learning execution unit 110 fires the interneuron 320, for example, the Hodgkin-Huxley model (AL Hodgkin, A. A quantitative description of membrane current and its application to conduction and excitation in nerve, from the physiological laboratory. University of The activity potential may be calculated according to Cambridge, pp. 500-544, 1952.). The calculated action potentials are distributed to the connected motor neurons 340 according to Kirchhoff's law.

学習実行部１１０は、拡張したヒルタイプモデルを用いてよく、発火している運動ニューロン３４０の活動電位の合算を筋肉モデルの入力信号としてよい。筋肉モデルにおいて、筋肉の収縮力が計算され、物理法則に従って、筋肉の収縮力から膝関節のトルクに変換し、膝を動かして、関節角度が変化する。学習実行部１１０は、運動結果を関節角度として出力してよい。関節角度が目標角度を達成した場合、学習実行部１１０は、発火パターンに報酬を与えてよい。 The learning execution unit 110 may use the expanded hill type model, and the sum of the action potentials of the firing motor neurons 340 may be used as the input signal of the muscle model. In the muscle model, the contractile force of the muscle is calculated, and according to the laws of physics, the contractile force of the muscle is converted into the torque of the knee joint, and the knee is moved to change the joint angle. The learning execution unit 110 may output the exercise result as a joint angle. When the joint angle reaches the target angle, the learning execution unit 110 may reward the firing pattern.

上述したように、学習実行部１１０は、拡張したヒルタイプモデルを用いてよい。従来のヒルタイプモデルは、筋肉の収縮要素（ＣＥ）、ＣＥに対して並列に配置される並列弾性要素（ＰＥＥ）及び直列に配置される直列弾性要素（ＳＥＥ）で構成されている。拡張モデルでは、ばね定数に起因する筋痙攣を軽減するために、従来のヒルタイプモデルにおける腱力計算に減衰係数を追加する。ヒルタイプモデルでは、筋繊維が運動ニューロンから電流を取得し、ＰＥＥ、ＳＥＥ、及びＣＥを使用して力に変換する。 As described above, the learning execution unit 110 may use the extended hill type model. The conventional leech type model is composed of a muscle contraction element (CE), a parallel elastic element (PEE) arranged in parallel with the CE, and a series elastic element (SEE) arranged in series. In the extended model, a damping coefficient is added to the tendon force calculation in the conventional hill type model in order to reduce the muscle spasm caused by the spring constant. In the hill-type model, muscle fibers take current from motor neurons and use PEE, SEE, and CE to convert it into force.

ｌｏｐｔは、ＣＥの最大の力を得るために最適化された長さであり、Ａは、筋肉活動比であり、ｌｃｅは、ＣＥの長さである。この関数を近似するためにいくつかの方程式が提案されている。例えば、Rosen and Kuoモデル（Deshpande, P.-H. K. . A. D. Contribution of passive properties of muscle-tendon units to the metacarpophalangeal joint torque of the index finger. IEEE, pp. 288-294, 2010.）を適用してよい。 lopt is the length optimized to obtain the maximum force of CE, A is the muscle activity ratio, and lc is the length of CE. Several equations have been proposed to approximate this function. For example, the Rosen and Kuo model (Deshpande, P.-HK. AD Contribution of passive properties of muscle-tendon units to the metacarpophalangeal joint torque of the index finger. IEEE, pp. 288-294, 2010.) may be applied. ..

Ｖｃｅは、ＣＥの収縮速度であり、Ｖｍａｘは、ＣＥの最大収縮速度である。ＰＥが発生する力であるＦｐｅの式は次のとおりである。 Vce is the contraction rate of CE, and Vmax is the maximum contraction rate of CE. The formula of Fpe, which is the force generated by PE, is as follows.

Ｋｐｅは、ＰＥのばね定数であり、ｌｐｅは、ＰＥの長さであり、ｌｐｅ_ｒｅｓｔはＰＥの平衡長であり、ｄｐｅは、ＰＥの減衰係数であり、Ｖｐｅは、ＰＥの終端速度である。ＳＥＥの力であるＦｓｅの式は次の通りである。 Kpe is the spring constant of PE, lpe is the length of PE, lpe_rest is the equilibrium length of PE, dpe is the damping coefficient of PE, and Vpe is the terminal velocity of PE. The formula of Fse, which is the power of SEE, is as follows.

ｋｓｅは、ＳＥＥのばね定数であり、ｌｓｅは、ＳＥＥの長さであり、ｌｓｅ_ｒｅｓｔはＳＥＥの平衡長であり、ｄｓｅは、ＳＥＥの減衰係数であり、Ｖｓｅは、ＳＥＥの終端速度である。 kse is the spring constant of SEE, lse is the length of SEE, lse_rest is the equilibrium length of SEE, dse is the damping coefficient of SEE, and Vse is the terminal velocity of SEE.

運動単位３３０が活動電位を受けると、筋肉の収縮が引き起こされる。収縮が力のピークに達するまでの時間を収縮時間と呼ぶ。遅筋の運動単位３３０は、収縮時間が長く、最大収縮力が小さくなる。速筋の運動単位３３０は、収縮時間が短く、最大収縮力が高くなる。１つの筋肉は、複数の速筋の運動単位３３０と複数の遅筋及び運動単位３３０で構成されている。そこで、これらの運動単位３３０からなるヒルタイプモデルを採用する。 When the motor unit 330 receives an action potential, muscle contraction is triggered. The time until the contraction reaches the peak of force is called the contraction time. The motor unit 330 of the slow muscle has a long contraction time and a small maximum contractile force. The motor unit 330 of the fast muscle has a short contraction time and a high maximum contractile force. One muscle is composed of a plurality of fast muscles and a plurality of slow muscles and a plurality of motor units 330. Therefore, a hill type model consisting of these motor units 330 is adopted.

Ｎは速筋の運動単位３３０の数であり、Ｍは遅筋の運動単位３３０の数であり、Ｆｃｅ_ｆ_ｉは、i番目の速筋の収縮力であり、Ｆｃｅ_ｓ_ｊ、ｊ番目の遅筋の収縮力である。 N is the number of motor units 330 of the fast muscle, M is the number of motor units 330 of the slow muscle, Fce_f_i is the contraction force of the i-th fast muscle, and Fce_s_j and the contraction force of the j-th slow muscle. Is.

遅筋の運動単位３３０及び速筋の運動単位３３０の生物学的特性が、本実施形態に係る筋肉モデルによってモデル化される。運動ニューロン３４０と筋繊維３５０で構成される運動単位３３０の成長モデルでは、すべての筋繊維３５０に、筋収縮に使用できるエネルギー値（ＨＰ）がある。収縮の度に、速筋のＨＰの値を、筋繊維３５０の直径に等しい値だけ減少させてよい。また、収縮の度に、遅筋のＨＰの値を、１だけ減少させてよい。継続的な筋肉の収縮によりＨＰが減少し、ＨＰが０になると、筋断裂が発生する。筋断裂が発生すると、回復しなければ、介在ニューロン３２０から電気信号を受信した場合でも、筋繊維３５０を再び収縮させることはできない。一方、介在ニューロン３２０からの信号の間隔が十分に大きければ、筋繊維３５０は自然に回復することができる。本モデルにおいて、自己回復力を示すＭＰが０でない限り、筋繊維３５０は、時間の経過とともに回復する。これらによって、学習実行部１１０は、様々な発火パターンを学習することができる。 The biological properties of the slow muscle motor unit 330 and the fast muscle motor unit 330 are modeled by the muscle model according to the present embodiment. In a growth model of motor unit 330 composed of motor neurons 340 and muscle fibers 350, all muscle fibers 350 have an energy value (HP) that can be used for muscle contraction. With each contraction, the HP value of the fast muscle may be reduced by a value equal to the diameter of the muscle fiber 350. Further, the HP value of the slow muscle may be decreased by 1 each time the contraction occurs. HP decreases due to continuous muscle contraction, and when HP becomes 0, muscle rupture occurs. When a muscle rupture occurs, the muscle fiber 350 cannot be contracted again even if an electric signal is received from the interneuron 320 unless it recovers. On the other hand, if the signal spacing from the interneuron 320 is large enough, the muscle fiber 350 can recover spontaneously. In this model, the muscle fiber 350 recovers over time unless the MP showing self-healing power is 0. As a result, the learning execution unit 110 can learn various firing patterns.

運動単位３３０は、使用されるたびにＥＸＰを取得し、成長を促進する。本モデルにおいては、成長のレベルを表すためにＬＶを定義している。遅筋の運動単位と速筋の運動単位には、異なる成長規則がある。速筋の運動単位３３０の場合、ＭＡＸＨＰ及び筋繊維３５０の直径のパラメータが増加する。当該ルールは、生物学的な成長ルールに基づいている。 Motor unit 330 acquires EXP each time it is used and promotes growth. In this model, LV is defined to represent the level of growth. Slow muscle motor units and fast muscle motor units have different growth rules. For the fast muscle motor unit 330, the MAXHP and muscle fiber 350 diameter parameters increase. The rules are based on biological growth rules.

速筋の運動単位３３０には、筋繊維の周囲に衛星細胞が存在する。筋断裂が発生すると、衛星細胞が分裂し、速筋の筋繊維３５０のサイズが増加する。太い筋繊維３５０ほど強度は高くなるが、より多くのＨＰを必要とする。 In the motor unit 330 of the fast muscle, satellite cells are present around the muscle fiber. When muscle rupture occurs, satellite cells divide and the size of fast muscle fibers 350 increases. The thicker the muscle fiber 350, the higher the strength, but it requires more HP.

遅筋の筋繊維３５０は、サイズが増加しないが、自己回復力が増加する。生物学的な成長ルールによれば、遅筋の筋繊維３５０の周囲の毛細血管の数が増加するため、遅筋の筋繊維３５０に輸送される酸素の量が増加する。 Slow muscle fibers 350 do not increase in size but increase self-healing power. According to biological growth rules, the number of capillaries surrounding the slow muscle fiber 350 increases, thus increasing the amount of oxygen transported to the slow muscle fiber 350.

本モデルでは、筋繊維３５０の疲労を示すパラメータであるＳＰをさらに含んでもよい。遅筋の筋繊維３５０のみにおいて、成長に伴ってＳＰの値が減少する。すなわち、遅筋の筋繊維は、より長く使用されることができる。 In this model, SP, which is a parameter indicating the fatigue of the muscle fiber 350, may be further included. Only in the slow muscle fiber 350, the SP value decreases with growth. That is, slow muscle fibers can be used longer.

表１は、各パラメータの説明を示し、表２は、アルゴリズムの一例を示す。 Table 1 shows a description of each parameter, and Table 2 shows an example of the algorithm.

学習実行部１１０は、介在ニューロン３２０の発火パターンを学習するために、Ｑラーニングを使用してよい。学習プロセスは、各介在ニューロン３２０をエージェントとするマルチエージェントシステム学習に基づいてよい。各エージェントは、その環境を監視する。環境とは、介在ニューロン３２０と運動ニューロン３４０との接続性、及び運動単位３３０のパラメータとして定義されてよい。学習中、初期接続設定は変更されないが、運動単位のパラメータは変更可能であってよい。 The learning execution unit 110 may use Q-learning to learn the firing pattern of the interneuron 320. The learning process may be based on multi-agent system learning with each interneuron 320 as an agent. Each agent monitors its environment. The environment may be defined as the connectivity between the interneuron 320 and the motor neuron 340, and the parameters of the motor unit 330. During learning, the initial connection settings are not changed, but the motor unit parameters may be changeable.

各介在ニューロン３２０は、複数の運動ニューロン３４０に接続されており、速筋又は遅筋のいずれかに接続される。なお、運動単位３３０の筋繊維３５０が速筋であるか遅筋であるかは、接続している運動ニューロンのサイズによって決まる。この原理は生物学に由来する。エージェントは、エージェント間で状態情報を共有できる。これは、ミエリン接続による情報共有と同等である。 Each interneuron 320 is connected to a plurality of motor neurons 340 and is connected to either fast or slow muscles. Whether the muscle fiber 350 of the motor unit 330 is a fast muscle or a slow muscle is determined by the size of the connected motor neuron. This principle comes from biology. Agents can share state information between agents. This is equivalent to information sharing via a myelin connection.

Ｑラーニングにおける状態と行動の組み合わせでは、Ｑｉ＝（ｓｉ：ａｉ）であり、Ｓｉは各エージェントの状態を示す。 In the combination of state and action in Q-learning, Qi = (si: ai), and Si indicates the state of each agent.

Ｍは、介在ニューロン３２０に接続されている運動ニューロン３４０の合計であり、Ｏは、他の介在ニューロン３２０に接続されている運動ニューロン３４０の合計である。各エージェントは、発火（１）又は発火しない（０）のような行動ａｉを実行する。 M is the sum of the motor neurons 340 connected to the interneuron 320, and O is the sum of the motor neurons 340 connected to the other interneurons 320. Each agent performs an action ai such as firing (1) or not firing (0).

各介在ニューロン３２０は、接続されている各運動単位のすべてのパラメータと、情報を共有している他の介在ニューロン３２０が保持している運動単位のエネルギーの合計を監視し、発火するかどうかを決定する。介在ニューロン３２０の発火に基づいて、Hodgkin Huxleyモデルを用いて、接続された運動単位の電気信号が計算され、入力信号の計算に利用される。次に、拡張されたヒルタイプモデルを使用して、筋肉の収縮から計算された角度がエージェントにフィードバックされる。 Each interneuron 320 monitors all parameters of each connected motor unit and the total energy of the motor unit held by other interneurons 320 sharing information and determines whether it fires. decide. Based on the firing of the interneuron 320, the Hodgkin Huxley model is used to calculate the electrical signal of the connected motor unit and use it to calculate the input signal. Then, using the expanded hill type model, the angle calculated from the contraction of the muscle is fed back to the agent.

報酬には、即時と遅延の２種類があってよい。即時の報酬として、膝関節が目標の角度を達成する度に、ｒｇｏａｌを受信する。膝関節が目標角度を達成し続ける限り、エージェントは報酬を受け取り続ける。 There may be two types of rewards, immediate and delayed. As an immediate reward, you will receive an rgoal each time the knee joint reaches the target angle. As long as the knee joint continues to reach the target angle, the agent will continue to receive rewards.

遅延報酬として、すべてのエージェントの残りのＨＰの合計が、エピソードの終わりに、報酬としてすべてのエージェントに均等に分配される。これは、効率的な動きを生み出す協調行動に寄与する。 As a late reward, the sum of the remaining HPs of all agents will be evenly distributed to all agents as a reward at the end of the episode. This contributes to cooperative action that produces efficient movement.

図６は、学習実行装置１００として機能するコンピュータ１２００のハードウェア構成の一例を概略的に示す。コンピュータ１２００にインストールされたプログラムは、コンピュータ１２００を、本実施形態に係る装置の１又は複数の「部」として機能させ、又はコンピュータ１２００に、本実施形態に係る装置に関連付けられるオペレーション又は当該１又は複数の「部」を実行させることができ、及び／又はコンピュータ１２００に、本実施形態に係るプロセス又は当該プロセスの段階を実行させることができる。そのようなプログラムは、コンピュータ１２００に、本明細書に記載のフローチャート及びブロック図のブロックのうちのいくつか又はすべてに関連付けられた特定のオペレーションを実行させるべく、ＣＰＵ１２１２によって実行されてよい。 FIG. 6 schematically shows an example of a hardware configuration of a computer 1200 that functions as a learning execution device 100. A program installed on the computer 1200 causes the computer 1200 to function as one or more "parts" of the apparatus according to the present embodiment, or causes the computer 1200 to perform an operation associated with the apparatus according to the present embodiment or the one or the like. A plurality of "parts" can be executed and / or a computer 1200 can be made to execute a process according to the present embodiment or a stage of the process. Such a program may be run by the CPU 1212 to cause the computer 1200 to perform certain operations associated with some or all of the blocks of the flowcharts and block diagrams described herein.

本実施形態によるコンピュータ１２００は、ＣＰＵ１２１２、ＲＡＭ１２１４、及びグラフィックコントローラ１２１６を含み、それらはホストコントローラ１２１０によって相互に接続されている。コンピュータ１２００はまた、通信インタフェース１２２２、記憶装置１２２４、ＤＶＤドライブ、及びＩＣカードドライブのような入出力ユニットを含み、それらは入出力コントローラ１２２０を介してホストコントローラ１２１０に接続されている。ＤＶＤドライブは、ＤＶＤ－ＲＯＭドライブ及びＤＶＤ－ＲＡＭドライブ等であってよい。記憶装置１２２４は、ハードディスクドライブ及びソリッドステートドライブ等であってよい。コンピュータ１２００はまた、ＲＯＭ１２３０及びキーボードのようなレガシの入出力ユニットを含み、それらは入出力チップ１２４０を介して入出力コントローラ１２２０に接続されている。 The computer 1200 according to this embodiment includes a CPU 1212, a RAM 1214, and a graphic controller 1216, which are interconnected by a host controller 1210. The computer 1200 also includes input / output units such as a communication interface 1222, a storage device 1224, a DVD drive, and an IC card drive, which are connected to the host controller 1210 via the input / output controller 1220. The DVD drive may be a DVD-ROM drive, a DVD-RAM drive, or the like. The storage device 1224 may be a hard disk drive, a solid state drive, or the like. The computer 1200 also includes a legacy input / output unit such as a ROM 1230 and a keyboard, which are connected to the input / output controller 1220 via an input / output chip 1240.

ＣＰＵ１２１２は、ＲＯＭ１２３０及びＲＡＭ１２１４内に格納されたプログラムに従い動作し、それにより各ユニットを制御する。グラフィックコントローラ１２１６は、ＲＡＭ１２１４内に提供されるフレームバッファ等又はそれ自体の中に、ＣＰＵ１２１２によって生成されるイメージデータを取得し、イメージデータがディスプレイデバイス１２１８上に表示されるようにする。 The CPU 1212 operates according to a program stored in the ROM 1230 and the RAM 1214, thereby controlling each unit. The graphic controller 1216 acquires the image data generated by the CPU 1212 in a frame buffer or the like provided in the RAM 1214 or itself so that the image data is displayed on the display device 1218.

通信インタフェース１２２２は、ネットワークを介して他の電子デバイスと通信する。記憶装置１２２４は、コンピュータ１２００内のＣＰＵ１２１２によって使用されるプログラム及びデータを格納する。ＤＶＤドライブは、プログラム又はデータをＤＶＤ－ＲＯＭ等から読み取り、記憶装置１２２４に提供する。ＩＣカードドライブは、プログラム及びデータをＩＣカードから読み取り、及び／又はプログラム及びデータをＩＣカードに書き込む。 The communication interface 1222 communicates with other electronic devices via the network. The storage device 1224 stores programs and data used by the CPU 1212 in the computer 1200. The DVD drive reads a program or data from a DVD-ROM or the like and provides it to the storage device 1224. The IC card drive reads the program and data from the IC card and / or writes the program and data to the IC card.

ＲＯＭ１２３０はその中に、アクティブ化時にコンピュータ１２００によって実行されるブートプログラム等、及び／又はコンピュータ１２００のハードウェアに依存するプログラムを格納する。入出力チップ１２４０はまた、様々な入出力ユニットをＵＳＢポート、パラレルポート、シリアルポート、キーボードポート、マウスポート等を介して、入出力コントローラ１２２０に接続してよい。 The ROM 1230 stores in it a boot program or the like executed by the computer 1200 at the time of activation, and / or a program depending on the hardware of the computer 1200. The input / output chip 1240 may also connect various input / output units to the input / output controller 1220 via a USB port, a parallel port, a serial port, a keyboard port, a mouse port, and the like.

プログラムは、ＤＶＤ－ＲＯＭ又はＩＣカードのようなコンピュータ可読記憶媒体によって提供される。プログラムは、コンピュータ可読記憶媒体から読み取られ、コンピュータ可読記憶媒体の例でもある記憶装置１２２４、ＲＡＭ１２１４、又はＲＯＭ１２３０にインストールされ、ＣＰＵ１２１２によって実行される。これらのプログラム内に記述される情報処理は、コンピュータ１２００に読み取られ、プログラムと、上記様々なタイプのハードウェアリソースとの間の連携をもたらす。装置又は方法が、コンピュータ１２００の使用に従い情報のオペレーション又は処理を実現することによって構成されてよい。 The program is provided by a computer-readable storage medium such as a DVD-ROM or IC card. The program is read from a computer-readable storage medium, installed in a storage device 1224, RAM 1214, or ROM 1230, which is also an example of a computer-readable storage medium, and executed by the CPU 1212. The information processing described in these programs is read by the computer 1200 and provides a link between the program and the various types of hardware resources described above. The device or method may be configured to implement the operation or processing of information in accordance with the use of the computer 1200.

例えば、通信がコンピュータ１２００及び外部デバイス間で実行される場合、ＣＰＵ１２１２は、ＲＡＭ１２１４にロードされた通信プログラムを実行し、通信プログラムに記述された処理に基づいて、通信インタフェース１２２２に対し、通信処理を命令してよい。通信インタフェース１２２２は、ＣＰＵ１２１２の制御の下、ＲＡＭ１２１４、記憶装置１２２４、ＤＶＤ－ＲＯＭ、又はＩＣカードのような記録媒体内に提供される送信バッファ領域に格納された送信データを読み取り、読み取られた送信データをネットワークに送信し、又はネットワークから受信した受信データを記録媒体上に提供される受信バッファ領域等に書き込む。 For example, when communication is executed between the computer 1200 and an external device, the CPU 1212 executes a communication program loaded in the RAM 1214, and performs communication processing with respect to the communication interface 1222 based on the processing described in the communication program. You may order. Under the control of the CPU 1212, the communication interface 1222 reads and reads the transmission data stored in the transmission buffer area provided in the recording medium such as the RAM 1214, the storage device 1224, the DVD-ROM, or the IC card. The data is transmitted to the network, or the received data received from the network is written to the reception buffer area or the like provided on the recording medium.

また、ＣＰＵ１２１２は、記憶装置１２２４、ＤＶＤドライブ（ＤＶＤ－ＲＯＭ）、ＩＣカード等のような外部記録媒体に格納されたファイル又はデータベースの全部又は必要な部分がＲＡＭ１２１４に読み取られるようにし、ＲＡＭ１２１４上のデータに対し様々なタイプの処理を実行してよい。ＣＰＵ１２１２は次に、処理されたデータを外部記録媒体にライトバックしてよい。 Further, the CPU 1212 makes the RAM 1214 read all or necessary parts of the file or the database stored in the external recording medium such as the storage device 1224, the DVD drive (DVD-ROM), the IC card, etc., on the RAM 1214. Various types of processing may be performed on the data. The CPU 1212 may then write back the processed data to an external recording medium.

様々なタイプのプログラム、データ、テーブル、及びデータベースのような様々なタイプの情報が記録媒体に格納され、情報処理を受けてよい。ＣＰＵ１２１２は、ＲＡＭ１２１４から読み取られたデータに対し、本開示の随所に記載され、プログラムの命令シーケンスによって指定される様々なタイプのオペレーション、情報処理、条件判断、条件分岐、無条件分岐、情報の検索／置換等を含む、様々なタイプの処理を実行してよく、結果をＲＡＭ１２１４に対しライトバックする。また、ＣＰＵ１２１２は、記録媒体内のファイル、データベース等における情報を検索してよい。例えば、各々が第２の属性の属性値に関連付けられた第１の属性の属性値を有する複数のエントリが記録媒体内に格納される場合、ＣＰＵ１２１２は、当該複数のエントリの中から、第１の属性の属性値が指定されている条件に一致するエントリを検索し、当該エントリ内に格納された第２の属性の属性値を読み取り、それにより予め定められた条件を満たす第１の属性に関連付けられた第２の属性の属性値を取得してよい。 Various types of information such as various types of programs, data, tables, and databases may be stored in recording media and processed. The CPU 1212 describes various types of operations, information processing, conditional judgment, conditional branching, unconditional branching, and information retrieval described in various parts of the present disclosure with respect to the data read from the RAM 1214. Various types of processing may be performed, including / replacement, etc., and the results are written back to the RAM 1214. Further, the CPU 1212 may search for information in a file, database, or the like in the recording medium. For example, when a plurality of entries each having an attribute value of the first attribute associated with the attribute value of the second attribute are stored in the recording medium, the CPU 1212 is the first of the plurality of entries. The attribute value of the attribute of is searched for the entry that matches the specified condition, the attribute value of the second attribute stored in the entry is read, and the attribute value of the second attribute is changed to the first attribute that satisfies the predetermined condition. You may get the attribute value of the associated second attribute.

上で説明したプログラム又はソフトウエアモジュールは、コンピュータ１２００上又はコンピュータ１２００近傍のコンピュータ可読記憶媒体に格納されてよい。また、専用通信ネットワーク又はインターネットに接続されたサーバシステム内に提供されるハードディスク又はＲＡＭのような記録媒体が、コンピュータ可読記憶媒体として使用可能であり、それによりプログラムを、ネットワークを介してコンピュータ１２００に提供する。 The program or software module described above may be stored on a computer 1200 or in a computer-readable storage medium near the computer 1200. Further, a recording medium such as a hard disk or RAM provided in a dedicated communication network or a server system connected to the Internet can be used as a computer-readable storage medium, whereby the program can be transferred to the computer 1200 via the network. offer.

本実施形態におけるフローチャート及びブロック図におけるブロックは、オペレーションが実行されるプロセスの段階又はオペレーションを実行する役割を持つ装置の「部」を表わしてよい。特定の段階及び「部」が、専用回路、コンピュータ可読記憶媒体上に格納されるコンピュータ可読命令と共に供給されるプログラマブル回路、及び／又はコンピュータ可読記憶媒体上に格納されるコンピュータ可読命令と共に供給されるプロセッサによって実装されてよい。専用回路は、デジタル及び／又はアナログハードウェア回路を含んでよく、集積回路（ＩＣ）及び／又はディスクリート回路を含んでよい。プログラマブル回路は、例えば、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、及びプログラマブルロジックアレイ（ＰＬＡ）等のような、論理積、論理和、排他的論理和、否定論理積、否定論理和、及び他の論理演算、フリップフロップ、レジスタ、並びにメモリエレメントを含む、再構成可能なハードウェア回路を含んでよい。 The blocks in the flowcharts and block diagrams of this embodiment may represent the stage of the process in which the operation is performed or the "part" of the device responsible for performing the operation. Specific steps and "parts" are supplied with a dedicated circuit, a programmable circuit supplied with computer-readable instructions stored on a computer-readable storage medium, and / or with computer-readable instructions stored on a computer-readable storage medium. It may be implemented by the processor. Dedicated circuits may include digital and / or analog hardware circuits, and may include integrated circuits (ICs) and / or discrete circuits. Programmable circuits include logical products, logical sums, exclusive logical sums, negative logical products, negative logical sums, and other logical operations, such as, for example, field programmable gate arrays (FPGAs), programmable logic arrays (PLAs), and the like. , Flip-flops, registers, and reconfigurable hardware circuits, including memory elements.

コンピュータ可読記憶媒体は、適切なデバイスによって実行される命令を格納可能な任意の有形なデバイスを含んでよく、その結果、そこに格納される命令を有するコンピュータ可読記憶媒体は、フローチャート又はブロック図で指定されたオペレーションを実行するための手段を作成すべく実行され得る命令を含む、製品を備えることになる。コンピュータ可読記憶媒体の例としては、電子記憶媒体、磁気記憶媒体、光記憶媒体、電磁記憶媒体、半導体記憶媒体等が含まれてよい。コンピュータ可読記憶媒体のより具体的な例としては、フロッピー（登録商標）ディスク、ディスケット、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、リードオンリメモリ（ＲＯＭ）、消去可能プログラマブルリードオンリメモリ（ＥＰＲＯＭ又はフラッシュメモリ）、電気的消去可能プログラマブルリードオンリメモリ（ＥＥＰＲＯＭ）、静的ランダムアクセスメモリ（ＳＲＡＭ）、コンパクトディスクリードオンリメモリ（ＣＤ－ＲＯＭ）、デジタル多用途ディスク（ＤＶＤ）、ブルーレイ（登録商標）ディスク、メモリスティック、集積回路カード等が含まれてよい。 The computer readable storage medium may include any tangible device capable of storing instructions executed by the appropriate device, so that the computer readable storage medium having the instructions stored therein may be in a flow chart or block diagram. It will be equipped with a product that contains instructions that can be executed to create means for performing the specified operation. Examples of the computer-readable storage medium may include an electronic storage medium, a magnetic storage medium, an optical storage medium, an electromagnetic storage medium, a semiconductor storage medium, and the like. More specific examples of computer-readable storage media include floppy (registered trademark) disks, diskettes, hard disks, random access memory (RAM), read-only memory (ROM), and erasable programmable read-only memory (EPROM or flash memory). , Electrically Erasable Programmable Read Only Memory (EEPROM), Static Random Access Memory (SRAM), Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD), Blu-ray® Disc, Memory Stick , Integrated circuit cards and the like may be included.

コンピュータ可読命令は、アセンブラ命令、命令セットアーキテクチャ（ＩＳＡ）命令、マシン命令、マシン依存命令、マイクロコード、ファームウェア命令、状態設定データ、又はＳｍａｌｌｔａｌｋ、ＪＡＶＡ（登録商標）、Ｃ＋＋等のようなオブジェクト指向プログラミング言語、及び「Ｃ」プログラミング言語又は同様のプログラミング言語のような従来の手続型プログラミング言語を含む、１又は複数のプログラミング言語の任意の組み合わせで記述されたソースコード又はオブジェクトコードのいずれかを含んでよい。 Computer-readable instructions are assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state-setting data, or object-oriented programming such as Smalltalk, JAVA®, C ++, etc. Includes either source code or object code written in any combination of one or more programming languages, including languages and traditional procedural programming languages such as the "C" programming language or similar programming languages. good.

コンピュータ可読命令は、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサ、又はプログラマブル回路が、フローチャート又はブロック図で指定されたオペレーションを実行するための手段を生成するために当該コンピュータ可読命令を実行すべく、ローカルに又はローカルエリアネットワーク（ＬＡＮ）、インターネット等のようなワイドエリアネットワーク（ＷＡＮ）を介して、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサ、又はプログラマブル回路に提供されてよい。プロセッサの例としては、コンピュータプロセッサ、処理ユニット、マイクロプロセッサ、デジタル信号プロセッサ、コントローラ、マイクロコントローラ等を含む。 Computer-readable instructions are used to generate means for a general purpose computer, a special purpose computer, or the processor of another programmable data processing device, or a programmable circuit, to perform an operation specified in a flowchart or block diagram. General purpose computers, special purpose computers, or other programmable data processing locally or via a local area network (LAN), a wide area network (WAN) such as the Internet, etc. to execute such computer-readable instructions. It may be provided to the processor of the device or a programmable circuit. Examples of processors include computer processors, processing units, microprocessors, digital signal processors, controllers, microcontrollers, and the like.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更又は改良を加えることが可能であることが当業者に明らかである。その様な変更又は改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments. It will be apparent to those skilled in the art that various changes or improvements can be made to the above embodiments. It is clear from the description of the claims that the form with such changes or improvements may be included in the technical scope of the present invention.

特許請求の範囲、明細書、及び図面中において示した装置、システム、プログラム、及び方法における動作、手順、ステップ、及び段階などの各処理の実行順序は、特段「より前に」、「先立って」などと明示しておらず、また、前の処理の出力を後の処理で用いるのでない限り、任意の順序で実現しうることに留意すべきである。特許請求の範囲、明細書、及び図面中の動作フローに関して、便宜上「まず、」、「次に、」などを用いて説明したとしても、この順で実施することが必須であることを意味するものではない。 The order of execution of each process such as operation, procedure, step, and step in the apparatus, system, program, and method shown in the claims, specification, and drawings is particularly "before" and "prior to". It should be noted that it can be realized in any order unless the output of the previous process is used in the subsequent process. Even if the scope of claims, the specification, and the operation flow in the drawings are explained using "first", "next", etc. for convenience, it means that it is essential to carry out in this order. It's not a thing.

２０ネットワーク、１００学習実行装置、１０２情報格納部、１０４入力受付部、１０６データ受信部、１０８動作設定部、１１０学習実行部、１１２表示制御部、２００通信端末、３００筋肉モデル、３１０脊髄、３２０介在ニューロン、３３０運動単位、３４０運動ニューロン、３５０筋繊維、３６０筋肉、３７２腱、３７４膝、３７６骨、４００発火パターン、４０２オン、４０４オフ、１２００コンピュータ、１２１０ホストコントローラ、１２１２ＣＰＵ、１２１４ＲＡＭ、１２１６グラフィックコントローラ、１２１８ディスプレイデバイス、１２２０入出力コントローラ、１２２２通信インタフェース、１２２４記憶装置、１２３０ＲＯＭ、１２４０入出力チップ 20 network, 100 learning execution device, 102 information storage unit, 104 input reception unit, 106 data reception unit, 108 operation setting unit, 110 learning execution unit, 112 display control unit, 200 communication terminal, 300 muscle model, 310 spinal cord, 320 Intervening neurons, 330 motor units, 340 motor neurons, 350 muscle fibers, 360 muscles, 372 tendons, 374 knees, 376 bones, 400 firing patterns, 402 on, 404 off, 1200 computers, 1210 host controller, 1212 CPU, 1214 RAM, 1216 graphic controller, 1218 display device, 1220 input / output controller, 1222 communication interface, 1224 storage device, 1230 ROM, 1240 input / output chip

Claims

An information storage unit that stores a muscle model that operates a muscle by contracting muscle fibers connected to the motor neurons included in the motor unit according to the firing pattern of a plurality of interneurons connected to each motor unit.
The motion setting unit that sets the target motion of the muscle model,
The learning execution unit that learns the firing pattern realizes the target motion by executing learning that rewards the firing pattern in which the motion of the muscle model is closer to the target motion among the plurality of firing patterns. A learning execution device including a learning execution unit that learns a firing pattern.

The learning execution unit operates the muscle model according to each of the plurality of firing patterns, generates a plurality of firing patterns based on a firing pattern in which the motion of the muscle model is closer to the target motion, and the plurality of firing patterns. The muscle model is operated according to each of the above, and the firing pattern that realizes the target motion is learned by repeating that the motion of the muscle model generates a plurality of firing patterns based on the firing pattern closer to the target motion. , The learning execution device according to claim 1.

The learning execution unit operates the muscle model according to each of the plurality of randomly generated firing patterns, and generates a plurality of firing patterns based on a firing pattern in which the movement of the muscle model is closer to the target movement. The target motion is realized by operating the muscle model according to each of the plurality of firing patterns and repeatedly generating a plurality of firing patterns based on the firing patterns in which the motion of the muscle model is closer to the target motion. The learning execution device according to claim 2, which learns an ignition pattern.

The learning execution unit operates the muscle model according to each of the plurality of firing patterns generated based on the learned firing pattern, and the motion of the muscle model is a plurality of firing patterns based on the firing pattern closer to the target motion. By generating an ignition pattern, operating the muscle model according to each of the plurality of ignition patterns, and repeating the operation of the muscle model to generate a plurality of ignition patterns based on the ignition pattern closer to the target motion. The learning execution device according to claim 2, which learns an ignition pattern that realizes the target operation.

The learning according to any one of claims 1 to 4, wherein the learning execution unit grows the motor unit in which the muscle fiber is contracted when the muscle model is operated based on the firing pattern. Execution device.

The muscle model includes a fast muscle motor unit and a slow muscle motor unit.
The learning according to claim 5, wherein the learning execution unit grows the motor unit of the fast muscle and the motor unit of the slow muscle according to different criteria when the muscle model is operated based on the firing pattern. Execution device.

The information storage unit includes a first parameter indicating whether the muscle is a fast muscle or a slow muscle, a second parameter indicating contractile energy, and a maximum value of the second parameter with respect to the motor unit. The third parameter indicating self-healing power and the maximum value of the third parameter are stored.
The sixth aspect of the present invention, wherein the learning execution unit executes learning using the first parameter, the second parameter, the maximum value of the second parameter, the third parameter, and the maximum value of the third parameter. Learning execution device.

The learning execution unit subtracts a predetermined value from the second parameter each time the motor unit contracts, and restores the second parameter with the passage of time while the third parameter is not 0. , The learning execution device according to claim 7.

The information storage unit stores a fourth parameter indicating the amount of energy consumed each time the motor unit contracts when the motor unit is a fast muscle.
The learning execution unit executes learning using the first parameter, the second parameter, the maximum value of the second parameter, the third parameter, the maximum value of the third parameter, and the fourth parameter. The learning execution device according to claim 8.

When the motor unit is a fast muscle, the learning execution unit subtracts the value of the fourth parameter from the second parameter each time the motor unit contracts, and when the motor unit is a slow muscle. The learning execution device according to claim 9, wherein a value other than the fourth parameter is subtracted from the second parameter each time the motor unit contracts.

When the learning execution unit determines that the muscle fiber has been damaged and then determines that the muscle fiber has recovered, and if the motor unit is a fast muscle, the maximum value of the second parameter and the said. The learning execution device according to claim 9 or 10, wherein the value of the fourth parameter is increased, and when the motor unit is a slow muscle, the maximum value of the third parameter is increased.

When the learning execution unit determines that the muscle fiber has been damaged and then determines that the muscle fiber has recovered, and the motor unit is a fast muscle, the maximum value of the third parameter is not increased. The learning execution device according to claim 11.

When the learning execution unit determines that the muscle fiber is damaged and then determines that the muscle fiber has recovered, and the motor unit is a slow muscle, the maximum value of the second parameter is not increased. The learning execution device according to claim 11 or 12.

The learning execution device according to any one of claims 11 to 13, wherein the learning execution unit determines that the muscle fiber is damaged when the second parameter becomes 0.

The information storage unit stores a fifth parameter related to the use of the motor unit for the motor unit.
The learning execution unit improves the level of the motor unit as the fifth parameter increases, and the higher the level of the motor unit, the maximum value of the second parameter when the motor unit is a fast muscle. The learning according to any one of claims 11 to 14, which makes it difficult to increase the value of the fourth parameter and makes it difficult to increase the maximum value of the third parameter when the motor unit is a slow muscle. Execution device.

From claim 1, the learning execution unit learns the firing pattern by contracting one motor unit and then preventing the one motor unit from contracting until a predetermined refractory period elapses. The learning execution device according to any one of 15.

The learning execution device according to claim 16, wherein the learning execution unit learns the ignition pattern by shortening the refractory period as the temperature of the motor unit is higher.

When the movement of the muscle model operated according to the plurality of firing patterns in a time series achieves the target movement, the learning execution unit sets a predetermined time from the firing pattern in the state of achieving the target movement. The learning execution device according to any one of claims 1 to 17, wherein the learning is executed by updating the firing pattern in the retroactive state.

For each of the plurality of muscle fibers contained in the muscle, a first parameter indicating whether the muscle fiber is a fast muscle or a slow muscle, a second parameter indicating contractile energy, and the second parameter. An information storage unit that stores a maximum value, a third parameter indicating self-healing power, and a maximum value of the third parameter.
Learning execution to learn the muscle model by executing learning using the first parameter, the second parameter, the maximum value of the second parameter, the third parameter, and the maximum value of the third parameter. A learning execution device equipped with a unit.

The information storage unit stores a fourth parameter indicating the amount of energy consumed each time the muscle fiber contracts when the muscle fiber is a fast muscle.
The learning execution unit executes learning using the first parameter, the second parameter, the maximum value of the second parameter, the third parameter, the maximum value of the third parameter, and the fourth parameter. The learning execution device according to claim 19.

The learning execution unit subtracts a predetermined value from the second parameter each time the muscle fiber contracts, and while the third parameter is not 0, the second parameter is restored with the passage of time. If it is determined that the muscle fiber has been damaged and then the muscle fiber has recovered, and the muscle fiber is a fast muscle, the maximum value of the second parameter and the value of the fourth parameter. 20 is the learning execution device according to claim 20, wherein the muscle fiber is a slow muscle, and the maximum value of the third parameter is increased to learn the model of the muscle.

When the muscle fiber is a fast muscle, the learning execution unit subtracts the value of the fourth parameter from the second parameter each time the muscle fiber contracts, and when the muscle fiber is a slow muscle. 21. The learning execution device according to claim 21, wherein a value other than the value of the fourth parameter is subtracted from the second parameter each time the muscle fiber contracts.

When the learning execution unit determines that the muscle fiber is damaged and then determines that the muscle fiber has recovered, if the muscle fiber is a fast muscle, the maximum value of the third parameter is not increased. The learning execution device according to claim 21 or 22.

When the learning execution unit determines that the muscle fiber is damaged and then determines that the muscle fiber has recovered, if the muscle fiber is a slow muscle, the maximum value of the second parameter is not increased. The learning execution device according to any one of claims 21 to 23.

A program for operating a computer as the learning execution device according to any one of claims 1 to 24.

A learning method performed by a computer
Motion setting that sets the target motion of the muscle model that operates the muscle by contracting the muscle fibers connected to the motor neurons included in the motor unit according to the firing pattern of a plurality of interneurons to which the motor unit is connected to each. Steps and
Learning execution including a learning execution step for learning the firing pattern that realizes the target motion by executing learning that rewards the firing pattern in which the motion of the muscle model is closer to the target motion among the plurality of firing patterns. Method.

A learning method performed by a computer
For each of the plurality of muscle fibers contained in the muscle, a first parameter indicating whether the muscle fiber is a fast muscle or a slow muscle, a second parameter indicating contractile energy, and the second parameter. A storage step for storing the maximum value, the third parameter indicating the self-healing power, and the maximum value of the third parameter.
Learning execution to learn the muscle model by executing learning using the first parameter, the second parameter, the maximum value of the second parameter, the third parameter, and the maximum value of the third parameter. A learning execution method with steps.