JP6781101B2

JP6781101B2 - Non-linear system control method, biped robot control device, biped robot control method and its program

Info

Publication number: JP6781101B2
Application number: JP2017088614A
Authority: JP
Inventors: 敏之大塚; 想太郎片山; 佐藤　康之; 康之佐藤; 将弘土井
Original assignee: Kyoto University; Toyota Motor Corp
Current assignee: Kyoto University; Toyota Motor Corp
Priority date: 2017-04-27
Filing date: 2017-04-27
Publication date: 2020-11-04
Anticipated expiration: 2037-04-27
Also published as: JP2018185747A

Description

本発明は、非線形システムの制御方法、二足歩行ロボットの制御装置、二足歩行ロボットの制御方法及びそのプログラムに関する。 The present invention relates to a control method for a nonlinear system, a control device for a biped robot, a control method for a biped robot, and a program thereof.

二足歩行ロボットのような安定性の低いシステムを制御する際には、未来（有限時間後まで）のシステム挙動を予測しながら制御を行うモデル予測制御（receding horizon control；リシーディングホライズン制御）を用いることが有効である。モデル予測制御は、制御周期（サンプリング周期）ごとに各時刻から有限時間未来までの最適制御問題を解き、制御入力値を決定するフィードバック制御である。 When controlling a system with low stability such as a bipedal walking robot, model predictive control (receding horizon control) that controls while predicting the system behavior in the future (until after a finite time) is used. It is effective to use. Model prediction control is feedback control that solves the optimum control problem from each time to the finite time future for each control cycle (sampling cycle) and determines the control input value.

フィードバック制御において、二足歩行ロボットは歩行動作等を伴う非線形性の高いシステムであるので、二足歩行ロボットの制御は、非線形モデル予測制御によって行われることが好ましい。ここで、非線形モデル予測制御は、一般に、多大な計算時間を要する。したがって、非線形モデル予測制御を用いて実時間（リアルタイム）で制御入力値の最適解を決定することは困難であった。 In feedback control, since the bipedal walking robot is a highly non-linear system accompanied by walking motion and the like, it is preferable that the bipedal walking robot is controlled by the non-linear model predictive control. Here, the nonlinear model predictive control generally requires a large amount of calculation time. Therefore, it has been difficult to determine the optimum solution of the control input value in real time (real time) using the nonlinear model predictive control.

この技術に関連し、非特許文献１は、非線形モデル予測制御を実時間（リアルタイム）で行うことが可能な、Ｃ／ＧＭＲＥＳ法（continuation／generalized minimum residual method）と呼ばれる技術を開示する。Ｃ／ＧＭＲＥＳ法は、連続変形法（continuation method）とＧＭＲＥＳ法とを組み合わせたアルゴリズムである。Ｃ／ＧＭＲＥＳ法は、状態変化が連続であるシステムに対し、最適解の連続性を利用して、最適解の変化率を求めながら最適解を追跡していく計算方法である。このＣ／ＧＭＲＥＳ法を用いることにより、非線形モデル予測制御においても、実時間でシステムを制御することが可能となる。 In relation to this technique, Non-Patent Document 1 discloses a technique called C / GMRES method (continuation / generalized minimum residual method) capable of performing nonlinear model predictive control in real time (real time). The C / GMRES method is an algorithm that combines the continuation method and the GMRES method. The C / GMRES method is a calculation method in which the optimum solution is tracked while obtaining the rate of change of the optimum solution by utilizing the continuity of the optimum solution for a system in which the state changes are continuous. By using this C / GMRES method, it is possible to control the system in real time even in nonlinear model predictive control.

Toshiyuki OHTSUKA and Hironori A. FUJII、「Real-Time Receding-Horizon Control Algorithm for Nonlinear Systems」、計測自動制御学会論文集、１９９７年１２月、Vol.33, No.12, p. 1131-1139Toshiyuki OHTSUKA and Hironori A. FUJII, "Real-Time Receding-Horizon Control Algorithm for Nonlinear Systems", Proceedings of the Society of Instrument and Control Engineers, December 1997, Vol.33, No.12, p. 1131-1139

非線形性を有するシステムである非線形システムでは、周囲の環境との物理的な接触を伴って移動するとき、物理的な接触により状態変化が不連続となる場合がある。一方、非特許文献１にかかる技術は、システムの状態変化が連続であることを前提としている。したがって、非特許文献１にかかる技術を用いて、物理的接触を行う可能性がある非線形システムを制御することは困難である。したがって、非線形モデル予測制御を用いて、不連続な状態変化を伴う非線形システムを実時間で制御することは困難であった。 In a non-linear system, which is a system having non-linearity, when moving with physical contact with the surrounding environment, the state change may be discontinuous due to the physical contact. On the other hand, the technique according to Non-Patent Document 1 is based on the premise that the state change of the system is continuous. Therefore, it is difficult to control a nonlinear system that may make physical contact by using the technique according to Non-Patent Document 1. Therefore, it has been difficult to control a nonlinear system with discontinuous state changes in real time by using nonlinear model predictive control.

本発明は、不連続な状態変化を伴う非線形システムを実時間で制御することが可能な非線形システムの制御方法、二足歩行ロボットの制御装置、二足歩行ロボットの制御方法及びそのプログラムを提供する。 The present invention provides a control method for a nonlinear system capable of controlling a nonlinear system with discontinuous state changes in real time, a control device for a bipedal walking robot, a control method for a bipedal walking robot, and a program thereof. ..

本発明にかかる非線形システムの制御方法は、前記非線形システムの状態を示す状態パラメータを取得する取得ステップと、前記取得された状態パラメータに基づいて、モデル予測制御のアルゴリズムを使用して、前記非線形システムを制御するための制御入力値を算出する算出ステップと、前記算出された制御入力値を用いて、前記非線形システムを制御する制御ステップとを有し、前記算出ステップにおいて、指定されたタイミングにおいて状態が不連続に変化するように前記非線形システムの状態を拘束する拘束パラメータを用いて、前記非線形システムの制御周期ごとに、前記モデル予測制御のアルゴリズムにおける予め定められた評価区間における前記制御入力値の最適解の変化率を算出し、前記変化率を用いて当該制御周期の次の制御周期における前記制御入力値の最適解を算出し、前記最適解から、現在の前記制御入力値を算出する。 The method for controlling a non-linear system according to the present invention uses an acquisition step for acquiring a state parameter indicating the state of the non-linear system and an algorithm for model prediction control based on the acquired state parameter. It has a calculation step for calculating a control input value for controlling the system and a control step for controlling the non-linear system using the calculated control input value, and the state at a designated timing in the calculation step. The control input value in the predetermined evaluation interval in the model prediction control algorithm is used for each control cycle of the non-linear system by using a constraint parameter that constrains the state of the non-linear system so that is discontinuously changed. The rate of change of the optimum solution is calculated, the optimum solution of the control input value in the control cycle next to the control cycle is calculated using the rate of change, and the current control input value is calculated from the optimum solution.

本発明は、上述したように、指定されたタイミングにおいて状態が不連続に変化するように非線形システムの状態を拘束する拘束パラメータを用いることで、想定しているタイミングで、不連続な状態変化を起こさせることができる。したがって、非線形システムの制御に非線形モデル予測制御の理論を容易に適用でき、さらにＣ／ＧＭＲＥＳ法を適用することも可能となる。したがって、本発明は、不連続な状態変化を伴う非線形システムを実時間で制御することが可能となる。 As described above, the present invention uses a constraint parameter that constrains the state of the nonlinear system so that the state changes discontinuously at a specified timing, thereby causing a discontinuous state change at the assumed timing. Can be woken up. Therefore, the theory of nonlinear model predictive control can be easily applied to the control of a nonlinear system, and the C / GMRES method can also be applied. Therefore, the present invention makes it possible to control a nonlinear system with a discontinuous state change in real time.

また、本発明にかかる二足歩行ロボットの制御装置は、２つの脚を用いて二足歩行を行うことが可能な二足歩行ロボットの動作を制御する二足歩行ロボットの制御装置であって、前記二足歩行ロボットの歩行に関する状態を示す状態パラメータを取得する状態取得手段と、前記取得された状態パラメータに基づいて、モデル予測制御のアルゴリズムを使用して、前記二足歩行ロボットの動作を制御するための制御入力値を算出する算出手段と、前記算出された制御入力値を用いて、前記二足歩行ロボットの動作を制御する制御手段とを有し、前記算出手段は、指定されたタイミングにおいて前記２つの脚のうちの遊脚が着地するように前記二足歩行ロボットの状態を拘束する拘束パラメータを用いて、前記二足歩行ロボットの制御周期ごとに、前記モデル予測制御のアルゴリズムにおける予め定められた評価区間における前記制御入力値の最適解の変化率を算出し、前記変化率を用いて当該制御周期の次の制御周期における前記制御入力値の最適解を算出し、前記最適解から、現在の前記制御入力値を算出する。 Further, the control device for a bipedal walking robot according to the present invention is a control device for a bipedal walking robot that controls the operation of a bipedal walking robot capable of performing bipedal walking using two legs. The operation of the bipedal walking robot is controlled by using the state acquisition means for acquiring the state parameter indicating the state related to walking of the bipedal walking robot and the model prediction control algorithm based on the acquired state parameter. It has a calculation means for calculating a control input value for performing the operation, and a control means for controlling the operation of the bipedal walking robot by using the calculated control input value, and the calculation means has a designated timing. In advance in the model prediction control algorithm for each control cycle of the bipedal walking robot, using a restraint parameter that constrains the state of the bipedal walking robot so that the free leg of the two legs lands. The rate of change of the optimum solution of the control input value in the defined evaluation interval is calculated, the optimum solution of the control input value in the control cycle next to the control cycle is calculated using the rate of change, and the optimum solution is used. , The current control input value is calculated.

また、本発明にかかる二足歩行ロボットの制御方法は、２つの脚を用いて二足歩行を行うことが可能な二足歩行ロボットの動作を制御する二足歩行ロボットの制御方法であって、前記二足歩行ロボットの歩行に関する状態を示す状態パラメータを取得する取得ステップと、前記取得された状態パラメータに基づいて、モデル予測制御のアルゴリズムを使用して、前記二足歩行ロボットの動作を制御するための制御入力値を算出する算出ステップと、前記算出された制御入力値を用いて、前記二足歩行ロボットの動作を制御する制御ステップとを有し、前記算出ステップにおいて、指定されたタイミングにおいて前記２つの脚のうちの遊脚が着地するように前記二足歩行ロボットの状態を拘束する拘束パラメータを用いて、前記二足歩行ロボットの制御周期ごとに、前記モデル予測制御のアルゴリズムにおける予め定められた評価区間における前記制御入力値の最適解の変化率を算出し、前記変化率を用いて当該制御周期の次の制御周期における前記制御入力値の最適解を算出し、前記最適解から、現在の前記制御入力値を算出する。 Further, the control method of the bipedal walking robot according to the present invention is a control method of the bipedal walking robot that controls the operation of the bipedal walking robot capable of performing bipedal walking using two legs. Based on the acquisition step of acquiring the state parameter indicating the state related to the walking of the bipedal walking robot and the acquired state parameter, the operation of the bipedal walking robot is controlled by using the model prediction control algorithm. It has a calculation step for calculating a control input value for the purpose and a control step for controlling the operation of the bipedal walking robot using the calculated control input value, and at a timing specified in the calculation step. Predetermined in the model prediction control algorithm for each control cycle of the biped robot using a restraint parameter that constrains the state of the biped robot so that the free leg of the two legs lands. The rate of change of the optimum solution of the control input value in the evaluated evaluation section is calculated, the optimum solution of the control input value in the control cycle next to the control cycle is calculated using the rate of change, and the optimum solution is used. The current control input value is calculated.

また、本発明にかかるプログラムは、２つの脚を用いて二足歩行を行うことが可能な二足歩行ロボットの動作を制御する二足歩行ロボットの制御方法を実現するプログラムであって、前記二足歩行ロボットの歩行に関する状態を示す状態パラメータを取得する取得ステップと、前記取得された状態パラメータに基づいて、モデル予測制御のアルゴリズムを使用して、前記二足歩行ロボットの動作を制御するための制御入力値を算出する算出ステップであって、指定されたタイミングにおいて前記２つの脚のうちの遊脚が着地するように前記二足歩行ロボットの状態を拘束する拘束パラメータを用いて、前記二足歩行ロボットの制御周期ごとに、前記モデル予測制御のアルゴリズムにおける予め定められた評価区間における前記制御入力値の最適解の変化率を算出し、前記変化率を用いて当該制御周期の次の制御周期における前記制御入力値の最適解を算出し、前記最適解から、現在の前記制御入力値を算出する、算出ステップと、前記算出された制御入力値を用いて、前記二足歩行ロボットの動作を制御する制御ステップとをコンピュータに実行させる。 Further, the program according to the present invention is a program that realizes a control method of a bipedal walking robot that controls the operation of a bipedal walking robot capable of performing bipedal walking using two legs. To control the operation of the bipedal walking robot by using an acquisition step for acquiring a state parameter indicating a state related to walking of the legged robot and an algorithm of model prediction control based on the acquired state parameter. This is a calculation step for calculating the control input value, and the bipedal robot uses a constraint parameter that constrains the state of the biped robot so that the free leg of the two legs lands at a specified timing. For each control cycle of the walking robot, the rate of change of the optimum solution of the control input value in the predetermined evaluation section in the model prediction control algorithm is calculated, and the rate of change is used to calculate the next control cycle of the control cycle. The operation of the bipedal walking robot is performed using the calculation step of calculating the optimum solution of the control input value in the above and calculating the current control input value from the optimum solution, and the calculated control input value. Have the computer perform the control steps to control.

本発明は、上述したように、指定されたタイミングにおいて遊脚が着地するように二足歩行ロボットの状態を拘束する拘束パラメータを用いることで、想定しているタイミングで、遊脚の着地といった不連続な状態変化を起こさせることができる。したがって、二足歩行ロボットの制御に非線形モデル予測制御の理論を容易に適用でき、さらにＣ／ＧＭＲＥＳ法を適用することも可能となる。したがって、本発明は、不連続な状態変化を伴う二足歩行ロボットの動作を実時間で制御することが可能となる。 As described above, the present invention uses a restraint parameter that constrains the state of the biped robot so that the swing leg lands at a specified timing, so that the swing leg does not land at the assumed timing. It is possible to cause a continuous state change. Therefore, the theory of nonlinear model predictive control can be easily applied to the control of a bipedal walking robot, and the C / GMRES method can also be applied. Therefore, the present invention makes it possible to control the movement of a bipedal walking robot accompanied by a discontinuous state change in real time.

また、好ましくは、前記拘束パラメータは、前記モデル予測制御のアルゴリズムで用いられる評価関数に含まれている。これにより、不連続な状態変化の無い非線形モデル予測制御と同様に最適化問題を扱うことが可能となる。 Also, preferably, the constraint parameter is included in the evaluation function used in the model prediction control algorithm. This makes it possible to handle optimization problems in the same way as nonlinear model predictive control with no discontinuous state changes.

また、好ましくは、前記拘束パラメータは、前記タイミングにおいて遊脚が着地したときの前記二足歩行ロボットの姿勢を指定する。これにより、想定した姿勢で遊脚を着地させるように二足歩行ロボットを制御することが可能となる。 Further, preferably, the restraint parameter specifies the posture of the biped robot when the swing leg lands at the timing. This makes it possible to control the bipedal walking robot so that the swing leg lands in the assumed posture.

また、好ましくは、前記拘束パラメータは、前記タイミングにおいて遊脚が着地したときの前記２つの脚の関節部の目標角度を指定する。これにより、想定した関節角度で遊脚を着地させるように関節部を制御することが可能となる。 Also, preferably, the restraint parameter specifies a target angle of the joints of the two legs when the swing leg lands at the timing. This makes it possible to control the joint portion so that the swing leg lands at the assumed joint angle.

また、好ましくは、前記拘束パラメータは、調整可能なゲインを含む。これにより、制御装置の性能によらないで制御を安定化させることが可能となる。 Also, preferably, the constraint parameter includes an adjustable gain. This makes it possible to stabilize the control regardless of the performance of the control device.

本発明によれば、不連続な状態変化を伴う非線形システムを実時間で制御することが可能な非線形システムの制御方法、二足歩行ロボットの制御装置、二足歩行ロボットの制御方法及びそのプログラムを提供できる。 According to the present invention, a method for controlling a nonlinear system capable of controlling a nonlinear system with a discontinuous state change in real time, a control device for a bipedal walking robot, a control method for a bipedal walking robot, and a program thereof are provided. Can be provided.

実施の形態１にかかるロボットシステムを示す概略図である。It is the schematic which shows the robot system which concerns on Embodiment 1. FIG. 実施の形態１にかかるロボットシステムの構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the robot system which concerns on Embodiment 1. FIG. 非線形モデル予測制御を説明するための図である。It is a figure for demonstrating the nonlinear model predictive control. 入力系列の更新について説明するための図である。It is a figure for demonstrating the update of an input series. 実施の形態１にかかるロボットをコンパス型モデルに適用する方法を説明するための図である。It is a figure for demonstrating the method of applying the robot which concerns on Embodiment 1 to a compass type model. 実施の形態１にかかるロボットをコンパス型モデルに適用した例を示す図である。It is a figure which shows the example which applied the robot which concerns on Embodiment 1 to a compass type model. 状態ジャンプを説明するための図である。It is a figure for demonstrating the state jump. 遊脚リンクの衝突直前のロボットの状態を示す図である。It is a figure which shows the state of the robot just before the collision of a swing leg link. 遊脚リンクの衝突直後のロボットの状態を示す図である。It is a figure which shows the state of the robot immediately after the collision of a swing leg link. 実施の形態１にかかる制御装置によって行われるロボットの制御方法を示すフローチャートである。It is a flowchart which shows the control method of the robot performed by the control device which concerns on Embodiment 1. FIG. 実施の形態２にかかるロボットを示す図である。It is a figure which shows the robot which concerns on Embodiment 2. 実施の形態２にかかるロボットを膝屈曲モデルに適用した状態を示す図である。It is a figure which shows the state which applied the robot which concerns on Embodiment 2 to a knee flexion model. 実施の形態２にかかるロボットを膝屈曲モデルに適用した例を示す図である。It is a figure which shows the example which applied the robot which concerns on Embodiment 2 to a knee flexion model. 本実施の形態にかかる非線形システムに非線形モデル予測制御のアルゴリズムを適用したシミュレーション結果を示す図である。It is a figure which shows the simulation result which applied the algorithm of the nonlinear model prediction control to the nonlinear system which concerns on this embodiment. 本実施の形態にかかる非線形システムに非線形モデル予測制御のアルゴリズムを適用したシミュレーション結果を示す図である。It is a figure which shows the simulation result which applied the algorithm of the nonlinear model prediction control to the nonlinear system which concerns on this embodiment. 本実施の形態にかかる非線形システムに非線形モデル予測制御のアルゴリズムを適用したシミュレーション結果を示す図である。It is a figure which shows the simulation result which applied the algorithm of the nonlinear model prediction control to the nonlinear system which concerns on this embodiment. 本実施の形態にかかる非線形システムに非線形モデル予測制御のアルゴリズムを適用したシミュレーション結果を示す図である。It is a figure which shows the simulation result which applied the algorithm of the nonlinear model prediction control to the nonlinear system which concerns on this embodiment. 本実施の形態にかかる非線形システムに非線形モデル予測制御のアルゴリズムを適用したシミュレーション結果を示す図である。It is a figure which shows the simulation result which applied the algorithm of the nonlinear model prediction control to the nonlinear system which concerns on this embodiment. 本実施の形態にかかる非線形システムに非線形モデル予測制御のアルゴリズムを適用したシミュレーション結果を示す図である。It is a figure which shows the simulation result which applied the algorithm of the nonlinear model prediction control to the nonlinear system which concerns on this embodiment. シミュレーション結果において定常状態の制御入力のグラフを示す図である。It is a figure which shows the graph of the control input of a steady state in the simulation result.

（実施の形態１）
以下、図面を参照して本発明の実施の形態について説明する。なお、各図面において、同一の要素には同一の符号が付されており、必要に応じて重複説明は省略されている。 (Embodiment 1)
Hereinafter, embodiments of the present invention will be described with reference to the drawings. In each drawing, the same elements are designated by the same reference numerals, and duplicate explanations are omitted as necessary.

図１は、実施の形態１にかかるロボットシステム１を示す概略図である。また、図２は、実施の形態１にかかるロボットシステム１の構成を示す機能ブロック図である。ロボットシステム１は、ロボット１００と、ロボットの動作を制御する制御装置２とを有する。 FIG. 1 is a schematic view showing a robot system 1 according to the first embodiment. Further, FIG. 2 is a functional block diagram showing the configuration of the robot system 1 according to the first embodiment. The robot system 1 includes a robot 100 and a control device 2 that controls the operation of the robot.

ロボット１００は、胴体１０２と、２つの脚である右脚１１０Ｒ及び左脚１１０Ｌとを有する。ロボット１００は、２つの脚（右脚１１０Ｒ及び左脚１１０Ｌ）を用いて歩行動作を行うことが可能な二足歩行ロボットである。右脚１１０Ｒ及び左脚１１０Ｌは、ロボット１００の胴体１０２の下部に設けられている。ここで、図１に示すように、ロボット１００の前方向をＸ軸方向、上方向をＹ軸方向とする。また、以下、右脚１１０Ｒに関する構成要素の符号に「Ｒ」を付し、左脚１１０Ｌに関する構成要素の符号に「Ｌ」を付すが、それぞれの構成要素について左右を区別しない場合には、「Ｒ」及び「Ｌ」は、適宜、省略され得る。 The robot 100 has a body 102 and two legs, a right leg 110R and a left leg 110L. The robot 100 is a bipedal walking robot capable of performing a walking motion using two legs (right leg 110R and left leg 110L). The right leg 110R and the left leg 110L are provided in the lower part of the body 102 of the robot 100. Here, as shown in FIG. 1, the front direction of the robot 100 is the X-axis direction, and the upward direction is the Y-axis direction. Hereinafter, "R" is added to the code of the component related to the right leg 110R, and "L" is added to the code of the component related to the left leg 110L. However, when the left and right are not distinguished for each component, "R" is added. "R" and "L" may be omitted as appropriate.

右脚１１０Ｒは、胴体１０２に近い方から順に、股関節部１２０Ｒと、上腿部１１２Ｒと、膝関節部１２２Ｒと、下腿部１１４Ｒと、足首関節部１２４Ｒと、足部１１６Ｒとを有する。同様に、左脚１１０Ｌは、胴体１０２に近い方から順に、股関節部１２０Ｌと、上腿部１１２Ｌと、膝関節部１２２Ｌと、下腿部１１４Ｌと、足首関節部１２４Ｌと、足部１１６Ｌとを有する。足部１１６Ｒ及び足部１１６Ｌの底部には、それぞれ足裏センサ１１８が設けられている。足裏センサ１１８は、足部１１６の底部に加わる荷重を検出する。 The right leg 110R has a hip joint portion 120R, an upper leg portion 112R, a knee joint portion 122R, a lower leg portion 114R, an ankle joint portion 124R, and a foot portion 116R in order from the side closest to the torso 102. Similarly, the left leg 110L includes a hip joint 120L, an upper leg 112L, a knee joint 122L, a lower leg 114L, an ankle joint 124L, and a foot 116L in order from the side closer to the torso 102. Have. Sole sensors 118 are provided on the bottoms of the foot 116R and the foot 116L, respectively. The sole sensor 118 detects the load applied to the bottom of the foot 116.

股関節部１２０Ｒ及び股関節部１２０Ｌは、胴体１０２の下部に取り付けられている。そして、股関節部１２０Ｒ及び股関節部１２０Ｌを介して、それぞれ、上腿部１１２Ｒ及び上腿部１１２Ｌが胴体１０２と接続されている。言い換えると、右脚１１０Ｒ及び左脚１１０Ｌは、それぞれ、股関節部１２０Ｒ及び股関節部１２０Ｌを介して、胴体１０２と接続されている。 The hip joint portion 120R and the hip joint portion 120L are attached to the lower part of the body 102. Then, the upper thigh portion 112R and the upper thigh portion 112L are connected to the torso 102, respectively, via the hip joint portion 120R and the hip joint portion 120L, respectively. In other words, the right leg 110R and the left leg 110L are connected to the torso 102 via the hip joint portion 120R and the hip joint portion 120L, respectively.

また、膝関節部１２２Ｒを介して、上腿部１１２Ｒと下腿部１１４Ｒとが接続されている。同様に、膝関節部１２２Ｌを介して、上腿部１１２Ｌと下腿部１１４Ｌとが接続されている。また、足首関節部１２４Ｒを介して、下腿部１１４Ｒと足部１１６Ｒとが接続されている。同様に、足首関節部１２４Ｌを介して、下腿部１１４Ｌと足部１１６Ｌとが接続されている。 Further, the upper leg portion 112R and the lower leg portion 114R are connected via the knee joint portion 122R. Similarly, the upper leg portion 112L and the lower leg portion 114L are connected via the knee joint portion 122L. Further, the lower leg portion 114R and the foot portion 116R are connected via the ankle joint portion 124R. Similarly, the lower leg portion 114L and the foot portion 116L are connected via the ankle joint portion 124L.

股関節部１２０は、ＸＹ平面に垂直な軸（つまりロボット１００の横方向に水平な軸）の周りに回転する。これにより、右脚１１０Ｒ及び左脚１１０Ｌは、前後に動作し得る。したがって、ロボット１００は、右脚１１０Ｒ及び左脚１１０Ｌを交互に前に出すことにより歩行動作を行うことができる。 The hip joint 120 rotates about an axis perpendicular to the XY plane (that is, a laterally horizontal axis of the robot 100). As a result, the right leg 110R and the left leg 110L can move back and forth. Therefore, the robot 100 can perform a walking motion by alternately pushing the right leg 110R and the left leg 110L forward.

膝関節部１２２は、ＸＹ平面に垂直な軸の周りに回転する。これにより、右脚１１０Ｒ及び左脚１１０Ｌは、膝関節部１２２で屈曲動作を行うことができる。また、足首関節部１２４は、ＸＹ平面に垂直な軸の周りに回転する。これにより、足部１１６は、下腿部１１４に対して上下に動作し得る。 The knee joint 122 rotates about an axis perpendicular to the XY plane. As a result, the right leg 110R and the left leg 110L can perform a flexion motion at the knee joint portion 122. Also, the ankle joint 124 rotates about an axis perpendicular to the XY plane. As a result, the foot portion 116 can move up and down with respect to the lower leg portion 114.

図２に示すように、ロボット１００の各関節部（股関節部１２０、膝関節部１２２及び足首関節部１２４）は、角度センサ１３０と、モータ１４０とを有する。角度センサ１３０は、例えばエンコーダであって、各関節部の関節角度を検出する。モータ１４０は、各関節部を動作させる、アクチュエータとしての機能を有する。また、各関節部は、各関節部のモータ１４０のトルクを検出するトルクセンサ１３６を有してもよい。また、ロボット１００の周囲の状態を検出するためのカメラが、胴体１０２に内蔵されていてもよい。 As shown in FIG. 2, each joint portion (hip joint portion 120, knee joint portion 122, and ankle joint portion 124) of the robot 100 has an angle sensor 130 and a motor 140. The angle sensor 130 is, for example, an encoder and detects the joint angle of each joint portion. The motor 140 has a function as an actuator for operating each joint portion. Further, each joint may have a torque sensor 136 that detects the torque of the motor 140 of each joint. Further, a camera for detecting the surrounding state of the robot 100 may be built in the body 102.

制御装置２は、例えばコンピュータとしての機能を有する。制御装置２は、ロボット１００の内部（例えば胴体１０２）に搭載されてもよい。また、制御装置２は、ロボット１００と物理的に離れていてもよく、その場合、ロボット１００と有線又は無線を介して通信可能に接続されてもよい。制御装置２は、ロボット１００の動作、特に、右脚１１０Ｒ及び左脚１１０Ｌの動作を制御する。さらに具体的には、制御装置２は、各関節部のモータのトルクを制御することで、右脚１１０Ｒ及び左脚１１０Ｌの姿勢を制御する。つまり、ロボットシステム１において、制御装置２はマスタ装置としての機能を有し、ロボット１００はスレーブ装置としての機能を有する。 The control device 2 has a function as, for example, a computer. The control device 2 may be mounted inside the robot 100 (for example, the body 102). Further, the control device 2 may be physically separated from the robot 100, and in that case, the control device 2 may be communicably connected to the robot 100 via a wire or a radio. The control device 2 controls the operation of the robot 100, particularly the operation of the right leg 110R and the left leg 110L. More specifically, the control device 2 controls the postures of the right leg 110R and the left leg 110L by controlling the torque of the motor of each joint portion. That is, in the robot system 1, the control device 2 has a function as a master device, and the robot 100 has a function as a slave device.

制御装置２は、主要なハードウェア構成として、ＣＰＵ（Central Processing Unit）４と、ＲＯＭ（Read Only Memory）６と、ＲＡＭ（Random Access Memory）８とを有する。ＣＰＵ４は、制御処理及び演算処理等を行う演算装置としての機能を有する。ＲＯＭ６は、ＣＰＵ４によって実行される制御プログラム及び演算プログラム等を記憶するための機能を有する。ＲＡＭ８は、処理データ等を一時的に記憶するための機能を有する。 The control device 2 has a CPU (Central Processing Unit) 4, a ROM (Read Only Memory) 6, and a RAM (Random Access Memory) 8 as a main hardware configuration. The CPU 4 has a function as an arithmetic unit that performs control processing, arithmetic processing, and the like. The ROM 6 has a function for storing a control program, an arithmetic program, and the like executed by the CPU 4. The RAM 8 has a function for temporarily storing processing data and the like.

また、制御装置２は、状態取得部１２、非線形モデル予測制御部１４、及びサーボ制御部１６（以下、「各構成要素」と称する）を有する。各構成要素は、例えば、ＣＰＵ４がＲＯＭ６に記憶されたプログラムを実行することによって実現可能である。また、各構成要素は、必要なプログラムを任意の不揮発性記録媒体に記録しておき、必要に応じてインストールするようにして、実現するようにしてもよい。なお、各構成要素は、上記のようにソフトウェアによって実現されることに限定されず、何らかの回路素子等のハードウェアによって実現されてもよい。 Further, the control device 2 includes a state acquisition unit 12, a nonlinear model prediction control unit 14, and a servo control unit 16 (hereinafter, referred to as “each component”). Each component can be realized, for example, by the CPU 4 executing a program stored in the ROM 6. Further, each component may be realized by recording a necessary program on an arbitrary non-volatile recording medium and installing it as needed. It should be noted that each component is not limited to being realized by software as described above, and may be realized by some hardware such as a circuit element.

状態取得部１２は、ロボット１００の現在の歩行に関する状態を示すデータ（状態パラメータ）を取得する、状態取得手段としての機能を有する。状態取得部１２は、各センサ（角度センサ１３０、足裏センサ１１８及びトルクセンサ１３６）から、各センサの検出値を取得する。そして、状態取得部１２は、取得された検出値（及び検出値から得られた値）を非線形モデル予測制御部１４に対して出力する。なお、「検出値から得られた値」とは、例えば、「検出値」が角度センサ１３０から検出された関節角度である場合、関節角度の速度（変化量，時間微分）であってもよい。この場合、状態パラメータは、関節角度及び関節角度の速度を示してもよい。 The state acquisition unit 12 has a function as a state acquisition means for acquiring data (state parameters) indicating a state related to the current walking of the robot 100. The state acquisition unit 12 acquires the detected value of each sensor from each sensor (angle sensor 130, sole sensor 118, and torque sensor 136). Then, the state acquisition unit 12 outputs the acquired detected value (and the value obtained from the detected value) to the nonlinear model prediction control unit 14. The "value obtained from the detected value" may be, for example, the speed (change amount, time derivative) of the joint angle when the "detected value" is the joint angle detected by the angle sensor 130. .. In this case, the state parameters may indicate the joint angle and the speed of the joint angle.

非線形モデル予測制御部１４は、ロボット１００の動作を制御するための制御入力値（入力値）を算出する算出手段としての機能を有する。非線形モデル予測制御部１４は、状態取得部１２からの検出値（及び検出値から得られた値）の少なくとも一部を状態パラメータとして、その状態パラメータに基づいて、モデル予測制御のアルゴリズムを使用してロボット１００の動作を制御するための制御入力値を算出する。また、非線形モデル予測制御部１４は、算出された制御入力値をサーボ制御部１６に対して出力する。詳しくは後述する。また、非線形モデル予測制御部１４は、ロボットシステム１の外部の上位コントローラ（図示せず）によって、必要な指示値（歩幅、歩行周期等）を入力されてもよい。 The nonlinear model prediction control unit 14 has a function as a calculation means for calculating a control input value (input value) for controlling the operation of the robot 100. The nonlinear model prediction control unit 14 uses at least a part of the detected value (and the value obtained from the detected value) from the state acquisition unit 12 as a state parameter, and uses an algorithm for model prediction control based on the state parameter. The control input value for controlling the operation of the robot 100 is calculated. Further, the nonlinear model prediction control unit 14 outputs the calculated control input value to the servo control unit 16. Details will be described later. Further, the nonlinear model prediction control unit 14 may input necessary instruction values (step length, walking cycle, etc.) by an external controller (not shown) of the robot system 1.

サーボ制御部１６は、非線形モデル予測制御部１４によって算出された制御入力値を用いてロボット１００の動作を制御する制御手段としての機能を有する。サーボ制御部１６は、算出された制御入力値となるように、ロボット１００の各関節部を制御する。また、サーボ制御部１６は、サーボアンプの機能を有してもよい。また、サーボ制御部１６は、トルク制御を行う場合、各関節部のトルク（関節トルク）が算出された制御入力値となるように、各関節のモータ１４０を制御する。このとき、サーボ制御部１６は、各関節部のトルクセンサ１３６によって検出されたトルク値を用いてフィードバック制御を行ってもよい。 The servo control unit 16 has a function as a control means for controlling the operation of the robot 100 by using the control input value calculated by the nonlinear model prediction control unit 14. The servo control unit 16 controls each joint portion of the robot 100 so as to obtain the calculated control input value. Further, the servo control unit 16 may have a servo amplifier function. Further, when performing torque control, the servo control unit 16 controls the motor 140 of each joint so that the torque (joint torque) of each joint becomes the calculated control input value. At this time, the servo control unit 16 may perform feedback control using the torque value detected by the torque sensor 136 of each joint portion.

（モデル予測制御）
ここで、本実施の形態にかかる非線形モデル予測制御部１４によって行われる、モデル予測制御（非線形モデル予測制御）の手法の概要について説明する。非線形モデル予測制御とは、非線形システムに対し、各サンプリング時刻で有限時刻未来までの最適入力（制御入力値の最適解）を求め、得られた入力のうち初期値を実際の入力とする制御である。非線形モデル予測制御には、非線形最適制御である、フィードバック制御である、及び、拘束条件を組み込み易いという、３つの利点がある。 (Model prediction control)
Here, an outline of a method of model prediction control (non-linear model prediction control) performed by the nonlinear model prediction control unit 14 according to the present embodiment will be described. Non-linear model predictive control is a control in which the optimum input (optimal solution of the control input value) up to the finite time future is obtained for the nonlinear system at each sampling time, and the initial value of the obtained inputs is used as the actual input. is there. The nonlinear model predictive control has three advantages: the nonlinear optimal control, the feedback control, and the constraint condition can be easily incorporated.

このように、非線形モデル予測制御は、フィードバック制御であるため外乱に対して強く、拘束条件も多様に組み合わせることができる。このような特徴があるため、非線形モデル予測制御は、多くのシステムへの導入が期待されている。しかしながら、ニュートン法などの従来の反復法では、サンプリング周期内で最適解に収束させることは困難であった。 As described above, since the nonlinear model predictive control is feedback control, it is strong against disturbance and various constraint conditions can be combined. Due to these characteristics, nonlinear model predictive control is expected to be introduced into many systems. However, with conventional iterative methods such as Newton's method, it is difficult to converge to the optimum solution within the sampling period.

近年、この問題に対する有効な数値計算法として、Ｃ／ＧＭＲＥＳ法が新たに考案された。Ｃ／ＧＭＲＥＳ法を用いることで有限時刻未来までの最適制御問題をサンプリング周期内で解くことが可能になった。しかしながら、以下に示すように、現状では、Ｃ／ＧＭＲＥＳ法による非線形モデル予測制御を適用できない場合もある。 In recent years, the C / GMRES method has been newly devised as an effective numerical calculation method for this problem. By using the C / GMRES method, it has become possible to solve the optimal control problem up to the finite time future within the sampling period. However, as shown below, there are cases where the nonlinear model prediction control by the C / GMRES method cannot be applied at present.

すなわち、ロボット１００等の非線形システムの多くは、状態が不連続に変化する事象（「状態ジャンプ」と称する）を伴い得る。この状態ジャンプを含む問題に対し、現状のＣ／ＧＭＲＥＳ法による非線形モデル予測制御を適用することは困難であった。状態ジャンプを伴うシステムを直接最適化しようとすると、状態ジャンプの時刻及び入力を同時に最適化する必要がある。また、状態ジャンプにかかるラグランジュ乗数がさらに追加されることになる。 That is, many non-linear systems such as the robot 100 may be accompanied by an event (referred to as "state jump") in which the state changes discontinuously. It has been difficult to apply the current nonlinear model predictive control by the C / GMRES method to the problem including the state jump. When trying to optimize a system with state jumps directly, it is necessary to optimize the time and input of state jumps at the same time. In addition, the Lagrange multiplier for the state jump will be added.

また、これらの問題をＣ／ＧＭＲＥＳ法を用いて数値的に解くには、Ｃ／ＧＭＲＥＳ法のアルゴリズムを大きく変える必要がある。そこで、本願の発明者らは、ペナルティ関数（ペナルティ項）を非線形モデル予測制御の評価関数に加えるという手法を見出した。ペナルティ関数を評価関数に加えることで、状態ジャンプが発生する時刻（本実施の形態においては遊脚が着地する時刻）を指定できる。また、ラグランジュ乗数を増やさずにＣ／ＧＭＲＥＳ法を適用することができる。 Further, in order to solve these problems numerically by using the C / GMRES method, it is necessary to significantly change the algorithm of the C / GMRES method. Therefore, the inventors of the present application have found a method of adding a penalty function (penalty term) to the evaluation function of nonlinear model predictive control. By adding the penalty function to the evaluation function, the time when the state jump occurs (in the present embodiment, the time when the swing leg lands) can be specified. In addition, the C / GMRES method can be applied without increasing the Lagrange multiplier.

まず、非線形モデル予測制御の概要について説明する。制御対象として、以下の式（１）で示すような状態方程式で表される非線形システムを考える。
First, the outline of the nonlinear model predictive control will be described. As a control target, consider a nonlinear system represented by the equation of state as shown by the following equation (1).

ただし，ｘ（ｔ）は状態ベクトルであり、ｕ（ｔ）は制御入力ベクトル（制御入力値を示すベクトル）である。また、Ｒ^ｎ及びＲ^ｍは、それぞれ、ｎ次元実数ベクトル全体の集合及びｍ次元実数ベクトル全体の集合を示す。非線形モデル予測制御とは、式（１）で表されるシステムに対し、各時刻ｔにおいて、以下の式（２）で表される評価関数を最小にする入力ｕ_ｏｐｔ（ｔ＋τ）を求め、その初期値ｕ_ｏｐｔ（ｔ）を時刻ｔにおける実際の制御入力値ｕ（ｔ）とする制御である。
However, x (t) is a state vector, and u (t) is a control input vector (a vector indicating a control input value). Further, R ⁿ and R ^m indicate a set of all n-dimensional real number vectors and a set of all m-dimensional real number vectors, respectively. Non-linear model predictive control _obtains an input input (t + τ) that minimizes the evaluation function represented by the following equation (2) at each time t for the system represented by the equation (1). This is a control in which the initial value u _opt (t) is set as the actual control input value u (t) at time t.

ここで、Ｔは時刻ｔにおける評価区間の長さである。関数φ（ｘ）は終端コストと呼ばれるスカラー値関数である。関数Ｌ（ｘ，ｕ）はステージコストと呼ばれるスカラー値関数である。τは評価区間における時間パラメータであって、０≦τ≦Ｔである。なお、Ｔは、通常、正のスカラー値Ｔ_ｆ及びα（α＞０）を用いて以下の式（３）のように与えられる。
Here, T is the length of the evaluation interval at time t. The function φ (x) is a scalar value function called the termination cost. The function L (x, u) is a scalar value function called the stage cost. τ is a time parameter in the evaluation interval, and 0 ≦ τ ≦ T. In addition, T is usually given by the following equation (3) using positive scalar values T _f and α (α> 0).

このように、非線形モデル予測制御は、各時刻ｔで状態に基づいた最適入力を求めているため、フィードバック制御となっている。
また、以下の式（４）で表されるベクトル関数を拘束条件として与えることもできる。
なお、拘束条件については、等式拘束条件だけでなく、不等式拘束条件を組み込むこともできる。 As described above, the nonlinear model prediction control is feedback control because the optimum input based on the state is obtained at each time t.
Further, a vector function represented by the following equation (4) can be given as a constraint condition.
As for the constraint condition, not only the equality constraint condition but also the inequality constraint condition can be incorporated.

図３は、非線形モデル予測制御を説明するための図である。図３に示すように、非線形モデル予測制御では、現在時刻ｔ_０においてＴ秒後までの期間のモデル挙動を予測して最適化計算を行って入力ｕ_ｏｐｔ（ｔ_０＋τ）を求める。そして、その初期値ｕ_ｏｐｔ（ｔ_０）を現在時刻ｔ_０における実際の制御入力値ｕ（ｔ_０）とする。 FIG. 3 is a diagram for explaining nonlinear model predictive control. As shown in FIG. 3, the nonlinear model predictive control, the model behavior time to T seconds later at the current time t ₀ through optimization calculations to predict prompts _{_{u opt (t 0 + τ)}} . Then, as its initial value _u opt _{(t 0)} the actual control input value u at the current time _{_{t 0} (t 0).}

同様に、制御周期であるサンプリング周期Δｔ秒後に、その時点での現在時刻ｔ_１においてＴ秒後までの期間のモデル挙動を予測して最適化計算を行って入力ｕ_ｏｐｔ（ｔ_１＋τ）を求める。そして、その初期値ｕ_ｏｐｔ（ｔ_１）を現在時刻ｔ_１における実際の制御入力値ｕ（ｔ_１）とする。以下同様に、サンプリング周期ごとに、制御入力値ｕ（ｔ）が算出されることとなる。 Similarly, after the sampling period Δt seconds which is the control period, the at current time t ₁ the model behavior period until after T seconds performs optimization calculation predicts input u _{opt (t} 1 ₊ τ) at the time Ask. Then, the initial value u _opt (t ₁ ) is set as the actual control input value u (t ₁ ) at the current time t ₁ . Similarly, the control input value u (t) is calculated for each sampling cycle.

非線形モデル予測制御で各時刻ｔにおいて解くべき問題は、評価区間上の時刻τについて以下の式（５）〜（８）に示すような最適制御問題である。
ただし、時刻ｔにおける評価区間上の時刻τ（０≦τ≦Ｔ）の状態変数ベクトルと制御入力ベクトルとを、それぞれ、ｘ^＊（τ；ｔ）＝ｘ^＊（ｔ＋τ）、ｕ^＊（τ；ｔ）＝ｕ^＊（ｔ＋τ）とした。また、添字＊は、評価区間上の値であることを示す。 The problem to be solved at each time t in the nonlinear model predictive control is the optimal control problem as shown in the following equations (5) to (8) for the time τ on the evaluation interval.
However, the state variable vector and the control input vector at time τ (0 ≦ τ ≦ T) on the evaluation interval at time t are x ^* (τ; t) = x ^* (t + τ) and u ^* (τ;, respectively. t) = u ^* (t + τ). The subscript * indicates that the value is on the evaluation interval.

式（７）のＪを汎関数とみなして変分法を用いて停留条件を求めると、最適制御の必要条件（オイラー・ラグランジュ方程式）が、以下の式（９）〜（１４）のように得られる。
When J in Eq. (7) is regarded as a functional and the retention condition is obtained using the variational method, the necessary conditions for optimal control (Euler-Lagrange equation) are as shown in Eqs. (9) to (14) below. can get.

ここで、
は、ハミルトン関数と呼ばれるスカラー値関数であり、λ^＊（τ；ｔ）は式（９）に対する随伴変数、μ^＊（τ；ｔ）は式（１４）に対するラグランジュ乗数である。 here,
Is a scalar value function called the Hamilton function, where λ ^* (τ; t) is the contingent variable for equation (9) and μ ^* (τ; t) is the Lagrange multiplier for equation (14).

一方、実際の数値計算は、すべて離散化して行われる。したがって、式（９）〜（１４）は、すべて離散近似して扱わなければならない。そこで、評価区間（０≦τ≦Ｔ）をＮステップに離散近似することを考える。その際の評価区間の時間刻みを、以下の式（１６）で示すようにする。
On the other hand, all actual numerical calculations are performed discretely. Therefore, equations (9) to (14) must all be treated as discrete approximations. Therefore, consider discretely approximating the evaluation interval (0 ≦ τ ≦ T) to N steps. The time step of the evaluation section at that time is shown by the following equation (16).

その上で、評価区間上のｉ番目（１≦ｉ≦Ｎ）のステップ、つまり時刻ｔ＋ｉΔτにおける状態を、
と表す。ｕ^＊ _ｉ（ｔ）、λ^＊ _ｉ（ｔ）及びμ^＊ _ｉ（ｔ）についても同様に表される。なお、Ｔ及びＮは、予め定められた値である。したがって、Δτも、予め定められた値である。 Then, the i-th (1 ≦ i ≦ N) step on the evaluation interval, that is, the state at the time t + iΔτ, is set.
It is expressed as. The same applies to u ^* _i (t), λ ^* _i (t) and μ ^* _i (t). In addition, T and N are predetermined values. Therefore, Δτ is also a predetermined value.

この条件の下で式（９）〜（１４）を離散近似すると、以下の式（１７）〜（２２）で示すような、離散近似されたオイラー・ラグランジュ方程式が得られる。
Discrete approximation of equations (9) to (14) under these conditions gives Euler-Lagrange equations discretely approximated as shown in equations (17) to (22) below.

但し、離散近似されたハミルトン関数は、以下の式（２３）で定義される。
However, the discretely approximated Hamiltonian function is defined by the following equation (23).

以上より，非線形モデル予測制御における最適入力を求める問題というのは、上記の式（１７）〜（２２）を解いて、ｉ＝０からｉ＝Ｎについて、ｘ^＊ _ｉ（ｔ）、ｕ^＊ _ｉ（ｔ）、λ^＊ _ｉ（ｔ）及びμ^＊ _ｉ（ｔ）を求めるという問題に帰着される。
ここで、ｘ^＊ _ｉ（ｔ）は、上記式（１７），（１８）より陽に求められる。また、λ^＊ _ｉ（ｔ）は、求められたｘ^＊ _ｉ（ｔ）と上記式（１９），（２０）とから陽に求められる。したがって、ｘ^＊ _ｉ（ｔ）、ｕ^＊ _ｉ（ｔ）、λ^＊ _ｉ（ｔ）及びμ^＊ _ｉ（ｔ）のうちの本質的な未知量は、以下の式（２４）で表されるベクトルＵ（ｔ）で定義される。
なお、「：＝」は、定義を意味する等号である。つまり、上記式（２４）において、左辺Ｕ（ｔ）は、右辺のベクトルで定義される。 From the above, the problem of finding the optimum input in nonlinear model predictive control is to solve the above equations (17) to (22), and for i = 0 to i = N, x ^* _i (t) and u ^* _i. It comes down to the problem of finding (t), λ ^* _i (t) and μ ^* _i (t).
Here, x ^* _i (t) is explicitly obtained from the above equations (17) and (18). Further, λ ^* _i (t) is explicitly obtained from the obtained x ^* _i (t) and the above equations (19) and (20). Therefore, the essential unknowns of x ^* _i (t), u ^* _i (t), λ ^* _i (t) and μ ^* _i (t) are the vectors represented by the following equation (24). It is defined by U (t).
In addition, ": =" is an equal sign meaning a definition. That is, in the above equation (24), the left side U (t) is defined by the vector on the right side.

そして、このＵ（ｔ）は、以下の式（２５）で示される方程式を解くことによって得られる。
Then, this U (t) is obtained by solving the equation represented by the following equation (25).

次に、Ｃ／ＧＭＲＥＳ法について説明する。ニュートン法などの反復法では、式（２５）を各サンプリング時間内で解くのは難しい。一方、Ｕ（ｔ）が時刻ｔに関して連続であれば、Ｕ（ｔ）は、サンプリング周期Δｔごとに、以下の式（２６）で示されるようにして更新され得る。
Next, the C / GMRES method will be described. In iterative methods such as Newton's method, it is difficult to solve equation (25) within each sampling time. On the other hand, if U (t) is continuous with respect to time t, U (t) can be updated for each sampling period Δt as shown by the following equation (26).

したがって、Ｕ（ｔ）を求めるためには、ｔ＝０ではＵ（０）を求め、ｔ＞０では、Ｕ（ｔ）の時間微分、つまりＵ（ｔ）の変化量である
を求めればよい。なお、式（２６）の計算は、上記のＵ（ｔ）の変化量を数値積分することに対応する。 Therefore, in order to obtain U (t), when t = 0, U (0) is obtained, and when t> 0, the time derivative of U (t), that is, the amount of change in U (t).
You just have to ask. The calculation of equation (26) corresponds to numerical integration of the amount of change in U (t) described above.

ここで、
を求めるために、式（２５）が全てのｔで成り立つことを考慮して、上記式（２５）と等価である、以下の式（２７）で表される方程式を扱うことを考える。 here,
In order to obtain, consider that the equation (25) holds for all t, and consider dealing with the equation represented by the following equation (27), which is equivalent to the above equation (25).

さらに、式（２７）は、以下の式（２８）で示すように書き換えられ得る。
但し、ζは正の実数で、安定化パラメータと呼ばれる。 Further, the equation (27) can be rewritten as shown by the following equation (28).
However, ζ is a positive real number and is called a stabilization parameter.

式（２８）の全微分を実行して整理すると、Ｕの変化量
は、次の式（２９）で表される連立方程式を解くことで得られる。 When the total derivative of Eq. (28) is executed and rearranged, the amount of change in U
Is obtained by solving the simultaneous equations represented by the following equation (29).

したがって、ｔ＞０のとき、各時刻で解くべき問題は、上記式（２９）で示される連立方程式のみとなる。さらに、式（２９）で示される連立方程式の数値解法として、連立方程式から少ない反復回数で高精度な解を得ることが可能なＧＭＲＥＳ法を用いることができる。 Therefore, when t> 0, the only problem to be solved at each time is the simultaneous equations represented by the above equation (29). Further, as a numerical solution method of the simultaneous equations represented by the equation (29), the GMRES method capable of obtaining a highly accurate solution from the simultaneous equations with a small number of iterations can be used.

上述した手法が、非線形モデル予測制御の実時間最適化アルゴリズムである。このＵ（ｔ）の連続性を利用した変形法（連続変形法；Continuation method）とＧＭＲＥＳ法を組み合わせたアルゴリズムを、Ｃ／ＧＭＲＥＳ法と称する。 The method described above is a real-time optimization algorithm for nonlinear model predictive control. An algorithm that combines the transformation method (Continuation method) utilizing the continuity of U (t) and the GMRES method is called a C / GMRES method.

次に、状態ジャンプを考慮した非線形モデル予測制御について説明する。状態ｘ（ｔ）∈Ｒ^ｎが以下の式（３０）で表される条件を満たすと、その直後より状態ジャンプが生じるシステムを仮定する。
ただし、ξ（ｘ（ｔ））∈Ｒ^ｌはベクトル値である。 Next, the nonlinear model predictive control considering the state jump will be described. When the state x (t) ∈ R ⁿ satisfies the condition expressed by the following equation (30), a system in which a state jump occurs immediately after that is assumed.
However, ξ (x (t)) ∈ R ^l is a vector value.

ここで、状態ジャンプが、時刻ｔ_ｊの前後で起こるとする。つまり、ｔ_ｊの直前及び直後の時刻をそれぞれ「ｔ_ｊ−」及び「ｔ_ｊ＋」と表すと，ここで仮定されるシステムは、以下の式（３１）を満たしている。
Here, state jump, to take place before and after the time t _j. That, t _j respectively just before and just after the time of "t _j -" is represented as and "t _{j +"} system which is assumed here, meet the following equation (31).

また、ここで仮定されるシステムは、状態ジャンプ直後の状態ｘ（ｔ_ｊ＋）が、以下の式（３２）で表されるように、状態ジャンプ直前の状態ｘ（ｔ_ｊ−）から陽に求められるとする。
Further, in the system assumed here, the state x (t _j +) immediately after the state jump is positively changed from the state x (t _j −) immediately before the state jump, as expressed by the following equation (32). Suppose you are asked.

さらに、状態ジャンプとともにシステムの状態方程式が切り替わり得るとする。このとき、状態ジャンプの前後の状態方程式は、以下の式（３３），（３４）で示すように記述される。
なお、状態方程式は必ずしも切り替わる必要はなく、ｆ_１（ｘ，ｕ）＝ｆ_２（ｘ，ｕ）であってもよい。また、説明の明確化のため、ここで扱うシステムでは、式（４）で示したような拘束条件は存在しないと仮定する。 Furthermore, it is assumed that the equation of state of the system can be switched with the state jump. At this time, the equations of state before and after the state jump are described as shown by the following equations (33) and (34).
The equation of state does not necessarily have to be switched, and f ₁ (x, u) = f ₂ (x, u) may be used. Further, for the sake of clarification of the explanation, it is assumed that the constraint condition shown in the equation (4) does not exist in the system dealt with here.

非線形モデル予測制御の問題を考えるために、以下の説明では、評価区間上でのジャンプ時刻をτ_ｊとする。つまり、τ_ｊは、以下の式（３５），（３６）を満たす。
To consider the problem of the non-linear model predictive control, in the following description, the jump time on the evaluation section and tau _j. That is, τ _j satisfies the following equations (35) and (36).

一般的に、上記のシステムを最適化するためには、変分法により導かれた停留条件を数値的に解けばよい。システムの方程式が上記式（３３），（３４），（３２）で与えられ、ｔ_ｊが固定されていないとき、停留条件は、以下の式（３７）〜（４８）で与えられる。
但し、ν∈Ｒ^ｌは、は式（３１）に対するラグランジュ乗数である。 In general, in order to optimize the above system, the retention condition derived by the variational method may be numerically solved. When the equations of the system are given by the above equations (33), (34) and (32) and t _j is not fixed, the retention conditions are given by the following equations (37) to (48).
Where ν ∈ R ^l is a Lagrange multiplier for equation (31).

また、ハミルトン関数Ｈ_１，Ｈ_２は、それぞれ以下の式（４９），（５０）で表される。
The Hamiltonian functions H ₁ and H ₂ are represented by the following equations (49) and (50), respectively.

上記の停留条件の中で、時刻ｔ_ｊに関する条件を示す式は、式（４１）,（４８）である。ここで、上述したように実際には離散化して数値計算することを考慮すると、これらの式からτ_ｊを求めることは困難である。τ_ｊを求めることができないと、評価区間上の状態ｘ^＊（τ；ｔ）及び評価区間上の随伴変数λ^＊（τ；ｔ）を求めることもできない。さらに、νも新たな未知量として追加されている。したがって、仮にτ_ｊが求められたとしても、本質的な未知量を算出するための式として上記の式（２４）をそのまま用いることはできず、新たに未知量及び未知方程式を定義しなければならない。 Among the above stationary condition, formula indicating the condition regarding the time _{t j} in Formula (41), a (48). Here, it is difficult to obtain τ _j from these equations, considering that the numerical calculation is actually performed by discreteization as described above. If τ _j cannot be obtained, the state x ^* (τ; t) on the evaluation interval and the contingent variable λ ^* (τ; t) on the evaluation interval cannot be obtained. In addition, ν has also been added as a new unknown quantity. Therefore, even if τ _j is obtained, the above equation (24) cannot be used as it is as an equation for calculating the essential unknown quantity, and the unknown quantity and the unknown equation must be newly defined. It doesn't become.

そこで、本実施の形態にかかるアルゴリズムでは、十分大きな正のスカラー値ｐを用いた以下の式（５１）で表されるペナルティ関数が、評価関数に追加されている。
Therefore, in the algorithm according to the present embodiment, a penalty function represented by the following equation (51) using a sufficiently large positive scalar value p is added to the evaluation function.

式（５１）で表されるペナルティ関数を加えた評価関数を最適化すれば、指定した時刻τ_ｊに対して、以下の式（５２）が成り立ち得る。
By optimizing an evaluation function obtained by adding a penalty function represented by the formula (51), for the specified time tau _j, may it holds the following equation (52).

つまり、上記のペナルティ関数を用いることにより、指定した時刻ｔ_ｊにおいて状態ジャンプを生じさせることが可能となる。言い換えると、上記のペナルティ関数は、指定したタイミングで状態が不連続となるようにシステムの状態を拘束する拘束パラメータである。また、式（５２）で表される拘束条件がペナルティ関数について追加されているため、ラグランジュ乗数νは追加されなくてもよい。したがって、本実施の形態にかかるアルゴリズムでは、非線形モデル予測制御の最適化問題を、状態ジャンプの無い非線形モデル予測制御と同数の未知数の問題として扱うことができる。 In other words, by using the above penalty function, it is possible to produce a state jump in the specified time t _j. In other words, the above penalty function is a constraint parameter that constrains the state of the system so that the state becomes discontinuous at a specified timing. Further, since the constraint condition represented by the equation (52) is added for the penalty function, the Lagrange multiplier ν need not be added. Therefore, in the algorithm according to the present embodiment, the optimization problem of the nonlinear model predictive control can be treated as the same number of unknown problems as the nonlinear model predictive control without the state jump.

したがって、状態ジャンプを含む非線形モデル予測制御の問題は、式（５１）を評価関数に加えることにより、状態ジャンプの無い非線形モデル予測制御と同様に扱うことができる。但し、ペナルティ関数を用いるため、あらかじめ適切な状態ジャンプ時刻ｔ_ｊを指定する必要がある。 Therefore, the problem of the nonlinear model predictive control including the state jump can be treated in the same manner as the nonlinear model predictive control without the state jump by adding the equation (51) to the evaluation function. However, since the use of penalty functions, it is necessary to specify in advance the proper state jump time t _j.

このように、本実施の形態にかかるアルゴリズムでは、ペナルティ関数を用いることにより、想定しているタイミングで、不連続な状態変化を起こさせることができる。したがって、元々の非線形モデル予測制御の理論を容易に適用でき、さらにＣ／ＧＭＲＥＳ法を適用することも可能となる。したがって、後述するように、二足歩行ロボットのような非線形システムに対しても実時間で制御を行うことが可能となる。 As described above, in the algorithm according to the present embodiment, by using the penalty function, it is possible to cause a discontinuous state change at the assumed timing. Therefore, the original theory of nonlinear model predictive control can be easily applied, and further, the C / GMRES method can be applied. Therefore, as will be described later, it is possible to control a non-linear system such as a bipedal walking robot in real time.

次に、ペナルティ関数を用いた場合の停留条件について説明する。ここで、仮に、連続時間で停留条件を導出した場合、以下の式（５３），（５４）で表される項が生じる。
Next, the stagnation condition when the penalty function is used will be described. Here, if the retention condition is derived for continuous time, the terms represented by the following equations (53) and (54) occur.

上述したように、コンピュータで数値計算を行うためには、停留条件の各方程式を離散近似して考える必要がある。しかしながら、式（５３）,（５４）については、τ_ｊの前後で離散近似を行うことができない。したがって、本実施の形態にかかる問題に対しては、状態方程式及び評価関数を離散近似したのち、変分法より停留条件を導出することとする。 As described above, in order to perform numerical calculations on a computer, it is necessary to consider each equation of the stagnation condition by discrete approximation. However, equation (53) and (54) can not perform a discrete approximation before and after tau _j. Therefore, for the problem of the present embodiment, the equation of state and the evaluation function are discretely approximated, and then the retention condition is derived from the variational method.

まず、以下の式（５５）を満たすステップｉをジャンプステップｉ_ｊと定義する。
First, step i that satisfies the following equation (55) is defined as jump step _ij .

ｉ_ｊを用いると、上記の式（３３）,（３４）,（３２）は、それぞれ、以下の式（５６），（５７），（５８）で示すように、離散近似される。
Using _ij , the above equations (33), (34), and (32) are discretely approximated as shown in the following equations (56), (57), and (58), respectively.

また、式（５１）で表されるペナルティ項（ペナルティ関数）は、以下の式（５９）で示すように、離散近似される。
Further, the penalty term (penalty function) represented by the equation (51) is discretely approximated as shown by the following equation (59).

次に、変分法より停留条件を求める。離散近似された本システムに対する最適制御問題とは、式（５９）が追加された、以下の式（６０）で表されるような離散近似された評価関数を最小にする入力の系列ｕ_０（ｔ），・・・，ｕ_Ｎ−１（ｔ）を求める問題である。
Next, the retention condition is obtained by the variational method. The optimal control problem for this discretely approximated system is a sequence of inputs u ₀ (with the addition of equation (59) that minimizes the discretely approximated evaluation function as represented by equation (60) below. t), ..., u _N-1 (t) is a problem to be obtained.

ここで、式（５６），（５７），（５８）は、それぞれ、以下の式（６１），（６２），（６３）で示されるような等式拘束条件とみなすことができる。
Here, the equations (56), (57), and (58) can be regarded as equality constraints as shown by the following equations (61), (62), and (63), respectively.

式（６１），（６２）のラグランジュ乗数ベクトルをλ_ｉ＋１（ｔ）、式（６３）のラグランジュ乗数ベクトルを
とする。このとき、ラグランジュ関数は、以下の式（６４）で定義される。 The Lagrange multiplier vector of equations (61) and (62) is λ _{i + 1} (t), and the Lagrange multiplier vector of equation (63) is
And. At this time, the Lagrange function is defined by the following equation (64).

式（６４）のＪに式（６０）で示されたＪを代入すると、以下の式（６５）が得られる。
Substituting J represented by the formula (60) into J of the formula (64), the following formula (65) is obtained.

但し、ハミルトン関数Ｈ_１，Ｈ_２は、それぞれ以下の式（６６），（６７）で定義される。
However, the Hamiltonian functions H ₁ and H ₂ are defined by the following equations (66) and (67), respectively.

また、式（６５）の変分は、以下の式（６８）のように表される。
Further, the variation of the equation (65) is expressed as the following equation (68).

制御入力値が最適であれば、式（６８）において、任意のδｘ_ｉ（ｔ），δｕ_ｉ（ｔ）について、
が成り立つ。ここで、ｘ_０（ｔ）＝ｘ（ｔ）よりδｘ_０（ｔ）＝０であることに注意すると、以下の式（６９）〜（７９）で表される停留条件が導かれる。但し、添字＊は、評価区間上の値であることを示す。 If the control input value is optimal, in the formula (68), any .delta.x _i (t), for .delta.u _i (t),
Is established. Here, if it is noted that δ x ₀ (t) = 0 from x ₀ (t) = x (t), the retention conditions represented by the following equations (69) to (79) are derived. However, the subscript * indicates that the value is on the evaluation section.

ここで、評価区間上の状態ｘ^＊ _ｉ（ｔ）は、上記式（６９），（７０），（７１），（７２）から算出され得る。また、随伴変数λ^＊ _ｉ（ｔ）は、上記式（７３），（７４），（７５），（７６）から算出され得る。したがって、本実施の形態にかかるモデルでは、拘束条件を考えていないため、未知量を示すベクトルＵ（ｔ）は、以下の式（８０）で定義される。
Here, the state x ^* _i (t) on the evaluation interval can be calculated from the above equations (69), (70), (71), and (72). Further, the adjoint variable λ ^* _i (t) can be calculated from the above equations (73), (74), (75) and (76). Therefore, in the model according to the present embodiment, since the constraint condition is not considered, the vector U (t) indicating the unknown quantity is defined by the following equation (80).

このＵ（ｔ）は、以下の式（８１）で示される方程式を解くことによって得られる。
この方程式を、上述したＣ／ＧＭＲＥＳ法を用いて解けばよい。 This U (t) can be obtained by solving the equation represented by the following equation (81).
This equation may be solved using the C / GMRES method described above.

ここで、サンプリング周期ごとの入力系列（最適解）の更新について考える。図４は、入力系列の更新について説明するための図である。図４の矢印Ａに示すように、Ｃ／ＧＭＲＥＳ法では、上記式（２６）で示すように前進差分近似を行っている。ここで、時刻ｔにおける評価区間でのｉ_ｊステップ目の時刻つまり（ｔ＋ｉ_ｊΔτ）が状態ジャンプ前の時刻（つまり時刻ｔ_ｊより前の時刻）であり、時刻ｔ＋Δｔにおける評価区間でのｉ_ｊステップ目の時刻つまり（ｔ＋Δｔ＋ｉ_ｊΔτ）が状態ジャンプ後の時刻（つまり時刻ｔ_ｊより後の時刻）であるとする。このとき、時刻ｔにおけるジャンプステップはｉ_ｊであるが、時刻ｔ＋Δｔにおけるジャンプステップはｉ_ｊ−１である。このとき、
は状態ジャンプ前の入力（以下、式Ｕｊ_ｂと表記）であり、
は状態ジャンプ後の入力（以下、式Ｕｊ_ａと表記）となる。 Here, consider updating the input series (optimal solution) for each sampling cycle. FIG. 4 is a diagram for explaining the update of the input sequence. As shown by the arrow A in FIG. 4, in the C / GMRES method, the forward difference approximation is performed as shown by the above equation (26). Here, the time of the _ijth step in the evaluation section at time t, that is, (t + i _j Δτ) is the time before the state jump (that is, the time before time t _j ), and i _j in the evaluation section at time t + Δt. It is assumed that the time of the step, that is, (t + Δt + i _j Δτ) is the time after the state jump (that is, the time after the time t _j ). At this time, the jump step at time t is _ij , but the jump step at time t + Δt is _ij -1. At this time,
Is the input before the state jump (hereinafter referred to as the formula Uj _b ),
The input status after jump (hereinafter, wherein Uj _a hereinafter) becomes.

したがって、本実施の形態にかかるシステムは、状態ジャンプの前後で状態が不連続であることから、ｉ＝ｉ_ｊのとき、単純に、
とすることはできない。つまり、式（２６）’のように計算すると、上記の状態ジャンプ後の入力Ｕｊ_ａは、最適入力として更新されていないこととなる。 Therefore, in the system according to the present embodiment, since the states are discontinuous before and after the state jump, when i = _ij , simply,
Cannot be. That is, when calculated as Equation (26) ', the input Uj _a post above states jump, and thus not updated as the optimum input.

そこで、本実施の形態においては、状態ジャンプ後の入力Ｕｊ_ａを、ｉ≠ｉ_ｊのｕ^＊ _ｉより近似することと考える。状態ジャンプ後の入力Ｕｊ_ａは、時刻ｔ＋Δｔ＋ｉ_ｊΔτの評価区間上の入力である。適切に近似するためには、図４の矢印Ｂで示すように、この時刻に最も近い時刻における入力、つまり時刻ｔ＋（ｉ_ｊ＋１）Δτの入力
から近似すればよい。 Therefore, in this embodiment, consider the input Uj _a status after jumping, and be approximated from the ^u _{* i} of i ≠ _{i j.} Input after the state jump Uj _a is an input on the evaluation section of the time t + Δt + _{i j} Δτ. For proper approximation, as shown by arrow B in FIG. 4, the input at the time closest to this time, that is, the input at time t + (i _j + 1) Δτ
It can be approximated from.

したがって、時刻ｔにおけるジャンプステップと時刻ｔ＋Δｔにおけるジャンプステップとが異なるとき、状態ジャンプ後の入力Ｕｊ_ａは、以下の式（８２）で示すように更新される。
Thus, when a jump steps in jumping step and time t + Delta] t at time t is different, the input Uj _a status after jumping is updated as shown in the following equation (82).

但し、ｉ_ｊ＝Ｎ−１のときは、上記式（８２）で示した近似は行えないので、通常通り、以下の式（８３）で示すように入力は更新される。
However, when _ij = N-1, the approximation shown in the above equation (82) cannot be performed, so the input is updated as shown in the following equation (83) as usual.

なお、上式は、ΔｔとΔτとが、以下の式（８４）を満たすことを仮定している。
Δｔ＜２Δτ ・・・（８４）
しかしながら、
２Δτ≦Δｔ・・・（８５）
であるときも、ｕ^＊ _ｉ（ｔ＋Δｔ）の系列は、実時間上のｕ^＊ _ｉ（ｔ）の系列から適切に近似され得る。 The above equation assumes that Δt and Δτ satisfy the following equation (84).
Δt <2Δτ ・・・ (84)
However,
2Δτ≤Δt ・・・ (85)
Even when, the series of u ^* _i (t + Δt) can be appropriately approximated from the series of u ^* _i (t) in real time.

なお、Ｃ／ＧＭＲＥＳ法においては、Ｕ（ｔ）の変化量を求める際に，Ｆ（Ｕ，ｘ，ｔ）を用いて、以下の式（８６）で示すように、前進差分近似を行っている。
In the C / GMRES method, when determining the amount of change in U (t), F (U, x, t) is used to perform forward difference approximation as shown in the following equation (86). There is.

しかしながら、差分近似の差分時間をｈとすると，時刻ｔと時刻ｔ＋ｈとの間で状態ジャンプが生じた場合、状態量が不連続に大きく変化してしまい、差分近似を正確に行うことができない。したがって、時刻ｔが以下の式（８７）で示される
ｔ＜ｔ_ｊ≦ｔ＋ｈ・・・（８７）
を満たすとき，以下の式（８８）で示すように後退差分近似を行う。
However, assuming that the difference time of the difference approximation is h, when a state jump occurs between the time t and the time t + h, the state quantity changes greatly discontinuously, and the difference approximation cannot be performed accurately. Therefore, the time t is represented by the following equation (87), t <t _j ≤ t + h ... (87).
When the condition is satisfied, the backward difference approximation is performed as shown by the following equation (88).

（二足歩行ロボットへの適用）
次に、上述した非線形モデル予測制御を、本実施の形態にかかるロボット１００の動作の制御に適用した例について説明する。なお、実施の形態１においては、ロボット１００がコンパス型モデルである例について説明するが、後述するように、非線形モデル予測制御は、ロボット１００がコンパス型モデルでなくても適用可能である。 (Application to biped robots)
Next, an example in which the above-mentioned nonlinear model predictive control is applied to the control of the operation of the robot 100 according to the present embodiment will be described. In the first embodiment, an example in which the robot 100 is a compass type model will be described, but as will be described later, the nonlinear model prediction control can be applied even if the robot 100 is not a compass type model.

なお、ロボット１００の歩行動作は、遊脚が地面と衝突する（着地する）という動作を含む。この衝突の前後で、ロボット１００の一般化速度が不連続に変化する。つまり、このとき、状態ジャンプが発生する。また、一般的に、歩行動作は、周期的な運動である。したがって、ロボット１００を、予め定められた周期ごとに状態ジャンプを生じさせる（つまり遊脚を着地させる）ように制御を行うことが可能である。なお、「着地」とは、遊脚が地面と衝突（接触）することに限定されない。つまり、「着地」とは、ロボット１００がその上を歩行している面（歩行面）に遊脚が接触することを意味する。 The walking motion of the robot 100 includes an motion in which the swing leg collides with the ground (landing). Before and after this collision, the generalized speed of the robot 100 changes discontinuously. That is, at this time, a state jump occurs. Also, in general, the walking motion is a periodic motion. Therefore, it is possible to control the robot 100 so as to generate a state jump (that is, land the swing leg) at predetermined intervals. The "landing" is not limited to the collision (contact) of the free leg with the ground. That is, "landing" means that the swing leg comes into contact with the surface (walking surface) on which the robot 100 is walking.

図５は、実施の形態１にかかるロボット１００をコンパス型モデルに適用する方法を説明するための図である。図５に示した例では、右脚１１０Ｒが支持脚であり、左脚１１０Ｌが遊脚（振り脚）である。制御装置２は、支持脚が地面と点接触していることを模擬するため、支持脚（図５の例では右脚１１０Ｒ）の足首関節部１２４に設けられたトルクセンサ１３６を用いて、支持脚の足首関節部１２４のトルクを０に制御する。また、制御装置２は、右脚１１０Ｒ及び左脚１１０Ｌの膝関節部１２２を、伸展状態でロックするように制御する。つまり、制御装置２は、右脚１１０Ｒ及び左脚１１０Ｌの膝関節部１２２の関節角度が伸展状態に対応する角度（例えば０）となるように、膝関節部１２２のモータ１４０を制御する。さらに、制御装置２は、遊脚（図５の例では左脚１１０Ｌ）の足裏センサ１１８を用いて、遊脚の着地を検出する。このようにして、ロボットシステム１は、コンパス型モデルを模擬することができる。 FIG. 5 is a diagram for explaining a method of applying the robot 100 according to the first embodiment to the compass model. In the example shown in FIG. 5, the right leg 110R is a support leg and the left leg 110L is a swing leg (swing leg). The control device 2 supports by using a torque sensor 136 provided at the ankle joint portion 124 of the support leg (right leg 110R in the example of FIG. 5) in order to simulate that the support leg is in point contact with the ground. The torque of the ankle joint portion 124 of the leg is controlled to 0. Further, the control device 2 controls the knee joint portions 122 of the right leg 110R and the left leg 110L so as to be locked in the extended state. That is, the control device 2 controls the motor 140 of the knee joint portion 122 so that the joint angles of the knee joint portions 122 of the right leg 110R and the left leg 110L are angles (for example, 0) corresponding to the extended state. Further, the control device 2 detects the landing of the swing leg by using the sole sensor 118 of the swing leg (left leg 110L in the example of FIG. 5). In this way, the robot system 1 can simulate a compass model.

図６は、実施の形態１にかかるロボット１００をコンパス型モデルに適用した例を示す図である。図６に示す例では、ロボット１００は、関節１５０と、支持脚リンク１５１と、遊脚リンク１５２とから構成されるコンパス型モデルにモデル化されている。ここで、関節１５０は、胴体１０２及び股関節部１２０に対応する。また、支持脚リンク１５１は、右脚１１０Ｒ及び左脚１１０Ｌのうちの支持脚に対応する。また、遊脚リンク１５２は、右脚１１０Ｒ及び左脚１１０Ｌのうちの遊脚に対応する。 FIG. 6 is a diagram showing an example in which the robot 100 according to the first embodiment is applied to a compass model. In the example shown in FIG. 6, the robot 100 is modeled on a compass model including a joint 150, a support leg link 151, and a swing leg link 152. Here, the joint 150 corresponds to the torso 102 and the hip joint 120. Further, the support leg link 151 corresponds to the support leg of the right leg 110R and the left leg 110L. Further, the swing leg link 152 corresponds to the swing leg of the right leg 110R and the left leg 110L.

関節１５０の質量をｍ_０とする。また、図６の矢印で示すように、関節１５０の周りに、制御入力値として入力トルクｕが入力される。ここで、支持脚リンク１５１及び遊脚リンク１５２の物理的性質は、互いに同じであるとする。支持脚リンク１５１及び遊脚リンク１５２の長さを、ｌとする。また、支持脚リンク１５１及び遊脚リンク１５２の質量を、ｍとする。 Let the mass of the joint 150 be m ₀ . Further, as shown by the arrow in FIG. 6, the input torque u is input as a control input value around the joint 150. Here, it is assumed that the physical properties of the support leg link 151 and the swing leg link 152 are the same as each other. Let l be the length of the support leg link 151 and the swing leg link 152. Further, the mass of the support leg link 151 and the swing leg link 152 is m.

また、鉛直方向に対する支持脚リンク１５１の角度をθ_１とし、鉛直方向に対する遊脚リンク１５２の角度をθ_２とする。但し、図６において時計回り（各リンクの下端を中心に関節１５０が前方に回る方向）を正とする。したがって、図６の状態では、θ_２＜０である。 Further, the angle of the support leg link 151 with respect to the vertical direction is set to θ _1, and the angle of the swing leg link 152 with respect to the vertical direction is set to θ ₂ . However, in FIG. 6, clockwise (direction in which the joint 150 rotates forward around the lower end of each link) is positive. Therefore, in the state of FIG. 6, θ ₂ <0.

次に、図６に例示したコンパス型モデルの歩行動作に関して、以下のような仮定があるとする。
・遊脚リンク１５２と地面９０との衝突（着地）は一瞬である。
・遊脚リンク１５２と地面９０との衝突は完全非弾性衝突である。
・リンクと地面との摩擦係数は∞である。
・両脚（両リンク）が同時に地面９０から力を受けることはない。 Next, it is assumed that the following assumptions are made regarding the walking motion of the compass model illustrated in FIG.
-The collision (landing) between the swing leg link 152 and the ground 90 is instantaneous.
-The collision between the swing leg link 152 and the ground 90 is a completely inelastic collision.
-The coefficient of friction between the link and the ground is ∞.
-Both legs (both links) do not receive force from the ground 90 at the same time.

上記の仮定より、両脚（支持脚リンク１５１及び遊脚リンク１５２）が同時に地面９０に着くことはない。また、衝突時に遊脚リンク１５２の速度は０となる。したがって、本モデルの歩行制御に必要な方程式は、片脚支持期の運動方程式（状態方程式）と、遊脚リンク１５２の衝突時の方程式（衝突方程式）との２つである。 From the above assumption, both legs (support leg link 151 and swing leg link 152) do not reach the ground 90 at the same time. Further, the speed of the swing leg link 152 becomes 0 at the time of a collision. Therefore, there are two equations required for walking control of this model: the equation of motion (state equation) during the one-leg support period and the equation at the time of collision of the swing leg link 152 (collision equation).

片脚（つまり支持脚リンク１５１）だけが地面９０に接触しているとき、ラグランジュの運動方程式より、以下の式（８９）で示す方程式が導き出される。
When only one leg (that is, the support leg link 151) is in contact with the ground 90, the equation shown by the following equation (89) is derived from Lagrange's equation of motion.

但し、ｑは、以下の式（Ｓ１）で示される一般化座標ベクトルである。
また、Ｍ（ｑ）は慣性行列、Ｈ（ｑ、ｑ（ドット））は重力とコリオリ力の項、Ｎｕはｑに対する一般化力である。なお、Ｒ^ｎ×ｍは、ｎ×ｍの実数行列全体の集合を示す。 However, q is a generalized coordinate vector represented by the following equation (S1).
Further, M (q) is an inertial matrix, H (q, q (dots)) is a term of gravity and Coriolis force, and Nu is a generalized force with respect to q. Note that R ^{n × m} indicates a set of the entire real number matrix of n × m.

ここで、式（８９）に示した運動方程式の詳細を、以下の式（Ｅ１），（Ｅ２），（Ｅ３）に示す。なお、以下の式において、Ｉ^ｍは、支持脚リンク１５１及び遊脚リンク１５２の重心（重心１５１ｍ及び重心１５２ｍ）周りの慣性モーメントである。また、ｌ_Ｇは、関節１５０から各リンクの重心（重心１５１ｍ及び重心１５２ｍ）までの長さである。
Here, the details of the equation of motion shown in the equation (89) are shown in the following equations (E1), (E2), and (E3). In the following formula, ^{I m} is the center of gravity of the support leg link 151 and the swing link 152 (centroid 151m and centroid 152m) is a moment of inertia about. Further, l _G is the length from the joint 150 to the center of gravity of each link (center of gravity 151 m and center of gravity 152 m).

また、状態ベクトルｘ∈Ｒ^４を、以下の式（Ｓ２）に示す。

この場合、状態方程式は、上記式（８９）より、以下の式（９０）で表される。
なお、本実施の形態かかるコンパス型モデルでは、歩行の拘束条件として、ＺＭＰ（zero moment point）は考慮されないものとする。 Further, the state vector x ∈ R ⁴ is shown in the following equation (S2).

In this case, the equation of state is expressed by the following equation (90) from the above equation (89).
In the compass type model according to the present embodiment, ZMP (zero moment point) is not considered as a walking constraint condition.

次に、衝突方程式について説明する。衝突方程式の説明の前に、状態ジャンプについて説明する。
図７は、状態ジャンプを説明するための図である。図７は、右脚１１０Ｒ又は左脚１１０Ｌの状態を示す図である。ここでは、右脚１１０Ｒの状態を示すとする。図７は、横軸が右脚１１０Ｒの角度を示し、縦軸が右脚１１０Ｒの角速度を示す、グラフ（位相線図）である。 Next, the collision equation will be described. Before explaining the collision equation, the state jump will be explained.
FIG. 7 is a diagram for explaining a state jump. FIG. 7 is a diagram showing a state of the right leg 110R or the left leg 110L. Here, it is assumed that the state of the right leg 110R is shown. FIG. 7 is a graph (phase diagram) in which the horizontal axis indicates the angle of the right leg 110R and the vertical axis indicates the angular velocity of the right leg 110R.

状態Ｉにおいて、右脚１１０Ｒが地面から離れて遊脚となる。したがって、状態Ｉから、右脚１１０Ｒ（遊脚リンク１５２）の角度及び角速度は、（θ_２，θ_２（ドット））である。このとき、状態Ｉから後述する状態ＩＩまでの期間では、角度及び角速度は、連続的に変化している。そして、状態ＩＩで、遊脚であった右脚１１０Ｒ（遊脚リンク１５２）が地面９０に着地する。そして、直ちに、状態は状態ＩＩＩに移行して右脚１１０Ｒは支持脚（支持脚リンク１５１）となる。このとき、状態ＩＩから状態ＩＩＩに遷移するときに、角度はほとんど変わらないが、角速度が急激に変化する。したがって、状態ＩＩから状態ＩＩＩに遷移する際に、状態ジャンプが発生している。 In state I, the right leg 110R separates from the ground and becomes a free leg. Therefore, from state I, the angle and angular velocity of the right leg 110R (free leg link 152) are (θ ₂ , θ ₂ (dots)). At this time, in the period from the state I to the state II described later, the angle and the angular velocity change continuously. Then, in the state II, the right leg 110R (free leg link 152), which was a free leg, lands on the ground 90. Immediately, the state shifts to state III, and the right leg 110R becomes a support leg (support leg link 151). At this time, when transitioning from the state II to the state III, the angle hardly changes, but the angular velocity changes abruptly. Therefore, a state jump occurs when transitioning from state II to state III.

状態ＩＩＩから、右脚１１０Ｒ（支持脚リンク１５１）の角度及び角速度は、（θ_１，θ_１（ドット））である。このとき、状態ＩＩＩから後述する状態ＩＶまでの期間では、角度及び角速度は、連続的に変化している。そして、状態ＩＶで、遊脚であった左脚１１０Ｌ（遊脚リンク１５２）が地面９０に着地する。そして、直ちに、状態は状態Ｉに移行して右脚１１０Ｒは遊脚（遊脚リンク１５２）となる。このとき、状態ＩＶから状態Ｉに遷移するときに、角度はほとんど変わらないが、角速度が急激に変化する。したがって、状態ＩＶから状態Ｉに遷移する際に、状態ジャンプが発生している。
このように、遊脚が地面に着地すると、状態を示すパラメータ（図７の例では脚の角度及び角速度）が、不連続に変化する。この現象が、状態ジャンプである。 From state III, the angle and angular velocity of the right leg 110R (support leg link 151) is (θ ₁ , θ ₁ (dot)). At this time, in the period from the state III to the state IV described later, the angle and the angular velocity change continuously. Then, in the state IV, the left leg 110L (free leg link 152), which was a free leg, lands on the ground 90. Immediately, the state shifts to the state I, and the right leg 110R becomes a free leg (free leg link 152). At this time, when transitioning from the state IV to the state I, the angle hardly changes, but the angular velocity changes abruptly. Therefore, a state jump occurs when transitioning from the state IV to the state I.
In this way, when the swing leg lands on the ground, the parameters indicating the state (angle and angular velocity of the leg in the example of FIG. 7) change discontinuously. This phenomenon is a state jump.

次に、衝突前後の一般化座標及び一般化速度の定義について説明する。
図８は、遊脚リンク１５２の衝突直前のロボット１００の状態を示す図である。図８に示すように、衝突直前の一般化座標及び一般化速度は、それぞれ以下の式（Ｇ１）及び式（Ｇ２）で表される。
Next, the definitions of generalized coordinates and generalized velocity before and after the collision will be described.
FIG. 8 is a diagram showing a state of the robot 100 immediately before the collision of the swing leg link 152. As shown in FIG. 8, the generalized coordinates and the generalized velocity immediately before the collision are represented by the following equations (G1) and (G2), respectively.

図９は、遊脚リンク１５２の衝突直後のロボット１００の状態を示す図である。図９に示すように、衝突直後の一般化座標及び一般化速度は、それぞれ以下の式（Ｇ３）及び式（Ｇ４）で表される。
FIG. 9 is a diagram showing a state of the robot 100 immediately after the collision of the swing leg link 152. As shown in FIG. 9, the generalized coordinates and the generalized velocity immediately after the collision are represented by the following equations (G3) and (G4), respectively.

ここで、遊脚リンク１５２と地面９０との衝突について、以下の２つの角運動量についての保存則を適用する。
・衝突前に遊脚であった脚（遊脚リンク１５２）の先端まわりの系全体の角運動量。
・関節１５０まわりの、衝突前に支持脚であった脚（支持脚リンク１５１）の角運動量。
上記の角運動量保存則は、上記式（Ｇ１），（Ｇ２），（Ｇ３），（Ｇ４）より、以下の式（９１）で表される。
Here, the following two conservation laws for angular momentum are applied to the collision between the swing leg link 152 and the ground 90.
-Angular momentum of the entire system around the tip of the leg that was a free leg before the collision (free leg link 152).
-Angular momentum of the leg (support leg link 151) that was the support leg before the collision around the joint 150.
The above law of conservation of angular momentum is expressed by the following equation (91) from the above equations (G1), (G2), (G3) and (G4).

なお、Ｑ^＋及びＱ⁻は、それぞれ以下の式（Ｅ４），（Ｅ５）で表される。
Note that Q ⁺ and Q ⁻ are represented by the following equations (E4) and (E5), respectively.

ここで、衝突前後で、一般化座標は変わらない。したがって、衝突直前の一般化座標ｑ⁻及び衝突直後の一般化座標ｑ^＋について、以下の式（９２）が成り立つ。
Here, the generalized coordinates do not change before and after the collision. Therefore, the following equation (92) holds for the generalized coordinate q ⁻ immediately before the collision and the generalized coordinate q ⁺ immediately after the collision.

また、上記式（９１），（９２）は、衝突直前の状態空間ベクトルｘ⁻及び衝突直後の状態空間ベクトルｘ^＋を用いて、以下の式（９３）で表される。
但し、Ｉ_２は２×２の単位行列である。 Further, the above equations (91) and (92) are expressed by the following equation (93) using the state space vector x ⁻ immediately before the collision and the state space vector x ⁺ immediately after the collision.
However, I ₂ is a 2 × 2 identity matrix.

また、Ｚ（ｑ⁻）は、以下の式（９４）を満たす２×２行列である。
Further, Z (q ⁻ ) is a 2 × 2 matrix satisfying the following equation (94).

次に、実施の形態１にかかるコンパス型モデルに、上述した非線形モデル予測制御を適用することを考える。上述したように、歩行動作とは、「連続して脚を前に出す」という動作である。本モデルにおいては、脚（支持脚リンク１５１及び遊脚リンク１５２）の開き角を目標値に近づける、という評価関数を設定し、遊脚リンク１５２が着地するたびに各リンクの座標を入れ替えることで歩行制御を行う。 Next, consider applying the above-mentioned nonlinear model predictive control to the compass model according to the first embodiment. As described above, the walking motion is an motion of "continuously pushing the legs forward". In this model, an evaluation function is set to bring the opening angles of the legs (supporting leg link 151 and swing leg link 152) closer to the target value, and the coordinates of each link are exchanged each time the swing leg link 152 lands. Perform walking control.

このとき、上述した評価関数Ｊの終端コストφ及びステージコストＬは、それぞれ、次の式（９５），（９６）のように表される。
但し、ｓ_ｆ、ｑ_０及びｒは、それぞれ重みを表す正のスカラー値である。また、θ_ｒｅｆは、脚の開き角の目標値（目標角度）である。 At this time, the terminal cost φ and the stage cost L of the evaluation function J described above are expressed by the following equations (95) and (96), respectively.
However, s _f , q _0, and r are positive scalar values representing weights, respectively. Further, θ _ref is a target value (target angle) of the leg opening angle.

また、遊脚の着地の度にθ_１とθ_２とを入れ替え、θ_１（ドット）とθ_２（ドット）とを入れ替えることを考慮すると、状態ジャンプの方程式である式（９３）は、以下の式（９７）のように書き換えられる。
In addition, considering that θ ₁ and θ ₂ are exchanged each time the swing leg lands and θ ₁ (dot) and θ ₂ (dot) are exchanged, the equation (93) of the state jump is as follows. Is rewritten as in equation (97).

但し、Ｉ'_２∈Ｒ^２×２は、以下の行列である。
However, I _^'2 ∈R ₂ ^× ₂ are the following matrix.

次に、歩行周期をＴ_ｓｔｅｐとすると、ジャンプ時刻ｔ_ｊは、整数ｋを用いて以下の式（９８）で表される。
ｔ_ｊ＝ｋＴ_ｓｔｅｐ・・・（９８）
なお、評価区間中に状態ジャンプが２回以上生じないように、式（３）で示したＴ（ｔ）を設定する。このとき，周期Ｔ_ｓｔｅｐごとに式（９１）で表される脚（遊脚リンク１５２）と地面９０との衝突が起こり、それ以外のときは、式（９７）で表される運動方程式でモデルの状態を記述することができる。つまり、モデルの状態は、以下の式（９９），（１００）で表される。
Next, assuming that the walking cycle is T _step , the jump time t _j is expressed by the following equation (98) using an integer k.
t _j = kT _step ... (98)
In addition, T (t) represented by the equation (3) is set so that the state jump does not occur more than once in the evaluation section. At this time, a collision occurs between the leg (swing link 152) represented by the equation (91) and the ground 90 for each period T _step , and in other cases, the model is modeled by the equation of motion expressed by the equation (97). Can describe the state of. That is, the state of the model is represented by the following equations (99) and (100).

ここで、時刻ｔ_ｊで状態ジャンプ、つまり脚（遊脚リンク１５２）と地面９０との衝突が起こらなければならない。また、支持脚リンク１５１及び遊脚リンク１５２の長さが互いに同じであることから、遊脚リンク１５２が地面９０に着地したとき、θ_１＝−θ_２である。したがって、状態ジャンプが起こる条件は、以下の式（１０１）で表される。
Here, the state jump at time t _j, ie must occur collision of the leg (the idle leg link 152) with the ground 90. Further, since the lengths of the support leg link 151 and the swing leg link 152 are the same as each other, when the swing leg link 152 lands on the ground 90, θ ₁ = −θ ₂ . Therefore, the condition under which the state jump occurs is expressed by the following equation (101).

したがって、評価関数に追加されるペナルティ関数（ペナルティ項）は、以下の式（５１）’で表される。ここで、ｐ_１は、ゲイン（重み）であって、十分大きな正のスカラー値である。このゲインｐ_１は、制御装置２のコンピュータの性能等に応じて、適宜調整可能である。
Therefore, the penalty function (penalty term) added to the evaluation function is represented by the following equation (51)'. Here, p ₁ is a gain (weight), which is a sufficiently large positive scalar value. The gain p _1, depending on the performance of the control unit 2 of the computer, and can be appropriately adjusted.

さらに、上記のペナルティ項とは別に、遊脚リンク１５２の着地時の脚の開き角を目標値に近づくけるような項を付け加えることを考える。したがって、以下の式（５１）”で示す項も、ペナルティ項として評価関数に加える。ここで、ｐ_２は、ゲイン（重み）であって、十分大きな正のスカラー値である。このゲインｐ_２は、制御装置２のコンピュータの性能等に応じて、適宜調整可能である。
Further, in addition to the above penalty term, it is considered to add a term so that the leg opening angle at the time of landing of the swing leg link 152 approaches the target value. Therefore, the term represented by the following equation (51) ”is also added to the evaluation function as a penalty term. Here, p ₂ is a gain (weight) and is a sufficiently large positive scalar value. This gain p ₂ Can be adjusted as appropriate according to the performance of the computer of the control device 2.

式（５１）’及び式（５１）”を足し合わせると、実施の形態１にかかるペナルティ項は、以下の式（１０２）で表される。これにより、ペナルティ項（拘束パラメータ）は、予め指定したタイミング（時刻ｔ_ｊ）において遊脚が着地するようにロボット１００の状態を指定することとなる。言い換えると、ペナルティ項（拘束パラメータ）は、予め指定したタイミング（時刻ｔ_ｊ）において遊脚が着地したときのロボット１００の姿勢（各関節部の関節角度）を指定することとなる。
したがって、実施の形態１にかかるコンパス型モデルにおける非線形モデル予測制御を用いたロボット１００の制御では、式（５１）で示したペナルティ関数として、上記式（１０２）で表したものが適用される。 When the equations (51)'and'(51)' are added together, the penalty term according to the first embodiment is represented by the following equation (102), whereby the penalty term (constraint parameter) is specified in advance. The state of the robot 100 is specified so that the swing leg lands at the determined timing (time t _j ). In other words, the penalty term (constraint parameter) is set by the swing leg at the predetermined timing (time t _j ). The posture (joint angle of each joint) of the robot 100 when landing is specified.
Therefore, in the control of the robot 100 using the nonlinear model prediction control in the compass model according to the first embodiment, the penalty function represented by the above equation (102) is applied as the penalty function shown by the equation (51).

図１０は、実施の形態１にかかる制御装置２によって行われるロボット１００の制御方法を示すフローチャートである。図１０に示した制御方法は、上述した非線形モデル予測制御を用いている。したがって、実施の形態１にかかるロボット１００の制御方法は、式（６９）〜（７９）で表される停留条件から、式（８０）で表されるベクトルＵ（ｔ）を求め、このベクトルＵ（ｔ）の各成分の値を、式（８１）で示される方程式を解くことによって算出する。これにより、ロボット１００の動作を制御するための制御入力値が算出される。ここで、図１０に示したフローチャートにおいて、Ｓ１０２は、状態パラメータを取得する取得ステップに対応し、Ｓ１０４〜Ｓ１１４は、制御入力値を算出する算出ステップに対応し、Ｓ１１６は、ロボット１００の動作を制御する制御ステップに対応する。 FIG. 10 is a flowchart showing a control method of the robot 100 performed by the control device 2 according to the first embodiment. The control method shown in FIG. 10 uses the above-mentioned nonlinear model predictive control. Therefore, in the control method of the robot 100 according to the first embodiment, the vector U (t) represented by the equation (80) is obtained from the retention conditions represented by the equations (69) to (79), and the vector U (t) is obtained. The value of each component of (t) is calculated by solving the equation represented by the equation (81). As a result, the control input value for controlling the operation of the robot 100 is calculated. Here, in the flowchart shown in FIG. 10, S102 corresponds to the acquisition step of acquiring the state parameter, S104 to S114 correspond to the calculation step of calculating the control input value, and S116 corresponds to the operation of the robot 100. Corresponds to the control step to be controlled.

まず、制御装置２は、ロボット１００の状態を示す状態ベクトル（状態パラメータ）を取得して、状態観測を行う（ステップＳ１０２）。具体的には、制御装置２の状態取得部１２は、股関節部１２０Ｒ及び股関節部１２０Ｌの角度センサ１３０から、股関節部１２０Ｒ及び股関節部１２０Ｌの関節角度を取得する。そして、状態取得部１２は、これらの関節角度から、現在時刻ｔにおけるθ_１及びθ_２（図６）を算出する。なお、状態取得部１２は、例えば、支持脚である方の脚にかかる股関節部１２０の関節角度と、胴体１０２の傾きとから、鉛直方向に対する支持脚リンク１５１の角度θ_１を取得できる。同様に、状態取得部１２は、例えば、遊脚である方の脚にかかる股関節部１２０の関節角度と、胴体１０２の傾きとから、鉛直方向に対する遊脚リンク１５２の角度θ_２を取得できる。なお、胴体１０２の傾きは、例えばジャイロセンサ等の傾斜センサを用いて取得可能である。 First, the control device 2 acquires a state vector (state parameter) indicating the state of the robot 100 and performs state observation (step S102). Specifically, the state acquisition unit 12 of the control device 2 acquires the joint angles of the hip joint portion 120R and the hip joint portion 120L from the angle sensor 130 of the hip joint portion 120R and the hip joint portion 120L. Then, the state acquisition unit 12 calculates θ ₁ and θ ₂ (FIG. 6) at the current time t from these joint angles. The state acquisition unit 12 can acquire the angle θ ₁ of the support leg link 151 with respect to the vertical direction from, for example, the joint angle of the hip joint portion 120 on the leg that is the support leg and the inclination of the body 102. Similarly, the state acquisition unit 12 can acquire the angle θ ₂ of the swing leg link 152 with respect to the vertical direction from, for example, the joint angle of the hip joint portion 120 on the swing leg and the inclination of the body 102. The inclination of the body 102 can be acquired by using an inclination sensor such as a gyro sensor.

また、状態取得部１２は、θ_１及びθ_２の変化量θ_１（ドット）及びθ_２（ドット）を算出する。例えば、状態取得部１２は、時刻ｔの１つ前のサンプリング周期（制御周期）における時刻ｔ−Δｔにおけるθ_１及びθ_２と時刻ｔにおけるθ_１及びθ_２との差分から、それぞれ変化量θ_１（ドット）及びθ_２（ドット）を算出してもよい。 The state acquisition section 12 calculates the theta ₁ and theta ₂ variation theta ₁ (dots) and theta ₂ (dots). For example, the state acquisition unit 12 changes the amount of change θ from the difference between θ ₁ and θ _{2 at} time t−Δt and θ ₁ and θ ₂ at time t in the sampling period (control cycle) immediately before time t, respectively. ₁ (dot) and θ ₂ (dot) may be calculated.

これにより、状態取得部１２は、時刻ｔにおける状態ベクトルｘ（ｔ）を取得する。さらに、状態取得部１２は、式（７２）で示すように、このｘ（ｔ）を、時刻ｔについての評価区間における、状態ベクトルの初期値とする。つまり、ｘ^＊ _０（ｔ）＝ｘ（ｔ）とする。なお、この式（７２）についての処理は、非線形モデル予測制御部１４が行ってもよい。 As a result, the state acquisition unit 12 acquires the state vector x (t) at time t. Further, as shown by the equation (72), the state acquisition unit 12 sets this x (t) as the initial value of the state vector in the evaluation interval for the time t. That is, x ^* ₀ (t) = x (t). The non-linear model prediction control unit 14 may perform the processing for this equation (72).

次に、制御装置２の非線形モデル予測制御部１４は、時刻ｔについての評価区間における状態変数を、各ｉ＝１〜Ｎについて更新する（ステップＳ１０４）。具体的には、非線形モデル予測制御部１４は、式（６９）〜（７１）で示したように、状態変数ｘ^＊ _ｉ（ｔ）を更新する。なお、実施の形態１にかかるコンパス型モデルの遊脚の着地の例では、状態方程式に関するｆは、ジャンプステップの前後で同じであるとする。つまり、ｆ_１（ｘ，ｕ）＝ｆ_２（ｘ，ｕ）である。 Next, the nonlinear model prediction control unit 14 of the control device 2 updates the state variables in the evaluation interval at time t for each i = 1 to N (step S104). Specifically, the nonlinear model prediction control unit 14 updates the state variables x ^* _i (t) as shown by the equations (69) to (71). In the example of landing of the swing leg of the compass type model according to the first embodiment, it is assumed that f related to the equation of state is the same before and after the jump step. That is, f ₁ (x, u) = f ₂ (x, u).

そして、非線形モデル予測制御部１４は、式（９０）に示した状態方程式を用いて、式（６９）〜（７０）で示すように、ｉ≠ｉ_ｊにおける状態変数を更新する。また、非線形モデル予測制御部１４は、ｉ＝ｉ_ｊにおいては、式（９７）に示した状態ジャンプの方程式を用いて、式（７１）で示すように、状態変数を更新する。また、制御装置２のＲＡＭ８は、得られた状態変数ｘ^＊ _ｉ（ｔ）を記憶する。 Then, the nonlinear model prediction control unit 14 updates the state variable in i ≠ _ij using the equation of state shown in equation (90) as shown in equations (69) to (70). Further, in i = _ij , the nonlinear model prediction control unit 14 updates the state variables as shown in the equation (71) by using the equation of the state jump shown in the equation (97). Further, the RAM 8 of the control device 2 stores the obtained state variable x ^* _i (t).

次に、制御装置２は、時刻ｔについての評価区間における随伴変数を、各ｉ＝１〜Ｎについて更新する（ステップＳ１０６）。具体的には、非線形モデル予測制御部１４は、式（７３）〜（７６）で示したように、Ｓ１０４で更新された各ｘ^＊ _ｉを用いて、随伴変数λ^＊ _ｉ（ｔ）を更新する。また、制御装置２のＲＡＭ８は、得られた随伴変数λ^＊ _ｉ（ｔ）を記憶する。 Next, the control device 2 updates the adjoint variable in the evaluation interval at time t for each i = 1 to N (step S106). Specifically, as shown by the equations (73) to (76), the nonlinear model prediction control unit 14 updates the adjoint variable λ ^* _i (t) by using each x ^* _i updated in S104. To do. Further, the RAM 8 of the control device 2 stores the obtained contingent variable λ ^* _i (t).

なお、上述したように、ｆ_１（ｘ，ｕ）＝ｆ_２（ｘ，ｕ）であるので、Ｈ_１＝Ｈ_２である。そして、非線形モデル予測制御部１４は、式（９０）に示した状態方程式及び式（９６）に示したステージコストＬの式を用いてハミルトン関数Ｈを算出し、式（７３）〜（７４）で示すように、ｉ≠ｉ_ｊにおける随伴変数を更新する。また、非線形モデル予測制御部１４は、ｉ＝ｉ_ｊにおいては、式（９７）に示した状態ジャンプの方程式、式（１０２）に示したペナルティ関数の式及び式（９６）に示したステージコストＬの式を用いて、式（７５）で示すように、随伴変数を更新する。また、非線形モデル予測制御部１４は、ｉ＝Ｎにおいては、式（９５）に示した終端コストの式を用いて、式（７６）で示すように、随伴変数を更新する。 As described above, since f ₁ (x, u) = f ₂ (x, u), H ₁ = H ₂ . Then, the nonlinear model prediction control unit 14 calculates the Hamiltonian function H using the equation of state shown in equation (90) and the equation of stage cost L shown in equation (96), and equations (73) to (74). As shown by, the contingent variable at i ≠ i _j is updated. Further, in i = _ij , the nonlinear model prediction control unit 14 has the state jump equation shown in the equation (97), the penalty function equation shown in the equation (102), and the stage cost shown in the equation (96). Using equation L, the contingent variable is updated as shown in equation (75). Further, in i = N, the nonlinear model prediction control unit 14 updates the adjoint variable as shown in the equation (76) by using the equation of the termination cost shown in the equation (95).

次に、制御装置２は、ベクトル関数Ｆを導出する（ステップＳ１０８）。具体的には、非線形モデル予測制御部１４は、式（８０）で示されたベクトルＵ（ｔ）を算出するため、Ｓ１０４及びＳ１０６の処理で得られた状態変数ｘ^＊ _ｉ（ｔ）及び随伴変数λ^＊ _ｉ（ｔ）を用いて、式（８１）で示されたベクトル関数Ｆ（Ｕ，ｘ）の方程式を導出する。なお、上述したように、実施の形態１においては、Ｈ_１＝Ｈ_２である。 Next, the control device 2 derives the vector function F (step S108). Specifically, the nonlinear model prediction control unit 14 calculates the vector U (t) represented by the equation (80), so that the state variables x ^* _i (t) obtained by the processing of S104 and S106 and the accompanying Using the variable λ ^* _i (t), the equation of the vector function F (U, x) represented by the equation (81) is derived. As described above, in the first embodiment, H ₁ = H ₂ .

次に、制御装置２は、ベクトルＵ（ｔ）の全微分を計算する（ステップＳ１１０）。具体的には、非線形モデル予測制御部１４は、式（２９）を変形した以下の式（１０３）から、Ｃ／ＧＭＲＥＳ法を用いて、Ｕの全微分（Ｕ（ドット））、つまりＵ（ｔ）の変化率を算出する。言い換えると、非線形モデル予測制御部１４は、制御周期ごとに、制御入力値の最適解の変化率を算出する。
これにより、式（８０）から明らかなように、ｉ＝０〜Ｎ−１について、ｕ^＊ _ｉ（ｔ）の時間微分（ｕ^＊ _ｉ（ドット））が得られることとなる。ＲＡＭ８は、得られたｕ^＊ _ｉ（ｔ）の時間微分の値を記憶する。 Next, the control device 2 calculates the total derivative of the vector U (t) (step S110). Specifically, the nonlinear model prediction control unit 14 uses the C / GMRES method from the following equation (103), which is a modification of the equation (29), to obtain the total derivative of U (U (dot)), that is, U ( The rate of change of t) is calculated. In other words, the nonlinear model prediction control unit 14 calculates the rate of change of the optimum solution of the control input value for each control cycle.
As a result, as is clear from the equation (80), the time derivative (u ^* _i (dot)) of u ^* _i (t) can be obtained for i = 0 to N-1. The RAM 8 stores the obtained time derivative value of u ^* _i (t).

次に、制御装置２は、入力系列ｕ^＊ _ｉ（ｔ）の更新を行う（ステップＳ１１２）。具体的には、非線形モデル予測制御部１４は、式（２６）及び式（８２）から、以下の式（１０４）及び（１０５）により、入力系列ｕ^＊ _ｉ（ｔ）の更新を行う。
Next, the control device 2 updates the input sequence u ^* _i (t) (step S112). Specifically, the nonlinear model prediction control unit 14 updates the input series u ^* _i (t) from the equations (26) and (82) by the following equations (104) and (105).

ここで、式（１０４），（１０５）から、非線形モデル予測制御部１４は、ある時刻ｔにおける入力系列ｕ^＊ _ｉ（ｔ）及びｕ^＊ _ｉ（ｔ）の変化率を用いて、次のサンプリング周期Δｔである時刻ｔ＋Δｔにおける入力系列を算出する。したがって、時刻ｔにおける入力系列ｕ^＊ _ｉ（ｔ）を算出するためには、現在の時刻ｔの１つ前のサンプリング周期Δｔの時刻ｔ−Δｔにおける入力系列ｕ^＊ _ｉ（ｔ−Δｔ）及びｕ^＊ _ｉ（ｔ−Δｔ）の時間微分を用いることとなる。これにより、式（８０）で示したＵ（ｔ）の各成分の値が得られる。言い換えると、入力系列ｕ^＊ _ｉ（ｔ）のｉ＝０〜Ｎ−１それぞれの値が得られる。ＲＡＭ８は、入力系列ｕ^＊ _ｉ（ｔ）のｉ＝０〜Ｎ−１それぞれの値を記憶する。 Here, from equations (104) and (105), the nonlinear model prediction control unit 14 uses the rate of change of the input series u ^* _i (t) and u ^* _i (t) at a certain time t to perform the next sampling. The input sequence at the time t + Δt having the period Δt is calculated. Therefore, in order to calculate the input series u ^* _i (t) at time t, the input series u ^* _i (t−Δt) and u at time t−Δt of the sampling period Δt immediately before the current time t ^* The time derivative of _i (t−Δt) will be used. As a result, the value of each component of U (t) represented by the formula (80) can be obtained. In other words, the values of i = 0 to N-1 of the input series u ^* _i (t) are obtained. The RAM 8 stores the values of i = 0 to N-1 of the input series u ^* _i (t).

次に、制御装置２は、入力値（制御入力値）を決定する（ステップＳ１１４）。具体的には、非線形モデル予測制御部１４は、以下の式（１０６）により、入力値ｕ_０を決定する。
つまり、非線形モデル予測制御部１４は、Ｕ（ｔ）の成分のうちの１番目の成分を入力値と決定する。非線形モデル予測制御部１４は、決定された入力値を、サーボ制御部１６に対して出力する。 Next, the control device 2 determines an input value (control input value) (step S114). Specifically, the nonlinear model prediction control unit 14 determines the input value u ₀ by the following equation (106).
That is, the nonlinear model prediction control unit 14 determines the first component of the U (t) components as the input value. The nonlinear model prediction control unit 14 outputs the determined input value to the servo control unit 16.

次に、制御装置２は、ロボット１００の制御を行う（ステップＳ１１６）。具体的には、サーボ制御部１６は、Ｓ１１４で決定された入力値から、各関節部に指示する関節トルクを決定する。さらに具体的には、サーボ制御部１６は、支持脚リンク１５１に対応する脚の股関節部１２０の関節トルクτ_１と、遊脚リンク１５２に対応する脚の股関節部１２０の関節トルクτ_２とを、以下の式（１０７），（１０８）によって決定する。
τ_１＝ｕ_０・・・（１０７）
τ_２＝−ｕ_０・・・（１０８） Next, the control device 2 controls the robot 100 (step S116). Specifically, the servo control unit 16 determines the joint torque instructed to each joint unit from the input value determined in S114. More specifically, the servo control unit 16 applies the joint torque τ ₁ of the hip joint portion 120 of the leg corresponding to the support leg link 151 and the joint torque τ ₂ of the hip joint portion 120 of the leg corresponding to the swing leg link 152. , Determined by the following equations (107) and (108).
τ ₁ = u ₀ ... (107)
τ ₂ = −u ₀ ... (108)

なお、胴体１０２（関節１５０）の姿勢が崩れることを防止するため、サーボ制御部１６は、以下の式（１０７），（１０８）のように関節トルクτ_１及び関節トルクτ_２を決定してもよい。
ここで、θ_{ｔｏｒｓｏ}は、鉛直方向に対する胴体１０２の前方への傾き角度である。また、ｋ_ｐ及びｋ_ｄは、ゲイン（重み）であって、予め定められた定数である。このゲインは、制御装置２のコンピュータの性能に応じて、適宜調整可能である。
サーボ制御部１６は、決定された関節トルクとなるように、各関節部（股関節部１２０）のモータ１４０を制御する。 In order to prevent the posture of the body 102 (joint 150) from collapsing, the servo control unit 16 determines the joint torque τ ₁ and the joint torque τ ₂ as in the following equations (107) and (108). May be good.
Here, θ _torso is a forward tilt angle of the fuselage 102 with respect to the vertical direction. Further, k _p and k _d are gains (weights) and are predetermined constants. This gain can be appropriately adjusted according to the performance of the computer of the control device 2.
The servo control unit 16 controls the motor 140 of each joint portion (hip joint portion 120) so as to obtain the determined joint torque.

以上のように、実施の形態１にかかる制御装置２は、非線形モデル予測制御のアルゴリズムを用いてロボット１００の動作を制御するに際し、指定したタイミングで遊脚が着地するようにロボット１００の状態を拘束するペナルティ関数（拘束パラメータ）を用いている。これにより、想定しているタイミングで、遊脚の着地という不連続な状態変化を起こさせることができる。したがって、元々の非線形モデル予測制御の理論を容易に適用でき、さらにＣ／ＧＭＲＥＳ法を適用することも可能となる。したがって、二足歩行ロボットのような非線形システムに対しても実時間で制御を行うことが可能となる。 As described above, the control device 2 according to the first embodiment sets the state of the robot 100 so that the swing leg lands at a designated timing when controlling the operation of the robot 100 by using the algorithm of the nonlinear model prediction control. A constraint function (constraint parameter) is used. As a result, it is possible to cause a discontinuous state change of landing of the swing leg at the assumed timing. Therefore, the original theory of nonlinear model predictive control can be easily applied, and further, the C / GMRES method can be applied. Therefore, it is possible to control a non-linear system such as a bipedal walking robot in real time.

また、実施の形態１にかかるペナルティ関数は、上記式（１０２）で示すように、予め指定したタイミングにおいて遊脚が着地したときのロボット１００の姿勢を指定している。これにより、想定した姿勢で遊脚を着地させるようにロボット１００を制御することが可能となる。さらに、実施の形態１にかかるペナルティ関数は、予め指定したタイミングにおいて遊脚が着地したときのロボット１００の各関節部の関節角度を指定している。これにより、想定した関節角度で遊脚を着地させるようにロボット１００の関節部を制御することが可能となる。 Further, as shown in the above equation (102), the penalty function according to the first embodiment specifies the posture of the robot 100 when the swing leg lands at a predetermined timing. This makes it possible to control the robot 100 so that the swing leg lands in the assumed posture. Further, the penalty function according to the first embodiment specifies the joint angle of each joint portion of the robot 100 when the swing leg lands at a predetermined timing. This makes it possible to control the joint portion of the robot 100 so that the swing leg lands at the assumed joint angle.

また、式（１０２）で示すように、ペナルティ関数は、調整可能なゲイン（ｐ_１及びｐ_２）を含む。指定されたタイミングで状態変化を確実に起こさせるため、ゲインは十分大きなスカラー値とする必要がある。しかしながら、制御装置２のコンピュータの性能等によっては、ゲインがあまりにも大きすぎると感度が過大となるため、不安定な制御となる可能性がある。したがって、ゲインを調整することにより、制御を安定化させることが可能となる。 Also, as shown in equation (102), the penalty function includes adjustable gains (p ₁ and p ₂ ). The gain must be a sufficiently large scalar value to ensure that the state changes at the specified timing. However, depending on the performance of the computer of the control device 2, if the gain is too large, the sensitivity becomes excessive, which may result in unstable control. Therefore, it is possible to stabilize the control by adjusting the gain.

（実施の形態２）
次に、実施の形態２について説明する。実施の形態２では、ロボット１００がより人間に似たヒューマノイドロボットである点で、実施の形態１と異なる。そして、実施の形態２では、膝を曲げて二足歩行を行うモデル（膝屈曲モデル）について、上述した非線形モデル予測制御を適用する。その他の点については、実施の形態１と実質的に同様であるので、説明を省略する。 (Embodiment 2)
Next, the second embodiment will be described. The second embodiment differs from the first embodiment in that the robot 100 is a humanoid robot more similar to a human. Then, in the second embodiment, the above-mentioned nonlinear model prediction control is applied to a model (knee flexion model) in which the knee is bent to perform bipedal walking. Since other points are substantially the same as those in the first embodiment, the description thereof will be omitted.

図１１は、実施の形態２にかかるロボット１００を示す図である。実施の形態１にかかるロボット１００と同様に、実施の形態２にかかるロボット１００は、胴体１０２と、右脚１１０Ｒと、左脚１１０Ｌとを有する。右脚１１０Ｒ及び左脚１１０Ｌの構成については、実施の形態１のものと同様である。また、胴体１０２は、腰部１０２ａと、胸部１０２ｂと、腰関節部１０２ｃとを有する。また、胴体１０２の上側には、頭部１０４が設けられている。この頭部１０４に、カメラ等のセンサが設けられていてもよい。 FIG. 11 is a diagram showing a robot 100 according to the second embodiment. Similar to the robot 100 according to the first embodiment, the robot 100 according to the second embodiment has a body 102, a right leg 110R, and a left leg 110L. The configurations of the right leg 110R and the left leg 110L are the same as those of the first embodiment. The torso 102 also has a waist 102a, a chest 102b, and a lumbar joint 102c. A head 104 is provided on the upper side of the body 102. A sensor such as a camera may be provided on the head 104.

また、ロボット１００は、胴体１０２（胸部１０２ｂ）の右側及び左側に、それぞれ右腕１６０Ｒ及び左腕１６０Ｌを有する。実施の形態２にかかるロボット１００は、右腕１６０Ｒ及び左腕１６０Ｌを用いて、所定の動作を行うことが可能である。なお、図１１には図示されていないが、右腕１６０Ｒ及び左腕１６０Ｌの先端に、対象物を把持することが可能なエンドエフェクタが設けられていてもよい。 Further, the robot 100 has a right arm 160R and a left arm 160L on the right side and the left side of the body 102 (chest 102b), respectively. The robot 100 according to the second embodiment can perform a predetermined operation by using the right arm 160R and the left arm 160L. Although not shown in FIG. 11, end effectors capable of gripping an object may be provided at the tips of the right arm 160R and the left arm 160L.

右腕１６０Ｒは、胴体１０２に近い方から順に、肩関節部１７０Ｒと、上腕部１６２Ｒと、肘関節部１７２Ｒと、前腕部１６４Ｒとを有する。同様に、左腕１６０Ｌは、胴体１０２に近い方から順に、肩関節部１７０Ｌと、上腕部１６２Ｌと、肘関節部１７２Ｌと、前腕部１６４Ｌとを有する。肩関節部１７０Ｒ及び肩関節部１７０Ｌは、胴体１０２の右側及び左側にそれぞれ取り付けられている。そして、肩関節部１７０Ｒ及び肩関節部１７０Ｌを介して、それぞれ、上腕部１６２Ｒ及び上腕部１６２Ｌが胴体１０２と接続されている。言い換えると、右腕１６０Ｒ及び左腕１６０Ｌは、それぞれ、肩関節部１７０Ｒ及び肩関節部１７０Ｌを介して、胴体１０２と接続されている。 The right arm 160R has a shoulder joint portion 170R, an upper arm portion 162R, an elbow joint portion 172R, and a forearm portion 164R in order from the side closest to the body 102. Similarly, the left arm 160L has a shoulder joint 170L, an upper arm 162L, an elbow joint 172L, and a forearm 164L in order from the side closer to the torso 102. The shoulder joint 170R and the shoulder 170L are attached to the right side and the left side of the body 102, respectively. Then, the upper arm portion 162R and the upper arm portion 162L are connected to the body 102, respectively, via the shoulder joint portion 170R and the shoulder joint portion 170L, respectively. In other words, the right arm 160R and the left arm 160L are connected to the torso 102 via the shoulder joint 170R and the shoulder joint 170L, respectively.

また、肘関節部１７２Ｒを介して、上腕部１６２Ｒと前腕部１６４Ｒとが接続されている。同様に、肘関節部１７２Ｌを介して、上腕部１６２Ｌと前腕部１６４Ｌとが接続されている。また、肩関節部１７０は、互いに直交した３軸の周りをそれぞれ回転するように構成され得る。また、肘関節部１７２は、１軸の周りを回転する。また、図２に示したように、肩関節部１７０及び肘関節部１７２は、角度センサ１３０と、モータ１４０とを有する。 Further, the upper arm portion 162R and the forearm portion 164R are connected via the elbow joint portion 172R. Similarly, the upper arm portion 162L and the forearm portion 164L are connected via the elbow joint portion 172L. Further, the shoulder joint portion 170 may be configured to rotate around three axes orthogonal to each other. Further, the elbow joint portion 172 rotates around one axis. Further, as shown in FIG. 2, the shoulder joint portion 170 and the elbow joint portion 172 have an angle sensor 130 and a motor 140.

図１２は、実施の形態２にかかるロボット１００を膝屈曲モデルに適用した状態を示す図である。図１２に示した例では、右脚１１０Ｒが支持脚であり、左脚１１０Ｌが遊脚である。制御装置２は、支持脚が地面と点接触していることを模擬するため、支持脚（図１２の例では右脚１１０Ｒ）の足首関節部１２４に設けられたトルクセンサ１３６を用いて、足首関節部１２４のトルクを０に制御する。さらに、制御装置２は、遊脚（図１２の例では左脚１１０Ｌ）の足裏センサ１１８を用いて、遊脚の着地を検出する。ここで、実施の形態１とは異なり、実施の形態２では、膝関節部１２２は、ロックされていない。このようにして、ロボットシステム１は、膝屈曲モデルを模擬することができる。 FIG. 12 is a diagram showing a state in which the robot 100 according to the second embodiment is applied to the knee flexion model. In the example shown in FIG. 12, the right leg 110R is a support leg and the left leg 110L is a free leg. The control device 2 uses a torque sensor 136 provided at the ankle joint portion 124 of the support leg (right leg 110R in the example of FIG. 12) to simulate that the support leg is in point contact with the ground. The torque of the joint portion 124 is controlled to 0. Further, the control device 2 detects the landing of the swing leg by using the sole sensor 118 of the swing leg (left leg 110L in the example of FIG. 12). Here, unlike the first embodiment, in the second embodiment, the knee joint portion 122 is not locked. In this way, the robot system 1 can simulate a knee flexion model.

図１３は、実施の形態２にかかるロボット１００を膝屈曲モデルに適用した例を示す図である。図１３に示す例では、ロボット１００は、支持脚下腿リンク２０１と、支持脚上腿リンク２０２と、胴体リンク２０３と、遊脚上腿リンク２０４と、遊脚下腿リンク２０５と、関節２１１，２１２，２１３，２１４，２１５とから構成される、膝屈曲モデルにモデル化されている。ここで、胴体リンク２０３は、ロボット１００の胴体１０２から上の構成要素に対応する。 FIG. 13 is a diagram showing an example in which the robot 100 according to the second embodiment is applied to a knee flexion model. In the example shown in FIG. 13, the robot 100 includes a support leg lower leg link 201, a support leg upper thigh link 202, a body link 203, a swing leg upper leg link 204, a swing leg lower leg link 205, and joints 211, 212. , 213, 214, 215, and is modeled on a knee flexion model. Here, the body link 203 corresponds to the components above the body 102 of the robot 100.

また、支持脚下腿リンク２０１は、右脚１１０Ｒ及び左脚１１０Ｌのうちの支持脚にかかる下腿部１１４に対応する。支持脚上腿リンク２０２は、右脚１１０Ｒ及び左脚１１０Ｌのうちの支持脚にかかる上腿部１１２に対応する。遊脚上腿リンク２０４は、右脚１１０Ｒ及び左脚１１０Ｌのうちの遊脚にかかる上腿部１１２に対応する。遊脚下腿リンク２０５は、右脚１１０Ｒ及び左脚１１０Ｌのうちの遊脚にかかる下腿部１１４に対応する。 Further, the support leg lower leg link 201 corresponds to the lower leg portion 114 on the support leg of the right leg 110R and the left leg 110L. The support leg upper thigh link 202 corresponds to the upper thigh portion 112 on the support leg of the right leg 110R and the left leg 110L. The free leg upper thigh link 204 corresponds to the upper thigh portion 112 on the free leg of the right leg 110R and the left leg 110L. The swing leg lower leg link 205 corresponds to the lower leg portion 114 on the swing leg of the right leg 110R and the left leg 110L.

また、関節２１１は、支持脚にかかる足首関節部１２４に対応する。関節２１２は、支持脚にかかる膝関節部１２２に対応する。関節２１３は、股関節部１２０に対応する。関節２１４は、遊脚にかかる膝関節部１２２に対応する。関節２１５は、遊脚にかかる足首関節部１２４に対応する。ここで、関節２１１は、地面９０に接触している。この関節２１１の位置、つまり支持脚の先端の位置を、（Ｘ_ｂ，Ｙ_ｂ）とする。また、関節２１５の位置、つまり遊脚の先端の位置を、（Ｘ_ｃ，Ｙ_ｃ）とする。 Further, the joint 211 corresponds to the ankle joint portion 124 applied to the support leg. The joint 212 corresponds to the knee joint 122 over the support leg. The joint 213 corresponds to the hip joint 120. The joint 214 corresponds to the knee joint 122 on the swing leg. The joint 215 corresponds to the ankle joint portion 124 on the swing leg. Here, the joint 211 is in contact with the ground 90. The position of the joint 211, that is, the position of the tip of the support leg is defined as (X _b , Y _b ). Further, the position of the joint 215, that is, the position of the tip of the swing leg is defined as (X _c , Y _c ).

ここで、各リンク及び各関節２１１〜２１５を区別するパラメータｋ（ｋ＝１〜５）を設ける。ｋ＝１は、支持脚下腿リンク２０１及び関節２１１に対応する。同様に、ｋ＝２は、支持脚上腿リンク２０２及び関節２１２に対応する。ｋ＝３は、胴体リンク２０３及び関節２１３に対応する。ｋ＝４は、遊脚上腿リンク２０４及び関節２１４に対応する。そして、ｋ＝５は、遊脚下腿リンク２０５及び関節２１５に対応する。 Here, a parameter k (k = 1 to 5) for distinguishing each link and each joint 211 to 215 is provided. k = 1 corresponds to the support leg lower leg link 201 and the joint 211. Similarly, k = 2 corresponds to the support leg upper thigh link 202 and the joint 212. k = 3 corresponds to the fuselage link 203 and the joint 213. k = 4 corresponds to the swing leg upper thigh link 204 and the joint 214. And k = 5 corresponds to the swing leg lower leg link 205 and the joint 215.

したがって、支持脚下腿リンク２０１、支持脚上腿リンク２０２、胴体リンク２０３、遊脚上腿リンク２０４、及び遊脚下腿リンク２０５を、それぞれ、リンク＃１、＃２、＃３、＃４、＃５と示すことがある。そして、各リンクについて一般化して示すときに、リンク＃ｋと示すことがある。関節２１１〜２１５についても同様に、それぞれ、関節＃１、＃２、＃３、＃４、＃５と示すことがある。そして、各関節について一般化して示すときに、関節＃ｋと示すことがある。 Therefore, the support leg lower leg link 201, the support leg upper thigh link 202, the torso link 203, the free leg upper thigh link 204, and the free leg lower leg link 205 are linked to links # 1, # 2, # 3, # 4, #, respectively. It may be indicated as 5. Then, when generalizing each link, it may be referred to as link # k. Similarly, the joints 211 to 215 may be referred to as joints # 1, # 2, # 3, # 4, and # 5, respectively. Then, when generalizing each joint, it may be referred to as joint #k.

支持脚下腿リンク２０１、支持脚上腿リンク２０２、胴体リンク２０３、遊脚上腿リンク２０４、及び遊脚下腿リンク２０５の長さ（リンク長）を、それぞれ、ｌ_１、ｌ_２、ｌ_３、ｌ_４、ｌ_５とする。また、支持脚下腿リンク２０１、支持脚上腿リンク２０２、胴体リンク２０３、遊脚上腿リンク２０４、及び遊脚下腿リンク２０５の質量（リンク質量）を、それぞれ、ｍ_１、ｍ_２、ｍ_３、ｍ_４、ｍ_５とする。また、支持脚下腿リンク２０１、支持脚上腿リンク２０２、胴体リンク２０３、遊脚上腿リンク２０４、及び遊脚下腿リンク２０５の各リンクの重心周りの慣性モーメントを、それぞれ、Ｉ^ｍ _１、Ｉ^ｍ _２、Ｉ^ｍ _３、Ｉ^ｍ _４、Ｉ^ｍ _５とする。なお、ｌ_ｋ、ｍ_ｋ及びＩ^ｍ _ｋは、予め定められた値である。 The lengths (link lengths) of the support leg lower leg link 201, the support leg upper thigh link 202, the torso link 203, the free leg upper thigh link 204, and the free leg lower leg link 205 are set to l ₁ , l ₂ , l ₃ , respectively. Let l ₄ and l ₅ . Further, the masses (link masses) of the support leg lower leg link 201, the support leg upper thigh link 202, the torso link 203, the free leg upper thigh link 204, and the free leg lower leg link 205 are m ₁ , m ₂ , and m ₃ , respectively. , M ₄ , and m ₅ . The support leg lower link 201, the support leg thigh link 202, the body link 203, the free leg thigh link 204, and the moment of inertia of the center of gravity around each link of the free leg shank link 205, ^{respectively,} _{I m} 1, I ^{Let it be m} ₂ , ^Im ₃ , ^Im ₄ , and ^Im ₅ . In addition, l _k , m _k and ^Im _k are predetermined values.

また、支持脚下腿リンク２０１、支持脚上腿リンク２０２、胴体リンク２０３、遊脚上腿リンク２０４、及び遊脚下腿リンク２０５の鉛直方向に対する角度（リンク角度）を、それぞれ、θ_１、θ_２、θ_３、θ_４、θ_５とする。なお、リンク角度θ_１、θ_２、θ_３、θ_４、θ_５は、ロボット１００の各関節部（股関節部１２０、膝関節部１２２及び足首関節部１２４）の角度センサ１３０で検出された関節角度から、幾何学的に一意に算出可能である。したがって、リンク角度θ_１、θ_２、θ_３、θ_４、θ_５は、ロボット１００の各関節部の関節角度に対応する。 Further, the angles (link angles) of the support leg lower leg link 201, the support leg upper thigh link 202, the torso link 203, the free leg upper thigh link 204, and the free leg lower leg link 205 with respect to the vertical direction are set to θ ₁ and θ ₂ , respectively. , Θ ₃ , θ ₄ , and θ ₅ . The link angles θ ₁ , θ ₂ , θ ₃ , θ ₄ , and θ ₅ are joints detected by the angle sensor 130 of each joint (hip joint 120, knee joint 122, and ankle joint 124) of the robot 100. It can be calculated geometrically and uniquely from the angle. Therefore, the link angles θ ₁ , θ ₂ , θ ₃ , θ ₄ , and θ ₅ correspond to the joint angles of the joints of the robot 100.

また、支持脚下腿リンク２０１の重心２０１ｍの位置を、（Ｘ_ｃ１，Ｙ_ｃ１）とする。支持脚上腿リンク２０２の重心２０２ｍの位置を、（Ｘ_ｃ２，Ｙ_ｃ２）とする。胴体リンク２０３の重心２０３ｍの位置を、（Ｘ_ｃ３，Ｙ_ｃ３）とする。遊脚上腿リンク２０４の重心２０４ｍの位置を、（Ｘ_ｃ４，Ｙ_ｃ４）とする。遊脚下腿リンク２０５の重心２０５ｍの位置を、（Ｘ_ｃ５，Ｙ_ｃ５）とする。また、ｋ＝１〜５それぞれについて、関節＃ｋから、リンク＃ｋの重心＃ｋ（質点）までの距離を、ｒ_ｋとする。このｒ_ｋは、予め定められた値である。 Further, the position of the center of gravity 201 m of the support leg lower leg link 201 is defined as (X _c1 , Y _c1 ). The position of the center of gravity 202 m of the support leg upper thigh link 202 is defined as (X _c2 , Y _c2 ). The position of the center of gravity 203 m of the fuselage link 203 is defined as (X _c3 , Y _c3 ). The position of the center of gravity 204 m of the free leg upper thigh link 204 is defined as (X _c4 , Y _c4 ). The position of the center of gravity 205 m of the swing leg lower leg link 205 is defined as (X _c5 , Y _c5 ). Further, k = 1 to 5 for each of the distance from the joint #k, to link the center of gravity of the #k #k (mass point), and _{r k.} The r _k is a predetermined value.

ここで、重心＃ｋの位置（Ｘ_ｃｋ，Ｙ_ｃｋ）は、以下の式（１０９），（１１０）で表される。なお、式（１０９），（１１０）の第２項については、ｋ＜４の場合は０とする。
Here, the position of the center of gravity #k (X _ck , Y _ck ) is expressed by the following equations (109) and (110). The second term of equations (109) and (110) is set to 0 when k <4.

式（１０９），（１１０）について時間微分を行うことで、以下の式（１１１），（１１２）で示すように、重心＃ｋの速度が得られる。
By performing the time derivative with respect to the equations (109) and (110), the velocity of the center of gravity #k can be obtained as shown by the following equations (111) and (112).

この場合、ラグランジアンΓを以下の式（１１３）のように表すことができる。なお、ｇは重力加速度である。
In this case, the Lagrangian Γ can be expressed as the following equation (113). In addition, g is gravitational acceleration.

このとき、ラグランジュの運動方程式は、以下の式（１１４）のように表される。
At this time, Lagrange's equation of motion is expressed as the following equation (114).

なお、ｑ及びηは、以下に示すベクトルである。
なお、ηは、関節トルクベクトルである。ここで、η_１は、足首関節部１２４のトルクである。また、η_２は、支持脚の膝関節部１２２のトルクである。また、η_３は、支持脚の股関節部１２０のトルク、言い換えると、支持脚に対する胴体リンク２０３のトルクである。η_４は、遊脚の股関節部１２０のトルクである。また、η_５は、遊脚の膝関節部１２２のトルクである。 Note that q and η are the vectors shown below.
Note that η is a joint torque vector. Here, η ₁ is the torque of the ankle joint portion 124. Further, η ₂ is the torque of the knee joint portion 122 of the support leg. Further, η ₃ is the torque of the hip joint portion 120 of the support leg, in other words, the torque of the body link 203 with respect to the support leg. η ₄ is the torque of the hip joint portion 120 of the swing leg. Further, η ₅ is the torque of the knee joint portion 122 of the swing leg.

そして、式（１１４）から、以下の式（１１５）を導き出すことができる。つまり、実施の形態２にかかる膝屈曲モデルでは、コンパス型モデルにおける式（８９）の方程式が、以下の式（１１５）で示すように修正される。なお、Ｍ（ｑ）及びＨ（ｑ、ｑ（ドット））は、膝屈曲モデルに対応するように修正され得る。
Then, the following equation (115) can be derived from the equation (114). That is, in the knee flexion model according to the second embodiment, the equation of the equation (89) in the compass type model is modified as shown by the following equation (115). Note that M (q) and H (q, q (dots)) can be modified to correspond to the knee flexion model.

ここで、Ｎは、以下の式（１１６）で表される行列である。
Here, N is a matrix represented by the following equation (116).

また、実施の形態２にかかる膝屈曲モデルにおいても、式（９１）が成り立つので、Ｑ^＋及びＱ⁻についても、膝屈曲モデルに対応するように修正され得る。また、式（９５）及び式（９６）に示した評価関数Ｊの終端コストφ及びステージコストＬについても、膝屈曲モデルに対応するように修正され得る。
また、実施の形態２にかかる膝屈曲モデルにおけるペナルティ項については、式（１０２）で示したものを、以下の式（１１７）で示したものに修正する。なお、ｑ_ｒｅｆは、遊脚が着地するときの目標姿勢を示す。言い換えると、ｑ_ｒｅｆは、遊脚が着地するときの各関節の関節角度に対応するリンク角度θ_１、θ_２、θ_３、θ_４、θ_５の目標値（目標角度）のベクトルを示す。具体的には、ｑ_ｒｅｆは、遊脚が着地した時点において、Ｘ_ｃ−Ｘ_ｂが実現したい歩幅となり、Ｙ_ｃ＝Ｙ_ｂとなる（遊脚の先端の高さが支持脚の先端の高さと同じとなる）ようなｑである。
Further, since the equation (91) holds also in the knee flexion model according to the second embodiment, Q ⁺ and Q ⁻ can be modified to correspond to the knee flexion model. Further, the termination cost φ and the stage cost L of the evaluation function J shown in the equations (95) and (96) can also be modified to correspond to the knee flexion model.
Further, regarding the penalty term in the knee flexion model according to the second embodiment, the one shown by the formula (102) is modified to the one shown by the following formula (117). Note that q _ref indicates the target posture when the swing leg lands. In other _{words, q ref} indicates a vector of a link angle theta ₁ which corresponds to the joint angle of each _{_{joint, θ 2, θ 3, θ}} 4, θ 5 of the target value (target angle) when the free leg lands. Specifically, q _ref is the stride length that X _c − X _b wants to achieve when the swing leg lands, and Y _c = Y _b (the height of the tip of the swing leg is the height of the tip of the support leg). Is the same as q).

但し、Ｒは、以下の式（１１８）で表されるペナルティ重み行列である。なお、ｐ_１〜ｐ_５は、予め定められた値である。
However, R is a penalty weight matrix represented by the following equation (118). In addition, p _{1 to} p ₅ are predetermined values.

そして、図１０に示したフローチャートにおいて、Ｓ１０２の処理で、状態取得部１２は、股関節部１２０、膝関節部１２２及び足首関節部１２４の角度センサ１３０から、各関節部の関節角度を取得する。そして、状態取得部１２は、これらの関節角度から、現在時刻ｔにおけるθ_１〜θ_５（図１３）及びこれらの変化量を算出して、状態ベクトルｘ（ｔ）を取得する。そして、非線形モデル予測制御部１４は、修正された各関数を用いてＳ１０４〜Ｓ１１４の計算を行って、関節トルクη_１〜η_５を算出することができる。そして、サーボ制御部１６は、算出された関節トルクη_１〜η_５から、各関節部のモータを制御することができる。 Then, in the flowchart shown in FIG. 10, in the process of S102, the state acquisition unit 12 acquires the joint angle of each joint portion from the angle sensor 130 of the hip joint portion 120, the knee joint portion 122, and the ankle joint portion 124. Then, the state acquisition unit 12 calculates θ _{1 to} θ ₅ (FIG. 13) at the current time t and the amount of change thereof from these joint angles, and acquires the state vector x (t). Then, the nonlinear model prediction control unit 14 can calculate the joint torques η _{1 to} η ₅ by performing the calculations S104 to S114 using each of the modified functions. Then, the servo control unit 16 can control the motor of each joint unit from the calculated joint torques η _{1 to} η ₅ .

以上のように、状態方程式及びペナルティ項等を、実施の形態１にかかるコンパス型モデルにおけるものから、実施の形態２にかかる膝屈曲モデルにおけるものに置き換えることができる。これにより、実施の形態２にかかる膝屈曲モデルにかかるロボット１００の制御においても、上記のコンパス型モデルの制御で用いた非線形モデル予測制御のアルゴリズムを用いることが可能となる。したがって、実施の形態２にかかる膝屈曲モデルについても、実施の形態１と同様に、実時間で非線形モデル予測制御を行うことが可能となる。 As described above, the equation of state, the penalty term, and the like can be replaced with those in the compass model according to the first embodiment to those in the knee flexion model according to the second embodiment. This makes it possible to use the nonlinear model prediction control algorithm used in the control of the compass type model in the control of the robot 100 related to the knee flexion model according to the second embodiment. Therefore, with respect to the knee flexion model according to the second embodiment, it is possible to perform nonlinear model prediction control in real time as in the first embodiment.

（シミュレーション結果）
次に、本実施の形態にかかる非線形モデル予測制御のアルゴリズムを用いて非線形システムについて行ったシミュレーション結果について説明する。以下に説明するシミュレーションは、本実施の形態にかかる非線形モデル予測制御のアルゴリズムを、実施の形態１にかかるコンパス型モデルにかかるロボット１００に適用したものである。 (simulation result)
Next, the simulation results of the nonlinear system using the algorithm of the nonlinear model prediction control according to the present embodiment will be described. In the simulation described below, the nonlinear model prediction control algorithm according to the present embodiment is applied to the robot 100 according to the compass model according to the first embodiment.

表１は、シミュレーションで用いたコンパス型モデルの物理パラメータを示す。また、表２は、シミュレーションで用いた非線形モデル予測制御の評価関数の重みを示す。また、脚の開き角の目標値をθ_ｒｅｆ＝０．３２［ｒａｄ］とし、歩行周期をＴ_ｓｔｅｐ＝０．８［ｓ］とする。また、式（Ｓ２）に示した状態ベクトルの初期値を、ｘ（ｔ）＝［−０．１６６，０．１６５，０．６，０．７５］^Ｔとする。 Table 1 shows the physical parameters of the compass model used in the simulation. Table 2 shows the weights of the evaluation functions of the nonlinear model predictive control used in the simulation. Further, the target value of the leg opening angle is set to θ _ref = 0.32 [rad], and the walking cycle is set to T _step = 0.8 [s]. Further, the initial value of the state vector shown in the equation (S2) is x (t) = [-0.166, 0.165, 0.6, 0.75] ^T.

表３は、Ｃ／ＧＭＲＥＳ法の数値計算に用いる各パラメータを示す。ここで、ｈ_ｄｉｒは、式（８６）で示したＧＭＲＥＳ法における前進差分近似の差分時間ｈである。また、ｒ_ｔｏｌは、シミュレーション開始時における最適性条件残差の許容値である。また、シミュレーション時間は１０［ｓ］とした。また、式（３）に示した評価区間Ｔ（ｔ）において、Ｔ_ｆ＝０．８［ｓ］、α＝１．０とした。 Table 3 shows each parameter used for the numerical calculation of the C / GMRES method. Here, h _dir is the difference time h of the forward difference approximation in the GMRES method represented by the equation (86). Further, r _trol is an allowable value of the optimum condition residual at the start of the simulation. The simulation time was set to 10 [s]. Further, in the evaluation interval T (t) shown in the equation (3), T _f = 0.8 [s] and α = 1.0.

図１４〜図１９は、本実施の形態にかかる非線形システムに非線形モデル予測制御のアルゴリズムを適用したシミュレーション結果を示す図である。また、図２０は、シミュレーション結果において定常状態の制御入力のグラフを示す図である。図１４は、表１〜表３に示した条件下において、本実施の形態にかかる非線形モデル予測制御のアルゴリズムを、実施の形態１にかかるコンパス型モデルにかかるロボット１００に適用したシミュレーション結果を示す。図１４〜図１７は、それぞれ、θ_１、θ_２、θ_１（ドット）及びθ_２（ドット）のシミュレーション結果を示す。また、図１８は、制御入力値ｕのシミュレーション結果を示す。また、図１９は、式（８１）で表されるベクトルＦの大きさである｜｜Ｆ｜｜（エラーノルム）のシミュレーション結果を示す。 14 to 19 are diagrams showing simulation results in which the nonlinear model prediction control algorithm is applied to the nonlinear system according to the present embodiment. Further, FIG. 20 is a diagram showing a graph of control input in a steady state in the simulation result. FIG. 14 shows the simulation results in which the nonlinear model prediction control algorithm according to the present embodiment is applied to the robot 100 according to the compass model according to the first embodiment under the conditions shown in Tables 1 to 3. .. 14 to 17 show the simulation results of θ ₁ , θ ₂ , θ ₁ (dot) and θ ₂ (dot), respectively. Further, FIG. 18 shows a simulation result of the control input value u. Further, FIG. 19 shows a simulation result of || F || (error norm), which is the magnitude of the vector F represented by the equation (81).

図２０のｔ＝８．０［ｓ］の近傍及びｔ＝８．８［ｓ］の近傍においてグラフが垂直に立っている箇所で、状態ジャンプが生じていることが分かる。このように、本シミュレーションでは、定常状態において、周期的な状態ジャンプを生じさせることに成功している。また、図１８のｕ（ｔ）のグラフから、ｔ＝５［ｓ］以降で、定常的な歩行をシミュレーションしていることが分かる。 It can be seen that the state jump occurs in the vicinity of t = 8.0 [s] and the vicinity of t = 8.8 [s] in FIG. 20 where the graph stands vertically. As described above, in this simulation, we have succeeded in generating a periodic state jump in the steady state. Further, from the graph of u (t) in FIG. 18, it can be seen that steady walking is simulated after t = 5 [s].

（変形例）
なお、本発明は上記実施の形態に限られたものではなく、趣旨を逸脱しない範囲で適宜変更することが可能である。例えば、ロボット１００の片方の脚は、股関節部１２０、膝関節部１２２及び足首関節部１２４を有するとしたが、このような構成に限られない。ロボット１００の脚は、３個よりも少ない数の関節部を有してもよいし、３個よりも多い数の関節部を有してもよい。この場合、状態ベクトル及び関節トルクベクトル（制御入力値）は、関節部の数に応じて、適宜、変更され得る。そして、状態方程式及びペナルティ関数等の関数も、関節部の数に応じて、適宜、変更され得る。 (Modification example)
The present invention is not limited to the above embodiment, and can be appropriately modified without departing from the spirit. For example, one leg of the robot 100 has a hip joint 120, a knee joint 122, and an ankle joint 124, but the configuration is not limited to this. The legs of the robot 100 may have less than three joints or may have more than three joints. In this case, the state vector and the joint torque vector (control input value) can be appropriately changed according to the number of joints. Then, the functions such as the equation of state and the penalty function can be changed as appropriate according to the number of joints.

また、上述した実施の形態においては、非線形システムが二足歩行ロボットである例について説明したが、本実施の形態にかかる非線形モデル予測制御のアルゴリズムは、二足歩行ロボット以外の非線形システムについても適用可能である。つまり、本実施の形態にかかる非線形システムの制御方法は、以下に例示するような、状態ジャンプを伴う任意の非線形システムに対して、適用可能である。そして、上述したように、非線形モデル予測制御において状態ジャンプが発生するタイミングを指定するようにすればよい。 Further, in the above-described embodiment, an example in which the nonlinear system is a bipedal walking robot has been described, but the nonlinear model prediction control algorithm according to the present embodiment is also applied to a nonlinear system other than the bipedal walking robot. It is possible. That is, the non-linear system control method according to the present embodiment is applicable to any non-linear system with a state jump, as illustrated below. Then, as described above, the timing at which the state jump occurs in the nonlinear model prediction control may be specified.

例えば、本実施の形態にかかる非線形システムは、図１１に示したロボット１００の腕（右腕１６０Ｒ又は左腕１６０Ｌ）のようなロボットハンド又はロボットアーム等であってもよい。この例における状態ジャンプは、ロボットハンド又はロボットアームが、周辺環境又は操作対象等の物体を押圧するとき、物体を把持し又は離すとき、球体等の物体を叩く又は打ち返すとき等に、発生し得る。なお、球体等の物体を打ち返す非線形システムの例として、例えば、卓球ロボットがある。 For example, the nonlinear system according to the present embodiment may be a robot hand or a robot arm such as the arm (right arm 160R or left arm 160L) of the robot 100 shown in FIG. The state jump in this example can occur when the robot hand or robot arm presses an object such as the surrounding environment or an operation target, grips or releases the object, hits or hits an object such as a sphere, and the like. .. An example of a non-linear system that hits an object such as a sphere is a table tennis robot.

また、例えば、本実施の形態にかかる非線形システムは、腕及び脚を同時に床等に着地して移動可能な人型ロボット又は動物型ロボット等であってもよい。この例における状態ジャンプは、人型ロボット又は動物型ロボットが、腕と脚とを同時に、壁、床又はテーブル等に接触して移動するとき、又は、人型ロボット又は動物型ロボットが、梯子又は壁等を登るとき等に、発生し得る。 Further, for example, the nonlinear system according to the present embodiment may be a humanoid robot or an animal robot that can move by landing the arms and legs on the floor or the like at the same time. The state jump in this example is when the humanoid robot or the animal robot moves with its arms and legs in contact with a wall, floor, table, etc. at the same time, or when the humanoid robot or the animal robot moves on a ladder or It can occur when climbing a wall or the like.

また、例えば、本実施の形態にかかる非線形システムは、ドローン等の無人航空機などであってもよい。この例における状態ジャンプは、無人航空機が、操作対象又は検査対象の物体に接触するとき又はその物体から離れるとき、輸送対象又は捕獲対象の物体を把持し又は離すとき等に、発生し得る。 Further, for example, the nonlinear system according to the present embodiment may be an unmanned aerial vehicle such as a drone. The state jump in this example can occur when the unmanned aerial vehicle touches or leaves an object to be operated or inspected, grips or releases an object to be transported or captured, and the like.

また、例えば、本実施の形態にかかる非線形システムは、加工機械の工具等であってもよい。この例における状態ジャンプは、加工機械の工具が、加工対象等の物体に接触し又は離れるとき等に、発生し得る。 Further, for example, the nonlinear system according to the present embodiment may be a tool of a processing machine or the like. The state jump in this example can occur when the tool of the machining machine comes into contact with or separates from an object such as a machining target.

また、例えば、本実施の形態にかかる非線形システムは、自動車のトランスミッション等であってもよい。この例における状態ジャンプは、トランスミッションのクラッチが、接触状態（動力の伝達状態）となったとき又は離間状態（動力の遮断状態）なったとき等に、発生し得る。 Further, for example, the nonlinear system according to the present embodiment may be a transmission of an automobile or the like. The state jump in this example may occur when the clutch of the transmission is in a contact state (power transmission state) or in a separated state (power cutoff state).

また、例えば、本実施の形態にかかる非線形システムは、ハイブリッド車の動力源等であってもよい。この例における状態ジャンプは、ハイブリッド車の動力源が、モータとエンジンとの間で切り替わるとき等に、発生し得る。また、例えば、本実施の形態にかかる非線形システムは、電気自動車又はハイブリッド車等のバッテリーであってもよい。この例における状態ジャンプは、バッテリーが充電と放電の間で切り替わるとき等に、発生し得る。 Further, for example, the nonlinear system according to the present embodiment may be a power source of a hybrid vehicle or the like. The state jump in this example can occur, for example, when the power source of the hybrid vehicle switches between the motor and the engine. Further, for example, the nonlinear system according to the present embodiment may be a battery of an electric vehicle, a hybrid vehicle, or the like. The state jump in this example can occur, for example, when the battery switches between charging and discharging.

また、例えば、本実施の形態にかかる非線形システムは、自動車等の自動運転システムであってもよい。この例における状態ジャンプは、自車の車線変更又は合流等によって先行車両や後続車両の有無が変化するとき等に、発生し得る。また、この例における状態ジャンプは、物体との衝突が避けられない場合に衝突後の状況まで含めて可能な範囲で最善の動作を行うように制御するとき等に、発生し得る。 Further, for example, the nonlinear system according to the present embodiment may be an automatic driving system such as an automobile. The state jump in this example may occur when the presence or absence of a preceding vehicle or a following vehicle changes due to a lane change or merging of the own vehicle. Further, the state jump in this example may occur when a collision with an object is unavoidable and the control is performed so as to perform the best possible operation including the situation after the collision.

また、例えば、本実施の形態にかかる非線形システムは、飛行機等であってもよい。この例における状態ジャンプは、飛行機の離着陸において、接地の前後を含めて運動を最適化するように制御するとき等に、発生し得る。具体的には、所望の経路で着陸しつつ、着陸後すみやかに減速するようにエンジン及び機体を制御するような場合である。 Further, for example, the nonlinear system according to the present embodiment may be an airplane or the like. The state jump in this example can occur during takeoff and landing of an airplane, such as when controlling to optimize motion including before and after touchdown. Specifically, it is a case where the engine and the airframe are controlled so as to decelerate promptly after landing while landing on a desired route.

また、例えば、本実施の形態にかかる非線形システムは、列車等であってもよい。この例における状態ジャンプは、列車の連結において、連結の前後を含めて運動を最適化するように制御するとき等に、発生し得る。具体的には、連結時の衝撃及び駆動モータの負荷を軽減するようにモータを制御するような場合である。 Further, for example, the nonlinear system according to the present embodiment may be a train or the like. The state jump in this example can occur in the connection of trains, such as when controlling to optimize the motion including before and after the connection. Specifically, it is a case where the motor is controlled so as to reduce the impact at the time of connection and the load of the drive motor.

上述したような任意の非線形システムについて、図１０で示したような制御方法が実行され得る。この場合、非線形システムの制御方法は、式（６９）〜（７９）で表される停留条件から、式（８０）で表されるベクトルＵ（ｔ）を求め、このベクトルＵ（ｔ）の各成分の値を、式（８１）で示される方程式を解くことによって算出する。これにより、非線形システムを制御するための制御入力値が算出される。そして、図１０に示したフローチャートにおいて、Ｓ１０２は、非線形システムの状態パラメータを取得する取得ステップに対応し、Ｓ１０４〜Ｓ１１４は、非線形システムを制御するための制御入力値を算出する算出ステップに対応し、Ｓ１１６は、非線形システムを制御する制御ステップに対応する。そして、算出ステップにおいて、指定されたタイミングにおいて状態が不連続に変化するように非線形システムの状態を拘束する拘束パラメータを用いて、制御入力値が算出される。このとき、非線形システムの制御周期ごとに、モデル予測制御のアルゴリズムにおける予め定められた評価区間における制御入力値の最適解の変化率が算出され、変化率を用いて当該制御周期の次の制御周期における前記制御入力値の最適解を算出され、最適解から、現在の前記制御入力値が算出される。 For any nonlinear system as described above, the control method shown in FIG. 10 can be implemented. In this case, the control method of the nonlinear system obtains the vector U (t) represented by the equation (80) from the retention conditions represented by the equations (69) to (79), and each of the vectors U (t). The value of the component is calculated by solving the equation represented by the equation (81). As a result, the control input value for controlling the nonlinear system is calculated. Then, in the flowchart shown in FIG. 10, S102 corresponds to the acquisition step of acquiring the state parameter of the nonlinear system, and S104 to S114 corresponds to the calculation step of calculating the control input value for controlling the nonlinear system. , S116 correspond to the control steps that control the nonlinear system. Then, in the calculation step, the control input value is calculated using the constraint parameter that constrains the state of the nonlinear system so that the state changes discontinuously at the specified timing. At this time, the rate of change of the optimum solution of the control input value in the predetermined evaluation interval in the model prediction control algorithm is calculated for each control cycle of the nonlinear system, and the rate of change is used to calculate the next control cycle of the control cycle. The optimum solution of the control input value in the above is calculated, and the current control input value is calculated from the optimum solution.

上述の例において、プログラムは、様々なタイプの非一時的なコンピュータ可読媒体（non-transitory computer readable medium）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（tangible storage medium）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ−ＲＯＭ（Read Only Memory）、ＣＤ−Ｒ、ＣＤ−Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ（Random Access Memory））を含む。また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体（transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 In the above example, the program can be stored and supplied to a computer using various types of non-transitory computer readable media. Non-transitory computer-readable media include various types of tangible storage media. Examples of non-temporary computer-readable media include magnetic recording media (eg, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (eg, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs, It includes a CD-R / W and a semiconductor memory (for example, a mask ROM, a PROM (Programmable ROM), an EPROM (Erasable PROM), a flash ROM, and a RAM (Random Access Memory)). The program may also be supplied to the computer by various types of transient computer readable media. Examples of temporary computer-readable media include electrical, optical, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

１・・・ロボットシステム、２・・・制御装置、１２・・・状態取得部、１４・・・非線形モデル予測制御部、１６・・・サーボ制御部、１００・・・ロボット、１０２・・・胴体、１１０Ｌ・・・左脚、１１０Ｒ・・・右脚、１１２・・・上腿部、１１４・・・下腿部、１１６・・・足部、１１８・・・足裏センサ、１２０・・・股関節部、１２２・・・膝関節部、１２４・・・足首関節部、１３０・・・角度センサ、１３６・・・トルクセンサ、１４０・・・モータ、１５０・・・関節、１５１・・・支持脚リンク、１５２・・・遊脚リンク、２０１・・・支持脚下腿リンク、２０２・・・支持脚上腿リンク、２０３・・・胴体リンク、２０４・・・遊脚上腿リンク、２０５・・・遊脚下腿リンク、２１１，２１２，２１３，２１４，２１５・・・関節 1 ... Robot system, 2 ... Control device, 12 ... State acquisition unit, 14 ... Non-linear model prediction control unit, 16 ... Servo control unit, 100 ... Robot, 102 ... Body, 110L ... left leg, 110R ... right leg, 112 ... upper leg, 114 ... lower leg, 116 ... foot, 118 ... sole sensor, 120 ... -Hip joint, 122 ... knee joint, 124 ... ankle joint, 130 ... angle sensor, 136 ... torque sensor, 140 ... motor, 150 ... joint, 151 ... Support leg link, 152 ... Free leg link, 201 ... Support leg lower leg link, 202 ... Support leg upper leg link, 203 ... Body link, 204 ... Free leg upper leg link, 205.・・ Free leg lower leg link, 211,212,213,214,215 ... Joint

Claims

It is a control method for nonlinear systems.
An acquisition step for acquiring a state parameter indicating the state of the nonlinear system, and
A calculation step of calculating a control input value for controlling the nonlinear system using the model predictive control algorithm based on the acquired state parameters.
It has a control step for controlling the nonlinear system using the calculated control input value.
In the calculation step, a constraint parameter that constrains the state of the nonlinear system so that the state changes discontinuously at a specified timing is used in advance in the model prediction control algorithm for each control cycle of the nonlinear system. The rate of change of the optimum solution of the control input value in the defined evaluation interval is calculated, the optimum solution of the control input value in the control cycle next to the control cycle is calculated using the rate of change, and the optimum solution is used. , A control method for a nonlinear system that calculates the current control input value.

A control device for a bipedal walking robot that controls the operation of a bipedal walking robot capable of bipedal walking using two legs.
A state acquisition means for acquiring a state parameter indicating a state related to walking of the bipedal walking robot, and
A calculation means for calculating a control input value for controlling the operation of the bipedal walking robot by using the model prediction control algorithm based on the acquired state parameters.
It has a control means for controlling the operation of the bipedal walking robot by using the calculated control input value.
The calculation means uses a restraint parameter that constrains the state of the bipedal walking robot so that the free leg of the two legs lands at a designated timing, and is used for each control cycle of the bipedal walking robot. , The rate of change of the optimum solution of the control input value in the predetermined evaluation interval in the model prediction control algorithm is calculated, and the rate of change is used to optimize the control input value in the next control cycle of the control cycle. A control device for a bipedal walking robot that calculates a solution and calculates the current control input value from the optimum solution.

The control device for a bipedal walking robot according to claim 2, wherein the constraint parameter is included in an evaluation function used in the model prediction control algorithm.

The control device for a bipedal walking robot according to claim 2 or 3, wherein the restraint parameter specifies the posture of the bipedal walking robot when the swing leg lands at the timing.

The control device for a bipedal walking robot according to claim 4, wherein the restraint parameter specifies a target angle of the joints of the two legs when the swing leg lands at the timing.

The control device for a bipedal walking robot according to any one of claims 2 to 5, wherein the constraint parameter includes an adjustable gain.

It is a control method of a bipedal walking robot that controls the operation of a bipedal walking robot capable of performing bipedal walking using two legs.
An acquisition step for acquiring a state parameter indicating a state related to walking of the biped robot, and
Based on the acquired state parameters, a calculation step of calculating a control input value for controlling the operation of the bipedal walking robot using a model prediction control algorithm, and a calculation step.
It has a control step for controlling the operation of the bipedal walking robot using the calculated control input value.
In the calculation step, for each control cycle of the bipedal walking robot, a restraint parameter that constrains the state of the bipedal walking robot so that the free leg of the two legs lands at a designated timing is used. , The rate of change of the optimum solution of the control input value in the predetermined evaluation interval in the model prediction control algorithm is calculated, and the rate of change is used to optimize the control input value in the next control cycle of the control cycle. A control method for a bipedal walking robot that calculates a solution and calculates the current control input value from the optimum solution.

It is a program that realizes a control method for a bipedal walking robot that controls the movement of a bipedal walking robot that can perform bipedal walking using two legs.
An acquisition step for acquiring a state parameter indicating a state related to walking of the biped robot, and
It is a calculation step of calculating a control input value for controlling the operation of the bipedal walking robot by using the model prediction control algorithm based on the acquired state parameters, and is the calculation step of calculating the control input value at a designated timing. Predetermined in the model predictive control algorithm for each control cycle of the bipedal walking robot using a constraint parameter that constrains the state of the bipedal walking robot so that the free leg of the two legs lands. The rate of change of the optimum solution of the control input value in the evaluation section is calculated, the optimum solution of the control input value in the control cycle next to the control cycle is calculated using the rate of change, and the present is obtained from the optimum solution. The calculation step for calculating the control input value of
A program that causes a computer to execute a control step that controls the operation of the bipedal walking robot using the calculated control input value.