JP6326167B1

JP6326167B1 - Learning device, learning method, learning program, moving image distribution device, activity device, activity program, and moving image generation device

Info

Publication number: JP6326167B1
Application number: JP2017071117A
Authority: JP
Inventors: 量生川上; 寛明齊藤; 慶介大垣
Original assignee: Dwango Co Ltd
Current assignee: Dwango Co Ltd
Priority date: 2017-03-31
Filing date: 2017-03-31
Publication date: 2018-05-16
Anticipated expiration: 2037-03-31
Also published as: JP2018173790A

Abstract

【課題】オブジェクトの動作を学習する。【解決手段】学習装置１は、オブジェクトＯに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータ１６を生成するモーション生成部２２と、モーションデータ１６に従って、オブジェクトＯの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データＴに記憶する更新部２３とを備える。モーション生成部２２は、オブジェクトに行わせるアクションの識別子に対応づけられた学習データＴを参照して、新たなモーションデータを生成する。【選択図】図３To learn the movement of an object. A learning device includes: a motion generation unit that generates motion data that associates an identifier of an action to be performed by an object with a target of a combination of movements of an operation unit for performing the action; According to the motion data 16, the motion of the movement part of the object O is tried, an evaluation value for the action is output, and an update unit 23 that associates the identifier of the action, the motion data, and the evaluation value and stores them in the learning data T is provided. . The motion generation unit 22 generates new motion data with reference to the learning data T associated with the identifier of the action to be performed by the object. [Selection] Figure 3

Description

本発明は、仮想領域で、動作部を備えるオブジェクトが学習する学習装置、学習方法、学習プログラム、動画配信装置、活動装置、活動プログラムおよび動画生成装置に関する。 The present invention relates to a learning device, a learning method, a learning program, a moving image distribution device, an activity device, an activity program, and a moving image generation device in which an object including an operation unit learns in a virtual region.

近年、機械学習が広く普及しており、ロボットなどの思考ルーチンや様々なゲームに登場する非プレイヤキャラクタ（ＮＰＣ）の思考ルーチンを自動構築する技術が知られている（例えば特許文献１参照）。この特許文献１において、所定の状況において、エージェントが取る行動を決定する思考ルーチンを自動構築する方法が、開示されている。 In recent years, machine learning has become widespread, and a technique for automatically constructing a thinking routine of a robot or the like and a thinking routine of a non-player character (NPC) appearing in various games is known (see, for example, Patent Document 1). In Patent Document 1, a method of automatically constructing a thought routine for determining an action to be taken by an agent in a predetermined situation is disclosed.

特許第５８７４２９２号公報Japanese Patent No. 5874292

人工生命体と称されるオブジェクトの学習においては、取るべき行動を決定する意思決定の学習段階と、行動を行うための一連の動作を学習する学習段階の２段階に分かれることが知られている。しかしながら、特許文献１に記載の技術において、一連の動作の学習については、何ら触れられていない。 In the learning of an object called an artificial life form, it is known to be divided into two stages: a learning stage of decision making that determines an action to be taken and a learning stage that learns a series of actions for performing an action. . However, the technique described in Patent Document 1 does not mention anything about learning of a series of operations.

従って本発明の目的は、オブジェクトが動作を学習する学習装置、学習方法、学習プログラム、動画配信装置、活動装置、活動プログラムおよび動画生成装置を提供することである。 Accordingly, an object of the present invention is to provide a learning device, a learning method, a learning program, a moving image distribution device, an activity device, an activity program, and a moving image generation device in which an object learns an action.

上記課題を解決するために、本発明の第１の特徴は、動作部を備えるオブジェクトが学習する学習装置に関する。第１の特徴に係る学習装置は、オブジェクトに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータを生成するモーション生成部と、モーションデータに従って、オブジェクトの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データに記憶する更新部とを備え、モーション生成部は、オブジェクトに行わせるアクションの識別子に対応づけられた学習データを参照して、新たなモーションデータを生成する。 In order to solve the above-described problem, a first feature of the present invention relates to a learning apparatus in which an object including an operation unit learns. The learning device according to the first feature includes a motion generation unit that generates motion data in which an identifier of an action to be performed by an object is associated with a target of a combination of movements of the operation unit for performing the action, and motion data The motion generation unit includes an update unit that tries to move the motion unit of the object, outputs an evaluation value for the action, and stores the action identifier, the motion data, and the evaluation value in the learning data in association with each other. New motion data is generated with reference to the learning data associated with the identifier of the action to be performed by the object.

オブジェクトは仮想領域で活動し、仮想領域は、複数の部分領域を備え、各部分領域に、当該部分領域の環境値識別子が対応づけられ、学習データは、アクションの識別子とモーションデータと評価値にさらに、部分領域の環境値識別子を対応づけて記憶し、モーション生成部は、オブジェクトが位置する部分領域の環境値識別子に対応づけられた学習データを参照して、新たなモーションデータを生成しても良い。 The object is active in a virtual area, and the virtual area includes a plurality of partial areas, and each partial area is associated with an environmental value identifier of the partial area, and the learning data is an action identifier, motion data, and evaluation value. Further, the environmental value identifier of the partial area is stored in association with each other, and the motion generation unit generates new motion data by referring to the learning data associated with the environmental value identifier of the partial area where the object is located. Also good.

モーションデータは、動作部について、動作部の動きと、動きを試みる時間とが対応づけられたペアを複数備え、更新部は、モーションデータで生成された複数のペアに基づいて動きを試みた後に、評価値を出力しても良い。 The motion data includes a plurality of pairs in which the motion of the motion unit is associated with the time to try the motion, and the update unit attempts to move based on the plurality of pairs generated by the motion data. The evaluation value may be output.

オブジェクトが、モーションデータに従って動作部の動きを試みる状態を示す動画データを逐次生成して配信する配信部をさらに備えても良い。 The object may further include a distribution unit that sequentially generates and distributes moving image data indicating a state in which the object tries to move the operation unit according to the motion data.

本発明の第２の特徴は、動作部を備えるオブジェクトが学習する学習方法に関する。本発明の第２の特徴に係る学習方法は、コンピュータが、オブジェクトに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータを生成するステップと、コンピュータが、モーションデータに従って、オブジェクトの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データに記憶するステップと、コンピュータが、オブジェクトに行わせるアクションの識別子に対応づけられた学習データを参照して、新たなモーションデータを生成するステップを備える。 The second feature of the present invention relates to a learning method in which an object having an operation unit learns. In the learning method according to the second aspect of the present invention, the computer generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of motions of the operation unit for performing the action. The computer tries to move the motion part of the object according to the motion data, outputs an evaluation value for the action, associates the action identifier, the motion data, and the evaluation value in the learning data, and the computer Includes a step of generating new motion data with reference to the learning data associated with the identifier of the action to be performed by the object.

本発明の第３の特徴は、動作部を備えるオブジェクトが学習する学習プログラムに関する。本発明の第３の特徴に係る学習プログラムは、コンピュータを、オブジェクトに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータを生成するモーション生成部と、モーションデータに従って、オブジェクトの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データに記憶する更新部として機能させ、モーション生成部は、オブジェクトに行わせるアクションの識別子に対応づけられた学習データを参照して、新たなモーションデータを生成する。 A third feature of the present invention relates to a learning program for learning an object having an operation unit. The learning program according to the third aspect of the present invention is a program for generating motion data that associates an identifier of an action that causes an object to be performed with an object and a target of a combination of motions of an operation unit that causes the action to be performed. According to the generation unit and the motion data, the movement of the motion unit of the object is attempted, the evaluation value for the action is output, and the action identifier, the motion data, and the evaluation value are associated with each other and stored in the learning data. The motion generation unit generates new motion data with reference to the learning data associated with the identifier of the action to be performed by the object.

本発明の第４の特徴は、動作部を備えるオブジェクトが学習する学習装置に接続する動画配信装置に関する。本発明の第４の特徴において学習装置は、オブジェクトに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータを生成するモーション生成部と、モーションデータに従って、オブジェクトの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データに記憶する更新部とを備え、モーション生成部は、オブジェクトに行わせるアクションの識別子に対応づけられた学習データを参照して、新たなモーションデータを生成し、動画配信装置は、学習装置において生成された学習データを用いて、オブジェクトが、動作部の動きを試みる状態を示す動画データを逐次生成して配信する。 A fourth feature of the present invention relates to a moving image distribution apparatus connected to a learning apparatus that an object having an operation unit learns. In a fourth aspect of the present invention, the learning device includes a motion generation unit that generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of movements of the operation unit for performing the action; A motion generation unit including an update unit that attempts to move the motion unit of the object in accordance with the motion data, outputs an evaluation value for the action, and stores the action identifier, the motion data, and the evaluation value in association with the learning data; Refers to the learning data associated with the identifier of the action to be performed by the object, generates new motion data, and the video distribution device uses the learning data generated in the learning device to operate the object The video data indicating the state of trying the movement of the part is sequentially generated and distributed.

本発明の第５の特徴は、動作部を備えるオブジェクトが学習する学習装置に接続する活動装置に関する。本発明の第５の特徴において学習装置は、オブジェクトに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータを生成するモーション生成部と、モーションデータに従って、オブジェクトの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データに記憶する更新部とを備え、モーション生成部は、オブジェクトに行わせるアクションの識別子に対応づけられた学習データを参照して、新たなモーションデータを生成し、活動装置は、学習装置で生成された学習データを用いて、オブジェクトを動作させる。 A fifth feature of the present invention relates to an activity device that is connected to a learning device that learns an object having an operation unit. In the fifth aspect of the present invention, the learning device includes a motion generation unit that generates motion data in which an identifier of an action to be performed by an object is associated with a target of a combination of movements of the operation unit for performing the action, A motion generation unit including an update unit that attempts to move the motion unit of the object in accordance with the motion data, outputs an evaluation value for the action, and stores the action identifier, the motion data, and the evaluation value in association with the learning data; Refers to the learning data associated with the identifier of the action to be performed by the object, generates new motion data, and the activity device operates the object using the learning data generated by the learning device.

本発明の第６の特徴は、動作部を備えるオブジェクトが学習する学習装置に接続する活動プログラムに関する。本発明の第６の特徴において学習装置は、オブジェクトに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータを生成するモーション生成部と、モーションデータに従って、オブジェクトの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データに記憶する更新部とを備え、モーション生成部は、オブジェクトに行わせるアクションの識別子に対応づけられた学習データを参照して、新たなモーションデータを生成し、活動プログラムは、コンピュータに、学習装置で生成された学習データを用いて、オブジェクトを動作させる。 A sixth feature of the present invention relates to an activity program connected to a learning device that learns an object having an operation unit. In a sixth aspect of the present invention, the learning device includes a motion generation unit that generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of movements of the operation unit for performing the action; A motion generation unit including an update unit that attempts to move the motion unit of the object in accordance with the motion data, outputs an evaluation value for the action, and stores the action identifier, the motion data, and the evaluation value in association with the learning data; Refers to the learning data associated with the identifier of the action to be performed by the object, generates new motion data, and the activity program uses the learning data generated by the learning device to the computer to identify the object. Make it work.

本発明の第７の特徴は、動作部を備えるオブジェクトが学習する学習装置に接続する動画生成装置に関する。本発明の第７の特徴において学習装置は、オブジェクトに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータを生成するモーション生成部と、モーションデータに従って、オブジェクトの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データに記憶する更新部とを備え、モーション生成部は、オブジェクトに行わせるアクションの識別子に対応づけられた学習データを参照して、新たなモーションデータを生成し、動画生成装置は、学習装置で生成された学習データを用いて、オブジェクトが、動作部の動きを試みる状態を示す動画データを逐次生成する。 A seventh feature of the present invention relates to a moving image generating apparatus that is connected to a learning apparatus that learns an object having an operation unit. In a seventh aspect of the present invention, the learning device includes a motion generation unit that generates motion data in which an identifier of an action to be performed by an object is associated with a target of a combination of movements of the operation unit for performing the action; A motion generation unit including an update unit that attempts to move the motion unit of the object in accordance with the motion data, outputs an evaluation value for the action, and stores the action identifier, the motion data, and the evaluation value in association with the learning data; Refers to the learning data associated with the identifier of the action to be performed by the object, generates new motion data, and the video generation device uses the learning data generated by the learning device to operate the object The moving image data indicating the state of trying the movement of the part is sequentially generated.

本発明によれば、オブジェクトが動作を学習する学習装置、学習方法、学習プログラム、動画配信装置、活動装置、活動プログラムおよび動画生成装置を提供することができる。 According to the present invention, it is possible to provide a learning device, a learning method, a learning program, a moving image distribution device, an activity device, an activity program, and a moving image generation device in which an object learns an action.

本発明の実施の形態に係る学習システムのシステム構成を説明する図である。It is a figure explaining the system configuration of the learning system concerning an embodiment of the invention. オブジェクトが仮想領域を活動する様子の一例を説明する図である。It is a figure explaining an example of a mode that an object activates a virtual area. 本発明の実施の形態に係る学習装置のハードウエア構成および機能ブロックを説明する図である。It is a figure explaining the hardware constitutions and functional block of the learning apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る仮想領域と部分領域を説明する図である。It is a figure explaining the virtual area and partial area which concern on embodiment of this invention. 本発明の実施の形態に係る領域属性データのデータ構造とデータの一例を説明する図である。It is a figure explaining an example of a data structure and data of area attribute data concerning an embodiment of the invention. 本発明の実施の形態に係る環境値データのデータ構造とデータの一例を説明する図である。It is a figure explaining an example of the data structure and data of environmental value data which concern on embodiment of this invention. 本発明の実施の形態に係るオブジェクト属性データのデータ構造とデータの一例を説明する図である。It is a figure explaining an example of the data structure and data of object attribute data concerning an embodiment of the invention. 本発明の実施の形態に係る動作部属性データのデータ構造とデータの一例を説明する図である。It is a figure explaining an example of data structure and data of operation part attribute data concerning an embodiment of the invention. 本発明の実施の形態に係る評価値指標データのデータ構造とデータの一例を説明する図である。It is a figure explaining an example of data structure and data of evaluation value index data concerning an embodiment of the invention. 本発明の実施の形態に係る学習データのデータ構造とデータの一例を説明する図である。It is a figure explaining an example of data structure and data of learning data concerning an embodiment of the invention. 本発明の実施の形態に係る学習部による学習処理を説明するフローチャートである。It is a flowchart explaining the learning process by the learning part which concerns on embodiment of this invention. 本発明の実施の形態に係るモーションデータの一例を説明する図である。It is a figure explaining an example of motion data concerning an embodiment of the invention. 本発明の実施の形態に係るモーションデータによる動作部の動きの一例を説明する図である。It is a figure explaining an example of a motion of the operation part by motion data concerning an embodiment of the invention. 本発明の変形例に係る学習システムのシステム構成を説明する図である。It is a figure explaining the system configuration | structure of the learning system which concerns on the modification of this invention. 本発明の変形例に係る学習システムにおけるデータの流れの一例を説明する図である。It is a figure explaining an example of the flow of data in the learning system concerning the modification of the present invention.

次に、図面を参照して、本発明の実施の形態を説明する。以下の図面の記載において、同一または類似の部分には同一または類似の符号を付している。 Next, embodiments of the present invention will be described with reference to the drawings. In the following description of the drawings, the same or similar parts are denoted by the same or similar reference numerals.

（学習システム）
図１に示される本発明の実施の形態に係る学習システム９は、人工生命体と称されるオブジェクトＯの活動を、コンピュータ上でシミュレートし、シミュレート結果を動画として提供する。本発明の実施の形態においては特に、オブジェクトＯは、動作部を備える。オブジェクトＯは、オブジェクトＯの属性や、オブジェクトＯが配置された領域の環境値などの制限下で、より良い効果が得られる動作部のモーションを学習する。 (Learning system)
The learning system 9 according to the embodiment of the present invention shown in FIG. 1 simulates the activity of an object O called an artificial life form on a computer and provides the simulation result as a moving image. Particularly in the embodiment of the present invention, the object O includes an operation unit. The object O learns the motion of the motion unit that can obtain a better effect under the restrictions of the attribute of the object O and the environment value of the area where the object O is arranged.

なお本発明の実施の形態に係るオブジェクトＯは、仮想領域Ｖで活動することを前提に記載するが、これに限らない。オブジェクトＯは、ロボットなどの有体物であって、動作部に対応する部品と、部品を制御するコンピュータを有しても良い。 The object O according to the embodiment of the present invention is described on the assumption that the object O is active in the virtual region V, but is not limited thereto. The object O is a tangible object such as a robot, and may include a component corresponding to the operation unit and a computer that controls the component.

学習システム９は、図１に示すように、学習装置１、動画配信装置２および端末３を備える。学習装置１、動画配信装置２および端末３は、通信ネットワーク８により相互に通信可能に接続される。 As shown in FIG. 1, the learning system 9 includes a learning device 1, a moving image distribution device 2, and a terminal 3. The learning device 1, the moving image distribution device 2, and the terminal 3 are connected to each other via a communication network 8 so as to communicate with each other.

学習装置１は、図２に示すように、仮想領域ＶにおいてオブジェクトＯが活動するシミュレーションを行う。図２に示す例においてオブジェクトＯは、胴体を支える四肢を有し、四足歩行をする犬型形状であるこれに限らない。オブジェクトＯは、二足歩行をする人型形状であっても良いし、図１２等に示すように、放射状に四肢を有する形状であっても良い。 As illustrated in FIG. 2, the learning device 1 performs a simulation in which the object O is active in the virtual region V. In the example shown in FIG. 2, the object O is not limited to the dog-shaped shape that has limbs that support the torso and walks on four legs. The object O may have a humanoid shape for biped walking, or may have a shape having limbs radially as shown in FIG.

動画配信装置２は、学習装置１によって学習されたオブジェクトＯの動きの動画を、端末３に配信する。 The moving image distribution device 2 distributes the moving image of the movement of the object O learned by the learning device 1 to the terminal 3.

端末３は、動画配信装置２から配信された動画を再生する。また端末３は、学習装置１に対して、学習の開始または終了の指示を入力したり、オブジェクトＯに行わせるアクションを指定したりしても良い。 The terminal 3 reproduces the moving image distributed from the moving image distribution device 2. Further, the terminal 3 may input an instruction to start or end learning to the learning device 1 or specify an action to be performed by the object O.

本発明の実施の形態において、学習装置１、動画配信装置２および端末３はそれぞれ異なる装置である場合を説明するが、同一の装置に所定の機能が実装されても良い。例えば、学習装置１が動画データを端末３に配信しても良いし、学習装置１で動画を再生しても良い。 In the embodiment of the present invention, a case where the learning device 1, the moving image distribution device 2, and the terminal 3 are different devices will be described, but a predetermined function may be mounted on the same device. For example, the learning device 1 may distribute the moving image data to the terminal 3, or the learning device 1 may reproduce the moving image.

（学習装置）
図３を参照して、本発明の実施の形態に係る学習装置１を説明する。 (Learning device)
With reference to FIG. 3, the learning apparatus 1 which concerns on embodiment of this invention is demonstrated.

本発明の実施の形態に係る学習装置１は、人工生命体とも称されるオブジェクトの活動をシミュレートする。 The learning device 1 according to the embodiment of the present invention simulates the activity of an object also called an artificial life form.

オブジェクトＯにおいて動作部は、オブジェクトＯを構成する部品を接続する関節である。またオブジェクトを構成する部品が武器や道具などの動作可能な部品である場合、この部品も動作部となる。この関節を所定方向に曲げたり回転させたりする一連のモーションを試みて、所定のアクションにおいてより良い効果が得られるように学習する。本発明の実施の形態において、オブジェクトＯが複数の動作部を備える場合を説明するが、一つの動作部を備えても良い。 In the object O, the operation unit is a joint that connects components constituting the object O. In addition, when a part constituting the object is an operable part such as a weapon or a tool, this part is also an operation part. A series of motions in which the joint is bent or rotated in a predetermined direction is tried to learn so that a better effect can be obtained in a predetermined action. In the embodiment of the present invention, the case where the object O includes a plurality of operation units will be described, but one operation unit may be included.

本発明の実施の形態において「アクション」は、オブジェクトＯ全体で行う動作であって、例えば、前進、後進、回転、ジャンプなどである。またオブジェクトＯの動作部が道具である場合、「アクション」は、道具の使用を含む。例えば、動作部が銃である場合、「アクション」は、前方射撃を含む。 In the embodiment of the present invention, the “action” is an operation performed on the entire object O, for example, forward, backward, rotation, jump, and the like. When the action part of the object O is a tool, “action” includes use of the tool. For example, when the operation portion is a gun, "action" includes a forward fire.

本発明の実施の形態において「モーション」は、アクションを行うための、オブジェクトＯの各動作部の一連の動きの組み合わせである。「モーション」単位で、アクションの効果が評価される。所定時間以内に行う、オブジェクトＯの各動作部の動作の組み合わせを、モーションとして生成する。オブジェクトＯの属性や、オブジェクトＯが配置された領域の環境値などの制限下で、オブジェクトＯに、生成したモーションに従って動作するように試みさせる。 In the embodiment of the present invention, “motion” is a combination of a series of movements of the respective operation units of the object O for performing an action. The effect of the action is evaluated in units of “motion”. A combination of motions of the motion parts of the object O performed within a predetermined time is generated as a motion. The object O is made to try to operate according to the generated motion under the restrictions of the attribute of the object O and the environment value of the area where the object O is arranged.

モーションは、アクション毎に評価される。例えば、前進のアクションを行わせるために生成されたモーションは、そのモーションを行った結果、前進した距離が大きいほど、評価が高くなる。学習装置１は、アクション毎に行ったモーションの評価に基づいて新たなモーションを生成して、より評価の高いモーションを模索して、学習する。 Motion is evaluated for each action. For example, the motion generated to perform the forward action has a higher evaluation as the distance moved forward is larger as a result of the motion. The learning device 1 generates a new motion based on the motion evaluation performed for each action, and searches for and learns a motion with higher evaluation.

図３に示すように学習装置１は、記憶装置１０、処理装置２０、入力装置３０、出力装置４０および通信制御装置５０を備える一般的なコンピュータである。一般的なコンピュータが所定の処理を実行するための学習プログラムを実行することにより、図３に示す各機能を実現する。 As illustrated in FIG. 3, the learning device 1 is a general computer including a storage device 10, a processing device 20, an input device 30, an output device 40, and a communication control device 50. Each function shown in FIG. 3 is realized by a general computer executing a learning program for executing predetermined processing.

記憶装置１０は、学習プログラムを記憶するとともに、領域属性データ１１、環境値データ１２、オブジェクト属性データ１３、動作部属性データ１４、評価指標データ１５、モーションデータ１６、学習データＴおよび動画データＭを記憶する。領域属性データ１１、環境値データ１２、オブジェクト属性データ１３、動作部属性データ１４および評価指標データ１５は、本発明の実施の形態において、オブジェクトＯが仮想領域Ｖで活動する際に用いられる参照データである。モーションデータ１６、学習データＴおよび動画データＭは、オブジェクトＯが仮想領域Ｖで活動する際に得られるデータであって、適宜更新される。 The storage device 10 stores a learning program and also stores region attribute data 11, environment value data 12, object attribute data 13, operation unit attribute data 14, evaluation index data 15, motion data 16, learning data T, and moving image data M. Remember. The area attribute data 11, the environment value data 12, the object attribute data 13, the action part attribute data 14, and the evaluation index data 15 are reference data used when the object O is active in the virtual area V in the embodiment of the present invention. It is. The motion data 16, the learning data T, and the moving image data M are data obtained when the object O is active in the virtual region V, and are updated as appropriate.

本発明の実施の形態において仮想領域Ｖは、複数の部分領域Ｄを備え、各部分領域Ｄに、当該部分領域Ｄの環境値識別子が対応づけられる。図４（ａ）に示すように、オブジェクトＯが平面で活動する場合、平面形状の仮想領域Ｖは複数の部分領域Ｄに分割される。図４（ｂ）に示すように、オブジェクトＯが空間で活動する場合、空間形状の仮想領域Ｖは複数の部分領域Ｄに分割される。各部分領域Ｄは、例えば、２〜５のモーションデータを学習させることで、他の部分領域Ｄに移動可能な程度の大きさを有する。部分領域Ｄの一辺は、例えば、オブジェクトＯの大きさの約１．５倍から５倍程度の大きさを有することが好ましい。 In the embodiment of the present invention, the virtual area V includes a plurality of partial areas D, and each partial area D is associated with the environment value identifier of the partial area D. As shown in FIG. 4A, when the object O is active on a plane, the planar virtual region V is divided into a plurality of partial regions D. As shown in FIG. 4B, when the object O is active in space, the space-shaped virtual region V is divided into a plurality of partial regions D. Each partial area D has such a size that it can move to another partial area D by learning 2 to 5 motion data, for example. One side of the partial region D preferably has a size of about 1.5 to 5 times the size of the object O, for example.

領域属性データ１１および環境値データ１２は、仮想領域Ｖの各部分領域Ｄの環境値を対応づける。領域属性データ１１および環境値データ１２は、オブジェクトＯの活動領域における活動条件に対応する。 The area attribute data 11 and the environment value data 12 associate the environment values of the partial areas D of the virtual area V with each other. The area attribute data 11 and the environment value data 12 correspond to activity conditions in the activity area of the object O.

領域属性データ１１は、オブジェクトＯが活動する部分領域Ｄの環境値を対応づけるデータである。図５に示すように領域属性データ１１は、部分領域識別子と、環境値識別子を対応づけたデータである。 The area attribute data 11 is data that associates environmental values of the partial area D in which the object O is active. As shown in FIG. 5, the region attribute data 11 is data in which a partial region identifier is associated with an environment value identifier.

本発明の実施の形態で用いられる環境値の数は、各仮想領域Ｖで用いられる部分領域Ｄの数よりも少なく設定されることが好ましい。換言すると、所定の環境値は、複数の部分領域Ｄに紐づけられ、所定の環境値における学習データは、その環境値が紐づけられる複数の部分領域Ｄの学習データとして参酌される。このように、仮想領域Ｖを複数の部分領域Ｄに分割し、各部分領域Ｄ毎に学習データを蓄積することによって、学習における計算量を削減し、効率的に学習することが可能になる。 The number of environmental values used in the embodiment of the present invention is preferably set to be smaller than the number of partial regions D used in each virtual region V. In other words, the predetermined environment value is associated with a plurality of partial areas D, and the learning data in the predetermined environment value is considered as learning data for the plurality of partial areas D associated with the environment value. Thus, by dividing the virtual region V into a plurality of partial regions D and accumulating learning data for each partial region D, it is possible to reduce the amount of calculation in learning and efficiently learn.

環境値データ１２は、部分領域Ｄに対応づけられる環境値識別子と、環境値のパラメータの組み合わせを対応づけるデータである。環境値データ１２は、図６に示すように、環境値識別子、傾斜、摩擦、重力、吸着力、水深、気温等が対応づけられる。例えば気温は、オブジェクトＯのエネルギー消費量に影響を与える。環境値データ１２は、各部分領域Ｄの環境値の組み合わせを対応づける。なお、環境値データ１２において、部分領域Ｄの属性によっては設定されない項目があっても良い。 The environmental value data 12 is data that associates an environmental value identifier associated with the partial region D with a combination of environmental value parameters. As shown in FIG. 6, the environmental value data 12 is associated with an environmental value identifier, inclination, friction, gravity, adsorption force, water depth, temperature, and the like. For example, the temperature affects the energy consumption of the object O. The environment value data 12 associates the combination of environment values of each partial area D. In the environment value data 12, there may be an item that is not set depending on the attribute of the partial region D.

オブジェクト属性データ１３および動作部属性データ１４は、オブジェクトＯおよび動作部の属性を対応づける。オブジェクト属性データ１３および動作部属性データ１４は、オブジェクトＯが活動する際のオブジェクトＯ自身の活動条件に対応する。 The object attribute data 13 and the action part attribute data 14 associate the object O with the action part attribute. The object attribute data 13 and the action part attribute data 14 correspond to the activity conditions of the object O itself when the object O is active.

オブジェクト属性データ１３は、オブジェクトＯ全体の属性を対応づけるデータである。オブジェクト属性データ１３は、図７に示すように、仮想領域Ｖ内で活動するオブジェクトＯを識別するオブジェクト識別子、オブジェクトＯの種別、オブジェクトＯが選択可能なアクションの識別子および個体値を対応づけたデータである。個体値には、基礎代謝、行動代謝、体力等が含まれる。オブジェクトＯの種別が共通する場合でも、オブジェクト識別子毎に、異なる個体値が対応づけられる。 The object attribute data 13 is data that associates the attributes of the entire object O with each other. As shown in FIG. 7, the object attribute data 13 is data that associates an object identifier that identifies an object O that is active in the virtual region V, a type of the object O, an identifier of an action that can be selected by the object O, and an individual value. It is. Individual values include basal metabolism, behavioral metabolism, physical fitness, and the like. Even when the types of the objects O are common, different individual values are associated with each object identifier.

動作部属性データ１４は、各オブジェクトＯに含まれる動作部の属性を対応づけたデータである。動作部属性データ１４は、図８に示すように、オブジェクト識別子、動作部識別子、個体値を対応づけたデータである。個体値には、形状、筋力、大きさ、重さ、重心、テクスチャ等が含まれる。オブジェクトＯの種別および動作部識別子が共通する場合でも、オブジェクト識別子毎に、異なる個体値が対応づけられる。なお、各動作部のテクスチャの情報は、オブジェクトＯの動きを描画する際に参酌される。 The action part attribute data 14 is data in which attributes of action parts included in each object O are associated. As shown in FIG. 8, the motion part attribute data 14 is data in which an object identifier, a motion part identifier, and an individual value are associated with each other. Individual values include shape, muscle strength, size, weight, center of gravity, texture, and the like. Even when the type of the object O and the motion part identifier are common, different individual values are associated with each object identifier. Note that the texture information of each operation unit is taken into account when drawing the movement of the object O.

評価指標データ１５は、オブジェクトＯが行うアクション毎の評価関数と評価値を対応づけたデータである。評価指標データ１５は、図９に示すように、アクション識別子、アクション名、評価関数、評価指標を対応づけたデータである。具体的には、「前進」のアクションについては、「前方向の移動量」に基づいて評価され、「移動量が多いほど評価が良い」評価値が付される。また「前方射撃」のアクションについては、「発射された弾丸のずれ」に基づいて評価され、「ずれが少ないほど評価が良い」評価値が付される。ここで「発射された弾丸のずれ」は、「発射された弾丸の発射方向と、オブジェクトＯの前方向との立体角のずれ」である。評価指標データ１５の評価値は、数値で評価されても良いし、数値を区分したレベルで評価されても良い。 The evaluation index data 15 is data in which an evaluation function for each action performed by the object O is associated with an evaluation value. As shown in FIG. 9, the evaluation index data 15 is data in which an action identifier, an action name, an evaluation function, and an evaluation index are associated with each other. Specifically, the “forward” action is evaluated based on the “forward movement amount”, and an evaluation value of “evaluation is better as the movement amount is larger” is given. Further, the action of “forward shooting” is evaluated based on “the deviation of the fired bullet”, and the evaluation value “the smaller the deviation is, the better the evaluation” is given. Here, “the deviation of the fired bullet” is “the deviation of the solid angle between the firing direction of the fired bullet and the forward direction of the object O”. The evaluation value of the evaluation index data 15 may be evaluated by a numerical value, or may be evaluated at a level obtained by dividing the numerical value.

モーションデータ１６は、動きの学習において、オブジェクトＯの各動作部に行わせる一連の動きのデータである。モーションデータ１６は、オブジェクトＯに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたデータである。モーションデータ１６は、動作部について、動作部の動きと、動きを試みる時間とが対応づけられたペアを複数備える。モーションデータ１６については、図１２を参照して、後述する。 The motion data 16 is a series of motion data to be performed by each motion unit of the object O in motion learning. The motion data 16 is data in which an identifier of an action to be performed by the object O is associated with a target of a combination of movements of the operation unit for performing the action. The motion data 16 includes a plurality of pairs in which the motion of the motion unit is associated with the time for attempting the motion. The motion data 16 will be described later with reference to FIG.

学習データＴは、モーションデータに基づいてオブジェクトＯに動作させた結果のデータである。学習データＴは、アクション識別子および環境識別子毎に、学習した結果のデータであり、具体的には、アクション識別子、アクション名、環境値識別子、モーションおよび評価値を対応づけたデータである。 The learning data T is data obtained as a result of operating the object O based on the motion data. The learning data T is data obtained as a result of learning for each action identifier and environment identifier. Specifically, the learning data T is data in which an action identifier, an action name, an environment value identifier, a motion, and an evaluation value are associated with each other.

動画データＭは、オブジェクトＯが学習する様子の動画データである。 The moving image data M is moving image data that the object O learns.

処理装置２０は、学習部２１および動画処理部２６を備える。 The processing device 20 includes a learning unit 21 and a moving image processing unit 26.

学習部２１は、オブジェクトＯに学習させるために、モーション生成部２２と更新部２３を備える。 The learning unit 21 includes a motion generation unit 22 and an update unit 23 in order to cause the object O to learn.

モーション生成部２２は、オブジェクトＯに行わせるアクションの識別子と、アクションを行わせるための動作部の動きの組み合わせの目標とを対応づけたモーションデータを生成する。モーション生成部２２は、オブジェクトＯに行わせるアクションの識別子に対応づけられた学習データを参照して、新たなモーションデータを生成する。 The motion generation unit 22 generates motion data in which an identifier of an action to be performed by the object O is associated with a target of a combination of movements of the operation unit for performing the action. The motion generation unit 22 generates new motion data with reference to the learning data associated with the identifier of the action to be performed by the object O.

モーション生成部２２は、多くの場合、オブジェクトＯの各動作部について、過去に学習した結果を示す学習データＴから、評価の良かったモーションに基づいて、良い評価が期待できるモーションを生成する。またモーション生成部２２は、所定以下の割合で、学習データＴの評価にかかわらず、ランダムでモーションを生成する。良い評価が期待できるモーションのみならず、ランダムにモーションを選択することにより、予見しにくい新たな評価が得られ、その結果、さらに良い評価を得られる場合があり、オブジェクトＯの進化に寄与する。 In many cases, the motion generation unit 22 generates a motion that can be expected to have a good evaluation, based on the motion that has been evaluated well, from the learning data T that indicates the result of learning in the past for each operation unit of the object O. In addition, the motion generation unit 22 randomly generates a motion at a predetermined ratio or less regardless of the evaluation of the learning data T. By selecting not only motions that can be expected to be good, but also randomly selecting motions, new evaluations that are difficult to predict can be obtained. As a result, even better evaluations can be obtained, which contributes to the evolution of the object O.

モーション生成部２２は、オブジェクトＯが位置する部分領域Ｄの環境値識別子に対応づけられた学習データＴを参照して、新たなモーションデータを生成する。本発明の実施の形態において学習データＴは、部分領域Ｄの環境値識別子に紐づけられるので、オブジェクトＯは、オブジェクトＯが位置する部分領域の環境値識別子における過去の学習の結果を参照して、さらに学習する。オブジェクトＯが位置する環境値を個別のパラメータでなく、パラメータの組み合わせとして管理することにより、学習における計算量を削減し、より短い時間で所望の学習結果を得ることが可能になる。 The motion generation unit 22 generates new motion data with reference to the learning data T associated with the environment value identifier of the partial region D where the object O is located. In the embodiment of the present invention, the learning data T is linked to the environmental value identifier of the partial area D, so the object O refers to the past learning result in the environmental value identifier of the partial area where the object O is located. To learn more. By managing the environmental value where the object O is located as a combination of parameters instead of individual parameters, it is possible to reduce the amount of calculation in learning and obtain a desired learning result in a shorter time.

更新部２３は、モーションデータ１６に従って、オブジェクトＯの動作部の動きを試みて、アクションに対する評価値を出力し、アクションの識別子とモーションデータと評価値とを対応づけて学習データＴに記憶する。更新部２３は、モーションデータ１６で生成された複数のペアに基づいて動きを試みた後に、評価値を出力する。 The update unit 23 attempts to move the motion unit of the object O according to the motion data 16, outputs an evaluation value for the action, and stores the action identifier, the motion data, and the evaluation value in the learning data T in association with each other. The update unit 23 outputs an evaluation value after trying to move based on a plurality of pairs generated by the motion data 16.

更新部２３は、モーションデータ１６として生成された、各動作部の動きと動きを試みる時間とを対応づけたペアに基づいて、動作部の動きの組み合わせの目標を実現するように、オブジェクトＯを動作させる。このとき更新部２３は、オブジェクトＯが位置する部分領域Ｄの環境値、オブジェクトＯの属性および動作部の属性を制約条件として、オブジェクトＯを動作させる。 The updating unit 23 sets the object O so as to realize the target of the motion unit motion combination based on the pair generated by generating the motion data 16 and associating the motion of each motion unit with the time to try the motion. Make it work. At this time, the update unit 23 operates the object O using the environment value of the partial region D where the object O is located, the attribute of the object O, and the attribute of the operation unit as constraints.

ここでオブジェクトＯは、オブジェクトＯが位置する部分領域の環境値、オブジェクトＯの属性および動作部の属性によっては、その動作を達成できずに、途中で断念せざるを得ない場合もある。オブジェクトＯは、モーションデータ１６で指定された時間、その動作を試みて達成できない場合であっても、その次の時間に指定された新たな動作を試みる。 Here, depending on the environmental value of the partial area in which the object O is located, the attribute of the object O, and the attribute of the action part, the object O may not be able to achieve its action and may have to give up halfway. The object O tries a new operation specified at the next time even if the operation cannot be achieved by trying the operation for the time specified by the motion data 16.

更新部２３は、モーションデータ１６として生成されたすべてのペアについて、動作を試みた後、オブジェクトＯに行わせたアクションに対して評価を行う。例えば、「前進」のアクションを行わせた場合、モーションデータ１６に基づいてオブジェクトＯが動作した結果得られた「前方向の移動量」を算出する。また更新部２３は、算出された「前方向の移動量」を、評価指標データ１５に基づいて評価値に換算して、学習データＴに記憶する。 The update unit 23 evaluates the action performed on the object O after trying the operation for all pairs generated as the motion data 16. For example, when the “forward” action is performed, the “forward movement amount” obtained as a result of the movement of the object O based on the motion data 16 is calculated. Further, the updating unit 23 converts the calculated “forward movement amount” into an evaluation value based on the evaluation index data 15 and stores it in the learning data T.

学習部２１は、所定の条件を満たして学習を終了するまで、モーションデータ１６の生成と、学習データＴの更新を繰り返す。学習部２１は、例えば、入力装置３０や通信制御装置５０等から、終了の指示が入力されると、処理を終了する。学習装置１は、所定の学習回数や学習時間に達した際に、処理を終了しても良いし、所定の評価値が得られた際に、処理を終了しても良い。 The learning unit 21 repeats the generation of the motion data 16 and the update of the learning data T until the learning is completed by satisfying a predetermined condition. The learning unit 21 ends the process when an end instruction is input from the input device 30 or the communication control device 50, for example. The learning device 1 may end the process when the predetermined number of learning times or the learning time is reached, or may end the process when a predetermined evaluation value is obtained.

動画処理部２６は、動画生成部２７と動画配信部２８を備える。 The moving image processing unit 26 includes a moving image generation unit 27 and a moving image distribution unit 28.

動画生成部２７は、オブジェクトＯが、モーションデータ１６に従って動作部の動きを試みる状態を示す動画データを逐次生成する。動画生成部２７は、オブジェクトＯの動作部の動きと動きを試みる時間とを対応づけたペアに基づいて、各時間で、各動作部の動きを試みるとともに、オブジェクトＯが位置する部分領域の環境値、オブジェクトＯの属性および動作部の属性によって、その動作を達成できた、あるいは達成できないで途中で止まってしまう状態などを描画する動画データＭを生成する。 The moving image generation unit 27 sequentially generates moving image data indicating a state in which the object O attempts to move the operation unit according to the motion data 16. The moving image generating unit 27 attempts to move each motion unit at each time based on a pair in which the motion of the motion unit of the object O is associated with the time to try the motion, and the environment of the partial region where the object O is located. Depending on the value, the attribute of the object O, and the attribute of the motion part, moving image data M is generated that draws a state where the motion can be achieved or cannot be achieved and stops halfway.

動画配信部２８は、動画生成部２７が生成した動画データＭを端末３に配信する。動画配信部２８は、動画生成部２７が生成した動画データＭを、動画配信装置２に送信して、動画配信装置２に配信させても良い。 The moving image distribution unit 28 distributes the moving image data M generated by the moving image generation unit 27 to the terminal 3. The moving image distribution unit 28 may transmit the moving image data M generated by the moving image generation unit 27 to the moving image distribution device 2 and distribute it to the moving image distribution device 2.

図１１を参照して、本発明の実施の形態に係る学習部２１による学習処理を説明する。 With reference to FIG. 11, the learning process by the learning part 21 which concerns on embodiment of this invention is demonstrated.

まずステップＳ１において学習部２１は、オブジェクトＯに行わせるアクションを特定するとともに、オブジェクトＯの位置を特定する。アクションは、入力装置３０を介して入力されても良いし、端末３から入力されても良いし、予めスクリプトとして設定されても良い。ステップＳ２において学習部２１は、学習データＴから、ステップＳ１で特定したアクションおよびオブジェクトＯの位置の環境値識別子に対応する、過去の学習データを抽出する。 First, in step S <b> 1, the learning unit 21 specifies an action to be performed by the object O and specifies the position of the object O. The action may be input via the input device 30, may be input from the terminal 3, or may be set in advance as a script. In step S2, the learning unit 21 extracts past learning data corresponding to the action value specified in step S1 and the environment value identifier of the position of the object O from the learning data T.

ステップＳ３において学習部２１は、ステップＳ２で抽出した過去の学習データを参照して、各動作部のモーションデータ１６を生成する。ここで学習部２１は、過去の学習データＴで良い評価値が得られたモーションデータに基づいて、モーションデータ１６を生成する。また学習部２１は、例えば、モーションデータを１００回生成するうちの５０回以下などの所定以下の割合で、評価値にかかわらず、モーションデータ１６を生成する。 In step S3, the learning unit 21 generates the motion data 16 of each operation unit with reference to the past learning data extracted in step S2. Here, the learning unit 21 generates the motion data 16 based on the motion data for which a good evaluation value is obtained from the past learning data T. In addition, the learning unit 21 generates the motion data 16 regardless of the evaluation value at a predetermined ratio such as 50 times or less out of 100 times when the motion data is generated 100 times.

ステップＳ４において学習部２１は、ステップＳ３で生成したモーションデータに従って、各動作部の動きを試みる。学習部２１は、ステップＳ５において、アクションに対応づけられた評価関数を用いて、オブジェクトＯが行ったアクションを評価して、ステップＳ６において学習データＴを更新する。 In step S4, the learning unit 21 tries the movement of each operation unit according to the motion data generated in step S3. In step S5, the learning unit 21 evaluates an action performed by the object O using an evaluation function associated with the action, and updates the learning data T in step S6.

ステップＳ７において、継続して学習する場合、ステップＳ１に戻る。学習を終了する場合、処理を終了する。 If learning is continued in step S7, the process returns to step S1. When the learning is finished, the process is finished.

図１２および図１３を参照して、動作部の学習について説明する。 Learning of the operation unit will be described with reference to FIGS. 12 and 13.

オブジェクトＯは、図１２（ａ）に示すように、球体形状の部品Ｐ１と、部品Ｐ１の周り設けられる四肢を有する。四肢の一つは、二つの円柱形状の部品Ｐ２およびＰ３を備える。部品Ｐ１と部品Ｐ２は、関節Ｎ１により接続され、部品Ｐ２と部品Ｐ３は、関節Ｎ２により接続される。 As shown in FIG. 12A, the object O has a spherical part P1 and limbs provided around the part P1. One of the limbs includes two cylindrical parts P2 and P3. The parts P1 and P2 are connected by a joint N1, and the parts P2 and P3 are connected by a joint N2.

図１２（ａ）に示すオブジェクトＯは、合計８個の関節により構成されるので、図１２（ｂ）に示すモーションデータ１６は、各関節について、８個のデータセットを有する。図１２（ｂ）に示すモーションデータ１６において、１行目のデータは、第０番目の関節の動作を示し、２行目のデータは、第１番目の関節の動作を示し、合計８行のデータにより構成される。 Since the object O shown in FIG. 12A is composed of a total of eight joints, the motion data 16 shown in FIG. 12B has eight data sets for each joint. In the motion data 16 shown in FIG. 12 (b), the data in the first row indicates the motion of the 0th joint, the data in the second row indicates the motion of the first joint, and a total of 8 rows. Consists of data.

図１２（ｃ）を参照して、各関節に対応づけられた動作を説明する。第０番目の先頭の記載［１，［３，２，１］］について説明する。［１，［３，２，１］］は、第０番目の関節が、１秒後に、［３，２，１］の状態になるように試みることを意味している。［３，２，１］の状態とは、オイラー角が、［３，２，１］の状態を意味し、具体的には、［０．３＊（７０−３０）＋３０，０．２＊（７０−３０）＋３０，０．１＊（７０−３０）＋３０］であることを意味する。ここで“３０”および“７０”は、第０番目の関節の可動域としてあらかじめ設定された定数である。 With reference to FIG.12 (c), the operation | movement matched with each joint is demonstrated. The 0th head description [1, [3, 2, 1]] will be described. [1, [3, 2, 1]] means that the 0th joint tries to be in the state of [3, 2, 1] after 1 second. The state of [3, 2, 1] means that the Euler angle is [3, 2, 1], specifically, [0.3 * (70-30) +30, 0.2 *. (70-30) +30, 0.1 * (70-30) +30]. Here, “30” and “70” are constants set in advance as the movable range of the 0th joint.

図１３（ａ）に示すのが、時間ｔ＝０の状態であって、オブジェクトＯが動きを試みていない状態である。図１３（ａ）では、関節Ｎ２（第０の関節）は、部品Ｐ２およびＰ３を垂直に接続する。一方、図１３（ｂ）は、モーションデータ１６において、１秒後の目標として設定されたオイラー角［３，２，１］の状態を示す。図１３（ｂ）では、関節Ｎ２は、部品Ｐ２およびＰ３を所定の角度分曲げた状態で接続する。 FIG. 13A shows a state at time t = 0 and the object O is not trying to move. In FIG. 13A , the joint N2 (0th joint) connects the parts P2 and P3 vertically. On the other hand, FIG. 13B shows the state of the Euler angles [3, 2, 1] set as the target after 1 second in the motion data 16. In FIG. 13B, the joint N2 connects the parts P2 and P3 in a state bent by a predetermined angle.

図１３（ｃ）は、オブジェクトＯが、第０の関節について、１秒間、オイラー角［３，２，１］の状態を目指して動作した結果である。関節を曲げる速度は、オブジェクトＯの筋力や重さによって異なるため、１秒以内に、オイラー角［３，２，１］の状態に到達するとは限らず、図１３（ｃ）は、目標とする状態に到達しなかった状態を示す。図１３（ｃ）では、関節Ｎ２は、部品Ｐ２およびＰ３を所定の角度で接続しているものの、図１３（ｂ）に示す目標の状態の角度よりも小さい。
FIG. 13C shows the result of the object O moving toward the Euler angle [3, 2, 1] state for one second with respect to the 0th joint. Since the speed at which the joint is bent varies depending on the muscle strength and weight of the object O, it does not always reach the Euler angle [3, 2, 1] within one second, and FIG. Indicates that the state has not been reached. In FIG. 13C, the joint N2 connects the parts P2 and P3 at a predetermined angle, but is smaller than the target state angle shown in FIG. 13B.

このように、モーションデータ１６において、各関節における各時間の動作目標を設定される。学習部２１は、モーションデータ１６に従って各関節を動かし、目標に達しなかった場合でも、モーションデータ１６に従って、次に設定された目標に向かって動作する。学習部２１は、モーションデータ１６に設定されたすべての関節および時間における動作目標を試みて、その結果を評価する。 In this way, in the motion data 16, an operation target for each time at each joint is set. The learning unit 21 moves each joint according to the motion data 16 and operates toward the next set target according to the motion data 16 even when the target is not reached. The learning unit 21 tries motion targets for all joints and times set in the motion data 16 and evaluates the results.

図１２および図１３において、関節を曲げる動作について説明したが、これに限られない。例えば、動作部が、バネ形状である場合、モーションデータ１６に、所定時間以内に伸長または縮小する距離が、目標として設定されても良い。動作部が、回転する車輪形状である場合、モーションデータ１６に、所定時間以内に回転する回転角が、目標として設定されても良い。動作部が銃である場合、所定条件を満たした際の発砲の動作が、目標として設定されても良い。 Although the operation of bending the joint has been described with reference to FIGS. 12 and 13, the present invention is not limited thereto. For example, when the operation unit has a spring shape, a distance that expands or contracts within a predetermined time may be set as a target in the motion data 16. When the operating unit has a rotating wheel shape, a rotation angle that rotates within a predetermined time may be set as a target in the motion data 16. When the operation unit is a gun, a firing operation when a predetermined condition is satisfied may be set as a target.

このように本発明の実施の形態に係る学習装置１は、オブジェクトＯがアクション毎に学習する。従って、学習装置１が、オブジェクトＯに様々なアクションを動作させ、動画データＭとして生成することにより、ユーザは、オブジェクトＯが様々なアクションを動作する様子を観察することが可能になる。またオブジェクトＯが活動する仮想領域Ｖを部分領域Ｄに分割することで、オブジェクトＯの位置毎の各種環境値にあわせて学習させる必要がなく、学習に必要な計算量を削減することができる。 As described above, the learning device 1 according to the embodiment of the present invention learns the object O for each action. Therefore, the learning device 1 operates various actions on the object O and generates the moving image data M, so that the user can observe how the object O operates various actions. Further, by dividing the virtual region V in which the object O is active into the partial regions D, it is not necessary to learn according to various environmental values for each position of the object O, and the amount of calculation required for learning can be reduced.

（変形例）
図１４および図１５を参照して、本発明の変形例に係る学習システム９ａを説明する。図１４に示す学習システム９ａは、学習装置１ａ、活動装置５、動画生成装置６および動画配信装置７を備える。学習装置１ａは、図３に示す学習装置１のうち、学習部２１を備え、動画処理部２６を備えない。 (Modification)
With reference to FIG. 14 and FIG. 15, the learning system 9a which concerns on the modification of this invention is demonstrated. A learning system 9a shown in FIG. 14 includes a learning device 1a, an activity device 5, a moving image generating device 6, and a moving image distribution device 7. The learning device 1a includes the learning unit 21 and does not include the moving image processing unit 26 in the learning device 1 illustrated in FIG.

活動装置５、動画生成装置６および動画配信装置７は、それぞれ、一般的なコンピュータであって、所望の処理を実行するように形成される。活動装置５および動画生成装置６は、学習装置１ａと連携して、学習装置１ａにおいて生成された学習データから生成されたモーションデータに従って、オブジェクトＯが動作する状態を可視化する。 Each of the activity device 5, the moving image generating device 6, and the moving image distribution device 7 is a general computer, and is formed so as to execute a desired process. The activity device 5 and the moving image generation device 6 cooperate with the learning device 1a to visualize the state in which the object O operates according to the motion data generated from the learning data generated in the learning device 1a.

活動装置５は、オブジェクトＯが仮想領域Ｖで活動する場合、活動装置５は、学習装置１ａにおいて生成された学習データから生成されたモーションデータに従って、仮想領域ＶでオブジェクトＯが動きを試みる様子を描画して、表示する。オブジェクトＯが有体物である場合、活動装置５は、学習装置１ａにおいて生成された学習データから生成されたモーションデータに従って、オブジェクトＯが備える各動作部に対応する部品が動くように制御する。 When the activity device 5 is active in the virtual region V, the activity device 5 tries to move the object O in the virtual region V according to the motion data generated from the learning data generated in the learning device 1a. Draw and display. When the object O is a tangible object, the activity device 5 performs control so that parts corresponding to each operation unit included in the object O move according to the motion data generated from the learning data generated in the learning device 1a.

動画生成装置６は、オブジェクトＯが仮想領域Ｖで活動する場合、活動装置５は、学習装置１ａにおいて生成された学習データから生成されたモーションデータに従って、仮想領域ＶでオブジェクトＯが動きを試みる様子を示す動画データＭを生成する。動画生成装置６は、図３に示す学習装置１の動画生成部２７に対応する。また動画生成装置６で生成された動画データＭは、動画配信装置７に送信され、動画配信装置７から各端末（図示せず）に動画データＭが配信され、各端末で動画データＭが再生される。 When the object O is active in the virtual region V, the moving image generating device 6 attempts to move the object O in the virtual region V according to the motion data generated from the learning data generated in the learning device 1a. Is generated. The moving image generating device 6 corresponds to the moving image generating unit 27 of the learning device 1 shown in FIG. The moving image data M generated by the moving image generating device 6 is transmitted to the moving image distribution device 7, and the moving image data M is distributed from the moving image distribution device 7 to each terminal (not shown), and the moving image data M is reproduced at each terminal. Is done.

ここで、活動装置５または動画生成装置６が、学習装置１ａの学習データＴを用いて処理する方法として、下記の方法が考えられる。なお、モーションデータは、学習装置１ｂで生成されても良いし、活動装置５または動画生成装置６で生成されても良い。また活動装置５または動画生成装置６は、学習装置１ａにおいて、モーションデータに従って各動作部の動きを試みた結果を取得して、その結果に従って、動作部の動きを再現しても良い。
（１）活動装置５または動画生成装置６が、図１５に示すように、学習装置１ａから、学習データＴに基づいて生成された新たなモーションデータを取得して、活動装置５または動画生成装置６が、取得したモーションデータに従って、オブジェクトＯの動作部の動きを試みる方法。ここで、学習装置１ａが生成した新たなモーションデータは、学習データＴを元に、評価に関係なくランダムに変異させたものである。
（２）活動装置５または動画生成装置６が、学習装置１ａから、学習データＴを取得して、活動装置５または動画生成装置６が、取得した学習データＴから、モーションデータを抽出し、抽出したモーションデータに従って、オブジェクトＯの動作部の動きを試みる方法。ここで学習装置１ａから取得するモーションデータは、学習データＴにおいて良い評価が得られたモーションデータに従って生成されたもので、良い評価が期待できるモーションデータである。 Here, the following method can be considered as a method in which the activity device 5 or the moving image generating device 6 performs processing using the learning data T of the learning device 1a. The motion data may be generated by the learning device 1b, or may be generated by the activity device 5 or the moving image generation device 6. In addition, the activity device 5 or the moving image generation device 6 may acquire the result of trying the movement of each operation unit according to the motion data in the learning device 1a, and reproduce the movement of the operation unit according to the result.
(1) The active device 5 or the moving image generating device 6 acquires new motion data generated based on the learning data T from the learning device 1a as shown in FIG. 6 is a method of trying the movement of the motion part of the object O according to the acquired motion data. Here, the new motion data generated by the learning device 1a is obtained by randomly mutating the learning data T regardless of the evaluation.
(2) The activity device 5 or the moving image generating device 6 acquires the learning data T from the learning device 1a, and the active device 5 or the moving image generating device 6 extracts and extracts the motion data from the acquired learning data T. A method of attempting to move the motion part of the object O according to the motion data. Here, the motion data acquired from the learning device 1a is generated according to the motion data for which good evaluation is obtained in the learning data T, and is motion data that can be expected to have good evaluation.

活動装置５および動画生成装置６等は、様々な場面に適用することが可能になる。例えば、動画生成装置６において、オブジェクトＯを活動させる動画データを生成して配信することにより、オブジェクトが活動する様子を観察するゲームとして提供することが可能になる。一人のユーザが一つのオブジェクトＯに学習させるゲームであっても良いし、複数のユーザが一つのオブジェクトＯに学習させるゲームであっても良い。 The activity device 5 and the moving image generating device 6 can be applied to various scenes. For example, by generating and distributing moving image data that activates the object O in the moving image generating device 6, it can be provided as a game for observing how the object is active. It may be a game in which one user learns one object O, or a game in which a plurality of users learn one object O.

また動画生成装置６において、仮想領域Ｖにおいて複数のオブジェクトを活動させる動画データを生成することにより、それぞれのオブジェクトが個性を持った動きをする動画データを生成することができる。このような動画データは、映画やビデオ等のシーンに適用することができる。 In addition, by generating moving image data that activates a plurality of objects in the virtual region V in the moving image generation device 6, moving image data in which each object moves with individuality can be generated. Such moving image data can be applied to scenes such as movies and videos.

このように、学習データＴを生成する装置と、学習データＴを参照してオブジェクトＯが活動する装置とを分けることにより、活動装置５や動画生成装置６における処理負荷が軽減される。また、予め作成した学習データＴを、場面に応じて再利用することが可能になる。 In this way, by separating the device that generates the learning data T and the device in which the object O is active with reference to the learning data T, the processing load on the activity device 5 and the moving image generation device 6 is reduced. In addition, it is possible to reuse the learning data T created in advance according to the scene.

（その他の実施の形態）
上記のように、本発明の実施の形態とその変形例によって記載したが、この開示の一部をなす論述および図面はこの発明を限定するものであると理解すべきではない。この開示から当業者には様々な代替実施の形態、実施例および運用技術が明らかとなる。 (Other embodiments)
As described above, the embodiments of the present invention and the modifications thereof have been described. However, it should not be understood that the descriptions and drawings constituting a part of this disclosure limit the present invention. From this disclosure, various alternative embodiments, examples, and operational techniques will be apparent to those skilled in the art.

例えば、本発明の実施の形態に記載した学習装置は、図３に示すように一つのハードウエア上に構成されても良いし、その機能や処理数に応じて複数のハードウエア上に構成されても良い。また、既存の情報処理システム上に実現されても良い。 For example, the learning device described in the embodiment of the present invention may be configured on one piece of hardware as shown in FIG. 3, or may be configured on a plurality of pieces of hardware according to the functions and the number of processes. May be. Moreover, you may implement | achieve on the existing information processing system.

本発明はここでは記載していない様々な実施の形態等を含むことは勿論である。従って、本発明の技術的範囲は上記の説明から妥当な特許請求の範囲に係る発明特定事項によってのみ定められるものである。 It goes without saying that the present invention includes various embodiments not described herein. Therefore, the technical scope of the present invention is defined only by the invention specifying matters according to the scope of claims reasonable from the above description.

１学習装置
２、７動画配信装置
３端末
５活動装置
６動画生成装置
８通信ネットワーク
９学習システム
１０記憶装置
１１領域属性データ
１２環境値データ
１３オブジェクト属性データ
１４動作部属性データ
１５評価指標データ
１６モーションデータ
２０処理装置
２１学習部
２２モーション生成部
２３更新部
２６動画処理部
２７動画生成部
２８動画配信部
Ｄ部分領域
Ｍ動画データ
Ｏオブジェクト
Ｔ学習データ
Ｖ仮想領域 DESCRIPTION OF SYMBOLS 1 Learning apparatus 2, 7 Movie delivery apparatus 3 Terminal 5 Activity apparatus 6 Movie generation apparatus 8 Communication network 9 Learning system 10 Storage apparatus 11 Area | region attribute data 12 Environment value data 13 Object attribute data 14 Operation | movement part attribute data 15 Evaluation index data 16 Motion Data 20 processing device 21 learning unit 22 motion generation unit 23 update unit 26 video processing unit 27 video generation unit 28 video distribution unit D partial area M video data O object T learning data V virtual area

Claims

A learning device for learning an object having a moving part,
The object is active in a virtual domain;
The virtual area includes a plurality of partial areas, and each partial area is associated with an environmental value identifier of the partial area,
A motion generation unit that generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of movements of the operation unit to perform the action;
According to the motion data, the movement of the motion part of the object is attempted, and an evaluation value for the action is output. An identifier of the action, an environmental value identifier of a partial area where the object is located, the motion data, and the evaluation value And an update unit that stores them in learning data in association with each other,
The motion generation unit generates new motion data by referring to learning data associated with an identifier of an action to be performed by the object and an environment value identifier of a partial area where the object is located. Learning device.

The motion data includes a plurality of pairs in which the motion of the motion unit and the time to try the motion are associated with the motion unit,
The learning apparatus according to claim 1, wherein the update unit outputs the evaluation value after attempting to move based on a plurality of pairs generated by the motion data.

The learning apparatus according to claim 1, further comprising: a distribution unit that sequentially generates and distributes moving image data indicating a state in which the object attempts to move the operation unit in accordance with the motion data.

A learning method in which an object having a moving part learns,
The object is active in a virtual domain;
The virtual area includes a plurality of partial areas, and each partial area is associated with an environmental value identifier of the partial area,
A step of generating motion data in which a computer associates an identifier of an action to be performed by the object with a target of a combination of movements of the operation unit for performing the action;
The computer attempts to move the motion part of the object according to the motion data, and outputs an evaluation value for the action. The action identifier, the environmental value identifier of the partial area where the object is located, and the motion data And storing the evaluation value in the learning data in association with each other;
The computer includes the step of generating new motion data by referring to learning data associated with an identifier of an action to be performed by the object and an environment value identifier of a partial area where the object is located. How to learn.

A learning program for learning an object having a moving part,
The object is active in a virtual domain;
The virtual area includes a plurality of partial areas, and each partial area is associated with an environmental value identifier of the partial area,
Computer
A motion generation unit that generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of movements of the operation unit to perform the action;
According to the motion data, the movement of the motion part of the object is attempted, and an evaluation value for the action is output. An identifier of the action, an environmental value identifier of a partial area where the object is located, the motion data, and the evaluation value And function as an update unit that stores the learning data in association with
The motion generation unit generates new motion data by referring to learning data associated with an identifier of an action to be performed by the object and an environment value identifier of a partial area where the object is located. Learning program.

A video distribution device connected to a learning device that learns an object having an operation unit,
The object is active in a virtual domain;
The virtual area includes a plurality of partial areas, and each partial area is associated with an environmental value identifier of the partial area,
The learning device
A motion generation unit that generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of movements of the operation unit to perform the action;
According to the motion data, the movement of the motion part of the object is attempted, and an evaluation value for the action is output. An identifier of the action, an environmental value identifier of a partial area where the object is located, the motion data, and the evaluation value And an update unit that stores them in learning data in association with each other,
The motion generation unit generates new motion data with reference to learning data associated with an identifier of an action to be performed on the object and an environment value identifier of a partial area where the object is located,
The video distribution device
A moving image distribution device, wherein learning data generated by the learning device is used to sequentially generate and distribute moving image data indicating a state in which the object attempts to move the motion unit.

An activity device connected to a learning device for learning an object having a motion part,
The object is active in a virtual domain;
The virtual area includes a plurality of partial areas, and each partial area is associated with an environmental value identifier of the partial area,
The learning device
A motion generation unit that generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of movements of the operation unit to perform the action;
According to the motion data, the movement of the motion part of the object is attempted, and an evaluation value for the action is output. An identifier of the action, an environmental value identifier of a partial area where the object is located, the motion data, and the evaluation value And an update unit that stores them in learning data in association with each other,
The motion generation unit generates new motion data with reference to learning data associated with an identifier of an action to be performed on the object and an environment value identifier of a partial area where the object is located,
The active device is
An activity device that operates an object using learning data generated by the learning device.

An activity program connected to a learning device for learning an object having a moving part,
The object is active in a virtual domain;
The virtual area includes a plurality of partial areas, and each partial area is associated with an environmental value identifier of the partial area,
The learning device
A motion generation unit that generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of movements of the operation unit to perform the action;
According to the motion data, the movement of the motion part of the object is attempted, and an evaluation value for the action is output. An identifier of the action, an environmental value identifier of a partial area where the object is located, the motion data, and the evaluation value And an update unit that stores them in learning data in association with each other,
The motion generation unit generates new motion data with reference to learning data associated with an identifier of an action to be performed on the object and an environment value identifier of a partial area where the object is located,
The activity program is
On the computer,
An activity program for operating an object using learning data generated by the learning device.

A moving image generating device connected to a learning device that learns an object having an operation unit,
The object is active in a virtual domain;
The virtual area includes a plurality of partial areas, and each partial area is associated with an environmental value identifier of the partial area,
The learning device
A motion generation unit that generates motion data in which an identifier of an action to be performed by the object is associated with a target of a combination of movements of the operation unit to perform the action;
According to the motion data, the movement of the motion part of the object is attempted, and an evaluation value for the action is output. An identifier of the action, an environmental value identifier of a partial area where the object is located, the motion data, and the evaluation value And an update unit that stores them in learning data in association with each other,
The motion generation unit generates new motion data with reference to learning data associated with an identifier of an action to be performed on the object and an environment value identifier of a partial area where the object is located,
The moving image generating device includes:
The moving image generating device, wherein the moving image data indicating the state in which the object tries to move the motion unit is sequentially generated using the learning data generated by the learning device.