JP7432330B2

JP7432330B2 - Posture correction network learning device and its program, and posture estimation device and its program

Info

Publication number: JP7432330B2
Application number: JP2019169020A
Authority: JP
Inventors: 寛史盛岡; 淳洗井
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2019-09-18
Filing date: 2019-09-18
Publication date: 2024-02-16
Anticipated expiration: 2039-09-18
Also published as: JP2021047563A

Description

本発明は、ＣＧモデルの姿勢を補正するニューラルネットワークの姿勢補正ネットワークを学習する姿勢補正ネットワーク学習装置およびそのプログラム、ならびに、姿勢補正ネットワークを用いてＣＧモデルの姿勢を推定する姿勢推定装置およびそのプログラムに関する。 The present invention relates to a posture correction network learning device and program for learning a posture correction network of a neural network that corrects the posture of a CG model, and a posture estimation device and program for estimating the posture of a CG model using the posture correction network. Regarding.

近年、映像から推定された人体の関節および骨組み（以下、人体ボーンという）をＣＧモデルに反映させて、リアルタイムでＣＧキャラクタを動作させたり、仮想現実上でヴァーチャルアバタを操作する機器が実用化されたりしている。
従来、映像から人体ボーンを認識する手法は種々存在する。
例えば、ニューラルネットワークを利用して、２次元画像上で人体ボーンの関節の位置と結合状態とを推定する手法が存在する（非特許文献１参照）。この手法に加えて、さらに、ステレオカメラを用いることで、三角測量の原理で関節の３次元位置まで推定する手法が存在する（非特許文献２参照）。この手法は、例えば、図８に示すように、予め定めた人体の関節ｐの３次元座標と、各関節ｐとの連結状態を推定する。
また、２次元画像上で人体ボーンの関節の姿勢を推定する手法としては、２次元画像上の関節座標から、直接観測できない関節の姿勢を潜在変数モデルとして学習する手法が存在する（特許文献１参照）。 In recent years, devices have been put into practical use that reflect the joints and bones of the human body estimated from video (hereinafter referred to as human body bones) in a CG model to make CG characters move in real time or to operate virtual avatars in virtual reality. I'm doing it.
Conventionally, there are various methods for recognizing human body bones from images.
For example, there is a method of estimating the positions and connection states of joints of human body bones on a two-dimensional image using a neural network (see Non-Patent Document 1). In addition to this method, there is a method that uses a stereo camera to estimate the three-dimensional position of a joint based on the principle of triangulation (see Non-Patent Document 2). In this method, for example, as shown in FIG. 8, the three-dimensional coordinates of predetermined joints p of the human body and the connection state of each joint p are estimated.
In addition, as a method for estimating the posture of a joint of a human body bone on a two-dimensional image, there is a method that uses a latent variable model to learn the posture of a joint that cannot be directly observed from the joint coordinates on a two-dimensional image (Patent Document 1 reference).

特開２０１０－２２９７８３号公報Japanese Patent Application Publication No. 2010-229783

Zhe Cao, et.al,“Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields”, CVPR2017Zhe Cao, et.al, “Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields”, CVPR2017 “ＡＩ（人工知能）骨格検出システム”,［online］，［令和１年９月２日検索］, インターネット＜ＵＲＬ：https://www.next-system.com/visionpose＞“AI (Artificial Intelligence) Skeleton Detection System”, [online], [Retrieved September 2, 2021], Internet <URL: https://www.next-system.com/visionpose>

従来の手法によれば、映像から人体ボーンの関節の位置（３次元位置）と、各関節の姿勢（３次元回転角）とを推定することは可能である。
しかし、従来の手法は、関節の３次元位置のみから、各関節の３次元回転角を推定しており、各関節をつなぐボーンを軸とした回転（ロール回転角）を考慮していない。なお、ロール回転角は、３次元位置のみから一意に推定することはできない。例えば、図９に示すように、関節ｐ１，ｐ２が連結している場合、関節ｐ１，ｐ２の３次元位置は従来の手法により推定可能であるが、関節ｐ１，ｐ２をつなぐボーンである軸Ａは、ロール回転角が不定である。 According to conventional methods, it is possible to estimate the positions (three-dimensional positions) of joints of human body bones and the postures (three-dimensional rotation angles) of each joint from images.
However, the conventional method estimates the three-dimensional rotation angle of each joint only from the three-dimensional position of the joint, and does not take into account the rotation (roll rotation angle) about the bones connecting each joint. Note that the roll rotation angle cannot be uniquely estimated only from the three-dimensional position. For example, as shown in FIG. 9, when joints p1 and p2 are connected, the three-dimensional positions of joints p1 and p2 can be estimated using conventional methods. The roll rotation angle is indeterminate.

そのため、このように推定された関節の位置および姿勢を３次元のＣＧモデルに適用した場合、モデルの関節の回転角が不自然に表現されたり、接続関係が破綻してしまったり等の状態が発生することがある。例えば、図１０（ａ）に示すように、本来、肩の関節ｐに腕が連結した状態でＣＧモデルを生成する場合、関節ｐの姿勢のうち、点線の軸についてはロール回転角が不定であるため、図１０（ｂ）に示すように、関節ｐにおいて、腕と胴体とにねじれが発生してしまう場合がある。 Therefore, when the joint positions and postures estimated in this way are applied to a three-dimensional CG model, the rotation angles of the model's joints may be expressed unnaturally, or the connection relationships may be broken. This may occur. For example, as shown in Fig. 10(a), when a CG model is generated with an arm connected to shoulder joint p, the roll rotation angle for the axis indicated by the dotted line in the posture of joint p is unstable. Therefore, as shown in FIG. 10(b), twisting may occur in the arm and torso at the joint p.

本発明は、このような問題に鑑みてなされたものであり、ＣＧモデルの関節の３次元位置情報から推定される姿勢を、より正しい姿勢に補正するニューラルネットワークによって構成される姿勢補正ネットワークを学習する姿勢補正ネットワーク学習装置およびそのプログラム、ならびに、姿勢補正ネットワークを用いてＣＧモデルの姿勢を推定する姿勢推定装置およびそのプログラムを提供することを課題とする。 The present invention has been made in view of these problems, and it learns a posture correction network composed of a neural network that corrects the posture estimated from the three-dimensional position information of the joints of a CG model to a more correct posture. An object of the present invention is to provide a posture correction network learning device and its program, and a posture estimation device and its program that estimate the posture of a CG model using the posture correction network.

前記課題を解決するため、本発明に係る姿勢補正ネットワーク学習装置は、３次元空間におけるＣＧモデルの関節の位置に対応付いた姿勢を補正するニューラルネットワークによって構成される姿勢補正ネットワークを学習する姿勢補正ネットワーク学習装置であって、姿勢補正手段と、レンダリング手段と、識別手段と、パラメータ更新手段と、を備える構成とした。 In order to solve the above problems, a posture correction network learning device according to the present invention provides a posture correction network that learns a posture correction network constituted by a neural network that corrects postures corresponding to positions of joints of a CG model in a three-dimensional space. The network learning device is configured to include posture correction means, rendering means, identification means, and parameter updating means.

かかる構成において、姿勢補正ネットワーク学習装置は、姿勢補正手段によって、姿勢補正ネットワークを用いて、姿勢が不定な学習データである姿勢不定学習データから補正学習データを演算する。なお、姿勢補正ネットワークのパラメータの初期値は当初不定であるが、パラメータ更新手段によって順次更新されることになる。 In this configuration, the posture correction network learning device uses the posture correction network by the posture correction means to calculate corrected learning data from the posture indefinite learning data, which is learning data with an indefinite posture. Note that although the initial values of the parameters of the attitude correction network are initially undefined, they are sequentially updated by the parameter updating means.

そして、姿勢補正ネットワーク学習装置は、レンダリング手段によって、姿勢が正しい学習データである正解学習データおよび補正学習データについて、ＣＧモデルの各部位に予め定めたパターンテキスチャを設定した３次元のダミーモデルをそれぞれレンダリングして２次元のレンダリング画像を生成する。これによって、ねじれ等の発生で姿勢が誤っている場合は、レンダリング画像上のパターンテキスチャの配置の違いとして表されることになる。 Then, the posture correction network learning device uses a rendering means to create a three-dimensional dummy model in which a predetermined pattern texture is set in each part of the CG model for the correct learning data and the corrected learning data, which are learning data with correct postures. Rendering is performed to generate a two-dimensional rendered image. As a result, if the posture is incorrect due to twisting or the like, this will be expressed as a difference in the arrangement of pattern textures on the rendered image.

また、姿勢補正ネットワーク学習装置は、識別手段によって、正解学習データから生成したレンダリング画像を真値として識別し、補正学習データから生成したレンダリング画像を偽値として識別するニューラルネットワークによって構成される識別ネットワークを用いて、レンダリング画像から識別結果を演算する。なお、識別ネットワークのパラメータの初期値は当初不定であるが、パラメータ更新手段によって順次更新されることになる。 The posture correction network learning device also includes an identification network configured by a neural network that uses an identification means to identify a rendered image generated from correct learning data as a true value, and identifies a rendered image generated from corrected learning data as a false value. The identification result is calculated from the rendered image using . Note that although the initial values of the parameters of the identification network are initially undefined, they are sequentially updated by the parameter updating means.

そして、姿勢補正ネットワーク学習装置は、パラメータ更新手段によって、正解学習データから生成したレンダリング画像の識別手段による識別結果と真値との交差エントロピーを小さくするとともに、補正学習データから生成したレンダリング画像の識別手段による識別結果と偽値との交差エントロピーを小さくするように識別ネットワークのパラメータを更新する。
また、姿勢補正ネットワーク学習装置は、パラメータ更新手段によって、補正学習データから生成したレンダリング画像の識別手段による識別結果と偽値との交差エントロピーを大きくするように姿勢補正ネットワークのパラメータを更新する。 Then, the posture correction network learning device uses the parameter updating means to reduce the cross entropy between the identification result by the identification means of the rendered image generated from the correct learning data and the true value, and identifies the rendered image generated from the corrected learning data. The parameters of the identification network are updated so as to reduce the cross entropy between the identification result by the means and the false value.
Further, the posture correction network learning device uses the parameter updating means to update the parameters of the posture correction network so as to increase the cross entropy between the identification result by the identification means of the rendered image generated from the correction learning data and the false value.

このように、識別ネットワークについては、レンダリング画像の真偽を正しく識別するように学習され、姿勢補正ネットワークについては、補正学習データを用いたレンダリング画像を識別ネットワークに真値と誤って識別させるように学習されることになる。
これによって、姿勢補正ネットワーク学習装置は、ＣＧモデルの姿勢を正しく補正するニューラルネットワークとして、姿勢補正ネットワークを学習する。 In this way, the identification network is trained to correctly identify the truth or falsehood of a rendered image, and the posture correction network is trained to make the identification network incorrectly identify a rendered image using corrected learning data as a true value. It will be learned.
Thereby, the posture correction network learning device learns the posture correction network as a neural network that correctly corrects the posture of the CG model.

なお、姿勢が不定なデータである姿勢不定学習データは、３次元空間におけるＣＧモデルの関節の位置から、関節を制御する方法である逆運動学計算により、関節の回転角が不定な姿勢を計算する姿勢計算手段によって生成してもよい。
また、本発明は、コンピュータを、前記姿勢補正ネットワーク学習装置として機能させるための姿勢補正ネットワーク学習プログラムで実現することもできる。 In addition, the posture-indeterminate learning data, which is data with an indeterminate posture, uses inverse kinematics calculation, which is a method of controlling joints, to calculate a posture in which the rotation angle of the joint is indeterminate from the joint positions of the CG model in three-dimensional space. It may be generated by an attitude calculation means.
Further, the present invention can also be realized by a posture correction network learning program for causing a computer to function as the posture correction network learning device.

また、前記課題を解決するため、本発明に係る姿勢推定装置は、ＣＧモデルの姿勢を推定する姿勢推定装置であって、姿勢補正手段を備える構成とした。 Moreover, in order to solve the above-mentioned problem, a posture estimation device according to the present invention is a posture estimation device that estimates the posture of a CG model, and is configured to include posture correction means.

かかる構成において、姿勢推定装置は、姿勢補正手段によって、３次元空間におけるＣＧモデルの関節の位置に対応付いた姿勢を補正するニューラルネットワークによって構成される姿勢補正ネットワークを用いて、関節の位置に対応付いた姿勢を補正する。
なお、補正対象の姿勢は、３次元空間におけるＣＧモデルの関節の位置から、関節を制御する方法である逆運動学計算により、関節の回転角が不定な姿勢を計算する姿勢計算手段によって生成してもよい。
また、本発明は、コンピュータを、前記姿勢推定装置として機能させるための姿勢推定プログラムで実現することもできる。 In such a configuration, the posture estimation device uses a posture correction network constituted by a neural network that corrects the posture corresponding to the joint position of the CG model in the three-dimensional space, using the posture correction means to correspond to the joint position. Correct your posture.
Note that the posture to be corrected is generated from the position of the joint of the CG model in a three-dimensional space by a posture calculation means that calculates a posture in which the rotation angle of the joint is uncertain by inverse kinematics calculation, which is a method of controlling the joint. You can.
Moreover, the present invention can also be realized by a posture estimation program for causing a computer to function as the posture estimation device.

本発明は、以下に示す優れた効果を奏するものである。
本発明によれば、３次元空間におけるＣＧモデルの関節の姿勢を精度よく補正するニューラルネットワークによって構成される姿勢補正ネットワークを学習することができる。
また、本発明によれば、学習済の姿勢補正ネットワークを用いて、ＣＧモデルの関節の位置のみでは不定な関節の姿勢を補正し、正しい姿勢を推定することができる。 The present invention has the following excellent effects.
According to the present invention, it is possible to learn a posture correction network constituted by a neural network that accurately corrects the postures of joints of a CG model in a three-dimensional space.
Further, according to the present invention, by using a learned posture correction network, it is possible to correct the postures of joints that are unstable based only on the joint positions of the CG model, and to estimate the correct postures.

本発明の実施形態に係る姿勢補正ネットワーク学習装置の処理内容を説明するための説明図である。FIG. 2 is an explanatory diagram for explaining the processing content of the posture correction network learning device according to the embodiment of the present invention. 本発明の実施形態に係る姿勢補正ネットワーク学習装置の構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of a posture correction network learning device according to an embodiment of the present invention. ダミーモデルの構造を説明するための説明図であって、（ａ）は各部位に設置したパターンテキスチャの例、（ｂ）はダミーモデルを正しいパターンの配列でレンダリングした画像例、（ｃ）はダミーモデルを誤ったパターンの配列でレンダリングした画像例を示す。FIG. 2 is an explanatory diagram for explaining the structure of a dummy model, in which (a) is an example of pattern texture installed in each part, (b) is an example of an image rendered with the dummy model in the correct pattern arrangement, and (c) is an example of an image rendered with the correct pattern arrangement. An example image is shown in which a dummy model is rendered with an incorrect pattern arrangement. 姿勢計算手段の処理内容を説明するための説明図である。FIG. 6 is an explanatory diagram for explaining the processing content of the posture calculation means. 本発明の実施形態に係る姿勢補正ネットワーク学習装置の動作を示すフローチャートである。3 is a flowchart showing the operation of the posture correction network learning device according to the embodiment of the present invention. 本発明の実施形態に係る姿勢推定装置の処理内容を説明するための説明図である。FIG. 3 is an explanatory diagram for explaining the processing content of the posture estimation device according to the embodiment of the present invention. 本発明の実施形態に係る姿勢推定装置の構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of a posture estimation device according to an embodiment of the present invention. 映像から推定される３次元空間におけるＣＧモデルの関節の位置を示す図である。FIG. 3 is a diagram showing the positions of joints of a CG model in a three-dimensional space estimated from an image. 従来の手法で求められる関節の姿勢の不定性を説明するための説明図である。FIG. 2 is an explanatory diagram for explaining the indeterminacy of the posture of a joint obtained by a conventional method. 従来の手法で求められる関節の姿勢で生成したＣＧの例であって、（ａ）は関節の姿勢が正しく連結された場合、（ｂ）は関節でねじれが発生した場合の状態を示す図である。Examples of CG generated with joint postures determined by conventional methods, (a) shows the state when the joint postures are correctly connected, and (b) shows the state when twisting occurs in the joint. be.

以下、本発明の実施形態について図面を参照して説明する。
＜姿勢補正ネットワーク学習装置の処理＞
図１を参照して、本発明の実施形態に係る姿勢補正ネットワーク学習装置１（図２）の処理について説明する。 Embodiments of the present invention will be described below with reference to the drawings.
<Processing of posture correction network learning device>
With reference to FIG. 1, processing of the posture correction network learning device 1 (FIG. 2) according to the embodiment of the present invention will be described.

姿勢補正ネットワーク学習装置１は、３次元空間におけるＣＧモデルの関節の位置に対応付いた姿勢を補正するニューラルネットワークによって構成される姿勢補正ネットワークＮｇを学習するものである。
姿勢補正ネットワーク学習装置１は、学習データとして、複数の関節位置情報Ｐと、複数の関節位置姿勢情報Ｑとを用いる。 The posture correction network learning device 1 is configured to learn a posture correction network Ng formed by a neural network that corrects postures associated with joint positions of a CG model in a three-dimensional space.
The posture correction network learning device 1 uses a plurality of pieces of joint position information P and a plurality of pieces of joint position and posture information Q as learning data.

関節位置情報Ｐは、関節の３次元位置座標（以下、単に位置という）ｐの関節数分の集合｛ｐ｝である。この関節位置情報Ｐは、例えば、従来の一般的な人体ボーンの認識手法を用いて、映像から推定された関節の位置であってもよいし、人体の関節位置にマーカを貼付し、マーカの位置を関節の位置として計測するモーションキャプチャによって取得したものであってもよい。 The joint position information P is a set {p} of three-dimensional position coordinates (hereinafter simply referred to as positions) of joints p for the number of joints. This joint position information P may be, for example, a joint position estimated from an image using a conventional general human body bone recognition method, or may be a joint position estimated from a video using a conventional general human body bone recognition method, or a marker is attached to the joint position of the human body, and the marker is It may be obtained by motion capture in which the position is measured as the position of a joint.

関節位置姿勢情報Ｑは、関節の３次元位置座標（位置）ｑと、各関節の３次元の各軸を基準とした回転角（以下、単に姿勢という）φとの関節数分の集合｛ｑ，φ｝である。この関節位置姿勢情報Ｑは、関節において姿勢が既知（正しい）の情報である。この関節位置姿勢情報Ｑには、予め関節の位置と姿勢とを正しく生成したＣＧデータの人体ボーンの情報や、既存のＣＧデータで、関節の姿勢が正しく表現されているもののみを選択した情報を用いることができる。
なお、関節位置姿勢情報Ｑの関節の数は、関節位置情報Ｐの関節の数と同じとする。 Joint position/posture information Q is a set of three-dimensional position coordinates (positions) q of joints and rotation angles (hereinafter simply referred to as postures) φ of each joint based on each three-dimensional axis for the number of joints {q ,φ}. This joint position/posture information Q is information on the known (correct) posture of the joint. This joint position and posture information Q includes human body bone information from CG data in which joint positions and postures are correctly generated in advance, and information obtained by selecting only existing CG data that accurately represents joint postures. can be used.
Note that the number of joints in the joint position/posture information Q is the same as the number of joints in the joint position information P.

姿勢補正ネットワーク学習装置１は、学習データである関節位置情報Ｐおよび関節位置姿勢情報Ｑから、姿勢補正ネットワークＮｇおよび識別ネットワークＮｄの２つのニューラルネットワークを順次学習することで、精度のよい姿勢補正ネットワークＮｇを生成する。
なお、姿勢補正ネットワークＮｇおよび識別ネットワークＮｄは、深層学習の敵対的生成ネットワーク（ＧＡＮ：Generative Adversarial Network）における生成器（Generator）および識別器（Discriminator）に相当するものである。 The posture correction network learning device 1 sequentially learns two neural networks, a posture correction network Ng and a discrimination network Nd, from joint position information P and joint position and posture information Q, which are learning data, to create a highly accurate posture correction network. Generate Ng.
Note that the posture correction network Ng and the discrimination network Nd correspond to a generator and a discriminator in a generative adversarial network (GAN) of deep learning.

姿勢補正ネットワーク学習装置１は、関節位置情報Ｐである関節の位置の集合｛ｐ｝ごとに、逆運動学計算により姿勢θ＾の集合｛θ＾｝を計算し、姿勢パラメータの集合｛ｐ，θ＾｝を生成する。
そして、姿勢補正ネットワーク学習装置１は、姿勢補正ネットワークＮｇを用いて、関節の位置の集合｛ｐ｝に対応する姿勢の集合｛θ＾｝を｛θ^～｝に補正する。そして、姿勢補正ネットワーク学習装置１は、補正後の姿勢パラメータの集合｛ｐ，θ^～｝で３ＤモデルであるダミーモデルＤｍをレンダリングすることで、２次元の画像（レンダリング画像）Ｉｇを生成する。
また、姿勢補正ネットワーク学習装置１は、関節位置姿勢情報Ｑである姿勢パラメータの集合｛ｑ，φ｝でダミーモデルＤｍをレンダリングすることで、２次元の画像（レンダリング画像）Ｉｒを生成する。 The posture correction network learning device 1 calculates a set of postures θ^ {θ^} by inverse kinematic calculation for each joint position set {p} that is joint position information P, and calculates a set {θ^} of posture parameters {p, Generate θ^}.
Then, the posture correction network learning device 1 uses the posture correction network Ng to correct the posture set {θ^} corresponding to the joint position set {p} to {θ ^~ }. Then, the posture correction network learning device 1 generates a two-dimensional image (rendered image) Ig by rendering the dummy model Dm, which is a 3D model, using the corrected posture parameter set {p, θ ^~ }.
Further, the posture correction network learning device 1 generates a two-dimensional image (rendered image) Ir by rendering the dummy model Dm with a set of posture parameters {q, φ} that is the joint position and posture information Q.

姿勢補正ネットワーク学習装置１は、画像Ｉｒを姿勢が正しい画像であるとして「真」と識別するように識別ネットワークＮｄを学習する。
また、姿勢補正ネットワーク学習装置１は、画像Ｉｇを姿勢が正しくない画像であるとして「偽」と識別するように識別ネットワークＮｄを学習する。また、姿勢補正ネットワーク学習装置１は、識別ネットワークＮｄが画像Ｉｇを「真」と識別するように姿勢補正ネットワークＮｇを学習する。
これによって、姿勢補正ネットワーク学習装置１は、精度の高い姿勢補正ネットワークＮｇを学習することができる。 The posture correction network learning device 1 learns the identification network Nd to identify the image Ir as “true” as an image with a correct posture.
Additionally, the posture correction network learning device 1 learns the identification network Nd to identify the image Ig as a "false" image as an image with an incorrect posture. The posture correction network learning device 1 also learns the posture correction network Ng so that the identification network Nd identifies the image Ig as "true."
Thereby, the posture correction network learning device 1 can learn a highly accurate posture correction network Ng.

＜姿勢補正ネットワーク学習装置の構成＞
次に、図２を参照（適宜図１参照）して、本発明の実施形態に係る姿勢補正ネットワーク学習装置１の構成について説明する。
図２に示すように、姿勢補正ネットワーク学習装置１は、記憶手段１０と、姿勢計算手段１１と、学習データ選択手段１２と、姿勢補正手段１３と、レンダリング手段１４と、識別手段１５と、パラメータ更新手段１６と、を備える。 <Configuration of posture correction network learning device>
Next, the configuration of the posture correction network learning device 1 according to the embodiment of the present invention will be described with reference to FIG. 2 (see FIG. 1 as appropriate).
As shown in FIG. 2, the attitude correction network learning device 1 includes a storage means 10, an attitude calculation means 11, a learning data selection means 12, an attitude correction means 13, a rendering means 14, an identification means 15, and a parameter update means 16.

記憶手段１０は、姿勢補正ネットワーク学習装置１で使用する各種のデータを記憶するものである。記憶手段１０は、ハードディスク、半導体メモリ等の一般的な記憶媒体で構成することができる。
ここでは、記憶手段１０を、ニューラルネットワーク記憶手段１０_１と、学習データ記憶手段１０_２と、ダミーモデル記憶手段１０_３とで分離して構成するが、１つの記憶媒体で構成しても構わない。 The storage means 10 stores various data used in the posture correction network learning device 1. The storage means 10 can be composed of a general storage medium such as a hard disk or a semiconductor memory.
Here, the storage means 10 is configured separately as a neural network storage means ₁₀₁ , a learning data storage means ₁₀₂ , and a dummy model storage means ₁₀₃ , but it may be configured as a single storage medium. .

ニューラルネットワーク記憶手段１０_１は、姿勢補正ネットワークＮｇと、識別ネットワークＮｄと、を記憶するものである。
姿勢補正ネットワークＮｇは、ＣＧモデルの予め定めた関節の姿勢を補正するニューラルネットワークである。
姿勢補正ネットワークＮｇは、関節の３次元の姿勢θ＾（＝（θｘ＾，θｙ＾，θｚ＾））を一組とした予め定めた関節の数（ｎ組）を入力データとする。また、姿勢補正ネットワークＮｇは、入力データと同じ数の補正後の関節の３次元の姿勢θ^～（＝（θｘ^～，θｙ^～，θｚ^～））のｎ組を出力データとする。なお、関節の数は、関節位置情報Ｐの関節の数と同じである。 The neural network storage means ₁₀₁ stores the posture correction network Ng and the identification network Nd.
The posture correction network Ng is a neural network that corrects the postures of predetermined joints of the CG model.
The posture correction network Ng takes as input data a predetermined number of joints (n sets) in which three-dimensional postures θ^ (=(θx^, θy^, θz^)) of the joints are one set. Further, the posture correction network Ng outputs n sets of three-dimensional postures θ ^~ (=(θx ^~ , θy ^~ , θz ^~ )) of the corrected joints, the same number as the input data. Note that the number of joints is the same as the number of joints in the joint position information P.

この姿勢補正ネットワークＮｇのモデル構造は特に限定するものではないが、一般的なニューラルネットワークの構成である重み付き総和演算を行うアフィンレイヤを複数接続し、レイヤ間に活性化関数を挿入した構造とすることができる。
なお、姿勢補正ネットワークＮｇの学習対象のパラメータであるレイヤ間の重み係数は、予め乱数等の初期値が設定され、パラメータ更新手段１６によって更新される。 The model structure of this posture correction network Ng is not particularly limited, but may be a structure in which multiple affine layers that perform weighted summation operations are connected, which is the structure of a general neural network, and an activation function is inserted between the layers. can do.
Note that the inter-layer weighting coefficients, which are parameters to be learned by the posture correction network Ng, are set to initial values such as random numbers in advance, and are updated by the parameter updating means 16.

識別ネットワークＮｄは、レンダリング手段１４で生成された画像のＣＧモデルの関節の姿勢が正しいか、誤っているかを識別するニューラルネットワークである。
識別ネットワークＮｄは、２次元の画像を入力データとする。また、識別ネットワークＮｄは、「真値：１」と「偽値：０」を識別する０を超え１未満の確率値を出力データとする。 The identification network Nd is a neural network that identifies whether the postures of the joints of the CG model of the image generated by the rendering means 14 are correct or incorrect.
The identification network Nd takes a two-dimensional image as input data. Further, the identification network Nd outputs a probability value greater than 0 and less than 1 that identifies "true value: 1" and "false value: 0".

この識別ネットワークＮｄのモデル構造は特に限定するものではないが、予め定めた大きさのカーネルを用いて畳み込み演算を行う畳み込みレイヤを複数接続し、最終段に出力を０を超えて１未満の値に正規化する活性化関数（シグモイド関数）を備えた構造とすることができる。
なお、識別ネットワークＮｄの学習対象のパラメータであるカーネルの重み係数は、予め乱数等の初期値が設定され、パラメータ更新手段１６によって更新される。 The model structure of this identification network Nd is not particularly limited, but it connects multiple convolution layers that perform convolution operations using kernels of a predetermined size, and outputs a value greater than 0 and less than 1 at the final stage. The structure can be provided with an activation function (sigmoid function) that is normalized to .
Note that the weighting coefficient of the kernel, which is a learning target parameter of the identification network Nd, is set to an initial value such as a random number in advance, and is updated by the parameter updating means 16.

学習データ記憶手段１０_２は、姿勢補正ネットワークＮｇおよび識別ネットワークＮｄを学習するための学習データとして、姿勢不定学習データＳｉと、正解学習データＳｃと、を記憶するものである。 The learning data storage means ₁₀₂ stores posture indefinite learning data Si and correct answer learning data Sc as learning data for learning the posture correction network Ng and the identification network Nd.

姿勢不定学習データＳｉは、姿勢計算手段１１によって、関節位置情報Ｐから計算された、関節の位置（３次元位置座標）ｐと、各関節の姿勢（３次元の各軸を基準とした回転角）θ＾との集合｛ｐ，θ＾｝である。なお、姿勢不定学習データＳｉにおける姿勢は、図９で説明したように、ロール回転角が不定である。 The posture indeterminate learning data Si includes joint positions (three-dimensional position coordinates) p and postures of each joint (rotation angles with respect to each three-dimensional axis) calculated from the joint position information P by the posture calculation means 11. ) θ^ is the set {p, θ^}. Note that, as explained in FIG. 9, the posture in the posture indefinite learning data Si has an indefinite roll rotation angle.

正解学習データＳｃは、関節の位置および姿勢が正しく表現されている関節位置姿勢情報Ｑである関節の位置（３次元位置座標）ｑと、各関節の姿勢（３次元の各軸を基準とした回転角）φとの集合｛ｑ，φ｝である。 The correct learning data Sc consists of joint positions (three-dimensional position coordinates) q, which is joint position and posture information Q that correctly represents the joint positions and postures, and the postures of each joint (based on each three-dimensional axis). rotation angle) φ and the set {q, φ}.

ダミーモデル記憶手段１０_３は、３ＤモデルであるダミーモデルＤｍを予め記憶するものである。
ダミーモデルＤｍは、ＣＧモデルの形状を３Ｄモデル（ポリゴンデータ等）で表したデータである。このダミーモデルＤｍは、レンダリング手段１４において、関節の位置および姿勢から、２次元の画像を生成するために使用される。
ダミーモデルＤｍにおいて、ＣＧモデルの各部位には、各部位を識別するパターンテキスチャが設定されているものとする。なお、パターンテキスチャは、少なくとも関節間の軸（ボーン）に対して回転状態を識別できるように、軸の回転方向で異なるものとする。 The dummy model storage means ₁₀₃ stores in advance a dummy model Dm, which is a 3D model.
The dummy model Dm is data representing the shape of the CG model as a 3D model (polygon data, etc.). This dummy model Dm is used by the rendering means 14 to generate a two-dimensional image from the positions and postures of the joints.
In the dummy model Dm, it is assumed that each part of the CG model is set with a pattern texture that identifies each part. Note that the pattern texture is different depending on the rotational direction of the axis so that the rotational state of the axis (bone) between the joints can be identified at least.

例えば、図３（ａ）に示すように、ダミーモデルＤｍの手首ｐ_１、肘ｐ_２、肩ｐ_３を関節とした場合、腕の上腕二頭筋ＢＣ、上腕三頭筋ＴＣ、前腕屈筋ＦＦ、前腕伸筋ＦＥに対応する各部位に異なるパターンテキスチャを貼り付けておく。
これによって、ダミーモデルＤｍをレンダリングして、画像とした場合、関節の姿勢が正しければ、図３（ｂ）のようにパターンの配列が正しい画像となる。一方、肩ｐ_３の姿勢が正しくなければ、図３（ｃ）のように、上腕二頭筋ＢＣおよび上腕三頭筋ＴＣの各部位がねじれ、誤った画像となる。 For example, as shown in FIG. 3(a), when the wrist p ₁ , elbow p ₂ , and shoulder p ₃ of the dummy model Dm are the joints, the biceps brachii BC, triceps brachii TC, and forearm flexor FF of the arm , different pattern textures are pasted on each part corresponding to the forearm extensor muscles FE.
As a result, when the dummy model Dm is rendered into an image, if the postures of the joints are correct, the image will have a correct pattern arrangement as shown in FIG. 3(b). On the other hand, if the posture of the shoulder _p3 is not correct, each part of the biceps BC and triceps TC will be twisted, resulting in an incorrect image, as shown in FIG. 3(c).

姿勢計算手段１１は、関節位置情報Ｐである関節の位置（３次元位置座標）から、各関節の姿勢として３次元空間の３軸を基準とした回転角を計算するものである。この３次元位置座標から、関節の姿勢を計算する手法は、一般的な逆運動学計算を用いればよい。 The posture calculation means 11 calculates rotation angles with respect to three axes in a three-dimensional space as the posture of each joint from the joint positions (three-dimensional position coordinates) that are joint position information P. A general inverse kinematics calculation may be used to calculate the posture of the joint from the three-dimensional position coordinates.

例えば、図４に示すように、手首ｐ_１、肘ｐ_２、肩ｐ_３を関節とした場合、姿勢計算手段１１は、予め定めた基準の関節（ここでは、手首ｐ_１とする）において初期値として、任意の姿勢θ_１を設定する。そして、姿勢計算手段１１は、手首ｐ_１と肘ｐ_２との３次元位置座標の変位量に応じて、逆運動学計算により、肘ｐ_２における姿勢θ_２を計算する。また、姿勢計算手段１１は、肘ｐ_２と肩ｐ_３の３次元位置座標の変位量に応じて、逆運動学計算により、肩ｐ_３における姿勢θ_３を計算する。なお、図４中、関節の位置ｐ_１，ｐ_２，ｐ_３が拘束条件となって姿勢が順次決定されることになるが、姿勢θ_１，θ_２，θ_３の点線の軸においては不定性を有していることになる。また、図４では、説明を簡略化するため、関節を減らして説明している。
このように、姿勢計算手段１１は、関節位置情報Ｐである関節の３次元位置座標から、各関節における姿勢を計算する。
姿勢計算手段１１は、ＣＧモデルの予め定めた関節数ごとに、姿勢不定学習データＳｉとして、位置ｐおよび姿勢θ＾を学習データ記憶手段１０_２に記憶する。 For example, as shown in FIG. 4, when the wrist p ₁ , elbow p ₂ , and shoulder p ₃ are joints, the posture calculation means ₁₁ initially calculates the An arbitrary posture θ ₁ is set as the value. Then, the posture calculation means 11 calculates the posture θ ₂ at the elbow p ₂ by inverse kinematics calculation according to the amount of displacement of the three-dimensional position coordinates of the wrist p ₁ and the elbow p ₂ . Further, the posture calculation means 11 calculates the posture θ ₃ at the shoulder p ₃ by inverse kinematics calculation according to the amount of displacement of the three-dimensional position coordinates of the elbow p ₂ and the shoulder p ₃ . Note that in FIG. 4, the positions p ₁ , p ₂ , and p ₃ of the joints serve as constraint conditions, and the postures are determined sequentially, but the postures θ ₁ , θ ₂ , and θ ₃ are undefined on the dotted line axes. This means that they have a sexual nature. Further, in FIG. 4, the number of joints is reduced to simplify the explanation.
In this way, the posture calculation means 11 calculates the posture of each joint from the three-dimensional position coordinates of the joint, which is the joint position information P.
The posture calculation means 11 stores the position p and the posture θ^ in the learning data storage means ₁₀₂ as posture indefinite learning data Si for each predetermined number of joints of the CG model.

学習データ選択手段１２は、学習データ記憶手段１０_２に記憶されている学習データを、選択して読み出すものである。
学習データ選択手段１２は、姿勢不定学習データＳｉまたは正解学習データＳｃを交互に選択することとしてもよいし、どちらか一方の学習データによる学習が完了した段階で、他方の学習データを選択することとしてもよい。なお、学習データ選択手段１２は、姿勢不定学習データＳｉおよび正解学習データＳｃのいずれを選択したかについては、パラメータ更新手段１６に通知する。また、学習データ選択手段１２は、次の学習データを選択するタイミングは、パラメータ更新手段１６から指示されたタイミングとする。 The learning data selection means 12 selects and reads out the learning data stored in the learning data storage means ₁₀₂ .
The learning data selection means 12 may alternately select the posture indefinite learning data Si or the correct answer learning data Sc, or may select the other learning data when learning using one of the learning data is completed. You can also use it as The learning data selection means 12 notifies the parameter updating means 16 which of the posture indefinite learning data Si and the correct learning data Sc has been selected. Further, the learning data selection means 12 selects the next learning data at a timing instructed by the parameter updating means 16.

学習データ選択手段１２は、姿勢不定学習データＳｉを選択した場合、当該学習データを姿勢補正手段１３に出力する。
また、学習データ選択手段１２は、正解学習データＳｃを選択した場合、当該学習データをレンダリング手段１４に出力する。 When the learning data selection means 12 selects the posture indefinite learning data Si, the learning data selection means 12 outputs the learning data to the posture correction means 13.
Further, when the learning data selection means 12 selects the correct learning data Sc, the learning data selection means 12 outputs the learning data to the rendering means 14.

学習データ選択手段１２は、パラメータ更新手段１６から、次の学習データの入力を指示された段階で、新たに学習データを選択する。
なお、学習データ選択手段１２は、予め定めた学習終了条件に達した段階で、学習データの選択を終了する。例えば、学習データ選択手段１２は、予め定めた学習回数だけ、学習データ全体を選択した場合、パラメータ更新手段１６から、パラメータ更新の変化量が予め定めた閾値を下回ったことを通知された場合等である。 The learning data selection means 12 selects new learning data when instructed to input the next learning data from the parameter updating means 16.
Note that the learning data selection means 12 ends the selection of learning data when a predetermined learning end condition is reached. For example, when the learning data selection means 12 selects the entire learning data a predetermined number of times of learning, when the parameter update means 16 notifies that the amount of change in the parameter update has fallen below a predetermined threshold, etc. It is.

姿勢補正手段１３は、姿勢補正ネットワークＮｇを用いて、姿勢計算手段１１で計算された学習データである姿勢不定学習データＳｉの関節の姿勢を補正するものである。
姿勢補正手段１３は、ニューラルネットワーク記憶手段１０_１に記憶されている姿勢補正ネットワークＮｇに、姿勢不定学習データＳｉの関節の数の姿勢を入力し、学習対象のパラメータであるレイヤ間の重み係数を用いて、ニューラルネットワークの演算を行う。
姿勢補正手段１３は、姿勢不定学習データＳｉの関節の位置と補正後の姿勢とを、補正学習データＳｉ_２として、レンダリング手段１４に出力する。 The posture correction means 13 corrects the postures of the joints of the posture indefinite learning data Si, which is the learning data calculated by the posture calculation means 11, using the posture correction network Ng.
The posture correction means 13 inputs the postures of the number of joints of the posture indefinite learning data Si into the posture correction network Ng stored in the neural network storage means ₁₀₁ , and calculates the weighting coefficient between layers, which is a parameter to be learned. This is used to perform neural network calculations.
The posture correction means 13 outputs the joint positions and corrected postures of the posture indefinite learning data Si to the rendering means ₁₄ as corrected learning data Si2.

レンダリング手段１４は、ＣＧモデルの関節の位置および姿勢を用いて、ダミーモデル記憶手段１０_３に記憶されているダミーモデルＤｍをレンダリングした２次元の画像（レンダリング画像）を生成するものである。
レンダリング手段１４は、姿勢補正手段１３から入力される姿勢が補正された関節の位置および姿勢（補正学習データＳｉ_２）、または、学習データ選択手段１２から入力された関節の位置および姿勢（正解学習データＳｃ）を用いて、ダミーモデルＤｍをレンダリングした２次元の画像を生成する。
レンダリング手段１４は、例えば、予め定めた視点位置を基準として、ＣＧモデルの関節の位置および姿勢に応じて、ダミーモデルＤｍで各部位に予め対応付けられているパターンテキスチャをマッピングし、投影変換を行うことで、２次元の画像を生成する。
レンダリング手段１４は、生成した２次元の画像を識別手段１５に出力する。 The rendering means 14 generates a two-dimensional image (rendered image) by rendering the dummy model Dm stored in the dummy model storage means ₁₀₃ using the positions and postures of the joints of the CG model.
The rendering means 14 outputs joint positions and postures whose postures inputted from the posture correction means 13 are corrected (corrected learning data Si ₂ ), or joint positions and postures inputted from the learning data selection means 12 (correct learning data). A two-dimensional image is generated by rendering the dummy model Dm using the data Sc).
For example, the rendering means 14 maps pattern textures that are previously associated with each part in the dummy model Dm according to the positions and postures of the joints of the CG model, using a predetermined viewpoint position as a reference, and performs projection transformation. By doing so, a two-dimensional image is generated.
The rendering means 14 outputs the generated two-dimensional image to the identification means 15.

識別手段１５は、識別ネットワークＮｄを用いて、レンダリング手段１４で生成された画像が、姿勢が正しい画像であるか否かを識別するものである。
識別手段１５は、ニューラルネットワーク記憶手段１０_１に記憶されている識別ネットワークＮｄに、レンダリング手段１４で生成された画像を入力し、学習対象のパラメータであるカーネルの重み係数を用いて、ニューラルネットワークの演算を行う。
識別手段１５は、０を超えて１未満の範囲で出力される識別ネットワークＮｄの演算結果（識別結果）をパラメータ更新手段１６に出力する。 The identification means 15 uses the identification network Nd to identify whether the image generated by the rendering means 14 is an image with a correct posture.
The identification means 15 inputs the image generated by the rendering means 14 into the identification network Nd stored in the neural network storage means ₁₀₁ , and uses the weighting coefficient of the kernel, which is the parameter to be learned, to determine the neural network. Perform calculations.
The identification means 15 outputs the calculation result (identification result) of the identification network Nd, which is output in a range exceeding 0 and less than 1, to the parameter updating means 16.

パラメータ更新手段１６は、正解学習データＳｃから生成した画像Ｉｒの識別結果と真値との交差エントロピーを小さくするとともに、補正学習データＳｉ_２から生成した画像Ｉｇの識別結果と偽値との交差エントロピーを小さくするように識別ネットワークＮｄのパラメータを更新し、補正学習データＳｉ_２から生成した画像Ｉｇの識別結果と偽値との交差エントロピーを大きくするように姿勢補正ネットワークＮｇのパラメータを更新するものである。 The parameter updating means 16 reduces the cross entropy between the identification result of the image Ir generated from the correct learning data Sc and the true value, and reduces the cross entropy between the identification result of the image Ig generated from the corrected learning data Si ₂ and the false value. The parameters of the identification network Nd are updated to reduce the , and the parameters of the posture correction network Ng are updated to increase the cross entropy between the identification result of the image Ig generated from the corrected learning data _Si2 and the false value. be.

ここでは、パラメータ更新手段１６は、予め定めた損失関数の損失値をそれぞれ最大化および最小化するように、姿勢補正ネットワークＮｇおよび識別ネットワークＮｄのパラメータを更新する。
損失関数として、識別ネットワークＮｄを学習するための損失関数（以下、識別用損失関数という）と、姿勢補正ネットワークＮｇを学習するための損失関数（以下、姿勢補正用損失関数という）とを用いる。 Here, the parameter updating means 16 updates the parameters of the attitude correction network Ng and the identification network Nd so as to maximize and minimize the loss values of the predetermined loss functions, respectively.
As the loss functions, a loss function for learning the discrimination network Nd (hereinafter referred to as a discrimination loss function) and a loss function for learning the posture correction network Ng (hereinafter referred to as a posture correction loss function) are used.

識別用損失関数は、学習データが正解学習データＳｃの場合の識別結果と真値との交差エントロピー、および、学習データが補正学習データＳｉ_２の場合の識別結果と偽値との交差エントロピーの和（損失値）を演算する関数である。
姿勢補正用損失関数は、学習データが補正学習データＳｉ_２の場合に、識別結果と偽値との交差エントロピー（損失値）を演算する関数である。 The discrimination loss function is the sum of the cross entropy between the discrimination result and the true value when the learning data is the correct learning data Sc, and the cross entropy between the discrimination result and the false value when the learning data is the corrected learning data _Si2 . (loss value).
The posture correction loss function is a function that calculates the cross entropy (loss value) between the identification result and the false value when the learning data is the corrected learning data _Si2 .

ここで、学習データが補正学習データＳｉ_２の場合における識別手段１５から入力される画像Ｉｇに対する識別結果の値をＤ（Ｉｇ）、学習データが正解学習データＳｃの場合における識別手段１５から入力される画像Ｉｒに対する識別結果の値をＤ（Ｉｒ）とする。また、真値を“Ｒｅａｌ（＝１）”、偽値を“Ｆａｋｅ（＝０）”とする。そして、Ｌ_ｂｃｅを、２値交差エントロピーを演算する関数とする。
なお、学習データが姿勢不定学習データＳｉを補正した補正学習データＳｉ_２であるか、正解学習データＳｃであるかは、学習データ選択手段１２から通知されるものとする。
この場合、パラメータ更新手段１６は、以下の式（１）に示す識別用損失関数Ｌ^Ｄを最小化するように、識別ネットワークＮｄのパラメータを更新する。 Here, D(Ig) is the value of the identification result for the image Ig input from the identification means 15 when the learning data is the corrected learning data _Si2 , and D(Ig) is the value of the identification result input from the identification means 15 when the learning data is the correct learning data Sc. Let D(Ir) be the value of the identification result for the image Ir. Also, let the true value be "Real (=1)" and the false value be "Fake (=0)". Then, let L _bce be a function that calculates binary cross entropy.
Note that the learning data selection means 12 notifies whether the learning data is the corrected learning data _Si2 obtained by correcting the attitude indefinite learning data Si or the correct learning data Sc.
In this case, the parameter updating means 16 updates the parameters of the identification network Nd so as to minimize the identification loss function L ^D shown in equation (1) below.

なお、ａは“０”，“１”を両端とする開区間の実数〔ａ∈（０，１）〕、ｂは“０”または“１”〔ｂ∈｛０，１｝〕を示す。
この式（１）の損失関数によって、パラメータ更新手段１６は、画像Ｉｒを「真」、画像Ｉｇを「偽」と識別するように、識別ネットワークＮｄを学習させる。
また、パラメータ更新手段１６は、以下の式（２）に示す姿勢補正用損失関数Ｌ^Ｇを最大化するように、姿勢補正ネットワークＮｇのパラメータを更新する。 Note that a represents a real number [a∈(0,1)] in an open interval with “0” and “1” as both ends, and b represents “0” or “1” [b∈{0,1}].
Using the loss function of equation (1), the parameter updating means 16 trains the identification network Nd to identify the image Ir as "true" and the image Ig as "false."
Further, the parameter updating means 16 updates the parameters of the attitude correction network Ng so as to maximize the attitude correction loss function ^LG shown in the following equation (2).

この式（２）の損失関数によって、パラメータ更新手段１６は、識別ネットワークＮｄに画像Ｉｇを「偽」と識別させないように、姿勢補正ネットワークＮｇを学習させる。
パラメータ更新手段１６は、式（１）および式（２）の損失関数の値をそれぞれ最小化および最大化させるように誤差逆伝播法により、識別ネットワークＮｄおよび姿勢補正ネットワークＮｇのパラメータを更新する。
そして、パラメータ更新手段１６は、パラメータ更新後、次の学習データを選択することを学習データ選択手段１２に指示する。また、パラメータ更新手段１６は、パラメータ更新の変化量が予め定めた閾値を下回る場合、その旨を学習データ選択手段１２に通知する。 Using the loss function of equation (2), the parameter updating means 16 causes the posture correction network Ng to learn so as not to cause the identification network Nd to identify the image Ig as "false".
The parameter updating means 16 updates the parameters of the identification network Nd and the attitude correction network Ng using the error backpropagation method so as to minimize and maximize the values of the loss functions of equations (1) and (2), respectively.
After updating the parameters, the parameter updating means 16 instructs the learning data selecting means 12 to select the next learning data. Furthermore, when the amount of change in parameter update is less than a predetermined threshold, the parameter update means 16 notifies the learning data selection means 12 of this fact.

以上説明したように、姿勢補正ネットワーク学習装置１は、ＣＧモデルの関節の位置および姿勢のうち、姿勢が不定な位置姿勢情報であっても、識別ネットワークＮｄにおいて姿勢が正しい情報であると識別するように姿勢を補正することが可能な姿勢補正ネットワークＮｇを学習することができる。
なお、姿勢補正ネットワーク学習装置１は、コンピュータを、前記した各手段として機能させるための姿勢補正ネットワーク学習プログラムで動作させることができる。 As explained above, the posture correction network learning device 1 identifies the position and posture information of the joints of the CG model as correct information in the identification network Nd even if the posture is uncertain. An attitude correction network Ng that can correct the attitude can be learned as follows.
Note that the posture correction network learning device 1 can operate a computer with a posture correction network learning program for causing the computer to function as each of the above-mentioned means.

＜姿勢補正ネットワーク学習装置の動作＞
次に、図５を参照（構成については適宜図２参照）して、姿勢補正ネットワーク学習装置１の動作について説明する。 <Operation of posture correction network learning device>
Next, the operation of the posture correction network learning device 1 will be described with reference to FIG. 5 (see FIG. 2 as appropriate for the configuration).

ステップＳ１において、学習データ記憶手段１０_２に、ＣＧモデルの関節の位置および姿勢が正しく表現されている関節位置姿勢情報Ｑを、正解学習データＳｃとして記憶しておく。
ステップＳ２において、姿勢計算手段１１は、関節位置情報ＰであるＣＧモデルの関節の位置から、３ＤのＣＧモデルの関節を制御する方法である逆運動学計算により各関節の姿勢を算出する。そして、姿勢計算手段１１は、算出した姿勢と関節の位置とを姿勢不定学習データＳｉとして、学習データ記憶手段１０_２に記憶する。 In step S1, joint position and orientation information Q that correctly represents the positions and orientations of the joints of the CG model is stored in the learning data storage means ₁₀₂ as correct learning data Sc.
In step S2, the posture calculation means 11 calculates the posture of each joint from the joint position information P of the CG model by inverse kinematics calculation, which is a method of controlling the joints of a 3D CG model. Then, the posture calculation means 11 stores the calculated posture and joint positions in the learning data storage means ₁₀₂ as posture indefinite learning data Si.

ステップＳ３において、学習データ選択手段１２は、学習データ記憶手段１０_２から、学習データ（姿勢不定学習データＳｉまたは正解学習データＳｃ）を選択して読み出す。
ステップＳ４において、学習データ選択手段１２は、学習データとして、姿勢不定学習データＳｉを選択した場合と、正解学習データＳｃを選択した場合とで、以降の処理を切り替える。 In step S3, the learning data selection means 12 selects and reads learning data (posture indefinite learning data Si or correct answer learning data Sc) from the learning data storage _{means 102} .
In step S4, the learning data selection means 12 switches the subsequent processing depending on whether the posture indefinite learning data Si is selected as the learning data or the correct learning data Sc is selected.

このステップＳ３で選択した学習データが姿勢不定学習データＳｉである場合（ステップＳ４でＹｅｓ）、ステップＳ５において、姿勢補正手段１３は、姿勢補正ネットワークＮｇを用いてニューラルネットワークの演算を行うことで、姿勢不定学習データＳｉの関節の姿勢を補正し、補正学習データＳｉ_２を生成する。そして、姿勢補正ネットワーク学習装置１は、ステップＳ６に動作を進める。
一方、ステップＳ３で選択した学習データが正解学習データＳｃである場合（ステップＳ４でＮｏ）、姿勢補正ネットワーク学習装置１は、そのまま、ステップＳ６に動作を進める。 If the learning data selected in this step S3 is the posture indefinite learning data Si (Yes in step S4), in step S5, the posture correction means 13 performs a neural network calculation using the posture correction network Ng. The joint postures of the posture-indeterminate learning data Si are corrected to generate corrected learning data _Si2 . The posture correction network learning device 1 then proceeds to step S6.
On the other hand, if the learning data selected in step S3 is the correct learning data Sc (No in step S4), the posture correction network learning device 1 directly proceeds to step S6.

ステップＳ６において、レンダリング手段１４は、ステップＳ３で選択された正解学習データＳｃ、または、ステップＳ３で選択され、ステップＳ５で補正された姿勢不定学習データＳｉを補正した補正学習データＳｉ_２を用いて、ダミーモデルＤｍをレンダリングした２次元の画像を生成する。 In step S6, the rendering means 14 uses the correct learning data Sc selected in step S3 or the corrected learning data _Si2 obtained by correcting the posture indefinite learning data Si selected in step S3 and corrected in step S5. , a two-dimensional image is generated by rendering the dummy model Dm.

ステップＳ７において、識別手段１５は、識別ネットワークＮｄを用いてニューラルネットワークの演算を行うことで、ステップＳ６で生成された画像が、姿勢が正しい画像であるか否かを識別する。ここで、識別手段１５は、「偽」を“０”、「真」を“１”とする０を超え１未満の範囲で識別結果を算出する。 In step S7, the identification means 15 performs a neural network calculation using the identification network Nd to identify whether or not the image generated in step S6 is an image with a correct posture. Here, the identification means 15 calculates the identification result in a range greater than 0 and less than 1, with "0" representing "false" and "1" representing "true".

ステップＳ８において、パラメータ更新手段１６は、予め定めた識別用損失関数を小さく、姿勢補正用損失関数を大きくするように識別ネットワークＮｄおよび姿勢補正ネットワークＮｇのパラメータを更新する。
具体的には、パラメータ更新手段１６は、式（１）に示すように、ステップＳ６で生成された画像が、正解学習データＳｃから生成された画像Ｉｒの場合、識別結果Ｄ（Ｉｒ）と真値を示す値“１”との交差エントロピーを小さくするように識別ネットワークＮｄのパラメータを更新する。また、パラメータ更新手段１６は、ステップＳ６で生成された画像が、姿勢不定学習データＳｉを補正した補正学習データＳｉ_２から生成された画像Ｉｇの場合、識別結果Ｄ（Ｉｇ）と偽値を示す値“０”との交差エントロピーを小さくするように識別ネットワークＮｄのパラメータを更新する。 In step S8, the parameter updating means 16 updates the parameters of the identification network Nd and the attitude correction network Ng so that the predetermined loss function for identification becomes smaller and the predetermined loss function for attitude correction becomes larger.
Specifically, as shown in equation (1), if the image generated in step S6 is the image Ir generated from the correct learning data Sc, the parameter updating means 16 updates the identification result D(Ir) and the true The parameters of the identification network Nd are updated so as to reduce the cross entropy with the value "1" indicating the value. Further, when the image generated in step S6 is an image Ig generated from corrected learning data _Si2 obtained by correcting the posture indefinite learning data Si, the parameter updating means 16 indicates a false value as the identification result D(Ig). The parameters of the identification network Nd are updated so as to reduce the cross entropy with the value "0".

また、パラメータ更新手段１６は、式（２）に示すように、ステップＳ６で生成された画像が、姿勢不定学習データＳｉを補正した補正学習データＳｉ_２から生成された画像Ｉｇの場合、識別結果Ｄ（Ｉｇ）と偽値を示す値“０”との交差エントロピーを大きくするように姿勢補正ネットワークＮｇのパラメータを更新する。 Further, as shown in equation (2), when the image generated in step S6 is an image Ig generated from corrected learning data _Si2 obtained by correcting the posture indefinite learning data Si, the parameter updating means 16 performs the identification result The parameters of the posture correction network Ng are updated so as to increase the cross entropy between D(Ig) and the value "0" indicating a false value.

ステップＳ９において、学習データ選択手段１２は、予め定めた学習完了の終了条件を満たしたか否かを判定する。例えば、ここでは、学習データ選択手段１２は、予め定めた学習回数だけ、学習データ全体を選択していない場合（ステップＳ９でＮｏ）、学習が完了していないと判定し、姿勢補正ネットワーク学習装置１は、ステップＳ２に戻って学習を継続する。
一方、学習データ選択手段１２は、予め定めた学習回数だけ、学習データ全体を選択した場合（ステップＳ９でＹｅｓ）、学習が完了したと判定し、姿勢補正ネットワーク学習装置１は、動作を終了する。
以上の動作によって、ＣＧモデルの関節の位置および姿勢のうち、姿勢が不定な位置姿勢情報であっても、姿勢を補正することが可能な姿勢補正ネットワークＮｇを学習することができる。 In step S9, the learning data selection means 12 determines whether a predetermined end condition for completing learning is satisfied. For example, here, if the entire learning data has not been selected for a predetermined number of learning times (No in step S9), the learning data selection means 12 determines that learning is not completed, and the posture correction network learning device 1 returns to step S2 and continues learning.
On the other hand, if the learning data selection means 12 selects the entire learning data a predetermined number of times (Yes in step S9), the learning data selection means 12 determines that learning is completed, and the posture correction network learning device 1 ends the operation. .
Through the above-described operations, it is possible to learn a posture correction network Ng that can correct the posture even if the position and posture information of the joints of the CG model is uncertain.

＜姿勢推定装置の処理＞
次に、図６を参照して、本発明の実施形態に係る姿勢推定装置２（図７）の処理について説明する。
姿勢推定装置２は、ニューラルネットワークである姿勢補正ネットワークＮｇを用いて、３次元空間上の関節の位置から関節の姿勢を推定するものである。
ここでは、姿勢推定装置２は、関節位置情報Ｐを入力し、姿勢を補正する。
関節位置情報Ｐは、関節の位置（３次元位置座標）ｐの関節数分の集合｛ｐ｝であって、図１で説明した関節位置情報Ｐと同じである。 <Processing of posture estimation device>
Next, with reference to FIG. 6, processing of the posture estimation device 2 (FIG. 7) according to the embodiment of the present invention will be described.
The posture estimation device 2 uses a posture correction network Ng, which is a neural network, to estimate the posture of a joint from the position of the joint in a three-dimensional space.
Here, the posture estimation device 2 inputs the joint position information P and corrects the posture.
The joint position information P is a set {p} of joint positions (three-dimensional position coordinates) p for the number of joints, and is the same as the joint position information P described in FIG. 1 .

姿勢推定装置２は、関節位置情報Ｐである関節の位置の集合｛ｐ｝ごとに、逆運動学計算により姿勢θ＾の集合｛θ＾｝を計算し、姿勢パラメータの集合｛ｐ，θ＾｝を生成する。
そして、姿勢推定装置２は、姿勢補正ネットワークＮｇを用いて、関節の位置の集合｛ｐ｝に対応する姿勢の集合｛θ＾｝を｛θ^～｝に補正する。
なお、姿勢補正ネットワークＮｇは、姿勢補正ネットワーク学習装置１（図２）で学習されたニューラルネットワークである。
このように補正した姿勢パラメータの集合｛ｐ，θ^～｝で３ＤのモデルＣをレンダリングすることで、姿勢が自然に描画された２次元の画像Ｉｇを生成することが可能になる。 The posture estimation device 2 calculates a set of postures θ^ {θ^} by inverse kinematic calculation for each joint position set {p} that is joint position information P, and calculates a set of posture parameters {p, θ^ } is generated.
Then, the posture estimation device 2 uses the posture correction network Ng to correct the posture set {θ^} corresponding to the joint position set {p} to {θ ^~ }.
Note that the posture correction network Ng is a neural network trained by the posture correction network learning device 1 (FIG. 2).
By rendering the 3D model C using the posture parameter set {p, θ ^~ } corrected in this way, it is possible to generate a two-dimensional image Ig in which the posture is naturally drawn.

＜姿勢推定装置の構成＞
次に、図７を参照（適宜図６参照）して、本発明の実施形態に係る姿勢推定装置２の構成について説明する。
図７に示すように、姿勢推定装置２は、記憶手段２０と、姿勢計算手段２１と、姿勢補正手段２２と、備える。 <Configuration of posture estimation device>
Next, the configuration of the posture estimation device 2 according to the embodiment of the present invention will be described with reference to FIG. 7 (see FIG. 6 as appropriate).
As shown in FIG. 7, the posture estimation device 2 includes a storage means 20, a posture calculation means 21, and a posture correction means 22.

記憶手段２０は、姿勢補正ネットワークＮｇを予め記憶するものである。記憶手段２０は、ハードディスク、半導体メモリ等の一般的な記憶媒体で構成することができる。
姿勢補正ネットワークＮｇは、ＣＧモデルの予め定めた関節の姿勢を補正するニューラルネットワークであって、予め姿勢補正ネットワーク学習装置１（図２）で学習したものである。 The storage means 20 stores the posture correction network Ng in advance. The storage means 20 can be composed of a general storage medium such as a hard disk or a semiconductor memory.
The posture correction network Ng is a neural network that corrects the postures of predetermined joints of the CG model, and is trained in advance by the posture correction network learning device 1 (FIG. 2).

姿勢計算手段２１は、関節位置情報Ｐである関節の位置（３次元位置座標）から、各関節の姿勢として３次元空間の３軸を基準とした回転角を計算するものである。この３次元位置座標から、関節の姿勢を計算する手法は、一般的な逆運動学計算を用いればよい。なお、姿勢計算手段２１は、姿勢補正ネットワーク学習装置１の姿勢計算手段１１と同じものを用いることができる。
姿勢計算手段２１は、ＣＧモデルの予め定めた関節数ごとに、位置ｐおよび姿勢θ＾を姿勢補正手段２２に出力する。 The posture calculation means 21 calculates a rotation angle with respect to three axes in a three-dimensional space as a posture of each joint from the joint position (three-dimensional position coordinate) which is the joint position information P. A general inverse kinematics calculation may be used to calculate the posture of the joint from the three-dimensional position coordinates. Note that the attitude calculation means 21 can be the same as the attitude calculation means 11 of the attitude correction network learning device 1.
The posture calculation means 21 outputs the position p and the posture θ^ to the posture correction means 22 for each predetermined number of joints of the CG model.

姿勢補正手段２３は、姿勢補正ネットワークＮｇを用いて、姿勢計算手段２１で計算された関節の姿勢を補正するものである。
姿勢補正手段２３は、記憶手段２０に記憶されている姿勢補正ネットワークＮｇに、姿勢計算手段２１で計算された関節の姿勢を入力し、姿勢補正ネットワークＮｇの学習済のレイヤ間の重み係数を用いて、ニューラルネットワークの演算を行う。
これによって、姿勢補正手段２３は、姿勢計算手段２１で計算された関節間の軸（ボーン）に対して不定な姿勢を、正しい姿勢に補正することができる。
姿勢補正手段２３は、関節の位置と推定した関節の姿勢とを対とした関節位置姿勢情報Ｑを、推定結果として出力する。
なお、姿勢推定装置２は、コンピュータを、前記した各手段として機能させるための姿勢推定プログラムで動作させることができる。 The posture correction means 23 corrects the posture of the joint calculated by the posture calculation means 21 using the posture correction network Ng.
The posture correction means 23 inputs the postures of the joints calculated by the posture calculation means 21 into the posture correction network Ng stored in the storage means 20, and uses the learned weight coefficients between the layers of the posture correction network Ng. and perform neural network calculations.
Thereby, the posture correction means 23 can correct the posture that is unstable with respect to the axis (bone) between the joints calculated by the posture calculation means 21 to a correct posture.
The posture correction means 23 outputs joint position and posture information Q, which is a pair of joint positions and estimated joint postures, as an estimation result.
Note that the posture estimation device 2 can operate a computer with a posture estimation program for causing the computer to function as each of the above-mentioned means.

以上説明したように、姿勢推定装置２は、ＣＧモデルの関節の位置から、関節の正しい姿勢を推定することができる。
姿勢推定装置２の動作については、姿勢計算手段２１および姿勢補正手段２２を連続して動作させるだけであるため、詳細な説明は省略する。 As described above, the posture estimation device 2 can estimate the correct posture of a joint from the position of the joint in the CG model.
Regarding the operation of the posture estimation device 2, since the posture calculation means 21 and the posture correction means 22 are simply operated continuously, a detailed explanation will be omitted.

以上、本発明の実施形態に係る姿勢補正ネットワーク学習装置１および姿勢推定装置２について説明したが、本発明は、この実施形態に限定されるものではない。
例えば、姿勢補正ネットワーク学習装置１は、関節位置情報Ｐを入力し、姿勢計算手段１１によって、関節に対応する姿勢を逆運動学計算により計算することとした。
しかし、この姿勢の計算は、予め外部で行ってもよい。
その場合、姿勢補正ネットワーク学習装置１に入力される関節位置情報Ｐは、関節ごとに姿勢が付加された情報とすればよい。そして、姿勢補正ネットワーク学習装置１は、姿勢計算手段１１を構成から省略すればよい。
これは、姿勢推定装置２についても同様である。 Although the posture correction network learning device 1 and posture estimation device 2 according to the embodiments of the present invention have been described above, the present invention is not limited to these embodiments.
For example, the posture correction network learning device 1 inputs the joint position information P, and causes the posture calculation means 11 to calculate the posture corresponding to the joint by inverse kinematics calculation.
However, this attitude calculation may be performed externally in advance.
In that case, the joint position information P input to the posture correction network learning device 1 may be information in which a posture is added for each joint. The posture correction network learning device 1 may omit the posture calculation means 11 from the configuration.
This also applies to the posture estimation device 2.

また、ここでは、関節の位置および姿勢を、人体の関節の位置および姿勢を例として説明したが、関節を有すれば犬、猫等の動物の関節の位置および姿勢であっても構わない。 Further, here, the positions and postures of the joints are explained using the positions and postures of the joints of a human body as an example, but the positions and postures of the joints of animals such as dogs and cats may be used as long as they have joints.

１姿勢補正ネットワーク学習装置
１０記憶手段
１０_１ニューラルネットワーク記憶手段
１０_２学習データ記憶手段
１０_３ダミーモデル記憶手段
１１姿勢計算手段
１２学習データ選択手段
１３姿勢補正手段
１４レンダリング手段
１５識別手段
１６パラメータ更新手段
２姿勢推定手段
２０記憶手段
２１姿勢計算手段
２２姿勢補正手段
Ｎｇ姿勢補正ネットワーク
Ｎｄ識別ネットワーク
Ｓｉ姿勢不定学習データ
Ｓｉ_２補正学習データ
Ｓｃ正解学習データ
Ｄｍダミーモデル 1 Attitude correction network learning device 10 Storage means 10 ₁ Neural network storage means 10 ₂ Learning data storage means 10 ₃ Dummy model storage means 11 Attitude calculation means 12 Learning data selection means 13 Attitude correction means 14 Rendering means 15 Identification means 16 Parameter updating means 2 Attitude estimation means 20 Storage means 21 Attitude calculation means 22 Attitude correction means Ng Attitude correction network Nd Identification network Si Attitude indefinite learning data Si ₂ correction learning data Sc Correct answer learning data Dm Dummy model

Claims

A posture correction network learning device that learns a posture correction network configured by a neural network that corrects postures associated with joint positions of a CG model in a three-dimensional space,
Posture correction means that uses the posture correction network to calculate corrected learning data from posture indefinite learning data, which is learning data whose posture is indefinite;
A three-dimensional dummy model in which a predetermined pattern texture is set for each part of the CG model is rendered for each of the correct learning data and the corrected learning data in which the posture is correct, and a two-dimensional rendered image is generated. a rendering means;
A discrimination result is obtained from the rendered image by using an identification network constituted by a neural network that identifies the rendered image generated from the correct learning data as a true value and identifies the rendered image generated from the corrected learning data as a false value. Identification means for calculating;
reducing the cross entropy between the identification result by the identification means of the rendered image generated from the correct learning data and the true value, and the intersection between the identification result by the identification means of the rendered image generated from the corrected learning data and the false value; The parameters of the identification network are updated so as to reduce the entropy, and the parameters of the posture correction network are updated so as to increase the cross entropy between the identification result of the rendering image generated from the corrected learning data by the identification means and the false value. a means for updating parameters;
A posture correction network learning device comprising:

The apparatus further comprises posture calculation means for calculating the posture of the joint from the position of the joint of the CG model in the three-dimensional space by inverse kinematics calculation, which is a method of controlling the joint, and generating the posture indefinite learning data. The posture correction network learning device according to claim 1, characterized in that:

An attitude correction network learning program for causing a computer to function as the attitude correction network learning device according to claim 1 or 2.

A posture estimation device that estimates the posture of a CG model,
The position of the joint is corrected using a posture correction network constituted by a neural network trained by the posture correction network learning device according to claim 1, which corrects the posture corresponding to the position of the joint of the CG model in a three-dimensional space. A posture estimation device comprising a posture correction means for correcting a posture associated with a posture.

A posture estimation program for causing a computer to function as the posture estimation device according to claim 4 .