JP2013257656A

JP2013257656A - Motion similarity calculation device, motion similarity calculation method, and computer program

Info

Publication number: JP2013257656A
Application number: JP2012132229A
Authority: JP
Inventors: Kenho Jo; 建鋒徐; Emi Meido; 絵美明堂; Shigeyuki Sakasawa; 茂之酒澤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2012-06-11
Filing date: 2012-06-11
Publication date: 2013-12-26
Anticipated expiration: 2032-06-11
Also published as: JP5837860B2

Abstract

【課題】動きデータ間の動きの類似度を算出する際に、ロバスト性を向上させることを図る。
【解決手段】動きデータから、主成分空間で動きの空間的な分布を表す空間的特徴量を算出する空間的特徴量算出部１２と、動きデータから、主成分空間で動きの時間的な分布を表す時間的特徴量を算出する時間的特徴量算出部１３と、二つの動きデータについてそれぞれに算出された空間的特徴量の間の距離と時間的特徴量の間の距離とを用いて動きの類似度を算出する類似度算出部１４と、を備える。
【選択図】図１Robustness is improved when calculating the similarity of motion between motion data.
A spatial feature amount calculation unit for calculating a spatial feature amount representing a spatial distribution of motion in a principal component space from motion data, and a temporal distribution of motion in the principal component space from the motion data. Motion using a temporal feature amount calculation unit 13 that calculates a temporal feature amount that represents the distance between the spatial feature amount and the distance between the temporal feature amounts calculated for each of the two motion data. And a similarity calculation unit 14 that calculates the similarity.
[Selection] Figure 1

Description

本発明は、動き類似度算出装置、動き類似度算出方法およびコンピュータプログラムに関する。 The present invention relates to a motion similarity calculation device, a motion similarity calculation method, and a computer program.

近年、モーションキャプチャ（motion capture）データを取得する方法として、マーカーを関節部位等に付けた人物を撮影した映像からモーションキャプチャデータを取得する方法や、ＲＧＢカメラと深度センサーを組み合わせた深度センサーシステムによって深度センサーシステムから人物までの距離を計測してモーションキャプチャデータを取得する方法などが知られている。前者の方法では、モーションキャプチャデータとして、例えば、図６に例示されるような人体スケルトン型動きデータが定義される。後者の方法としては、例えば、「Kinect（登録商標）センサー」と呼ばれるものが知られている。 In recent years, as a method of acquiring motion capture data, a method of acquiring motion capture data from an image of a person with a marker attached to a joint part or the like, or a depth sensor system combining an RGB camera and a depth sensor A method of acquiring motion capture data by measuring a distance from a depth sensor system to a person is known. In the former method, for example, human skeleton type motion data as illustrated in FIG. 6 is defined as the motion capture data. As the latter method, for example, a so-called “Kinect (registered trademark) sensor” is known.

ここでは、人体の姿勢（ポーズ）の動きを表現するデータとして、図６に例示される人体スケルトン型動きデータについて説明する。図６は、人体スケルトン型動きデータの定義例の概略図である。人体スケルトン型動きデータは、人の骨格を基に、骨及び骨の連結点（ジョイント）を用い、一ジョイントを根（ルート）とし、ルートからジョイント経由で順次連結される骨の構造を木（ツリー）構造として定義される。図６には、人体スケルトン型動きデータの定義の一部分のみを示している。図６において、ジョイント１００は腰の部分であり、ルートとして定義される。ジョイント１０１は左腕の肘の部分、ジョイント１０２は左腕の手首の部分、ジョイント１０３は右腕の肘の部分、ジョイント１０４は右腕の手首の部分、ジョイント１０５は左足の膝の部分、ジョイント１０６は左足の足首の部分、ジョイント１０７は右足の膝の部分、ジョイント１０８は右足の足首の部分、ジョイント１０９は鎖骨の部分、ジョイント１１０、１１１は肩の部分、ジョイント１１２は首の部分、ジョイント１１３、１１４は股関節の部分、である。 Here, the human skeleton type motion data illustrated in FIG. 6 will be described as data representing the motion of the posture (pose) of the human body. FIG. 6 is a schematic diagram of a definition example of human skeleton type motion data. Human body skeleton-type motion data is based on the human skeleton, using bone and bone connection points (joints), with one joint as the root (root), and the structure of bones sequentially connected from the root via the joint (tree) Tree) structure. FIG. 6 shows only a part of the definition of the human body skeleton type motion data. In FIG. 6, a joint 100 is a waist part and is defined as a root. Joint 101 is the elbow portion of the left arm, Joint 102 is the wrist portion of the left arm, Joint 103 is the elbow portion of the right arm, Joint 104 is the wrist portion of the right arm, Joint 105 is the knee portion of the left foot, and Joint 106 is the left foot portion. Ankle part, joint 107 is right leg knee part, joint 108 is right leg ankle part, joint 109 is clavicle part, joints 110 and 111 are shoulder parts, joint 112 is neck part, joints 113 and 114 are The hip joint part.

スケルトン型動きデータは、スケルトン型対象物の各ジョイントの動きを記録したデータであり、スケルトン型対象物としては人体や動物、ロボットなどが適用可能である。スケルトン型動きデータとしては、各ジョイントの位置情報や角度情報、速度情報、加速度情報などが利用可能である。ここでは、人体スケルトン型動きデータとして、人体スケルトンの角度情報と加速度情報を例に挙げて説明する。 The skeleton type motion data is data that records the movement of each joint of the skeleton type object, and a human body, an animal, a robot, or the like is applicable as the skeleton type object. As the skeleton type motion data, position information, angle information, speed information, acceleration information, and the like of each joint can be used. Here, human body skeleton angle data and acceleration information will be described as an example of human body skeleton type motion data.

人体スケルトン型角度情報データは、人の一連の動きを複数のポーズの連続により表すものであり、人の基本ポーズ（neutral pose）を表す基本ポーズデータと、実際の人の動きの中の各ポーズを表すポーズ毎のフレームデータとを有する。基本ポーズデータは、基本ポーズのときのルートの位置及び各ジョイントの位置、並びに各骨の長さなどの情報を有する。基本ポーズデータにより基本ポーズが特定される。フレームデータは、基本ポーズからの移動量をジョイント毎に表す。ここでは、移動量として角度情報を利用する。各フレームデータにより、基本ポーズに対して各移動量が加味された各ポーズが特定される。これにより、各フレームデータによって特定される各ポーズの連続により、人の一連の動きが特定される。なお、人体スケルトン型角度情報データは、人の動きをカメラ撮影した映像からモーションキャプチャ処理によって作成したり、或いは、キーフレームアニメーションの手作業によって作成したりすることができる。 Human body skeleton-type angle information data represents a series of movements of a person by a series of multiple poses. Basic pose data that represents a person's basic pose and each pose in the actual movement of the person. Frame data for each pose. The basic pose data includes information such as the position of the root and the position of each joint in the basic pose, and the length of each bone. The basic pose is specified by the basic pose data. The frame data represents the amount of movement from the basic pose for each joint. Here, angle information is used as the movement amount. Each frame data identifies each pose in which each movement amount is added to the basic pose. Thereby, a series of movements of a person is specified by the continuation of each pose specified by each frame data. The human skeleton-type angle information data can be created by a motion capture process from an image obtained by photographing a person's movement with a camera, or can be created manually by key frame animation.

人体スケルトン型加速度情報データは、人の各ジョイントの加速度をポーズ毎のフレームデータと複数のポーズの連続により表すものである。なお、人体スケルトン型加速度情報データは、加速度計で記録したり、映像や動きデータから算出したりすることができる。 The human body skeleton type acceleration information data represents the acceleration of each joint of a person by continuous frame data for each pose and a plurality of poses. The human skeleton-type acceleration information data can be recorded by an accelerometer, or calculated from video and motion data.

従来、二つの動きの間の類似度を算出する従来の技術としては、おおよそ次の２つのステップ１、２から構成される。 Conventionally, the conventional technique for calculating the degree of similarity between two movements generally includes the following two steps 1 and 2.

ステップ１：二つのモーションキャプチャデータの各フレーム間の距離を算出する。例えば非特許文献１には、フレーム間の距離を算出する方法として以下の二通りが記載されている。
（フレーム間の距離を算出する方法１）
人体スケルトン型動きデータにおいて、ジョイント毎に角度の距離を定義し、各ジョイントを重み付け平均する。
（フレーム間の距離を算出する方法２）
人体スケルトン型動きデータを潜在空間に変更し、潜在空間の座標同士でEuclidean距離を算出する。
また、非特許文献２及び特許文献１に記載される従来技術では、各関節の動きの特徴量を算出し、その特徴量同士の距離を定義し、各ジョイントを重み付け平均する。 Step 1: The distance between each frame of two motion capture data is calculated. For example, Non-Patent Document 1 describes the following two methods for calculating the distance between frames.
(Method 1 for calculating the distance between frames)
In the human skeleton type motion data, the angle distance is defined for each joint, and each joint is weighted averaged.
(Method 2 for calculating the distance between frames)
The human skeleton type motion data is changed to the latent space, and the Euclidean distance is calculated between the coordinates of the latent space.
In the prior art described in Non-Patent Document 2 and Patent Document 1, feature quantities of movements of the joints are calculated, distances between the feature quantities are defined, and each joint is weighted averaged.

ステップ２：二つの動きの間の類似度を算出する際に、全てのフレームを考慮する。例えば非特許文献３に記載されるように最も一般的な方法である、はＤＴＷ（Dynamic Time Warping）を用いる。 Step 2: Consider all frames when calculating the similarity between two movements. For example, as described in Non-Patent Document 3, the most common method is DTW (Dynamic Time Warping).

特開２０１０−０３３１６３号公報JP 2010-033163 A

B.J.H. van Basten, A. Egges, “Evaluating distance metrics for animation blending,” ICFDG 2009, pp. 199-206.B.J.H. van Basten, A. Egges, “Evaluating distance metrics for animation blending,” ICFDG 2009, pp. 199-206. Meinard Muller, Tido Roder, and Michael Clausen, “Efficient content-based retrieval of motion capture data,” ACM SIGGRAPH 2005, pp. 677-685.Meinard Muller, Tido Roder, and Michael Clausen, “Efficient content-based retrieval of motion capture data,” ACM SIGGRAPH 2005, pp. 677-685. Armin Bruderlin and Lance Williams, “Motion signal preprocess,” ACM SIGGRAPH 1995, pp. 97-104.Armin Bruderlin and Lance Williams, “Motion signal preprocess,” ACM SIGGRAPH 1995, pp. 97-104. Yossi Rubner; Carlo Tomasi, Leonidas J. Guibas (1998). "A Metric for Distributions with Applications to Image Databases". Proceedings ICCV 1998: 59-66Yossi Rubner; Carlo Tomasi, Leonidas J. Guibas (1998). "A Metric for Distributions with Applications to Image Databases". Proceedings ICCV 1998: 59-66

しかし、上述した従来の動き類似度算出技術では、以下に示すような課題がある。
（１）異なるモーションキャプチャシステムで取得したモーションキャプチャデータに対して、共通的に動きの類似度を算出することが難しい。例えば、人体スケルトン型動きデータの骨構造の定義が異なると、従来の技術では適用できない。
（２）モーションキャプチャデータを取得するための撮影の環境（被写体である人物、撮影距離、撮影角度など）が異なると、共通的に動きの類似度を算出することが難しい。
（３）入力データの種類（例えば、人体スケルトン型動きデータ、深度センサーシステムで取得したモーションキャプチャデータ、ＲＧＢ画像データ、距離画像データなど）が異なると、共通的に動きの類似度を算出することが難しい。
（４）モーションキャプチャシステムなどで取得したデータがノイズを含んでいると、ロバストな類似度を算出することが難しい。 However, the conventional motion similarity calculation technique described above has the following problems.
(1) It is difficult to calculate motion similarity in common for motion capture data acquired by different motion capture systems. For example, if the definition of the bone structure of the human skeleton type motion data is different, it cannot be applied by the conventional technique.
(2) If the shooting environment for acquiring motion capture data (such as a person who is a subject, shooting distance, shooting angle, etc.) is different, it is difficult to calculate the similarity of motion in common.
(3) When the types of input data (for example, human skeleton type motion data, motion capture data acquired by a depth sensor system, RGB image data, distance image data, etc.) are different, the similarity of motion is commonly calculated. Is difficult.
(4) If data acquired by a motion capture system or the like includes noise, it is difficult to calculate a robust similarity.

本発明は、このような事情を考慮してなされたもので、ロバスト性を向上させた動き類似度算出装置、動き類似度算出方法およびコンピュータプログラムを提供することを課題とする。 The present invention has been made in view of such circumstances, and it is an object of the present invention to provide a motion similarity calculation device, a motion similarity calculation method, and a computer program with improved robustness.

上記の課題を解決するために、本発明に係る動き類似度算出装置は、動きデータから、主成分空間で動きの空間的な分布を表す空間的特徴量を算出する空間的特徴量算出部と、動きデータから、主成分空間で動きの時間的な分布を表す時間的特徴量を算出する時間的特徴量算出部と、二つの動きデータについてそれぞれに算出された空間的特徴量の間の距離と時間的特徴量の間の距離とを用いて動きの類似度を算出する類似度算出部と、を備えたことを特徴とする。 In order to solve the above problems, a motion similarity calculation device according to the present invention includes a spatial feature amount calculation unit that calculates a spatial feature amount representing a spatial distribution of motion in a principal component space from motion data, and , A temporal feature amount calculation unit that calculates a temporal feature amount representing temporal distribution of motion in the principal component space from the motion data, and a distance between the spatial feature amounts calculated for each of the two motion data And a similarity calculation unit that calculates the similarity of motion using the distance between the temporal feature amounts.

本発明に係る動き類似度算出装置において、前記空間的特徴量算出部は、動きデータ内の各フレームの動きの主成分についてのフレーム間の距離を用いてヒストグラムを算出する、ことを特徴とする。 In the motion similarity calculation apparatus according to the present invention, the spatial feature amount calculation unit calculates a histogram using a distance between frames with respect to a main component of motion of each frame in motion data. .

本発明に係る動き類似度算出装置において、前記時間的特徴量算出部は、動きデータ内の基準フレームとそれ以外の各フレームとの間の距離を算出し、算出したフレーム間距離から極値を求め、隣り合う極値間のフレーム数が一定数になるように、フレーム間距離の値を補間または間引く、ことを特徴とする。 In the motion similarity calculation apparatus according to the present invention, the temporal feature amount calculation unit calculates a distance between the reference frame in the motion data and each of the other frames, and calculates an extreme value from the calculated inter-frame distance. The inter-frame distance value is interpolated or thinned out so that the number of frames between adjacent extreme values is constant.

本発明に係る動き類似度算出装置において、前記時間的特徴量算出部は、動きデータ内の基準フレームとそれ以外の各フレームとの間の距離を算出し、算出したフレーム間距離から極値を求め、極値の間隔が極端に短い場合には当該極値を削除し、極値の間隔が極端に長い場合には適切な極値を挿入する、ことを特徴とする。 In the motion similarity calculation apparatus according to the present invention, the temporal feature amount calculation unit calculates a distance between the reference frame in the motion data and each of the other frames, and calculates an extreme value from the calculated inter-frame distance. In addition, when the interval between extreme values is extremely short, the extreme value is deleted, and when the interval between extreme values is extremely long, an appropriate extreme value is inserted.

本発明に係る動き類似度算出装置において、前記類似度算出部は、空間的特徴量の間の距離と時間的特徴量の間の距離に対して重み付け平均値を算出する、ことを特徴とする。 In the motion similarity calculation apparatus according to the present invention, the similarity calculation unit calculates a weighted average value for a distance between a spatial feature and a distance between temporal features. .

本発明に係る動き類似度算出方法は、動きデータから、主成分空間で動きの空間的な分布を表す空間的特徴量を算出するステップと、動きデータから、主成分空間で動きの時間的な分布を表す時間的特徴量を算出するステップと、二つの動きデータについてそれぞれに算出された空間的特徴量の間の距離と時間的特徴量の間の距離とを用いて動きの類似度を算出するステップと、を含むことを特徴とする。 The motion similarity calculation method according to the present invention includes a step of calculating a spatial feature amount representing a spatial distribution of motion in the principal component space from the motion data, and a temporal motion of the motion in the principal component space from the motion data. Calculate the similarity of motion using the step of calculating the temporal feature value representing the distribution and the distance between the spatial feature values calculated for each of the two motion data and the distance between the temporal feature values. And a step of performing.

本発明に係るコンピュータプログラムは、動きデータから、主成分空間で動きの空間的な分布を表す空間的特徴量を算出するステップと、動きデータから、主成分空間で動きの時間的な分布を表す時間的特徴量を算出するステップと、二つの動きデータについてそれぞれに算出された空間的特徴量の間の距離と時間的特徴量の間の距離とを用いて動きの類似度を算出するステップと、をコンピュータに実行させるためのコンピュータプログラムであることを特徴とする。 The computer program according to the present invention calculates a spatial feature amount representing a spatial distribution of motion in the principal component space from the motion data, and represents a temporal distribution of motion in the principal component space from the motion data. Calculating a temporal feature, and calculating a motion similarity using a distance between the spatial feature calculated for each of the two motion data and a distance between the temporal features Is a computer program for causing a computer to execute.

本発明によれば、動きデータ間の動きの類似度を算出する際に、ロバスト性を向上させることができるという効果が得られる。 According to the present invention, it is possible to improve the robustness when calculating the similarity of motion between motion data.

本発明の一実施形態に係る動き類似度算出装置１の構成を示すブロック図である。It is a block diagram which shows the structure of the motion similarity calculation apparatus 1 which concerns on one Embodiment of this invention. 図１に示す空間的特徴量算出部１２の構成を示すブロック図である。It is a block diagram which shows the structure of the spatial feature-value calculation part 12 shown in FIG. ヒストグラムの例である。It is an example of a histogram. 本発明の一実施形態に係る時間的特徴量算出処理を説明するためのグラフ図である。It is a graph for demonstrating the temporal feature-value calculation process which concerns on one Embodiment of this invention. 本発明の一実施形態に係る主成分分析方法の手順を示す概略図である。It is the schematic which shows the procedure of the principal component analysis method which concerns on one Embodiment of this invention. 人体スケルトン型動きデータの定義例である。It is a definition example of human body skeleton type motion data.

以下、図面を参照し、本発明の実施形態について説明する。
図１は、本発明の一実施形態に係る動き類似度算出装置１の構成を示すブロック図である。図１において、動き類似度算出装置１は、入力部１１と空間的特徴量算出部１２と時間的特徴量算出部１３と類似度算出部１４と出力部１５を有する。本実施形態に係る動き類似度算出装置１は、動きデータ間の動きの類似度を算出する。動きデータは、人物の動きを表すデータである。動きデータとしては、モーションキャプチャデータ、ＲＧＢ画像データ、又は距離画像データなどがある。以下の説明では、動きデータの一例として、モーションキャプチャデータを用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of a motion similarity calculation apparatus 1 according to an embodiment of the present invention. In FIG. 1, the motion similarity calculation device 1 includes an input unit 11, a spatial feature value calculation unit 12, a temporal feature value calculation unit 13, a similarity calculation unit 14, and an output unit 15. The motion similarity calculation device 1 according to the present embodiment calculates the similarity of motion between motion data. The motion data is data representing the motion of a person. The motion data includes motion capture data, RGB image data, distance image data, and the like. In the following description, motion capture data is used as an example of motion data.

［入力部］
入力部１１は、動きデータとしてモーションキャプチャデータを入力する。モーションキャプチャデータの種類としては、例えば、人体スケルトン型動きデータや、深度センサーシステムで取得したモーションキャプチャデータなどがある。 [Input section]
The input unit 11 inputs motion capture data as motion data. The types of motion capture data include, for example, human skeleton type motion data and motion capture data acquired by a depth sensor system.

［空間的特徴量算出部］
図２は、図１に示す空間的特徴量算出部１２の構成を示すブロック図である。図２において、空間的特徴量算出部１２は、物理量変更部２１と主成分分析部２２とヒストグラム算出部２３を有する。空間的特徴量算出部１２は、入力部１１から入力されたモーションキャプチャデータが表す人物の動きについての空間的特徴量を算出する。空間的特徴量は、動きの空間的な分布を表す。空間的特徴量の算出では、比較的大きな動きを重視し、比較的細かい動きを無視する。また、主成分分析とヒストグラムの算出によって、ロバスト性を向上させる。 [Spatial feature calculation unit]
FIG. 2 is a block diagram showing a configuration of the spatial feature quantity calculation unit 12 shown in FIG. In FIG. 2, the spatial feature quantity calculation unit 12 includes a physical quantity change unit 21, a principal component analysis unit 22, and a histogram calculation unit 23. The spatial feature amount calculation unit 12 calculates a spatial feature amount regarding the movement of the person represented by the motion capture data input from the input unit 11. The spatial feature amount represents a spatial distribution of motion. In the calculation of the spatial feature amount, a relatively large movement is emphasized, and a relatively fine movement is ignored. In addition, robustness is improved by principal component analysis and histogram calculation.

［物理量変更部］
物理量変更部２１は、入力されたモーションキャプチャデータから、主成分分析部２２に入力するデータを生成する。以下、人体スケルトン型動きデータを用いて、主成分分析部２２に入力するデータを生成する実施例を説明する。 [Physical quantity change part]
The physical quantity changing unit 21 generates data to be input to the principal component analyzing unit 22 from the input motion capture data. Hereinafter, an embodiment in which data to be input to the principal component analysis unit 22 is generated using human body skeleton type motion data will be described.

物理量変更部２１は、人体スケルトン型動きデータが表す動きの期間において、各ジョイントがルートに対してどの位置で動いているのかを算出する。ここでは、人体スケルトン型動きデータの一種である人体スケルトン型角度情報データを用いて、時刻ｔにおけるジョイント相対位置を算出する。ジョイント相対位置は、ルートに対するジョイントの相対的な位置である。 The physical quantity changing unit 21 calculates at which position each joint is moving with respect to the root during the motion period represented by the human skeleton-type motion data. Here, the joint relative position at time t is calculated using human skeleton type angle information data which is a kind of human skeleton type motion data. The joint relative position is a relative position of the joint with respect to the root.

まず、物理量変更部２１は、人体スケルトン型角度情報データ内の基本ポーズデータとフレームデータを用いて、ジョイント位置を算出する。基本ポーズデータは、基本ポーズのときのルートの位置及び各ジョイントの位置、並びに各骨の長さなど、基本ポーズを特定する情報を有する。フレームデータは、ジョイント毎に、基本ポーズからの移動量の情報を有する。ここでは、移動量として角度情報を利用する。この場合、時刻ｔにおけるｋ番目のジョイントの位置ｐ^ｋ（ｔ）は、（１）式および（２）式により算出される。ｐ^ｋ（ｔ）は３次元座標で表される。なお、時刻ｔはフレームデータの時刻である。本実施形態では、時刻ｔを単に「フレームインデックス」として扱う。これにより、時刻ｔは、０，１，２，・・・，Ｔ−１の値をとる。Ｔは、人体スケルトン型角度情報データに含まれるフレームの個数である。 First, the physical quantity changing unit 21 calculates the joint position using the basic pose data and the frame data in the human body skeleton type angle information data. The basic pose data includes information for specifying the basic pose, such as the position of the root and the position of each joint in the basic pose, and the length of each bone. The frame data has information on the amount of movement from the basic pose for each joint. Here, angle information is used as the movement amount. In this case, the position p ^k (t) of the k-th joint at time t is calculated by the equations (1) and (2). p ^k (t) is represented by three-dimensional coordinates. Note that time t is the time of the frame data. In the present embodiment, time t is simply handled as a “frame index”. Thereby, the time t takes values of 0, 1, 2,..., T−1. T is the number of frames included in the human skeleton-type angle information data.

但し、０番目（ｉ＝０）のジョイントはルートである。Ｒ_ａｘｉｓ ^{ｉ−１，ｉ}（ｔ）は、ｉ番目のジョイントとその親ジョイント（「ｉ−１」番目のジョイント）間の座標回転マトリックスであり、基本ポーズデータに含まれる。各ジョイントにはローカル座標系が定義されており、座標回転マトリックスは親子関係にあるジョイント間のローカル座標系の対応関係を表す。Ｒ^ｉ（ｔ）は、ｉ番目のジョイントのローカル座標系におけるｉ番目のジョイントの回転マトリックスであり、フレームデータに含まれる角度情報である。Ｔ^ｉ（ｔ）は、ｉ番目のジョイントとその親ジョイント間の遷移マトリックスであり、基本ポーズデータに含まれる。遷移マトリックスは、ｉ番目のジョイントとその親ジョイント間の骨の長さを表す。 However, the 0th (i = 0) joint is the root. R _axis ^{i-1, i} (t) is a coordinate rotation matrix between the i-th joint and its parent joint ("i-1" -th joint), and is included in the basic pose data. A local coordinate system is defined for each joint, and the coordinate rotation matrix represents the correspondence of the local coordinate system between joints in a parent-child relationship. R ⁱ (t) is a rotation matrix of the i-th joint in the local coordinate system of the i-th joint, and is angle information included in the frame data. T ⁱ (t) is a transition matrix between the i-th joint and its parent joint, and is included in the basic pose data. The transition matrix represents the bone length between the i-th joint and its parent joint.

次いで、物理量変更部２１は、時刻ｔにおける、ルートに対するｋ番目のジョイントの相対位置（ジョイント相対位置）ｐ’^ｋ（ｔ）を（３）式により算出する。 Next, the physical quantity changing unit 21 calculates the relative position (joint relative position) p ′ ^k (t) of the k-th joint with respect to the root at time t by using Equation (3).

但し、ｐ^ｒｏｏｔ（ｔ）は時刻ｔにおけるルート（０番目のジョイント）の位置（ｐ^０（ｔ））である。 Here, p ^root (t) is the position (p ⁰ (t)) of the route (0th joint) at time t.

これにより、時刻ｔのフレーム「ｘ（ｔ）」は、「ｘ（ｔ）＝｛ｐ’^１（ｔ），ｐ’^２（ｔ），・・・，ｐ’^Ｋ（ｔ）｝^Ｔ」と表される。但し、Ｋは、ルートを除いたジョイントの個数である。^Ｔは転置行列を表す。 Thus, the frame “x (t)” at time t is expressed as “x (t) = {p ′ ¹ (t), p ′ ² (t),..., P ′ ^K (t)} ^T ”. expressed. K is the number of joints excluding the root. ^T represents a transposed matrix.

［主成分分析部］
主成分分析部２２は、物理量変更部２１が生成したジョイント相対位置データに対して、主成分分析処理を行う。ここでは、時刻ｔのフレーム「ｘ（ｔ）」を用いて、ジョイント相対位置データ「Ｘ」を「Ｘ＝｛ｘ（ｔ１），ｘ（ｔ２），・・・，ｘ（ｔＮ）｝と表す。但し、Ｎは、ジョイント相対位置データに含まれるフレームの個数である。Ｘは、Ｍ行Ｎ列の行列である（但し、Ｍ＝３×Ｋ）。 [Principal component analysis section]
The principal component analysis unit 22 performs principal component analysis processing on the joint relative position data generated by the physical quantity changing unit 21. Here, using the frame “x (t)” at time t, the joint relative position data “X” is expressed as “X = {x (t1), x (t2),..., X (tN)}”. Where N is the number of frames included in the joint relative position data, and X is a matrix of M rows and N columns (where M = 3 × K).

主成分分析処理では、Ｘに対して主成分分析処理を行い、Ｘを主成分空間へ変換する。 In the principal component analysis processing, principal component analysis processing is performed on X, and X is converted into a principal component space.

ここで、主成分分析方法を説明する。
まず、（４）式により、Ｘから平均値を除いたＮ行Ｍ列の行列Ｄを算出する。 Here, the principal component analysis method will be described.
First, the matrix D of N rows and M columns obtained by subtracting the average value from X is calculated by the equation (4).

次いで、（５）式により、Ｎ行Ｍ列の行列Ｄに対して特異値分解（Singular Value Decomposition）を行う。 Next, singular value decomposition (Singular Value Decomposition) is performed on the matrix D of N rows and M columns according to the equation (5).

但し、Ｕは、Ｎ行Ｎ列のユニタリ行列である。Σは、Ｎ行Ｍ列の負でない対角要素を降順にもつ対角行列であり、主成分空間の座標の分散を表す。Ｖは、Ｍ行Ｍ列のユニタリ行列であり、主成分に対する係数（principal component）である。 However, U is a unitary matrix of N rows and N columns. Σ is a diagonal matrix having non-negative diagonal elements of N rows and M columns in descending order, and represents the variance of the coordinates of the principal component space. V is a unitary matrix of M rows and M columns, and is a coefficient (principal component) for the principal component.

次いで、（６）式により、Ｎ行Ｍ列の行列Ｄを主成分空間へ変換する。Ｍ行Ｎ列の行列Ｙは、主成分空間の座標を表す。 Next, the matrix D of N rows and M columns is converted into the principal component space by the equation (6). The matrix Y with M rows and N columns represents the coordinates of the principal component space.

主成分分析処理では、主成分空間の座標を表す行列（主成分座標行列）Ｙと、主成分に対する係数の行列（主成分係数行列）Ｖをメモリに保存する。 In the principal component analysis process, a matrix (principal component coordinate matrix) Y representing coordinates of the principal component space and a matrix of principal components (principal component coefficient matrix) V are stored in a memory.

なお、元空間の座標を表す行列Ｘと主成分座標行列Ｙは、（６）式と（７）式により相互に変換することができる。 Note that the matrix X representing the coordinates of the original space and the principal component coordinate matrix Y can be converted into each other by the equations (6) and (7).

また、上位のｒ個の主成分によって、（８）式により変換することができる。 Moreover, it can convert by (8) Formula by the upper r main components.

但し、Ｖ^ｒは、主成分係数行列Ｖ内の上位のｒ個の行から成るＭ行ｒ列の行列である。Ｙ^ｒは、主成分座標行列Ｙ内の上位のｒ個の列から成るｒ行Ｎ列の行列である。Ｘ^〜は、復元されたＭ行Ｎ列の行列である。 Here, V ^r is a matrix of M rows and r columns composed of upper r rows in the principal component coefficient matrix V. Y ^r is an r-row N-column matrix composed of the upper r columns in the principal component coordinate matrix Y. X ^~ is a matrix of reconstructed M rows and N columns.

なお、元空間の一部の自由度だけを主成分分析処理することも可能である。例えば、足に関するジョイント相対位置データのみから生成したＭ’行Ｎ列の行列Ｘ’に対して、（４）式、（５）式及び（６）式により主成分分析処理を行う。また、ジョイント相対位置データ内の一部のフレームだけを主成分分析処理することも可能である。例えば、最初のＮ’個のフレームのみから生成したＭ行Ｎ’列の行列Ｘ’に対して、（４）式、（５）式及び（６）式により主成分分析処理を行う。 Note that it is also possible to perform principal component analysis processing on only some degrees of freedom of the original space. For example, the principal component analysis process is performed on the M ′ × N matrix X ′ generated only from the joint relative position data regarding the foot using the equations (4), (5), and (6). It is also possible to perform principal component analysis processing on only some frames in the joint relative position data. For example, the principal component analysis processing is performed on the M × N ′ matrix X ′ generated from only the first N ′ frames by the equations (4), (5), and (6).

［ヒストグラム算出部］
ヒストグラム算出部２３は、主成分分析部２２が算出した主成分座標行列Ｙを用いて、ヒストグラムを算出する。このヒストグラムが本実施形態に係る空間的特徴量である。そのヒストグラム算出処理では、主成分座標行列Ｙ内の上位のｒ個の列から成るｒ行Ｎ列の行列Ｙ^ｒを用いて、ヒストグラムを算出する。ｒの値は任意に設定可能とする。 [Histogram calculation unit]
The histogram calculation unit 23 calculates a histogram using the principal component coordinate matrix Y calculated by the principal component analysis unit 22. This histogram is a spatial feature amount according to the present embodiment. In the histogram calculation process, a histogram is calculated using a matrix Y ^r of r rows and N columns composed of the upper r columns in the principal component coordinate matrix Y. The value of r can be set arbitrarily.

行列Ｙ^ｒは、ジョイント相対位置データＸと同様にＮ個のフレームを有し、「Ｙ^ｒ＝｛ｙ^ｒ（ｔ１），ｙ^ｒ（ｔ２），・・・，ｙ^ｒ（ｔＮ）｝と表される。ヒストグラム算出部２３は、行列Ｙ^ｒのフレームの全組合せに対して、（９）式によりユークリッド距離を算出する。 The matrix Y ^r has N frames like the joint relative position data X, and is ^expressed as “Y ^r = {y ^r (t1), y ^r (t2),..., Y ^r (tN)}”. is. histogram calculation unit 23 that is, for all combinations of the frame of the matrix Y ^r, calculates the Euclidean distance by equation (9).

次いで、ヒストグラム算出部２３は、算出したユークリッド距離ｄ（ｉ，ｊ）を用いて、所定のビン数のヒストグラム（度数分布図）のデータを算出する。ヒストグラムは度数分布を表す棒状のグラフである。図３はヒストグラムの例である。本実施形態では、ヒストグラムの横軸にユークリッド距離ｄ（ｉ，ｊ）、縦軸に各ユークリッド距離ｄ（ｉ，ｊ）の度数を示す。このとき、ユークリッド距離ｄ（ｉ，ｊ）の最大値と最小値に基づいて、横軸に表すビンの幅を等間隔に設定する。これにより、例えば、６４個のビンから構成されるヒストグラム「Ｈ＝｛ｈ（１），ｈ（２），・・・，ｈ（６４）｝が算出される。 Next, the histogram calculator 23 calculates data of a histogram (frequency distribution diagram) of a predetermined number of bins using the calculated Euclidean distance d (i, j). The histogram is a bar graph representing the frequency distribution. FIG. 3 is an example of a histogram. In the present embodiment, the horizontal axis of the histogram shows the Euclidean distance d (i, j), and the vertical axis shows the frequency of each Euclidean distance d (i, j). At this time, based on the maximum value and the minimum value of the Euclidean distance d (i, j), the widths of the bins represented on the horizontal axis are set at equal intervals. Thereby, for example, a histogram “H = {h (1), h (2),..., H (64)}” composed of 64 bins is calculated.

以上が空間的特徴量算出部１２の説明である。 The above is the description of the spatial feature quantity calculation unit 12.

［時間的特徴量算出部］
時間的特徴量算出部１３は、入力部１１から入力されたモーションキャプチャデータが表す人物の動きについての時間的特徴量を算出する。時間的特徴量は、動きの時間的な分布を表す。 [Temporal feature value calculation unit]
The temporal feature amount calculation unit 13 calculates a temporal feature amount regarding the movement of the person represented by the motion capture data input from the input unit 11. The temporal feature amount represents a temporal distribution of motion.

まず、時間的特徴量算出部１３は、モーションキャプチャデータ内の基準フレームとそれ以外の各フレームとの間の距離を算出する。本実施形態では、時間的に最初のフレーム（第１フレーム）を基準フレームとする。この結果、第１フレームとそれ以外の各フレームとの間の距離「Ｄ＝｛ｄ（ｔ２），ｄ（ｔ３），・・・，ｄ（ｔＮ）｝が算出される。なお、フレーム間距離の算出方法としては、非特許文献１に記載される方法を用いることができる。又は、上記主成分分析部２２が算出した主成分座標行列Ｙを用いて、（９）式によりユークリッド距離として算出してもよい。 First, the temporal feature amount calculator 13 calculates the distance between the reference frame in the motion capture data and each of the other frames. In the present embodiment, the first frame (first frame) in time is used as the reference frame. As a result, the distances “D = {d (t2), d (t3),..., D (tN)}” between the first frame and the other frames are calculated. As a calculation method, a method described in Non-Patent Document 1 can be used, or, as a principal component coordinate matrix Y calculated by the principal component analysis unit 22, the Euclidean distance is calculated by Equation (9). May be.

次いで、時間的特徴量算出部１３は、算出したフレーム間距離Ｄから極値を求める。このとき、図４に例示されるように、最初のフレーム間距離３０１（第１フレームと第２フレーム間の距離）と最後のフレーム間距離３０６（第１フレームと第Ｎフレーム間の距離）も極値として追加する。これにより、図４の例では６個の極値３０１〜３０６が求まる。なお、極値の間隔に激しい変化がある場合には解消することが好ましい。このために、極値の間隔が極端に短い場合には当該極値を削除する。逆に、極値の間隔が極端に長い場合には適切な極値を挿入する。 Next, the temporal feature quantity calculation unit 13 obtains an extreme value from the calculated inter-frame distance D. At this time, as illustrated in FIG. 4, the first inter-frame distance 301 (the distance between the first frame and the second frame) and the last inter-frame distance 306 (the distance between the first frame and the Nth frame) are also obtained. Add as extreme value. Thereby, in the example of FIG. 4, six extreme values 301 to 306 are obtained. In addition, it is preferable to eliminate when there is a drastic change in the interval between extreme values. For this reason, when the interval between extreme values is extremely short, the extreme value is deleted. Conversely, if the interval between extreme values is extremely long, an appropriate extreme value is inserted.

次いで、時間的特徴量算出部１３は、フレーム間距離「Ｄ＝｛ｄ（ｔ２），ｄ（ｔ３），・・・，ｄ（ｔＮ）｝に対して、隣り合う極値間のフレーム数（距離の値の数）が一定数（任意に設定可能な値とする）になるように、フレーム間距離の値を補間または間引く。この調整後のフレーム間距離を時間的特徴量「Ｆ＝｛ｆ（１），ｆ（２），・・・，ｆ（Ｎ’）｝」とする。この時間的特徴量Ｆによれば、同じ形の動きに対して動きの速さで区別しないようにする効果がある。 Next, the temporal feature quantity calculation unit 13 calculates the number of frames between adjacent extreme values (D = {d (t2), d (t3),. The inter-frame distance value is interpolated or thinned out so that the distance value number) is a fixed number (a value that can be arbitrarily set). f (1), f (2),..., f (N ′)} ”. According to this temporal feature amount F, there is an effect of not distinguishing the movements of the same shape by the speed of movement.

［類似度算出部］
類似度算出部１４は、入力部１１が入力した二つのモーションキャプチャデータに関し、一方のモーションキャプチャデータが表す人物の動きと、もう一方のモーションキャプチャデータが表す人物の動きとの類似度を算出する。この類似度算出処理では、各モーションキャプチャデータに関して、空間的特徴量算出部１２が算出した空間的特徴量と、時間的特徴量算出部１３が算出した時間的特徴量とを用いる。以下、本実施形態に係る類似度算出方法を説明する。 [Similarity calculation unit]
The similarity calculation unit 14 calculates the similarity between the motion of the person represented by one motion capture data and the motion of the person represented by the other motion capture data regarding the two motion capture data input by the input unit 11. . In the similarity calculation process, the spatial feature amount calculated by the spatial feature amount calculation unit 12 and the temporal feature amount calculated by the temporal feature amount calculation unit 13 are used for each motion capture data. Hereinafter, the similarity calculation method according to the present embodiment will be described.

ここでは、第１のモーションキャプチャデータに関して、空間的特徴量（ヒストグラム「Ｈ１＝｛ｈ１（１），ｈ１（２），・・・，ｈ１（６４）｝）と時間的特徴量（フレーム間距離「Ｆ１＝｛ｆ１（１），ｆ１（２），・・・，ｆ１（Ｎ）｝」とする。第２のモーションキャプチャデータに関して、空間的特徴量（ヒストグラム「Ｈ２＝｛ｈ２（１），ｈ２（２），・・・，ｈ２（６４）｝）と時間的特徴量（フレーム間距離「Ｆ２＝｛ｆ２（１），ｆ２（２），・・・，ｆ２（Ｍ）｝」とする。 Here, with respect to the first motion capture data, a spatial feature amount (histogram “H1 = {h1 (1), h1 (2),..., H1 (64)})” and a temporal feature amount (interframe distance). “F1 = {f1 (1), f1 (2),..., F1 (N)}” For the second motion capture data, a spatial feature (histogram “H2 = {h2 (1), h2 (2),..., h2 (64)}) and temporal features (interframe distances “F2 = {f2 (1), f2 (2),..., f2 (M)}”). .

まず、類似度算出部１４は、第１、第２のモーションキャプチャデータの空間的特徴量（ヒストグラムＨ１、Ｈ２）を用いて、ヒストグラムＨ１とヒストグラムＨ２との間の距離を算出する。このヒストグラム間距離の算出処理では、例えば非特許文献４に記載されるＥＭＤ（Earth Mover’s Distance）という距離の定義を用いて、ヒストグラムＨ１とヒストグラムＨ２との間の距離を算出することができる。このＥＭＤを用いたヒストグラム間距離の算出結果をＥＭＤ（Ｈ１，Ｈ２）とする。ＥＭＤを用いたヒストグラム間距離は、値が小さいほど類似度が大きいことを表す。 First, the similarity calculation unit 14 calculates the distance between the histogram H1 and the histogram H2 using the spatial feature amounts (histograms H1 and H2) of the first and second motion capture data. In the calculation process of the distance between histograms, the distance between the histogram H1 and the histogram H2 can be calculated using the definition of distance called EMD (Earth Mover's Distance) described in Non-Patent Document 4, for example. The calculation result of the distance between histograms using this EMD is defined as EMD (H1, H2). The distance between histograms using EMD indicates that the smaller the value, the greater the similarity.

次いで、類似度算出部１４は、第１、第２のモーションキャプチャデータの時間的特徴量（フレーム間距離Ｆ１、Ｆ２）を用いて、（１０）式により距離Ｄ１（Ｆ１、Ｆ２）を算出する。 Next, the similarity calculation unit 14 calculates the distance D1 (F1, F2) by the expression (10) using the temporal feature amounts (interframe distances F1, F2) of the first and second motion capture data. .

又、類似度算出部１４は、第１、第２のモーションキャプチャデータの時間的特徴量（フレーム間距離Ｆ１、Ｆ２）を用いて、（１１）式により距離Ｄ２（Ｆ１、Ｆ２）を算出する。この距離Ｄ２（Ｆ１、Ｆ２）は、第１、第２のモーションキャプチャデータのうち、時間的に長い方のモーションキャプチャデータに対して、距離の算出を開始するフレームをＫ（任意に設定可能な値とする）個だけずらして算出される。 Further, the similarity calculation unit 14 calculates the distance D2 (F1, F2) by the equation (11) using the temporal feature amounts (interframe distances F1, F2) of the first and second motion capture data. . This distance D2 (F1, F2) is K (arbitrarily settable) as a frame from which the distance calculation is started with respect to the longer one of the first and second motion capture data. It is calculated by shifting the value).

類似度算出部１４は、距離Ｄ１（Ｆ１、Ｆ２）と距離Ｄ２（Ｆ１、Ｆ２）のうち、値が小さい方（類似度が大きい方）を時間的特徴量間の距離Ｄ（Ｆ１、Ｆ２）とする。 The similarity calculation unit 14 selects a distance D1 (F1, F2) between temporal feature values of the distance D1 (F1, F2) and the distance D2 (F1, F2), whichever has a smaller value (a higher similarity). And

次いで、類似度算出部１４は、空間的特徴量間の距離（ヒストグラム間距離ＥＭＤ（Ｈ１，Ｈ２））と時間的特徴量間の距離Ｄ（Ｆ１、Ｆ２）を用いて、（１２）式により、類似度（similarity）を算出する。 Next, the similarity calculation unit 14 uses the distance between the spatial feature amounts (inter-histogram distance EMD (H1, H2)) and the distance D (F1, F2) between the temporal feature amounts according to the equation (12). The similarity is calculated.

但し、αは重み係数であり、０から１までの範囲で任意に設定可能な値とする。 However, α is a weighting factor and is a value that can be arbitrarily set in the range from 0 to 1.

［出力部］
出力部１５は、類似度算出部１４が算出した類似度を出力する。これにより、入力部１１が入力した二つのモーションキャプチャデータに関する動きの類似度が出力される。 [Output section]
The output unit 15 outputs the similarity calculated by the similarity calculation unit 14. Thereby, the similarity of the motion regarding the two motion capture data input by the input unit 11 is output.

なお、上述した実施形態では、モーションキャプチャデータを用いて主成分分析処理を行う実施例を説明したが、他の種類の動きデータを用いて主成分分析処理を行う実施例を、以下に図５を参照して説明する。図５は、本発明の一実施形態に係る主成分分析方法の手順を示す概略図である。 In the above-described embodiment, an example in which principal component analysis processing is performed using motion capture data has been described. However, an example in which principal component analysis processing is performed using other types of motion data is described below with reference to FIG. Will be described with reference to FIG. FIG. 5 is a schematic diagram showing a procedure of a principal component analysis method according to an embodiment of the present invention.

（ステップＳ１）入力部１１は、動きデータとして、ＲＧＢ画像データ又は距離画像データを入力する。ＲＧＢ画像データは、ＲＧＢカメラで撮像された動画像データであり、フレーム内の画素値はＲＧＢ画素値である。距離画像データは、深度センサーシステムで取得された深度情報を有する動画像データであり、フレーム内の画素値は深度値である。 (Step S1) The input unit 11 inputs RGB image data or distance image data as motion data. The RGB image data is moving image data captured by an RGB camera, and the pixel values in the frame are RGB pixel values. The distance image data is moving image data having depth information acquired by the depth sensor system, and the pixel value in the frame is a depth value.

（ステップＳ２）物理量変更部２１は、入力部１１が入力した動きデータを記憶する。この動きデータは、図５に示されるように、Ｎ個のフレームから構成される。ｉ番目のフレームのサイズＭである画素数Ｍは、フレームの縦の画素数「height」と横の画素数「width」との積「Ｍ＝height×width」である。但し、「１≦ｉ≦Ｎ」である。これにより、任意のサイズの画像から構成される動きデータに対して適用可能とする。 (Step S2) The physical quantity changing unit 21 stores the motion data input by the input unit 11. This motion data is composed of N frames as shown in FIG. The number of pixels M which is the size M of the i-th frame is the product “M = height × width” of the number of vertical pixels “height” and the number of horizontal pixels “width”. However, “1 ≦ i ≦ N”. Thereby, it can be applied to motion data composed of images of an arbitrary size.

（ステップＳ３）物理量変更部２１は、行列Ｂ’を生成する。行列Ｂ’の列は、動きデータ内のフレームの時系列の順序で、Ｎ個のフレームの画素値ベクトルが配列される。従って、行列Ｂ’のｉ列目の列ベクトルは、ｉ番目のフレームの画素値ベクトルＶｉである。画素値ベクトルＶｉは、ｉ番目のフレームを構成するＭ個の画素の画素値が、各画素のフレーム内の位置に基づく所定の順序で、ベクトルの要素として配列されている。これにより、行列Ｂ’は、Ｍ行Ｎ列の行列となる。 (Step S3) The physical quantity changing unit 21 generates a matrix B ′. In the column of the matrix B ′, pixel value vectors of N frames are arranged in the time-series order of the frames in the motion data. Therefore, the column vector of the i-th column of the matrix B ′ is the pixel value vector Vi of the i-th frame. In the pixel value vector Vi, the pixel values of M pixels constituting the i-th frame are arranged as vector elements in a predetermined order based on the positions of the pixels in the frame. Thereby, the matrix B ′ becomes a matrix of M rows and N columns.

（ステップＳ４）主成分分析部２２は、行列Ｂ’の分散共分散行列Ｓの固有値λｉ（ｉ＝１，２，・・・）と固有ベクトルｖｉ（ｉ＝１，２，・・・）を算出する。分散共分散行列ＳのサイズはＭ行Ｍ列である。 (Step S4) The principal component analysis unit 22 calculates the eigenvalue λi (i = 1, 2,...) And the eigenvector vi (i = 1, 2,...) Of the variance-covariance matrix S of the matrix B ′. To do. The size of the variance-covariance matrix S is M rows and M columns.

（ステップＳ５）主成分分析部２２は、行列Ｂ’の分散共分散行列Ｓの固有値λｉと固有ベクトルｖｉを用いて、固有値ごとに、固有値と固有ベクトルを主成分の空間の値（主成分得点）に変換した時系列データ（主成分得点の時系列）を生成する。一つの主成分得点の時系列は、動きデータ内のフレーム数と同じＮ個のデータから構成される。 (Step S5) The principal component analysis unit 22 uses the eigenvalues λi and eigenvectors vi of the variance-covariance matrix S of the matrix B ′ to convert the eigenvalues and eigenvectors into the principal component space values (principal component scores) for each eigenvalue. Generate converted time series data (time series of principal component scores). A time series of one principal component score is composed of N pieces of data that are the same as the number of frames in the motion data.

ヒストグラム算出部２３は、主成分分析部２２が作成した第ｉ主成分得点の時系列（ｉ＝１，２，・・・）のうち、上位のｒ個の主成分得点の時系列を用いて、ヒストグラムを算出する。 The histogram calculation unit 23 uses the time series of the r highest principal component scores among the time series (i = 1, 2,...) Of the i-th principal component score created by the principal component analysis unit 22. Calculate a histogram.

次に、図５のステップＳ４の一実施例を説明する。一般に、動画像のサイズ（画素数）Ｍは、例えば携帯電話機に付属のカメラで簡易に撮影する場合を想定すると、縦×横＝（height×width＝Ｍ）＝３２０×２４０＝８万画素程度、フレーム数（時系列数）Ｎ＝３００程度、であることが考えられる。すなわち、「M＞Ｎ」であることが想定される。この場合、演算量の削減の観点から、Ｍ行Ｍ列の分散共分散行列Ｓの固有値問題を直接解くのではなく、後述するようにＮ行Ｎ列の行列Ｃの固有値問題を解いて、この結果から分散共分散行列Ｓの固有値と固有ベクトルを、固有値の大きい方から所定数分だけを算出することが好ましい。以下に、この実施例を説明する。 Next, an example of step S4 in FIG. 5 will be described. In general, the size (number of pixels) M of a moving image is about vertical × horizontal = (height × width = M) = 320 × 240 = 80,000 pixels, for example, when it is assumed that the camera attached to the mobile phone is simply captured. It can be considered that the number of frames (number of time series) is about N = 300. That is, it is assumed that “M> N”. In this case, from the viewpoint of reducing the amount of computation, instead of directly solving the eigenvalue problem of the M-column and M-column variance-covariance matrix S, the eigenvalue problem of the N-row N-column matrix C is solved as described later. From the result, it is preferable to calculate the eigenvalues and eigenvectors of the variance-covariance matrix S by a predetermined number from the larger eigenvalue. This embodiment will be described below.

ステップＳ３で求められたＭ行Ｎ列の行列Ｂ’は（１３）式で表される。 The matrix B ′ of M rows and N columns obtained in step S3 is expressed by equation (13).

この行列Ｂ’の行平均（フレーム方向に対する平均）をＭ行Ｎ列の行列meanとして（１４）式で表す。 The row average (average with respect to the frame direction) of this matrix B ′ is expressed by equation (14) as a matrix mean of M rows and N columns.

そして、行列Ｂ’から１行ずつ抽出して、行列meanの対応する行の平均ベクトルを引いて，行方向に結合した行列Ｂを（１５）式で定義する。 Then, the matrix B extracted from the matrix B ′ row by row, subtracting the average vector of the corresponding row of the matrix mean, and defining the matrix B coupled in the row direction is defined by equation (15).

この行列Ｂを利用して、分散共分散行列Ｓを表すと（１６）式となる。 When this matrix B is used to express the variance-covariance matrix S, equation (16) is obtained.

そして、固有方程式は（１７）式となる。 And the eigen equation becomes the equation (17).

この（１７）式の固有方程式に対して両辺に左から行列Ｂを掛けて、「Ｂｖ＝ｕ」とすると（１８）式となる。 When this characteristic equation (17) is multiplied by the matrix B from the left on both sides and “Bv = u” is obtained, equation (18) is obtained.

（１８）式は、行列（Ｎ^−１）ＢＢ^Ｔ（行列Ｃとする）の固有方程式である。この行列Ｃに対する固有ベクトルｕを求めてから、固有ベクトルｕを使って分散共分散行列Ｓの固有値ｖを求める。このときの行列の大きさを考えると、Ｎ行Ｎ列となっており、一般に画素値ベクトルの次元数Ｍよりもフレーム数Ｎの方が圧倒的に小さいことから、分散共分散行列Ｓから直接的に固有値を求めるよりも、はるかに少ない演算量で算出できる。 Equation (18) is an eigen equation of the matrix (N ^ -1) BB ^ T (denoted as matrix C). After obtaining the eigenvector u for the matrix C, the eigenvalue v of the variance-covariance matrix S is obtained using the eigenvector u. Considering the size of the matrix at this time, it is N rows and N columns, and generally the number of frames N is overwhelmingly smaller than the number of dimensions M of the pixel value vector. Therefore, it can be calculated with a much smaller amount of computation than when the eigenvalue is obtained.

固有ベクトルｕから、分散共分散行列Ｓの固有ベクトルｖを求めるには、（１８）式に左から行列Ｂ^Ｔを掛ける。これにより、（１９）式が得られる。 In order to obtain the eigenvector v of the variance-covariance matrix S from the eigenvector u, the matrix B ^ T is multiplied from the left in the equation (18). Thereby, the equation (19) is obtained.

この（１９）式から、Ｂ^Ｔｕが、分散共分散行列Ｓの固有ベクトルｖであることが分かる。但し、規格化はされていないので規格化すると、分散共分散行列Ｓの固有ベクトルｖは（２０）式となる。 From this equation (19), it can be seen that B ^ Tu is the eigenvector v of the variance-covariance matrix S. However, since normalization is not performed, the eigenvector v of the variance-covariance matrix S is expressed by the equation (20).

このように、Ｎ行Ｎ列の行列Ｃの固有ベクトルｕを求め、この固有ベクトルｕを用いて、Ｍ行Ｍ列の分散共分散行列Ｓの固有ベクトルｖを求めることができる。但し、分散共分散行列ＳのＭ個の固有値のうち、固有値の大きい方からＮ番目以降の固有値は０と考える。この方法によれば、一般に想定されるような大容量の動画像に対しても主成分分析処理が実計算上可能である。 In this way, the eigenvector u of the matrix C of N rows and N columns can be obtained, and the eigenvector v of the variance-covariance matrix S of M rows and M columns can be obtained using this eigenvector u. However, of the M eigenvalues of the variance-covariance matrix S, the Nth and subsequent eigenvalues from the larger eigenvalue are considered to be zero. According to this method, the principal component analysis process can be actually performed even for a large-capacity moving image as generally assumed.

なお、画像の解像度を下げてＭを小さくしたり、又は、フレームレートを下げてＮを小さくしたり、又は、その両方を行ったりしてもよい。また、動画像のサイズが「Ｍ＜Ｎ」である場合には、Ｍ行Ｍ列の分散共分散行列Ｓの固有値問題を直接的に解いてもよい。 Note that M may be reduced by reducing the resolution of the image, or N may be reduced by reducing the frame rate, or both. In addition, when the size of the moving image is “M <N”, the eigenvalue problem of the variance-covariance matrix S of M rows and M columns may be directly solved.

上述した実施形態によれば、二つの動きデータに対し、それぞれに空間的特徴量と時間的特徴量を算出し、その空間的特徴量間の距離と時間的特徴量間の距離を用いて動きの類似度を算出する。これにより、異なるモーションキャプチャシステムで取得したモーションキャプチャデータに対しても、共通的に動きの類似度を算出することができるという効果が得られる。又、モーションキャプチャデータを取得するための撮影の環境（被写体である人物、撮影距離、撮影角度など）が異なる場合であっても、共通的に動きの類似度を算出することができるという効果が得られる。又、入力する動きデータの種類（例えば、人体スケルトン型動きデータ、深度センサーシステムで取得したモーションキャプチャデータ、ＲＧＢ画像データ、距離画像データなど）が異なる場合であっても、共通的に動きの類似度を算出することができるという効果が得られる。このように本実施形態によれば、動きデータ間の動きの類似度を算出する際に、ロバスト性を向上させることが可能となる。 According to the above-described embodiment, spatial feature amounts and temporal feature amounts are calculated for the two motion data, respectively, and the motion between the spatial feature amounts and the distance between the temporal feature amounts is used. The similarity is calculated. Thereby, the effect that the similarity of motion can be calculated in common for motion capture data acquired by different motion capture systems can be obtained. In addition, even when the shooting environment for acquiring the motion capture data (person who is the subject, shooting distance, shooting angle, etc.) is different, the similarity of motion can be calculated in common. can get. Even if the types of motion data to be input (for example, human skeleton motion data, motion capture data acquired by a depth sensor system, RGB image data, distance image data, etc.) are different, similar motions are commonly shared The effect that the degree can be calculated is obtained. As described above, according to the present embodiment, it is possible to improve the robustness when calculating the similarity of motion between motion data.

又、本実施形態によれば以下に示すような効果が得られる。従来、コンピューターグラフィックス（ＣＧ）キャラクターをユーザの動きに合わせて動かす技術が知られている。そして、ＣＧキャラクターの動きの基となるデータベースに対して、ユーザの個性的な動きを追加することで、ＣＧキャラクターの動きに親しみを感じさせる効果を加えることが期待できる。このとき、ユーザの動きと、データベース中のモーションキャプチャデータの動きとの類似度を用いて両者の動きを認識し、ユーザの個性的な動きをデータベースに追加することが考えられる。ここで、本実施形態によれば、ユーザの動きを任意のモーションキャプチャシステムで取得すればよいので、ユーザにとって非常に使い勝手がよくなるという効果が得られる。 Further, according to the present embodiment, the following effects can be obtained. Conventionally, a technique for moving a computer graphics (CG) character in accordance with a user's movement is known. Then, by adding the individual movement of the user to the database on which the movement of the CG character is based, it can be expected to add an effect that makes the movement of the CG character feel familiar. At this time, it is conceivable to recognize the movement of the user using the similarity between the movement of the user and the motion capture data in the database, and add the user's unique movement to the database. Here, according to the present embodiment, since the user's motion may be acquired by an arbitrary motion capture system, an effect that the user can use it very much is obtained.

以上、本発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、本発明の要旨を逸脱しない範囲の設計変更等も含まれる。 As mentioned above, although embodiment of this invention was explained in full detail with reference to drawings, the specific structure is not restricted to this embodiment, The design change etc. of the range which does not deviate from the summary of this invention are included.

また、図１に示す動き類似度算出装置１の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより、動き類似度算出処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものであってもよい。
また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、フラッシュメモリ等の書き込み可能な不揮発性メモリ、ＤＶＤ（Digital Versatile Disk）等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。 Also, a program for realizing the function of the motion similarity calculation device 1 shown in FIG. 1 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Thus, the motion similarity calculation process may be performed. Here, the “computer system” may include an OS and hardware such as peripheral devices.
“Computer-readable recording medium” refers to a flexible disk, a magneto-optical disk, a ROM, a writable nonvolatile memory such as a flash memory, a portable medium such as a DVD (Digital Versatile Disk), and a built-in computer system. A storage device such as a hard disk.

さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（例えばＤＲＡＭ（Dynamic Random Access Memory））のように、一定時間プログラムを保持しているものも含むものとする。
また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。
また、上記プログラムは、前述した機能の一部を実現するためのものであっても良い。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であっても良い。 Further, the “computer-readable recording medium” means a volatile memory (for example, DRAM (Dynamic DRAM) in a computer system that becomes a server or a client when a program is transmitted through a network such as the Internet or a communication line such as a telephone line. Random Access Memory)), etc., which hold programs for a certain period of time.
The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, and what is called a difference file (difference program) may be sufficient.

１…動き類似度算出装置、１１…入力部、１２…空間的特徴量算出部、１３…時間的特徴量算出部、１４…類似度算出部、１５…出力部、２１…物理量変更部、２２…主成分分析部、２３…ヒストグラム算出部 DESCRIPTION OF SYMBOLS 1 ... Motion similarity calculation apparatus, 11 ... Input part, 12 ... Spatial feature-value calculation part, 13 ... Temporal feature-value calculation part, 14 ... Similarity calculation part, 15 ... Output part, 21 ... Physical-quantity change part, 22 ... Principal component analysis unit, 23 ... Histogram calculation unit

Claims

A spatial feature amount calculating unit that calculates a spatial feature amount representing a spatial distribution of motion in the principal component space from the motion data;
A temporal feature amount calculating unit that calculates a temporal feature amount representing a temporal distribution of motion in the principal component space from the motion data;
A similarity calculation unit that calculates the similarity of motion using the distance between the spatial feature amount calculated for each of the two motion data and the distance between the temporal feature amounts;
A motion similarity calculation device comprising:

The spatial feature amount calculation unit calculates a histogram using a distance between frames with respect to a main component of motion of each frame in motion data.
The motion similarity calculation apparatus according to claim 1, wherein:

The temporal feature amount calculation unit calculates a distance between a reference frame in the motion data and each of the other frames, obtains an extreme value from the calculated inter-frame distance, and determines the number of frames between adjacent extreme values. Interpolate or decimate inter-frame distance values to be a constant,
The motion similarity calculation apparatus according to claim 1, wherein

The temporal feature amount calculation unit calculates a distance between a reference frame in motion data and each of the other frames, obtains an extreme value from the calculated inter-frame distance, and an extreme value interval is extremely short Deletes the extreme value and inserts an appropriate extreme value if the interval between extreme values is extremely long.
The motion similarity calculation apparatus according to any one of claims 1 to 3, wherein

The similarity calculation unit calculates a weighted average value for the distance between the spatial feature and the distance between the temporal features.
The motion similarity calculation apparatus according to claim 1, wherein:

Calculating a spatial feature amount representing a spatial distribution of motion in the principal component space from the motion data;
Calculating a temporal feature amount representing a temporal distribution of motion in the principal component space from the motion data;
Calculating a similarity of motion using a distance between the spatial feature amount calculated for each of the two motion data and a distance between the temporal feature amounts;
A motion similarity calculation method comprising:

Calculating a spatial feature amount representing a spatial distribution of motion in the principal component space from the motion data;
Calculating a temporal feature amount representing a temporal distribution of motion in the principal component space from the motion data;
Calculating a similarity of motion using a distance between the spatial feature amount calculated for each of the two motion data and a distance between the temporal feature amounts;
A computer program for causing a computer to execute.